Introduction:
This tool is the wrapper of the R ‘glm’ function and aims at peak ranking by coefficients of
linear regression. Group information is given by a group design file
(Tab-delimited text file). The tool is only available for binary
classification. So number of groups should be 2.
Input files:
1.
Peak table file in Tab-delimited
txt format, with the first column as the compound identifier and others as
samples.
For example:
|
HU_011 |
HU_014 |
HU_015 |
HU_017 |
HU_018 |
HU_019 |
|
|
(2-methoxyethoxy)propanoic acid isomer |
3.019766 |
3.814339 |
3.519691 |
2.562183 |
3.781922 |
4.161074 |
|
(gamma)Glu-Leu/Ile |
3.888479 |
4.277149 |
4.195649 |
4.32376 |
4.629329 |
4.412266 |
|
1-Methyluric acid |
3.869006 |
3.837704 |
4.102254 |
4.53852 |
4.178829 |
4.516805 |
|
1-Methylxanthine |
3.717259 |
3.776851 |
4.291665 |
4.432216 |
4.11736 |
4.562052 |
|
1,3-Dimethyluric acid |
3.535461 |
3.932581 |
3.955376 |
4.228491 |
4.005545 |
4.320582 |
|
1,7-Dimethyluric acid |
3.325199 |
4.025125 |
3.972904 |
4.109927 |
4.024092 |
4.326856 |
|
2-acetamido-4-methylphenyl acetate |
4.204754 |
5.181858 |
3.88568 |
4.237915 |
1.852994 |
4.080681 |
|
2-Aminoadipic acid |
4.080204 |
4.359246 |
4.249111 |
4.231404 |
4.323679 |
4.244485 |
2.
Group design file in Tab-delimited
text format with two column (samplename groupname).
For example:
|
HU_011 |
M |
|
HU 014 |
F |
|
HU_015 |
M |
|
HU_017 |
M |
|
HU_018 |
M |
|
HU_019 |
M |
|
HU_020 |
M |
|
HU_021 |
M |
|
HU_022 |
F |
Output files:
1.
'LR_VarImp.txt', feature ranked
results.
2.
'LR_Prediction.txt', logistic
regression model sample prediction results using inputted data.
3.
'LR_Prediction_Summary.txt', prediction
summary.
4.
'ROC_Curve_Data.txt', ROC
analysis result.
5.
'ROC_Curve.pdf', ROC curve plot.
6.
'PR_Curve_Data.txt', PR analysis
result.
7.
'PR_Curve.pdf', PR curve plot.
Group name of characters or
string is preferred. Number is also supported, but not recommended.