Introduction:
This tool performs the OPLS-DA algorithm to rank
peaks on the inputted table by variable importance in projection (VIP). Group
information is given by a group design file (Tab-delimited text file). OPLS-DA
is only available for binary classification and the number of groups should be
2.
The orthogonal partial least-squares (OPLS)
algorithm was introduced by J. Trygg and Wold (2002) in order to model separately the variations of
the predictors correlated and orthogonal to the response. It has a similar
predictive capacity compared to PLS and improves the interpretation of the
predictive components and of the systematic variation (Pinto, Trygg, and Gottfries 2012). In
particular, OPLS modeling of single responses only requires one predictive
component. Diagnostics such as the Q2Y metrics and permutation testing are of
high importance to avoid overfitting and assess the statistical significance of
the model. The VIP, which reflects both the loading weights for each component
and the variability of the response explained by this component (Pinto, Trygg, and Gottfries 2012;
Mehmood et al. 2012), can be used for feature ranking and selection (J. Trygg and Wold 2002; Pinto, Trygg, and Gottfries 2012).
Input files:
1.
Peak table file in Tab-delimited
text format, with the first column as the compound identifier and the others as
samples.
1.
For example:
|
HU_011 |
HU_014 |
HU_015 |
HU_017 |
HU_018 |
HU_019 |
|
|
(2-methoxyethoxy)propanoic
acid isomer |
3.019766 |
3.814339 |
3.519691 |
2.562183 |
3.781922 |
4.161074 |
|
(gamma)Glu-Leu/Ile |
3.888479 |
4.277149 |
4.195649 |
4.32376 |
4.629329 |
4.412266 |
|
1-Methyluric
acid |
3.869006 |
3.837704 |
4.102254 |
4.53852 |
4.178829 |
4.516805 |
|
1-Methylxanthine |
3.717259 |
3.776851 |
4.291665 |
4.432216 |
4.11736 |
4.562052 |
|
1,3-Dimethyluric
acid |
3.535461 |
3.932581 |
3.955376 |
4.228491 |
4.005545 |
4.320582 |
|
1,7-Dimethyluric
acid |
3.325199 |
4.025125 |
3.972904 |
4.109927 |
4.024092 |
4.326856 |
|
2-acetamido-4-methylphenyl
acetate |
4.204754 |
5.181858 |
3.88568 |
4.237915 |
1.852994 |
4.080681 |
|
2-Aminoadipic
acid |
4.080204 |
4.359246 |
4.249111 |
4.231404 |
4.323679 |
4.244485 |
2.
Group design file in
Tab-delimited text file with two columns (samplename groupname).
For example:
|
HU_011 |
M |
|
HU
014 |
F |
|
HU_015 |
M |
|
HU_017 |
M |
|
HU_018 |
M |
|
HU_019 |
M |
Parameter:
1.
VIP-value
threshold: A numerical variable indicating the cutoff of Variable Importance in
Projection.
Output files:
1.
'OPLSDA_Score.txt', Component (scores)
matrix, P1 refers to 1th score and O1 refers to 1th orthogonal score.
2.
'OPLSDA_VIP.txt', Columns: feature
name, VIP, Corr.Coeffs (refers to correlation
coefficient between raw data and 1th score data), Corr.P,
FDR.
3.
'OPLSDA_VIP_Sig.txt', significant
result.
4.
'OPLSDA_Permutation.txt', permutation
result.
5.
'Fitted_Curve_Parameter.txt', parameters
about fitted curve in permutation plot.
6.
'OPLSDA_R2X_R2Y_Q2.txt', data
frame with the model overview.
7.
'OPLSDA_VPlot.pdf', visualization
about 'OPLSDA_VIP.txt' data.
8.
'OPLSDA _Score_2D_Label.pdf', OPLSDA
scatter plot using P1 and O1 values with the sample name label.
9.
'OPLSDA _Score_2D.pdf', OPLSDA scatter
plot using P1 and O1 values without the sample name label.
10.
'OPLSDA_Permutation.pdf', visualization
about 'OPLSDA_ Permutation.txt' data.
11.
'OPLSDA_R2X_R2Y_Q2.pdf', visualization
about ' OPLSDA_R2X_R2Y_Q2.txt' data.
Note:
Group number must be 2 in the sample group file.
Group names of characters or string are
preferred. Numbers are also supported but not recommended.
Reference:
[1]
Thevenot, E.A., Roux, A., Xu, Y., Ezan, E., Junot, C.
2015. Analysis of the human adult urinary metabolome variations with age, body
mass index and gender by implementing a comprehensive workflow for univariate
and OPLS statistical analyses. Journal of Proteome Research. 14: 3322-3335.
[2]
Trygg J, Wold S. Orthogonal projections to latent
structures (O-PLS) [J]. Journal of Chemometrics 2002,16:119 –128.
[3]
Rui C P, Trygg J, Gottfries
J. Advantages of orthogonal inspection in chemometrics[J].
Journal of Chemometrics, 2012, 26(6):231–235.
[4]
Mehmood, T., KH. Liland, L. Snipen,
and S. Saebo. 2012. “A Review of Variable Selection
Methods in Partial Least Squares Regression.” Chemometrics and Intelligent
Laboratory Systems 118 (0): 62–69.
[5]
Galindo-Prieto
B., Eriksson L. and Trygg J. (2014). Variable
influence on projection (VIP) for orthogonal projections to latent structures
(OPLS). Journal of Chemometrics 28, 623-632.