Introduction:

This workflow takes a peak table file and a group design file as inputs. It performs all univariate and multivariate statistical analysis as user selected (between two groups).

Input files:

1.     Peak table file in Tab-delimited text format, with the first column as the compound identifier and the others as samples.

2.     Group design file in Tab-delimited text format with two columns (samplename     groupname).

Output files:

'pkTable_summary.txt', basic statistics summary information on columns (sample data).

't_test_results.txt', t-test results with p value, log2FC, and q value.

't_test_significant_results.txt', significant t-test results.

'wilcox_test_results.txt', Wilcoxon-test results with p value, log2FC, and q value.

'wilcox _test_significant_results.txt', significant Wilcoxon-test results.

'aov_results.txt', analysis of variance model results with p-value and q value.

'aov_significant_results.txt', significant analysis of variance model results.

'kw_test_results.txt ', Kruskal-Wallis rank sum test results with p-value and q value.

'kw_test_significant_results.txt ', significant Kruskal-Wallis rank sum test results.

'PCA_Score.txt', PCs (scores) matrix.

'PCA_R2.txt', importance of PCs.

'PCA_Screeplot.pdf', scree plot of variance explained (R2).

'PC12_Score_2D_Label.pdf', PCA scatter plot using PCs score values with the sample name label, PC12 refers to PC1 vs PC2.

'PC12_Score_2D.pdf', PCA scatter plot using PCs score values without the sample name label, PC12 refers to PC1 vs PC2.

'PLSDA_Score.txt', Component (scores) matrix.

'PLSDA_R2X_R2Y_Q2.txt', data frame with the model overview.

'PLSDA _Score_2D_Label.pdf', PLSDA scatter plot using Component 1 and Component 2 score values with the sample name label.

'PLSDA _Score_2D.pdf', PLSDA scatter plot using Component 1 and Component 2 score values without the sample name label.

'OPLSDA_Score.txt', Component (scores) matrix, P1 refers to 1th score and O1 refers to 1th orthogonal score.

'OPLSDA_VIP.txt', Columns: feature name, VIP, Corr.Coeffs (refers to correlation coefficient between raw data and 1th score data), Corr.P, FDR.

'OPLSDA_VIP_Sig.txt', significant result.

'OPLSDA_Permutation.txt', permutation result.

'Fitted_Curve_Parameter.txt', parameters about fitted curve in permutation plot.

'OPLSDA_R2X_R2Y_Q2.txt', data frame with the model overview.

'OPLSDA_VPlot.pdf', visualization about 'OPLSDA_VIP.txt' data.

'OPLSDA _Score_2D_Label.pdf', OPLSDA scatter plot using P1 and O1 values with the sample name label.

'OPLSDA _Score_2D.pdf', OPLSDA scatter plot using P1 and O1 values without the sample name label.

'OPLSDA_Permutation.pdf', visualization about 'OPLSDA_ Permutation.txt' data.

'OPLSDA_R2X_R2Y_Q2.pdf', visualization about ' OPLSDA_R2X_R2Y_Q2.txt' data.

'SVM_Prediction.txt', SVM model sample prediction results using inputted data.

'SVM_Prediction_Summary.txt', prediction summary.

'SVM_Imp_Rank.txt', feature ranked results that are sorted by SVM-RFE.

'SVM_Imp.pdf', scatter plot about feature importance.

'SVM_Top10_Imp.pdf', plot for Top 10 features.

'RF_Prediction.txt', RF model sample prediction results using inputted data.

'RF_Prediction_Summary.txt', prediction summary.

'RF _Imp_Rank.txt', feature ranked results that are sorted by MeanDecreaseGini.

'RF _Imp.pdf', scatter plot about feature importance.

'RF _Top10_Imp.pdf', plot for Top 10 features.

'Boruta_Decision_Info.txt', final result of feature selection.

'Boruta_Decision_Boxplot.pdf', important bands plot.

'biosigner_variable_results.txt', feature rank results by biosigner algorithm.

'biosigner_variable_significant_results.txt', significant feature results.

'biosigner_figure-tier.pdf ', displays classifier tiers from selected features.

'biosigner_figure-boxplot.pdf ', individual boxplots from selected features.

Parameter:

Please refer to the corresponding modules for specific parameters.