Introduction:
This tool is a wrapper for the function 'runGC()' in the R 'metaMS' package which is designed to process a series of
GC-MS data files and to produce a peak table. It performs a pseudospectrum-based
analysis, where the basic entity is a collection of (mz,
I) pairs at specific retention times. The standard workflow of metaMS for GC-MS data is the following:
1. peak picking;
2. definition of pseudospectra;
3. identification and elimination of artefacts;
4. annotation by comparison to a database of
standards;
5. definition of unknowns;
6. output.
Input files:
1. Multiple GC-MS raw data files in netCDF, mzXML or mzML format.
Parameter:
1.
RT range: part of the
chromatograms that is to be analyzed. If given, it should be a vector of two
numbers indicating minimal and maximal retention time (in minutes). For example
5, 25.
2.
FWHM: numeric specifying the
full width at half maximum of matched filtration gaussian model peak. Can only
be used to calculate the actual sigma.
3.
RT_ Diff: the allowed RT shift
in minutes between different samples.
4.
Min_ Features: the minimum
number of ion in a mass spectrum.
5.
similarity_ threshold: the minimum similarity allowed between mass spectra
considered as the same compound.
6.
min. class. fract: the fraction of samples in which
a pseudospectrum is present before it is regarded as an unknown.
7.
min. class. size: the absolute number of samples in
which a pseudospectrum is present before it is regarded as an unknown.
Output files:
1.
'gcms_raw_pkTable.txt', raw peak table is generated with one line per "compound" and
one column per sample.
For example:
|
Name |
alg7 |
alg8 |
alg9 |
alg11 |
|
MC1 |
0 |
0 |
13856154 |
16243519 |
|
MC2 |
0 |
0 |
31899968 |
38644500 |
|
MC3 |
0 |
0 |
3492761 |
3258346 |
|
MC4 |
12240788 |
10174549 |
0 |
0 |
|
MC5 |
0 |
12862868 |
10526781 |
0 |
2.
'gcms_mass_spectra.msp',
Corresponding pseudospectrum(compound) mass spectrum information in MSP format, the
identifier is same in peak table file.
For example:
Name:
MC1
DB.idx: -17
rt: 10.277
Class: Unknown
rt.sd: 0.0032
Num Peaks: 21
53 171375; 54 77970; 61 503846; 62 64082;
67 248848;
68 1989572; 69 1357050; 70 31899968; 71
1527535; 75 3753457;
76 301941; 96 67588; 103 1462811; 104
139832; 105 78760;
126 80663; 144 303486; 170 82419; 172
1232879; 173 172461;
187 77275;
Name: MC2
DB.idx: -18
rt: 10.103
Class: Unknown
rt.sd: 0.0039
Num Peaks: 20
53 135076; 54 142370; 55 440504; 65
93944; 67 253352;
68 155873; 80 137896; 81 74296; 82 1013110;
83 388082;
84 13856154; 85 911621; 92 227705; 110
46007; 138 19458;
186 440870; 187 71383; 199 17498; 201
66141; 436 2903;
3.
'gcms_mass_spectra_999norm.msp',
intensities normalized mass spectrum information in MSP format, intensities
sum=999.
4.
'TICs.pdf', Total Ion
Chromatograms.
5.
'BPCs.pdf', Base Peak
Chromatograms.
6.
'EICs', Extracted Ion
Chromatograms.
Note£º
Here ProteoWizard
software (http://proteowizard.sourceforge.net/doc_users.html) is recommended. It
supports the reading/writing of the following open formats on all platforms
(note: vendor formats require Windows with vendor libraries).
mzML 1.1
mzML 1.0
mzXML
MGF
MS2/CMS2/BMS2
mzIdentML
Please read the protocol of this software
carefully. It can not be used for any commercial
purposes.
Reference:
[1]
R. Wehrens,
G. Weingart and F. Mattivi,
metaMS: An open-source pipeline for GC-MS-based
untargeted metabolomics J. Chrom. B (2014), v966,
109-116.
[2]
Chambers M C, Maclean B, Burke R,
et al. A cross-platform toolkit for mass spectrometry and proteomics[J].
Nature Biotechnology, 2012,
30(10):918-920.http://proteowizard.sourceforge.net/doc_users.html