Introduction:

This tool is a wrapper for the function 'runGC()' in the R 'metaMS' package which is designed to process a series of GC-MS data files and to produce a peak table. It performs a pseudospectrum-based analysis, where the basic entity is a collection of (mz, I) pairs at specific retention times. The standard workflow of metaMS for GC-MS data is the following:

1. peak picking;

2. definition of pseudospectra;

3. identification and elimination of artefacts;

4. annotation by comparison to a database of standards;

5. definition of unknowns;

6. output.

Input files:

1.       Multiple GC-MS raw data files in netCDF, mzXML or mzML format.

Parameter:

1.      RT range: part of the chromatograms that is to be analyzed. If given, it should be a vector of two numbers indicating minimal and maximal retention time (in minutes). For example 5, 25.

2.      FWHM: numeric specifying the full width at half maximum of matched filtration gaussian model peak. Can only be used to calculate the actual sigma.

3.      RT_ Diff: the allowed RT shift in minutes between different samples.

4.      Min_ Features: the minimum number of ion in a mass spectrum.

5.      similarity_ threshold: the minimum similarity allowed between mass spectra considered as the same compound.

6.      min. class. fract: the fraction of samples in which a pseudospectrum is present before it is regarded as an unknown.

7.      min. class. size: the absolute number of samples in which a pseudospectrum is present before it is regarded as an unknown.

Output files:

1.      'gcms_raw_pkTable.txt', raw peak table is generated with one line per "compound" and one column per sample.

For example:

Name

alg7

alg8

alg9

alg11

MC1

0

0

13856154

16243519

MC2

0

0

31899968

38644500

MC3

0

0

3492761

3258346

MC4

12240788

10174549

0

0

MC5

0

12862868

10526781

0

 

2.      'gcms_mass_spectra.msp', Corresponding pseudospectrum(compound) mass spectrum information in MSP format, the identifier is same in peak table file.

For example:

Name: MC1

 DB.idx: -17

 rt: 10.277

 Class: Unknown

 rt.sd: 0.0032

 Num Peaks: 21

 53 171375; 54 77970; 61 503846; 62 64082; 67 248848;

 68 1989572; 69 1357050; 70 31899968; 71 1527535; 75 3753457;

 76 301941; 96 67588; 103 1462811; 104 139832; 105 78760;

 126 80663; 144 303486; 170 82419; 172 1232879; 173 172461;

 187 77275;

 

 Name: MC2

 DB.idx: -18

 rt: 10.103

 Class: Unknown

 rt.sd: 0.0039

 Num Peaks: 20

 53 135076; 54 142370; 55 440504; 65 93944; 67 253352;

 68 155873; 80 137896; 81 74296; 82 1013110; 83 388082;

 84 13856154; 85 911621; 92 227705; 110 46007; 138 19458;

 186 440870; 187 71383; 199 17498; 201 66141; 436 2903;

 

3.      'gcms_mass_spectra_999norm.msp', intensities normalized mass spectrum information in MSP format, intensities sum=999.

4.      'TICs.pdf', Total Ion Chromatograms.

5.      'BPCs.pdf', Base Peak Chromatograms.

6.      'EICs', Extracted Ion Chromatograms.

Note£º

Here ProteoWizard software (http://proteowizard.sourceforge.net/doc_users.html) is recommended. It supports the reading/writing of the following open formats on all platforms (note: vendor formats require Windows with vendor libraries).

mzML 1.1

mzML 1.0

mzXML

MGF

MS2/CMS2/BMS2

mzIdentML

 

Please read the protocol of this software carefully. It can not be used for any commercial purposes.

Reference:

[1]     R. Wehrens, G. Weingart and F. Mattivi, metaMS: An open-source pipeline for GC-MS-based untargeted metabolomics J. Chrom. B (2014), v966, 109-116.

[2]     Chambers M C, Maclean B, Burke R, et al. A cross-platform toolkit for mass spectrometry and proteomics[J]. Nature Biotechnology, 2012, 30(10):918-920.http://proteowizard.sourceforge.net/doc_users.html