MRMPROBS is launched as a universal program for targeted metabolomics using not only multiple reaction monitoring (MRM)- or selected reaction monitoring (SRM) but also SCAN and data independent MS/MS acquisition (DIA) data. Originally, the previous MRMPROBS program was developed to deal with large scale MRM assay’s data sets monitoring 500-1000 small molecules in a single run simultaneously. The program provided 1) a user-friendly graphical user interface (GUI) for data curation and 2) an objective evaluation system of small molecule identifications. Here, it was expanded for DIA-MS data (like SWATH-MS) and for SCAN data (like GC/MS and LC/MS).
All data-processing workflow from data import to statistical analysis is supported. This tutorial will introduce the workflow for 1) MRM data, 2) SWATH-MS (DIA) data, and 3) GC/MS data for targeted metabolomics. In this MRMPROBS project, your feedback would be appreciated to improve the identification and quantification systems as well as the user interface.
Section 1: Software environments
Section 2: Required software programs and files
Section 3: Project type and condition
Section 4: ABF file conversion
Section 4-1: Downloading the ABF converter
Section 4-2: Check the conditions for file conversion
Section 4-3: File conversion
Section 5: Reference file format
Section 5-1: Reference library for Project type 1: MRMPROBS key index = metabolite name (abf)
Section 5-2: Reference library for Project type 2: MRMPROBS key index = Function (mzML)
Section 5-3: Reference library for Project type 3: MRMPROBS key index = SCAN or DIA-MS (abf)
Section 5-3-1: Reference format for DIA-MS data
Section 5-3-2: Dictionary file for DIA-MS data processing
Section 5-3-3: Reference format for GC/MS and LC/MS data
Section 6: Starting MRMPROBS
Section 6-1: Summary for MRM demonstration data sets
Section 6-2: Starting up your project
Section 6-3: Importing Abf files
Section 6-4: Parameter
Section 7: MRMPROBS viewer
Section 7-1: Mouse operation in the chromatogram viewer
Section 7-2: Library editor (optional)
Section 7-3: Tool button
Section 7-4: Tab
Section 7-5: Button
Section 7-6: List Box
Section 7-7: Details on the MRMPROBS function
Section 7-7-1: File menu
Section 7-7-2: Data reprocessing
Section 7-7-3: Statistical analysis
Section 7-7-4: Missing value methods
Section 7-7-5: Normalization
Section 7-7-6: Window menu
Section 7-7-7: View menu
Section 7-7-8: Option menu
Section 7-7-9: Export menu
Appendix A: How to obtain appropriate file conversion of the Shimadzu .lcd file
Appendix B: Third option of MRMPROBS: via mzML file
Reifycs Analysis Base File Converter (ABF file converter)
Download link: http://www.reifycs.com/AbfConverter/index.html
Download link: http://prime.psc.riken.jp/Metabolomics_Software/MRMPROBS/index.html
Demo files and the reference library (tab-delimited text file)
MRMPROBS can import Analysis Base Framework (ABF) format data. MRMPROBS extracts chromatogram data together with the reference library including the name of the target metabolite, its retention-time and amplitude information, and precursor m/z and product m/z. The supported formats for ABF conversion are Shimadzu Inc. (.LCD), Agilent Technologies (.D), AB Sciex (.WIFF), Waters (.RAW), and Thermo Fisher Scientific (.RAW). MRMPROBS is also acceptable to a common data format mzML converted by an open source file translator ProteoWizard. The information is described in Appendix B.
1. MRMPROBS key index = metabolite name (abf)
2. MRMPROBS key index = Function (mzML)
* The above two projects are for MRM data sets.
3. MRMPROBS key index = SCAN or DIA-MS (abf)
4. MRM-DIFF (abf, mzML)
* File converter is freely available.
To convert files of some MS vendors including Bruker, LECO, Shimadzu, Thermo, and Waters, the specific data access library needs to be installed on your PC
Also see FAQ for ABF converter
Summary of PC condition required for file conversion
|Agilent||.D||None, but the files from Chemstation should be converted to netCDF|
|LECO||.PEG||All PEG files should be first converted to netCDF (AIA).|
|Shimadzu for GC/MS||.QGD||GCMS solution|
|Shimadzu for LC/MS||.LCD||LCMS solutions|
|Waters||.RAW||MassLynx Raw Data Reader Interface Library|
|netCDF||.CDF||Microsoft Visual J# 2.0|
Five items are required as tab-delimited format. The header names are flexible, but the item order should be kept.
1 column. Compound namea
2 column. Precursor m/z (accurate m/z information is rounded into nominal m/z information)
3 column. Product m/z
4 column. Retention time [min]
5 column. Amplitude ratios [%]b
a When you choose project 1 of MRMPROBS, the name must be identical to the compound name in the instrument setting window. The compound name MUST be written by half-width alphanumeric symbols.
b About the amplitude ratio format
✓ Example: only one transition for one metabolite
Thymine 125 42.05 5.58 100
✓ Example: multiple transitions for one metabolite
G6P 258.9 97.05 9.21 100
G6P 258.9 79.05 9.21 30.1
G6P 258.9 199.15 9.21 5.5
Note 1: You can edit the reference library and update its information in MRMPROBS. However, an empty value cannot be accepted when the library is imported. If you do not know the suitable retention time and amplitude information for the metabolites, enter arbitrary values for the metabolites.
Note 2: Users do not have to include all metabolite information you entered in the MS instrument.
Note 3: Sometimes the tab-delimited file exported from Microsoft Excel includes unexpected hidden trailing columns. These unexpected columns after the ‘Ratio’ column cannot be handled by MRMPROBS. You can inspect the exported file by selecting a few rows (see below). If there are selected characters after the last column (Ratio), edit the file in Excel to delete these columns and re-export it again.
Good example (no unexpected column)
Bad example (there are unexpected columns)
Six items are required as tab-delimited text format. The header names are flexible but the item order should be followed. (Here, in order easily to see the library, the reference was described in the Microsoft Excel.)
1 column. Compound namea
2 column. Function IDb
3 column. Precursor m/z (accurate m/z information is rounded into nominal m/z information)
4 column. Product m/z
5 column. Retention time [min]
6 column. Amplitude ratios [%]c
a When you use project 2 of MRMPROBS, the name doesn’t have to be identical to the compound name in the instrument setting. The compound name MUST be written by half-width alphanumeric symbols.
b The function ID is the most important ID to use this option. In the mzML data, there is a markup indicating a ‘Function ID’ which is unambiguous key to contact to the specific MRM chromatogram for the retention time range, the precursor ion, and product ion. In order to easily to see the relationship between the function ID and the MRM information, use the SeeMS program which can be downloaded at ProteoWizard webpage: http://proteowizard.sourceforge.net/.
To find the identical function ID in your data, use the Microsoft Excel sorting function and your experiment condition file. In the most of case, the proteowizard is sorting the functions following the order to 1. Precursor Ion, 2. Product Ion, 3. Retention time starting point.
c About the amplitude ratio format
See the section of Reference library for Project type 1: MRMPROBS key index = metabolite name (abf).
Users can utilize MRMPROBS software for scan type data such as GC/MS, LC/MS, and LC-data independent MS/MS (DIA-MS). The below figure is the reference library for DIA-MS data. Here, our objective is to utilized DIA-MS data as MRM (what we call DIA-MRM, for example SWATH-MRM for SCIEX machine.). This library can be easily exported by MS-DIAL software: http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/.
1 column. Compound name
2 column. Precursor m/z
3 column. Product m/z
4 column. Retention time [min]
5 column. Amplitude ratios [%]
6 column. RT begin: start time to draw the chromatogram
7 column. RT end: end time to draw the chromatogram
8 column. MS1 tolerance: mass accuracy for survey scan MS data
9 column. MS2 tolerance: mass accuracy for MS/MS spectra
10 column. MS level: put 1 for survey scan MS data (MS1), and put 2 for MS/MS.
11 column. Class: it’s used for the MRMPROBS viewer to filter out the chromatograms. Set ‘NA’ or something if not interest.
Below is the description of the ‘bridge’ from MS-DIAL to MRMPROBS
The dictionary file should contain MS1 scan range and precursor window in combination with its experimental ID.
In the case of SWATH data-independent analysis, the experiment file can be made at PeakView (Show->sample information). Do not change the column orders. The word “SCAN” should be kept.
MRMPROBS is improved to utilize single MS data such as GC/MS and LC/MS, and the below figure is the reference library for GC/MS data. The trick to import the single MS data sets is 1) to assign the same values for product m/z and MS2 tolerance as precursor m/z and MS1 tolerance, respectively, and 2) to assign ‘1’ as MS level for all queries.
This library can be easily exported by MS-DIAL software: http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/.
* The tutorial uses 40 demonstration files and the reference library which are downloadable from the above link. The common measurement conditions of the demonstration files were as follows.
Liquid chromatography: total 25 min run per sample with CELI L-column2 ODC (150 mm×2.1 mm, 3 μm).
Mass spectrometer: MRM method with negative ion mode.
Target metabolite number: 60
Total transitions: 166
The detail of experimental conditions is downloadable at the MRM Database section (Ion-pair LC-QqQ/MS).
File → New project.
Chose a project type (select the top one for this demonstration).
Select ‘ExampleLibrary.txt’ and set the above parameters for this demonstration.
Smoothing method: linear weighted moving average.
Smoothing level: 1-2
Minimum peak width: 3-5
Minimum peak height: 50-100
Retention time tolerance: As long as the reverse phase or hydrophilic interaction chromatography LC are used, 0.1-0.2 min is recommended.
Amplitude tolerance: 15
Minimum posterior: Decide the minimum probability for peak identification. MRMPROBS calculates a probability for a peak, i.e. “probability of true target metabolite given the calculated scores”. The detected peak less than this criterion is recognized as a false peak. The recommended value is 50-70.
Note: The first data processing including file import, peak detection, and peak identification requires 5-20 seconds (depending on machine specifications) per file.
Note: The details and the operation method for chromatogram viewer are described later.
Raw data matrix
If you double-click a metabolite name or a file name, the chromatograms are generated in the chromatogram viewer.
Data re-processing can be done by newly optimized parameters in this option. Re-processing is also performed per metabolite or per file. The target MRM can also be changed. The parameters are set per metabolite and per file. The required time for data re-processing is very short because file import has been performed already.
The current program can apply two types of missing value approaches and can normalize a quantification value by the internal standard and loess/cubic spline with the analytical order information. If you want to use the internal standard, you must set the optimal setting in the “Option menu”. The current program can also do principal component analysis.
After clicking the “Done” button, the “Statistical analysis setting” button is activated.
You can do principal component analysis. Add the calculated number of the principal components and choose the scale and transform method.
Zooming in and out can be done with the mouse wheel. Each principal component is shown by selecting the X axis or the Y axis combobox.
The tile setting is possible depending on your computer’s resolution. Please select your preference.
In this menu the chromatograms in the chromatogram viewer are sorted by file id, analytical order, class id, and file type.
Here it is possible to set the properties of metabolites and files. In particular, this option menu is used to create a data matrix for statistical analysis.
In the file properties you can re-set the file type, class ID, and analytical order except for the file name. If you clear the check box of the included property it is no longer included in the processed data matrix.
In the metabolite properties you can set the internal standard. It can be set independently for each metabolite. However, please make sure that the metabolite name of the internal standard is completely consistent with the metabolite name in the “internal standard” column. Therefore, we recommend that you use copy and paste for the internal standard setting. In this window, although copy and paste can be performed just by using the keyboard, you can do “multi” copy. For example, copy a metabolite name by pushing Ctrl + C. Select the rows you want to add in the internal standard column by dragging and paste the clipboard contents by pushing Ctrl + V.
A tab-delimited text file can be exported for a raw data matrix, a processed data matrix, the updated library, detected peak information detail, and PCA results. Moreover, the PCA result can be exported by some image formats.
Although you can do a content change of the .lcd file after LC-QqQ/MS (MRM) analysis, it is very useful to construct a suitable method file (.lcm format file) for the successful file convert of the MRMPROBS software.
1. Event name and channel (MRM transitions) rule.
2. Update compound table
After the method construction of MRM transitions, you should update the compound table m/z by the MRM event. If you can analyze the samples by using the updated method file, you do not have to perform any other tasks for the stable file convert.
You can check the updated table by Method->Data Processing Parameters->Compound tab.
3. If your data (.lcd) were not collected by a suitable method described above, you can improve the .lcd file by using the method file modified in the above way. After the construction of the modified method file, please open “Postrun Analysis” of LabSolutions.
After selecting the analysis files (.lcd) push the “Apply to Method” button.
Select the modified method file and improve your .lcd file including the compound table m/z. If you can do this, the file (.lcd) is successfully converted by Reifycs Inc. software.
4. File convert
Conditions: You can convert from .lcd files to .abf files on your computer by installing LabSolutions software. “TTFLDataExportVer5.dll” of LabSolutions ver. 5.53 SP4 or later is required for the file convert. Check the “TTFLDataExportVer5.dll” (Program Files (or *86)>LabSolutions) file property. If the file size is less than 577,536 bytes, contact Shimadzu Inc. for a file change.
After “AnalysisBaseFileConverter.exe” is opened, drag and drop the .lcd files to this converter.
Push the “Convert” button. The ABF format files will be generated in the same folder as the .lcd files.
Required software and file
MRMPROBS can import the mzML format file. In the third option of MRMPROBS, the “function id” is utilized to extract the chromatogram data. The users should add the “function id” information to the reference library in addition to the normal library format.
Convert the vendor’s MS file to mzML via ProteoWizard
Note! ProteoWizard does’nt support Shimadzu MS format. If you want to use them, please use the abf converter.