Help with Data Processing and Data analysis

Basic questions from students; resources for projects and reports.

5 posts Page 1 of 1
I'm looking for help using software such as the Multivariate Mass Spectra Reconstruction (MMSR) approach developed by Tikunov et al. and MetAlign from Wageningen. I pretty much need a tutorial for dummies (I am finding the manuals included with the downloads very hard to follow. Ok, I really don't know what I'm doing). The gap is, I feel like it is in "programmer" speak and I still stuck at ChemStation (I have MS.D data files). For example, what number do I enter for "Mass Bin Parameter for Conversion to Nominal: ??". I can create my own Bins, but first, what's a Bin? Has anyone had success with using these and want to throw me a bone? Any replies are appreciated, thank you in advance!
I haven't been looking at these programs, because I'm an R-and-XCMS person, but mass binning, to me, is about taking nearby mass values and saying "actually, these are the same mass". The usual context in which it happens in multivariate-world is when finding peaks in the 3d data set of intensity versus retention time and mass. Most systems slice the run into loads of extracted ion chromatograms, and all systems have to deal with mass varying slightly from one spectrum to the next.

The answer is binning: you decide that you will sort all the data into mass-bins 0.2 units wide (or whatever suits your instrument and application), so 563.11 in one spectrum lands up in the same bin as 563.13 in the next. Then you can see a chromatogram by looking at the binned values for "approximately 563.1" over the entire chromatography run. At its crudest, you might take all the spectra and sort them into bins of 99.5-100.5; 100.5-101.5; 101.5-102.5.... etc., and then look for an extracted ion chromatogram in the bin for "102 +/-0.5" from 0-20 minutes. Does that make sense?

The problem is that there is always a boundary between two bins. If your bin-boundary happens to be at 563.12, then two very similar masses (as above) in successive spectra still get sorted wrongly into different bins. For this reason, some software (e.g. XCMS) uses overlapping bins.

You have my sympathy with dealing with this sort of software. The software is often great, but the help-files tend to be written by people who don't need help, for others who don't need help, and are often as useful as a chocolate teapot. It's a bit like learning maths from wikipedia, where the maths pages only make sense if you already thoroughly know the material they're trying to tell you (which is fairly pointless).
Thanks for the tips. I basically know what the software does, I have data I want to fit, I know what questions I'm asking. I just need a walk through in how to connect the dots using this software. I read papers with the authors using this amazing metabolomics data mining software, but no where does it say HOW in user friendly terms (dum-dum level for me!). I chose these because of open source, but I am considering purchase if it comes with support. Any suggestions? Is XCMS only for LCMS? I'm using GCMS.
I use AnalyzerPro from Spectral Works for GC-MS data. It does a good job for what I need, which is finding little peaks hidden in a forest of big ones, and batch processing of large sets of complicated chromatograms, but that may not be what you have in mind. You can get a 30-day trial for free.

Peter
Peter Apps
...and to answer your other question, No, XCMS works on GC-MS data too, but it is just as fraught with parameters that need to be chosen, and I find it hard to imagine that the help/information on any other piece of software could be worse.

In particular, you'd need to make XCMS aware of the narrower peak-widths that are typical in half-way decent GC-MS. If you are using a low-res quadrupole based GC, also ignore anything the XCMS-experts say about their latest peak-finding algorithms (because they are all playing with high-res data now), and make sure you tell the original peak-finding algorithm that it should work with fairly wide mass bins (faster, and gives better peak-lists because the whole peak is guaranteed to be in one bin).

XCMS will also make an individual entry in its peak-list for every different fragment from the same chemical. This gives you large peak-lists, which is a bit of a pain. On the good side, though, you'll find that all the peaks from the same chemical will not only have nearly identical retention times, but will also have almost identical loadings in PCA plots; you can almost "deconvolute" a spectrum from the peak-list data!
5 posts Page 1 of 1

Who is online

In total there is 1 user online :: 0 registered, 0 hidden and 1 guest (based on users active over the past 5 minutes)
Most users ever online was 1117 on Mon Jan 31, 2022 2:50 pm

Users browsing this forum: No registered users and 1 guest

Latest Blog Posts from Separation Science

Separation Science offers free learning from the experts covering methods, applications, webinars, eSeminars, videos, tutorials for users of liquid chromatography, gas chromatography, mass spectrometry, sample preparation and related analytical techniques.

Subscribe to our eNewsletter with daily, weekly or monthly updates: Food & Beverage, Environmental, (Bio)Pharmaceutical, Bioclinical, Liquid Chromatography, Gas Chromatography and Mass Spectrometry.

Liquid Chromatography

Gas Chromatography

Mass Spectrometry