Page 1 of 1

Natural product fragmentation libraries

Posted: Sun Feb 01, 2015 7:27 pm
by dkotes
We just got a new Q-TOF and will be running a lot of crude extracts. There are many peaks and lots of data, and I was wondering if anyone had any experience with dereplication of fungal and bacterial fermentation broths. Or could recommend any books on the subject.

Seeing a whole TIC of compounds you have no idea what they are is, overwhelming.

Thanks

Re: Natural product fragmentation libraries

Posted: Mon Feb 02, 2015 9:13 am
by Peter Apps
What are you interested in finding out from the MS results ?

"Dereplication" is a new one to me - it suggests that you want to know which compounds occur in more than one sample ??

Peter

Re: Natural product fragmentation libraries

Posted: Mon Feb 02, 2015 1:48 pm
by lmh
Are you teetering on the brink of metabolomics? Are you looking for differences between treatments, when each treatment is available as a sensible number of replicates? If so, firstly talk to your Waters rep (you've just bought a Waters instrument I believe) about their metabolomics software - they do, I'm sure, have a peak-finding software thing. If you don't like what they're offering, or you can't afford it, look at open-source software such as XCMS. This will reduce all your runs to a huge table of "peaks" (rows) versus "samples" (columns). It's then relatively easy to investigate the table in all sorts of useful ways.

For example, you can do repeated t-tests or anova on rows of the table, and sort the table by the significance of the result. This isn't statistically super-resilient, because if you carry out 500 Anova tests and look at P-values, on average 5 will have a "significance" better than 0.01 just by chance. It will, however, immediately highlight the peaks that are most likely to have been affected by your treatment. You can also use techniques like PCA.

You can also filter your data. If you include repeated injections of one sample that definitely contains all peaks, you can check the relative standard deviation of each peak in this sample, and reject all rows in the table where the peak in question has a poor RSD.