Agilent LCMS Data storage and transfer issues

Discussions about chromatography data systems, LIMS, controllers, computer issues and related topics.

5 posts Page 1 of 1
Long time viewer, first time poster here. We are a lab that deals with lots of different projects - both LC-MS and LC-MS/MS, both qualitative and quantitative data. We are having some issues managing our data transfer and storage and are wondering if anyone has had similar issues and has come up with a better system.

We have an Agilent LC-QTOF using MassHunter Data Acquisition on a dedicated instrument PC (for sample running and quick analysis/monitoring) that is networked to our network drive for instrument data. Once acquired, raw data is copied from the PC to the network drive.
We then need to process the data. When we bought the instrument, we also got a range of Agilent provided compound libraries, and processing software including Mass Profiler Professional, Profinder, along with MassHunter Qual. and Quant. software. However, Agilent licensing agreements limit our use of the libraries and some software (MPP, Profinder) to one PC - which we have a shared data PC (Agilent provided) for this purpose. We have multiple users that can use the data PC and can remote in to, so it works ok.
However, the MassHunter software saves in a large data format, and this system is generating large data sets. The transferring of data between instrument PCs, network drives, and the data PC and back again is resulting in duplication of data and also filling up drives fast. For example: one set of samples (~20 samples) is 41 GB unprocessed and larger once processed with library searches etc. We are filling up a 500 GB hard drive within 2-3 months with LCMS data.

We are trying to get a better data management system in place, but it is complicated by the Agilent licensing rules, along with our own IT data and security policies. Ideally we need one that avoids using so much storage space but still retains all the data we need, in a format that can be used by our software (MH Qual. & Quant. particularly) and can be backed up.

My questions are:
1. Has anyone else had similar issues, and did you come up with any better solutions/processes?
2. Is there a better format to store/transfer MassHunter LCMS and MSMS data that is smaller but still compatible with MH, MPP, Profinder etc?
3. Is there an obvious solution that I'm missing?
4. or: are the data sizes what they are and un-avoidable and I just need to plead with IT for more space? ;)

Thanks in advance for any advice. :)
HLNewson wrote:
Is there a better format to store/transfer MassHunter LCMS and MSMS data that is smaller but still compatible with MH, MPP, Profinder etc?
You can easily check that. If your big files within the sample directories are binary (you open it with a nodepad and it's all giberrish), then it's unlikely there's another format that will be more compact and at the same time be compatible. If big files are text, then there's probably an alternative, at the very least you can compress the samples with general-purpose compression utility like zip. But then each time you need to use them you'll have to uncompress.
HLNewson wrote:
3. Is there an obvious solution that I'm missing?
I doubt. If you need more data than you can fit, you can only extend your hard drives. 500GB is a modest disk, which can be found on many personal laptops. For the serious work at lab you shouldn't be shy and ask for 2/4/whatever TB disks.
HLNewson wrote:
or: are the data sizes what they are and un-avoidable and I just need to plead with IT for more space?
It's funny that you have to plead, because this is one of the IT's problem in the first place - they should be responsible to make all of this work:)
Software Engineer at elsci.io
Hi there
Compressing your data files using a zip program will definitely help with the transfer of data especially when you have to transfer it across a network. MH generates a lot of files for each sample acquired. Windows is very inefficient at transferring lots of small files. Transferring a single zip file across a network will fly in comparison.

If you are not comfortable using zips to transfer the data, have a look at a specific copy tool that is tailored better to do the copying operation than the Windows copy command.

Having said that, I think many of those small files aren’t needed when processing data. You should check this but I think that the data.ms is the important one.

Also check what is going on when you processing the data. MH likes to save results generated in MH quant and MH qual for example. I guess so things load faster the next time the file is loaded but this could add to the data directory size and may not be needed.

If data storage is a real issue, have a look at the acquisition settings you are using. You might be able to minimize file size with some changes to those settings.

Hope some of these ideas help
Kevin
It is normal for large LC-MS datasets to be large. Your organisation needs to grit its teeth and come up with a data-storage strategy for big data. Be happy at least that you don't work with images. A disk that you take several months to fill will just about last a crystallographer until tea-time today... (and microscopists are quite frightening when it comes to disk-space too).

You could experiment with collecting data in centroid mode rather than profile, if your system will do it? Also, if you have a PDA, take a look at how rapidly it's collecting data. Rapid acquisition of PDA spectral data can really bulk file sizes too.
Agilent has Data Store and ECM for centralized data retention. Both will compress data files into zip file automatically. My guess is your company doesn't have the budget?

It is your IT's job to manage data retention strategy or at least provide the infrastructure. But if you really have to DIY because your IT dept is useless, buy two enterprise grade NAS. Use one as the primary storage and use the other to back up the first one. Write a windows backup script and use windows scheduler for automatic backup.
... just an addendum, also consider whether you need to retain in perpetuity the processed data. It might be enough to be able to recreate, if necessary, the processed results, knowing that you can recreate exactly the same result. For this, it might be enough to retain the unprocessed raw data (as is already happening) and the processing methods, together with any information about how you used them, and sequence/batch files as processed.
5 posts Page 1 of 1

Who is online

In total there are 15 users online :: 0 registered, 0 hidden and 15 guests (based on users active over the past 5 minutes)
Most users ever online was 894 on Tue Dec 21, 2021 5:59 pm

Users browsing this forum: No registered users and 15 guests

Latest Blog Posts from Separation Science

Separation Science offers free learning from the experts covering methods, applications, webinars, eSeminars, videos, tutorials for users of liquid chromatography, gas chromatography, mass spectrometry, sample preparation and related analytical techniques.

Subscribe to our eNewsletter with daily, weekly or monthly updates: Food, Environmental, (Bio)Pharmaceutical, Bioclinical, Liquid Chromatography, Gas Chromatography and Mass Spectrometry.

Liquid Chromatography

Gas Chromatography

Mass Spectrometry