Advertisement

Savitsky-Golay smoothing- pros and cons?

Discussions about chromatography data systems, LIMS, controllers, computer issues and related topics.

10 posts Page 1 of 1
I run a gradient method which has a run time of 45 minutes and there are a few sections of the baseline which could benefit from smoothing but even when I use the maximum number of points in which to pick an average for the smoothing, there is no difference in the appearance of the baseline when I apply it to the run during integration. Is this because 45 mins is too long a time in which to fit a polynomial curve to the average of the points such that no discernible difference to the baseline is observed?

Is it possible to set a range for Savitsky Golay to apply to or are you limited to the entire run time?
My layperson's understanding of the Savitsky-Golay filter is that it is essentially a polynomial-weighted moving average. If you're programming it yourself, you can play with both the number of points and the weighting coefficients (if you haven't already done so, check out Norman Dyson's "Chromatographic Integration Methods"; here's a link to it on Amazon: http://tinyurl.com/ho8425n ).

The implementation in commercial data systems is somewhat obfuscated by terminology; things like "sampling rate", "bunching", and "time constant" tweak the parameters, but the vendors tend to be very close-mouthed about exactly how they are applied.
-- Tom Jupille
LC Resources / Separation Science Associates
tjupille@lcresources.com
+ 1 (925) 297-5374
Most systems smooth the entire chromatogram. Savitzky-Golay smoothing works as Tom described, which means it's very good at bending itself round curves. It basically fits a polynomial curve through a set of n successive points and makes a new point in the middle of the curve; then it moves one point further along and repeats. It does this very economically by a simple weighting method - it really is a thing of beauty. As a side-effect, it can even calculate the slope, rate of change of slope (etc.) as it goes along. I am rather in love with it, and think its creators must have been mega-brains to come up with it.

Its ability to bend round curves means that it doesn't mess up peak-widths unless driven to extremes, but it also means that it really only removes high-frequency noise. If your baseline problem is a tendency to wander up and down over a few points, then the smoothing will probably curve itself round the baseline and go up and down too!

For interest, there are two parameters that SG smoothing actually needs, and which affect its smoothing: the number of data-points to use per operation, and the power of polynomial curve that it aims to fit. Most software allows you to select the number of data-points, but has a fixed polynomial power that the manufacturer may or may not disclose. I'd guess they're just quadratic, but if anyone knows better, I'd love to know!
I run a gradient method which has a run time of 45 minutes and there are a few sections of the baseline which could benefit from smoothing but even when I use the maximum number of points in which to pick an average for the smoothing, there is no difference in the appearance of the baseline when I apply it to the run during integration. Is this because 45 mins is too long a time in which to fit a polynomial curve to the average of the points such that no discernible difference to the baseline is observed?

Is it possible to set a range for Savitsky Golay to apply to or are you limited to the entire run time?
Are you looking for a cosmetic improvement in the appearance of the chromatogram, or is the lack of a smooth baseline affecting the integration of peaks ?

Peter
Peter Apps
Hi Peter, its a 45 minute run time and there are sections of the baseline which are quite noisy and could do with some smoothing in order to aid this integration. Can you advise as to if Savitsky is appropriate?
I guess the question becomes, are you looking for analyte peaks that are near the limit of detection and are trying to improve signal to noise, or is there severe perturbations in the baseline caused by instrument problems you are trying to smooth out?

With the first the smoothing might help but with the second it would be prudent to trouble shoot the instrument problem instead of trying to use the smoothing to cover it up.
The past is there to guide us into the future, not to dwell in.
Hi Peter, its a 45 minute run time and there are sections of the baseline which are quite noisy and could do with some smoothing in order to aid this integration. Can you advise as to if Savitsky is appropriate?
With all the caveats that have been mentioned already about what kind of noise you can smooth away I would say try it and see. Smoothing can be applied to raw data after it is acquired (or at least it can on my GC-MS software), so you can try different types of smoothing, with different settings until you find what meets your requirements.

But if only some sections of the baseline are noisy I suspect that you are not looking at the high frequency noise that smoothing works with - if the baseline is wobbling rather than buzzing smoothing is not going to help, and then you need to follow James' advice on troubleshooting.

Peter
Peter Apps
I guess the question becomes, are you looking for analyte peaks that are near the limit of detection and are trying to improve signal to noise, or is there severe perturbations in the baseline caused by instrument problems you are trying to smooth out?

With the first the smoothing might help but with the second it would be prudent to trouble shoot the instrument problem instead of trying to use the smoothing to cover it up.
I am trying to improve signal to noise as there are sections of the baseline which contain several important peaks and there is quite a lot of noise in these sections. As a gradient curve it rises then dips quite sharply in response to the re-introduction of the water in the mobile phase mix.

How do you use Savitsky Golay post-run? I use Empower 2 and it gives me the option as to how many points to use- from 2 to 30. But this applies to the whole 45 minutes run time yes? If I picked 20, it would take 20 points as a moving polynomial function curve and smooth the data taking the average of the 20 points, inevitably including the smooth sections and yielding an unhelpful smoothed look to the entire chromatogram.

Can I specify sections ( i.e the troublesome section!) to which Savitsky applies, or it is better to increase my filter time constant from normal to approx. 0.15 seconds (I find 0.2 skews peak symmetry)?,
The Savitsky-Golay is essentially a "low-pass" filter. It damps out rapid changes (short-term noise) but has a negligible effect on slower changes (peaks, long-term noise, and drift). In other words, if the baseline is already smooth, then the filter should not change it. If you have to set the parameters such that you distort the "smooth" parts of the chromatogram, then you will also distort the "noisy" parts of the chromatogram.

From your description, It sounds like you are not seeing noise (which, by definition, is random) but rather perturbations of the baseline. If it's a gradient, I suspect that the steepnes (rate of change) of those pertubations is comparable to that of your peaks, in which case if you filter out the perturbations, you will also filter out the peaks.
-- Tom Jupille
LC Resources / Separation Science Associates
tjupille@lcresources.com
+ 1 (925) 297-5374
if your baseline goes up and down with the gradient, don't worry. That's normal. It should be possible to set the integrator to cope with this sort of change. Integrators are sometimes fooled by rapid negative movements at the injection, and set a "baseline" at the bottom of a negative injection peak, in which case the best thing to do is set up integrator events turning integration off until safely after the injection (you won't want to be quantifying unretained peaks anyway). Thereafter, movements in the baseline should be much broader than real peaks, so the integrator can tell the difference.

SG smoothing can be applied safely over the whole run. There is no conceivable reason for smoothing only part of a run (as Tom said, it just won't do anything if the chromatogram is already smooth).

My attitude to smoothing is that if you have peaks which are being integrated badly because the integrator is being fooled by noise (it's dividing peaks that have a noise blip in the middle, making the peak look like two nearly-coeluting peaks; it's picking bad start and end points because it can't see baseline for the noise) then it is better to smooth than to resort to manual integration. It's very hard to be sure, with manual integration, that you are unbiased in your choices (and even harder to prove it to someone else). At least with smoothing you've treated all data the same way. Of course, if you need to do this, you are probably quantifying peaks whose errors will be big. Nevertheless, if you're doing so in a biological system where the variability between replicate samples is +/- 40%, and you're looking for 5-fold changes, it doesn't really matter if you have big quantitative errors. (But if you need to quantify +/- 5% and you're having to smooth, you've got big, big problems!).
10 posts Page 1 of 1

Who is online

In total there are 2 users online :: 1 registered, 0 hidden and 1 guest (based on users active over the past 5 minutes)
Most users ever online was 4374 on Fri Oct 03, 2025 12:41 am

Users browsing this forum: Semrush [Bot] and 1 guest

Latest Blog Posts from Separation Science

Separation Science offers free learning from the experts covering methods, applications, webinars, eSeminars, videos, tutorials for users of liquid chromatography, gas chromatography, mass spectrometry, sample preparation and related analytical techniques.

Subscribe to our eNewsletter with daily, weekly or monthly updates: Food & Beverage, Environmental, (Bio)Pharmaceutical, Bioclinical, Liquid Chromatography, Gas Chromatography and Mass Spectrometry.

Liquid Chromatography

Gas Chromatography

Mass Spectrometry