Problems Solving Quadratic Fit

Discussions about HPLC, CE, TLC, SFC, and other "liquid phase" separation techniques.

14 posts Page 1 of 1
I have a very nice quadratic fit for a calibration curve in excel when I have mg/mL on the x. To solve for x though, I wanted to switch the axises (to put mg/mL on y and area counts on x) but then the fit is awful and the resulting values aren't even logical. Any tips on how to hand the quadratic calibration curves? I have never done non-linears before. I tried to solve by hand with the quadratic formula but it only worked for the higher x values.

I have fit it with a "power" fit, but it's not as good at all.

Any ideas?
try to use "polynomical fit", second order


but indeed, Excel seems to have a "problem" when interchanging the axis.
The 2nd order polynomial fit is what I used. We also tried plugging the polynomial fit/data into the Igor graphing program and got the same results as excel. It might be some weird math thing. I was hoping that someone had a good trick to pull off what I described above.
I'm sorry, I'm not totally clear what you're doing. Can I just check this is the situation:

You have a calibration curve of Peak Area (Y) versus Amount (X) which works well as a quadratic fit.
You therefore have an equation Area = a*Amt^2 + b*Amt + c where you know a, b, c
You are now trying to find Amt in unknown samples from Area.
This should be fairly straightforward in that you can rearrange the equation to
0 = a*Amt^2 + b*Amt + (c-Area)
and solve this using the well-known
Amt = (-b PlusOrMinus square-root[b^2 - 4ac])/2a
99% of the time this ought to give you the right answer. If it isn't doing so, there are two things that can be going wrong.
Firstly, obviously, your low-concentration points may actually be fractionally a long way from the calibration curve but you don't see it when you look at the curve because all the values are small. This will give big percentage errors in any measured samples at low concentrations. This is a real problem in the analysis side of things, and not a mere matter of maths.
Secondly, if you're really unlucky (but it's never happened to me) the standard equation for solving quadratics as above isn't the most numerically stable. Textbooks recommend you calculate:
q = -0.5*[b + sign(b)*squareroot(b^2-4ac)]
after which the solutions are q/a and c/q (of which of course only one is relevant in your case; the other will probably be negative).

In my hands, the most likely thing to go wrong is me mistyping the equations in Excel.
What we have found to be a good solution is to take the natural log of both the area counts and the concentration. We plot this, the result is linear and we can do our calculations this way.

Thanks for the help, those of you who responded!
Which textbooks do you recomend to get a detailed understanding of this (using quadratic equation to calculate unknown conc. from non-linear std curve)?

lmh wrote:
I'm sorry, I'm not totally clear what you're doing. Can I just check this is the situation:

You have a calibration curve of Peak Area (Y) versus Amount (X) which works well as a quadratic fit.
You therefore have an equation Area = a*Amt^2 + b*Amt + c where you know a, b, c
You are now trying to find Amt in unknown samples from Area.
This should be fairly straightforward in that you can rearrange the equation to
0 = a*Amt^2 + b*Amt + (c-Area)
and solve this using the well-known
Amt = (-b PlusOrMinus square-root[b^2 - 4ac])/2a
99% of the time this ought to give you the right answer. If it isn't doing so, there are two things that can be going wrong.
Firstly, obviously, your low-concentration points may actually be fractionally a long way from the calibration curve but you don't see it when you look at the curve because all the values are small. This will give big percentage errors in any measured samples at low concentrations. This is a real problem in the analysis side of things, and not a mere matter of maths.
Secondly, if you're really unlucky (but it's never happened to me) the standard equation for solving quadratics as above isn't the most numerically stable. Textbooks recommend you calculate:
q = -0.5*[b + sign(b)*squareroot(b^2-4ac)]
after which the solutions are q/a and c/q (of which of course only one is relevant in your case; the other will probably be negative).

In my hands, the most likely thing to go wrong is me mistyping the equations in Excel.
this is a very ancient thread; there has been a more informative version, more recently, but I can't find it.

Short answer: I personally recommend that people stick to using chromatography software packages to do their calibrations, and avoid setting up calibration curves in Excel if at all possible. It is very rare for it to be genuinely necessary. Chromatography software is designed for the purpose and does it better: it will allow you to combine appropriate curves-shapes with weighting of points, and all at the tick of a few boxes.

I'm not sure what to recommend for those who want to understand the numerical side of quadratic fits. They're really just another least-squares method. I'd guess most textbooks on numerical analysis will mention the subject. Numerical Recipes (Press et al.) gives the improved equations for solving quadratics, but books like this are grossly excessive for a simple calibration curve - though they're a good read if you're so minded!
If you really want to get into it, check out the series of 52 articles from American Laboratory: https://www.americanlaboratory.com/1403 ... Chemistry/
You have to download each article separately, but collectively this is my "go to" text on statistics, if for no other reason than all the examples are from chromatography (as opposed to, say, widget manufacturing)>
-- Tom Jupille
LC Resources / Separation Science Associates
tjupille@lcresources.com
+ 1 (925) 297-5374
Hollow wrote:
Excel seems to have a "problem" when interchanging the axis.

This is not a problem of Excel or any software. If one has even a simple function y = x^2, inverting the axes results in a function x = y^0.5. Of course, the latter square-root function cannot be fitted with a quadratic equation.

Generally, a function y = a0 + a1*x + a2*x^2 cannot be converted to a function x = b0 + b1*y + b2*y^2, or to a function x = k*y^n, or to a function log(x) = c0 + c1*log(y).
... see the other thread. The quadratic fit, in any case, minimises the error in the y-direction, so it should be used on data where the expected error is in that direction; swapping the axes to make life arithmetically simpler is statistically dubious.

Yes, if you've got a curve of defined shape and it's a polynomial one way round, if you swap the axes it won't be a polynomial. It may, however, give quite a decent approximation to a polynomial over a limited span. If you have a curve of unknown shape, then it can be hard to know whether you're more justified in treating y versus x as polynomial or x versus y as polynomial. That's rather the point of fitting a polynomial: you're not necessarily saying that the underlying function is polynomial, you're saying that a polynomial fit gives a decent approximation to the data and allows you to interpolate.

Basically: make sure y is where the errors are, look at the residuals, and if they deviate unacceptably from random, then consider a different fit.
Math is not my thing, but I want to understand this. If my reading of Wikipedia is right, a "residual" refers to the difference between the observed signal (or ratio of Internal signal to Analyte signal) and the signal (or ratio) for a given concentration that the calibration model predicts. So it sounds a lot like the readback check done on calibration points in methods like 8270E as part of verifying the initial calibration. In that method, your low point is allowed to be +/- 50% of the expected concentration. That seems like a huge range.

What can residuals tell us? Say I'm using the average of response factors for my calculations. All of the residuals on the low end are positive (i.e. the calculated concentration is greater than my cal standard's expected value), and the opposite is true for the high standards. I think that would mean that my analyte does not respond in a linear fashion over the calibration range, and I need to limit my calibration range... maybe lop off standards at the high end.

What would the same pattern mean if observed for a quadratic curve? I'm really not sure. And I'm not clear on what errors in the y direction vs. the x direction signify here.

Am I at all on the right track here? :)
The residuals tell you the total error in your regression line and can be used to compare different mathematical models. Here is an easy example.

Say you have this data set
X: 1, 2, 3, 4, 5
Y: 1, 4, 9, 16, 25

Clearly this equation is Y=X^2. But if we didn't know that we could try different models by performing a linear transformation. I start off using a square root transformation, a log transformation, and an inverse transformation. All you would do is take the square root, Logarithm, and inverse of the y-values and plot them against the X values.

X: 1, 2, 3, 4, 5
Square root(y): 1, 2, 3, 4, 5
Log (y): 0, 0.60206, 0.954243, 1.20412, 1.39794
1/y: 1, 0.25, 0.111, 0.0625, 0.04

If you plot Sqrt(y) vs x and log(y) vs x and so on, the model that fits closest to a linear plot can be considered the function. You can make this determination using two different numbers, the correlation coefficient (R^2) or the residual sum of squares. The residual sum of squares takes the equation of best fit and produces a "theoretical" number based upon the slope and intercept and the X input. Then a "residual" is calculated based upon what was ACTUALLY observed vs what the model predicts it should be. This number is squared and then finally all of the numbers summed which gives the total error in the model vs what was ACTUALLY observed. The model with the lowest residual sum of squares is the model that best fits your data and in theory should be the most predictive.

This sounds kinda trivial until you have to do multivariate linear regression, where there may be two or more variables that affect a result, or if there are combined effects between two variables. example, say your equation is something like X^2 +XY+Y = Z.
"What can residuals tell us? Say I'm using the average of response factors for my calculations. All of the residuals on the low end are positive (i.e. the calculated concentration is greater than my cal standard's expected value), and the opposite is true for the high standards. I think that would mean that my analyte does not respond in a linear fashion over the calibration range, and I need to limit my calibration range... maybe lop off standards at the high end. "

In short, yes this is also telling you that the range your observing isn't linear and a smaller dynamic range should be modeled. Mathematically speaking though, removing data points will appear to give you a better fit, but you will always have some positive and some negative values for the residual analysis (unless it's a perfect fit) because the "line of best fit" is designed to even out the error across all data points. This means that if you observe that your data isn't displaying a linear relationship across all data points and you remove some data points, this will improve the correlation coefficient and residual sum of squares, but now you shouldn't use that calibration curve to extrapolate back out to the data you removed, as this represents a different model.
Yes. Non-random residuals mean your fit is systematically wrong (e.g. you've put a straight line through a curve). The size of the residual estimates the likely systematic error at the concentration of that individual residual. If your residuals are non-random but all rather small, you may not be worried. Lopping off the calibration points that are a long way outside the range won't necessarily turn a curve into a straight line, but it might now be so close a fit that the residuals become irrelevantly small.
14 posts Page 1 of 1

Who is online

In total there are 11 users online :: 0 registered, 0 hidden and 11 guests (based on users active over the past 5 minutes)
Most users ever online was 599 on Tue Sep 18, 2018 9:27 am

Users browsing this forum: No registered users and 11 guests

Latest Blog Posts from Separation Science

Separation Science offers free learning from the experts covering methods, applications, webinars, eSeminars, videos, tutorials for users of liquid chromatography, gas chromatography, mass spectrometry, sample preparation and related analytical techniques.

Subscribe to our eNewsletter with daily, weekly or monthly updates: Food, Environmental, (Bio)Pharmaceutical, Bioclinical, Liquid Chromatography, Gas Chromatography and Mass Spectrometry.

Liquid Chromatography

Gas Chromatography

Mass Spectrometry