Advertisement

y=mx or y=mx+b

Discussions about HPLC, CE, TLC, SFC, and other "liquid phase" separation techniques.

28 posts Page 2 of 2

I agree perfectly with Daren. It is exactly the approach we also use in validations. Note that the ICH ask for a minimum of 5 calibration points.

HW Mueller: I don't understand your argumentation. Since all your calibration points contain an error, your regression line will NEVER go through y=0. But thats not a problem. Your y=0 just has to be within the 95% confidence interval.

Mtnshawn: If you wish, I could provide a small Excel-sheet, that does all the calculations

At the risk of beating this to death I think it's important to point out that the approach of evaluating whether the intercept is within the 95% confidence interval of the intercept is not always the best. The problem with it is that the better your linearity is, the smaller the confidence interval becomes.

The extent to which you do, or do not, have a significant intercept is not always related to the linearity of the curve. You may have an R-squared value of 0.9999999 but a finite intercept due to some systematic issue (spectrometer background subtraction for example). I have seen several cases where the curve appears perfect when plotted, but the statistics tells us we've failed.

A better approach in my opinion is simply to evaluate the magnitude of your intercept against the magnitude of the standard at the 100% level. Generally it should be NMT 2 or 3%.

Hi Adam,

yes your C.I. window is going to be reflective of the linearity of your curve. So one would have to demonstrate linearity (>0.99 R2) or whatever you use, before you would calculate the 95% confidence interval. From my experiences, once you have established satisfactory linearity, then your C.I. window should not change significantly ( ex. 0.998 vs. 0.9999). The goal is really to demonstate the acceptability of using y=mx, it is possible to have a perfectly linear 5-point curve that will not provide accurate quantitation if forced through zero.

Daren

Had a couple of vacation days, sorry it has been awhile getting back.

To all: Thanks for beating up this subject. Between all of the replies posted to this topic, I believe that I can justify the need to use which ever model works best.

So a MASSIVE THANK YOU for your input.

Klaus-I hate to sound like a nit/free loader, but Math was never my strogest suit, so a spread sheet would be great.

scook@rxkinetix.com

Thanks again
Shawn

Klaus,
you apparently hit on a language barrier. My "within the error" may now be titulated "95% confidence interval". We just didn´t want to be that quantitative, on purpose, as we believed it´s the researchers duty, as well as that of the persons using this research, to evaluate the worth of the findings. We did not believe that a government official had the foresight, etc., to do this for us. Thus also Rutherford´s statement, mentioned before: "If your experiment needs statistics, then you ought to have done a better experiment".
Now please don´t misunderstand me again, I see some value, for instance publications, in using statistics to give the statement "significant" a more standardized meaning. It´s just totally overblown, or even misused in my opinion.
Incidentally, a small "never" might have been better, the chance to go through zero is no worse than going through another point.

Do anyone know any free software which can do [u]weighted[/u] regression?
Excel has no such function.

Thanks

Just convert your values to the "weighted" form, do the usual regression, then convert back.
-- Tom Jupille
LC Resources / Separation Science Associates
tjupille@lcresources.com
+ 1 (925) 297-5374

Do anyone know any free software which can do weighted regression?
Excel has no such function.

Thanks
Excel does have that function (it may require Analysis Toolpak activation from Tools, Add-Ins). To enter the linest function (an array function), select a range 2 columns wide, 4 rows (for typical 1 set of Ys for a set of Xs cals), type =linest(... per syntax below) then hit Control-Shift-Enter together.

Linest - from Help:

Calculates the statistics for a line by using the "least squares" method to calculate a straight line that best fits your data, and returns an array that describes the line. Because this function returns an array of values, it must be entered as an array formula. For more information about array formulas, click .

The equation for the line is:

y = mx + b or y = m1x1 + m2x2 + ... + b (if there are multiple ranges of x-values)

where the dependent y-value is a function of the independent x-values. The m-values are coefficients corresponding to each x-value, and b is a constant value. Note that y, x, and m can be vectors. The array that LINEST returns is {mn,mn-1,...,m1,b}. LINEST can also return additional regression statistics.

Syntax

LINEST(known_y's,known_x's,const,stats)

Known_y's is the set of y-values you already know in the relationship y = mx + b.

If the array known_y's is in a single column, then each column of known_x's is interpreted as a separate variable.


If the array known_y's is in a single row, then each row of known_x's is interpreted as a separate variable.
Known_x's is an optional set of x-values that you may already know in the relationship y = mx + b.

The array known_x's can include one or more sets of variables. If only one variable is used, known_y's and known_x's can be ranges of any shape, as long as they have equal dimensions. If more than one variable is used, known_y's must be a vector (that is, a range with a height of one row or a width of one column).


If known_x's is omitted, it is assumed to be the array {1,2,3,...} that is the same size as known_y's.
Const is a logical value specifying whether to force the constant b to equal 0.

If const is TRUE or omitted, b is calculated normally.


If const is FALSE, b is set equal to 0 and the m-values are adjusted to fit y = mx.
Stats is a logical value specifying whether to return additional regression statistics.

If stats is TRUE, LINEST returns the additional regression statistics, so the returned array is {mn,mn-1,...,m1,b;sen,sen-1,...,se1,seb;r2,sey;F,df;ssreg,ssresid}.


If stats is FALSE or omitted, LINEST returns only the m-coefficients and the constant b.
The additional regression statistics are as follows.

Statistic Description
se1,se2,...,sen The standard error values for the coefficients m1,m2,...,mn.
Seb The standard error value for the constant b (seb = #N/A when const is FALSE).
r2 The coefficient of determination. Compares estimated and actual y-values, and ranges in value from 0 to 1. If it is 1, there is a perfect correlation in the sample — there is no difference between the estimated y-value and the actual y-value. At the other extreme, if the coefficient of determination is 0, the regression equation is not helpful in predicting a y-value. For information about how r2 is calculated, see "Remarks" later in this topic.
sey The standard error for the y estimate.
F The F statistic, or the F-observed value. Use the F statistic to determine whether the observed relationship between the dependent and independent variables occurs by chance.
df The degrees of freedom. Use the degrees of freedom to help you find F-critical values in a statistical table. Compare the values you find in the table to the F statistic returned by LINEST to determine a confidence level for the model.
ssreg The regression sum of squares.
ssresid The residual sum of squares.
Thanks,
DR
Image

Let us look back to the original problem.
Calculating recoveries by y=mx or y=mx+b.
Experiment part include the analysis of control and analysis of spiked sample (at any level).
If control sample have an impurity at certain level that has to be considered in calculating the recovery. Thus the (amount present)=b.
Analysis of the spiked sample will give (amount observed)=y.
The known amount of degradant is added for spikng (amount added)=x.
Thus %amount recovered=(y-b)/x x100=(Amt observed - Amount present)/ Amount added x 100.


Thus the only correct way of doing is y=mx+b. y=mx is true only when the amount present in the control sample is not detected or zero.

Yikes! That's so far off base, I'm not even sure where to start.

b is the intercept of the curve, not the amount in an unspiked sample. And x is the slope of the response curve, not the amount spiked.

As we've already established, which form of the equation you use depends on whether the intercept of the response curve is - or is not - significant.

Once you determine the amount in a spike and a control, then you can use this formula to get recovery (assuming a control is used at all):

(Amt observed in spiked sample - Amt in control)/ Amt spiked x 100.

adam,
did you hit the wrong key? The slope is y/(x-b) which is m.
What gives? Anal. good, interpretation (statistics) bodged?

Unbelivable! There must be a gremlin in action: I goofed the simplest of algebra. The slope, when the curve goes through zero, is y/x (for b = 0). When this is not the case (b > or < than 0) the slope is, of course, (y+b)/x.

Gee whizz. If you make this + a - it´s finally there: m = (y-b)/x
28 posts Page 2 of 2

Who is online

In total there are 17 users online :: 1 registered, 0 hidden and 16 guests (based on users active over the past 5 minutes)
Most users ever online was 4374 on Fri Oct 03, 2025 12:41 am

Users browsing this forum: Google Feedfetcher and 16 guests

Latest Blog Posts from Separation Science

Separation Science offers free learning from the experts covering methods, applications, webinars, eSeminars, videos, tutorials for users of liquid chromatography, gas chromatography, mass spectrometry, sample preparation and related analytical techniques.

Subscribe to our eNewsletter with daily, weekly or monthly updates: Food & Beverage, Environmental, (Bio)Pharmaceutical, Bioclinical, Liquid Chromatography, Gas Chromatography and Mass Spectrometry.

Liquid Chromatography

Gas Chromatography

Mass Spectrometry