Page 1 of 1
Curve fit - r versus r-squared
Posted: Tue Dec 01, 2009 2:53 am
by bisnettrj2
Can anyone tell me what the difference between r and r-squared in terms of curve suitability is? I read EPA method 8000C (the guideline document for 8000-series methods), and they define r and r-squared (Coefficient Of Determination) as completely different equations, but Agilent's Chemstation seems to compute r-squared differently than 8000C does (seemingly as, literally, r*r = r-squared) - or maybe I'm wrong?
pg 43-44
http://www.epa.gov/osw/hazard/testmetho ... 00c_v3.pdf
pg 16
http://www.chem.agilent.com/Library/Sup ... a10424.pdf
Is there any significant difference in terms of defining curve suitability? Is one better than the other? Are there any times when there would be large differences between the two? I usually only use %RSD in evaluating my curves, and I don't have a lot of experience in using linear regression.
Any input will be appreciated.
Posted: Tue Dec 01, 2009 6:38 am
by aceto_81
r-squared is just what it says: r*r.
As far as I can see the method for the calculation from the EPA method for R-squared is the same as using the agilent method for r, which then should be squared.
There are some differences between r and r-squared, take some time to search on the internet, I'm sure you will find which one suits your purposes the best. (Here is one to start with:
http://en.wikipedia.org/wiki/Coefficien ... ermination)
Ace
Posted: Tue Dec 01, 2009 4:42 pm
by bisnettrj2
I agree that the Agilent calculation of r-squared is simply the 8000C r-calculation, but 'squared'. My problem lies in the equation for r-squared (COD) in 8000C, and the following statement:
"Most instrument data systems calculate an r2 term as a coefficient describing correlation. This statistic should not be confused with the correlation coefficient (r); they are NOT related. The r2 term is more closely related to the COD as described above. As with the COD, a r2 value of 1.00 indicates a perfect fit." p. 44 of the previously referenced method
Posted: Wed Dec 02, 2009 10:03 am
by aceto_81
I have no clue about how to interpret that statement.
Maybe: "Don't trust your CDS calculation for R^2" ?
Ace
Posted: Wed Dec 02, 2009 5:02 pm
by bisnettrj2
I don't think it means you can't trust the CDS. I think it's saying that r-squared isn't the square of r, and when you have a requirement to have a curve fit of greater than X for r (but no specified requirement for r-squared), you can't interpret the value of r from the computed r-squared from your CDS. Of course, that's just my interpretation.
Posted: Sat Dec 05, 2009 6:51 pm
by mbicking
Both r and r-squared (r^2) are measuring the same thing - the proportion of the response that is due to the factors (change in concentration in our case). Any other changes are due to random errors and uncertainty from various sources.
r varies from -1 (perfect inverse relationship) to +1 (perfect direct relationship). r^2 ranges from 0 - 1, but obviously does not factor in direction.
Mathematically, r is always larger than r^2, and I have heard of people preferring r because of this. However, what I have gathered from my limited statistical training is that r^2 is the proper parameter to use.
However, as noted in many other threads, this does not always mean you have a good curve. "It's a good place to start your evaluation, but a bad place to stop."
Posted: Sat Dec 05, 2009 6:58 pm
by bisnettrj2
Thank you, but I think my question is more along the lines of 'Can you use one to derive the other?', and my thinking is, from the equations laid out in EPA Method 8000C, that the answer is 'no'. However, in Agilent's documentation on how Chemstation calculates r-squared, it seems like they are simply squaring the 'r' equation in EPA Method 8000C and calling it 'r-squared', when EPA Method 8000C seems to define r-squared/Coefficient of Determination as something completely different. Unless I'm reading it wrong? BTW, it's been a very long time since I took stats, and even then it wasn't my favorite subject.
Posted: Sat Dec 05, 2009 7:10 pm
by mbicking
The two values are indeed related mathematically - r^2 = r*r. There are some different ways to do the calculations, and that may be where you are getting confused.
Unfortunately, the terminology is not standardized.
Posted: Sat Dec 05, 2009 7:22 pm
by bisnettrj2
Perhaps that's where the problem lies. I guess I shouldn't be surprised that there is ambiguity in the guidelines set out by regulatory agencies. Thanks for the help.
Posted: Sat Dec 05, 2009 7:31 pm
by mbicking
Historically, EPA usually has been pretty sophisticated about how they do their statistics, often to the point of absurdity. They originally employed some very good, but very theoretical, statisticians, and the results were rigorous, but not necessarily practical for a practicing analyst.