How you do the calculation is not important - what is important is how you get your x value in your previous post, and how you calculated the calibration equation that you use to get from x to Y.
Peter
Peter,
Thank you again for taking the time to go through all of this. I know it is a lot and I really appreciate it. I don't believe I am doing a good job clearly explaining myself here so I will attempt to be more concise in this post.
Concern"with only two points you do not know if the calibration is linear, and the negative result for the blank with its very low rsd suggests strongly that it is not. If you and the regulators are happy with a two-point curve then you can empirically estimate uncertainty of calibration + repeatability uncertainty by running multiple replicates at 5 and 10 ppm, but this will not account for line curvature. More points along the line would allow you to estimate uncertainty die to curvature or to fit a curved calibration whose residuals would be lower than for a straight line"
ResponseI am not using only 2 points as can be seen in the imgur post I shared. For the low range curve I have two certified standards plus the "100%" blanks. Each certified standard is then used to simulate many different concentrations by varying the pressure in my gas cell. This means I only have two certified sources, but over 50 points covering a range of concentrations.
Example: I collect a scan of my 5 ppm standard at 0 PSIg, this point is said to be 5 ppm. I then collect a scan of the 5 ppm standard at 2 PSIg. I then calculate a new simulated concentration C
s = 5 ppm * (14.7 PSIg + 2 PSIg)/14.7 PSI. So this point would be 5.68 ppm. I realize this also expands any error in the standards but the fact that I can vary the pressures like this from two different sources and it still be linear when including the blanks shows that the standards are what they claim to be and that varying the pressures to simulate concentrations doesn't add much uncertainty to the curve. My linearity for the curve is still 0.999.
So far everything in this post relates to a single curve used to measure CO2 in the range from 0 to 11 ppm using an integration region that is not interfered with by water. I used two different methods to check the uncertainty of this curve. One was compiling 42 100% scans together which gave me an uncertainty of +/- 0.07 ppm with a blank average of 0.046 ppm.
Separately, I did the S0 and S1 method. I took the points used in the creation of this calibration curve and for each point I plotted the squared value of the concentration obtained when I use the curve to calculate the concentration on the x axis, vs the squared difference between the concentration plotted on the x axis and the concentration I told the curve it was supposed to be. The result of this is the top image in my second imgur link. So every point on that curve corresponds to a point on my calibration curve. Specifically, it is plotting the calculated values for my calibration points vs the difference to what they are supposed to be. Using this method and solving for my uncertainty at 0 ppm (to compare to my other method for determining uncertainty) as described in The Eurochem manual page 115 gave me an uncertainty at 0 ppm of +/- 0.088 ppm. This is very similar to the other uncertainty value of +/- 0.07 ppm that was found using a completely different method and scans.
Concern"
it is also going to cause extra signal in standards and samples, giving signals that are biased high. The impact on the results for CO2 depends on whether the water content in blanks, samples and standards is equal and always the same. If for e.g. you calibrate using a standard with more moisture than is in your sample the result for CO2 in the sample will be biased low. In the far more likely scenario of variable water content you will have increased variability in signal and results, as well as bias"
ResponseSo here we are switching gears to a completely different curve, still used to quantify CO2, but at much higher concentrations. The levels of CO2 I am interested in saturate the region used in the low range CO2 curve too quickly. Therefore I am using this region that has water interference because it was what gave me the best results experimentally. I determined this by compiling all my CAPT proficiency test samples and several SRM samples into a file. I'd then change the curve and recalculate values for all the CAPT samples and SRM concentrations. I used the region that gave me the best combination of R%Ds for all the samples and the best R^2 value for linearity.
The "far more likely scenario" you mention is of course what is actually happening here. Here is a very quick and generic image as an example.
http://imgur.com/a/0nM30
Now, if this theoretical curve had some interference, such as water, this would mean that the area for the 1000 ppm point would actually be greater than it should be. This would shift the point to the right, but not up. If this is the case for every point used to make the curve then this would give the curve a y-intercept lower than expected. This is exactly the case with my high range CO2 curve, which has a Y-intercept of -100. This number is very repeatable as seen from my extremely low standard deviation of under 3 ppm. You seemed really concerned with this in a previous post but I just don't see how this is a problem. This curve and the region associated with it are not mean to quantify values below 350 ppm, so the fact that the bias at 0 ppm is -100 ppm shouldn't matter considering it is a repeatable phenomenon and follows what was expected. I mean, isn't that exactly what bias is supposed to represent? The two CO2 curves, low range and high range, act more as a piecewise function.
This high range CO2 curve has been vary accurate at quantifying CO2 results in CAPT PT samples which are meant to simulate real world samples near these concentrations. Additionally, all of my methods have maximum allowable water concentrations reported as dew points. Every sample will have a dew point with it; if the dew point exceeds allowed values then I don't even need to run the CO2 because the sample already failed and I already know the high levels of water vapor will interfere with the high range CO2 readings.
Concern"
is x the signal (IR absorption), or has the computer already done some calculations on it ?. From what you say in the next paragraph about the computer subtracting values it sounds is if you are processing the data twice - once to get x and again to get Y from x."
And
"
How you do the calculation is not important - what is important is how you get your x value in your previous post, and how you calculated the calibration equation that you use to get from x to Y."
ResponseThis is a link to images of the actual calibration curves and actual equations used to calculate concentrations.
http://imgur.com/a/kB4Jo
For the low range CO2 curve the equation I use to quantify CO2 is Y = 1.135x + 0.012; Where x is the integrated area from the scan and Y is the calculated concentration. This equation was constructed by the software. I put in my 50 or so calibration points, tell the computer what concentration each of those areas is supposed to represent, and it spits out this equation. The x in this case is literally just the area. The only calculations involved to this point are done to the interferogram to convert it into a spectrum. These include selected apodization (Happ-Ganzel), zero fill (8) and other spectral parameters. These obviously affect the resulting spectra but these spectral settings are always applied the same way. I don't believe these are considered data processing steps on the x variable as much as they are spectral settings.
Concern"
now I am confused. In the previous paragraph x is the signal from the instrument. You should calculate concentrations from x by using the calibration equation that you got by plotting signal against standard concentration (you must know the concentration of the standard before you measure it). So what value of what are you telling the computer to use, and what are you subtracting it from ?"
Response I can see how this was confusing. We are switching gears again completely here. At this point the calibration curves are complete. They were indeed created from standards of known concentration by plotting the instrument signal on the x-axis and these known concentrations on the y-axis. I THEN created completely separate uncertainty graphs, seen here
http://imgur.com/a/3qhJ3
These are separate graphs from my calibration curves and are meant to tell me different things. Every point on these uncertainty curves corresponds to a specific point on the respective calibration curve. So the x-values on these uncertainty curves are actually the Y-value outputs of the calibration curves. In other words, I take a calibration point, say 5 ppm, and run that point back through my calibration curve. So for my low range CO2 curve, Y = 1.135x + 0.012, It is literally taking the area that I associated with 5 ppm, and then calculating a concentration for it. Possibly 5.3 ppm for instance. All of these "calculated" concentrations are then squared and used as my x values on the uncertainty charts. The Y-value for this point would be (5.3 ppm - 5ppm)^2.
So for the uncertainty graphs, every single point corresponds to a point on the respective curve. The X-values for the calibration curves are areas, the Y-values are input concentrations from known standards. For the uncertainty curves, the X-values are the calculated concentrations for these "known concentration" points, and the Y-values are simply the difference between this value and what was expected.
Concern"
your calibration points are scattered on both axes - you have variability of both signal (as expected ) and CO2 content of the standards. Why are the standards varying ?, and how do you know (without measuring) what their CO2 content is ?"
Response
I assume you are referring to the uncertainty graphs here, and not the actual calibration curves themselves. Under that assumption, here are the uncertainty graphs again and an explanation.
http://imgur.com/a/3qhJ3
Looking again at the low range CO2 here, the Y-values aren't measured concentrations, they are differences between calculated and expected values. You know how I said I had two SRM standards for my low range CO2 curve, one at 5 ppm and one at 10 ppm? The variation in the x-axis is due to the slight pressure differences these scans were run at. So you can see clumps around 25 and around 100 on the x-axis. This is because these points were made using standards at 5 and 10 ppm and then varying pressure slightly. The values close to 0 on the Y-axis aren't points with little or no signal, they are points with little or no difference between what I calculated and what I input into the computer. Again, every point was made from a SRM with known concentration, so by knowing the pressure I can calculate expected values of these SRMs at different pressures to simulate more points. Of course doing this with only one standard would be linear, but here I have two standards and blanks of which are linear, I then just fill in between these three points with slight pressure variations.
You asked how do I know the CO2 content without measuring, and for these uncertainty charts (again, not the calibration curves) I don't really. My X-values (on the uncertainty charts) for each point are concentration values obtained by plugging the area associated with that point into the Y= mX+B equation from the calibration image. For low range CO2 this was the Y = 1.135x + 0.12 equation I keep coming back to. It called to use Xi in the Eurochem guide, where Xi is the individual measurements, correct? I had a similar bit of confusion with notation when looking through this because my calibration curves are Conc = M * (signal) + B, while in the eurochem guide they have it as signal = M* (Conc) + B.
They then mention the uncertainty of a predicted X due to the variability in Y, which is exactly how I have it set up now, right? When looking at the uncertainty graphs, I have some uncertainty on my concentration (x) due to variability in y (my uncertainty).
Concern"
No, not correct, if your method has a systematic relative bias of 1.3% there is something wrong with it, either to do with the measurements themselves, or with the calibration"
I am not quite following here. Shouldn't any acceptable bias be determined by the method or certification being sought? And why is a 1.3% bias not acceptable if I can see it and correct for it? Isn't that exactly what a bias is? Would it be acceptable at 0.5%? And if I was correcting for it either way, why is one acceptable and the other not? If I literally only care if a CO2 level is above or below 1000 ppm with an uncertainty of +/- 50 ppm, and I know that 1000 ppm actually shows up as 987 ppm with an uncertainty of +/- 24 ppm I don't see how that isn't enough to certify concentrations above or below 1000 ppm.
I very much appreciate your help. It has been useful for me to sit down and try to explain everything I am working on to an outsider. It really forces me to think it through more clearly. Again, I am confident in these curves ability to meet the requirements of the certifications I am seeking, but I am just trying to pick a way to report bias and stick with it. I wanted to do bias by concentration like the uncertainty but it seems like I am going to just have to report a bias only when it would cause a sample to fail.