Page 1 of 1

Fraud in data? (File numbering and date/time)

Posted: Wed Nov 17, 2010 8:48 pm
by lgouveia
Dear forum fellows,
I'm currently reviewing/auditing some analytical validation data obtained from an Agilent HPLC system (don't know the hardware model nor the software version) and I have a couple of questions I was hoping some of you could give some feed-back. The main issue is that I found several inconsistencies among the data as well as a poorly designed validation plan (with gross errors).
I feel that some of the results might have been submitted to some kind of "fixing" (out of order, stating that some data belong to sample X when in fact was sample Y, etc.).
One thing I noticed is that the file numbering it's not sequential, that is I've found files "xyz000100.d" up to " xyz000125.d" (date on report: 02-Jan-2010 08:00...) and later a "xyz000050.d" (date on report: 05-Jan-2010 08:00) , that is a "lower number" file with a later date.
Does the file numbering in the Chemstation software follows a sequential order?
If you delete a file, does the software assigns later on that "free filenumber" to a new data file, or it continues numbering the files sequentially ignoring any previous "blank" files?
Is there a way to check if the computer system date/time has been altered in such a way that the report appears to be generated (as well as the data) on a different date/time?
Any feedback and or references to this issue are welcome
Regards
Luis

Posted: Wed Nov 17, 2010 9:18 pm
by sphereman
It has been a couple of years since I used ChemStation but, if I remember correctly, it is possible to have a lower number on a later day. It all depends on how the sequence was set up and data files stored.

It is possible to have identical file names xyz000100.d but the file path should be different, for example

C:\Chem\data\02JAN10\xyz000100.d was acquired on 02-Jan-2010 and
C:\Chem\data\05Jan10\xyz000100.d on 05-Jan-2010.

The full file path should be listed near the top of the ChemStation printout.

I hope this helps

Posted: Thu Nov 18, 2010 4:08 am
by Consumer Products Guy
We start a new Sequence with each use, and we use the default numbering, into a different subdirectory, each time. So if our first vial is from position #1, that file will be 001-0101.D. The first three digits are the location of the vial, the two after the dash are the line number of the Sequence table, and the last two is which injection on that Sequence Table line. But I think you're talking about automatic numbering.

If you use Explorer and list the files in the xyz0000.D or whatever directory, you will also see the time that the half-dozen or so files were recorded. So you might be able to tell if data was fudged if the times aren't in chronological order.
Is there ChemStore electronic data security? That's a requirement for 21CFR11 in U.S., and you stated a validated method and audit. The original data as recorded during the analysis should be available to be retrieved using that or similar.

Posted: Thu Nov 18, 2010 8:10 am
by Peter Apps
Since it possible to manually enter data file names, an out of sequence name is not, in itself, evidence of crookery. If there is an SOP that says data files names have to be sequential then there might be a violation.

Peter

Posted: Thu Nov 18, 2010 8:57 am
by Csaba
Hi,
Some comments concerning ChemStation WITHOUT ChemStore.

“ Does the file numbering in the Chemstation software follows a sequential order? “ Comment: You can change it at free will

“If you delete a file, does the software assigns later on that "free filenumber" to a new data file, or it continues numbering the files sequentially ignoring any previous "blank" files?â€

Posted: Thu Nov 18, 2010 5:47 pm
by lmh
You refer to "the date on the report".

What do you mean by this? The report has a date at the bottom, which is the date on which the report was produced. It may also have an injection date, which should be the date on which the run was carried out. Obviously the date on which the report was produced may have absolutely nothing to do with the order of runs in a sequence, and illogical dates here are no evidence of fraud. There are circumstances in which I might quite innocently generate a new report for a sample even months after doing the others.

Further to Csaba's point, even on networked instruments, quite often the clock isn't coordinated all that regularly; I can change my networked PC's clock and it stays wrong until I turn off and on again (**). If anyone wanted to cheat on a time-stamp, in many labs it wouldn't be all that hard.

On the matter of file-names, I've noticed that inexperienced users will occasionally (surprisingly often!) copy what they see. If they see logical names, they will type in names in a similar logic, but not quite right! It's the same situation as the person who draws black lines on autoclave tape before getting their bottles autoclaved... file names that are almost logical but not quite might indicate someone who doesn't quite understand where they come from, rather than someone trying to cheat.

(** note for the curious: the only way I found this out was I used to have an invoicing system that could only invoice for a job in the same month as the job was carried out. If I carried out a job on the last day of the month, it was almost impossible to invoice. Temporary Solution while database designer made changes: set the clock back to an earlier date to generate the invoice!)

Posted: Sat Nov 20, 2010 11:39 am
by lgouveia
You refer to "the date on the report".

What do you mean by this? The report has a date at the bottom, which is the date on which the report was produced. It may also have an injection date, which should be the date on which the run was carried out. Obviously the date on which the report was produced may have absolutely nothing to do with the order of runs in a sequence, and illogical dates here are no evidence of fraud. There are circumstances in which I might quite innocently generate a new report for a sample even months after doing the others.

Further to Csaba's point, even on networked instruments, quite often the clock isn't coordinated all that regularly; I can change my networked PC's clock and it stays wrong until I turn off and on again (**). If anyone wanted to cheat on a time-stamp, in many labs it wouldn't be all that hard.

On the matter of file-names, I've noticed that inexperienced users will occasionally (surprisingly often!) copy what they see. If they see logical names, they will type in names in a similar logic, but not quite right! It's the same situation as the person who draws black lines on autoclave tape before getting their bottles autoclaved... file names that are almost logical but not quite might indicate someone who doesn't quite understand where they come from, rather than someone trying to cheat.

(** note for the curious: the only way I found this out was I used to have an invoicing system that could only invoice for a job in the same month as the job was carried out. If I carried out a job on the last day of the month, it was almost impossible to invoice. Temporary Solution while database designer made changes: set the clock back to an earlier date to generate the invoice!)
Sorry for not being accurate. I meant the date regarding the data file (the date when the data file was created) not the report generation/printing which, as you stated, can have any date after the data file date.