top of page
Forensic Auditing and Data Integrity

 

In recent years, regulatory agencies in Europe and the US have increased their focus on data integrity. So far, most warning letters resulting from tampering with data are from off-shore facilities. However, regulatory agencies are improving their capabilities to detect data tampering and scrutiny for QC laboratories and manufacturing organizations in Europe and the US itself will increase.

 

Responsibility for data integrity can not be delegated to CMOs. According to regulations, the client remains respnsible for quality of its products. Moreover, the FDA has the ability to halt all regulatory review of a sponsor's applications if it alleges data integrity issues at only one CMO.

What is data integrity:

Several regulatory authorities (FDA, EMA) etc have defined data integrity and outlined their expectation to show data integrity. Let's give it a try here:

According to Davenport and al., data is the record of a transaction. Thus, a data point contains at least one observable (e.g. a temperature), the value of which is documented in one form or another. Other attributes of the data point can be documented as well, such as the air pressure, the time the transaction was performed. These attributes are called meta data. Data must be authenticated, i.e., it must be possible to discern who (or what instrument, under what governance) made the observation and performed the documentation.

In short, data integrity is the ability to completely retrace the provenance of data. Regulatory expectation then focuses on:

  • The management structures and governance of obtaining and documenting data, including its authentication

  • The systems (eDC software, computers, networks, but also log books, etc) used capture and document data

  • Retention of data (including backup)

The need for forensic data integrity analysis

Given the above, therefore, it is logical that data integrity audits of yore focused on the management structures, (computer) systems and retention systems. However, today this is not enough anymore. In particular computer systems are very complex today, and experience has shown that people with the necessary wherewithal and determination can undermine even the best systems.

And what if your systems analysis has discovered an opening for data manipulation? Does that automatically mean that (all) data has been manipulated? Or how would you show to what extent data has been manipulated? These are questions that can not be answered by analyzing a sample of data, the amount of data is simply too big to get any statistical significance of a conclusion drawn out of it.

The challenge:

The challenge in demonstrating data integrity is the vast amount of data that typically is in play, even if it concerns for instance only one ANDA filing. The situation is akin to finding the needle in the haystack. Unless thorough methodologies are applied, the absence of finidng anomalities does not show that no such anomalities exist, and someone else, e.g., an investigator of a regulatory agency may "just get lucky". QORM's deep roots in data management can facilitate processes  and develop tools to uncover data tampering or show that data tampering in all likelihood has or has not occured.

In the following are described two examples with vastly different results. In the first, QORM consults were able to show that, despite fear of the contrary, the company's QC data had integrity allowing an ANDA submission. In the second we showed that QC personnel systematically abused super-user privileges to tamper with unfavorable data.

Anchor 1
Data Integrity
Case 1: Verification of data integrity of ANDA filing

The client, a multinational pharmaceutical company with production capabilities in Asia planned on submitting data for an ANDA. Prior to submission, the client intended to ensure data integrity of the ANDA submission package. There was a recent FDA inspection of the QC facility in question and concerns were raised about data integrity, focusing on one particular example of HPLC data.

For brevity's sake, here we only describe one set of data analyses for this client. We tried to assertain whether unfavorable HPLC data have been omitted, or integration parameters have been used that were not aligned with approved methods.

The HPLC data was collected in an Empower 2 data system. With the help of the vendor, we were able to obtain a mirror of data from the HPLC instruments in this laboratory and fed it into an Access database. We also obtained a dump of relevant data from the LIMS. The total data set now contained over 100,000 records from more than 25,000 injections. By combining LIMS and Empower 2 data we were able to show

  • All Empower 2 data could be traced to a specific Analytical Request that was dated prior to the execution of the HPLC experiment

  • All HPLC experiments were executed in accordance to their method

  • No final result (e.g., after manual integration) was "fittted" so as to bring an OOS result to inspecification

  • A very small number of experiments resulted in OOS that we were able to trace to either analyst error or equipment malfunction through the resulting deviations. This included the data the FDA had previously flagged as potentially being tampered with.

  • There were a sizeable number of differences in results recorded in LIMS and Empower 2. Non of these differences had any material effect on the end result and likely were transcription errors

 

This information, together with the results from the paper audit allowed our client to use the ANDA data set for its successful submission.

Anchor 2
Case 2: Forensic data analysis to uncover data tampering

The client, a multi-national biopharmaceutical company was planning to file an NDA. One of its development partners, a CMO in India, has received a warning letter after a whistle blower alleged that analysts in the QC lab routinely used super-user privileges to the computers to delete data and insert new data sets after manipulation of the systems clock.The FDA had discovered one of such instances. The client wondered whether it should continue working with the CMO.

 

The following discussion focuses on the analysis of gas chromatography (GC) data only. QORM Consultants quickly confirmed that analysts had administrator access the stand-alone PCs that controlled - one each - the GC machines using EZChrome software.

Due to a power outage some weeks prior to our analysis, all data had been wiped off the PCs. The CMO IT group maintained daily backups of all data. To this end, all data was copied to a central hard drive and then backed up on tape. Because of the "copy" step, the date/time stamps of the original data was lost.

Our goal became to ascertain that analysts had, or had not, on a routine scale manipulated the computer clocks and deleted/recreated data on the hard drive of these computers. Complying with our request, the CMO IT group restored the backups of one instrument for each of 180 days (half a year).

We developed a powershell script to exctract specific data from all injections in every one of the backups. The fields extracted included

  • Sequence Number

  • Injection Number

  • Date/Time of Data Acquisition

Our idea was that, if data was deleted on the PC and a new data point with the same name was established after manipulating the PC's clock, most likely the day and the hour of the data acquisition was matched, but most likely the minutes and/or seconds of the data acquisition timestamp would not match. Thus, the date/time of the data acquisition for a specific data point would not be the same between one backup and another backup, if the data manipulation did not happen within one-and-the-same day.

Indeed, we were able to show that for only this one instrument there appeared to be several dozens of cases were data was manipulated.

ETL for Data Integrity
bottom of page