Challenges to data integrity in the pharma lab

June 24, 2024
To prioritize data integrity in the lab, companies must implement good data practices throughout the R&D process

Data integrity is an ongoing concern across all R&D organizations, no matter what part of the pharmaceutical research life cycle they’re navigating. These concerns extend beyond the potential for delayed timelines or cost overruns. Instead, it’s about something bigger: establishing a culture of quality; ensuring product efficacy and patient safety; and being a trusted brand, partner or provider.  

Prioritizing data integrity in the lab 

To prioritize data integrity in the lab, companies must implement good data practices throughout the R&D process. This involves defending the fidelity and confidentiality of all records and data generated, including raw data, metadata and transformed data. Key considerations include ensuring data integrity, governance and security — each presenting unique challenges but collectively crucial for upholding good data practices in the modern pharma lab. 

A shifting data management landscape 

As organizations digitize their data for scalable analytics, they must evolve their data management strategies to mitigate threats to data integrity, including technological, managerial and external risks. This is no small task. The complexity of R&D data, processes and technologies increases the risk of data integrity violations, as highlighted by the FDA's reports of increasing violations in the pharma industry. Missteps like these at any point in the R&D process can impact the overall research validity.  

Data integrity and security breaches could potentially lead to incorrect or (non-recreatable) research results, raise implications on patient safety and product efficacy, or generate violations that might cause a drug to be rejected at submission or pulled from the market later.  

Multimodal R&D 

Companies hoping to drive innovation are diversifying their R&D efforts and working across different areas of science with novel modalities, but also across different locations globally, each with its own compliance standards and regulations.  As a result, data are pouring from wide-ranging sources via different means and in different formats.  This incredible volume and diversity of multimodal R&D data creates lab integration and data management challenges that can risk compromising data integrity and security. Many companies are struggling to keep pace with a vast volume of diverse data and metadata needed to inform decision making throughout the R&D process. 


Ensuring the success of R&D at scale means collaboration between research groups is becoming essential to necessitate efficient data sharing while ensuring data integrity and security. However, efforts to improve collaboration face challenges due to differences in data handling approaches and the lack of standardized data management practices. The importance of data sharing in advancing science was recently underscored by the United States National Institutes for Health (NIH), which established new 2023 data management and sharing policies to confirm findings, encourage reuse and spur innovation. Establishing better data management standards, such as the FAIR guiding principles, can help address these challenges by promoting data findability, accessibility, interoperability and reusability.  

Artificial intelligence 

Artificial intelligence (AI) is increasingly integrated into R&D, requiring organizations to adapt their data infrastructures to support AI-driven research in an AI-everywhere world. For many universities and health companies, becoming AI-ready means first adopting technology and process changes to support exponential growth in data volumes, elimination of data silos, integration of bespoke software and systems, and normalization of data.  

The ultimate goal is that any data created and captured throughout the R&D process will be trustworthy, well-structured, correlated, shareable and model-ready. Global compliance regulations are currently being updated to guide the use of AI and ML in medical and general research. 

Achieving trustworthy, well-structured data aligned with regulatory and ethical standards is crucial, particularly with the EU's recent passing of the Artificial Intelligence Act. This landmark law aims to protect human health, safety and fundamental rights as AI is increasingly relied upon for innovation across a broad spectrum of industries and organizations. Now is the time for companies to ensure that their existing systems and processes support the regulatory and ethical challenges of using AI in research, including assurance of data integrity, security, traceability and bias limitation. 

Good data practices  

Alignment of data management and integrity are vital to long-term research success and preparation for the automated, connected and collaborative future of research. Fortunately, today's scientists have a wide range of tools to easily manage, search, and visualize their R&D data, with the future being led by solutions that can unite all those applications that produce and analyze data within one secure data-management platform. 

About the Author

Dan Ayala | Chief Security and Trust Officer, Dotmatics

Throughout his nearly 30 year career, Daniel has led security, privacy and compliance groups in banking and financial services, pharmaceutical, information, higher education, research and library organizations around the world. Daniel writes and speaks regularly on these topics and co-hosts The Great Security Debate Podcast.