5 Questions to Ask When Evaluating Data Analytics Offerings

Correctly answering these five questions will help life sciences companies select the right solution to get more value from their data, leading to improved outcomes

Michael Risse

Aug. 3, 2017

12 min read

The data generation and collection strategies at the center of life sciences organizations and their manufacturing processes have evolved dramatically, especially in recent years. These organizations now collect and store huge volumes of data across their operations, both on and off premise, across multiple geographic locations, in an increasing number of separate data silos.

These advances have coincided with the proliferation of connected sensors and increasingly inexpensive storage, leading to an Industrial Internet of Things (IIoT) ecosystem projected to generate more than 4 trillion gigabytes of data per year by 2020, according to IDC Research.

It is critical that your life sciences data analytics solution is designed to interact with and display time-series data.

New, advanced data analytics have a huge positive impact on the growing volumes of data in many sectors, from retail to financial. So why aren’t all these new analytics widely leveraged in the life sciences industry? With so much data and the promise of so many new technologies, why is it so difficult to gain the same benefits as other sectors? Why do so many life sciences organizations still feel like they have too much data and not enough insight?

We believe this gap — between the data these organizations have and the insight they want — exists because most data analytics solutions fail to address the unique challenges and opportunities presented by life sciences applications.

When we talk about data analytics, we mean any software enabling process engineers or scientists to:

• Create a cleansed, focused data set for analysis through assembling, aggregating, or “wrangling” data from various sources, including data historians, offline data, manufacturing systems and relational databases

• Investigate operations data using “self-service” tools to rapidly analyze alarm, process or asset data for ad hoc or regular reporting

• Publish or share insights and reports across the organization to enable data-driven action, or enable predictive analytics on incoming data

• Many data analytics solutions claim to offer some or all these things — with the goal of finally closing the gap between data and insight. But are they successful? And how do you evaluate success?

We propose five questions every process manufacturing buyer should ask when evaluating an advanced analytics solution.

1. IS THE DATA ANALYTICS SOLUTION DESIGNED SPECIFICALLY FOR PROCESS MANUFACTURING, AND CAN IT HANDLE TIME-SERIES DATA AND SOLVE PROCESS MANUFACTURING PROBLEMS?

Many data analytics solutions are general purpose and designed for IT, and as such aren’t a good fit for life sciences, hence the first question.

Anyone who works with industrial time-series data knows it isn’t like other data. No matter the industry — from pharma to mining to oil & gas — the data produced and the assets involved present a tangle of convoluted relationships and contextual challenges.

Focusing on technologies rather than solving process manufacturing problems leads to confusion and dissatisfaction.

Whether you’re looking at a pharmaceutical facility, a process development lab or a refinery — there are historians collecting data across many different protocols used by multiple vendors across a disparate array of equipment, of varying ages and implementations.

These systems are typically producing data at speeds and volumes that other industries find dizzying, and at uneven intervals that can confound conventional relational databases. All this data also needs to be cleansed to be useful.

To make matters worse, all these events and signals lack the associated context to make them meaningful on their own — a problem compounded further when assembling data from multiple sources, which requires the addition of these key relationships.

Finally, time-series data is hard to navigate. Sensors have timestamps that need to be aligned and aggregated across specific ranges in time, obstacles not found with transaction data.

The right data analytics solution will work exclusively with industrial time-series data. This will enable the solution to go far beyond spreadsheets or general purpose data analytics software designed for relational or IT-based applications.

This means correctly handling, displaying and navigating time-series data. This enables users to capture the right data to solve real life sciences problems.

Collecting sensor data isn’t a trivial task. It’s also often the start of a longer process that involves cleansing, adding context and performing calculations — a process that needs to leverage the hard-won insights and institutional knowledge of engineers.

2. DOES THE ANALYTICS SOLUTION RELY ON YOUR EXPERTS OR THEIR EXPERTS?

Beware of vendor experts bearing correlations. Many data analytics vendors know their own technology extremely well, but don’t know much about process manufacturing. This can lead to a focus on the analytics themselves rather than the implications of any findings — and, in turn, an emphasis on correlations over outcomes.

The key to positive business outcomes for process manufacturing is empowering internal experts. A typical process manufacturing organization has a great deal of expertise at its disposal, spread out across a skilled front line of process engineers, scientists, team leads and other technical specialists.

These experienced front-line users often have decades of experience; detailed knowledge of the company’s processes and history; situational awareness of its operations; and fluency in plant assets, sensors and tags. They have the advanced technical education and experience to ask smart, productive questions.

Unfortunately, these employees are often limited by an aging suite of data analytics tools, most of which were originally created in the mid- to late-1990s. They know the right questions — but asking them using existing analytics tools can be difficult and time-consuming, almost impossible in many cases.

The right data analytics solution puts the power into the hands of the people who can most efficiently create the most positive outcomes. The solution must be designed for use by front-line employees — specifically engineers with the experience, expertise and education to investigate alarms, generate reports and optimize production outcomes.

The solution must have the productivity tools and features to help plant personnel assemble, cleanse, search, visualize, contextualize, investigate and share insights from process data — all without involving IT experts.

The selected solution must also help life sciences organizations get the most out of a quickly retiring cohort of engineers before they leave. It needs to help capture knowledge, and then preserve and present it in a modern, browser-based application that appeals to the incoming wave of new engineers.

You’ve already got the people and the expertise, just give them the right tools, get out of the way and enjoy the results.

3. IS THE VENDOR FOCUSED ON YOUR PROBLEMS?
Too many data analytics vendors are more focused on the technologies involved with their products than your problems.

We’re in the midst of an overwhelming wave of new data analytics technologies including software innovation with big data, hardware innovation with scale-out computing architectures and cognitive computing innovation with advances in machine learning and deep learning.

It’s no longer even clear what the term “analytics” means in industrial environments. These rapid advancements have led to two big problems for anyone trying to compare analytics solutions for a process manufacturing organization:

First, the technologies — big data, predictive analytics, machine learning, cloud computing, etc. — have eclipsed the narrative of benefit and impact. Rather than discussing why we should adopt a particular innovation, the conversation focuses too often on what technology to use, often with more enthusiasm for the technology than the actual benefits.

Second, the sheer pace of recent innovation means there has been too little focus on fitting new offerings into existing environments. Technology generations used to last decades, now it feels like months. The result is many customers get lost in the fog of technology discussions, instead of focusing on end results.

The answer to “Which technology?” should always be “All of the above, but only as required.”

Presumably, the goal of any new offering is to improve outcomes in yields, margins, quality and safety. So any modern solution should be drawing from all these recent technology advancements to accomplish those outcomes — without organizations having to enlist expert assistance or know exactly how these underlying technologies work.

The world of big data, predictive analytics, machine learning and cloud computing needs to be turned inside out — from a technology-centric and revolutionary approach to a user-focused and problem-solving evolutionary approach.

The right data analytics solution won’t ask you to do a lot of work to adopt specific innovations. It will instead harness innovation on your behalf, in a manner leveraging many technology advancements to deliver concrete benefits specific to process manufacturing via a modern, cloud-enabled and browser-friendly application experience.

4. DOES THE SOLUTION REQUIRE YOU TO MOVE YOUR DATA?

The data analytics solution should not require you to move, duplicate or transform your data. Contextualization has always been difficult with process data, often requiring manual effort and painstaking work in Microsoft Excel to define relationships among relevant data. Historians have come a long way in terms of trend viewing and investigation, but “Export to Excel” is still every historian’s most important feature for doing the “real work” of data aggregation, context and modeling.

Collaboration tools are a required feature for any data analytics solution in a life sciences setting.

For example, a pharmaceutical engineer might have several hypotheses to explain a bad batch outcome, ranging from an error by an operator to a bioreactor maintenance event to a specific raw-material variation. The data exists to validate these hypotheses, but it requires bringing together disparate databases, often across multiple data silos, and then creating context to evaluate the data.

This contextualization process goes by many names — including data wrangling, data harmonization and data blending — but for many analytics solutions, these tasks still require manual data transformation and duplication, and often the creation of entirely new databases or data lakes.

A better approach is to create an index on top of your data sources so front-line users can look at the data in a structured way, while leaving the data itself in place. Engineers and scientists can search the data like they would with Google, and quickly and dynamically add context within historian data and across data sets.

When the data analytics solution connects directly to historians, engineers can contextualize without getting IT or other experts involved, without duplicating or transforming the data and without creating data lakes.

The data analytics solution needs to connect to your live data — as it is and where it is, of any size and any type — and let the engineer interact with it. Your engineers are free to traverse the system, ask questions on the fly and layer multiple data sources on top of each other in a single view, even when many historians are involved. All work is captured for future use and reference, so your engineers don’t have to start all over again if their original question doesn’t prove out.

5. IS THE SOLUTION FAST ENOUGH?
The right solution should allow your engineers and scientists to work as fast as they can think.

Engineers typically look at data for a specific reason. For example, because an alarm went off in a system, or someone has asked a question or they need to generate a report. Traditional analysis tools often require specialized skills or syntax, so these tasks can be difficult and time-consuming — and the tools are typically only mastered by a few people within an organization.

Beyond the struggles of individual users, few tools are built around collaboration and organizational knowledge capture. When one user cleanses data for a project or creates context and relationships among data sources, that analysis and information is often lost, with no way for other users to discover or leverage it.

Advanced data analytics solutions should be flexible enough to support both real-time collaboration and existing workflows. Engineers should be able to interact with tools spontaneously, as quickly as they can create tasks or devise hypotheses. Their work results should also be iterative and distributed so it can inform others with a whole greater than the sum of the parts.

The data analytics solution’s interface for front-line engineers needs to provide a fluid, visual and pleasant user experience — similar to other modern web-based applications. Searching, saving and sharing must be easy and intuitive — even from browsers and mobile devices — and reports should be able to be generated in minutes, not days.

Collaboration tools need to provide a central place where all your front-line employees can work together to leverage each other’s expertise and work. For example, one person might know a certain set of process data really well, and they know how to clean and transform that data, while another person might know the ERP system really well, and another might be an expert with your maintenance system. With the right data analytics solution, they can work together, easily and confidently, simultaneously or asynchronously, across teams and geographic locations. All work is captured and saved automatically, and this leads to a powerful, multiplicative effect on productivity and user empowerment as more expertise, knowledge and context are injected into the data analytics system over time.

CLOSING THE GAP
The large and growing gap between data and insight in life sciences organizations will only start closing when data analytics vendors start putting the process engineer and analyst, by whatever title, at the center of the picture. These engineers have the expertise, ability and incentive to ask the right questions and take advantage of insights generated by the answers.

With the right data analytics solution, these engineers, analysts and experts can quickly bridge the gap between data and insight. This will make difficult problems easy, and impossible problems solvable. The result is faster insights leading to better yields, margins, quality and safety outcomes.