Smarter Access to Analytics for Process Engineers

Dec. 16, 2016
Pattern recognition software leverages historian data to automate information gathering for plant engineers and managers

In this information age, data is everywhere. How can we improve efficiencies and manage this information into useable nuggets of information? Plant and operations managers receive vast amounts of both structured and unstructured data every day. How can information be accessed quickly and affordably to improve performance?

Historians serve as a repository for data from many systems, making them a good source for advanced analytics. However, process historian tools are not ideal for automating the analysis of the data or search queries. They are ‘write’ optimized and not ‘read/analytics’ optimized. Finding the relevant historical event and building the process context is usually a time consuming and laborious task.

A level of operational intelligence and understanding of data are required to improve process performance and overall efficiency. Process engineers and other personnel must be able to search time series data over a specific timeline and visualize all related plant events quickly and efficiently. Part of this is the time series data generated by the process control and automation systems, lab systems and other plant systems as well as the usual annotations and observations made by operators and engineers.


Limitations of Data Modeling Software

• Requires significant engineering: Data cleaning, filtering, modeling, validating, iterating on results/models

• Sensitive to change: Users needed continual training

• Requires data scientist: Plants had to hire additional workers, or engineers spent too much time trying to be data scientists

• Not plug and play: Installation and deployment require significant time and money

• Black Box Engineering: User cannot see how results are determined

In order to run a plant smoothly, process engineers and operators need to be able to accurately predict process performance or the outcome of a batch process, while eliminating false positives. Accurately predicting process events that will likely happen in a plant or facility requires accurate process historian or time-series search tools and the ability to apply meaning to the patterns identified within the process data.

While there exists a variety of process analytics solutions in the industrial software market, these largely historian-based software tools often require a great deal of interpretation and manipulation and are less than automated. They perform rear-looking trends or export raw data in Microsoft Excel. The tools used to visualize and interpret process data are typically trending applications, reports and dashboards. These can be helpful, but are not particularly good at predicting outcomes.

Predictive analytics, a relatively new dimension to analytics tools, can provide valuable insights about what will happen in the future based on historical data, both structured and unstructured. Many predictive analytics tools start by using a more enterprise approach and require more sophisticated distributed computing platforms such as Hadoop or SAP Hana. These are powerful and useful for many analytics applications, but represent a more complex approach to managing both plant and enterprise data. Companies that use this enterprise data management approach often must employ specialized data scientists to help organize and cleanse the data. In addition, data scientists are not intimately familiar with the process like engineers and operators, which limits their ability to achieve the best results.

Furthermore, many of these advanced tools are perceived as engineering-intensive “black boxes” in which the user only knows the inputs and expected outcome, without any insight into how the result was determined. Understandably, for many operational and asset related issues, this approach is too expensive and time consuming and require highly skilled data scientist. This why a lot of the vendors target only the 1 percent of critical assets, ignoring many other opportunities for process improvement.

There are just a handful of solution suppliers that are taking a different approach to providing industrial process data analytics and also leveraging unique multi-dimensional search capabilities for stakeholders. This approach combines the ability to visualize process historian time series data, overlay similar matched historical patterns and provide context from data captured by engineers and operators.

The ideal pattern recognition solution provides on-premise, packaged virtual server deployment that easily integrates to the local copy of the plant historian database archives and evolves over time towards scalable architecture to communicate with available enterprise distributed computing platforms.

This newer technology uses “pattern search-based discovery and predictive-style process analytics” targeting the average user. It is typically easily deployed in less than two hours, delivering immediate value, with no data modeling solution or data scientist required. Often called “self-service analytics” this software puts the power of extensive search and analytics into the hands of the process experts, engineers and operators, who can best identify areas for improvement.

Another problem typically presented by historian time series data is the lack of a robust search mechanism along with the ability to annotate effectively. By combining both the search capabilities on structured time series process data and data captured by operators and other subject matter experts, users can predict more precisely what is occurring or what likely will occur within their continuous and batch industrial processes.

According to Peter Reynolds, Senior Consultant at ARC Advisory Group, “The new platform is built to make operator shift logs searchable in the context of historian data and process information. In a time when the process industries may face as much as a 30 percent decline in the skilled workforce through retiring workers, knowledge capture is a key imperative for many industrial organizations.”

Self-service analytics delivers:
• Cost-efficient virtualized deployment (“plug and play”) within the available infrastructure
• A deep knowledge of both process operations and data analytics techniques to avoid the need for specialized data scientists 
• Easy scalability for corporate big data initiatives and environments
• Model-free predictive process analytics (discovery, diagnostic and predictive) tool that complements and augments, rather than replaces, existing historian information architectures


What to Look for in a Self-service Analytics Solution

• Column store with in-memory indexing of historian data

• Search technology based on pattern matching and machine learning algorithms empowering users to find historical trends that define process events and conditions

• Diagnostic capabilities to quickly find the cause of detected anomalies and process situations

• Knowledge and event management and process data contextualization

• Identification, capturing and sharing important process analysis among billions of process data points

• Capture capabilities that support manually created event frames or bookmarks by users or automatically generated by third party applications. These annotations are visible within the context of specific trends.

• Monitoring capabilities that integrate predictive analytics and early warning detection of abnormal process events on saved historical patterns or searches and leverage live process data. Provides operators a live view to determine if recent process changes match the expected process behavior and pro-actively adjust settings when they do not. 

Using pattern recognition and machine learning algorithms permits users to search process trends for specific events or to detect process anomalies, unlike traditional historian desktop tools. Much like the music app Shazam, self-service analytics work by identifying significant patterns in data or “high energy content” and matching it to similar patterns in its database instead of trying to match each note of a song. Shazam can identify songs quickly and accurately using this technique because if it takes too long to get an answer, the user will close the search.

These technologies form the critical base layer of the new systems technology stack because it makes use of the existing historian databases and creates a data layer that performs a column store to index the time series data. These next generation systems also work well with leading process historian suppliers including OSIsoft, AspenTech, and Yokogawa. Typically, they are designed to be simple to install and deploy via a virtual machine (VM) without impacting the existing historian infrastructure.

The technology playing field for manufacturers and other industrial organizations has changed. To remain competitive, companies must use analytics tools to uncover areas for efficiency improvements.

“There is an immediate need to search time-series data and analyze these data in context with the annotations made by both engineers and operators to be able to make faster, higher quality process decisions. If users want to predict process degradation or an asset or equipment failure, they need to look beyond time series and historian data tools and be able to search, learn by experimentation and detect patterns in the vast pool of data that already exists in their plant,” added Reynolds.

Fortunately, this new process analytics model can support the necessary “re-tooling” of traditional process historian visualization tools for a very low cost investment in terms of both time and money.

About the Author

Bert Baeck | CEO of TrendMiner