From run-to-failure maintenance to predictive maintenance: A case study

Oct. 17, 2019
How one company used IIoT to go from a run-to-failure mentality to proactively preventing breakdowns

Much has been written about the Industrial Internet of Things (IIoT) and its potential to change manufacturing. The pharmaceutical industry is starting to recognize the benefits of digital technologies, in particular, the value that real-time asset and process intelligence obtained from these solutions. The cost of a single critical asset failure in a pharmaceutical manufacturing plant can lead to scrapped batches costing hundreds of thousands of dollars. However, technology alone is rarely successful, and companies that are looking to implement IIoT technology need to have a reliability mindset in order to get value from their program.

C&W Services, a provider of integrated facility services, has been working with pharmaceutical manufacturers to improve asset reliability within their facilities. One site in particular is a North Carolina facility that runs 24/7, producing oral solid dose and injectable drugs. The site’s facility management team is responsible for critical environments and office space across 1.5 million square feet. The team consists of maintenance technicians, a large custodial staff, a reliability engineer, an operations manager, and predictive maintenance (PdM) technicians. Over the past four years, this team has introduced new processes, technology and culture changes. The results have been dramatic, including:

  • Improved equipment reliability
  • Improved employee engagement
  • Reduced unplanned downtime
  • Improved processes
  • Increased preventive maintenance (PM) intervals
  • Energy waste reduction

Understanding needs before making changes

Before C&W Services was brought in, the site was in run-to-failure mode. Run-to-failure maintenance is where assets are allowed to operate until they break down, with no preventive maintenance (e.g., re-lubricating bearings on a scheduled time) or predictive maintenance (e.g., trending the vibration to understand if a failure is likely to happen).
The main challenges at the site were ineffective preventive maintenance, a computerized maintenance management system (CMMS) that wasn’t used properly, a lack of predictive maintenance and ineffective cost controls. Preventive maintenance tasks were written, but they were not specific, didn’t match the equipment and weren’t based on criticality. In some cases, there were duplicate tasks or assets without any impact (e.g., a bathroom exhaust fan) being given the same importance as production-critical equipment. The true impact of unreliability and actual downtime was not clear because the maintenance team didn’t capture necessary information within the CMMS.

The CMMS was also missing key information like failure codes and equipment criticality rankings. Even though equipment reliability had a direct impact on operations, the teams lacked the tools and training to predict downtime and were entirely reactive. There were no approvals for spares or contractor work — if a work order was entered, parts were automatically ordered, regardless of cost, leading to higher maintenance costs and unmanaged parts inventories.

Recognizing the areas that needed improvement, the site operations manager put several changes in place to address them:

  1. Scheduling of tasks around critical equipment
  2. Lean maintenance processes and operator-driven reliability to increase employee ownership
  3. Improved preventive
  4. maintenance tasks
  5. Digital record keeping and mobile work order management for non-validated equipment (in process)
  6. Wireless predictive maintenance that provides real-time notification of problems
  7. Root cause analysis of all failures

By first understanding the site, the team was able to identify quick wins as well as longer term practices that would lead to a culture change where IIoT technology could contribute.

Don’t delay predictive maintenance

Recognizing the value that PdM brings in terms of labor efficiency and reduction in downtime is essential. Many of the preventive maintenance tasks that were being completed on the calendar year schedule were found to be unnecessary (e.g., filters were clean and didn’t need replacing, but they were replaced anyway). Shifting from reactive to predictive maintenance required a system that was easy to use and one that could automatically collect and analyze asset condition data. The maintenance team didn’t have the resources to analyze all the data coming in from multiple sources and at different times per day; they needed a system that could flag only the important issues.
The team chose an asset reliability and optimization system that could monitor a wide range of rotating and non-rotating equipment. The system, from Petasense, provides a machine learning-based Asset Health Score which provides real-time notification of developing problems. This reduces the time a reliability engineer has to spend on analysis. The system also gives the facility the ability to monitor almost any asset with multiple sensor input options. Detailed analysis tools enable investigation of problems when detected.

Recognizing the need to quickly demonstrate the benefits of continuous vibration monitoring, the operations manager commissioned the PdM program by setting up sensors on some of the critical air handlers (AHUs). Despite already improved PM practices, he was able to find additional improvement opportunities. For example, immediately after lubrication, the sensors showed over-greasing and soft foot from legs that weren’t tightened. It was discovered that lubrication technicians weren’t using grease guns properly, so this provided the opportunity to improve training. By demonstrating that PdM could pick up problems that were not solved by improved PM alone, money was allocated to expand the program.

Moving beyond rotating equipment

Most people think of predictive maintenance as the application of vibration, ultrasound, thermography, oil analysis, or motor current analysis to rotating machines. The site team looked at PdM as a way to get asset information on rotating and non-rotating equipment. They installed vibration sensors on pumps, motors, and fans, and a range of sensors on non-rotating assets such as steam traps, filters, and HVAC ducts. Some of the less common applications that are being monitored include:

  • Measuring filter differential pressure using magnahelic gauges: This measures the air flows back to the unit and after the filter. Sensors are also monitoring the supply air temperature, which gives an indication of both the asset condition and the ability to meet the process demands.
  • Monitoring steam traps using ultrasound and temperature monitoring: Steam traps remove condensate from steam. When a steam trap fails, it causes energy and monetary losses. Failed steam traps can lead to production stops, which was happening in some units.
  • Monitoring current and amp draw on each leg of the motors: This allows the team to extend the intervals between infrared inspections. This is especially effective for variable frequency drives and motors that are not on active control systems.

Get your wireless network right

One of the biggest challenges was getting reliable WiFi in the places where it was needed. There was WiFi throughout the facility, but some gateways were added in order to support the predictive maintenance system. C&W Services was able to use the guest network for some areas and set up a separate Petasense network system for others.
The assets being monitored were in separate mechanical rooms and on the roofs. The site was monitoring air handling units, and it is difficult to get reliable WiFi into the metal enclosures. Power was not available throughout the facilities, which limited where gateways could be installed. Cooling towers and other structures were a distance from the main building with limited power and network options.

Setting up or using an existing wireless network is often overlooked, and can be a challenging part of getting the program going. Without reliable infrastructure, the sensors can’t collect and transmit data, and analysts can’t see what is going on. Areas that are particularly difficult to get connectivity are those with lots of concrete, metal enclosures (e.g., AHU) and water. A site survey can help to determine where there are connectivity issues, and can determine which assets are able to be monitored with the available wireless network.

Monitor and adjust goals as needed

In order to show the effectiveness of the program, the site operations team set clear goals for maintenance and operators. The goals are visible and tracked daily, and results are presented regularly to the pharmaceutical company. Failure avoidance, energy and cost savings are captured and documented. The team now showcases a “Reliability Board” in a main hallway to drive awareness.

The change in the maintenance culture has led to a more engaged and proactive workforce. By encouraging ownership from the operators, understanding asset criticality, improving PM tasks, and setting up wireless PdM systems to predict downtime, the site has moved from reactive to predictive maintenance. Benefits have been:

  • Improved equipment reliability: 31 problems were found within the first two months of PdM
  • Reduced downtime: The response due to failure dropped from 29 percent to 9 percent
  • Improved training: Learning the sensor operations also enforces PdM strategy
  • Increased PM intervals: 2X extension of PM intervals on monitored systems
  • Energy reduction: Less stream and condensate loss and more efficient equipment operation

Before the maintenance transformation, the facility experienced a bearing or belt failure on an average of every 48 days. This has now been extended to an average of once every six months. With the system, now they are able to detect loose belts, soft foot on motors, bearings and motors needing grease, and failed steam traps. The program has had a huge impact in terms of reliability.

This major pharmaceutical site has achieved zero downtime in the facilities for the first half of 2019. In the previous year, there were four downtime events, one of which was in the facilities and caused significant production downtime.

What’s possible

Companies in industries such as oil and gas or power generation have used PdM for 30 years, but these technologies were expensive and reserved for critical assets. With the advent of low-cost wireless sensors and increased computing power, companies across industries can monitor more assets, provide greater insights and generate more value from their program.

IIoT technology can be an enabler to improved maintenance practices, but it cannot be expected to solve all maintenance challenges. Companies looking to implement IIoT should consider the dimensions of culture, process and technology, in order to adopt a true reliability mindset.

Lessons learned

Companies considering implementing new technologies should consider the following five lessons learned:

  • Understand the site needs before making changes
  • Don’t wait too long to put in place predictive maintenance
  • Think beyond rotating machinery
  • Think through the WiFi and connectivity
  • Set your goals, monitor, and adjust as needed
About the Author

Gary Stevens | CRL Operations Manager of Life Sciences Group