Is your 'failure data’ failing you?
Computerised Maintenance Management System (CMMS) is a tool that can not only help an organisation in effectively and efficiently meeting its maintenance works management requirements but is also a vital source of equipment Reliability and Maintenance (RM) data.
ISO 14224 provides a comprehensive basis for the collection of RM data in a standard format for equipment in all facilities and operations within the petroleum, natural gas and petrochemical industries during the operational lifecycle of equipment. It describes data-collection principles and associated terms and definitions that constitute a “reliability language” that can be useful for communicating operational experience.
Following three broad categories of data are identified in ISO 14224:
Scope of the Data:
Asset structure hierarchy, Equipment Identification
(e.g. Asset Taxonomy, Equipment Attributes/Specification)
(e.g. Failure cause, Failure consequence, Failure detection method)
Identification of Maintenance Transactions, Maintenance Activities, Resource
(e.g. Maintenance Record No, Tasks, Procedures, Man-power, tools)
While the Equipment Data & Maintenance Data form a significant part of the overall RM database, they are not the subject of this article.
It is a well-known fact that improving and maintaining equipment reliability requires a deep understanding of not only how the equipment operates but most importantly how it can fail to operate. By definition, reliability is the “ability of an item to perform a required function under given conditions for a given time interval” (Ref: ISO 14224).
One of the pillars for any reliability program is continuous improvement which is not possible without having relevant and updated data in place which could be analysed from reliability improvement perspective. This is where the role of Failure Data comes into play.
Failure Data should at least have the following information in order to make it effective:
Failure Data Information:
Failure Date / Time:
Date & of failure detection (year/month/day)
effect by which a failure is observed on the failed item (usually considered at equipment-unit level)
Failure Impact on Plant Safety:
Impact on personnel, environment & assets (usually scaled as per Organizations Risk philosophy).
Failure Impact on Plant Operations:
Production loss, downtime (usually scaled as per Organizations Risk philosophy).
Failure Impact on Function:
Consequence of failure on Equipment functionality e.g. critical, quality, production quantity (usually scaled as per Organizations Risk philosophy).
Physical, chemical or other processes which led to the failure (e.g. mechanical failure, material failure, electrical / instrument failure, external influences)
Immediate Failure Cause:
Circumstances during design, manufacture or operation which led to the failure (should be supplemented by detailed Root Cause Analysis).
Sub-unit / Component / Maintainable item(s) failed:
Identification of Sub-unit / Component / Maintainable item(s) failed
Operating Condition at Failure Time:
Running, start-up, maintenance/testing, idle, standby.
Scheduled activity (e.g. inspection, testing) / Continuous Monitoring / Casual Occurrence (e.g. sensory observation, corrective maintenance)
Immediate remedial action taken (can also include recommendations to avoid reoccurrence, but this usually comes after detailed Root Cause Analysis)
Annex B (Interpretation & notation of failure & maintenance parameters) of ISO 14224 provides detailed description on Failure Data and its structuring.
CMMS provides the platform for collection, presentation and analysis of the Failure Data. However, it is ironic that many organisations miss out on this important functionality of CMMS.
‘Failure Codes’ in a CMMS are the representative of what has been explained above as Failure Data. Failure Code is a combination of equipment type/class identification, Failure Mode, Failure Cause & remedial action taken.
The best way to develop Failure Codes is to carry out an in-depth Failure Modes & Effects Analysis (FMEA) through a cross-functional team of experts & practitioners. Through this exercise relevant Failure Modes can be identified and mapped for each Asset/Equipment Class & Sub-class. The quality of this analysis will largely impact the outcome of CMMS Failure Data during operations. Most organisations do not carry out proper FMEA during the initial stages and therefore fail in capturing asset history in a structured manner.
Proper design of Failure Codes also ensures that similar types of assets are treated in similar manner, thus providing for a uniform approach in handling Failure Modes for these assets.
Effort should be made to have an optimised number of Failure Codes in CMMS. Too many Failure Codes will create confusion for the maintenance craftsmen to identify the real problems & causes. While too little Failure Codes may not be sufficient enough to capture true history of the equipment.
The significance of Failure Data should be communicated well to the maintenance craftsmen and they should be provided adequate training for proper use of this feature in CMMS. Without them realising its significance, it is not possible to develop a reliable RM data.
CMMS Work Order closure should be subject to careful review by relevant organisational levels in order to ensure that proper Failure Data is being entered in the system. It is often a good practice for any maintenance manager to pay special attention to the ‘Corrective Work Orders’ in the system which are generated against any equipment failure or malfunction. This ensures that failure data is given its due importance by the maintenance craftsmen.
Usage of free text fields should be discouraged as much as possible as it becomes meaningless for failure reporting and results in wasted effort. Instead standardised text should be used to ensure uniformity and clarity in failure reporting.
Periodic review of Failure Data not only identifies any gaps in existing maintenance programs but also helps in eliminating unnecessary tasks thus leading to optimised maintenance practices. It is also important to note that recurring or critical failures should always be subjected to proper Root Cause Analysis (RCA) & CMMS Failure Codes should never be treated as its substitute.
CMMS Failure Reporting provides leverage for any Maintenance and Reliability function to improve its analytics and contribute to the organisational bottom-line by implementing appropriate reliability techniques.
Usman Mustafa Syed is a Maintenance & Reliability Consultant with 12 years of experience within the oil and energy sector. His areas of expertise include CMMS, Reliability Centred Maintenance (RCM) and Safety Critical Elements (SCE). He is currently based in Kuala Lumpur, Malaysia & can be reached at firstname.lastname@example.org