Chapter 4. Focused Improvement. Part 2

Author: | Posted in Instructor Training No comments

6. Why-Why Analysis

Why-Why Analysis is a technique for tracking down the root causes of undesirable phenomena logically and methodically, based on the facts, instead of merely brainstorming. It is conducted using one of the following two basic approaches:

  • Approach starting from the viewpoint of ‘what should have happened (but didn’t)’ Think what the necessary conditions must be (the conditions that must be satisfied in order for everything to work as it ought to) and list them up as items to investigate. Then investigate each condition on the spot to see whether it is actually being satisfied.
  • Approach starting from first principles
    Focusing on the point at which the problem occurred, go back to first principles, and do the first step of the analysis by asking why these principles failed to operate.8 important considerations when implementing Why-Why Analysis(1) Clearly identify the phenomenon by closely observing the physical objects and materials involved in situ(2) Use simple phrases like ‘A did/became B’ when defining the phenomenon and answering the Why?s(3) Check the logical structure of the analysis by reading back from the last Why? to the phenomenon.(4) Reading the analysis backward, check that all the possible causes have been identified for each event.

    (5) Continue asking Why? until you identify the actions that will prevent the problem from recurring.

    (6) Only note items that deviate from normal.
    (7) Avoid trying to identify causes lying in human psychology. (8) Do not use negative words such as ‘bad’.

7. Industrial Engineering (IE) Techniques

Products are manufactured on the shop floor through a combination of people, equipment and raw materials. This means that the efficiency of what people do must be raised at the same time as the efficiency of the equipment they operate, and IE techniques are an important weapon for doing this. Figure 4.22 shows the procedure and key points for effecting improvements by using IE techniques to highlight current issues.

IE techniques include measurement techniques and analytical methods (see Figure 4.23). It is important to use these effectively to come up with improvement ideas and examine improvement proposals thoroughly.

Figure 4.22 Improvement Procedure and Key Points

Figure 4.23 IE Measurement and Analysis Techniques

Human behaviour and motion take place for a purpose, and the key issue is their efficiency relative to their purpose. It is important to be ‘motion-minded’, that is, to adopt an analytical, businesslike approach. Being ‘motion-minded’ means:

• Always noticing and being concerned about wasteful working methods
• Always being aware of the principles for improving such methods, as a matter of common sense

Mastering the procedure for devising more efficient working methods, and practising it habitually

8. Eliminating Breakdowns
8.1 Some Common Issues Associated with Breakdowns

This section describes the typical problems that many companies face in attempting to reduce and eventually eliminate breakdowns, and what they need to do about solving them.

(1) Production department apathy

Production departments often pay little attention to breakdowns, regarding them as the responsibility of the maintenance department. They have an ‘I make, you fix’ mentality. While very concerned about producing the required volume of product, they show little interest in the equipment and its maintenance, even though these are critical to product quality.

Why are production personnel typically so unconcerned with maintenance when they are the ones who face the biggest problem when the equipment fails? There are several possible reasons. They may only have enough labour resource to cope with production, and no time to do anything else. They may be prohibited from touching the equipment for safety reasons. Perhaps their company has taken the principle of division of labour too far. Or there may be no system in place that would allow them to maintain their own equipment. All of these are mistaken policies on the part of senior management. They are mistaken because, no matter how hard the maintenance department tries, there is no way it can eliminate failures or even reduce them significantly on its own. It can only do so with the help of the operating department. The two departments must work closely together. Operators must perform routine maintenance tasks such as cleaning, checking, lubricating, retightening, and simple parts replacements (looking out for problems as they do so), while maintenance personnel must perform all the other, more specialised, maintenance work required. To achieve this, it is essential to involve the operators in the maintenance effort and ensure that they play a full part.

Breakdowns often show warning signs before they actually happen. Operators must be taught to recognise these signs so that failures can be discovered at an early stage and nipped in the bud. They must also learn how to keep their equipment clean, tight and properly lubricated to ensure that problems occur as infrequently as possible. The only reason why operators do not do this anyway is that they have not been taught how to do it or why it is necessary. Given the necessary instruction, they will happily perform the task. Training operators to recognise when something is wrong and report to the maintenance department, provided that the maintenance department then takes prompt corrective action, is an extremely effective way of reducing breakdowns. Making the best possible use of the operators’ abilities is the key to reducing failure rates.

(2) Weak approach to failure analysis

Most companies do not analyse failures sufficiently rigorously. Despite complaining about high failure rates, they neglect to make effective use of the information that failures can furnish. It is important to learn as much as possible from each individual breakdown as well as to collect and analyse failure data over time in order to identify where weaknesses lie. Failures are an extremely valuable source of information. Some breakdowns cannot be anticipated with existing levels of knowledge and experience. The purpose of analysing them once they happen is to find out what needs to be done in order to prevent them, or ones like them, from happening again. The learnings from the analysis should then be communicated to everyone in the relevant work area, as well as being rolled out to other areas where similar problems could arise.

Many companies attempt to analyse failures but are unsuccessful because they do not investigate the root causes thoroughly enough. Some common problems are listed below:

  • Not observing the phenomenon closely enough.
  • Not sketching broken parts, relevant aspects of the surroundings, etc. in sufficient detail.
  • Not collecting and dismantling broken parts.
  • Fixing the symptoms rather than the causes (e.g. getting a machine going again by replacing a part, without trying hard enough to find out why the part broke in the first place).
  • Not doing anything to prevent the same problem from recurring.
  • Not analysing failures on the spot, because no system exists for doing so.
  • Problems like this occur for the following sorts of reasons:
  • Inadequate understanding of equipment mechanisms, components, and their functions.
    Ignorance of commonly-used techniques for identifying root causes (e.g. Why- Why Analysis).

It is impossible to analyse failures effectively without a thorough knowledge of the equipment’s individual mechanisms, the internal structure of its parts, how those parts are assembled, and the workings of the system as a whole. These must all be carefully studied and understood at the outset of any failure analysis, not just by maintenance staff but by line teams as well. Failures can be prevented by analysing them accurately, identifying their true causes, eliminating those causes, and making sure the causes stay eliminated by checking at appropriate intervals. Most companies need to work much harder in this area.

Each failure that has been analysed and dealt with should be monitored to find out how successful the solution has been. If the failure recurs, it should be analysed again in order to determine the true cause, and further corrective action taken. In short, all

failures should be rigorously analysed, their true causes identified, and appropriate action taken to prevent them from recurring. Meanwhile, failure data should be collated to identify plant weaknesses and prioritise the maintenance effort.

(3) Unsatisfactory maintenance system

Maintenance systems can be categorised as time-based (TBM) or condition-based (CBM). Time-based maintenance is the foundation of condition-based maintenance, which can become effective only when the former is properly implemented.

The following systems, which serve as the basis for TBM, are generally not well enough established:

  • Inspection standards (specifying intervals, locations, methods and criteria).
  • A maintenance calendar clearly showing parts replacements and overhaul intervals, and lubrication top-up and replacement timings, together with a system for implementing it.
  • A system for keeping historical records of failures.Some factories possess documented inspection standards that include details of inspection points, intervals, methods and criteria but have not been revised for many years. Others have not clearly defined the mutual responsibilities of the operating and maintenance departments for checking and lubricating. This indicates a low level of plant management expertise. Written standards should be continually updated in line with technological progress.In some cases, up-to-date written standards are provided, but either the inspections are not done properly or only some of them are done. This indicates a problem with the implementation of the standards. There may also be problems with the use of computers or the operation of systems for recording inspection and failure data.When production volumes increase, scheduled parts replacements and local overhauls are sometimes postponed or abandoned, leading to breakdowns. If parts replacement or local overhaul intervals have been appropriately selected, they should be implemented as scheduled; otherwise, they should be re-examined to see if they can be extended.Whatever the cause, factories need to review their systems in order to identify weaknesses, and then eliminate those weaknesses to get the systems operating reliably. Generally speaking, the maintenance budget is the first in the firing line when costs have to be cut because of a worsening business environment or declining profits. This is probably because either the importance of maintenance is not understood or the existing maintenance system does not work well.

(4) Weak implementation of predictive maintenance (CBM)

Predictive maintenance is a method in which equipment or its components are monitored continuously or at regular intervals in order to track changes in certain parameters, and are judged to be in satisfactory condition or not on the basis of any tendency for those parameters to change. Predictive maintenance diagnostic techniques can be classified as ‘simple’ or ‘precise’. Simple techniques are used to decide whether or not something is wrong, while precise techniques indicate which component is going wrong and how long it is likely to last for (this is possible in many cases, although not in all).

Generally speaking, however, most companies’ approach to predictive maintenance is woefully inadequate. A wide range of inexpensive, easy-to-use diagnostic equipment is now available, with better software than ever before, but hardly any companies make good use of it. For example, the following problems with the use of simple vibrometers are often seen:

  • Measurements are unreliable because they vary so widely.
  • This problem happens when measurements are taken at different points or under different loading conditions each time.
  • The diagnostic techniques are not trusted because the measurements remain unchanged for a long time (i.e. no results are seen).
    If the measurements are accurate and there is nothing wrong with the equipment, a long time may pass before any changes occur. In such cases, the usual method of verifying the diagnostic technique is to take measurements on a model in the normal condition (e.g. just after repair or parts replacement) and in an abnormal condition (e.g. with a faulty bearing installed) in order to see the difference. Measurements are not taken regularly, and trends are not monitored.
  • It is essential to take measurements at regular intervals and monitor the trends.Simple condition monitoring is not difficult and yet is highly effective if the correct techniques are acquired. Team leaders and operators should learn to use it. Precision monitoring should only be introduced once simple monitoring has been mastered, which is why relatively few companies have adopted it so far. Implementing it requires a thorough understanding of vibration theory and vibration analysis techniques.There is a general tendency to equate equipment diagnosis with vibration analysis, but monitoring such parameters as load currents, timings, run times, temperatures and sound levels can also be an effective way of assessing equipment condition. It is important to conduct thorough investigations to determine whether any parameters exist that will allow a degradation in equipment or component performance to be detected, find out how the change in those parameters relates to the deterioration, and understand their cause-and-effect relationships with the phenomena.

8.2 Strategies for Reducing Failures

The approach to zero failures incorporates the following seven concepts, each of which is described below:

Phase 1 – Eliminate forced deterioration

Phase 2 – Extend lifetimes through corrective maintenance

Phase 3 – Monitor and Control deterioration

Phase 4 – Carry out predictive maintenance

(1) Eliminate forced deterioration 1 Classify failures

When tackling failures, it is important to reduce the overall failure rate by eliminating the easy problems first. It is sometimes difficult to know where to start when faced with a mixture of different types of failure, such as simple ones (e.g. out-of- position sensors or broken wires), complex ones (e.g. broken gears or breakdown of control systems from unknown causes) and repetitive ones (occurring mainly in hydraulic systems, drive systems and other vital equipment systems).

According to a survey on the nature of breakdowns covering a large number of factories, however, about 70% of the total are simple failures while only 30% are complex ones, as Figure 4.24 illustrates. It is important to start by reducing the level of these simple failures in order to create enough time for maintenance personnel to reduce complex and recurring failures through corrective maintenance.

Figure 4.24 Causes and the Equipment Elements where they Arise

By simple failures, we mean the type of failure that can be prevented through good Autonomous Maintenance. Some examples are:

  • Bearings seizing up because of insufficient lubrication.
  • Broken wires due to contact with equipment or excessive bending.
  • Malfunctions due to misalignment resulting from incorrect fixing of sensors.
  • Malfunctions due to ingress of coolant or other liquid into limit switches.
  • Bearings seizing up in hydraulic pumps.
  • Damaged V-belts.

Failures like this can be prevented if operators are trained in how to lubricate their equipment and check it using their five senses, and are able to detect problems such as loose and broken V-belts, overheating and abnormal noise in bearings, etc. It is essential to make maximum use of operators’ abilities in order to reduce failures.

As mentioned earlier, it is difficult to identify failure trends on a single machine, because successive failures tend to occur randomly, but it is sometimes possible to identify trends if failures are classified by equipment group and location of occurrence. The aims of classifying failures are:

  • To identify equipment weaknesses.
  • To highlight deficiencies in equipment management.
  • To clarify priorities for implementing solutions.
  • To identify the support that operators require for carrying out AutonomousMaintenance (e.g. training in inspection and lubrication) and the kinds of things they should be asked to do (e.g. checking and lubricating their equipment, and detecting problems at an early stage).Failures can be classified in various ways. As explained, doing so helps to identify weaknesses in groups of similar equipment, locations of related causes, and weaknesses in equipment management. Six useful bases for classifying failures are described below.(a) By production line or equipment groupIf the output and OEE of particular lines or equipment groups have declined as a result of frequent breakdowns, it is first necessary to identify which types of equipment are failing most often. The Autonomous Maintenance activities of the operators and the specialised maintenance activities of the maintenance department should then be concentrated on these.

    (b) By location

    Failures can also be classified according to which part of the equipment they occur in. The categories specified for this might include fasteners (nuts, bolts and other fixing devices), lubrication systems, drive systems, pneumatic systems, hydraulic systems, electrical systems, control systems, sensors, and jigs/tools. This system of classification will reveal weaknesses in particular equipment groups and lines and enable the maintenance activities to be prioritised.

(c) By mode of occurrence

Classifying failures according to their mode of occurrence (cracking, breaking, deformation, wear, corrosion, leaking, loosening, etc.) is a useful way of sorting out what type of corrective maintenance (extending service life, preventing recurrence, etc.) needs to be undertaken for each.

(d) By cause

Failures may also be classified according to their cause (inadequate basic conditions, improper use, failure to reverse deterioration, faulty design, lack of skill, etc.) in order to pinpoint weaknesses in equipment management and decide what needs to be done next.

(e) By history

Another method is to classify failures according to whether or not they have happened before. Classifying them in this way will shed light on the following points:

  • The MTBF of recurring failures.
  • The effectiveness of any steps taken to prevent failures recurring.
  • If there are no plans to extend the lifetimes of particular components at present, hasthe checking of these components been incorporated into the periodic maintenancecalendar?
  • The way in which similar equipment should be inspected.
  • The most important issues that need to be addressed through AutonomousMaintenance.
  • The most important issues that need to be addressed through specialised maintenance.
    (f) By relation to Autonomous MaintenanceFailures can also be classified according to whether they could be prevented through Autonomous Maintenance (‘simple failures’), or whether they could only be prevented by the maintenance department (‘complex failures’). The distinction between simple and complex failures is not clear-cut, but experience tells us that operators are capable of preventing the following types of failure through good Autonomous Maintenance:(i) Failures whose warning signs can easily be detected through the five senses:

• those associated with faults that can be noticed by looking at, touching or moving the equipment (e.g. limit switches, sensors, wires and bearings).

those associated with parts of the equipment that can be seen from the outside, without having to take it apart.

(ii) Failures whose warning signs can be detected through partial disassembly: • those associated with faults that can be noticed by looking at, touching or

moving the equipment after partially disassembling it (faults such as deformation, wear, slackness, play, etc.), e.g. wear of speed reducer gears, backlash in key slots.

(iii)Failures whose warning signs can easily be detected through the use of simple measuring instruments:

• those associated with faults that can be detected through the use of dial gauges, feeler gauges, spirit levels, etc. (e.g. misalignment, eccentricity, tilting, etc.).

The situation is bound to change as the standard of Autonomous Maintenance rises, but it is important to draw a line at a certain level and classify the failures in this way, because doing so will clarify the following:

  • The priorities in carrying the Autonomous Maintenance programme forward.
  • The order in which the Autonomous Maintenance tasks (inspection techniques, etc.) should be taught.Table 4.11 indicates the aspects of equipment management that fall within the remit of Autonomous Maintenance.

2 Analyse failures
(a) Why do failures happen?

All production equipment is subjected to stress of one type or another. It may be operational (mechanical or electrical) stress applied in order to make the equipment work, or environmental stress (temperature, humidity, vibration, dust, etc.) coming from its surroundings. When the stress applied exceeds the equipment’s strength, it breaks down (see Figure 4.25).

In other words, equipment breaks down when excessive stress combines with insufficient strength (the equipment may have been too weak to begin with, or it may have been allowed to weaken through deterioration). This situation can be caused by one or more of the following five factors:

  • Failure to maintain basic conditions
  • Failure to observe correct operating conditions
  • Failure to reverse deterioration
  • Failure to correct design weaknesses
  • Lack of skillAll of these are human responsibilities. Equipment does not want to break down; people make it break down. The corollary of this is that it is only people who can prevent it from breaking down, and they can only do so if they change their attitudes and behaviours. Figure 4.26 summarises the causes of different types of failure.

(b) The need for failure analysis

In most factories, it is the maintenance personnel assigned to a particular line or piece of equipment who repair it and try to find the cause of the problem when it breaks down. However, as explained earlier, about 70% of failures can be prevented through good Autonomous Maintenance, and, in any case, failure rates will never go down to zero if all the maintenance work is left up to the maintenance personnel. It is essential for every operator to take personal responsibility for failures of their equipment and learn as much as possible from them. Although they cannot undo a failure once it has happened, they should resolve never to allow the same failure to happen twice.

It is possible to prevent failures from recurring if each breakdown is treated as a real- life case study and a careful analysis is made of the causes of the problem, whether or not there were any warning signs, the quality of the inspections carried out, and what kind of remedial action was taken. Operators must conduct this kind of failure analysis in conjunction with their Autonomous Maintenance activities in order to identify the causes of the routine failures that they see almost daily; to highlight the areas where there are weaknesses or when something is lacking; to improve their knowledge and skills; and to develop themselves into experts on their own equipment.

(c) How to carry out a good failure analysis

(i) Pinpoint the phenomenon
Start by visiting the scene of the crime immediately after the breakdown has happened; examine the equipment and materials there and interview the operators in order to find out how the equipment stopped, which components are broken and in what way they are broken, when the last similar breakdown occurred, and whether or not there were any warning signs. All this information should be recorded on a form provided for the purpose.

(ii) Take interim action
If a part is broken, it should of course be replaced with a new one in order to restart production as soon as possible, but it hardly needs to be said that this in itself is not a true countermeasure. Many people, however, do mistakenly treat this kind of stopgap action as a complete solution.

(iii) Prepare to investigate the causes
Owing to lack of knowledge about how the equipment works, how it is constructed and how it should be correctly used, problems are often dealt with in a haphazard sort of way. The upshot is that the problem repeats itself or other problems occur close to the part that was repaired. This happens when the people involved replace parts or strengthen the equipment without really understanding the causes of the problem. In order to determine the causes of the problem correctly and take effective action, it is essential to understand the equipment’s functions, internal structure and correct method of operation using systems diagrams and sketches of the equipment and the failed parts.

The next step is to work out what condition the functional parts should ideally be in, based on engineering principles and parameters, and draw up a list of items that need to be checked to ensure that both the minimum necessary and the optimal conditions are satisfied. The equipment must then be checked thoroughly using this list in order to identify and correct all deficiencies.

(iv) Track down the causes
The reasons why the deficiencies identified in the previous step have occurred should then be tracked down using Why-Why Analysis (see Figure 4.27). The human aspect should be investigated particularly carefully because, as mentioned earlier, failures are due to inadequate human behaviours. When multiple independent causes or complex combinations of interacting causes are at work, P- M Analysis should be used.

Figure 4.27 Example of Why-Why Analysis Sheet

(v) Take corrective action
The causes identified in the previous step should be dealt with promptly through restoration or improvement. If left untreated, minor equipment defects will become worse and begin to affect other parts of the equipment, as well as being forgotten if too much time is allowed to pass before any action is taken. If the problem is too difficult to rectify right away because of technical, financial or time constraints, a plan should be drawn up to ensure that it is dealt with at an appropriate time. It is also important to roll out any action taken or checks implemented to similar equipment and to other equipment with similar mechanisms.

(vi) Ensure that the problem cannot recur
Find out why the minor equipment defects that eventually led to the failure were not spotted in time, and work out what needs to be done in order to improve everyone’s ability to spot them in the future. Check whether the necessary inspection schedules are in place and whether the standards are adequate. It is also important to find some suitable predictive maintenance techniques for detecting deterioration.

Any failure analysis should follow the steps described above, paying particular attention to the following points:

  • Be sure to report each failure on a separate sheet.
  • Give people repeated practice in analysing failures while teaching them about their equipment.
  • Work closely with the maintenance department, exchanging information and conducting analyses together.
  • Supervisors and managers must coach and advise operators painstakingly.
  • Promote understanding by using broken parts, analysis sheets and one-point lessons to explain equipment mechanisms, potential problems and the correct inspection methods.(2) Extend lifetimes through corrective maintenance 1 Establish basic conditionsThe three basic conditions for reliable equipment operation are cleanliness, tightness and correct lubrication. Establishing and maintaining basic conditions means preventing the equipment from deteriorating and is the most important way of avoiding the causes of breakdowns.(a) CleaningAs the word implies, cleaning means keeping the equipment free of dust, dirt, oil stains, spilt product, and other forms of contamination. Production equipment is extremely sensitive to contamination of this sort, and many sporadic failures or product defects are due to it getting into sliding parts, hydraulic systems, electrical control systems and so on, causing problems such as wear, clogging, leaks, defective operation, short-circuits, and inaccuracy. Thorough, regular cleaning is essential in order to eliminate this kind of forced deterioration.

    Cleaning does not mean simply getting the equipment looking nice. When operators clean their machines, they naturally have to look at and touch every part of the equipment including all the little nooks and crannies that they never normally see. This makes it much more likely that they will spot potential problems in machinery, dies, jigs and tools; not only dust and dirt, but also wear, play, scratches, slackness, deformation, leaks, cracks, overheating, excessive vibration and noise. Cleaning should not be done for its own sake; it should be cleaning with meaning. It is usually possible to find from two to five hundred potential defects when cleaning a single machine that has been neglected for a long time. This is why the slogan ‘cleaning is inspection’ is so common in Autonomous Maintenance circles.

(b) Lubrication

It goes without saying that equipment cannot perform satisfactorily unless it is properly lubricated. Despite this, however, empty, dirty, blocked or leaking reservoirs, grease nipples, lubricators, oil tubes and other lubrication devices are a common sight in many production areas.

Neglecting to lubricate can lead directly to sporadic failures such as bearing seizure. It can also accelerate equipment deterioration by causing wear or overheating, and the effects can spread out to all of the equipment’s units, giving rise to a huge range of different types of failure. Inadequate lubrication can be cited as a typical example of what might be called a psychological latent defect, because it arises from insufficient attention and interest on the part of the people responsible for doing the job.

(c) Tightening

Many failures are due to nuts, bolts and other fasteners breaking, working loose or falling off. Even a single loose bolt can be a source of failure if it is used to attach an important part such as a bearing unit, die, jig, cutting tool, limit switch, coupling, or flange.

Fastener problems, however, do not usually lead directly to failure but start a chain reaction, which eventually results in a breakdown. When one bolt works loose, for example, the part it is supposed to hold may begin to vibrate, causing another bolt to work loose and create further vibration. Vibration breeds vibration, backlash breeds backlash, and the upshot is a serious breakdown. When one company investigated the causes of its breakdowns, it discovered that 60% of them were due to some form of nut or bolt problem. These kinds of problems account for a surprisingly high proportion of latent defects.

(3) Monitor and control deterioration 1 Observe correct operating conditions

If equipment is to perform its required functions, it must be operated under the correct conditions. In hydraulic systems, for example, the hydraulic fluid must be kept at the correct temperature, volume, pressure, acidity and level of cleanliness, while electrical control systems and measuring instruments must be operated under certain conditions of ambient temperature, humidity, dust level and vibration level. Switches and other devices must be fitted correctly in the right position and satisfy certain parameters (limit switches, for example, must have a dog of the correct shape, together with roller lever and dog contacts of the correct angle and strength). It is essential to set and observe the correct operating, handling and loading conditions for each piece of equipment in use.

Attempting improvements is pointless if the correct operating conditions are not being followed, because the equipment’s accuracy of movement and processing conditions will be unstable and any problems will simply repeat themselves. To eliminate these problems, it is essential to specify the correct operating conditions for each equipment unit and component, and ensure that they are followed.

2 Reverse deterioration

When dealing with failures, attempts are often made to introduce improvements while neglecting to restore deteriorated machines, jigs and tools, or only partially restoring them simply by replacing the broken parts. This will not work. Machines, dies, jigs and tools can only function effectively when the strength and accuracy of their components are properly balanced. If it is clear that a machine’s strength and accuracy are unbalanced from the start because of poor design or fabrication, it may be necessary to remodel it. In other cases, however, if only the broken parts of the machine are remodelled or restored, while other relevant parts are ignored, the problems will merely repeat themselves. In fact, they will go on repeating themselves as long as the deteriorated parts that ultimately cause the failures remain undetected.

For example, if a drive shaft has broken off at a notched section, we should make sure that any defects such as play due to a worn or badly-fitting bearing, or backlash due to worn gears, are eliminated before replacing the shaft or remodelling it to increase the notched section’s radius of curvature.

Equipment deteriorates slowly over and time and its parts eventually begin to fail, starting with the weakest. Simply restoring or remodelling a broken part will not be very successful, because the next weakest part will fail soon after. The quickest way to achieve zero breakdowns is to go back to the drawings, identify the deteriorated parts by checking and testing, and restore the overall balance of the equipment’s strength and precision before thinking about changing its design.

To correct deterioration properly in this way, methods of accurately discovering, predicting and correcting deterioration must be found. Deterioration is detected and predicted by periodic checking and inspecting and through the use of diagnostic techniques, and is corrected by overhauling based on standards. This of course requires a high level of skill on the part of those responsible for maintenance. It also requires the implementation of a preventive maintenance system.

3 Correct design weaknesses

To eliminate breakdowns, it is sometimes necessary to redesign the equipment, changing the materials, dimensions and shapes of its components. If a machine frequently breaks down despite being looked after carefully, and it is impossible to keep it going for long even with regular checks, inspections and overhauls, the maintenance costs become too great, and it may be necessary to eliminate the weaknesses by redesigning. However, it is better not to remodel equipment unless absolutely necessary. There are countless examples of serious mistakes being committed through making hasty decisions, inappropriately copying improvements done on other equipment, or being seduced by attractive new technologies presented in catalogues.

If a machine’s parts are not considered durable enough, the first thing to do is to decide whether it is a design fault. If it is, the weakness should be identified accurately and a plan for remodelling the equipment should be put in place. To do this, the following procedure should be adopted:

(a) Find out exactly what happened before and after the breakdown, and identify the phenomenon precisely.

(b) Check the equipment’s structure and functions.
(c) Check to see whether basic conditions are being maintained, correct operating conditions are being followed, and the equipment has been properly restored. (d) Identify the mechanism by which the phenomenon occurs.
(e) Find the causes (design weakness, some other reason, or both).
(f) Plan an improvement.
(g) Implement the improvement.
(h) Follow up the improvement to see whether it worked or not.

(4) Carry out predictive maintenance
1 Improve operating and maintenance skills

When thinking about how to eliminate breakdowns, we often make the mistake of focusing our attention exclusively on the machines, jigs, tools, materials being processed and other hardware while forgetting about the operating and maintenance skills involved. If the cause of the problem is in fact lack of skill, looking for the causes in the hardware can lead us to repeatedly change the design of a machine or the specifications of the materials used while still failing to reduce the number of breakdowns. If a problem is known to be due to operating or maintenance error, at least we can do something about it; but people are often convinced that the methods they are using are correct when in fact they are not. In such cases, a solution is not easily found. Problems like this can only be solved by working out exactly what skills the operators and maintenance people need to look after their particular equipment, and ensuring that they acquire those skills through comprehensive education and training.

8.3 The 5 Main Factors that Cause Equipment to Fall, and the Priority Issues

The seven concepts behind achieving zero failures have been described. The fact that actions (3) through (7) are required indicates that neglecting to carry them out is a common cause of failure. These actions are so important that they are identified as the 5 Zero-Breakdown Countermeasures. Neglecting a single one of them can lead directly to a failure, but more often than not, failure is due to neglecting several in combination, as shown in Figure 4.28. This means that it is not always possible to eliminate a particular kind of failure by implementing only one or two of the countermeasures. Sometimes, even after numerous improvements have been made, failures continue to happen. The fastest way to achieve zero breakdowns is to implement all five countermeasures in order to identify and deal with every latent problem. Figure 4.29 shows the usual priorities within the countermeasures. This figure has been generalised to make it applicable to any kind of equipment and can be used for ensuring that all aspects have been covered.

Figure 4.28 Overlapping Causes of Breakdowns

Figure 4.29 The 5 Zero-Breakdown Countermeasures

8.4 The 4 Phases to Zero Breakdowns

The 5 Zero-Breakdown Countermeasures will not work well if they are introduced in a rush or if more than one is implemented at a time. The best approach is to introduce them progressively in four phases. The main thrusts of the four phases, outlined in Table 4.12, are explained below. If this approach is followed, breakdowns are certain to get closer and closer to zero.

Table 4.12 The 4 Phases to Zero Breakdowns

Phase 1: Eliminate forced deterioration

1. Correct neglected deterioration. The first thing that needs to be done is to reverse any obvious deterioration that has been left untreated.

2. Eliminate forced deterioration. Forced deterioration, which results from omitting to sustain basic equipment conditions and observe correct operating standards, is the biggest cause of variation in equipment failure intervals. Sustaining basic equipment conditions and adhering to operating standards will eliminate this variation.

Phase 2: Extend lifetimes through corrective maintenance

1. Correct design weaknesses. Eliminating forced deterioration will bring the service life of equipment closer and closer to its natural lifespan (i.e. the length of time it would last for if subjected only to natural deterioration). If it still does not last long enough, this indicates a problem with the original design. In such cases, the lifetime of the equipment must be extended by correcting its inherent weaknesses. This approach to improving equipment reliability is called
‘ corrective maintenance’.

2. Eliminate sporadic breakdowns. Although most sporadic breakdowns result from misoperation, many are also due to faulty repair work. They are the most difficult kind of failure to deal with, since they are unpredictable and therefore cannot be prevented by checking or inspecting; moreover, they are often caused by excessive stress acting on a part which, under normal conditions, would never be expected to be subjected to it. They must be eliminated by reducing human error, which can be achieved by raising operating and maintenance skills, installing error-proofing devices, improving maintenance tools and procedures, and introducing fail-safe designs.

3. Reverse visible deterioration. In phase 2, all relatively obvious external deterioration must be reversed, and the equipment restored to its original, pristine condition. Generally, more than 50% of all breakdowns can be avoided by eliminating visible deterioration.

Phase 3: Monitor and control deterioration

1. Periodically reverse deterioration. Deterioration must be corrected regularly to maintain and further reduce the level of breakdowns attained in Phase 2. To accomplish this, parts life must be estimated, standards for periodically checking, inspecting and replacing them must be established, and equipment must be restored carefully on the basis of these standards. The most important thing when doing this is first to improve the maintainability of the equipment through corrective maintenance. If standards are set while the maintainability of the equipment is still poor, dismantling the equipment for checking or replacing parts will be excessively time-consuming and costly and in the end will not get done.

2. Learn how to detect signs of internal deterioration. It is impossible to prevent every type of breakdown just by periodically reversing external, visible deterioration; operators must learn to use their five senses to detect warning signs of internal deterioration as well. Not every type of internal deterioration gives out such signs, but many do; and trained operators are often able to detect it by noticing unusual temperature, vibration, noise, light, colours, smells and movements. They should do all they can to sharpen their ability to detect the warning signs of abnormality.

Phase 4: Carry out predictive maintenance

1. Predict equipment life using diagnostic techniques. When equipment life remains highly variable, or when the warning signs of breakdowns are not detectable by the five senses, cannot be detected reliably, or cannot be detected early enough, the only remaining approach is to use diagnostic techniques to predict equipment lifetimes – that is, to analyse deterioration parameters quantitatively using diagnostic equipment. Even with parts of the equipment that are on a time-based maintenance regime, it is often possible to reduce maintenance costs by the application of condition-based maintenance using diagnostic technology; in fact, it is essential to consider moving from time-based to condition-based maintenance whenever possible.

2. Analyse catastrophic failures. Catastrophic failures are totally unpredictable, sudden failures that result in a total loss of function. Once the level of failures has been driven down by the application of the measures described above, almost all of the remaining failures will be of this type, or close to it. To prevent such failures from recurring, the causes of each should be subjected to a technical analysis (investigating the physics of fracture surfaces, materials fatigue, gear-tooth faces, stress concentrations, etc.), steps should be taken to increase the lifetimes of the relevant parts, and deterioration should be periodically reversed based on an estimate of those lifetimes. Table 4.13 shows a programme for achieving zero failures that summarises the points described in this section.

Table 4.13 A Programme for Achieving Zero Breakdowns

Chapter 4. Focused Improvement. Part 3

Add Your Comment