This section contains technical notes about the STEP-MES-based Investigation System The notes are inspired by published comments about the system which, for various reasons, misstate the nature or use of the STEP-MES investigation system. These misrepresentations and misuses mislead anyone who might be interested in experimenting with this system, or evaluating it for potential use.
1. UNWARRANTED ASSUMPTIONS
Assumption: STEP-MES is a Root Cause Analysis method.
False: it is a comprehensive integrated investigation and analysis system for understanding, predicting and improving processes. It shows the coupled interactions that produced outcomes, making attributed characterizations and abstractions like unsafe, human error, factors, cause or root causes unsuitable in descriptions of what happened.
Assumption: STEP-MES reflects an "accident causation model."
False: STEP-MES is based on a process perception of the accident phenomenon, and a behavioral adaptation of the General Systems input/output model, unrelated to accident causation models. An accident is viewed as the process by which a stable activity is transformed by a process of successive interactions among people, objects and energies over time to produce unwanted outcomes.
Assumption: STEP-MES is limited by the requirement to use only concrete actions; it doesn't handle "precursors."
False: the requirement compels specificity to fully explain a process, including remote inputs that "program" people, object and energy behaviors, overcoming ambiguities and abstractions tolerated by other methodologies. Investigators have to identify all specific behaviors necessary to reproduce an outcome of interest, so changes can be addressed and monitored for the full range of behaviors involved, from operators, equipment, energies, supervisors, designers, safety or risk analysts, procedures writers, trainers, regulators, managers, executives, media personnel, standards organizations, legal advisors, legislators, criminals, or anyone else whose actions influenced what happened and why it happened.
Assumption: STEP-MES is very complicated.
False: it is the occurrence that is complicated - when it is fully understood. STEP-MES only helps to document occurrences fully, thereby helping produce outputs that improve an investigation's efficacy and value.
2. STEP-MES PARADIGM SHIFTS
The STEP-MES investigation system introduces a new way to think about accidental phenomena and their investigation, differing in several ways from other practices.
2.a Investigation purpose
The STEP-MES investigation system views accidental occurrences as processes consisting of interactions among people, objects and energies over time that produce unwanted outcomes. The purpose of STEP-MES investigation system is to understand, document and communicate the behaviors involved to users of investigators' reports, and respond to those processes by changing risk-raising behaviors, as contrasted with traditional purposes of preventing recurrence, determining fault, causes, root causes, or causal factors; or gathering data for statistical or other analyses. The focus is on behaviors that might be changed to improve activities during which something of interest occurred, and the inputs and outputs for those behaviors. This paradigm shift creates a positive collaborative investigation setting, rather than a negative investigative environment and defensive posture, affecting how investigators approach their tasks, what participants are willing to contribute to the investigation, and their interactions with others.
2.b No causes or causal factors
A major difference between the STEP-MES investigation system
and current investigation paradigms is that STEP-MES uses a "process model" of phenomena, requiring concrete descriptions of people, object and energy behaviors, e.g., coupled interactions and "event block" pairs or sets explaining what happened in a matrix structure, in lieu of traditional causes, factors, human error and similar kinds of traditional classification terms reflecting various "causation models."
The main rationale for this change is to encourage objectivity during investigations. Traditional causes and causal factor labels represent conclusions by investigators, based on
demonstrably inconsistent and often ambiguous or unstated criteria, to explain
what happened. In contrast, STEP-MES uses data produced by the occurrence, harmonized and arrayed
in a manner which identifies specific behaviors and relationships to explain what happened and to address
for process improvement purposes,. This is essential
for minimizing subjectivity in investigation work products.
A second major reason is that the STEP-MES
approach facilitates identification of process improvement actions from
episodic occurrences, by describing all the behaviors by people and objects required to produce the observed outcomes. This makes it
possible to seek similar interaction pairs and sets elsewhere in the process investigated or
in similar processes everywhere, thus leading to identification of changes with potentially
broad ranges of effects – a goal shared with cause and factor classification
schemes.
A third major reason for this paradigm shift is the desire to
support a change from the inherent adversarial nature of investigations with
reliance on conflicting opinions and judgments, also inherited from the legal community, to
a collaborative endeavor and environment for investigators, conducive to
rigorous application of logic tests for data relevance, validity and
completeness as the investigation progresses, and enhanced replicability of the
results. To achieve this, the language of investigations must shift from
inconsistent terms with pejorative connotations like cause, causal factors,
blame, fault and error, inherited from the legal lexicon, to concrete terms
more compatible with scientific inquiry.
2.c. Prohibiting "DID NOTS"
A major shift from current investigation paradigms is illustrated by the prohibition of DID NOTs and similar negative terms like fault, error, failed or failed to in matrixes describing what happened. STEP-MES software flags DID NOT entries. DID NOT entries
mask what the person or object WAS doing at that time by raising the level of abstraction or ambiguity, thus making it impossible to define the specific demonstrated behavior that has to be changed.
For example, consider the sequence:
"driver turned ignition key ->car did not start."
We can not picture what happened in our minds as a MENTAL MOVIE from this description - because "did not start" is ambiguous about why the car did not start. STEP-MES requires the investigator to describe who or what did what after the driver turned the key. For example, a more complete and precise description might be: the operator placed his foot on the gas pedal, the operator rotated the key in the ignition switch, the key turned the ignition lock cylinder, the turning cylinder completed a circuit between an energized wire and the wire to the starter relay, the open brake pedal circuit blocked current flow to the relay solenoid, the relay blocked the electrical current flow to the starter. (The brake pedal circuit that would be closed by depressing the needed to activate the relay remained open because the operator placed his right foot on the gas pedal instead of the brake pedal.)
Thus, in this example, the corrective action required was to depress the brake pedal before turning the ignition key. Simply asserting that car did not start would not disclose this action without further investigation. Thus, prohibiting "DID NOTs" forces the investigator to really understand specifically who or what did what to define what needed to be done so car would start in the future. While this is a simple example, it illustrates how prohibiting "DID NOTs" forces the investigator to develop more complete descriptions. (The author was frustrated by this problem for almost an hour in a newly rented and unfamiliar car one dark night far from home the first year the brake pedal safety interlock was introduced in US automobiles. )
Theoretically, the problem is that DID NOT START describes an
aggregation of several other actions, rather than the specific individual actions
needed to fully describe and explain the outcome. This occurs in almost all instances when DID NOT is used. By aggregating the actions, it is highly likely that understanding of important interactions will be overlooked and the benefits of that understanding lost, or obscured in abstractions. This may reflect missing data or alternatively, lazy investigators. Either diminishes learning opportunities.
DID NOT may involve another inherent investigation problem: it implies that the investigator
describing what did not happen knows what should have happened.
That knowledge typically is based on knowledge of the outcome or intermediate
outcomes developed during the investigation, or manuals or procedures, or
personal experience, which may or may not be valid in the present case. However, as the occurrence
was happening, the people and objects involved did not know or were unaware of
what the outcome would be, and so their actions must be understood within the
context of the then current course of what was occurring, and how and why they
might best influence that course of events. (Historians recognize this context problem.) To gain this understanding, investigators
must try to determine what the person or object was doing (observing,
recalling, anticipating, pondering, concluding, hearing, concerned about, saying, etc.) or
occupied with at that time. See TIP below.
This challenge is at the heart of
"human factors" investigation needs, exemplified by the "what did he know and
when did he know it" kind of pursuit. Paraphrased, what did they do and why did
they do it. Until these questions are pursued, in lieu of the DID NOTs,
investigators cannot lay legitimate claim to understanding why something
happened. While perhaps adequate for legal proceedings, DID NOTs are
unsatisfactory for investigation purposes if specific behavior changes to
produce improved future performance are to be identified validly and
reproducibly.
TIP: to pursue the difference between expected behavior and actual behavior, start by recording the action of the individual establishing the expected actions, then record what the individual or object actually did. Then, bridge the gap until the succession of actors and actions that changed the the expected behavior to the actual behavior over time is understood. Tracing the development of the expected actions and the analysis of criteria to support those selected -- or not selected -- is often illuminating. Tracing an individual's decision process may also be illuminating. Tracing the normalization of deviation, for example, may also be much more fruitful than simply saying "DID NOT", and can lead to identification of concrete counteractions far from the scene to control such changes in the future.
2.d. Data Integrating Matrix
The immediate integrating organization and attempted coupling of new investigation input data on STEP-MEW time/actor Matrixes as the data is acquired during investigations is another departure from dominant investigation paradigms. STEP-MES transforms, documents, integrates and tests data on a Matrix as acquired, providing a "progressive analysis" approach. That displays a continually updated summary of actions identified by the investigation. This is a sharp contrast with the "get all the facts, then analyze them" or tree-building approaches of some current practices which do not require data disciplining, or raise the level of abstraction or delay the analyses to compensate for data disciplining shortcomings.
The timing of all actions during a process is demanded by STEP-MES so they can be arrayed on a matrix in their temporal and spatial sequence with confidence, and so the effects of the timing of interactions can be identified. The timing requirement also facilitates the identification of gaps in knowledge of the behaviors if the sequential actions of the person, object or energy can not be visualized in the form of a mental movie. The lack of timing requirements in tree arrays -- a major shortcoming with that data handling structure -- is overcome by STEP-MES matrixes.
The efficiency of investigations is enhanced by this Matrix-driven progressive analysis approach. It quickly exposes gaps in information about what an actor did that need to be filled by acquiring additional data about that actor. This is especially useful in team investigations, where the method provides for rapid and frequent distribution of the current status of the investigation among team members.
Collaboration is encouraged, and differences or irrelevant theories or actions are quickly resolved with logic or additional data when data have to be harmonized and displayed on a matrix with properly reasoned links. Focusing on the matrix and its completion typically has the effect of subordinating perceived self interests to the broader group interest of understanding, explaining and displaying the process. Hypothesized actions must fit into the flow of the known actions already displayed and be validated by one or more sources. Irrelevant data are readily recognized by all when the actions are added to a matrix, but they can not eventually be linked to any displayed actions.
STEP-MES requires the source on which each matrix entry is based to be shown with the entry, to ensure that investigators do not "invent" actions for the Matrix (or if they do, they report that) and to provide reviewers with a source to which they can refer if questions arise. Appropriate software can archive and provide printouts of the sources listed for investigation management and source management purposes.
2.e. Use of Conditions
Another paradigm shift is that STEP-MES uses only actions to document and explain how a process produced an outcome. The underlying rationale is that in nature both static and dynamic states remain unchanged until acted on by someone or something.
A state does not initiate the next action or change; action by someone or something does. Therefore STEP-MES focuses on actions or behaviors during investigations, and prescribes rules for linking coupled actions. Current investigation practices have no such constraint, and will mix actions, conditions, characterizations, allegations, etc., in their descriptions of the phenomenon. This is one of the most difficult habits for experienced investigators to change. Unfortunately, by tolerating conditions in descriptions, investigators can gloss over actions that produced those conditions. For example, it is easy to allege that the corporate culture contributed to an occurrence, but what actions produced that culture, and what specific behaviors does one change to change a corporate culture? Corporate cultures, as with other states, evolve as a result of actions by individuals in supervisory positions over time, and the interpretation or reactions to those accumulated actions by supervised individuals. Tracing those actions to identify what could be changed can be seen to be necessary if effective, efficient action to remedy an undesired corporate culture is to be taken.
This constraint should not be misunderstood: STEP-MES uses states to infer the actions that produced them when observations or data defining the actions are not available. For example, the examination of conditions to infer actions is essential to forensic investigations in laboratories. The reading of event recorders to develop a description of what an aircraft did during a crash sequence is another example of deriving data about actions from the condition of the recording medium. However, this task should not be confused with the task of documenting the description and explanation of what happened.
An area where this becomes a difficult challenge for investigators is in deriving actions that led to states in individuals, and how they affected the decision actions by the individual. The only possibility for doing this with any confidence is to be able to interview individuals skillfully, sensitive to such actions; fatally injured witness can offer no insights into mental actions like decisions, conclusions, inferences and the like. This of course is another persuasive reason for emphasis on investigating survivable incidents or "near misses" in a program designed to learn from episodic experiences.
An additional argument is that once identified, actions by individuals often are much easier to observe or monitor objectively than conditions - which are frequently transient or ambiguous - during future activities.
2.f. Replicability and Investigation "Forcing Functions"
This STEP-MES investigation system paradigm shift during investigations is to elevate the replicability concept which disciplines scientific method. In science, replicability is demonstrated by experimentally reproducing predicted results to validate the understanding and explanation of a phenomenon. For investigators, an inherent background objective should be to achieve the understanding required to reproduce the phenomenon or process being investigated. Replicability can be demonstrated with new designs or changes that are flow charted with Investigation Catalyst, by operating the system to determine if each of the predicted interaction pairs or sets occur. Most new systems are tested for that purpose, but often interactions are not documented unambiguously so their occurrence, sequence and timing can be validated.
At first glance, this approach does not seem practical for dealing retrospectively with accidental occurrences involving losses, because one does not want to reproduce the losses. Therefore, to demonstrate replicability, investigators of loss occurrences must rely on demonstrating the robust logic of their description and explanation of what happened. STEP-MES helps accomplish this in two ways. The two interact.
First, STEP-MES links and rules force investigators to identify necessary and sufficient relationships among all the coupled actions required to produce the process outcome(s). When properly understood and implemented, this necessary and sufficient testing compels investigators to ask and answer the questions needed to define the actions that would reproduce each successive interaction leading to the process outcome(s). This constitutes a "forcing function" imposed on investigators during investigations, to describe and explain each action during the process logically and completely. The requirement forces investigators to suppress assumptions about the current system or influencing conditions, and pursue why an actor did each action during the process - in terms of other actions. This leads investigators backward in time to prior actions that shaped the next action. Prior actions might include previous successful task performance experience, establishment of the work environment, training, procedures, task validation, task design and design philosophy, analysis methods, policies, personnel selection and other "human factors" actions or decisions. STEP-MES applies this forcing function to actions by people, objects and energies that played a role in the outcome.
Secondly, STEP-MES matrixes make visible the reasoning on which the investigators base their claim that the process can be replicated. Links are coded to indicate whether the relationship among actions is tentative, confirmed or fully tested for completeness. This provide investigators with a valuable communication tool to either help them demonstrate their understanding of the process, or enable others to point out difficulties they might have with the process description and explanation.
Coded links between coupled actions show the logical flow of the actions required to reproduce the outcome. Necessary and sufficient tests, using the basic question "Are all these actions necessary and sufficient to produce the linked result every time they occur?" force investigators to satisfy themselves they have all the data they need. Links showing the degree of certainty serve as aids during investigations to show what work remains. Gaps where the one or more of the necessary actions are unknown or not retrievable expose uncoupled flow problems quickly, and become the focus for data acquisition efforts or hypothesis development.
The visibility of the logic flow is the key to verifying replicability of the predicted effects of changes that are introduced. Subsequent observations of linked action pairs or coupled sets during future activities, and comparison of those observed actions with the actions displayed on the matrixes, quickly disclose deviations if they exist or occur. Once disclosed, they can be acted on by changing behaviors.
Future task design or modification and training, and operational procedures, should be based on replicable behaviors, demonstrated by the investigation, rather than ambiguous factors or causes. The consequences of ambiguity or abstraction for these functions become readily discernible when the rules, tools and reasoning required to achieve replicability are imposed on investigators, and enforced during reviews of investigation work products. They also are quite apparent during subsequent observations of changes made following investigations.
Thus, replicability is an imperative consideration for quality assurance of investigation work products for all these reasons.
2.g. STEP-MES Exposes Uncertainties
Another paradigm shift is that STEP-MES accommodates uncertain data, estimates or speculations, and requires that they be conspicuously flagged in its displays to ensure that everyone knows they exist and are unresolved. Users of the investigation outputs do not have to discover such data on their own, thus minimizing their review time and effort, and reducing the potential for challenges and disputes.
Uncertainties on STEP-MES matrixes are flagged with a question mark (?) on a link or element of an Event Block. Estimates are flagged with an "e" when they are used for times or other estimated quantitative data. Speculations, assumed data or incomplete data data are flagged with color coded Event Blocks. Tentative or uncertain links are flagged with dashed links, and incomplete sets are flagged with empty arrowheads on links. Fully completed and explained descriptions carry none of these flags.
During investigations, the question marks serve as "placeholders" for data that are being acquired but which are not yet available. For example, the investigator become aware of an action, but does not yet know who did it, so the EB is created with a ? for the actor. Alternatively, the investigator learns that an actor's time is not yet fully accounted for, so a ? is shown as the action at that time. The purpose, of course, is to remind the investigator of the open data item.
The flags serve another purpose during investigations: they tell investigation managers where investigation efforts are still needed, or what they should be prepared to explain if challenged. When viewed in a matrix, they also give managers in basis for assessing the value of resolving the uncertainties, confirming estimates or validating speculations, in terms of what it would add to the needed understanding to support future action.
3. STEP-MES REASONING REQUIREMENTS
Logical reasoning skills are essential to most data-related tasks during investigations. STEP-MES requires four explicit kinds of reasoning competence: sequential, if-then, deductive and necessary and sufficient. Each contributes to the successful development of complete, valid descriptions and explanations of a process.
3.a. Sequential reasoning
Sequential reasoning is used to order and integrate actions in their temporal and spatial order. Another use is to pinpoint gaps in a sequence of actions by visualizing them in sequence in a Mental Movie, where missing frames expose gaps in the interaction flow. STEP-MES documents this reasoning by positioning events in sequence on a time/actor matrix.
3.b.Input / output reasoning
This reasoning skill is required to determine if an action affects any other subsequent actions; e.g., if this happened, then does it follow that one or more subsequent actions had to occur. I/O reasoning is used to identify inputs which influenced an action, and provides the basis for the coupling between actions during a process. STEP-MES documents this input/output reasoning with tentative or confirmed links.
Backward I/O reasoning is used to hypothesize what actions might have produced observed or reported changes: e.g., if you observe this state or action, then what prior actions were required to produce this state or action? This backward use is sometimes called why-because reasoning or if-then reasoning.
3.c Deductive reasoning
Deductive reasoning is used to infer actions, using proven scientific laws or principles, that would explain the actions needed to produce an observed action or condition. Another use is to hypothesize such actions to fill gaps in a sequence of actions, based on application of natural laws and scientific principles, prior observations or perceptions of how processes should function.
3.d. Necessary and Sufficient reasoning
Necessary and sufficient reasoning is used to test the completeness of each set of interactions during the process described by the matrix, during an investigation and during quality assurance tasks. The necessary reasoning process asks if each of the actions linked to subsequent action(s) must always occur for the subsequent action to occur; if not, unnecessary actions should be unlinked, and discarded at the conclusion of the investigation. This task can utilize counterfactual reasoning: if an action would not have occurred, would the subsequent action still have occurred?
Sufficient reasoning determines if all the actions shown and linked to the subsequent coupled action(s) are sufficient for the subsequent action(s) to result every time they all occur; if not, the missing actions should be identified by additional investigation and added to the matrix. This task requires knowledge of the process or subsystem and its operation, how interactions advance the process toward its outcome, and skill at tracking actions needed to reproduce each successive step in the process. When complete, users should be able to reproduce the entire process and outcome - in documented form.
Particularly challenging during this task is finding all the actions that influenced what people did, such as communications, assumptions, perceptions, decisions, workloads or imposed requirements, orders, training, and past experiences with interactions with other people or objects. Finding prior actions that influenced what objects did is similarly challenging, often requiring understanding of actions like assuming design criteria, selecting design standards, selecting safety analysis or investigation methods, establishing operating procedures, conducting operational tests, prescribing or performing maintenance procedures, or assuring quality, for example.
PRECAUTIONARY NOTE: Tracking the actions which influenced what deceased individuals did is an almost impossible task, because data from the primary original source is unavailable. Investigators often try to overcome this constraint, regrettably, by introducing into accident descriptions inadequately supported inferences or speculations, ambiguities, characerizations, allegations, judgments, and raised levels of abstraction of the description to mask the problem. A preferable option is to focus safety investigation programs on occurrences with surviving witnesses who can describe what they did and why they did it.