Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
18th World Conference on Nondestructive Testing, 16-20 April 2012, Durban, South Africa
Holistically Evaluating the Reliability of NDE
Systems – Paradigm Shift
Christina MUELLER 1, Marija BERTOVIC 1, Mato PAVLOVIC 1, Daniel KANZLER 1, Uwe EWERT 1, Jorma PITKÄNEN 2, Ulf RONNETEG 3
1 Federal Institute for Materials Research and Testing, Unter den Eichen 87, 12205 Berlin, Germany,
Phone: +49 30 81041833, Fax +49 30 81041836; e-mail: [email protected]) 2 Posiva Oy, Eurajoki, Finland 3 SKB, Oskarshamn, Sweden
Abstract New methodologies for evaluating the reliability of NDE systems are discussed in accordance with the specific requirements of industrial application. After a review of the substantive issues from the previous decades, the go forward guidance is concluded.
For high safety demands a quantitative probability of detection (POD) created from hit miss experiments or signal response analysis and ROC (Receiver Operating Characteristics) are typically created. The modular model distinguishes between the influence of pure physics and technique, industrial application factors and the human factor and helps to learn what factors are covered by modelling, open or blind trials. A new paradigm is offered to consider the POD or reliability of the system as a function of the configuration of input variables and use it for optimisation rather than for a final judgement. New approaches are considered dealing with real defects in a realistic environment, affordable but precisely like the Bayesian approach or model assisted methods.
Among the influencing parameters, the human factor is of high importance. A systematic psychological approach helps to find out where the bottlenecks are and shows possibilities for improvement. Keywords: Reliability of NDE, POD, ROC, Modular Model, Multiparameter-POD, Bayesian Approach, Human Factor
1. Review of Approaches to NDE Reliability
1.1. State of the Art as described by the European-American Workshops on NDE
Reliability 1997 – 2002
Before presenting the new paradigm in the field of NDE reliability the authors would like to summarize the insights from the first three workshops from the series of European-American Workshops on NDE reliability. The main achievement from the first workshop 1997 in Berlin was the conceptual model in terms of the reliability formula which simply says that the total reliability of an NDE system is composed of the intrinsic capability, IC (which stands for the physical principle behind the defect indication and its technical realization as an upper bound), the application factors, AP (describing the realistic circumstances like UT-coupling, limited access, noise of the surrounding …) and the human factor, HF, present in each application. While imperfect – e.g. the mutual interactions between the factors were not considered – the conceptual model helped to properly define potential for performance optimization, and also as an assessment tool for the adequacy of open and blind trials. A main
benefit from the Second European American Workshop on NDE Reliability, September 1999, Boulder, Colorado, USA [1], was the clear definitions of i) the NDE System as the procedure, equipment and personnel that are used in performing NDE inspection and ii) the
NDE reliability as the degree that an NDT system is capable of achieving its purpose regarding detection, characterization and false calls. The main conclusion 2002 from the third European American Workshop was: We need to quantify the risk in NDE and demining!
Figure 1: Status 2002: The need to deal with NDE Reliability
From the beginning, it was clear that we are asking the NDE process to describe the real status of a component. Later, limits were discovered for NDE systems e.g. for small flaws, not all flaws will be detected. The second task then is to provide quantitative reliability information about inspection results at least for defined risk critical applications. In Figure 1 this means that there is not always a 1:1 correspondence between the “truth” (the real defect situation) in a weld (left side) and the defects indicated by the NDE system which are recorded in the inspection record (right side).
Figure 2: Status 2002: Each NDE-system has a certain performance capability which can be defined by parameters
The traditional European approach relies on standards i.e. to describe value ranges and allowed uncertainties of the physical and technical parameters in a standard and test according to the standard as illustrated in Figure 2.
TruthTruth
NDENDE--SystemSystem
InspectionInspection ReportReport
sect
ion
No.
sect
ion
No.
1 acceptable flaw2 indifferent / additional radiography3 unacceptable flaw
Section No.
Defect Type
Importance
1 2 3 4 ...
Eb Aa - Ab
3 1 - 1
Section No.
Defect Type
Importance
1 2 3 4 ...
Eb Aa - Ab
3 1 - 1
environmentalenvironmental
influenceinfluenceenvironmentalenvironmental
influenceinfluence
socialsocial andand
psychologicalpsychological
influenceinfluence
socialsocial andand
psychologicalpsychological
influenceinfluence
inspection reportinspection report
imagingitem
imagingitem
digitisation
andimprovement
inspectorinspector
NNNeuronal
Networks
NNNeuronal
Networks
x - ray tube
undercut
filmcrack
pore
x - ray tube
undercut
filmcrack
poreof the source
and interaction
of rays / waveswith material
and defects
physicsof the source
and interaction
of rays / waveswith material
and defects
physics
or
Figure 3: Status 2002: Each NDE system has a certain performance capability which can be measured by Performance Demonstration - US Nuclear Approach (EPRI PDI)
An alternative approach – the performance demonstration strategy as required by ASME section XI appendix VIII – employed by ISI NDE for the US nuclear industry represented by EPRI since the 1980s – is illustrated in Figure 3. The NDE system – composed of the NDE equipment, the procedure and the human personnel is handled as a black box. Only the “input” in terms of the flawed test samples is compared to the “output” the test results. It is the responsibility of the NDE provider how to reach the goal and document the practice in a procedure/specification. The results are then statistically evaluated and acceptance limits are set.
Figure 4 illustrates the ROC (Receiver Operating Characteristics)-curves as a performance measure which has been used for a number of decades in the operation of radar, and also in medical diagnostics [2, 3]. Within the ROC-scheme for the characterization of the performance of an NDE system the proportion of hits (true positive indications) are evaluated against false calls (false positive indications) in a diagram. Along one ROC-curve the sensitivity is rising as the decision threshold is lowered, i.e. since the threshold aims to distinguish between noise and wanted signals, more and more noise signals are accumulated. In the upper right corner all detected signals are gathered. The quality of NDE-systems raises from the straight line – the “pure chance” curve where noise and wanted signals are counted with the same frequency - up to the left edge step curve where only valid signals are gathered (ideal system). The truth of real NDE systems might occur on all ranges between theses extremes (see e.g. Figure 15). For NDE-engineers it is now important to learn what physical parameters do influence the probability of detection to be limited or growing. For a fixed false call rate – as indicated in Figure 5 – we look for the value of the POD and how it is distributed. Originating in the US-aerospace industry [4] another data sampling scheme is utilized for this purpose – the “Probability of Detection” (POD) as a function of defect size (or another appropriate defect parameter) with Figure 6 showing the result for one applied threshold.
ReportReportReportReportNonNon--DestructiveDestructive TestTestUnknownUnknown DefectDefect Situation in Situation in
Testsample Testsample oror Real Real ComponentComponent
Black BoxBlack Box
Blind TestBlind Test
ComparisonComparison
TrueTrue DataDataTest Test ResultResult
StatisticsStatistics of Hit (TP), Miss (FN)of Hit (TP), Miss (FN)
StatisticsStatistics of of FalseFalse Alarm (FP), Alarm (FP), TrueTrue "no "no DefectDefect" (TN)" (TN)PODPOD
PFAPFA
Figure 4: Status 2002: Comparison of Different ND – Systems using ROC – Receiver Operating Characteristics – Curves
Figure 5: Status 2002: ROC - POD Relation; POD - Probability of Detection
Figure 6: "Probability of Detection„ POD as a function of defect size (for one threshold), Typical POD Curve, Hit-Miss: Percentage of defects found
For automated thresholding systems – e.g. eddy current testing of gas turbine engines – the method has been developed further for a signal response POD as illustrated in Figure 7.
Standard Deviations for all Curves
Noise : 1.0Signal : 1.0
Meanvalue of Noise : 0.0
Difference of the Mean ValuesSignal - Noise :
1 - 0.0
2 - 0.53 - 1.0
4 - 1.55 - 2.0
6 - 2.57 - 3.0
IncreasingReliability
pro
ba
bil
ity
of
co
rre
ct
de
tec
tio
ns
probability of false call
p(TP)
p(FP)
reliability
1
2
3
45
67
pro
ba
bil
ity
of
co
rre
ct
de
tec
tio
ns
probability of false call
p(TP)
p(FP)
reliability
1
2
3
45
67
pro
ba
bil
ity
of
co
rre
ct
de
tec
tio
ns
probability of false call
p(TP)
p(FP)
pro
ba
bil
ity
of
co
rre
ct
de
tec
tio
ns
probability of false call
p(TP)
p(FP)
reliability
1
2
3
45
67
pro
ba
bil
ity
of
co
rre
ct
de
tec
tio
ns
PODROC
sen
sitivi
tyra
ises
0.0 0.5 1.0
0.5
1.0
probability of false call
p(TP)
p(FP)
pro
ba
bil
ity
of
co
rre
ct
de
tec
tio
ns
PODROC
sen
sitivi
tyra
ises
0.0 0.5 1.0
0.5
1.0
probability of false call
p(TP)
p(FP)
defect size
PO
D
100 %
defect size
PO
D
100 %
Figure 7: Status 2002: „â versus a“ Signal
A defect of size “a” causes a signal of height “â”. When the signal is above the selected threshold it is counted as defect indication and otherwise as noise. The “â versus a” curve almost all cases on a logarithmic scale is indicated on the lower left hand side of Using an appropriate statistical model for the data distribution (see transformed into a typical POD curve with a corresponding 95% lower confidence bound.
1.2. New Views and Approaches 2009
reliability of NDE systems unerring, reliable and efficiently according to the specific
requirements of industrial application
In the above chapter we discussed the issuAmerican Reliability workshop in 2002.
: Status 2002: „â versus a“ Signal-Response-POD
A defect of size “a” causes a signal of height “â”. When the signal is above the selected counted as defect indication and otherwise as noise. The “â versus a” curve
almost all cases on a logarithmic scale is indicated on the lower left hand side of Using an appropriate statistical model for the data distribution (see [4]) this diagram can be
to a typical POD curve with a corresponding 95% lower confidence bound.
and Approaches 2009-2012: New methodologies in evaluating the
y of NDE systems unerring, reliable and efficiently according to the specific
requirements of industrial application
In the above chapter we discussed the issues already known during the 3workshop in 2002. What has developed since?
A defect of size “a” causes a signal of height “â”. When the signal is above the selected counted as defect indication and otherwise as noise. The “â versus a” curve – in
almost all cases on a logarithmic scale is indicated on the lower left hand side of Figure 7. this diagram can be
to a typical POD curve with a corresponding 95% lower confidence bound.
: New methodologies in evaluating the
y of NDE systems unerring, reliable and efficiently according to the specific
es already known during the 3rd European-
Figure
With the “NDE Reliability Flower Garden” shown in considerable and versatile progress all over the world between the corner points of industrial demands and scientific progress on the one hand and between sand economic conditions on the other hand. What is the blooming item in each of the flowers? As indicated in the work from EPRI [established in the US nuclear industry and with itthe ENIQ (European Network of Inspection Qualification) approach, which was from the beginning composed of parameter assessment and open and blind trials. The National adaptations of the ENIQ methodology are also stoday [6]. The POD-qualification of NDE methods applied to airplane components is mandatory worldwide. For general NDE applied by accredited labsvalidation of quantitative and qualitative testinstandard exists. Also the new branch of SHMevaluate the reliability via POD curves. Life Time Management) is more and morfacilities and the reliability of NDE is an important bridge between testing and the assessment of the remaining risk and life time. The youngest most promising flower is the modelling assisted POD (MAPOD see Bruhigh number of empirical experiments by substituting missing information by modelled results. Last not least for dealing with the Human Factor in an appropriate scientific way working psychological methods came on board firstly touched in investigations within the PANI program and investigations in demining and NDE by BAM. But even more insights from this field are to be taken into account when completing the modular reliability model: the organizational context within which all the modules are embedded and also the interaction between different organizations as we can learn from Babette Fahlbruchmodular model as depicted in the different “Flowers” of Figure
Figure 8: Today: The Reliability Flower Garden 2012
With the “NDE Reliability Flower Garden” shown in Figure 8 the authors like to illustrate the considerable and versatile progress all over the world between the corner points of industrial demands and scientific progress on the one hand and between safety and security demands and economic conditions on the other hand. What is the blooming item in each of the flowers?
d in the work from EPRI [5] the PD (Performance Demonstration) is now well established in the US nuclear industry and with its procedure examination it drew nearer to the ENIQ (European Network of Inspection Qualification) approach, which was from the beginning composed of parameter assessment and open and blind trials. The National adaptations of the ENIQ methodology are also spread out over Western and Eastern Europe
qualification of NDE methods applied to airplane components is mandatory worldwide. For general NDE applied by accredited labs (ENvalidation of quantitative and qualitative testing methods is required when no established
Also the new branch of SHM (Structural Health Monitoring) is going toevaluate the reliability via POD curves. RBI (Risk Based Inspection) and RBLM (Risk Based Life Time Management) is more and more applied to pressure equipment and nuclear facilities and the reliability of NDE is an important bridge between testing and the assessment of the remaining risk and life time. The youngest most promising flower is the modelling
Bruce Thompson, et. al. [7]) which helps to reduce costs for a high number of empirical experiments by substituting missing information by modelled results. Last not least for dealing with the Human Factor in an appropriate scientific way
l methods came on board firstly touched in investigations within the PANI program and investigations in demining and NDE by BAM. But even more insights from this field are to be taken into account when completing the modular reliability model:
tional context within which all the modules are embedded and also the interaction ions as we can learn from Babette Fahlbruch [12
modular model as depicted in Figure 9 can also help to understand what issues are covered by Figure 8.
the authors like to illustrate the considerable and versatile progress all over the world between the corner points of industrial
afety and security demands and economic conditions on the other hand. What is the blooming item in each of the flowers?
(Performance Demonstration) is now well s procedure examination it drew nearer to
the ENIQ (European Network of Inspection Qualification) approach, which was from the beginning composed of parameter assessment and open and blind trials. The National
pread out over Western and Eastern Europe qualification of NDE methods applied to airplane components is
(EN-ISO 17025) the g methods is required when no established
Health Monitoring) is going to RBI (Risk Based Inspection) and RBLM (Risk Based
e applied to pressure equipment and nuclear facilities and the reliability of NDE is an important bridge between testing and the assessment of the remaining risk and life time. The youngest most promising flower is the modelling
) which helps to reduce costs for a high number of empirical experiments by substituting missing information by modelled results. Last not least for dealing with the Human Factor in an appropriate scientific way
l methods came on board firstly touched in investigations within the PANI program and investigations in demining and NDE by BAM. But even more insights from this field are to be taken into account when completing the modular reliability model:
tional context within which all the modules are embedded and also the interaction 12]. The generalized
can also help to understand what issues are covered by
Generally, when asked for an assessment of the reliability of an NDE-system the following approach is helpful to consider: first, the actual level of safety demands have to be defined in order to adapt the thoroughness of investigation to the level of risk when the tested component would fail. Then, all the essential influencing parameters need to be listed and transferred to an appropriate design of experiments to determine the reliability in terms of a qualitative assessment for lowering the risk, or in terms of quantitative probability of detection (POD) or ROC (Receiver Operating Characteristics) curves.
We recommend here a new paradigm which considers the POD or reliability of the system as a matrix of input variables utilized for process optimization, rather than for a final judgment. A subsequent element, and we believe advantage of this approach for the end user, is also to sample all single PODs to an integral “Volume POD” of a part including data fusion [9].
Among the influencing parameters, the human factor is the most important one but of course after the basics via IC are established. A systematic psychological approach should help to find out where the bottlenecks are, but as first priority, to provide the best possible working conditions for the human inspectors.
Figure 9: The modular reliability model helps to understand and weight the different influences as realized by
MAPOD and ENIQ, PANI and BAM
2. Application Examples for the Progress in Methods for Determining Intrinsic
Capability, Application Parameters and Human Factors
2.1. Examples for new handling of Intrinsic Capability and Application Parameters
Multiparameter POD
Physics
Intrinsic Capability
e.g. Environment
IC
Optimized
Diagnostic
System
AP HF
Human FactorsApplication Parameters
e.g. experienced orunexperienced
inspectors
Signal
Organisational Context
Signal
Physics
Intrinsic Capability
e.g. Environment
IC
Optimized
Diagnostic
System
AP HF
IC
Optimized
Diagnostic
System
AP HF
Human FactorsApplication Parameters
e.g. experienced orunexperienced
inspectors
e.g. experienced orunexperienced
inspectors
SignalSignal
Organisational Context
SignalSignal
Figure 10 shows a typical result using the “â versus a” scheme as explained above. When applying this approach to industrial applications like copper canister components for radioactive waste (see e.g. [8], [9], [10], [14]) we have to expand this approach to real industrial conditions.
Figure 10: Probability of Detection (POD) – â vs. a approach
Specifically, we see a need for a multi-parameter „a“ (depth, size, orientation, roughness …) and a data-field „â“ (more than a maximum signal) for industrial applications with thick parts and complex defect shapes.
To investigate this a modelling-assisted multi-parameter methodology (Pavlovic [8], [9]) was developed and applied to the lid of the copper canister seen in Figure 11. The POD as a function of defect size (FBH diameter), defect depth and defect angle is presented in Figure 12(a), (b) and (c), respectively. The sharp decline of POD with increasing angle, for example, shows how important a comprehensive multi-parameter consideration is.
Figure 11: Copper Canisters for deep deposit SKB/Posiva, Canister Parts planned for the NDT inspection:
Copper Lid
(a) (b) (c)
Figure 12: POD as function of FBH-diameter (a), depth (b), and angle (c).
The investigation examples presented so far include the consideration of the intrinsic capability (IC) and also parts of the application factors (AP). Bayesian Approach for dealing with limited data from real defects
Applying the POD evaluation to real industrial processes, the need to deal with the real occurring flaws under production or in service conditions arises as a natural application factor. To create those in a statistically sufficient amount is too expensive or even unfeasible. The Bayesian Approach provides a good solution to compute POD-curves in case of small amount of real defects without losing the necessary information by incorporating data from artificial defects via a prior function. The posterior function contains the needed information for the computation of POD-curves for real defects with an acceptable and sufficient amount of information, even for sparse amount of data as shown in the work of Kanzler, et. al. [10].
Figure 13: Data of artificial defects and real defects for RT
Figure 14: The POD made of only real defects and the improvement with the Bayesian approach with combining real and artificial data
Summary result of a round robin test in radiographic weld inspection
The next example – as contrast to the almost “pure” examples covering intrinsic probability and application factors - shows ROC curve results in Figure 15 for the same technique (X-ray inspection) applied to weld specimens evaluated by 20 different inspectors. So, the difference in the results is only caused by the difference in human performance (see [11]).
Figure 15: ROC of 20 human inspectors for radiographic film evaluation from [5]
• Radiographic film images of 40 weld sections were evaluated by 20 inspectors • grey: the ROC-curves of the individual inspectors • red: the overall mean-value-curve with the actual operating points and error bars in the
maximum point The high diversity in this and other similar investigations suggests that we need to understand human factors in a more systematic way making advantage of the facilities of working psychology.
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
0 0,2 0,4 0,6 0,8 1
p(FP)
p(T
P)
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
0 0,2 0,4 0,6 0,8 1
p(FP)
p(T
P)
2.2. Human Factors
According to the periods of safety research (see Figure 16), developed by Reason (1993; in [12] and expanded by Wilpert& Fahlbruch (1998; in [12]), the different influencing factors on the reliability of safety relevant work results – from technology over individual human errors to socio-technical and inter-organizational factors - are well known for a while in the human factors field. Within the human factor project SR2514, sponsored by the German’s Federal Office for Radiation Protection (Bundesamt für Strahlenschutz, BfS), on human factors’ influences on manual ultrasonic inspection performance during ISI in nuclear power plants , a uman factor model had been set up and tested which contained the three first areas of factors of Figure 16(see [13]for detailed results).
Figure 16: Periods of Safety Research
The specific goal of the project was to determine if and how time pressure influences the manual ultrasonic inspection performance. Based on a DoE protocol, ten experienced inspectors carried out manual ultrasonic inspection. Additionally two teams carried out mechanised/ automated inspection. All inspections were accomplished on a mock-up reactor vessel and on various test pieces with programmed flaws (notches and cracks). The selected flaws were typical for tasks performed during ultrasonic in-service inspection in nuclear power plants. The task was to find the flaws in given areas and to determine the corresponding amplitudes, coordinates and length dimensions. The evaluation criteria for the quality of inspection and for the human factors’ influence, was the measurement precision for detected indications.
To substantiate the human factors element, the main focus was put on parameterization of the manual inspections. The time available for inspection, evaluation and writing the protocols was varied in three levels (A-no time limitation, B-middle time pressure, C-high time pressure). Psychological factors, i.e. mental workload, stress resistance and stress reaction were additionally assessed in the course of the experiment.
The results show a high influence of human factors on the inspection results. The main hypothesis was confirmed, i.e. that the time pressure has an effect on the measurement precision. The effect of individually perceived time pressure (see Figure 17) (rather than the so called “objective” setting from outside) and mental workload were found to be significant having a negative impact on the performance. A good preparation before the inspection (e.g.
Inter-organizational factors
Dysfunctional
relations
between
organizations as
source of
factors
Inter-organizational factors
Dysfunctional
relations
between
organizations as
source of
factors
Socio-technical factors
Interaction of
subsystems as
source of
factors
1995
Socio-technical factors
Interaction of
subsystems as
source of
factors
1995
Human error factors
Individuals
as source of
factors
1990
Human error factors
Individuals
as source of
factors
1990
Technical factors
Technology
as source of
factors
Technical factors
Technology
as source of
factors
Com
ple
xity o
f t
echn
olo
gy
Time
1950
Expanded from Reason (1990)
Com
ple
xity o
f t
echn
olo
gy
Time
1950
Com
ple
xity o
f t
echn
olo
gy
Time
1950
Time
1950
Expanded from Reason (1990)
thorough briefing and training on test pieces) and the communication between the "organisation" (e.g. the supervision of the utilities, inspection companies and expert organisations) and the inspectors were considered as highly important. The influence of the organisation has shown to be relevant for the practice. When the inspectors consider the inspection tasks as not precise enough defined, the documentation as ambiguous and not manageable, the working conditions as inadequate, and above all when they feel they have not been sufficiently prepared for the tasks, there is a clear increase in the mental workload of the task and consequently a decrease in the measurement precision. Figure 18 shows differences between different operators in the amplitude estimation through all experimental conditions (time pressure A, B and C).
Figure 17: Results: the scatter of amplitude, extension and depth measurement as a function of the perceived
temporal demand (A – Low temporal demand, B – Middle temporal demand, C – high temporal demand)
Figure 18: Measured Amplitude-mean values - Differences between the inspectors
Conclusions from the project: • It was confirmed time pressure has an effect on the quality of UT inspections • The organizational context determines the way inspections are performed and therefore
highly influences on the inspection quality. For an improvement of the human performance attention needs to be drawn on: • Demonstration task for training and confirmation
0
5
10
15
20
25
30
35
A B C
Zeitliche Anforderungen
Str
eu
un
gdes y
_m
ax
[mm
2]
exte
nsi
on
Temporal demand
0
5
10
15
20
25
30
35
A B C
Zeitliche Anforderungen
Str
eu
un
gdes y
_m
ax
[mm
2]
exte
nsi
on
exte
nsi
on
Temporal demand
0
20
40
60
80
100
120
A B C
Zeitliche Anforderungen
Str
euu
ng
des z
_m
ax
[mm
2]
dep
th
Temporal demand
0
20
40
60
80
100
120
A B C
Zeitliche Anforderungen
Str
euu
ng
des z
_m
ax
[mm
2]
dep
thd
epth
Temporal demand
0
5
10
15
20
25
A B C
Temporal demand
ampli
tude
0
5
10
15
20
25
A B C
Temporal demand
ampli
tude
Vertical bars denote 0.95 confidence intervals
r g i k m s p d e z
Pruefer
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
Am
pA
mp
litud
e
Operators
Vertical bars denote 0.95 confidence intervals
r g i k m s p d e z
Pruefer
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
Am
pA
mp
litud
e
Operators
• Organization – good preparation • Written procedures and protocols • Supervision Further attempts to investigate human factors in the NDT-field have focused on the mechanized inspection of the canister, to be used for the final disposal of nuclear waste in Sweden and Finland. A substantial number of factors that could affect the reliability of NDT methods have been identified and analyzed. Proposals for the compensation of varying human performance, according to the experts, include the implementation of human redundancy (known also as the 4-eyes principle) or the semi-automation of the defect detection process.. However, implementing human redundancy in critical tasks, such as defect identification, as well as using an automated aid (software) to help operators in decision making about the existence and size of defects, could lead to other kinds of problems, namely social loafing (excerpting less effort when working on tasks collectively as compared to working alone) and automation bias (uncritical reliance on the proper function of an automated system without recognizing its limitations and the possibilities of automation failure) that might affect the reliability of NDT in an undesired manner, when not taken into account adequately (as elaborated by Bertovic, et. al. [14]). Deeper understanding of the defect detection and sizing process, as well as an evaluation of the existing procedures, are a part of the continuing effort to understand the varying human performance, as well as to optimize current practices, if needed.
3. Conclusion and Outlook
Dealing with the NDE reliability measurement effort the safety and economic requirements have to be taken into account and the qualitative and quantitative importance of each subtask (detection, false calls, characterisation, sizing) needs to be assessed. Multi-parameter POD and methods employing modelling assisted POD or data combination by Bayesian Approach appear promising to simultaneously fulfil safety and economic demands. The modular model helps to understand the impact of different sources of influences from parameters connected to physics, application or human factors – individual or organizational, respectively. Human factors need to be more deeply investigated by means of working psychology as ways to optimize the working conditions and preparative actions for the operators.
4. Acknowledgement
The authors would like to thank Lloyd Schaefer as well as Babette Fahlbruch for helpful discussions.
5. REFERENCES
1. ASNT Topical Conference Paper Summaries Book of the American-European Workshop
on Non-destructive Inspection Reliability, September 21 24, 1999, NIST, Boulder, CO, USA, ISBN: 1 57117 041 3.
2. Nockemann C, Heidt H and Thomsen N, “Reliability in NDT: ROC study of radiographic weld inspections”, NDT&E International, 1991 24 (5) 235-245.
3. Metz C E, “Basic Principles of ROC analysis”, Seminars in Nuclear Medicine, 1978 8 (4).
4. Berens A P, “NDE Reliability Data Analysis – Metals Handbook” Volume 17, 9th
Edition: Nondestructive Evaluation and Quality Control, ASM International, OH, 1989 5. M. Dennis, C. Latiolais, F. Ammirato, P. Heasler, “Development of POD and Flaw
Sizing Uncertainty Distributions from NDE Qualification Data to Support Probabilistic Structural Integrity Assessment for Dissimilar Metal Welds, Proceedings of the Eights International Conference on NDE in Relation to Structural Integrity for Nuclear and Pressurised; Components, 29 September – 1 October 2010 – Berlin, Germany
6. B. Neuendorf, T. Seldis, L. Gandossi, „Trends in Europe’s NDE for the Nuclear Industry”,Proceedings of the Eights International Conference on NDE in Relation to Structural Integrity for Nuclear and Pressurised, Components, 29 September – 1 October 2010 – Berlin, Germany
7. R. Bruce Thompson, L. J. Brasche, D. Forsyth, E. Lindgren, P. Swindell, W. Winfree, „Recent Advances in Model-Assisted Probability of Detection, Paper presented on the 4th European-American Workshop on Reliability of NDE, Berlin, Germany, June 24-26, 209
8. Pavlovic M, Takahashi K, Müller C, Boehm R and Ronneteg U, “Reliability in Non-Destructive Testing (NDT) of the Canister Components. NDT Reliability – Final Report”, SKB Technical report R-08-129, ISSN 1402-3091, Swedish Nuclear Fuel and Waste Management Co., 2008.
9. Mato Pavlovic et. al., Th.3.A.2; Multi-Parameter Influence on the Response of the Flaw to the Phased Array Ultrasonic NDT System. The Volume POD; CD of Conference Proceedings of the 4th European-American Workshop on Reliability of NDE, June 24-26, 2009, DGZfP
10. Kanzler, D., Müller, C., Ewert U., Pitkänen, J. „BAYESIAN APPROACH FOR THE EVALUATION OF THE RELIABILITY OF NON-DESTRUCTIVE TESTING METHODS: COMBINATION OF DATA FROM ARTIFICIAL DEFECTS AND REAL DEFECTS”, paper presented on the 18th World Conference on Nondestructive Testing, Durban, South Africa, April 16-20, 2012
11. Fücsök F, Müller C and Scharmach M, “Reliability of Routine Radiographic Film Evaluation – an Extended ROC Study of the Human Factor”, 8th ECNDT Barcelona, 2002, CD-ROM.
12. Fahlbruch B, “Integrating Human Factors in Safety and Reliability Approaches”, Paper presented at 4th European-American Workshop on Reliability of NDE, Berlin, June 24-26, 2009.
13. Bertovic, M., Gaal, M., Müller, C. & Fahlbruch, B. (2011). Investigating Human Factors in Manual Ultrasonic Testing: Testing the Human Factor Model. Insight, 53(12), 673-676.
14. Marija Bertovic, Babette Fahlbruch, Christina Müller, Jorma Pitkänen, Ulf Ronneteg, Mate Gaal, Daniel Kanzler, Uwe Ewert, Detlef Schombach, “Human Factors Approach to the Acquisition and Evaluation of NDT Data”, paper presented on the 18th World Conference on Nondestructive Testing, Durban, South Africa, April 16-20, 2012