WO2022156873A1

WO2022156873A1 - A computer-implemented method for training an artificial neural network, method for inspecting a component, test system for examining components, and computer program

Info

Publication number: WO2022156873A1
Application number: PCT/EP2021/025165
Authority: WO
Inventors: Joachim Bamberg; Frederik ELISCHBERGER; Katrin TAUBENBERGER; Ernst Rau
Original assignee: MTU Aero Engines AG
Priority date: 2021-01-22
Filing date: 2021-04-30
Publication date: 2022-07-28

Abstract

The invention relates to a computer-implemented method for training an artificial neural network, comprising the steps of creating the artificial neural network, producing at least one test body with at least one real microstructure defect, ultrasonically measuring the test body at least region-wise while obtaining first ultrasonic measurement data, ultrasonic measuring at least at part of a preferably microstructure defect-free component while obtaining second ultrasonic measurement data, determining microstructure defect data which characterize real and/or artificial microstructure defects in the first and/or second ultrasonic measurement data, and training the artificial neural network by means of a computer system using the first ultrasound measurement data, the second ultrasound measurement data and the microstructure defect data, wherein the microstructure defect data are used as labels. The invention further relates to a method for inspecting a component (10), a test system (8) for examining components (10), a computer program, and a computer-readable storage medium.

Description

A computer-implemented method for training an artificial neural network, method for inspecting a component, test system for examining components, and computer program

Description

The invention relates to a computer-implemented method for training an artificial neural network, a method for inspecting a component, a test system for examining components, a computer program, and a computer-readable storage medium.

As part of quality assurance, components such as turbine disks are inspected non-destructively for production-related defects such as pores, cracks, or inclusions. Usually, ultrasonic testing is used to detect these defects in the component volume. If the ultrasound hits such a defect, there is a characteristic signal (echo) that can be evaluated in terms of amplitude and transit time. In addition, there are always noise signals generated by the microstructure of the component during ultrasonic testing. The defect detection limit results from the signal-to-noise ratio.

However, it is possible that the manufacture of components such as turbine parts with a finegrained forged structure may produce melt defects characterized only by a localized, sharply defined, coarse-grained structure. This type of defect is called segregation (discrete white spot segregation) and can be critical for the strength and service life of turbine disks and the like. This type of defect is not detectable in conventional ultrasonic testing. Only segregations that extend to the component surface can be detected with the so-called etch test. Concealed segregations are also not accessible to this type of testing. In addition, the production of artificial segregation-like defects is costly and time-consuming.

The task of the present invention is to provide a means for improved testing of components which can also be used to test for internal microstructure defects such as segregations.

This task is solved by computer-implemented method according to claim 1 or 2 for training an artificial neural network, a method according to claim 8 for inspecting a component, a test system according to claim 9 for examining components, a computer program according to claim 10, and a computer-readable storage medium according to claim 11. Advantageous embodiments of each aspect of the invention are to be regarded as advantageous embodiments of the respective other aspects of the invention.

A first aspect of the invention relates to a computer-implemented method for training an artificial neural network. The computer-implemented method comprises the steps of creating the artificial neural network, producing or providing a multitude of test components with each at least one respective artificial microstructure defect that is inserted at a known location in the test component ultrasonic measuring the test components, preferably in form of a complete Full-A-Scan of the test component, manually labeling the microstructure defects at the known locations in the respective ultrasonic measurement data-sets of the test components, training the artificial neural network by means of a computer system using the ultrasound measurement data with the labelled microstructure defect locations. The invention is based on the approach that a segregation or similar microstructure defects generate a hidden, characteristic noise signature with slight frequency dependence and local micro-reflections due to its coarse-grained structure and can thus be detected. However, the analysis of the signature cannot be performed simply by temporally and spatially discrete amplitude and delay evaluations. Instead, according to the invention, a neural network is trained with ultrasound data from one or many test bodies with one or more purposefully produced microstructure defects (segregations) to then detect the hidden characteristic noise signature of these real microstructure defects. In general, “a/an” and "one " are to be read as indefinite articles in the context of this disclosure, i.e., in the absence of any explicit indication to the contrary, always also as "at least one". Conversely, “a/an” and "one " can also be understood as "only one".

A second aspect of the invention relates to a computer-implemented method for training an artificial neural network. The computer-implemented method comprises the steps of creating the artificial neural network, providing at least a first ultrasonic measuring data characterizing a regional microstructure defect, preferably in form of a local Full-A-Scan, providing at least one second ultrasonic measurement data characterizing a preferably microstructure defect-free component, preferably in the form of a complete Full-A-Scan of the defect-free component and training the artificial neural network by means of a computer system using the first ultrasound measurement data and the second ultrasound measurement data, wherein the first ultrasonic measurement data characterizing the microstructure defect data are used as labels. Preferably the first ultrasonic measurement data can be provided by producing at least one test body with at least one real micro structure defect and by ultrasonically measuring the test body at least region-wise while obtaining the first ultrasonic measurement data. Likewise, the second ultrasonic measurement data can be provided by ultrasonic measuring at least at part of a preferably microstructure defect-free component thereby obtaining second ultrasonic measurement data. Afterwards microstructure defect data may be determined (manually) which characterize real and/or artificial microstructure defects in the first and/or second ultrasonic measurement data. The artificial neural network may then be trained by means of a computer system using the first ultrasound measurement data, the second ultrasound measurement data and the microstructure defect data, wherein the microstructure defect data are used as labels. This second aspect of the invention is similar to the first aspect of the invention but takes a software based approach instead of a hardware based approach, i.e. instead of producing a physical test specimen with microstructural defects in known locations, the data of a defect free test component or test specimen is used as a basis for producing virtual test components with microstructural defects in known locations by inserting known local data signatures of defects into the known data signatures of defect free parts and labeling the defects for training the artificial neural network.

In an advantageous embodiment of the invention it is provided that the second ultrasound measurement data is modified by inserting at least a portion of the first ultrasound measurement data, preferably by numerical integration, at least once as an artificial microstructural defect location into the second ultrasound measurement data. In addition to the costly and time-consuming production of real segregation-like defects, one or many virtual defects can be generated this way. Thus, a much higher error variation is possible with which fundamental obstacles in machine learning like class-imbalance can be reduced.

In a further advantageous embodiment of the invention it is provided that at least a part of the first ultrasonic measurement data is inserted into the second ultrasonic measurement data in such a way that artificial microstructural defect locations are characterized at at least two different positions and/or depths of the component. This can be used to insert one or many artificial defect locations at different virtual space positions or depths of the component into the data set.

In a further advantageous embodiment of the invention it is provided that the first and/or second ultrasonic measurement data are transformed, preferably by means of short-time FFT. By this transformation a representation of the signal can be created, by which microdefects appear at characteristic frequencies.

In a further embodiment, a better signal-to-noise ratio and thus an additionally improved identification of defect locations can be achieved in that the first and/or second ultrasound measurement data are weighted and/or denoised.

An additionally improved identification of defect locations can also be achieved in that the artificial neural network is trained with a supervised learning algorithm and/or with first ultrasound measurement data obtained on a plurality of test bodies and/or with second ultrasound measurement data obtained on a plurality of components, and the respective microstructure defect data as labels.

In order to assess or improve the quality of the trained artificial neural network, it has been shown to be advantageous that the quality of the artificial neural network is tested and/or optimized by means of a receiver operation characteristic using ultrasound measurement data. The ultrasound measurement data can for example be determined on other real, defect-free or defective components.

A second aspect of the invention relates to a method for inspecting a component, in particular a component of an aircraft engine, in which the component is ultrasonically measured at least in regions while obtaining ultrasonic measurement data and is inspected for the presence of structural defects by means of an artificial neural network which is trained according to the first aspect of the invention using the ultrasonic measurement data. This allows micro-defects occurring in the volume of the component to be detected during an ultrasound examination. Further advantages can be found in the descriptions of the first aspect of the invention.

A third aspect of the invention relates to a test system for examining components, in particular components for aircraft engines, for the presence of structural defects, comprising at least one ultrasonic examination device, by means of which the component can be measured at least in regions in order to obtain ultrasonic measurement data, and a computer system which is designed to test for the presence of structural defects in the component by means of an artificial neural network which is trained according to the first aspect of the invention, using the ultrasonic measurement data. This allows micro-defects occurring in the volume of the component to be detected by means of the trained artificial neural network using an ultrasound examination via the at least one ultrasonic examination device. Further advantages can be found in the descriptions of the first aspect of the invention.

A forth aspect of the invention relates to a computer program that includes instructions that, when the computer program is executed by a computer system, cause the computer system to perform a method according to the first aspect of the invention and/or a method according to the second aspect of the invention. For related features and advantages thereof, see the descriptions of the first aspect of the invention.

A fifth aspect of the invention relates to a computer-readable storage medium that stores a computer program according to the forth aspect of the invention.

Further features of the invention can be gathered from the claims, the figures and the figure description. The features and combinations of features mentioned above in the description, as well as the features and combinations of features mentioned below in the figure description and/or shown alone in the figures, can be used not only in the combination indicated in each case, but also in other combinations without leaving the scope of the invention. Thus, embodiments are also to be regarded as encompassed and disclosed by the invention which are not explicitly shown and explained in the figures, but which arise from the explained embodiments and can be generated by separate combinations of features. Embodiments and combinations of features are also to be regarded as disclosed which thus do not have all the features of an originally formulated independent claim. Moreover, embodiments and combinations of features are to be regarded as disclosed, in particular by the embodiments set forth above, which go beyond or deviate from the combinations of features set forth in the recitations of the claims. The figures show:

Fig. 1 a schematic diagram of an ultrasonic examination of a turbine disk;

Fig. 2 a matrix microstructure of IN718 before heat treatment;

Fig. 3 a matrix microstructure of IN718 after heat treatment; Fig. 4 a matrix microstructure of IN718 after HIP in its final state on the surface of a test body;

Fig. 5 typical A-Scan measured in a pulse-echo setup using a 10 MHz focused single transducer;

Fig. 6 an artificial intelligence model for classifying grain noise signatures;

Fig. 7 an overview over the complete preprocessing pipeline for training the artificial neuronal network;

Fig. 8 an overview of the RCAS architecture;

Fig. 9 UT C-Scan measurements with a calibrated UT System on 0.4 mm FBH and a focussed UT probe with 10 MHz center frequency;

Fig. 10 visualization of the prediction for Surface 1 P2 with RCAS and its attention weights by taking two slices in scan and index direction;

Fig. 11 evaluation of a conventional ultrasonic inspection; and

Fig. 12 evaluation of an Al-assisted ultrasonic inspection.

Fig. 1 shows a schematic diagram of a test system 8 for examining a component 10, for example a turbine disk 10, for the presence of structural defects, comprising at least one ultrasonic examination device 12, by means of which the component 10 can be measured at least in regions in order to obtain ultrasonic measurement data, and a computer system 13 which is designed to test for the presence of structural defects in the component 10 by means of an artificial neural network, using the ultrasonic measurement data gathered by the ultrasonic examination device 12. The turbine disk 10 has an ultrasonic shape S. The turbine disk 10 includes a segregation 14 and is examined using the ultrasonic examination device 12. The ultrasonic waves generated by the ultrasonic examination device 12 have different travel times for the frontwall echo FW, the backwall echo BW, and the segregation 14. An enlargement of the microstructure with the grains G and grain boundaries B around the segregation 14 is shown in magnification I, which lead to grain noise. The gathered ultrasound measurement data is then transferred to the computer system 13 for processing using the trained artificial neural network.

Fig. 6 shows the principle of use of an artificial intelligence Al for classifying grain noise signatures from ultrasonic amplitude scans (A-scans), resulting in the classification as normal grain noise NGN or defect grain noise DGN. Stochastic scattering of multiple A-Scans induced by coarser grain of segregation can be used in this model.

Ultrasonic testing has been used in the industry for many years to successfully detect internal defects in bulk material. This disclosure focuses on the inspection of material made from IN718 often used for the manufacturing of turbine components. Unfortunately a recent accident in 2016 with a turbine engine failure lead to the incorporation of a new type of defect into the portfolio of defect types that ultrasonic testing might be able to detect. This defect poses new challenges to the conventional ultrasonic testing due to its very different material characteristic in comparison to traditional defects like cracks or voids. Its reliable detection in an industrial setup remains unsolved and requires new nondestructive techniques. To our best knowledge this is the first disclosure that utilizes deep learning techniques in combination with conventional ultrasonic testing for this kind of defect. For the new approach presented firstly artificial defects with similar material characteristics as real ones are defined and successfully manufactured. Secondly a Recurrent Convolutional Neural Network with Attention and Spectral representations (RCAS) having only 93000 parameters can be trained. In the executed experiments RCAS proves its superior capability of detection in comparison to conventional ultrasonic testing over the course of six measurements with three different types of ultrasonic probes resulting in roughly 176000 measurement points. Lastly the usage of an attention mechanism for detecting regions of interest within sequential ultrasonic data provides meaningful visualizations for the estimation of the depth of a defect.

On October, 2016 the Boeing 767-323 of the American Airlines Flight 383 experienced an uncontained Failure of its right engine resulting in a fire by which more than a dozen people were injured [19], The high-pressure turbine stage 2 disk ruptured due to a "gray subsurface material discontinuity" [19, p. 1] in the forward bore region. This discontinuity enabled low-cycle fatigue (LCF) cracks and was in a subsequent Examination classified as a discrete dirty white spot segregation (DWS). The National Transportation and Safety Board (NTSB)'s investigation stated that this defect was "most likely not detectable during production inspections and subsequent inservice inspections" [19, p. 1], Thus the Federal Aviation Administration (FAA) and the NTSB issued recommendations to address this incident. In particular the recommendation A- 18-4 stated the requirement of establishing a "subsurface in-service inspection technique, such as ultrasonic inspections, for [...] rotating parts for all engines" [19, p. 77],

Segregations from a material science perspective are a broadly used phenomenon but can in generally be understood as a local change in material properties [14], This difference can differentiate from an atomic level up to grains or even larger. In this work a segregation can be mainly defined as a local change in grain size where the matrix in which the segregation can be fully embedded and can have a finer grain size. The material used in this disclosure is the Ni-based superalloy IN718 used for the manufacturing of gas turbine components like disks or blades.

There exists a wide variety of Non -Destructive Evaluation (NDE) techniques for the detection of different types of defects and by that functions as a mandatory quality control criteria heavily used in the aeronautic industry. Each NDE technique has its own strength and weaknesses and do not or only negligibly change the physical condition of the inspected specimen. Besides Ultrasonic (UT) testing, which will be the focus in this disclosure, the main methods employed in the industry are visual -based methods, dye penetrant inspection, radiography, etch detection as well as eddy current inspection and thermography [41], The detection capabilities underlie several factors that may affect its outcome like differences in subjective interpretations by the auditor or the material geometry and surface roughness [42], Also each method can be fundamentally limited by the penetration depth which can vary from surface-only to up to depths of half a meter.

The subject of this invention is whether the detection of artificial defects with similar material characteristics as discrete clean white spot segregations (CWS) using conventional ultrasonic inspection hardware in combination with deep learning based techniques is possible. Artificial defects in this setup can therefore be seen as reference defects for CWS. Typically grain induced noise during ultrasonic testing can be considered to decrease the Signal-to-Noise-ratio (SNR.) and by that lower the sensitivity of the whole inspection. In the case of segregations it has shown that given a broadband ultrasonic probe with its center frequency and its wavelength being close to the average grain size - thus having stochastic scattering - the induces grain noise contains a distinct pattern that can be detected via a trained deep learning model.

Deep Learning (DL) technology has become an important part of our daily life due to its deep integration into many consumer products and services that are being used on a daily basis, especially prominent in smartphones and internet services. In the recent years groundbreaking results have been achieved in various disciplines and showed promising results in natural language understanding like topic classification, sentiment analysis and language translation [22],

We show that the detection of artificial defects is challenging for the conventional UT testing and that even with one such defect the effective training of a deep learning model is possible. Moreover by using such models the detection capabilities compared to conventional UT can be enhanced by a significant margin which supports the hypothesis that grain scattering from coarse grain can be distinguished consistently from grain scattering of a fine grain microstructure. In particular by integrating an attention mechanism with the Recurrent Convolutional Neural Network with Attention and Spectral representations (RCAS) architecture useful information about the depth of the artificial defect can be retrieved without explicitly providing information about the depth itself.

This work is structured as follows: First in section II a short introduction will be given of what defines a segregations and what techniques there already are in the industry to detect this type of material defect. Next in section III there will be a short introduction into UT testing as an industry wide standard as NDE technique as well as the definition of necessary formulas and the current state-of-the-art in ultrasonic scattering by discussing the Figure of Merit (FOM). This is followed by the section IV which deals with DL and the classification of this problem into the DL domain of Sound Event Detection (SED). Also the preprocessing pipeline will be discussed in detail and the attention based RCAS architecture. For training the DL models in section V the overall manufacturing procedure for our specimen containing an artificial defect with similar material characteristics as a segregation is explained and how the UT measurement are used for training purposes. This is followed by training results and a short discussion of the like. Lastly in section VI a conclusion sums up this invention.

II. Segregations This section describes the aforementioned defect, the segregation, as well as other manifestations in more detail. Starting with the formation procedure in section II- A during melting of the raw material, etch detection as industry wide standard for reliably and non-destructively detecting segregations is shortly mentioned afterwards in section II-B.

A. The formation during melting

The metallurgical examination of the DWS in the ruptured disk of the incident revealed a similar grain structure as those found during forging [19], This observation and further studies by [13], [15] indicate that segregations are already present in the ingot. Segregations as described above are believed to be formed during the Melting process - in particular the Vacuum -Arc remelting (VAR) - of the ingot [16], [17], [19], According to [17] and the Gas Turbine Superalloy Committee of Aerospace Division of ASM International white spot segregation can be subdivided into discrete, dendritic and solidification whitespot segregations [12], They are called white spots due to their white and light appearance when applying etchant. In the following only discrete white spot segregations out of the three types will be discussed in more detail.

The main differences of discrete white spots to the other types are the highest depletion in alloying elements, its location from the center to the mid-radius of the ingot and most importantly, hence its name discrete, its sharp and distinct interface between the white spot and the matrix [17, p. 126], Also discrete white spots can occur in various shapes with a diameter greater than 10 mm while having a different internal microstructure than the surround matrix [17, p. 126], There is evidence that white spots are leftovers of the shelf, crown or torus material which fell into the molten pool without fully dissolving due to a lack of sufficient temperature and time as shown in Fig. 1 [17], These white spots are referred to as CWS. Given that the molten pool surface is coated with oxide and/or nitride debris, remnants of the electrode can become surrounded by these impurities when falling into the molten pool which are then called DWS [11], The impurities surrounding the CWS are called stringers [19], Discrete white spots are considered to be the most deleterious microstructural anomaly during the VAR and can reduce the yield strength and fatigue life of Alloy 718 significantly [13], [17], Especially the presence of stringers can act as early initiation sites for cracks which leads to the reduced LCF life. In difference to that CWS are considered not as critical as DWS although it is believed that decrease in life expectancy can be mainly attributed to the sufficiently large difference in grain size No. of 3 to 4 ASTM according to ASTM-E112 [17],

B. Etch detection

One way of reliably detecting segregations is by using etch detection. Since in this method etchant is applied to the surface only surface-connected segregations are detectable [17], The general notion is that depending on the local differences in chemical compositions, phases or grains are ablated differently which leads to a distinct reflectivity pattern. This way heavily precipitated areas like grain boundaries can be made visible whereas solute-lean areas like segregations are left as light areas [17], The etch detection is often the last non-destructive inspection step of a turbine disk when it is already in its final geometry. Therefore it has to be ensured that while etching only a small amount of material is being removed from the surface in order for the part to be within the allowed specifications. For the visual inspection of an etched part, imprints of any anomalies are made for subsequent documentation purposes and later decision-making by experts on how to classify the anomaly. Given any type of critical anomaly like a DWS typically results in scrapping the complete part. Since the part has already been manufactured to its final geometry the added value by that is mostly lost and moreover the likelihood of turbine disks from the same raw material also having segregations increases.

III. Ultrasonic Testing

This section deals in more detail with the conventional ultrasonic testing procedures that are currently employed in the industry for the detection of defects like voids or cracks. Therefore first requirements for the detection of such defects are mentioned in section III-A. This section is then followed by the technical introduction into ultrasonic testing in III-B and the theoretical approach of quantifying grain noise by the FOM in section III-C.

A. The industry standard of Ultrasonic Testing

When it comes to the detection of defects in the volume of gas turbine disks one can either use high-energy radiography or ultrasound. The de facto standard in industry is ultrasonic testing with single probe transducer at frequencies from 5 MHz up to 20 MHz or even higher. Much higher frequencies would theoretically enable higher resolution and by that the detection of smaller defects but at the same time higher attenuation at higher depths in the material drastically reduces the resolution. On the contrary lower frequencies enable the sonification of thicker parts but with its higher wavelength can not reliably detect small defects. Tackling this trade-off one can e.g. switch to multizone inspection by using multiple focussed ultrasonic probes or deploy a phased array system. Often in industry the standard inspection for turbine disks utilizes a single transducer which has a broadband sender and enables the detection of reference defects as flatbottom hole (FBH) with a diameter of 0.4 mm to a depth of 50.0 mm.

Theoretically the reflectivity of a discontinuity depends on its acoustical impedance Z in comparison to the acoustical impedance of the surrounding material. It should also be noted that for a reliable Probability of Detection (POD) many other factors like defect size, orientation, surface roughness and also material attenuation, scattering behaviour, calibration sensitivity and lastly human factors can significantly influence the detection capabilities. The acoustical impedance Z can be defined as the product of material density p and sound velocity of material c.

B. Technical description

Given an ultrasonic system that scans the surface of a test specimen with a surface sampling rate of Sx = S_y = S which in this scenario can be considered equal for both the x and y direction and a

2 . y — jy . jy

Surface of size X * Y = A there will be in total ^sv ^{x y} scan positions generated. In all the following experiments a surface sampling rate of S = 0.2 mm/pixel was chosen. Given the unit of the sampling rate a smaller sampling rate results in a finer surface inspection grid yet for the industry ultrasonic inspection a higher throughput of inspected parts is desirable. We decided to use a relatively fine sampling rate because the data will always be a superset of the data generated with a higher sampling rate given that the sampling rates are integer multiples of each other i.e. all points on the surface of the lower sampled inspection have also been scanned by the probe with the higher sampling rate.

For each position (x, y) on the surface one time signal called A-Scan can be generated. Depending on the depth of the specimen d in mm, its sound velocity c in mm/ns and the sampling rate of the ultrasonic receiver Sz in Hz the number of individual amplitudes received in a pulse-echo 2-d setup can be defined by Nz = C-S_z Therefore the inspection of a surface results in N_x * N_y many A-Scans which together form a Full-A-Scan. An A-Scan

at position (x, y) describes a number of amplitudes A_x,_y,t received at time t. Note that time and depth are in a direct relation assuming a constant sound velocity over the depth. A calibrated ultrasonic system can be able to resolve reference defects with the same size in different depths with the same amplitude. One typical A-Scan with selected Regions measured in a pulse-echo setup using a 10 MHz focussed single transducer is shown in Fig. 5. Region A is the offset region before the FW. Region B is the region of the FW whereas region C is called the dead zone as being a non-inspectable depth. This phenomena is caused by the ultrasonic equipment and the reverberations of the FW. The next region D is the inspectable region and therefore the ROI for the ultrasonic inspection. The last region E defines the physical end of the specimen and the start of the BW. Everything shortly after the BW is the Back reflection of the BW itself.

An A-Scan region

defines a set of amplitudes A_x,_y,t that are within a certain time frame [t_s, tE] after applying an arbitrary function f to it. In addition to that amplitudes are stored in a high frequency representation having positive and negative fluctuations as seen in Fig. 5. Depending on the digitalization rate B the zero amplitude can be set at 2^B-1. Since for the detection of defects in conventional UT it does not matter whether complex signal patterns are present in the signal but only that a high amplitude exists, the absolute amplitude of the difference to the zero amplitude can be used for inspection purposes. By the afore mentioned depth calibration it can be ensured that e.g. cracks perpendicular oriented to the incident direction lead to same maximal absolute amplitude in all depth the system was calibrated on. Given a Full-A-Scan it can be sufficient for the conventional ultrasonic inspection to first define the ROI for all A-Scans and then take the maximal absolute amplitude

at each position. This representation in which each amplitude can be associated with a color, similar to a heat map, is called C-Scan and can be defined by

An A-Scan can be then represented by one digit

which results in an image for a Full-A-Scan. This C-Scan representation can be understood as the inspection result and conveys information about lateral maximal amplitude distributions while not considering the depth. Whether an ultrasonic indication at position (x, y) can be critical depends on Cx,y and the set Threshold Tcriticai. If any Cx,y within the calibrated depth is larger or equal than Tcriticai a sufficiently large acoustical impedance mismatch seems to be present in the material in order to be critical for its lifetime. The so called Time of Flight (TOF) representation can be similar to the C-Scan but computes the depth of the maximal absolute amplitude

The TOF visualization can be used to highlight reflectors that would be not detectable with a threshold when e.g. the maximal absolute amplitude can be only minimally higher than the noise but occurs at the same depth. Another application could be the tracking of the FW to evaluated whether the scan axis are plane-parallel to the surface of the specimen.

Another figure used for estimating the existent of a defect can be the SNR. Unfortunately there are numerous definitions of the SNR but most often they share the idea of taking the ratio of the signal P_s and the peak noise amplitude Pn of some noise region within the C-Scan [4], One commonly used formulation is

which enables a more robust ratio of the defect signal to the maximum noise by subtracting the mean average noise by which all signals are superimposed. Thus the maximum absolute amplitude can be far below the defect threshold yet resulting in an unacceptable SNR that can be another indication for a defect.

C. Ultrasonic Scattering

To ensure the safety of an engine during its service life in an environment of high competition with increasing requirements towards each part of the engine also the relevance of NDE and to enhance its capabilities to accurately detect smaller defects as well as other types of defects becomes more important in the future. As described earlier the DWS leading to the catastrophic failure was not detectable during production inspection and to our knowledge would also not be reliably detectable today. The fundamental requirement of an impedance mismatch in order to be detectable by ultrasonic inspection is not existent in segregations that are fully bonded with the host material. Superalloys like IN718 used in modem aero engine turbine disks are typically polycrystalline materials with a heterogeneous internal grain morphology. An ultrasonic wave propagating in such a media lead to random changes in amplitude and phase which can be referred to as grain noise. In normal inspection circumstances the inspection can be fundamentally limited by the attenuation of the inspected material because with increasing attenuation and noise the SNR decreases and thus limits the detection capabilities. The attenuation can be composed of the diffraction of the beam spreading by the ultrasonic probe, the internal absorption due to friction and the scattering where scattering can be the most dominant part of the attenuation [18], The magnitude of scattering mainly depends on the ratio of the wavenumber k and the scatterer Diameter D and it has been shown that it can be subdivided into the three scattering regimes Rayleigh scattering

when kD « 1, stochastic scattering

when kD » 1 [5], [8]-[ 10], The scattering event happening at grain boundaries can be due to the difference in phase velocity from one microstructural grain to another [18], This noise remains coherent over time meaning that it can be fully determined by the grain morphology and can be exactly reproduced when repeatedly measuring at the same position.

The modeling and theory of grain noise in ultrasonic inspection has been studied in depth over the course of 80 years and is still considered an active research area today [18], Major improvements and the foundation of the Independent Scattering Model (ISM) has been established by the Engine Titanium Consortium (ETC) [21], One contribution is the FOM

which defines a measure that quantifies the noise severity of a sample without dependency to the inspection hardware with the unit cm'^1/2 where n is the density of grains and Arms is the average single-grain backscatter amplitude [21], This measure is directly proportional to the normalized Root mean squared (RMS) grain noise Nrms as shown by equation 6 and can be derived directly from an A-Scan. Therefore one first has to calculate the instrumentation background noise

which is the constant amplitude at time t induced by the inspection hardware. Next the RMS grain noise is

so that only the amplitudes outside of the static instrumentation noise are taken into account for the noise severity. In the last step a reference signal Emax - most often by utilizing the FW - can be estimated which leads to the ratio

IV. Deep Learning Based Approach

In the following section IV-A reasons are given why the detection of segregations in ultrasonic A-Scans can be seen as a subfield of SED, an already existing field in DL. Next in IV-B the complete preprocessing pipeline is elaborated on. Similarly to other SED input pipelines the A- Scan, seen as a time-series, can be transformed into its spectral representation with necessary normalization and standardization steps by using the Short-time Fourier Transform (STFT). The last section IV-C introduces the RCAS architecture as a combination of a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN) and the bahdanau attention mechanism [32],

A. Similarities to SED

Despite major advances towards the quantification of grain noise using the FOM it certainly has its limitation. First of all it assumes that scattering events are created by single scattering processes whereas multiple scattering events are neglected [21], Moreover the FOM computation squeezes a Full-A-Scan into a one dimensional plot that defines the noise generating severity at different depths for a selection of gate windows. Depending on frequency spectrum of the used ultrasonic probe in relation to the grain diameters problems occur due to the single scattering assumption [7],

Assuming that the inspected specimen contains a CWS the FOM of the specimen will presumably not be able to address the small change in coarser grain size in the CWS because one can assume that the volume of the defect in comparison to the overall volume of the material can be neglectable. Also the FOM describes the magnitude of noise generating capacity and can be not intended to be used as an detection methodology.

When the ultrasonic wave passes a CWS assuming a mean grain diameter of 50 microns and a probe center frequency of 10 MHz in Inconel 718 with an approximate wavelength X = 0.5 mm the scattering will be in the stochastic scattering regime (kD ~ 0.6). Given that the matrix grain size can be 10 microns or smaller the scattering effects will be in the rayleigh scattering regime (kD ~ 0.12). The scattering behaviour can be therefore highly dependent on the frequency and can be created by a complex interaction of multiple scattering events that can interfere constructively and destructively. We assume that the coarse grain region having a specific grain size distribution interacting with an ultrasonic wave in a distinct pattern that can detected with a sophisticated logic. This hypothetical noise pattern is in the following referred to as grain signature.

More generally an A-Scan can be understood as a time series similar to a audio sequence only in much higher frequencies. A well known challenge in DL can be the detection of sound events within an audio signal which can be commonly referred to as Sound Event Classification (SEC) or sound event tagging [6], In difference to that can be the SED in which not only the existence of a sound event in general can be predicted but also its exact start and stop time. Yet by subdiving the time series into equal-length chunks and tag sound events to each SEC-systems can be used to solve similar problems as SED-system. The temporal resolution of the SEC-system can be thus limited by the length of each chunk.

For example one wants to detect whether the word segregations has been said within an audio signal of a certain duration it is intuitively easy to comprehend that this cannot be solved with a maximal threshold criteria. Moreover the discrete amplitudes of this sound event look vastly different depending on the person saying the word. Analogously the sound event of the multiple scattering events of a segregations to our knowledge cannot yet be hard modeled mathematically given a sequence of amplitudes but have to be soft modeled by approximation with a DL model.

Formally the model

is given some form of transformation of an A-Scan

where T is the time depth and M the feature depth that outputs

where C is the number of classes which is in this case either defect or no defect. It is assumed that the time frame within ASx,y of the defect, if apparent, is not known directly and thus not provided to the model F during training.

B. Preprocessing pipeline

For the transformation first the background noise bt for all time positions can be computed and subtracted from all A-Scans A_x,y as shown by step two in Fig. 7.

Fig. 7 gives an overview over the complete preprocessing pipeline. Data representations are shown by boxes with their corresponding dimensions in brackets. Nx+, Ny+, Nz+ define the dimensions before applying operations that lead to the final dimensions Nx, Ny, Nz. Similarly do M+ and T - define the dimensions of the STFT before padding and filtering out frequencies outside of two to 25 MHz.

By that any amplitude which can be constant over all positions will not contribute to the grain noise and can therefore be omitted. Next the mean grain noise locally around the position (x, y) can be approximated with a mean convolutional kernel K

where

as shown by step four. This computation step basically corresponds to a 7x7 depthwise convolution where each weight can be fixed and can not be trained. The Kernel size will always be uneven with KN = KM SO that a middle position

can be determined and the sum over all kernel weights will be one resulting in an constant average amplitude. The A-Scan

after applying the kernel K as described with equation 7 results in a new Full-A-Scan representation.

Note that Nx and Ny will be reduced by KN-I = KM-I because for all positions at the border of the Full-A-Scan are ignored where the Kernel would be outside of the surface. Also note that the kernel can be constant over time t. Although the scattering reflections of a CWS are not globally coherent over the surface of the Full-A-Scan it can be assumed that these are coherent over a small fraction of the surface, i.e. over the surface of the defect at time t. The real size of the kernel depends on the sampling rates Sx and S_y and was for the following experiments chosen to be 1.4 mm * 1.4 mm resulting in Nx = Ny = 7. For normalizing the data after applying the mean kernel first the maximal absolute amplitude Cx,y can be calculated and then transformed by

with

In order for the kernel to work as intended it has to ensured that neighbouring Amplitudes are in fact physically from the same position of the volume of the specimen. The idea is to detect the FW position for each A-Scan and then shift all amplitudes in time so that the position of the FW can be at a predefined reference position in time as shown by step one in Fig. 3. Due to the necessity of high amplification for the grain noise to be present in the A-Scan a second measurement with much lower amplification can be conducted so that the FW is not over-amplified and thus its position can be detected fairly accurately. The kernel operation as described above can be applied after this shifting operation. Due to surface imperfections and turning grooves from machining the FW signal typically jitters. This jittering can disturb the accurate determination of the position when only taking the first maximal amplitude according to the TOF computation described by eq. 2.

First we define the approximate FW position trw and a window size w in which the FW position can shift over the inspection surface

where I is the identity function I(x) = x. The next step can be to define the maximal left-most tlm ^rm max _and maximal right-most ^ma^ amplitude above a threshold amplitude Amax as well as

^rm the minimal left-most

and minimal right-most amplitude below a threshold amplitude Amin. The estimated zero-crossing position of the FW can be defined by the average of all four time positions

For each position (x, y) the positional difference of tzero to a reference position t_ref can be computed and each A-Scan can be shifted in time according to this difference so that we can assume that for each A-Scan the first grain noise amplitudes starts at tref+i. Theoretically the last grain noise amplitude will be at position tzero + Nz although we assume a death zone WFW after the FW and WBW before the BW which defines the maximal usable number of grain noise amplitudes

as shown by step four in Fig. 3. Usually WBW can be assumed to be much smaller than WFW and was assumed to be zero as shown in 2 with region D. Note that if tzero is computed to be before r_ref, thus being in Zone A the missing amplitudes to fill up the A-Scan necessary for the right shift are filled with zero amplitudes 2^B-1. This can be unproblematic because later only the region D will be used for detection purposes. Also during ultrasonic inspection it can be ensured that the Zone A can be sufficiently large so that a left shift of the FW is possible.

The next step can be the acoustic feature extraction as shown by step five in Fig. 3 to transform the low level representation of an A-Scan into a higher level representation in the frequency domain. In this disclosure the feature extraction can be achieved by four main stages: frame blocking, windowing, frequency spectrum calculation and spectral amplitude normalization [6], To obtain the frequency spectrum the STFT can be used by first dividing the A-Scan into short time frames, then apply a window function and lastly compute the Fast-Fourier Transformation (FFT) over the time frame. The spectrum of one A-Scan can be the concatenation of all short time fou- rier transformed time frames. Due to the natural attenuation of the grain noise also their spectral amplitudes become weaker over time. Since the model F should be able to detect defect grain noise in any depth - which is of course limited by the physical requirements of the inspection system - the spectral amplitude of grain noise in higher depths should be on average on the same level as in lower depths. This can be achieved by computing the STFT of all A-Scans in the Full- A-Scan, then take the median of the spectral amplitudes over all positions for each point in time separately and finally divide each A-Scan spectrum by this median spectrum as shown by step six. To set the upper bound of the maximal spectral amplitude to one each A-Scan spectrum can be divided by its maximal spectral amplitude. Given a STFT window size Wstft and an overlapping of w_oi the number of time frames

with the embedding depth of

Since by the ultrasonic receiver a certain frequency bandwidth is already filtered out the same frequencies in the STFT are filtered out as well. Here it can be assumed that ultrasonic probes with a center frequency at 10 MHz there are no significant frequencies existent in the ultrasonic beam below 2 MHz or above 15 MHz. As shown in step seven in Fig. 3 the final number of time frame T can be reached by padding the missing frequency spectras with zeros. Therefore each A- Scan will be represented by a M*T spectrum independent of its actual number of amplitudes.

C. RCAS Architecture

A multitude of DL architectures have been used in the past for solving general audio tagging related problems. The architectural key concepts and currently state-of-the-art (SOTA) for the audio pattern recognition task AudioSet [24] are CNN [23 ]-[25] . A combination of CNN and RNN often simply referred to as Recurrent-Convolutional Neural Network (RCNN) [6], [26] can be found frequently as well as attention-based architectures [27]-[29], This attention computation can be understood as the importance of data instances in a set of bag of instances for the class prediction of the complete bag. This type of attention can be vastly different to the attention mechanism used in Neural Machine Translation (NMT) [30]-[32], This type of time specific attention as well as self-attention used in Transformer architectures [33] has not yet been used as frequently in the research of audio tagging related problems as in NMT. Especially for the scenario of audio tagging with two classes the additional visualization capabilities of the attention weights make these architectures suitable for this detection problem. There are several limitations to this problem that make the usage of convolutions more complicated in comparison to sequential models like RNN and Transformer architectures. The major obstacle can be the high variance in usable information over the time axis. The number of valid positions can be directly proportional to the inspected depth of the specimen. For example a turbine disc can have ten or more inspection surfaces that all vary in depth even over one surface from depths of 20mm up to 50mm or even higher. The architecture should be able to handle sequential data where a large portion can be padded with zeros. Another limitation can be the requirement of depth or time related information i.e. knowing at which position in time or in depth the defect occurred. Convolutions are primarily suitable for image-like data that enable the learning of positional feature maps and do not directly work on sequential data. Given these limitations we choose to focus on an architecture that combines RNN with attention and a residually connected CNN called the architecture RCAS as shown in Fig. 8.

Fig. 8 shows a full overview of the architecture called RCAS as combination of a small CNN for learning more abstract spectral respresentations residually connected with layer normalization [36], a one-layer one-directional RNN with GRU [34] and a masked Bahdanau [32] Attention module. Boxes [1,M,T] symbolize data, boxes 2, 4, 6 trainable neural network structures and box 3 the residual connection and normalization.

The input representation AS

shown at step one is equal to the last spectral representation of the input pipeline as shown in Fig. 3. For illustration purposes the inference is only visualized for one datapoint which is why there is always a leading one for the dimensions. The residual connection [35] are loosely oriented on the residual building blocks in the transformer architecture [33], Therefore the output after step three is

which enables the skip of the CNN at step two during training and inference. This building block can create a more abstract spectral representation than the normalized STFT representation after the input pipeline which might be more suitable for the correct defect prediction. Next the data can be fed into a one-layer RNN using GRU which aligns the positions to computation steps and thus generates a sequence of hidden states ht that take as input the previous hidden state ht-i and the input for position t. The embedding for each time step will be the frequency spectrum always consisting of M amplitudes and as mentioned earlier can be dependent on the window size w_stft of the STFT operation. The embedding itself can be only learnable via the

CNN in e.g. comparison to fully learnable word embeddings in NMT. The output

of the RNN has the same dimensions as the input shown with the bottom data representation at step five. The top box at step five represents the last hidden state h .

Now apparently for a defect A-Scan

a small time window can be ultimately responsible for classifying the complete time signal into one class. To not only enable the model to focus on this time frame in particular but also later visualize the learned attention weights at the decision was made to use the bahdanau attention mechanism [32], Therefore an additional so called alignment model a implemented as a feed forward neural network can be jointly trained with the RNN and CNN for the binary classification problem as shown by equation 9.

where v_a, W_a and U_a are the weight matrices of the feed forward neural network that are being learned. Lastly the context vector ct can be calculated by

where at is the attention weight at time t and can be computed by

When computing the attention weights according to equation 11 the mask mt is set to -co if

is equal to zero i.e. has been padded with zeros to a length of T or is set to zero if

is unequal to zero i.e. there actually is valid data in the spectral representation. This way the attention weight at becomes zero at the padded positions and by that sets the context vector ct to zero at the same positions. The architecture itself can be an encoder-only model since the output for each A-Scan should be a class label and not a sequence.

Lastly a fully connected layer as again a feed forward neural network can be used to squeeze the context vector ct to a vector of size C where C is number of classes corresponding to the probabilities of Sx. y belonging to each class.

V. Experimental results

In this section the overall experimental setup for the acquiring of measurement data and the training of two DL architectures and its comparison to conventional ultrasonic thresholding is described. Starting with section V-A and the manufacturing of a specimen containing fully embedded seeds as well as the ultrasonic setup. This is followed by V-B and the supervised learning setup, the problem of class imbalance and concrete countermeasures like the focal loss and endeavors taken to increase the overall dataset by introducing depth variation. Lastly in V-C a short discussion of the results is undertaken and the performance of both DL models are compared to the conventional ultrasonic testing methodology.

A. Experimental Setup For the generation of sufficiently many data points for the training of the aforementioned model it was necessary to produce test samples with artificial defects similar to CWS. We focused on the production of CWS and completely omitted DWS in this disclosure because of the neglectable effects of very thin oxide clusters on the ultrasonic pulse. Moreover it was also assumed that compositional differences of CWS and DWS to the IN718 matrix do not contribute to a sufficiently large change in acoustical impedance. All following explanations and ideas are influenced by the report of the ETC Phase II and their described production procedure for synthetic inclusion samples [20],

The main focus here was at first to generate the appropriate average grain size in the seed inclusions. Therefore a small volume of fine-grain IN718 with an average grain size of 8 pm or ASTM 11 has been heat treated - also called annealing - to coarsen the average grain size to 27 pm or ASTM 7.5 as can be seen in the microstructures of Fig. 2, Fig. 3, and Fig. 4.

The final geometry of the inserted seeds were cylinders with 5 mm height and 4 mm in diameter. In total three types of seeds have been manufactured and inserted into the host material. The first consists of the same material as the host material to control and assess the manufacturing procedure and later enable the system to selectively ignore any UT effects that may occur. The second seed can be a section of the volume that has been heat treated similar to that as shown in Fig. 5. Lastly additively manufactured material was inserted as well but was later omitted completely due to the very textured grain structure. In total one seed in a depth of approximately 17.5 mm has been inserted that could later be used for data processing.

After cleaning all surfaces the three seeds were tightly inserted into blind holes and closed with a cap as shown in the left image of Fig. 6. To enable the bonding of all surfaces the cap has been electron beam (EB)-welded to the host material under near vacuum. Then everything was HIPped which resulted in a fully bonding of the touching surfaces. The aforementioned HIP parameters have been loosely incorporated from the final design block of [20], Lastly, from all surfaces perpendicular to the bonding surfaces a large portion was milled off to ensure that the welding seam of the EB-weld is fully removed. The final dimensions of the specimen were 75 mm x 28 mm x 35 mm as shown by the right image in Fig. 6. Also one additional metallographic inspection from the surface of the final specimen indicated no grain growth as shown by the right image in Fig. 5 thus it can be assumed that the average grain size of the heat treated and non heat treated seed within the volume has not changed either. Unfortunately there are not metallographic images of the internal microstructure because at this point only one specimen was available. The assumption that the bonding went successful was later confirmed by the ultrasonic measurements.

The specimen has been inspected with a conventional ultrasonic system at 100 MHz sampling rate and a surface resolution of 0.2 mm/pix for the Index- and Scan-direction. For all measurements the Full-A-Scan representation can be stored for later analysis. The inspection surfaces in this case were chosen to be the top and bottom - not visible in the image - surfaces of the specimen as shown by the right image in Fig. 6. The main motivation was the much lower ultrasonic reflectivity of the internal heat treated seed when inspecting from this direction because the sharp transition of the grain size was not directly perpendicular to the propagation direction of the sound wave due to the cylindrical outer surface. In total three ultrasonic probes were used resulting in twelve measurements from which six are used for TOF correction whereas the other six contain the grain noise information due to much higher amplification. For the comparison purposes to the calibrated ultrasonic testing two additional measurements with a certified ultrasonic probe have been conducted as well that are shown in the UT C-scan in Fig. 9. The indications at the top right and top left corner are SDH for orientation purposes right next to the indication of the additively manufactured seed that will be omitted in this study. The heat treated and non heat treated seeds are undetectable and had a response of about one to two percent FSH on average. For the final measurements the gain for all three ultrasonic probes was set to a level so that the deepest reference FBH at a depth of 1.5 inch results in a response of 80% FSH.

This way the average grain noise increased to a level that was measurable with the UT system. Also the distance from the probe to the surface of the specimen was reduced so that the focal point was at around 3 mm below surface. The angle of incidence was perpendicular to the surface at all measured positions.

Roughly 176000 A-Scans could be generated that are used for the following training as shown in the overview Table I. Table I: Overview of all datapoints representing the base training dataset for this disclosure. The first column resolution defines the number of a-scans per inspected surface. The second column describes the parameter of the UT probe with center frequency, focal length in inch and transducer diameter also in inch. The last column displays the number of defect positions manually labelled as defect.

By explicitly ignoring the non heat treated seed during labeling and by that set its class to the background class the trained model can be forced to ignore any ultrasonic effects introduced by the manufacturing procedure.

B. RCAS Training Setup

For the training each individual A-Scan can be fed into the DL model which predicts whether the A-Scan is classified as a defect or not. One of the first problem arose due to the high class imbalance of defects to non defects. To tackle this problem the focal loss has been successfully deployed [38], Instead of using class weighted cross entropy (CE) an additional modulating factor can be incorporated

with

Here I ’ J describes the ground truth label and ”

J the models predicted probability for the class with label y = 1 which corresponds to the defect class, at describes the linear weighting factor of the CE for each class and y > 0 can be defined as the tunable focusing parameter [38], The general idea is the exponential down-weighting of examples that are easy to classify correctly which in the case of imbalance data can be most often the overrepresented class. To minimize the loss during training what can be observed can be the classification of all data points to the overrepresented class. Therefore the pt for the data points with that class label converges to one which effectively scales the modulating factor close to zero and by that reduces the weighting of the CE for those data points in the batch. On the other hand underrepresented data points are thus upscaled drastically in comparison. Without extensive hyperparameter tuning at = {0.25,0.75} and y = 2 which correspond to the recommended parameter settings by [38],

Next the model should be able to learn and detect which time frame was mainly responsible for the predicted class. This study was limited by the number of artificial defects in the specimen which, unfortunately, were only inserted at one depth position. To not only significantly increase the number of data points but also force each model to be able to effectively distinguish normal grain noise from grain noise induced by the seed, the length of each A-Scan can be systematically varied. This depth variation in theory should correspond to the inspection of specimen with various thicknesses.

As first step a minimal length can be defined that has to be fulfilled by all depth varied A-Scans. This minimal length corresponds to the diameter of the seed given a constant ultrasonic velocity. Next it had to be ensured that the depth zone for the A-Scans containing grain noise from the heat treated seed will always be completely within the depth varied A-Scans. In other words the defined zones for the depth variation are considered relative to the seed zone compared to absolute zones for defect free A-Scans. All permutations are labelled with the same label as its base A-Scan label and are always right padded to a predefined length. By applying this method the number of trainable A-Scans could be increased to 2.3 million. The split into train set and test set was realised by sampling each surface in a grid-wise manner and sorting out every tenth data point into the validation set. Thus roughly 10 % of all data points are within the test set which was never used during training and at the same time also roughly 10% of data points with a defect label are sorted out as well. Adam [40] with default settings was used as optimizer with a constant learning rate of 10'⁵. Training of the DL models can be done by Stochastic Gradient Descent (SGD) with the maximal possible batch size of 4096 for the CNN and 8096 for RCAS. The experiments are implemented with the TensorFlow [39] framework by utilizing a Nvidia Quadro RTX6000 Graphical Processing Unit (GPU).

C. Results and Discussion

To compare the proposed RCAS architecture and show its effectiveness it will be compared to the performance of the conventional UT methodology and a CNN with comparable number of parameters as shown in Table III. The baseline model can be the application of a maximal threshold onto each A-Scan and classify the region R_x,y as defect if any absolute amplitude Cx,y is equal or above a certain threshold. This was once applied on all A-Scans in its raw form and after applying the mean Kernel as described by equation 7. The average gain calibration for the data used in this experiment is constant and not depth dependent according to the industry standard which leads to a much higher amplification of all amplitudes below 1.5 inch depth. Thus the detection capabilities for the thresholding models will most likely be worse in reality.

The RCAS model as described in more detail by Table II with the left RCAS column employs at first a very small residually build encoder module where the CNN can be a two layered plain CNN with a 3 x 3 kernel size and a depth of 32 followed by a convolutional layer with a 1 x 1 kernel. The output of this residual layer can be inserted into a dropout layer with a 40 % dropout rate. Next a one layered RNN with GRU cells and 128 hidden units learns a time dependent feature representation that can be evaluated and trained on by a masked Attention module with 128 attention units.

The CNN architecture as described in more detail by Table II with the right CNN column can be based on the CNN 10 architecture by [25] only with a lot fewer parameters. Here a two layer CNN can be used where each layer can consist of a convolutional layer with a kernel size of 3 x 3 and depths of 64 and 128 respectively. After that a batch normalization layer [3], a 2 x 2 aver- age pooling layer and a dropout layer can be applied. The second average pooling layer can be substituted with a global average pooling layer followed by a dense layer with depth 64. Finally a second dropout layer followed by the last dense layer generate the logit predictions. The dropout probability for both layers can be set to 40 %. All activation functions are nonlinear Rectified linear Units (ReLU) [2] which are employed after each convolution as well as the second last dense layer.

Table II: Overview of the parameter setup of both used dl architectures. The abbreviations for the layer column are k for kernel, b for bias and rk for recurrent kernel.

Table III shows the experimental results and compares the four models against each other. The model No Defect at the top is basically the most naive approach for solving this problem by classifying all positions as non-defect. This model should portray the overall problem of class imbalance which is why the metric accuracy in the last column should be assessed critically since the number of True Negative (TN) cases always outweighs the number of True Positive (TP) ones. In particular it becomes apparent that both threshold models are unable to reliably discriminate non defect positions from defect positions. Especially for rather low thresholds there are at least 40 times as many False Positive (FP)'s as TP cases. This results in very low precision values for all threshold models. Therefore over all measurements it is not possible to simply decrease the threshold and by that be more sensitive and detect more defect positions. In fact, as shown by the Table III, decreasing the threshold roughly leads to the same amount of TP and FP as it would be the case when randomly guessing the probability of an A-Scan being a defect or not.

Table III: Results of comparing conventional UT with and without the application of a 7x7 mean kernel and various absolute amplitude thresholds to dl based techniques. Both models RCAS and CNN are trained and validated on the same dataset and all performance indicators for all models are computed for the same dataset without depth variation as presented by the table I. The threshold for the DL models can be based on the predicted probability for the defect class whereas for the threshold models it can be based on the relative absolute amplitude.

This exact effect can be seen in the Receiver Operating Characteristic (ROC) curve where a diagonal line constitutes absolute random guessing. This closeness to random guessing corresponds to a completely flat horizontal line. Both diagnostic tools can be combined into one measure by computing the Area under Curve (AUC) between the random guess and curve of interest. For the ROC this area can be multiplied by two in order to be within zero and one. This implies that the higher the kink of the Precision -Recall curve is to the top right or for the ROC curve to the top left, the better is the discriminating power of the model. A Precision vs. Recall curve for all four models over all six measurements with a threshold stepsize of 1/50 can be computed. The AUC can be approximated by the same stepsize as for thresholding using the composite trapezoidal rule. A ROC curve for all four models over all six measurements with a threshold stepsize of 1/50 can be computed as well. The AUC can be approximated by the same stepsize as for thresholding using the composite trapezoidal rule.

The mean Intersection-over-Union (mloU) has been computed for all DL models on the same test dataset which in this case was roughly 10 % of all scan positions with depth variation and a threshold of 50 %. For the validation this threshold has always been set to 50 % because at this point the predicted probability of the model for the defect class was higher than for the non- defect class. The column right next to it, in comparison, states the mloU over all positions without depth variation under different thresholds. There, in particular, RCAS provides a higher discriminative power compared to the CNN.

As can be seen in Fig. 10, which shows the Visualization of the prediction for Surfacel P2 with RCAS and its attention weights by taking two slices in scan and index direction, the learned attention weights can be used as reasonable estimations for the depth of the seed. For both directions the approximated predicted seed depth by taking the maximal average attention weight coincides with its actual position. Interestingly but not as visible as the heat treated seed are the higher attention weights in the lower part of the top right attention map to correctly predict the region around the non heat treated seed as non defect class. Also noteworthy is the on average maximal attention weight depth for the non defect class which was, for all measurements, in the left half of the material depth. In other words the grain noise in the first halve of the specimen are more important for the prediction of non-defect class than the second halve. The top left and right parts of Fig. 10 show the scan path (y) versues the index path (top left) or the material depth (top right), while the bottom left part shows the index path (x) versus the material depth (y). The crosshatched areas are padded regions, the dashed line shows the selected slices of the Full-A-Scan, and the dotted line shows the depth of maximal average Attention.

Fig. 11 shows the evaluation of a conventional ultrasonic inspection in comparison to Fig. 12, which shows the evaluation of an Al-assisted ultrasonic inspection. One can see that the segregation 14 cannot be identified in the conventional inspection result while it is clearly visible in the Al-assisted inspection according to the present invention.

We showed the application of DL models for the detection of artificial coarse grain regions inserted into a fined grained IN718 volume and why the industry UT inspection is not suitable. In particular the proposed RCAS architecture specifically designed to cope with sequential data with high variation in length by utilizing attention was strictly better than a CNN with comparable size. Especially attention in combination with the data augmentation step showed promising results and enabled the prediction of the depth of the seed for all inspected surfaces with all used UT probes in this experiment. But the study also revealed that there can be a high variation due to the used UT equipment primarily dependent on UT probe and also only one seed in one depth has been tested thoroughly. Although a high number of training data points could be generated a reasonably larger DL model could be easily overfitted on the dataset and the test dataset was not able to effectively reveal this overfitting problem. There can be a need for a more dedicated test dataset based on e.g. completely new test specimen.

In conclusion this is the first time a DL based technique has been used successfully for the detection of coarser grain regions in fine grain IN718 superalloy. Moreover was the reliable detection of artificially introduced seeds not possible with conventional UT. This shows that the usage of thresholds by assuming a sufficient change in acoustic impedance is insufficient for this defect type and demands for other techniques that need to utilize information like grain noise in the ultrasonic signal to enhance the overall ultrasonic testing capabilities. The parameter values given in the documents for the definition of process and measuring conditions for the characterization of specific properties of the subject matter of the invention are also to be regarded as included in the scope of the invention in the context of deviations due to, for example, measuring errors, system errors, calculation errors, DIN tolerances and the like.

References

[1] Annis, C, "MIL-HDBK-1823A," Nondestructive Evaluation System Reliability Assessment, Department of Defense Handbook, Wright-Patterson AFB, USA, 2009.

[2] V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in International Conference on Machine Learning (ICML), pp. 807-814, 2010.

[3] Ioffe, S., and Szegedy, C., "Batch normalization: Accelerating deep network training by reducing internal covariate shift," In Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 448-456, 2015.

[4] Howard, P. J., Copley, D. C. and Gilmore, R. S., "A Signal-to-noise ratio comparison of ultrasonic transducers for C-scan imaging in titanium," In Review of Progress in Quantitative Nondestructive Evaluation. Springer, Boston, MA, pp. 2113-2120, 1995.

[5] Thompson, R. B., Margetan, F. J., Haldipur, P., Yu, L., Li, A., Panetta, P. and Wasan, H., "Scattering of elastic waves in simple and complex poly crystals," Wave Motion, vol. 45, no. 5, pp. 655-674, 2008.

[6] Cakir, E., "Deep neural networks for sound event detection," Ph.D. dissertation, Tampere University, 2019.

[7] Margetan, F. J., Haldipur, P., Yu, L. and Thompson, R. B., "Looking for Multiple Scattering Effects in Backscattered Ultrasonic Grain Noise from Jet-Engine Nickel Alloys," In AIP Conference Proceedings, vol. 760, no. 1, pp. 75-82, American Institute of Physics, April 2005.

[8] Huntington, H. B., "On ultrasonic scattering by polycrystals," The Journal of the Acoustical Society of America, vol. 22, no. 3, pp. 362-364, 1950.

[9] Roth, W, "Scattering of ultrasonic radiation in poly crystalline metals," Journal of Applied Physics, 19(10), pp. 901-910, 1948. [10] Mason, W P. and McSkimin, H. J., "Energy losses of sound waves in metals due to scattering and diffusion," Journal of Applied Physics, 19(10), pp. 940-946, 1948.

[11] Mitchell, A., "White spot defects in VAR superalloy," In Proceedings of the 1986 Vacuum Metallurgy Conference on Specialty Metals Melting and Processing, pp. 55-61, June 1986.

[12] Jackman, L. A., Maurer, G. E„ and Widge, S. U. N. I. L„ "New knowledge about 'white spots' in superalloys," Advanced Materials and Processes, USA, 143(5), 1993.

[13] Viosca, A. L., "A method for the characterization of white spots in vacuum-arc remelted superalloys," Ph.D. dissertation, Dept. Meeh. Eng., The University of Texas at Austin, 2011.

[14] Lejcek, P., "Grain boundary segregation in metals," Springer Science and Business Media, vol. 136, 2010.

[15] Evans, D. G. and Fahrmann, M., "A Study of the Effect of Electro- Slag Re-melting Parameters on the Structural Integrity of Large Diameter Alloy 718 ESR Ingot," The Minerals, Metals and Materials Society, pp. 507-515, 2004.

[16] Damkroger, B. K., Kelley, J. B., Schlienger, M. E., Van Den Avyle, J. A., Williamson, R. L., and Zanner, F. J., "The influence of VAR processes and parameters on white spot formation in Alloy 718 (No. SAND-94-1267C; CONF-940663-1)," Sandia National Labs., Albuquerque, NM, USA, 1994.

[17] Jackman, L., "White spots in superalloys. Superalloys 718, 625, 706 and Various Derivatives," pp. 153-166, 1994.

[18] Van Pamel, A., "Ultrasonic inspection of highly scattering materials," Ph.D. dissertation, Dept. Meeh. Eng., Imperial College London, 2015.

[19] National Transporation and Safety Board, "Uncontained Engine Failure and Subsequent Fire American Airlines Flight 383 Boeing 767-323, N345AN," Jan. 2018. [Online], Available: www.ntsb.gov/investigations/AccidentReports/Pages/AARl 801 .aspx, Accessed on: Mai 7, 2020 [20] Margetan, F. et al, "Fundamental studies of nickel billet materials-Engine Titanium Consortium II. FJ Margetan, E. Nieters, P. Haldipur, et air FAA Technical Center, Atlantic City, NJ, 2005.

[21] Margetan, F. J., Thompson, R. B., Yalda-Mooshabad, I., and Han, Y. K., "Detectability of small flaws in advanced engine alloys," 1993.

[22] LeCun, Y., Bengio, Y, and Hinton, G., "Deep learning," nature, 521(7553), 436-444, 2015.

[23] Ford, L., Tang, H., Grondin, E, and Glass, J. R., "A Deep Residual Network for Large-Scale Acoustic Scene Analysis," in INTERSPEECH, pp. 2568-2572, 2019.

[24] J. F. Gemmeke et al., "Audio Set: An ontology and human-labeled dataset for audio events," 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, pp. 776-780, 2017.

[25] Q. Kong, Y Cao, T. Iqbal, Y Wang, W Wang and M. D. Plumbley, "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition," in IEEE/ ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2880-2894, 2020.

[26] Y Wang, J. Li and F. Metze, "A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling," ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, United Kingdom, pp. 31-35, 2019.

[27] Q. Kong, Y Xu, W Wang and M. D. Plumbley, "Audio Set Classification with Attention Model: A Probabilistic Perspective," 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, pp. 316-320, 2018.

[28] Yu, C, Barsim, K. S., Kong, Q., and Yang, B., "Multi-level attention model for weakly supervised audio classification," arXiv preprint arXiv: 1803.02353, 2018. [29] Q. Kong, C. Yu, Y Xu, T. Iqbal, W. Wang and M. D. Plumbley, "Weakly Labelled AudioSet Tagging With Attention Neural Networks," in IEEE/ ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 11, pp. 1791-1802, Nov. 2019.

[30] Luong, M. T., Pham, H., and Manning, C. D., "Effective approaches to attention-based neural machine translation," In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412-1421, Sep. 2015.

[31] Jean, S., Cho, K., Memisevic, R., and Bengio, Y, "On using very large target vocabulary for neural machine translation," In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 1-10, July 2015.

[32] Bahdanau, D., Cho, K., and Bengio, Y, "Neural machine translation by jointly learning to align and translate," 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, May 2015.

[33] Vaswani, A. et al, "Attention is all you need," in Advances in neural information processing systems, pp. 5998-6008, 2017.

[34] Cho, K., Van Merrienboer, B., Bahdanau, D., and Bengio, Y, "On the properties of neural machine translation: Encoder-decoder approaches," In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103-111, Oct. 2014.

[35] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, pp. 770-778, 2016.

[36] Ba, J. L., Kiros, J. R., and Hinton, G. E., "Layer normalization," arXiv: 1607.06450, 2016.

[37] American Society for Testing and Materials (Filadelfia, Pennsylvania), "ASTM El 12-96 (2004) e2: Standard Test Methods for Determining Average Grain Size," ASTM, 2004. [38] T. Lin, P. Goyal, R. Girshick, K. He and P. Dollar, "Focal Loss for Dense Object Detection," In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318-327, 1 Feb. 2020. [39] Abadi, M. et al, "Tensorflow: A system for large-scale machine learning," In 12th USENIX symposium on operating systems design and implementation (OSDI 16) (pp. 265-283), 2016.

[40] Kingma, D. P., and Ba, J., "Adam: A method for stochastic optimization," arXiv preprint arXiv: 1412.6980, 2014.

[41] Curtis, G. J., "Acoustic emission energy relates to bond strength. Non-Destructive Testing," 8(5), in Non-Destructive Testing, vol. 8, no. 5, pp. 249-257, 1975.

[42] Czimmermann, T. et al, "Visual-Based Defect Detection and Classification Approaches for Industrial Applications — A SURVEY," In Sensors, vol. 20, no. 5, p. 1459, 2020.

Claims

42 Claims

1. A computer-implemented method for training an artificial neural network, comprising the steps of:

- creating the artificial neural network;

- producing or providing a multitude of test components with each at least one respective artificial microstructure defect that is inserted at a known location in the test component;

- ultrasonic measuring the test components, preferably in form of a complete Full-A-Scan of the test component;

- manually labeling the microstructure defects at the known locations in the respective ultrasonic measurement data-sets of the test components; and

- training the artificial neural network by means of a computer system using the ultrasound measurement data with the labelled microstructure defect locations.

2. A computer-implemented method for training an artificial neural network, comprising the steps of:

- creating the artificial neural network;

- producing at least one test body with at least one real microstructure defect;

- providing at least a first ultrasonic measuring data characterizing a regional microstructure defect, preferably in form of a local Full-A-Scan;

- providing at least one second ultrasonic measurement data characterizing a preferably microstructure defect-free component, preferably in the form of a complete Full-A-Scan of the defect- free component; and

- training the artificial neural network by means of a computer system using the first ultrasound measurement data and the second ultrasound measurement data, wherein the first ultrasonic measuring data characterizing the microstructure defect data are used as labels.

3. The computer-implemented method of claim 2, wherein the second ultrasound measurement data is modified by inserting at least a portion of the first ultrasound measurement data, preferably by numerical integration, at least once as an artificial microstructural defect location into the second ultrasound measurement data. 43

4. The computer-implemented method according to claim 3, wherein at least a part of the first ultrasonic measurement data is inserted into the second ultrasonic measurement data in such a way that artificial microstructural defect locations are characterized in at least two different positions and/or depths of the defect-free component.

5. The computer-implemented method according to any one of claims 3 to 4, in which at least the first and the second ultrasound measurement data are used to create multiple artificial defect bearing components by inserting artificial microstructural defects at different positions and/or depths of the first signal, which artificial defect bearing components are then used for training the artificial neural network.

6. The computer-implemented method according to any one of claims 2 to 5, in which the first and/or second ultrasonic measurement data are transformed, preferably by means of short-time FFT.

7. The computer-implemented method according to any one of claims 2 to 6, in which the first and/or second ultrasound measurement data are weighted and/or denoised.

8. The computer-implemented method according to any one of claims 1 to 7, in which the artificial neural network is trained with a supervised learning algorithm and/or with first ultrasound measurement data obtained on a plurality of test bodies and/or with second ultrasound measurement data obtained on a plurality of components, and the respective microstructure defect data as labels.

9. The computer-implemented method according to any one of claims 1 to 8, in which a quality of the artificial neural network is tested and/or optimized by means of a receiver operation characteristic using ultrasound measurement data.

10. A method for inspecting a component, in particular a component of an aircraft engine, in which the component is ultrasonically measured at least in regions while obtaining ultrasonic measurement data and is inspected for the presence of structural defects by means of an artificial neural network which is trained according to one of claims 1 to 8 using the ultrasonic measurement data. 44

11. A test system (8) for examining components (10) or blanks, in particular components for aircraft engines, for the presence of structural defects, comprising

- at least one ultrasonic examination device (12), by means of which the component (10) can be measured at least in regions in order to obtain ultrasonic measurement data, and

- a computer system (13) which is designed to test for the presence of structural defects (14) in the component (10) by means of an artificial neural network which is trained according to one of claims 1 to 8, using the ultrasonic measurement data.

12. A computer program that includes instructions that, when the computer program is executed by a computer system (13), cause the computer system (13) to perform a method according to any one of claims 1 to 9 and/or a method according to claim 10.

13. A computer-readable storage medium that stores a computer program according to claim 12.