WO2023095250A1

WO2023095250A1 - Abnormality detection system

Info

Publication number: WO2023095250A1
Application number: PCT/JP2021/043223
Authority: WO
Inventors: 友輔生内; 圭吾長谷川; 海斗笹尾
Original assignee: 株式会社日立国際電気
Priority date: 2021-11-25
Filing date: 2021-11-25
Publication date: 2023-06-01
Also published as: JPWO2023095250A1

Abstract

The purpose of the present invention is to provide an abnormality detection system that can detect an abnormal portion with higher accuracy using a DL model. The abnormality detection system comprises: a signal acquisition unit (201) for acquiring an input signal from a sensor; a mask generation unit (202) for generating a mask to be superimposed on the input signal; a mask superimposition unit (203) for generating a masked signal by superimposing, on the input signal, the mask generated by the mask generation unit (202); a signal reconstruction unit (204) for generating a reconstructed signal by reconstructing the masked signal generated by the mask superimposition unit (203); and an abnormality determination unit (205) for determining whether or not the input signal includes an abnormal portion on the basis of an error, in a mask region, between the input signal and the reconstructed signal. The signal reconstruction unit (204) reconstructs a region that is masked by using a deep learning model obtained by performing training using normal input signals.

Description

Anomaly detection system

The present invention relates to an anomaly detection system, and more particularly to an anomaly detection system using AI technology such as deep learning.

The application of AI (artificial intelligence) such as deep learning (DL) is progressing for tasks such as anomaly detection, prediction, and attribute identification that use time-series signals and images as input. By incorporating such functions into products and systems, it is possible to reduce labor costs and improve the added value of products by substituting human work, contributing to sales promotion of products and systems. For this reason, many companies have a high interest in AI.

AutoEncoder is one of known anomaly detection methods using DL. An autoencoder uses a neural network to extract features of an input signal, and uses the extracted features to restore (reconstruct) a signal such that the error with respect to the input signal is zero. Learning is performed by unsupervised learning, and parameters are learned so that the error between an arbitrary normal input signal and a signal reconstructed using an autoencoder becomes zero. During operation, based on the error between the input signal and the signal extracted and reconstructed using the trained DL model (neural network and set of parameters), it is determined whether the input signal contains an abnormal portion. In the case of an input signal containing an abnormal portion, the abnormal portion is not reconstructed correctly, and the error between the input signal and the reconstructed signal becomes large, and the input signal is determined to contain an abnormal portion.

In practice, generalization performance is required to reconstruct a signal with zero error even for an unknown input signal that contains only normal parts. For example, in the case of a system that uses cameras to check outdoor public infrastructure facilities by detecting anomalies, the shooting environment may differ due to differences in weather and time of day. In this case, there is a possibility that the shooting environment during acquisition of input signals (learning data) during learning and the shooting environment during actual shooting may differ greatly. As a result, the possibility of erroneous detection of a non-abnormal state as abnormal increases, causing a decrease in accuracy. Therefore, it is generally desirable that learning data be acquired in various environments and have diversity.

On the other hand, in Patent Document 1, the region and pixel value of an object generated in a monitoring area are extracted from the image captured by the photographing means, and based on the determination criteria set for each angle of view and position in the image, acquisition describes an obstacle detection system that divides the object area and pixel values into blocks, identifies the type of the object from the local feature amount, and detects the presence or absence of an obstacle that has occurred in the image from the information on the type of the object. ing.

JP 2019-124986 A

However, if the DL model has generalization performance after learning with diverse learning data, the abnormal part can be reconstructed without error for the input signal containing the abnormal part during operation. As a result, the error between the input signal and the reconstructed signal becomes small, causing a problem that the abnormality cannot be detected. This problem is considered to occur because the learning data with diversity has features similar to the features of the abnormal part, so that the abnormal part can be reconstructed as a normal part.

In addition, in Patent Document 1, a local feature amount is used, and since this local feature amount needs to be extracted by prior learning, etc., there is a possibility that it will be affected by the environment during learning.

In view of the above problems, it is an object of the present invention to provide an anomaly detection system that can detect anomalous parts with higher accuracy using a DL model.

In order to achieve the above object, one typical anomaly detection system of the present invention includes a signal acquisition unit that acquires an input signal from a sensor, a mask generation unit that generates a mask superimposed on the input signal, the input A mask superimposing unit that superimposes a mask generated by the mask generating unit on a signal to generate a masked signal, and a signal reconstructing unit that reconstructs the masked signal generated by the mask superimposing unit to generate a reconstructed signal. and determining whether the input signal includes an abnormal portion based on the error (for example, squared error, SNR, PSNR, SSIM, and other known feature amounts) between the input signal and the reconstructed signal within the mask region. and an anomaly determination unit for determining, wherein the signal reconstruction unit reconstructs the masked region using a deep learning model trained using a normal input signal.

According to the present invention, in an anomaly detection system, an anomaly can be detected with higher accuracy using a DL model.
Problems, configurations, and effects other than those described above will be clarified by the following embodiments.

FIG. 1 is a block diagram of a computer system for implementing aspects according to embodiments of the invention. FIG. 2 is a block diagram showing one embodiment of the anomaly detection system of the present invention. FIG. 3 is a functional block diagram showing an example of an analysis server of the anomaly detection system of the present invention. FIG. 4 is a diagram for explaining an example of the learning of the anomaly detection method of the anomaly detection system of the present invention. FIG. 5 is a diagram for explaining an example of operation of the anomaly detection method of the anomaly detection system of the present invention. FIG. 6 is a diagram for explaining an example of processing of the anomaly detection system of the present invention. FIG. 7 is a diagram illustrating an example of masks for reducing the number of masked signals in the anomaly detection system of the present invention. FIG. 8 is a diagram showing a model for explaining an example of a mask selection method using the background subtraction method in the anomaly detection system of the present invention. FIG. 9 is a diagram illustrating an example of a mask selection method using the background subtraction method in the anomaly detection system of the present invention. FIG. 10 is a diagram illustrating an example of a mask adjustment method using object detection results in the anomaly detection system of the present invention. FIG. 11 is a diagram showing a model for explaining the first specific example of the anomaly detection system of the present invention. FIG. 12 is a diagram for explaining a first specific example of the anomaly detection system of the present invention. FIG. 13 is a diagram showing a model for explaining a second specific example of the anomaly detection system of the present invention. FIG. 14 is a diagram for explaining a second specific example of the anomaly detection system of the present invention. FIG. 15 is a diagram illustrating an example of application of the anomaly detection system of the present invention to time-series data. FIG. 16 is an example of a processing flowchart of the anomaly detection system of the present invention.

A mode for carrying out the present invention will be described with reference to the drawings.

<Example of hardware for implementing the embodiment>
FIG. 1 is a block diagram of a computer system 1 for implementing aspects according to embodiments of the invention. The mechanisms and apparatus of various embodiments disclosed herein may be applied to any suitable computing system.

The main components of the computer system 1 include one or more processors 2, memory 4, terminal interface units 12, storage interface units 14, I/O (input/output) device interface units 16, and network interfaces 18. These components may be interconnected via memory bus 6 , I/O bus 8 , bus interface unit 9 and I/O bus interface unit 10 .

The computer system 1 may include one or

more processing units

2A and 2B, collectively referred to as processors 2. Each processor 2 executes instructions stored in memory 4 and may include an on-board cache. In some embodiments, computer system 1 may include multiple processors, and in other embodiments, computer system 1 may be a single processing unit system. As the processing device, CPU (Central Processing Unit), FPGA (Field-Programmable Gate Array), GPU (Graphics Processing Unit), DSP (Digital Signal Processor), etc. can be applied.

In some embodiments, memory 4 may include random access semiconductor memory, storage devices, or storage media (either volatile or non-volatile) for storing data and programs. In some embodiments, memory 4 represents the entire virtual memory of computer system 1 and may include the virtual memory of other computer systems connected to computer system 1 via a network. Although memory 4 may conceptually be considered a single entity, in other embodiments memory 4 may be a more complex arrangement, such as a hierarchy of caches and other memory devices. For example, memory may exist as multiple levels of caches, and these caches may be partitioned by function. As a result, one cache may hold instructions and another cache may hold non-instruction data used by the processor. The memory may be distributed and associated with various different processing units, such as in the so-called NUMA (Non-Uniform Memory Access) computer architecture.

Memory 4 may store all or part of the programs, modules, and data structures that implement the functions described herein. For example, memory 4 may store latent factor identification application 50 . In some embodiments, latent agent identification application 50 may include instructions or descriptions that perform the functions described below on processor 2, or may include instructions or descriptions that are interpreted by other instructions or descriptions. In some embodiments, latent factor identification application 50 may be implemented in semiconductor devices, chips, logic gates, circuits, circuit cards, and/or other physical hardware instead of or in addition to processor-based systems. It may be implemented in hardware via a device. In some embodiments, latent agent identification application 50 may include data other than instructions or descriptions. In some embodiments, a camera, sensor, or other data input device (not shown) may be provided in direct communication with bus interface unit 9, processor 2, or other hardware of computer system 1. . Such a configuration may reduce the need for processor 2 to access memory 4 and the latent factor identification application.

The computer system 1 may include a bus interface unit 9 that provides communication between the processor 2 , memory 4 , display system 24 and I/O bus interface unit 10 . I/O bus interface unit 10 may be coupled to I/O bus 8 for transferring data to and from various I/O units. I/O bus interface unit 10 connects via I/O bus 8 to a plurality of I/

O interface units

12, 14, 16, also known as I/O processors (IOPs) or I/O adapters (IOAs). and 18. Display system 24 may include a display controller, a display memory, or both. The display controller can provide video, audio, or both data to display device 26 . Computer system 1 may also include devices such as one or more sensors configured to collect data and provide such data to processor 2 . For example, computer system 1 may include environmental sensors that collect humidity data, temperature data, pressure data, etc., and motion sensors that collect acceleration data, motion data, etc., and the like. Other types of sensors can also be used. The functions provided by bus interface unit 9 may be implemented by an integrated circuit including processor 2 .

The I/O interface unit has the function of communicating with various storage or I/O devices. For example, the terminal interface unit 12 may include user output devices such as video displays, speaker televisions, etc., and user input devices such as keyboards, mice, keypads, touch pads, trackballs, buttons, light pens, or other pointing devices. Such user I/O devices 20 can be attached. A user inputs input data and instructions to the user I/O device 20 and the computer system 1 by operating the user input device using the user interface, and receives output data from the computer system 1. good too. The user interface may be displayed on a display device, played by a speaker, or printed via a printer, for example, via the user I/O device 20 .

The storage interface unit 14 may include one or more disk drives or direct access storage devices 22 (typically magnetic disk drive storage devices, but arrays of disk drives or other storage devices configured to appear as a single disk drive). device) can be attached. In some embodiments, storage device 22 may be implemented as any secondary storage device. The contents of the memory 4 may be stored in the storage device 22 and read from the storage device 22 as needed. I/O device interface unit 16 may provide an interface to other I/O devices such as printers, fax machines, and the like. Network interface 18 may provide a communication path so that computer system 1 and other devices may communicate with each other. This communication path may be, for example, network 30 .

The computer system 1 shown in FIG. 1 includes a bus structure that provides a direct communication path between processor 2, memory 4, bus interface unit 9, display system 24, and I/O bus interface unit 10; In other embodiments, computer system 1 may include hierarchical, star, or web configurations of point-to-point links, multiple hierarchical buses, parallel or redundant communication paths. Further, although I/O bus interface unit 10 and I/O bus 8 are shown as a single unit, in practice computer system 1 may include multiple I/O bus interface units 10 or multiple I/O buses. A bus 8 may be provided. Also, although multiple I/O interface units are shown for isolating the I/O bus 8 from the various communication paths leading to the various I/O devices, in other embodiments, one of the I/O devices Some or all may be directly connected to one system I/O bus.

In some embodiments, computer system 1 is a device that receives requests from other computer systems (clients) that do not have a direct user interface, such as multi-user mainframe computer systems, single-user systems, or server computers. There may be. In other embodiments, computer system 1 may be a desktop computer, handheld computer, laptop, tablet computer, pocket computer, phone, smart phone, or any other suitable electronic device.

<Overall configuration example>
FIG. 2 is a block diagram showing one embodiment of the anomaly detection system of the present invention.

The anomaly detection system shown in FIG. 2 includes an analysis server 101, a sensor 102, and a database server 103. Here, analysis server 101 , sensor 102 , and database server 103 are each connected via network 104 .

The analysis server 101 is composed of an electronic computer system equipped with a processor such as a CPU. The processor may include a DSP, FPGA, GPU, etc. in addition to the CPU. The analysis server 101 performs processing to be described later.

The sensor 102 is a device that continuously acquires signal data, such as a camera, acceleration sensor, temperature sensor, and the like. A signal may be acquired by combining a plurality of these. In the case of a camera, for example, a configuration of a camera that obtains information by forming an image of incident light on an imaging device via a lens or a diaphragm can be applied. Examples of the imaging device here include a CCD (Charge-Coupled Device) image sensor and a CMOS (Complementary Metal Oxide Semiconductor) image sensor. The camera can capture images and videos, for example, at 3 frames per second (3 fps) or more, and the information is sent to the analysis server 101 and the database server 103 . A plurality of cameras can be installed depending on the situation.

The database server 103 is a database server equipped with a storage device. Information required for analysis by the analysis server 101 and information acquired by the sensor 102 can be recorded. Also, the results of analysis performed by the analysis server 101 can be recorded.

A network 104 is a line capable of data communication that connects each server. Any type of line, such as a dedicated line, an intranet, an IP network such as the Internet, etc., does not matter.

The signal data acquired by the sensor 102 is analyzed by the analysis server 101, and the result of abnormality detection is stored in the database server 103. Note that the configuration in FIG. 2 is an example, and other configurations are also applicable. For example, the function of the analysis server 101 is integrated with the sensor 102, and the processing of the abnormality detection system is performed there. Also, the storage device of the database server 103 may be integrated with the sensor 102 or integrated with the analysis server 101 .

<Example of analysis server>
FIG. 3 is a functional block diagram showing an example of an analysis server of the anomaly detection system of the present invention. Functional blocks of the analysis server 101 will be described with reference to FIG.

The analysis server 101 includes a signal acquisition section 201 , a mask generation section 202 , a mask superimposition section 203 , a signal reconstruction section 204 , an abnormality determination section 205 , an output control section 206 and an auxiliary storage section 207 .

The auxiliary storage unit 207 stores the signal input from the sensor 102 . Further, the auxiliary storage unit 207 holds necessary information such as setting parameters. The auxiliary storage unit 207 is usually composed of a non-volatile memory such as an HDD (Hard Disk Drive) or flash memory, and stores programs executed by the analysis server 101 and data to be processed by the programs. A signal from the auxiliary storage unit 207 is output to the signal acquisition unit 201 , the mask generation unit 202 and the output control unit 206 .

The signal acquisition unit 201 acquires signals from the auxiliary storage unit 207 . The acquired signal is the signal from the sensor 102 and may be preprocessed to reduce the effects of noise, flicker, and the like. Examples of preprocessing here include processing using a smoothing filter, an edge enhancement filter, and the like. In the case of image data, a data format such as RGB color, YUV, or monochrome may be selected depending on the application. Furthermore, reduction processing may be performed to a predetermined size in order to reduce the processing cost. The signal subjected to these processes is output to mask superimposition section 203 .

The mask generation unit 202 acquires mask setting parameters from the auxiliary storage unit 207 and generates a plurality of masks. At this time, the mask to be used may be determined by acquiring a signal from the auxiliary storage unit 207 and performing preprocessing such as a background subtraction method described later. The signal of the generated mask is output to the mask superimposition section 203 .

The mask superimposing unit 203 superimposes the input signal obtained by the signal acquiring unit 201 and a plurality of masks generated by the mask generating unit 202 to generate a masked signal. Here, the masked portion is a portion that does not retain the information of the original input signal. The masked signal is output to signal reconstruction section 204 .

The signal reconstruction unit 204 reconstructs the masked signal generated by the mask superimposition unit 203 . This reconstruction produces a reconstructed signal in which the masked portion of the signal is reconstructed. Reconstruction is performed by inputting to an AI (artificial intelligence) DL (deep learning) model capable of signal restoration (interpolation) such as inpainting. Deep learning can apply methods using neural networks and parameter sets. Here, inpainting refers to a technique of masking an image and restoring it, and a model for that purpose is called an inpainting model. A specific example of inpainting is image inpainting. For example, the technique of Non-Patent Document 1 may be used as image-in-painting. The reconstructed signal is sent to abnormality determination section 205 .

The abnormality determination unit 205 calculates the error between the signal obtained by the signal acquisition unit 201 and the signal reconstructed by the signal reconstruction unit 204, and determines the presence or absence of an abnormal portion using a preset threshold value. . In particular, if the error in the masked area is greater than or equal to the threshold, it can be determined as abnormal. As the threshold here, it is possible to apply a threshold that indicates the extent of the range in which the signal difference is greater than or equal to a predetermined value. Also, as errors, for example, squared error, SNR (Signal-to-Noise Ratio), PSNR (Peak Signal-to-Noise Ratio), SSIM (Structural Similarity), etc. It is possible to evaluate using a known feature amount or the like. A determination result is output to the output control unit 206 .

The output control unit 206 outputs to the database server 103 the output result of the abnormality determination obtained from the abnormality determination unit 205, the signal information stored in the auxiliary storage unit 207, and the like.

In the following, an example based on a camera as the sensor 102, image data as the signal acquired from the sensor 102, and image-in-painting as the DL model capable of restoring (interpolating) the signal will be described.

<Outline of anomaly detection>
FIG. 4 is a diagram for explaining an example of the learning of the anomaly detection method of the anomaly detection system of the present invention. FIG. 5 is a diagram for explaining an example of operation of the anomaly detection method of the anomaly detection system of the present invention. An outline of an anomaly detection method using image-in-painting will be described with reference to FIGS.

As shown in FIG. 4, during learning, an input image 310 that is a normal input signal (an image showing a normal inspection target 301) is used. A mask 305 is superimposed on this input image 310 to generate a masked image 320 which is a masked signal. Here, the position of the mask 305 on the masked image 320 is randomly superimposed. As the mask 305, a monochrome mask such as black or white can be applied so that the information of the original image in this portion is not left. Then, using image-in-painting 330 , the masked image 320 is reconstructed (restored) to an unmasked state to create a reconstructed image 340 . A comparison is made between this reconstructed image 340 and the unmasked input image 310 . In particular, the input image 310 is compared within the masked area 305 to learn to reconstruct a normal input image. At this time, a loss function is calculated, and feature extraction/signal reconstruction parameters for error-free reconstruction are learned. In this way, machine learning is performed in advance to generate an inpainting model, which is a DL model.

Fig. 5 explains anomaly detection during operation. The upper side shows the case where the inspection object 401 has no abnormality, and the lower side shows the case where the inspection object 501 has an abnormality 502 . During operation, a mask is superimposed on the signal acquired from the camera (sensor 102) and input to the inpainting model. Based on the error in the masked area between the signal reconstructed by the inpainting model and the input signal without masking, it is determined whether the input signal contains abnormal portions. When the mask overlaps the abnormal portion, the error between the input signal and the reconstructed signal increases because the abnormal portion is reconstructed as if it were a normal portion.

A case where there is no abnormality in the inspection target 401 on the upper side of FIG. 5 will be described. A masked image 420 is generated by superimposing a mask 405 on an input image 410 , image-in-painting 430 processing is performed, and a reconstructed image 440 is generated. The input image 410 and the reconstructed image 440 are then compared to calculate the error in the area of the mask 405 . Here, since there is no abnormality in the area of the mask 405, the error in the area of the mask 405 is small.

A case where there is an abnormality 502 in the inspection target 501 on the lower side of FIG. 5 will be described. A masked image 520 is generated by superimposing a mask 505 on an input image 510 , image-in-painting 530 processing is performed, and a reconstructed image 540 is generated. The input image 510 and the reconstructed image 540 are then compared to calculate the error in the area of the mask 505 . Here, since the region of the mask 505 has an anomaly 502 , this anomaly is not reflected in the reconstructed image 540 . Therefore, the error in the area of mask 505 is large.

<Example of processing of anomaly detection system>
FIG. 6 is a diagram for explaining an example of processing of the anomaly detection system of the present invention. 6, the sensor 102, the signal acquisition unit 201, the mask generation unit 202, the mask superimposition unit 203, the signal reconstruction unit 204, the abnormality determination unit 205, and the auxiliary storage unit shown in FIG. 3 when the input signal is an image. 207 will be described.

The signal acquisition unit 201 acquires an image transmitted from the camera that is the sensor 102 and stored in the auxiliary storage unit 207 and outputs it as an input image 610 . Here, an example in which an inspection object 601 with an abnormality 602 is shown in an input image 610 is shown.

The mask generation unit 202 reads parameters such as the size, shape, number, and slide amount of the masks stored in the auxiliary storage unit 207 and generates a mask 605 . The mask 605 is positioned differently for each applied image. In FIG. 6, one square mask is formed by changing the position in order for each image to be applied. As a result, n patterns of mask patterns 620-1 to 620-n are generated. Preferably, these patterns together can cover all locations in the image. The shape of the mask 605 is rectangular, which is suitable for sequentially masking the entire area, but the shape is not limited to this, and a specific shape can also be applied.

Next, the mask superimposing unit 203 superimposes the mask generated by the mask generating unit 202 on the input image 610 to create masked images 630-1 to 630-n. That is, n images are created by superimposing masks 605 at different positions on the same input image 610 . In FIG. 6, the position is changed while sliding the mask area.

Next, the signal reconstruction unit 204 inputs the masked image generated by the mask superimposition unit 203 to the inpainting model (image inpainting 635). As an output result, n reconstructed images 640-1 to 640-n are respectively reconstructed. At this time, the masked images 630-1 to 630-n may be input to the inpainting model one by one, or a plurality of images may be subjected to batch processing (parallel processing). Reconstructed images 640-1 through 640-n are reconstructions of mask 605 portions of masked images 630-1 through 630-n, respectively.

Next, the abnormality determination unit 205 compares the input image 610 with the reconstructed images 640-1 to 640-n reconstructed from the masked images 630-1 to 630-n. This comparison calculates the error in each region of the mask 605, and if there is an error equal to or greater than a certain threshold, it is determined that there is an "abnormality". If all the errors are less than the threshold, it is determined that there is no abnormality. As the threshold value here, a threshold value can be used that indicates the extent of the range of difference due to binarization by performing binarization with a certain reference value. If the range (the number of pixels) is greater than or equal to a predetermined range, it can be determined as an abnormal portion. Note that the reference value for binarization can be a pixel value difference of a predetermined value or more (for example, a predetermined pixel value or more), and a suitable value can be used. FIG. 6 shows the binarized error images 650-1 to 650-n, and the error image 650-m shows a difference in the abnormal 602 portion.

It should be noted that the size, shape, number, and slide amount of the mask 605 can be arbitrarily set.

<Example of reducing the number of masked signals>
FIG. 7 is a diagram illustrating an example of masks for reducing the number of masked signals in the anomaly detection system of the present invention. Here, an example in which an inspection target 701 with an abnormality 702 is shown in an input image 710 is shown.

　If there are many masked signals input to the inpainting model, the processing time will increase and the real-time performance will be impaired. For this reason, the use of a mesh pattern mask shown in FIG. 7 can be used as a means of reducing the number of masked signals to be input to the inpainting model. Unlike the above-described FIG. 6, a mesh pattern mask that covers a plurality of locations arranged in a predetermined pattern is used instead of a mask that covers a range of one location.

The mask generation unit 202 forms a mesh pattern mask 705 by sequentially changing the position for each image to be applied. As a result, four mask patterns 720-1 to 720-4 are generated. Preferably, these patterns together can cover all locations in the image. It should be noted that the shape of one mask of the mesh pattern is a rectangle, which is suitable for covering the entire area while shifting the position, but it is not limited to this, and a specific shape can also be applied.

Next, the mask superimposing unit 203 superimposes the mask patterns 720-1 to 720-4 on the input image 710 to create masked images 730-1 to 730-4. That is, four images are created by superimposing masks 705 at different positions on the same input image 710 .

Also, the location of the abnormality can be specified by the abnormality determination unit 205, as in FIG.

In this way, in the example of FIG. 7, by using the mesh pattern, it is possible to reduce the number of masked signals and reduce the processing. Specifically, it is possible to analyze the masked signals by using 48 masks as 4 mask patterns. This can greatly reduce the time required for analysis. It should be noted that if the area to be hidden at one time is increased, it may be difficult to restore the image at the time of reconstruction. In this case, the technique of masking each mask as shown in FIG. 6 is effective.

<Example of mask selection method using background subtraction method>
FIG. 8 is a diagram showing a model for explaining an example of a mask selection method using the background subtraction method in the anomaly detection system of the present invention. FIG. 9 is a diagram illustrating an example of a mask selection method using the background subtraction method in the anomaly detection system of the present invention.

A mask pattern is generated by the mask generation unit 202 shown in FIGS. 3 and 6, but in order to minimize the number of masks used at this time, preprocessing for selecting a mask may be added. This pretreatment can reduce the range in which the mask is used, for example, by preliminarily limiting the area for which the mask is selected. As a specific example, a mask selection method using the background subtraction method will be described with reference to FIGS.

Here, as shown in FIG. 8, an example is taken of an inspection object 802 flowing on a conveyor 803 using a camera 801 fixed from above. A camera 801 corresponds to the sensor 102 shown in FIGS. 2, 3, and 6, and shoots an object 802 to be inspected from above. An inspection object 802 is moving on a conveyor 803 in the direction of the arrow. At this time, it is assumed that the angle of view of the camera 801 is fixed, and the inspection object 802 passes through the angle of view of the camera 801 in the horizontal direction.

As shown in FIG. 9, prepare a background image 920 in which the object to be inspected is not shown. Specifically, in FIG. 8, the image is obtained when the inspection object 802 does not exist. An input image 910 is an image in which an inspection object 802 is shown. Then, the difference between the background image 920 and the input image 910 is obtained and binarized with a threshold to generate a difference image 930 . A threshold suitable for the difference is selected here as the threshold. In the example of FIG. 9, the white portion of the difference image 930 is determined to be the inspection target 802 area.

Selection of the mask 905 is performed in the white area of the generated difference image 930 (area corresponding to the inspection object 802). The upper left coordinate and lower right coordinate of the white portion of the difference image 930 are identified, and a mask is selected from the size of the mask 905 and the slide amount. Specifically, a pattern such as the mask slide image 940 is determined. In the mask slide image 940, the method is to slide from the upper left to the right side in parallel to the right end, and then move from the left end to the right end under it. This selects a mask sufficient to mask the white portion of the difference image 930 .

The mask generation unit 202 in FIG. 9 shows an example in which a mask is selected from the difference image 930. Here, it is determined that the region selected by the mask 905 can be covered with one level of the mask 905 in the vertical direction. Therefore, four patterns of mask patterns 950-1 to 950-4 are generated by sequentially changing the position of the mask 905 from left to right.

This can greatly reduce the number of masked signals input to the inpainting model.

<Example of mask selection method using methods other than background subtraction>
In addition to the background subtraction method described with reference to FIGS. 8 and 9, there are other methods of selecting a mask. For example, template matching with an image in which the object to be inspected does not appear, difference between frames when the object to be inspected is moving, motion detection by optical flow, and the like may be used.

In the case of template matching, a method such as comparing with a template image prepared in advance and excluding an image close to the template from the mask selection area can be used. In the case of optical flow, by detecting the range of a moving object such as a person and masking that portion, the other portions can be excluded from the mask selection region.

<Example of mask adjustment method using object detection results>
FIG. 10 is a diagram illustrating an example of a mask adjustment method using object detection results in the anomaly detection system of the present invention.

If the size of the mask is larger than the size of the abnormal part, the information of the abnormal part will not be input to the inpainting model. As a result, since an abnormal portion is not included in reconstructing an image, detection accuracy of an abnormal portion can be improved. On the other hand, if the mask is superimposed so that most of the inspection target is hidden, the area to be reconstructed becomes large. As a result, the accuracy of reconstructing the object to be inspected is lowered, which is a factor in degrading the detection accuracy. Therefore, detection accuracy can be further improved by adding mask size adjustment processing.

Fig. 10 shows an example of adjusting the size of the mask using the object detection result of the DL model that can detect the object to be inspected. In order to be able to detect the inspection target in advance, the DL model is learned by using a normal detection target.

First, the input image 1010 is input to the learned object detection DL model. In the object detection DL model, an object detection result 1020 is calculated by predicting the area of the inspection object 1001 . In the object detection result 1020, the range of the inspection target 1001 is specified. In FIG. 10, the ranges of three inspection targets 1001 are identified. Then, a mask 1005 covering a part of the identified inspection object 1001 area is created. A plurality of masks are used to cover one of the inspection objects 1001 . In FIG. 10, a mask is created to cover every quarter of the area.

The mask superimposing unit 203 superimposes the created mask on the input image 1010 . In FIG. 10, 12 masked images 1030-1 to 1030-12 in total are created in order to cover 1/4 of the area for each of the three inspection objects 1001. FIG. Note that the size covered by the mask may be appropriately adjusted as 1/n of the detection target (n is a natural number, and an integer of 2 or more and an integer of 4 or more can be applied).

With such a configuration, the signal is restored without hiding most of the inspection object, so it is possible to improve the accuracy of the inpainting model and reduce the processing load.

8 and 9, the background subtraction method or the like is used to specify the size of the inspection target screen including the abnormal portion, and then the size of the mask is adjusted using the method of FIG. You may

<First specific example>
FIG. 11 is a diagram showing a model for explaining the first specific example of the anomaly detection system of the present invention. FIG. 12 is a diagram for explaining a first specific example of the anomaly detection system of the present invention.

In the first specific example, an example is shown in which anomaly detection is performed by fixing a mask to a moving inspection object. As shown in FIG. 11, an object to be inspected 1102 flowing on a belt conveyor 1103 is imaged using a camera 1101 fixed from above. A camera 1101 corresponds to the sensor 102 shown in FIGS. 2, 3, and 6, and shoots an object 1102 to be inspected from above. A plurality of inspection objects 1102 are moving on the belt conveyor 1103 in the direction of the arrow (from the right to the left) at intervals. At this time, the angle of view of the camera 1101 is fixed, and the inspection object 1102 passes through the angle of view of the camera 1101 in the horizontal direction.

As shown in FIG. 12, the inspection target 1102 moves from the right side to the left side of the image captured by the camera. FIG. 12 shows n−1, n, n+1 and n+2 consecutive input image frames. As shown in FIG. 12, the mask 1205 area in the input image 1210 is fixed to a width A (the width in the direction perpendicular to the traveling direction) through which the inspection target 1102 passes or a wider width. That is, the position and size of the mask 1205 area in the input image 1210 are always constant. Also, the width of the mask 1205 in the traveling direction of the inspection target 1102 is set to be equal to or greater than the length of the inspection target 1102 moving in one frame. This makes it possible to inspect the entire area to be inspected.

Examples of the inspection object 1102 can be applied to various objects that are assumed to move, such as industrial products, food, and transportation items such as cardboard.

Thus, in the first specific example, fixing the position of the mask 1205 reduces the number of masked images, so a system that enables real-time analysis can be realized.

<Second example>
FIG. 13 is a diagram showing a model for explaining a second specific example of the anomaly detection system of the present invention. FIG. 14 is a diagram for explaining a second specific example of the anomaly detection system of the present invention. The second specific example differs from the first specific example in that the camera moves, but is common in that the inspection object moves relative to the camera.

In the second specific example, as shown in FIG. 13, a fixed inspection target 1302 is photographed using a camera 1301 that moves in the direction of the arrow (from left to right). A camera 1301 corresponds to the sensor 102 shown in FIGS. 2, 3, and 6, and shoots an object 1302 to be inspected from above. The camera 1301 has a moving mechanism capable of moving in parallel with the inspection object 1302 . At this time, the angle of view of the camera 1301 is fixed, and the inspection object 1302 passes through the angle of view of the camera 1301 in the horizontal direction.

As shown in FIG. 14, the mask 1405 area in the input image 1410 is fixed to the width B (the width in the direction perpendicular to the direction of travel) of the inspection object 1302 or wider. That is, the position and size of the mask 1405 area in the input image 1410 are constant. Also, the width of the mask 1405 in the traveling direction of the camera 1301 is set to be equal to or greater than the length of movement of the camera 1301 in one frame. This makes it possible to inspect the entire area of the inspection target by photographing while moving the camera 1301 .

Thus, in the second specific example, it is possible to continuously inspect the inspection object 1302 while moving the camera 1301 . The inspection object 1302 is particularly effective for fixed objects such as long objects such as electric wires and large objects. Also, when it is desired to investigate the entire circumference of a cross section perpendicular to the direction of travel, it is possible to use a plurality of cameras arranged in the circumferential direction of the cross section or a mirror that captures the circumferential direction.

<Example of application to time-series data>
FIG. 15 is a diagram illustrating an example of application of the anomaly detection system of the present invention to time-series data.

The input signal used for anomaly detection is not limited to one signal. For example, in the case of a time-series continuous signal, a portion of the time-series continuous signal may be input to the inpainting model as a mask region to perform anomaly detection. FIG. 15 illustrates an example of application to time-series data. Here, moving image data is taken as an example of time-series data.

First, extract several consecutive frames of video data. In FIG. 15, n frames 1510-1 to 1510-n are extracted. Next, a mask is superimposed on some frames. In FIG. 15, masked data 1520-2 is created by superimposing a mask on the entire area of 1510-2. Since the mask is not superimposed on the other frames 1510-1, 1510-3 to 1510-n, the masked data 1520-1, 1520-3 to 1520-n are the original frames 1510-1, 1510-3 to 1510-n. It is the same image as 1510-n.

It should be noted that the mask may be superimposed on the entire area of the relevant frame, or on a part of the detection area. Also, the frame on which the mask is superimposed may be determined to be the m-th frame from the first frame of the extracted frames and fixed to one. Also, it may be a plurality of frames.

Next, an image group in which a mask is superimposed on some frames is input to the inpainting model 1530 . In the inpainting model 1530, reconstruction is performed according to the inpainting method, and an image group of the same number of frames as the input frames is generated. In FIG. 15, reconstructed frames 1540-1 through 1540-n are generated. In particular, masked reconstructed frame 1540-2 is reconstructed.

After that, the error between the reconstructed frame 1540-2 corresponding to the mask-superimposed frame and the original frame 1510-2 is calculated to determine whether the relevant frame contains an abnormal portion. A threshold value is set for determination, and if there is an error of a certain value or more, it is determined that there is an abnormal portion. The threshold in this case can be applied in the same manner as in FIG. FIG. 15 shows a binarized error image 1550 with a difference range.

In addition, as one of the other modifications, only the frames on which the mask is superimposed may be reconstructed. As one of the other modifications, instead of superimposing arbitrary masks as above, arbitrary frames are deleted, the remaining frames are input to the inpainting model, and the deleted frames are reconstructed. may be

Examples of applications to time-series data include the detection of people and vehicles that move unsteadily (detection of abnormal behavior), the stagnation of people working on production lines, and the detection of deviant behavior.

Skeleton detection may also be used as an application example. For example, the skeletal coordinates of a person are estimated for each frame of a moving image. A portion of the result is then masked to reconstruct the skeletal coordinates for the frame at the masked time. In this way, abnormal behavior may be detected from the error between the estimated skeletal coordinates and the reconstructed skeletal coordinates.

In FIG. 15, moving images are used as time-series data, but sensor data such as vibration, voltage, and sound may also be used. In this case, since it is waveform data, part of the waveform data at a certain time is masked and reconstructed. Then, an abnormality can be detected from the error between the reconstructed signal and the input signal. For example, when there is a peak, an abnormality can be detected.

Also, the signal acquired from the sensor 102 may be optionally converted into a power spectrum, a spectrogram, or the like, and then anomaly detection may be performed. For example, an arbitrary frequency component of the power spectrum transformed into the frequency domain using FFT (Fast Fourier Transform) or the like is masked and input. Then, the spectrogram may be regarded as an image, and a part thereof may be masked and input to perform anomaly detection.

<Flowchart>
FIG. 16 is an example of a processing flowchart of the anomaly detection system of the present invention.

First, the processor unit of the analysis server 101 shown in FIG. 3 executes the program loaded from the auxiliary storage unit 207 to the main storage unit to activate the anomaly detection system. The anomaly detection system may allow the user to check the results through a GUI (Graphical User Interface), or may notify and confirm only the presence or absence of the judged anomaly.

After starting the anomaly detection system, in step 1601 parameters such as the sensor 102 that acquires the input signal, the frequency of acquiring the input signal, the size of the input signal, the size and shape of the mask, the number of masks, the amount of slide, and the error threshold are determined. One mask may be analyzed at the same time, or a plurality of masks may be analyzed at the same time like a mesh pattern. These parameters may be set by reading a setting file prepared in advance, or may be selected by the user using a GUI. Note that the number of sensors 102 such as cameras that acquire input signals may be one or more. Hereinafter, an example will be described in which a device that acquires an input signal is a single camera as the sensor 102, and images are sequentially acquired in real time.

Next, in step 1602, the signal acquisition unit 201 reads the input signal acquired from the camera (sensor 102).

Next, in step 1603, it is determined whether or not the end command has been executed. If the termination command is executed, the image analysis system is terminated. If not, go to step 1604 . Here, the end command may be a keyboard operation or a GUI operation.

In step 1604, the mask generation unit 202 adjusts and selects the mask. Note that the mask size may be fixed and set in step 1601 so that all masks are used. The mask may be adjusted so as to cover a part of the inspection target area. Alternatively, a sufficient region may be selected to mask the portion where the object to be inspected is assumed to be by background subtraction, template matching, optical flow, or the like.

Next, in step 1605, the mask superimposing unit 203 superimposes one of the masks determined in step 1604 on the input signal.

Next, in step 1606, the signal reconstructing unit 204 inputs the mask-superimposed signal to the inpaint model to reconstruct the signal.

Next, in step 1607, the error between the input signal and the signal reconstructed in step 1606 is calculated by the abnormality determination unit 205.

In step 1608, the abnormality determination unit 205 determines whether the error calculated in step 1607 is larger than the error threshold determined in step 1601. If the condition is satisfied, proceed to step 1609; otherwise, proceed to step 1610.

In step 1609, the abnormality determination unit 205 determines that the area superimposed with the mask is an abnormal portion.

In step 1610, the abnormality determination unit 205 determines whether or not all the masks selected in step 1604 have been determined. If the condition is satisfied, the process proceeds to step 1612; otherwise, the process proceeds to step 1611.

At step 1611, the mask used in the mask superimposing unit 203 is changed. Based on the results of the setting in step 1601 and the adjustment/selection in step 1604, the next mask that has not been used is determined, and the process returns to step 1605. FIG.

In step 1612, the output control unit 206 notifies the user of the abnormal portion. Here, the occurrence notification may be made on the GUI, or the event occurrence notification may be delivered to the small terminal. After completion of event occurrence notification, the process proceeds to step 1602 to read the next input signal.

As described above, by masking the abnormal portion, only the normal signal portion is input, and the masked normal signal is reconstructed. As a result, highly accurate anomaly detection can be realized by comparing the reconstructed signal with the original signal including an anomalous portion. In addition, by optimizing the analysis by selecting the mask to be used and adjusting the size, it is possible to realize high-speed and high-precision anomaly detection.

As described above, the embodiments of the present invention have been described, but the present invention is not limited to the above examples, and includes various modifications. For example, it is not limited to those having all the configurations provided in the above-described embodiments. It is also possible to delete part of the configuration of one embodiment, replace it with the configuration of another embodiment, or add the configuration of another embodiment to the configuration of one embodiment.

DESCRIPTION OF SYMBOLS 1... Computer system, 2... Processor, 2A, 2B... Processing unit, 4... Memory, 6... Memory bus, 8... I/O bus, 9... Bus interface unit, 10... I/O bus interface unit, 12... Terminal Interface unit 14 Storage interface unit 16 Device interface unit 18 Network interface 20 User I/O device 22 Storage device 24 Display system 26 Display device 30 Network 50 Latency Factor identification application 101 Analysis server 102 Sensor 103 Database server 104 Network 201 Signal acquisition unit 202 Mask generation unit 203 Mask superimposition unit 204 Signal reconstruction unit 205 Abnormality Determination unit 206 Output control unit 207 Auxiliary storage unit 301 Inspection object 305 Mask 310 Input image 320 Masked image 330 Image in painting 340 Reconstructed image 401 Inspection target 405 Mask 410 Input image 420 Masked image 430 Image in painting 440 Reconstructed image 501 Inspection target 502 Abnormal 505 Mask 510 Input image 520 Masked image 530 Image in painting 540 Reconstructed image 601 Inspection object 602 Abnormal 605 Mask 610 Input image 620 Mask pattern 630 Masked image 640 Reconstructed image Construction image 650 Error image 705 Mask 720 Mask pattern 730 Masked image 801 Camera 802 Inspection object 803 Conveyor 905 Mask 910 Input image 920 Background image 920 Mask pattern 930 Difference image 940 Mask slide image 1001 Inspection object 1005 Mask 1010 Input image 1020 Object detection result 1030 Masked image 1101 Camera 1102 Inspection object 1103 Belt conveyor 1205 Mask 1210 Input image 1301 Camera 1302 Inspection object 1405 Mask 1410 Input image 1510 Frame 1520 Masked data 1530 Inpainting model 1540 ... reconstructed frame, 1550 ... error image

Claims

a signal acquisition unit that acquires an input signal from the sensor;
a mask generator that generates a mask to be superimposed on the input signal;
a mask superimposing unit that superimposes the mask generated by the mask generating unit on the input signal to generate a masked signal;
a signal reconstructing unit that reconstructs the masked signal generated by the mask superimposing unit to generate a reconstructed signal;
an abnormality determination unit that determines whether the input signal includes an abnormal portion based on the error in the mask region between the input signal and the reconstructed signal;
The anomaly detection system, wherein the signal reconstruction unit reconstructs the masked region using a deep learning model trained using a normal input signal.
In the anomaly detection system according to claim 1,
The abnormality determination unit determines that there is an abnormality when an error between the input signal and the reconstructed signal within the mask region is equal to or greater than a predetermined threshold, and determines that there is no abnormality when the error is less than the threshold. An anomaly detection system characterized by judging.
In the anomaly detection system according to claim 2,
the sensor is a camera, the input signal is an image,
The determination using the threshold is characterized in that the input signal and the reconstructed signal are binarized with a predetermined reference value, and the threshold is used to determine whether the range of difference due to binarization is a predetermined range or more. anomaly detection system.
In the anomaly detection system according to claim 1,
The mask generation unit generates a mask by setting parameters including mask size, shape, number, and slide amount during operation,
The mask superimposing unit generates a masked signal by superimposing the mask on the input signal by changing the position while sliding the mask region, and
The anomaly detection system, wherein the signal reconstructing unit reconstructs the masked area for each masked signal whose masked area has been changed.
In the anomaly detection system according to claim 1,
The anomaly detection system, wherein the mask generation unit selects a region of the mask to be superimposed using preprocessing.
In the anomaly detection system according to claim 5,
The anomaly detection system, wherein the preprocessing uses a background subtraction method to limit an area of the mask to be superimposed.
In the anomaly detection system according to claim 5,
The anomaly detection system characterized in that the preprocessing is to limit the area of the superimposed mask to the area of the detected object by object detection using a deep learning model, and adjust the size of the mask according to the area of the detected object. .
In the anomaly detection system according to claim 1,
the sensor is a camera, the input signal is an image,
The camera captures an image of an inspection object that moves relative to the camera,
The anomaly detection system, wherein the mask superimposing unit superimposes the mask on the input image from the camera while keeping the position and size of the mask constant.
In the anomaly detection system according to claim 1,
The anomaly detection system, wherein the mask uses a mesh pattern mask in which a plurality of locations are arranged in a predetermined pattern.