WO2024157719A1 - 異常検出装置、異常検出方法、及びプログラム - Google Patents

異常検出装置、異常検出方法、及びプログラム Download PDF

Info

Publication number
WO2024157719A1
WO2024157719A1 PCT/JP2023/046725 JP2023046725W WO2024157719A1 WO 2024157719 A1 WO2024157719 A1 WO 2024157719A1 JP 2023046725 W JP2023046725 W JP 2023046725W WO 2024157719 A1 WO2024157719 A1 WO 2024157719A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
captured image
situation
captured
calculated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2023/046725
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
哲夫 井下
裕一 中谷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP2024572919A priority Critical patent/JPWO2024157719A1/ja
Publication of WO2024157719A1 publication Critical patent/WO2024157719A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast

Definitions

  • This disclosure relates to an anomaly detection device, an anomaly detection method, and a program.
  • Patent Document 1 discloses technology to detect defective products through image analysis.
  • the system in Patent Document 1 masks part of an image of an item such as a product, then restores the image and detects defects in the item by comparing the original image with the image obtained by the restoration.
  • Patent Document 1 does not assume that the situation of the subject of anomaly detection will change. This disclosure has been made in consideration of the above-mentioned problems, and one of its objectives is to provide a new technology that uses images to detect anomalies.
  • the abnormality detection device disclosed herein includes a generating means for generating a plurality of processed images by processing each of a plurality of captured images in time series, a calculating means for calculating, for each of the plurality of captured images, difference data representing the difference between that captured image and the processed image generated from that captured image, and a determining means for determining whether or not the captured image represents an abnormal situation based on the plurality of calculated difference data.
  • the processing performed on the captured image includes a masking process for masking one or more partial regions included in the captured image, and a restoration process for restoring the masked partial region using data other than the partial region.
  • the abnormality detection method disclosed herein is executed by a computer.
  • the abnormality detection method includes a generating step of generating a plurality of processed images by processing each of a plurality of captured images in a time series, a calculating step of calculating, for each of the plurality of captured images, difference data representing the difference between the captured image and the processed image generated from the captured image, and a determining step of determining whether or not the captured image represents an abnormal situation based on the calculated plurality of difference data.
  • the processing performed on the captured image includes a masking process that masks one or more partial regions included in the captured image, and a restoration process that restores the masked partial region using data other than the partial region.
  • the program disclosed herein causes a computer to execute the anomaly detection method disclosed herein.
  • This disclosure provides a new technology that uses images to detect anomalies.
  • FIG. 1 is a diagram illustrating an example of an overview of an abnormality detection device.
  • 1A to 1C are diagrams illustrating an example of an outline of processing performed on a captured image;
  • 2 is a block diagram illustrating a functional configuration of the abnormality detection device.
  • FIG. 2 is a block diagram illustrating a hardware configuration of a computer that realizes the abnormality detection device.
  • 10 is a flowchart illustrating a flow of a process executed by the abnormality detection device.
  • 11A and 11B are diagrams illustrating an example of a mask process using a mask image.
  • FIG. 1 illustrates the case where two mask images are applied alternately.
  • FIG. 11 is a diagram illustrating a case in which a normal region is excluded from a region to be masked; 1 is a diagram illustrating an example of a functional configuration of an abnormality detection device having an output unit. 13 is a diagram illustrating an example of an observation screen including a display indicating that the observation range is in an abnormal state.
  • predetermined values such as predetermined values and threshold values are stored in advance in a storage device accessible from a device that uses the value.
  • the storage unit is composed of one or any number of storage devices.
  • Fig. 1 is a diagram illustrating an overview of an abnormality detection device 2000 according to an embodiment. Note that Fig. 1 is a diagram for facilitating understanding of the overview of the abnormality detection device 2000, and the operation of the abnormality detection device 2000 is not limited to that shown in Fig. 1.
  • the abnormality detection device 2000 acquires a plurality of captured images 10 and determines whether the captured images 10 represent an abnormal situation.
  • the plurality of captured images 10 acquired by the abnormality detection device 2000 are time-series image data generated by the camera 20.
  • the camera 20 is configured to generate video data by repeatedly capturing images.
  • each captured image 10 is a video frame that constitutes the video data.
  • the abnormality detection device 2000 performs processing on each captured image 10 to generate a processed image 30 from the captured image 10.
  • FIG. 2 is a diagram illustrating an example of an overview of the processing performed on the captured image 10.
  • the processing performed on the captured image 10 includes a masking process and a restoration process.
  • the masking process is a process that generates an intermediate image 50 by masking at least one partial area included in the captured image 10.
  • the masked image area is represented by a dot pattern.
  • the restoration process is a process that generates a processed image 30 by restoring the masked partial area for the intermediate image 50 (i.e., the captured image 10 with the partial area masked).
  • the process of restoring a masked partial area in an intermediate image 50 is a process of inferring the contents of the partial area by using data of image areas other than the partial area contained in the intermediate image 50. Therefore, some difference may occur between the processed image 30 and the captured image 10 from which the processed image 30 was derived.
  • captured image 10 is an image obtained by capturing an image of a road.
  • a fallen object 60 is captured in captured image 10.
  • intermediate image 50 the image area representing the fallen object 60 is masked.
  • the falling object 60 is not restored in a restoration process based on the unmasked image area. Therefore, the fallen object 60 is not included in processed image 30 obtained by the restoration process. As a result, a difference occurs between captured image 10 and intermediate image 50.
  • the abnormality detection device 2000 calculates, for each captured image 10, difference data 40 that indicates the difference between the captured image 10 and the processed image 30 generated from that captured image 10. The abnormality detection device 2000 then uses the calculated difference data 40 to perform a process of determining whether or not the captured image 10 represents an abnormal situation (hereinafter, an abnormality determination process).
  • an abnormal situation represented by the captured image 10 is, for example, "a situation in which an object that is not normally present is present within the imaging range of the camera 20."
  • An object that is not normally present is, for example, a foreign object such as a fallen object, or an object that has been left behind (hereinafter, “left behind object”).
  • the imaging range of the camera 20 is also referred to as the "observation range.”
  • camera 20 is a camera used to monitor road conditions.
  • a foreign object on the road is an object that does not normally exist in the road, which is the observation range. Therefore, if such a foreign object is captured in captured image 10, captured image 10 represents an abnormal situation.
  • camera 20 is a camera used for monitoring facilities such as an airport.
  • an abandoned object is an object that does not normally exist within the observation range. Therefore, if such an abandoned object is captured in captured image 10, captured image 10 represents an abnormal situation.
  • a processing process including a mask process and a restoration process is performed on each of a plurality of captured images 10 in a time series, thereby obtaining a processed image 30. Furthermore, for each of the plurality of captured images 10, difference data 40 is generated that represents the difference between the captured image 10 and the processed image 30 generated from that captured image 10. Then, by using the plurality of difference data 40, it is determined whether or not the captured image 10 represents an abnormal situation.
  • a method can be considered in which it is determined whether or not a captured image 10 represents an abnormal situation based only on the difference between one captured image 10 and a processed image 30 generated from that captured image 10.
  • this method even if the difference between the captured image 10 and the processed image 30 is caused by the influence of temporary noise, it may be determined that the captured image 10 represents an abnormal situation.
  • the abnormality detection device 2000 multiple difference data 40 are obtained using multiple captured images 10, and abnormalities are detected using these multiple difference data 40, so that it is possible to prevent erroneous detection of an abnormal situation due to the influence of temporary noise.
  • the anomaly detection device 2000 of this embodiment will be described in more detail below.
  • ⁇ Example of functional configuration> 3 is a block diagram illustrating a functional configuration of the abnormality detection device 2000 of this embodiment.
  • the abnormality detection device 2000 includes a generation unit 2020, a calculation unit 2040, and a determination unit 2060.
  • the generation unit 2020 performs processing on each of the multiple captured images 10 to generate multiple processed images 30.
  • the calculation unit 2040 calculates difference data 40 for each of the multiple captured images 10, the difference data 40 representing the difference between the captured image 10 and the processed image 30 generated from the captured image 10.
  • the determination unit 2060 determines whether or not the scene represented by the captured image 10 represents a predetermined situation based on the multiple calculated difference data 40.
  • Each functional component of the abnormality detection device 2000 may be realized by hardware that realizes each functional component (e.g., a hardwired electronic circuit, etc.), or may be realized by a combination of hardware and software (e.g., a combination of an electronic circuit and a program that controls it, etc.).
  • a further explanation will be given of the case where each functional component of the abnormality detection device 2000 is realized by a combination of hardware and software.
  • FIG. 4 is a block diagram illustrating an example of the hardware configuration of a computer 1000 that realizes the anomaly detection device 2000.
  • the computer 1000 is any computer.
  • the computer 1000 is a stationary computer such as a PC (Personal Computer) or a server machine.
  • the computer 1000 is a portable computer such as a smartphone or a tablet terminal.
  • the computer 1000 may be a dedicated computer designed to realize the anomaly detection device 2000, or may be a general-purpose computer.
  • each function of the anomaly detection device 2000 is realized on the computer 1000.
  • the application is composed of a program for realizing each functional component of the anomaly detection device 2000.
  • the method of acquiring the program is arbitrary.
  • the program can be acquired from a storage medium (such as a DVD disk or USB memory) on which the program is stored.
  • the program can be acquired by downloading the program from a server device that manages the storage device on which the program is stored.
  • Computer 1000 has bus 1020, processor 1040, memory 1060, storage device 1080, input/output interface 1100, and network interface 1120.
  • Bus 1020 is a data transmission path for processor 1040, memory 1060, storage device 1080, input/output interface 1100, and network interface 1120 to transmit and receive data to and from each other.
  • the method of connecting processor 1040 and the like to each other is not limited to bus connection.
  • the processor 1040 is a variety of processors, such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), or an FPGA (Field-Programmable Gate Array).
  • the memory 1060 is a primary storage device realized using RAM (Random Access Memory) or the like.
  • the storage device 1080 is an auxiliary storage device realized using a hard disk, SSD (Solid State Drive), memory card, ROM (Read Only Memory), or the like.
  • the input/output interface 1100 is an interface for connecting the computer 1000 to an input/output device.
  • an input device such as a keyboard and an output device such as a display device are connected to the input/output interface 1100.
  • the network interface 1120 is an interface for connecting the computer 1000 to a network.
  • This network may be a LAN (Local Area Network) or a WAN (Wide Area Network).
  • the storage device 1080 stores a program that realizes each functional component of the anomaly detection device 2000 (a program that realizes the application described above).
  • the processor 1040 reads this program into the memory 1060 and executes it to realize each functional component of the anomaly detection device 2000.
  • the anomaly detection device 2000 may be realized by one computer 1000, or may be realized by multiple computers 1000. In the latter case, the configuration of each computer 1000 does not need to be the same, and can be different from each other.
  • ⁇ Processing flow> 5 is a flowchart illustrating the flow of processing executed by the anomaly detection device 2000 of this embodiment.
  • the generating unit 2020 acquires a plurality of captured images 10 (S102).
  • the calculating unit 2040 generates a processed image 30 from each captured image 10 (S104).
  • the calculating unit 2040 calculates difference data 40 representing the difference between each captured image 10 and the corresponding processed image 30 (S106).
  • the determining unit 2060 uses the calculated plurality of difference data 40 to determine whether or not the scene represented by the captured image 10 represents a predetermined situation (S108).
  • the generating unit 2020 acquires the captured image 10 (S102).
  • various methods can be adopted for acquiring the captured image generated by the camera.
  • the camera 20 stores each captured image 10 in a storage unit accessible from the abnormality detection device 2000.
  • the generating unit 2020 acquires each captured image 10 from this storage unit.
  • the camera 20 may be configured to transmit the captured image 10 to the abnormality detection device 2000.
  • the generating unit 2020 acquires the captured image 10 by receiving the captured image 10 transmitted from the camera 20.
  • the generation unit 2020 may acquire the captured images 10 one by one, or may acquire multiple captured images 10 at once. In the latter case, for example, the generation unit 2020 periodically accesses the storage unit in which the captured images 10 are stored, and acquires all unacquired captured images 10 at once.
  • the generating unit 2020 generates the processed image 30 from each captured image 10 (S104).
  • the processing for generating the processed image 30 from the captured image 10 includes masking and restoration. The masking and restoration processes will be described below.
  • the generating unit 2020 masks one or more partial regions of the captured image 10.
  • the region to be masked is also referred to as a mask target region.
  • various methods can be adopted for masking a specific region on an image.
  • the generating unit 2020 changes the value of each pixel in the mask target region in the captured image 10 to a predetermined value (for example, 0 or 1) to generate an intermediate image 50 in which each mask target region is masked.
  • a method for identifying the mask target region will be described later.
  • the generating unit 2020 may generate the intermediate image 50 by performing a process of superimposing the captured image 10 and a mask image in which the arrangement of the mask target region is determined.
  • an existing technology can be used as a technology for performing a mask process on a specific image using an image for masking. Note that the mask image will be described later.
  • the generation unit 2020 randomly identifies one or more partial regions from the captured image 10, and treats each identified partial region as a mask target region.
  • the number of mask target regions to be identified from one captured image 10, the shape of the mask target region, and the size of the mask target region are, for example, predetermined.
  • the number of mask target regions is set to Nm.
  • the generation unit 2020 randomly identifies Nm positions from the captured image 10. Furthermore, for each of the identified Nm positions, the generation unit 2020 identifies a partial region of a predetermined shape and a predetermined size that uses that position as a reference position (for example, the center or the position of the upper left corner). Then, the generation unit 2020 treats each of the identified Nm partial regions as a mask target region.
  • the multiple mask target regions may be identified so as to allow mutual overlap, or so as not to overlap with each other.
  • the generation unit 2020 randomly identifies Nm mask target regions in order. Then, if a newly identified mask target region overlaps with an already identified mask target region, the generation unit 2020 performs the process of randomly identifying the new mask target region again, so that the new mask target region does not overlap with the already identified mask target region.
  • one or more of the number of regions to be masked, the shape of the regions to be masked, and the size of the regions to be masked may be determined randomly.
  • the area to be masked may be identified using a mask image that shows the layout of the area to be masked. For example, in the mask image, the value of each pixel in the image area treated as the area to be masked is 0, and the value of each pixel in other image areas is 1.
  • FIG. 6 is a diagram illustrating an example of mask processing using a mask image.
  • the area to be masked is represented by a dot pattern.
  • the area to be masked is defined by a checkered pattern.
  • the sizes of the mask image and the captured image 10 may be the same or different. In the latter case, the generation unit 2020 enlarges or reduces the mask image so that the size matches the size of the captured image 10.
  • the mask target area may be identified based on a predetermined rule (hereinafter, mask rule).
  • mask rule a rule such as "divide the captured image 10 vertically into Mv areas and horizontally into Mh areas, and in odd rows, the partial areas of odd columns are mask target areas, and in even rows, the partial areas of even columns are mask target areas" may be used.
  • the arrangement of the mask target areas is represented by a checkerboard pattern.
  • the number of divisions in the vertical and horizontal directions may be predetermined or may be determined randomly. Alternatively, for example, the number of divisions in the vertical and horizontal directions may be determined based on the size of a predetermined partial area. Specifically, if the horizontal size of the captured image 10 is Wc and the horizontal size of the partial area is Wp, then the number of divisions in the horizontal direction will be the smallest integer greater than or equal to Wc/Wp. The number of divisions in the vertical direction can also be calculated in a similar manner.
  • a rule of "treating the partial areas of odd rows and odd columns and the partial areas of even rows and even columns as areas to be masked" may be used instead of the above-mentioned rule of "treating the partial areas of odd rows and even columns and the partial areas of even rows and odd columns as areas to be masked."
  • the rule for determining the areas to be masked is not limited to a rule representing a check pattern, and can be any rule.
  • the generating unit 2020 may or may not make the mask target area common to all captured images 10. In the former case, for example, the generating unit 2020 performs mask processing using the same mask image or mask rule on all captured images 10.
  • the generation unit 2020 randomly identifies the mask target area for each captured image 10. As another example, the generation unit 2020 alternately applies two mask images or mask rules to the captured images 10 in a time series.
  • FIG. 7 is a diagram illustrating a case where two mask images are applied alternately.
  • the code for the i-th captured image 10 in chronological order is represented as "10-i.”
  • mask processing is performed using mask image 70-1 for captured images 10 that are even-numbered in chronological order.
  • mask processing is performed using mask image 70-2 for captured images 10 that are odd-numbered in chronological order.
  • the two mask images 70-1 and 70-2 satisfy the relationship that "partial regions treated as regions to be masked in mask image 70-1 are not treated as regions to be masked in mask image 70-2, and partial regions not treated as regions to be masked in mask image 70-1 are treated as regions to be masked in mask image 70-2.”
  • mask image 70-2 is obtained by performing a process on mask image 70-1 that inverts 0s and 1s.
  • the number of mask images and mask rules is not limited to two, and may be three or more.
  • a mask image M1 to be applied to the (3k-2)th captured image 10 in chronological order a mask image M2 to be applied to the (3k-1)th captured image 10 in chronological order, and a mask image M3 to be applied to the 3kth captured image 10 in chronological order are prepared in advance.
  • k is a natural number.
  • the generating unit 2020 may detect from the captured image 10 an object that is normally included in the captured image 10 (in other words, captured by the camera 20), and may exclude an image area representing the object from the mask target area.
  • an object that is normally included in the captured image 10 is also referred to as a "normal object.”
  • an image area representing a normal object is also referred to as a "normal area.”
  • a normal object would be a vehicle or the like.
  • a normal object would be a person or the like.
  • existing technology can be used to detect a specific type of object from an image. Note that the type of object to be detected as a normal object is assumed to be predetermined.
  • FIG. 8 is a diagram illustrating a case in which a normal region is excluded from a region to be masked.
  • the normal object is a vehicle. Therefore, a normal region 82 representing a vehicle is detected from the captured image 10.
  • the generating unit 2020 generates a new mask image 90 by superimposing a mask image 70 prepared in advance and an image 80 representing the arrangement of normal regions 82.
  • the region represented by the mask image 90 as a region to be masked is a region that is a region to be masked in the mask image 70 and is not included in the normal region 82.
  • the generating unit 2020 uses the mask image 90 to perform mask processing on the captured image 10.
  • the method of excluding normal regions from the region to be masked is not limited to the method of using a mask image.
  • the generating unit 2020 may identify the region to be masked by randomly identifying a partial region from the image region excluding the normal region.
  • Excluding the normal region from the region to be masked has the effect of reducing erroneous determinations by the determination unit 2060.
  • a restoration process is performed on the intermediate image 50 in which the normal region has been masked, there is a possibility that the normal region cannot be accurately restored. If the normal region cannot be accurately restored, there is a possibility that the captured image 10 will be erroneously determined to represent an abnormal situation due to the difference in the normal region between the captured image 10 and the processed image 30.
  • By excluding the normal region from the region to be masked it is possible to prevent the occurrence of such erroneous determinations resulting from the inability to correctly restore the masked normal boundary region.
  • the generating unit 2020 generates the processed image 30 by performing a restoration process on the intermediate image 50.
  • the restoration process is performed using a machine learning model such as a neural network.
  • the machine learning model used in the restoration process is also referred to as a restoration model.
  • the restoration model is trained in advance so that, in response to an input of an image including a masked image region, the restoration model outputs an image in which the image region is restored.
  • the generating unit 2020 inputs the intermediate image 50 to the restoration model, and uses the image output from the restoration model as the processed image 30.
  • the restoration model is trained using multiple training data.
  • the training data includes a ground-truth image before masking and a training input image obtained by masking one or more subregions in the ground-truth image.
  • a device that trains the restoration model (hereinafter, a training device) inputs the training input image to the restoration model, for example, and calculates a loss based on the image output from the restoration model and the ground-truth image.
  • the training device then trains the restoration model by updating the trainable parameters of the restoration model based on the loss.
  • the training device does not need to include the normal regions (regions representing normal objects such as vehicles) mentioned above in the loss calculation. Specifically, when calculating the loss using the values of each pixel in the training input image and the image output from the restoration model, the training device excludes the values of each pixel included in the normal region from the calculation.
  • the training input images used to train the restoration model may be images generated by camera 20 or may be images generated by another camera.
  • SimMIM which is disclosed in Non-Patent Document 1
  • the restoration model is not limited to SimMIM, and various machine learning models can be used.
  • the calculation unit 2040 calculates, for each of the multiple captured images 10, difference data 40 that represents the difference between that captured image 10 and the processed image 30 generated from that captured image 10 (S106). There are various methods for calculating the difference data 40 that represents the difference between the two images. For example, the calculation unit 2040 calculates the difference value of pixel values between corresponding pixels between the captured image 10 and the processed image 30. Then, the calculation unit 2040 calculates the sum of the difference values calculated for each pixel as the difference data 40.
  • the pixel value of each pixel is a scalar value. Therefore, the difference value between corresponding pixels can be obtained by calculating the difference between the two scalar values.
  • the captured image 10 and the processed image 30 are multi-channel images (e.g., RGB images)
  • the value of each pixel is a vector composed of values for each channel. Therefore, the difference value between corresponding pixels can be obtained by calculating the norm of two vectors. For example, suppose that the pixel value at coordinates (x,y) in the captured image 10 is (r1,g1,b1), and the pixel value at coordinates (x,y) in the processed image 30 is (r2,g2,b2).
  • the difference value calculated for the pixel at coordinates (x,y) is expressed as the norm of vectors (r1,g1,b1) and (r2,g2,b2).
  • the difference data 40 between the captured image 10 and the processed image 30 is not limited to the sum of pixel values.
  • the difference data 40 between the captured image 10 and the processed image 30 may be the average value of pixel values.
  • the calculation unit 2040 may exclude the normal area described above from the calculation of the difference data 40. In this way, the difference between the captured image 10 and the processed image 30 in the normal area is not taken into consideration in the judgment by the judgment unit 2060. Therefore, even if the masked normal area cannot be accurately restored, it is possible to accurately judge whether or not the captured image 10 represents an abnormal situation.
  • the difference between the captured image 10 and the processed image 30 in the normal region can be prevented from being taken into consideration in the judgment by the judgment unit 2060. Therefore, when the normal region is excluded from the region to be masked, it is not necessary to exclude the normal region from the calculation target of the difference data 40.
  • the determination unit 2060 uses the difference data 40 calculated for each of the multiple captured images 10 to determine whether the captured images 10 represent an abnormal situation (S108). To this end, for example, the determination unit 2060 calculates, for each of the multiple time points, an index value that represents the degree of change in the situation before and after that time point. Hereinafter, this index value is also referred to as the "degree of change in situation.” Furthermore, when the degree of change in situation is calculated for a certain time point, that time point is also referred to as the reference time point for that degree of change in situation.
  • the determination unit 2060 determines whether the degree of change in the situation calculated for each reference time point is equal to or greater than a threshold value. If the degree of change in the situation calculated for a certain reference time point is equal to or greater than the threshold value, the determination unit 2060 determines that the captured image 10 at that reference time point represents an abnormal situation.
  • the degree of change in the situation at a certain reference time point is calculated, for example, based on difference data 40 calculated for a first period of a predetermined length that includes a time point before the reference time point, and difference data 40 calculated for a second period of a predetermined length that includes a time point after the reference time point.
  • the reference time point may be included in either the first period or the second period, or in both.
  • the degree of change in the situation is calculated by the following formula (1).
  • D(r) represents the degree of change in the situation at the reference time r.
  • the time is represented by a discrete value (e.g., a frame number) assigned to each captured image 10 in ascending order of the generation time.
  • V1(r) represents the average value of the difference data 40 calculated for each of the (L1+1) captured images 10 included in the period of length L1 ending at time r.
  • V2(r) represents the average value of the difference data 40 calculated for each of the (L2+1) captured images 10 included in the period of length L2 starting at time r.
  • Si represents the difference data 40 calculated for the captured image 10 at time i.
  • L1 and L2 may be the same as each other, or may be different from each other.
  • V1(r) represents the magnitude of the difference between the captured image 10 and the processed image 30 in the near past based on time point r.
  • V2(r) represents the magnitude of the difference between the captured image 10 and the processed image 30 in the near future based on time point r. Therefore, when V2(r) is sufficiently larger than V1(r), it indicates that the difference between the captured image 10 and the processed image 30 has become larger at or around time point r, and that this large difference continues. Therefore, when V2(r) is sufficiently larger than V1(r), it indicates that the situation represented by the captured image 10 (i.e., the situation of the place imaged by the camera 20) is changing at or around time point r.
  • the difference between the captured image 10 and the processed image 30 is small in the period before time point r, but the difference between the captured image 10 and the processed image 30 is large in the period after time point r.
  • abnormal situations can be detected by comparing the degree of change in the situation with a threshold value.
  • L2 the length of the near future from the reference time point r, the longer the future situation will be taken into consideration. Therefore, if the situation in the observation range has only changed for a short period of time, it is preferable to make L2 sufficiently large so that the observation range is not judged to be in an abnormal situation.
  • an abnormal situation in the observation range is one in which "there is an abandoned object.”
  • the length L2 it is preferable that an object temporarily placed on the ground can be distinguished from an abandoned object.
  • Fig. 9 is a diagram illustrating an example of the functional configuration of the anomaly detection device 2000 having the output unit 2080.
  • the output unit 2080 causes an arbitrary display device to display a screen on which the captured images 10 generated by the camera 20 are displayed in chronological order.
  • this screen will be referred to as the observation screen.
  • the observation screen displays the video data generated by the camera 20.
  • various information is displayed on the observation screen. For example, if the determination unit 2060 determines that the captured image 10 represents an abnormal situation (in other words, the observation range is in an abnormal situation), some kind of display is added to the observation screen so that the user of the anomaly detection device 2000 can perceive the result of the determination.
  • the output unit 2080 includes a predetermined message or mark indicating that the observation range is in an abnormal situation on the observation screen.
  • the user of the anomaly detection device 2000 will also be referred to simply as the user.
  • FIG. 10 is a diagram illustrating an example of an observation screen including a display indicating that the observation range is in an abnormal condition.
  • the observation screen 100 includes an image display area 110.
  • captured images 10 generated by the camera 20 are displayed in chronological order.
  • the observation range is a road.
  • the abnormal situation is a situation where a foreign object (e.g., a fallen object) is present on the road.
  • the determination unit 2060 determines that "the road being observed is in a situation where a foreign object is present.” In response to this determination, the output unit 2080 then displays the message 120 "Foreign object present" on the observation screen 100.
  • the output unit 2080 may include on the observation screen a display showing the position of an object (hereinafter, abnormal object) that contributes to the observation range being in an abnormal state, such as the foreign object in the example of FIG. 10.
  • an object hereinafter, abnormal object
  • a mark 130 showing the position of the detected foreign object is displayed on the captured image 10.
  • an image region that represents an abnormal object is an image region in which there is a large difference between the captured image 10 and the processed image 30.
  • the difference value between corresponding pixels is calculated between the captured image 10 and the processed image 30 generated from that captured image 10.
  • the output unit 2080 then identifies difference values whose magnitude is equal to or greater than the threshold value from among the difference values between pixels calculated for the captured image 10 in which the degree of change in situation is equal to or greater than the threshold value and the processed image 30 generated from the captured image 10.
  • the output unit 2080 then identifies an image region made up of pixels whose difference values are equal to or greater than the threshold value as an image region representing an abnormal object (hereinafter, an abnormal region).
  • a lower limit for the size of the abnormal region may be set to prevent noise from being recognized as an abnormal region.
  • the output unit 2080 identifies, as an abnormal region, an image region that is composed of pixels whose difference value between the captured image 10 and the processed image 30 is equal to or greater than a threshold value, and whose size is equal to or greater than a predetermined lower limit.
  • the output unit 2080 continues to display the message 120 or the mark 130 (hereinafter, message 120, etc.) on the observation screen 100 for a certain period of time. For example, the output unit 2080 displays the message 120, etc. on the observation screen 100 for a predetermined period of time in response to determining that the observation range is in an abnormal situation.
  • the output unit 2080 may determine whether to continue displaying the message 120 etc. on the observation screen 100 based on the degree of change in the situation. Specifically, when the output unit 2080 determines that the degree of change in the situation at a certain reference point in time is equal to or greater than a threshold value, it identifies an abnormal area and records the correspondence between the abnormal area and the degree of change in the situation. Furthermore, the output unit 2080 attenuates the degree of change in the situation corresponding to the abnormal area over time. The degree of change in the situation attenuated over time represents the degree to which it is necessary to continue to notify the message 120 etc. Thus, the degree of change in the situation attenuated over time is also expressed as the degree of notification necessity.
  • the output unit 2080 determines whether or not to include a message 120 or the like corresponding to an abnormal region on the observation screen 100, depending on the notification necessity level corresponding to the abnormal region. For example, the output unit 2080 displays the message 120 or the like corresponding to the abnormal region on the observation screen 100 while the notification necessity level corresponding to the abnormal region is equal to or greater than a threshold value.
  • the output unit 2080 may compare the degree of notification necessity corresponding to the abnormal region with the newly calculated degree of change in the situation. If the degree of notification necessity corresponding to the abnormal region is greater than the newly calculated degree of change in the situation, the output unit 2080 displays the message 120, etc. corresponding to the abnormal region on the observation screen 100. On the other hand, if the degree of notification necessity corresponding to the abnormal region is equal to or less than the newly calculated degree of change in the situation, the output unit 2080 ends the display of the message 120, etc. corresponding to the abnormal region.
  • the program includes instructions (or software code) that, when loaded into a computer, cause the computer to perform one or more functions described in the embodiments.
  • the program may be stored on a non-transitory computer-readable medium or a tangible storage medium.
  • computer-readable medium or tangible storage medium may include random-access memory (RAM), read-only memory (ROM), flash memory, solid-state drive (SSD) or other memory technology, CD-ROM, digital versatile disc (DVD), Blu-ray (registered trademark) disc or other optical disk storage, magnetic cassette, magnetic tape, magnetic disk storage or other magnetic storage device.
  • the program may be transmitted on a transitory computer-readable medium or a communication medium.
  • the transitory computer-readable medium or communication medium may include electrical, optical, acoustic, or other forms of propagated signals.
  • the processing performed on the captured image includes a masking process that masks one or more partial regions contained in the captured image, and a restoration process that restores the masked partial regions using data other than those partial regions.
  • the calculation means is detecting an image area from the captured image that represents a predetermined type of object that is not abnormal for being included in the captured image;
  • the anomaly detection device further comprising: a processor for processing the detected image data by calculating a difference between the captured image and the processed image generated from the captured image for an image area other than the detected image area.
  • (Appendix 3) 2.
  • the anomaly detection device detects an image area representing a predetermined type of object from the captured image, and excludes the detected image area from being masked.
  • a restoration model is trained to output an image in which the masked image region is restored in response to an input of an image including the masked image region, the calculation means performs the restoration process by inputting the captured image in which the partial region is masked by the mask process to the restoration model; 4.
  • the anomaly detection device according to claim 1, wherein, in training the restoration model, an image region representing a predetermined type of object that is not abnormal to be included in the captured image is not included in the calculation of loss.
  • the determination means is calculating, for each of a plurality of reference time points, a situation change degree that represents a ratio of a statistical value of the difference data calculated for each of a plurality of the captured images generated in a period after the reference time point to a statistical value of the difference data calculated for each of the plurality of the captured images generated in a period before the reference time point;
  • An abnormality detection device as described in any one of appendix 1 to 3, wherein if the degree of change in the situation calculated for the reference time point is equal to or greater than a threshold value, it is determined that the captured image generated at the reference time point represents an abnormal situation.
  • Appendix 6 an output means for outputting a screen including the captured image;
  • the processing performed on the captured image includes a masking process that masks one or more partial regions contained in the captured image, and a restoration process that restores the masked partial regions using data other than those partial regions.
  • the computer has a restoration model trained to output an image in which the masked image region is restored in response to an input of an image including the masked image region,
  • the restoration process is performed by inputting the captured image in which the partial region is masked by the mask process to the restoration model.
  • the anomaly detection method according to any one of appendix 7 to 9, wherein, in training the restoration model, image regions representing a predetermined type of object that is not abnormal to be included in the captured image are not included in the calculation of loss.
  • Appendix 12 an output step of outputting a screen including the captured image;
  • the processing performed on the captured image includes a masking process that masks one or more partial areas contained in the captured image, and a restoration process that restores the masked partial areas using data other than those partial areas.
  • a restoration model is trained to output an image in which the masked image region is restored in response to an input of an image including the masked image region,
  • the restoration process is performed by inputting the captured image in which the partial region is masked by the mask process to the restoration model.
  • the program according to any one of appendices 13 to 15, wherein, in training the restoration model, image regions representing a predetermined type of object that is not abnormal to be included in the captured image are not included in the calculation of loss.
  • Appendix 18 causing a computer to execute an output step of outputting a screen including the captured image;

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
PCT/JP2023/046725 2023-01-25 2023-12-26 異常検出装置、異常検出方法、及びプログラム Ceased WO2024157719A1 (ja)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2024572919A JPWO2024157719A1 (https=) 2023-01-25 2023-12-26

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2023009536 2023-01-25
JP2023-009536 2023-01-25

Publications (1)

Publication Number Publication Date
WO2024157719A1 true WO2024157719A1 (ja) 2024-08-02

Family

ID=91970499

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/046725 Ceased WO2024157719A1 (ja) 2023-01-25 2023-12-26 異常検出装置、異常検出方法、及びプログラム

Country Status (2)

Country Link
JP (1) JPWO2024157719A1 (https=)
WO (1) WO2024157719A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121033036A (zh) * 2025-10-29 2025-11-28 重庆长安汽车股份有限公司 显示屏幕异常检测方法、装置、电子设备、介质及产品

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009273005A (ja) * 2008-05-09 2009-11-19 Omron Corp 部品実装基板の外観検査用画像の保存方法および復元方法、ならびに画像保存処理装置
JP2012207948A (ja) * 2011-03-29 2012-10-25 Hitachi Ltd 設備異常経時変化判定装置、設備異常変化判定方法、およびプログラム
WO2022119870A1 (en) * 2020-12-02 2022-06-09 Amgen Inc. Image augmentation techniques for automated visual inspection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009273005A (ja) * 2008-05-09 2009-11-19 Omron Corp 部品実装基板の外観検査用画像の保存方法および復元方法、ならびに画像保存処理装置
JP2012207948A (ja) * 2011-03-29 2012-10-25 Hitachi Ltd 設備異常経時変化判定装置、設備異常変化判定方法、およびプログラム
WO2022119870A1 (en) * 2020-12-02 2022-06-09 Amgen Inc. Image augmentation techniques for automated visual inspection

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121033036A (zh) * 2025-10-29 2025-11-28 重庆长安汽车股份有限公司 显示屏幕异常检测方法、装置、电子设备、介质及产品

Also Published As

Publication number Publication date
JPWO2024157719A1 (https=) 2024-08-02

Similar Documents

Publication Publication Date Title
US11151712B2 (en) Method and apparatus for detecting image defects, computing device, and computer readable storage medium
US11450114B2 (en) Information processing apparatus, information processing method, and computer-readable storage medium, for estimating state of objects
US12159387B2 (en) Inspection support apparatus, inspection support method, and inspection support program for concrete structure
KR101708547B1 (ko) 사상(事象) 검출 장치 및 사상 검출 방법
US20170330315A1 (en) Information processing apparatus, method for processing information, discriminator generating apparatus, method for generating discriminator, and program
CN109446061B (zh) 一种页面检测方法、计算机可读存储介质及终端设备
JP6918735B2 (ja) 監視装置、監視方法および監視プログラム
US11455489B2 (en) Device that updates recognition model and method of updating recognition model
US20200334801A1 (en) Learning device, inspection system, learning method, inspection method, and program
WO2022137841A1 (ja) 異常検出システム、学習装置、異常検出プログラム、および学習プログラム
JP2020042754A (ja) 分類装置、分類方法、分類プログラム、及び検査装置
WO2024157719A1 (ja) 異常検出装置、異常検出方法、及びプログラム
CN115587959A (zh) 异常检测系统及异常检测方法
AU2011371064B2 (en) Motion image region identification device and method thereof
CN118609211A (zh) 行为检测的方法、装置、设备、存储介质及程序产品
CN108256633B (zh) 一种测试深度神经网络稳定性的方法
CN113610798A (zh) 图像检测方法、装置、电子设备及存储介质
WO2024194951A1 (ja) 物体検出装置、方法、及びプログラム
US11527091B2 (en) Analyzing apparatus, control method, and program
WO2016092783A1 (en) Information processing apparatus, method for processing information, discriminator generating apparatus, method for generating discriminator, and program
JP7314645B2 (ja) 物体検出プログラム、物体検出方法、及び、物体検出装置
JP7459696B2 (ja) 異常検知システム、学習装置、異常検知プログラム、学習プログラム、異常検知方法、および学習方法演算装置の学習方法
CN115797874A (zh) 基于ai的人员乘坐皮带监管方法、系统、设备及介质
CN115861321A (zh) 应用于工业互联网的生产环境检测方法及系统
JPWO2018198916A1 (ja) 画像処理装置、画像処理方法及び記憶媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23918673

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2024572919

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2024572919

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 23918673

Country of ref document: EP

Kind code of ref document: A1