WO2020149242A1

WO2020149242A1 - Work assistance device, work assistance method, program, and object detection model

Info

Publication number: WO2020149242A1
Application number: PCT/JP2020/000731
Authority: WO
Inventors: 清水　秀樹; 亮介田嶋
Original assignee: Ａｒｉｔｈｍｅｒ株式会社
Priority date: 2019-01-17
Filing date: 2020-01-10
Publication date: 2020-07-23

Abstract

The present invention can efficiently set whether an image is one in which an object appears. A work assistance device 20 is provided with an extraction unit 24A and an output unit 23. The extraction unit 24A uses an object detection model 21M, which is constructed using a teaching image Gt in which an object O appears, to extract as a candidate image Gk from an arbitrary moving image GD an image containing a region for which the probability that the object appears is equal to or greater than a first threshold value. The output unit 23 outputs an image in which a bounding box B corresponding to the region for which the probability that the object appears is equal to or greater than the first threshold value has been combined with the relevant candidate image.

Description

Work support device, work support method, program, and object detection model.

The present disclosure relates to a work support device, a work support method, a program, and an object detection model.

Conventionally, a method of generating a teacher image for machine learning has been studied. For example, in Patent Document 1 (Japanese Patent Laid-Open No. 2018-169672), an insufficient pattern is identified by identifying an insufficient pattern with a small number of teacher images and spatially inverting or changing the color tone of a certain teacher image. There is disclosed a technique for generating a new teacher image belonging to.

As described in Patent Document 1, it is desirable to collect a large number of teacher images for effective learning.

The work support device according to the first aspect of the present disclosure is a work support device that supports a work for setting whether or not an object is shown in an area in an image. Here, the work support device includes an extraction unit and a generation unit. The extraction unit includes an area in which the probability that the object is imaged is equal to or greater than a first threshold value from an arbitrary moving image by using the object detection model constructed by using the teacher image in which the object is imaged. Extract the image as a candidate image. The generation unit generates coordinate information of a region in the candidate image in which the probability that the object is captured is equal to or higher than the first threshold value. With such a configuration, by designating or canceling the coordinate area in the candidate image, it is possible to efficiently set whether or not the target object is the image of the target object. As a result, a large number of teacher images can be efficiently collected.

The work support device according to the second aspect of the present disclosure includes an extraction unit and an output unit. The extraction unit includes an area in which the probability that the object is imaged is equal to or greater than a first threshold value from an arbitrary moving image by using the object detection model constructed by using the teacher image in which the object is imaged. Extract the image as a candidate image. In addition, the output unit outputs an image in which a bounding box corresponding to a region in which the probability that the object is captured is equal to or higher than the first threshold is combined with the candidate image. With such a configuration, by maintaining the display of the bounding box or deleting it, it is possible to efficiently set whether or not the target object is the image of the target object. As a result, a large number of teacher images can be efficiently collected.

It is a schematic diagram showing a configuration of a work support device 20. It is a figure which shows an example of the candidate image Gk. FIG. 3 is a partially enlarged view of FIG. 2. 5 is a schematic diagram for explaining the concept of information processing by the processing unit 24. FIG. 6 is a flowchart for explaining the operation of the work support device 20. 9 is a flowchart for explaining the operation of the work support device 20 according to Modification A. FIG. 13 is a schematic diagram for explaining end-to-end learning according to modification B. It is a schematic diagram which shows an example of the display screen of the work assistance apparatus 20 which concerns on modification C, D.

(1) Configuration of Work Support Device Hereinafter, the configuration of the work support device according to an embodiment of the present disclosure will be described with reference to the drawings.
FIG. 1 is a schematic diagram showing the configuration of the work support device 20 according to the present embodiment. FIG. 2 is a diagram showing an example of the candidate image Gk output by the work support device 20. FIG. 3 is a partially enlarged view of a broken line portion of FIG. FIG. 4 is a schematic diagram for explaining the concept of information processing by the processing unit 24 described later.

The work support device 20 is a device that supports the work of generating the teacher image Gt used for the object detection model 21M. In the following description, Gt is used to collectively describe a plurality of teacher images, and Gt1 is used with a subscript when the individual teacher images are described separately.

The “object detection model 21M” is constructed by a neural network in which weights are adjusted based on the teacher image Gt on which the object O is copied, and the area extraction of the object O in the image and the extraction of the object O are performed. Perform object recognition. Specifically, the object detection model 21M calculates the probability that the object O appears in the image when the image is input, and if the calculated probability is a predetermined value or more, the object O Outputs the area where is appearing.
The target object detection model 21M can detect a plurality of preset target objects.

As shown in FIGS. 2 and 3, the area of the object O is defined by coordinate information b1 to b4 corresponding to the four vertices of the bounding box B combined in the image Gk. Therefore, in the teacher image Gt of the object detection model 21M, the object O is imaged in the area corresponding to the coordinate information b1 to b4 of the bounding box B.

In the examples in FIGS. 2 and 3, a “traffic light” is shown as the object O, but the object O is not limited to this. As the object O, any object can be adopted. The object O can be set not only by the type of object but also by the state or the like. For example, the object O may be set not only as a traffic light but also as a traffic light that displays a red traffic light and a traffic light that displays a blue traffic light.

The work support device 20 can be realized by any computer, and includes a storage unit 21, an input unit 22, an output unit 23, and a processing unit 24. The work support device 20 may be realized as hardware using an LSI (Large Scale Integration), an ASIC (Application Specific Integrated Circuit), an FPGA (Field-Programmable Gate Array), or the like.

The storage unit 21 stores various kinds of information, and is realized by an arbitrary storage device such as a memory and a hard disk. Here, the storage unit 21 stores information such as the weight of the neural network that constructs the object detection model 21M. The storage unit 21 stores a plurality of teacher images Gt, and stores a plurality of teacher images Gt1 to Gtp (p is a natural number other than 1) in the initial state. The storage unit 21 also stores a moving image GD for generating a new teacher image Gtq (q is a value other than 1 to p). The moving image GD is captured by an arbitrary image capturing device. The moving image GD is composed of a plurality of still images Gdi (i=1 to j, j is a natural number).

The input unit 22 is realized by an arbitrary input device such as a keyboard, a mouse, a touch panel, etc., and inputs various information to a computer.

The output unit 23 is realized by any output device such as a display, a touch panel, a speaker, etc., and outputs various information from the computer.

The processing unit 24 executes various types of information processing, and is realized by a processor such as a CPU or GPU and a memory. Here, the processing unit 24 causes the extraction unit 24A, the generation unit 24B, the synthesis unit 24C, the setting unit 24D, by reading one or a plurality of programs stored in the storage unit 21 into the CPU, GPU, or the like of the computer. And functions as the updating unit 24E. Hereinafter, each function of the processing unit 24 will be described with reference to FIG.

The extraction unit 24A uses the object detection model 21M, and the probability that the object O is captured from the images Gd1 to Gdj of each frame of the arbitrary moving image GD is the first threshold P1 or more and the second threshold. An image including a region that is P2 or less is extracted as a candidate image Gk (denoted as Gk1 to Gk3 in FIG. 4). Here, the first threshold P1 is set to about 10%, and the second threshold P2 is set to about 60%.

The generation unit 24B generates coordinate information of an area in the candidate image Gk in which the probability that the object O is captured is equal to or higher than the first threshold P1. The generation unit 24B also generates a file in which coordinate information b1 to b4 of the vertices of the bounding box B (denoted as B1 to B3 in FIG. 4) corresponding to the area in which the object O is imaged is described. ..
The coordinate information b1 to b4 can be defined by the two-dimensional coordinates of each vertex. However, not limited to this, the coordinate information b1 to b4 may be defined by the two-dimensional coordinates of one vertex and the width and height from the vertex, assuming that the bounding box B is a square or a rectangle. it can. In the former case, a file in which eight values corresponding to four vertices on the two-dimensional coordinates are described is generated. In the latter case, a file in which a total of four values including two values corresponding to one vertex on the two-dimensional coordinates and two values indicating the width and height from the vertex is generated is generated.

The combining unit 24C combines the bounding box B with the candidate image Gk and displays it on the display of the output unit 23.

The setting unit 24D sets, via the input unit 22, whether or not the object O is shown in the bounding box B combined with the candidate image Gk. In the example shown in FIG. 4, the object O is shown in the bounding boxes B1 and B3 of the candidate images Gk1 and Gk3, but the object P other than the object O is shown in the bounding box B2 of the candidate image Gk2. There is. In such a case, the setting unit 24D sets, via the input unit 22, "candidate images Gk1 and Gk3 are target images in which the target O is captured" (U1, U3). Further, the setting unit 24D makes a setting via the input unit 22 that "the candidate image Gk2 is not a target image in which the target O is captured" (U2).

The setting unit 24D generates an arbitrary bounding box B in the image by designating the coordinates, and sets that the object O is a target image in which the target O is captured using the bounding box B. Is also possible.

The updating unit 24E updates the target object detection model 21M by adding an image in which whether or not the target object O is set to the current teacher image Gt and readjusting the weight of the neural network.

(1-2) Operation of Work Support Device FIG. 5 is a flowchart for explaining the operation of the work support device 20 according to the present embodiment.
First, a moving image GD of the surrounding environment of the image capturing device is captured by an arbitrary image capturing device. Then, these moving images GD are stored in the storage unit 21 of the work support device 20 in a timely manner (S1).

Next, the work support apparatus 20 uses the object detection model 21M constructed by using the initial teacher image groups Gt1 to Gtp, and uses the object detection model 21M to construct the moving image GD stored in the storage unit 21 as one frame image Gdi. It is determined whether (i=1 to j, j is a natural number) includes a region in which the probability that the object O is copied is equal to or more than the first threshold P1 and equal to or less than the second threshold P2 (S2 to S4).

When the probability that the object O is copied is within the above range, the work support apparatus 20 outputs the image of that frame as a “candidate image Gk” (S4-Yes, S5). Specifically, as shown in FIGS. 2 and 3, an image in which the bounding box B is combined in the area where the object O is displayed is displayed on the display constituting the output unit 23.

Subsequently, the worker determines whether the candidate image Gk is a target image in which the target O is captured (S6). At this time, the work support apparatus 20 accepts the setting as to whether the candidate image Gk is the target image, via the input unit 22 and the setting unit 24D. For example, when the images as shown in FIGS. 2 and 3 are displayed as the candidate image Gk, the “traffic light”, which is the object O, is shown in the bounding box B, and therefore the operator does not need to perform additional work. The setting that the image is a physical image can be accepted (S6-Yes, S7). On the other hand, when the target object O is not shown in the candidate image Gk, it means that an object other than the target object O is erroneously recognized as a “traffic light” and is extracted, and thus the bounding of the erroneously recognized object is performed. After allowing the operator to perform the process of deleting the box B, the setting that the image is not the object image is accepted (S6-No, S8).

After that, the above processes are sequentially executed until the image Gdj of the last frame of the moving image GD is reached (S9, S10).

(3) Features of work support device (3-1)
As described above, in the work support device 20 according to the present embodiment, the extraction unit 24A uses the object detection model 21M, and the probability that the object O is copied from an arbitrary moving image GD is the first. An image including a region having a threshold value P1 or more and a second threshold value P2 or less is extracted as a candidate image Gk. In addition, the generation unit 24B generates coordinate information b1 to b4 of a region in the candidate image Gk in which the probability that the object O is photographed is not less than the first threshold P1 and not more than the second threshold P2. Further, the combining unit 24C generates the bounding box B corresponding to the coordinate information b1 to b4 and combines the bounding box B with the candidate image Gk. Then, the candidate image Gk in which the bounding box B is displayed is displayed on the display constituting the output unit 23.
The work support device 20 also includes a setting unit 24D. The setting unit 24D sets that the object O is imaged in the bounding box B or that the object O is not imaged in the bounding box B through the operation of the input unit 22 by the operator. Accept.

Therefore, by using the work support device 20 as described above, the worker can efficiently perform the work of setting whether or not the object O is imaged in the image Gdi of each frame of the moving image GD. Like Specifically, the worker only needs to confirm whether or not the object O is imaged in the area corresponding to the bounding box B in the candidate image Gk in which the object O is displayed with a certain probability of appearance. Like Then, the image showing the object O can be used as a new teacher image Gtq (see FIG. 4). In short, by using the work support device 20, it is possible to efficiently collect a large number of teacher images Gt.

(3-2)
In particular, the extraction unit 24A extracts, as the candidate image Gk, an image including a region in which the probability that the object O is copied is equal to or higher than the first threshold P1. Therefore, an image in which the object O is not copied is excluded. It will be. In other words, the extraction unit 24A does not extract a noise image as the candidate image Gk. As a result, the setting of whether or not the object O is a captured image is made efficient.

(3-3)
Further, the extraction unit 24A extracts, as a candidate image Gk, an image in which the probability that the object O is captured is equal to or less than the second threshold value P2. As a result, it is possible to efficiently collect a new teacher image that contributes to the improvement of the detection accuracy of the object detection model 21M.

Supplementally, the weight of the target object detection model 21M is updated by adding a target image that can be detected with high probability using the current teacher image group Gt1 to Gtp as a new teacher image to the current teacher image group Gt1 to Gtp. However, in many cases, no significant change occurs in the feature amount of the object O extracted from the current teacher image group Gt1 to Gtp. That is, such a teacher image often does not contribute to improvement in detection accuracy of the object detection model 21M. On the other hand, in addition to the current teacher image group Gt1 to Gtp as a new teacher image Gtq, an object image that cannot be detected with high probability using the current teacher image group Gt1 to Gtp is added to the target object detection model 21M. When the weight is updated, a significant change occurs in the feature amount of the target object O extracted from the current teacher image group Gt1 to Gtp, and the detection accuracy of the target object detection model 21M is improved.
As described above, the work support apparatus 20 according to the present embodiment extracts an image whose detection accuracy does not increase in the current teacher image groups Gt1 to Gtp as the candidate image Gk, thereby detecting the detection accuracy of the object detection model 21M. It is possible to efficiently collect new teacher images that contribute to improvement.

(3-4)
The work support device 20 according to the present embodiment further includes an updating unit 24E. The updating unit 24E adds an image in which whether or not the target object O is captured to the current teacher image Gt, adjusts the weight of the neural network, and updates the target object detection model 21M. With such a configuration, the accuracy of detecting the object O in the object detection model 21M is improved according to the use of the work support device 20. As a result, it becomes possible to provide an object detection model with high detection accuracy.

(3-5)
Further, the object detection model 21M according to the present embodiment can detect a plurality of objects. Furthermore, the setting unit 24D can also accept a change in the setting of the target object corresponding to the bounding box B. Specifically, it is set that the second object, not the first object, is imaged in the candidate image that is output as the probability that the first object is imaged is greater than or equal to the first threshold value. be able to.
For example, when a bounding box B indicating that the probability that a traffic light displaying a green light is displayed is greater than or equal to a first threshold value and less than or equal to a second threshold value is displayed, a red light signal is captured in the actual candidate image Gk. If so, the user can set through the input unit 22 and the setting unit 24D that the candidate image Gk includes a traffic light displaying a red traffic light.

(4) Modified Example (4-1) Modified Example A
In the work support device 20 according to the present embodiment, the extraction unit 24A may stop the extraction of the candidate image Gk when the change amount of the previously extracted image is equal to or less than the predetermined amount. Specifically, the extraction unit 24A according to the modification A stores the previously extracted image as the reference image Gc. Then, when the amount of change from the reference image Gc is equal to or less than the predetermined amount, the extraction unit 24A includes an image including a region in which the probability that the object O is imaged is equal to or more than the first threshold P1 and equal to or less than the second threshold P2. To stop extracting. In other words, when the extraction unit 24A extracts the candidate image Gk, the candidate image is set as the reference image Gc. In addition, the extraction unit 24A stops extracting the image as a candidate image Gk when the change amount of the image Gdi of one frame forming the moving image GD from the reference image Gc is equal to or less than a predetermined amount.

The work support device 20 according to this modification A executes the operation shown in the flowchart of FIG. In the work support device 20 according to the modification A, steps T1 to T4, T6 to T8, and T10 to T12 execute the same processing as steps S1 to S9 described above, respectively. On the other hand, in the work support device 20 according to the modified example A, the processes of steps T5 and T9 are added. In step T5, the candidate image Gk is extracted only when the amount of change from the reference image Gc is larger than the predetermined amount. Further, in step T9, when the object image is newly set, the object image is set as a new reference image Gc.

With such a configuration, the work support apparatus 20 according to the modification A does not collect the candidate images Gk that do not contribute to the improvement in the detection accuracy of the object detection model 21M. Supplementally, an image in which the amount of change from the reference image Gc is less than or equal to a predetermined amount is an image similar to the reference image Gc. Therefore, such an image is added as a new teacher image to the current teacher image groups Gt1 to Gtp. Thus, even if the weight of the object detection model 21M is updated, there is often no significant change in the feature amount of the object O extracted from the current teacher image groups Gt1 to Gtp. That is, such a teacher image often does not contribute to the improvement of the detection accuracy of the object detection model 21M. Therefore, by ignoring such an image, the object detection model 21M can be quickly constructed while reducing the calculation load.
In other words, the work support device 20 according to the modification A can efficiently collect the candidate images Gk that contribute to the improvement of the detection accuracy of the object detection model 21M.

(4-2) Modification B
Further, in the work support device 20 according to the present embodiment, the object detection model 21M is constructed by a neural network that performs area extraction of the object O and object recognition of the object O end-to-end. It may be one. With such a configuration, the object O can be detected at high speed, and the object O can be detected in real time.

It should be noted that the term “end-to-end” as used herein means, as shown in the concept of FIG. 7A, a neural network having an appropriate structure for the processing of region extraction of the object O and object recognition of the object O. It means learning the input/output relations directly through the network. For example, such an object detection model 21M can be realized by using an algorithm such as YOLO (You Only Look Once) or SSD (Single Shot Multi Box Detector).

However, the object detection model 21M is not limited to this, and as shown in the concept of FIG. 7B, is constructed by a combination of an algorithm and a neural network for individually extracting the area of the object O and recognizing the object of the object O. It may be done.

(4-3) Modification C
Further, the work support device 20 according to the present embodiment, as shown in FIG. 8, synthesizes an image showing the type of the target object and the value of the probability that the target object O is photographed with the candidate image Gk and outputs the candidate image Gk. It may be one. Accordingly, the worker can easily recognize what the target object O shown in the candidate image Gk is. For example, in FIG. 8, in the area indicated by the symbol M, the image corresponding to the bounding box B is a traffic light that displays a red traffic light (denoted as Red_light in FIG. 8 ), and the probability that it is a traffic light that displays a red traffic light. Is 43.21%. The area indicated by the symbol M is displayed near the corresponding bounding box B.

(4-4) Modification D
Further, the work support device 20 according to the present embodiment uses the object detection model 21M, and the probability that the object O is imaged from an arbitrary moving image GD is equal to or higher than the first threshold value and equal to or lower than the second threshold value. An image including a certain area may be stored as a candidate image Gk in a folder divided for each object. Further, the work support device 20 may output the candidate image Gk stored for each folder together with the bounding box B.

With this, it becomes possible to efficiently judge whether or not the object is shown in the candidate image Gk. Supplementally, since the images accumulated in each folder are associated with a predetermined target object, when the worker continuously displays the images accumulated in each folder, whether the target object is copied or not is displayed. You just have to confirm whether or not.

For example, an operator opens a folder in which a plurality of candidate images in which a traffic light displaying a green signal is displayed is accumulated and continuously outputs the images in the folder so that the traffic light displaying the green signal in those candidate images. It is possible to efficiently determine whether or not is captured. Further, when the operator continuously confirms the images in the folder, the operator clicks an icon, which is indicated by symbol I2 in FIG. Can be displayed. Here, when the icon I2 is clicked, the next image is displayed, and at the same time, the candidate image Gk being displayed is set to include a traffic light displaying a green signal. In short, the worker can perform the annotation work for generating the teacher image Gt used for the object detection model 21M by simply clicking the icon I2 while continuously checking the images. Note that the symbol I1 in FIG. 8 is an icon that means return to the front, and when this icon I1 is clicked, the previously displayed candidate image is displayed.

If a plurality of objects of the same type are shown in the candidate image Gk, they are stored in the folder corresponding to the objects as they are. On the other hand, when a plurality of objects of different types are shown in the candidate image Gk, they are stored in a folder indicating an exception.

The present disclosure is not limited to the above embodiments as they are. The present disclosure can be embodied by modifying the constituent elements within the scope not departing from the gist of the present invention at the implementation stage. In addition, the present disclosure can form various disclosures by appropriately combining a plurality of constituent elements disclosed in each of the above-described embodiments. For example, some components may be deleted from all the components shown in the embodiment. Further, the constituent elements may be appropriately combined with different embodiments.

Further, the configurations of the following aspects are also disclosed in the above-described embodiments.
The work support apparatus of the first aspect is a work support apparatus that supports a work for setting whether or not an object is imaged in a region within an image. Here, the work support device includes an extraction unit and a generation unit. The extraction unit includes an area in which the probability that an object is imaged is equal to or higher than a first threshold value from an arbitrary moving image by using an object detection model constructed using a teacher image in which the object is imaged. Extract the image as a candidate image. The generation unit generates coordinate information of an area in the candidate image in which the probability that the object is captured is equal to or higher than the first threshold value. With such a configuration, by designating or canceling the coordinate area in the candidate image, it is possible to efficiently set whether or not the target object is the image of the target object. As a result, a large number of teacher images can be efficiently collected.

A work support apparatus according to a second aspect is the work support apparatus according to the first aspect, in which the object detection model is constructed by a neural network that performs end-to-end object region extraction and object object recognition. Is. With such a configuration, it is possible to speed up the detection of the target object.

The work support apparatus according to the third aspect is the work support apparatus according to the first aspect or the second aspect, wherein the extraction unit selects an image including a region in which the probability that the object is captured is equal to or less than a second threshold as a candidate image. To extract. With such a configuration, it is possible to efficiently collect candidate images that contribute to improving the detection accuracy of the object detection model.

The work support apparatus according to the fourth aspect is the work support apparatus according to the first to third aspects, and the extraction unit extracts the candidate image when the change amount from the previously extracted candidate image is less than or equal to a predetermined amount. To stop. As a result, the collection of candidate images that do not contribute to the improvement in the detection accuracy of the object detection model is stopped. As a result, it is possible to efficiently collect candidate images that contribute to improving the detection accuracy of the object detection model.

The work support apparatus according to the fifth aspect is the work support apparatus according to the first to fourth aspects, and further includes an updating unit. The updating unit updates the target object detection model by adding an image in which whether or not the target object is set to the teacher image. With such a configuration, it is possible to efficiently collect the candidate images that contribute to the improvement of the detection accuracy of the object according to the use.

The sixth aspect of the object detection model is the object detection model updated by the work support apparatus of the fifth aspect. Therefore, an object detection model with high detection accuracy can be provided.

The program according to the seventh aspect causes a computer to function as a work support device that supports a work for setting whether or not an object is shown in an area in an image. This program causes a computer to function as an extraction unit and a generation unit. The extraction unit includes an area in which the probability that the object is imaged is equal to or greater than a first threshold value from an arbitrary moving image by using the object detection model constructed by using the teacher image in which the object is imaged. Extract the image as a candidate image. The generation unit generates coordinate information of a region in the candidate image in which the probability that the object is captured is equal to or higher than the first threshold value. With such a configuration, by designating or canceling the coordinates in the candidate image, it is possible to efficiently set whether or not the target object is a captured target object image. As a result, a large number of teacher images can be efficiently collected.

The work support method of the eighth aspect is a method of using a computer to support work for setting whether or not an object is shown in an area within an image. In this work support method, the probability that the object is copied from an arbitrary moving image is equal to or higher than a first threshold value by using an object detection model constructed using a teacher image in which the object is copied. An image including a region is extracted as a candidate image. Then, in this work support method, the coordinate information of the region in the candidate image in which the probability that the object is photographed is the first threshold value or more is generated. Therefore, according to this work support method, it can be efficiently set whether or not the target object is a captured target object image. As a result, a large number of teacher images can be efficiently collected.

The work support apparatus according to the ninth aspect includes an extraction unit and an output unit. The extraction unit includes an area in which the probability that the object is imaged is equal to or greater than a first threshold value from an arbitrary moving image by using the object detection model constructed by using the teacher image in which the object is imaged. Extract the image as a candidate image. Further, the output unit outputs an image in which a bounding box corresponding to a region in which the probability that the object is captured is equal to or higher than the first threshold is combined with the candidate image. With such a configuration, by maintaining the display of the bounding box or deleting it, it is possible to efficiently set whether or not the target object is the image of the target object. As a result, a large number of teacher images can be efficiently collected.

The work support apparatus according to the tenth aspect is the work support apparatus according to the ninth aspect, and accepts a setting that an object is imaged in the bounding box or that the object is not imaged in the bounding box. A setting unit is further provided. Through this setting unit, it can be efficiently set whether or not the target object is a captured target object image.

The work support apparatus of the eleventh aspect is the work support apparatus of the ninth or tenth aspect, and the object detection model detects a plurality of objects. In addition, the work support device replaces the first target object with the second target object for the image in which the bounding box corresponding to the region in which the probability that the first target object is captured is equal to or higher than the first threshold value is combined. It further includes a setting unit that accepts the setting that the is captured. With such a configuration, correction can be easily performed when the object is erroneously detected.

The work support apparatus according to the twelfth aspect is the work support apparatus according to the ninth to eleventh aspects, and the object detection model detects a plurality of objects. In addition, the work support device uses the object detection model to classify, from any moving image, an image including a region in which the probability that the object is captured is equal to or more than a first threshold value as a candidate image, for each object. Stored in the specified folder. With this, by continuously displaying the images accumulated in each folder, it is only necessary to confirm whether or not the target object is imaged, which reduces the burden on the operator.

The work support apparatus according to the thirteenth aspect is the work support apparatus according to the ninth to twelfth aspects, and combines the value of the probability that the object is photographed with the candidate image and outputs it. This allows the operator to easily recognize what the target object is in the candidate image.

The work support apparatus according to the fourteenth aspect is the work support apparatus according to the ninth to thirteenth aspects, wherein the extraction unit selects an image including a region in which the probability that the object is captured is equal to or less than the second threshold as a candidate image. To extract. With such a configuration, it is possible to efficiently collect candidate images that contribute to improving the detection accuracy of the object detection model.

The work support apparatus according to the fifteenth aspect is the work support apparatus according to the ninth to fourteenth aspects, wherein the extraction unit extracts the candidate image when the amount of change from the previously extracted candidate image is less than or equal to a predetermined amount. To stop. As a result, it is possible to efficiently collect candidate images that contribute to improving the detection accuracy of the object detection model.

A work support apparatus according to a sixteenth aspect is the work support apparatus according to the ninth to fifteenth aspects, in which an image in which whether or not an object is photographed is set to a teacher image and an object detection model is set. An updating unit for updating is further provided. With such a configuration, it is possible to efficiently collect the candidate images that contribute to the improvement of the detection accuracy of the object according to the use.

The program of the seventeenth aspect causes a computer to function as an extraction unit and an output unit. The extraction unit includes an area in which the probability that the object is imaged is equal to or greater than a first threshold value from an arbitrary moving image by using the object detection model constructed by using the teacher image in which the object is imaged. Extract the image as a candidate image. The output unit outputs, to the candidate image, an image in which a bounding box corresponding to a region in which the probability that the object is captured is equal to or higher than the first threshold value is combined. With such a configuration, by maintaining the display of the bounding box or deleting it, it is possible to efficiently set whether or not the target object is the image of the target object. As a result, a large number of teacher images can be efficiently collected.

The work support method according to the eighteenth aspect is a method of using a computer to support work for setting whether or not an object (O) is imaged in a region within an image. In this work support method, the probability that the object is copied from an arbitrary moving image is equal to or higher than a first threshold value by using an object detection model constructed using a teacher image in which the object is copied. An image including a region is extracted as a candidate image. Further, in this work support method, an image in which a bounding box corresponding to a region in which the probability that an object is imaged is equal to or higher than a first threshold is combined is output to the candidate image. Therefore, according to this work support method, by maintaining the display of the bounding box or deleting it, it is possible to efficiently set whether or not the object is the image of the object. As a result, a large number of teacher images can be efficiently collected.

20 work support device 21 storage unit 21M object detection model 22 input unit 23 output unit 24 processing unit

24A extracting unit

24B generating unit

24C combining unit

24D setting unit 24E updating unit GD moving image Gd frame image Gk candidate image Gt teacher image Gc Reference image M Area I1 icon in which the value of the probability that the object is imaged is displayed (back)
I2 icon (Next)

JP, 2008-169672, A

Claims

A work support device (20) for supporting work for setting whether or not an object (O) is imaged in a region within an image,
Using the target object detection model (21M) constructed using the teacher image (Gt) in which the target object is copied, the probability that the target object is copied from an arbitrary moving image (GD) is first. An extraction unit (24A) that extracts an image including a region equal to or more than a threshold value as a candidate image (Gk);
A generation unit (24B) that generates coordinate information of a region in the candidate image in which the probability that the object is photographed is a first threshold value or more;
A work support device (20) comprising:
The object detection model is constructed by a neural network that performs end-to-end object extraction and object area detection of the object,
The work support device according to claim 1.
The extraction unit extracts, as the candidate image, an image including a region in which the probability that the object is captured is a second threshold value or less,
The work support device according to claim 1.
The extraction unit stops extraction of the candidate image when the amount of change from the previously extracted candidate image is less than or equal to a predetermined amount.
The work support device according to any one of claims 1 to 3.
An update unit (24E) that updates the target object detection model by adding an image in which whether or not the target object is captured to the teacher image is further provided.
The work support device according to any one of claims 1 to 4.
An object detection model updated by the work support device according to claim 5.
Computer,
A program that functions as a work support device (20) that supports a work for setting whether or not an object (O) is imaged in an area in an image,
Using the target object detection model (21M) constructed using the teacher image (Gt) in which the target object is copied, the probability that the target object is copied from an arbitrary moving image (GD) is first. An extraction unit (24A) that extracts an image including a region equal to or larger than a threshold value as a candidate image (Gk),
A generation unit (24B) that generates coordinate information of an area in the candidate image in which the probability that the object is captured is equal to or higher than a first threshold value,
A program to function as.
A work support method for supporting a work for setting whether or not an object (O) is imaged in an area in an image using a computer,
Using the target object detection model (21M) constructed using the teacher image (Gt) in which the target object is copied, the probability that the target object is copied from an arbitrary moving image (GD) is first. An image including a region that is equal to or greater than a threshold is extracted as a candidate image (Gk),
Generating coordinate information of an area in the candidate image in which the probability that the object is imaged is a first threshold value or more;
Work support method.
Using the object detection model (21M) constructed using the teacher image (Gt) in which the object (O) is copied, the probability that the object is copied from an arbitrary moving image (GD) is calculated. An extraction unit (24A) that extracts an image including a region that is equal to or larger than the first threshold value as a candidate image (Gk);
An output unit (23) that outputs an image in which a bounding box (B) corresponding to an area in which the probability that the object is imaged is equal to or higher than a first threshold is combined with the candidate image,
A work support device (20) comprising:
A setting unit (22, 24D) that accepts a setting that the object is imaged in the bounding box, or that the object is not imaged in the bounding box,
Is further provided. The work support device according to claim 9.
The object detection model is for detecting a plurality of objects,
For the image in which the bounding box corresponding to the area in which the probability that the first object is copied is equal to or greater than the first threshold value is combined, the second object is copied instead of the first object. A setting unit (22, 24D) that receives the setting,
Is further provided. The work support device according to claim 9.
The object detection model is for detecting a plurality of objects,
Using the object detection model (21M), an image including an area in which the probability that the object is imaged is equal to or higher than a first threshold value is set as a candidate image (Gk) from an arbitrary moving image (GD). Store in a folder that is divided for each
The work support device according to any one of claims 9 to 11.
A value of the probability that the object is imaged is combined with the candidate image and output.
The work support device according to any one of claims 9 to 12.
The extraction unit extracts, as the candidate image, an image including a region in which the probability that the object is captured is a second threshold value or less,
The work support device according to any one of claims 9 to 13.
The extraction unit stops extraction of the candidate image when the amount of change from the previously extracted candidate image is less than or equal to a predetermined amount.
The work support device according to any one of claims 9 to 14.
An update unit (24E) that updates the target object detection model by adding an image in which whether or not the target object is captured to the teacher image is further provided.
The work support device according to any one of claims 9 to 15.
Computer,
Using the object detection model (21M) constructed using the teacher image (Gt) in which the object (O) is copied, the probability that the object is copied from an arbitrary moving image (GD) is calculated. An extraction unit (24A) that extracts an image including a region that is equal to or larger than the first threshold value as a candidate image (Gk),
An output unit (23) that outputs an image in which a bounding box (B) corresponding to an area in which the probability that the object is imaged is equal to or higher than a first threshold is combined with the candidate image,
A program to function as.
A work support method for supporting a work for setting whether or not an object (O) is imaged in an area in an image using a computer,
Using the object detection model (21M) constructed using the teacher image (Gt) in which the object (O) is copied, the probability that the object is copied from an arbitrary moving image (GD) is calculated. An image including a region that is equal to or larger than the first threshold is extracted as a candidate image (Gk),
An image in which a bounding box (B) corresponding to a region in which the probability that the object is imaged is equal to or higher than a first threshold is combined is output to the candidate image,
Work support method.