WO2022153599A1 - 情報処理装置および情報処理方法 - Google Patents
情報処理装置および情報処理方法 Download PDFInfo
- Publication number
- WO2022153599A1 WO2022153599A1 PCT/JP2021/033706 JP2021033706W WO2022153599A1 WO 2022153599 A1 WO2022153599 A1 WO 2022153599A1 JP 2021033706 W JP2021033706 W JP 2021033706W WO 2022153599 A1 WO2022153599 A1 WO 2022153599A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- detection
- moving body
- reliability
- moving
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 114
- 238000003672 processing method Methods 0.000 title claims description 5
- 238000001514 detection method Methods 0.000 claims abstract description 412
- 238000004364 calculation method Methods 0.000 claims abstract description 23
- 238000000034 method Methods 0.000 claims description 103
- 238000010801 machine learning Methods 0.000 claims description 9
- 238000011410 subtraction method Methods 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 5
- 238000012706 support-vector machine Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 description 45
- 238000010586 diagram Methods 0.000 description 11
- 241001465754 Metazoa Species 0.000 description 8
- 210000000746 body region Anatomy 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/254—Analysis of motion involving subtraction of images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
- G06T7/62—Analysis of geometric attributes of area, perimeter, diameter or volume
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20221—Image fusion; Image merging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20212—Image combination
- G06T2207/20224—Image subtraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Definitions
- the present invention relates to an information processing device and an information processing method.
- Patent Document 1 discloses a technique for distinguishing between a detection target and a moving object other than the detection target based on physical quantity information such as a detection position among moving objects.
- the difference region extracted by the moving body difference method may vary depending on the difference in moving speed or movement method.
- the difference region by the moving body difference method can output a detection rectangle (detection frame) by following the change of the latest time, but the animal body region to be extracted is not stable due to the influence of the accuracy of the frame-to-frame difference or background subtraction. In some cases. For example, in a human body that works without moving, it is difficult to stably output the rectangular size of an animal body because the moving part changes as the time changes.
- One aspect of the present invention is to provide a technique for improving the detection accuracy of an animal body in a moving image and stably outputting a detection frame.
- the present invention adopts the following configuration in order to achieve the above object.
- the first aspect of the present disclosure is a detection unit that detects a moving object from each frame image of the moving image, a calculation unit that calculates the reliability that the detected moving object is a predetermined subject, and a first frame.
- the information processing device uses the detection frame of the moving body (second moving body) detected in the previous frame (second frame) for the moving body (first moving body) detected in the current frame (first frame).
- the detection frame of the first moving object is determined based on the reliability. By adopting a detection frame with higher reliability, the information processing device can improve the detection accuracy of moving objects and can stably output the detection frame.
- a "predetermined subject" is a moving object to be detected, such as a human body.
- the information processing device may further include a determination unit that determines a second moving object that is the same subject as the first moving object among the plurality of moving objects detected in the second frame. By more accurately determining the moving object of the same subject as the first moving object among the moving objects detected in the second frame, the information processing apparatus can stably output the detection frame for the same subject.
- the determination unit determines the second moving object, which is the same subject as the first moving object, based on the distance between the centers of the frame circumscribing the first moving object and the detection frame of each moving object detected in the second frame. You may.
- the information processing device can reduce the processing load by determining the second moving object, which is the same subject as the first moving object, by a simple method.
- the determination unit is the same subject as the first moving body, based on the ratio of the overlapping area to the area occupied by the frame circumscribing the first moving body and the detection frame of each moving body detected in the second frame. You may determine the moving body of.
- the information processing device can reduce the processing load by determining the second moving object, which is the same subject as the first moving object, by a simple method.
- the determination unit determines the second moving object, which is the same subject as the first moving object, by collating the first moving object with each moving object detected in the second frame using a matching algorithm by machine learning. You may.
- the information processing device can accurately determine the second moving object, which is the same subject as the first moving object.
- the determination unit determines in each frame which moving object is the same subject as the first moving object among the moving objects detected in a plurality of frames before the first frame, and the determination unit determines in each frame the first moving object.
- the highest reliability of the reliability of the first moving object by each detection frame of the moving object determined to be the same subject as the first moving object is the reliability of the first moving object by the frame circumscribing the first moving object. If it is larger than, the detection frame for which the highest reliability is calculated may be determined as the detection frame for the first moving object.
- the determining unit determines the frame circumscribing the first moving body as the detection frame of the first moving body. May be good.
- the information processing device determines the detection frame without comparing it with the reliability of the detection frame of the previous frame, so that the processing load can be reduced. ..
- the determination unit determines the detection frame of the second moving object. , May be determined as the detection frame of the first moving object.
- the information processing apparatus can improve the detection accuracy of moving objects by adopting a detection frame having higher reliability.
- the determination unit may record the detection frame of the first moving object in the recording unit when the reliability of the determined detection frame of the first moving object is larger than the second threshold value. Since the frame whose reliability is equal to or less than the second threshold value is not recorded in the recording unit, the information processing apparatus can output a stable detection frame.
- the determination unit is a case where the reliability of the first moving object by the detection frame of the second moving object is larger than the reliability of the first moving object by the frame circumscribing the first moving object, and the first moving object is determined.
- the detecting frame of the second moving object is determined as the detecting frame of the first moving object.
- the detection frame of the first moving object may be recorded in the recording unit.
- the difference may be, for example, an area change from the detection frame of the second moving object to the frame circumscribing the first moving body, or the ratio of the area change to the area of the detecting frame of the second moving body.
- the information processing device detects by erroneous detection by not recording the detection frame for the first moving object when the difference between the circumscribed frame in the current frame and the detection frame in the previous frame is larger than the third threshold value in succession.
- the output of the frame can be reduced.
- the information processing device may further include an output unit that superimposes and outputs the detection frame of the first moving object recorded in the recording unit on the first frame.
- the detection accuracy of the animal body in the moving image is improved, and the information processing apparatus can output a stable detection frame.
- the output unit may output the first moving object detection frame when the reliability of the first moving object detection frame recorded in the recording unit is greater than the second threshold value.
- the information processing device can stably output a detection frame having a reliability higher than the second threshold value.
- the output unit is a case where the reliability of the first moving body by the detection frame of the second moving body is larger than the reliability of the first moving body by the frame circumscribing the first moving body, and the output unit is set to the first moving body.
- the detection frame of the first moving object recorded in the recording unit is output. You may.
- the information processing device controls so that the detection frame for the first moving object is not output when the frames in which the difference between the circumscribed frame in the current frame and the detection frame in the previous frame is larger than the third threshold value are continuous. It is possible to reduce the output of the detection frame due to erroneous detection.
- the output unit may output the detection frame of the first moving object when the number of consecutive frames whose reliability by the determined detection frame of the first moving object is larger than the first threshold value is larger than a predetermined number. ..
- the information processing device detects high reliability by controlling so as to output a detection frame for the first moving object when consecutive frames have a reliability higher than the first threshold value by the detecting frame of the first moving object.
- the frame can be output continuously.
- the information processing apparatus changes the position and size of the second moving object detection frame and the moving object detection frame determined to be the same subject as the first moving object in the frame before the second frame.
- a correction unit for correcting the detection frame of the second moving object may be further provided based on the above. By correcting the detection frame of the moving body detected in the previous frame, the correction unit 125 can improve the reliability of the moving body when the correction frame is applied to the current frame.
- the detection unit may detect a moving object by at least one of the inter-frame difference method and the background subtraction method.
- the calculation unit may calculate the reliability that the detected moving object is a predetermined subject by a classifier based on at least one of a neural network, boosting, and a support vector machine.
- the second aspect of the present invention is a detection step in which a computer detects a first moving object from a first frame included in a moving image, a reliability that the first moving object is a predetermined subject, and a first moving object.
- a calculation step calculated using a frame circumscribing the image and a detection frame recorded in the recording unit using the detection frame of the second moving object detected in the second frame before the first frame.
- the detection frame of the first moving body is based on the reliability of the first moving body by the frame circumscribing the first moving body and the reliability of the first moving body in the first frame by the detecting frame of the second moving body. It is an information processing method including a determination step of determining and recording in a recording unit.
- the present invention can also be regarded as a program for realizing such a method by a computer or a recording medium in which the program is recorded non-temporarily. It should be noted that each of the above means and treatments can be combined with each other as much as possible to form the present invention.
- the present invention it is possible to improve the detection accuracy of the animal body in the moving image and stably output the detection frame.
- FIG. 1 is a diagram illustrating an application example of the information processing apparatus according to the embodiment.
- FIG. 2 is a diagram illustrating a hardware configuration of an information processing device.
- FIG. 3 is a diagram illustrating the functional configuration of the information processing apparatus.
- FIG. 4 is a flowchart showing an example of the detection rectangle output process.
- 5A to 5C are diagrams illustrating a method for determining the same subject.
- FIG. 6 is a flowchart showing an example of the detection rectangle output process according to the second embodiment.
- FIG. 7 is a flowchart showing an example of the detection rectangle output process according to the third embodiment.
- FIG. 8 is a flowchart showing another example of the detection rectangle output process according to the third embodiment.
- 9A and 9B are diagrams illustrating a situation in which the fourth embodiment is applied.
- FIG. 10 is a flowchart showing an example of the detection rectangle output process according to the fourth embodiment.
- FIG. 11 is a flowchart showing another example of the detection rectangle output process according to the fourth embodiment.
- FIG. 12 is a flowchart showing an example of the detection rectangle output process according to the fifth embodiment.
- FIG. 13 is a flowchart showing an example of the detection rectangle output process according to the sixth embodiment.
- FIG. 14 is a diagram illustrating a functional configuration of the information processing apparatus according to the seventh embodiment.
- FIG. 15 is a diagram illustrating correction of the detection rectangle according to the seventh embodiment.
- FIG. 16 is a flowchart showing an example of the detection rectangle output process according to the seventh embodiment.
- FIG. 1 is a diagram illustrating an application example of the information processing apparatus according to the embodiment.
- the information processing device acquires a moving image input from the camera, and detects an animal body (hereinafter, also referred to as a moving body) from each image frame of the acquired moving image.
- a moving body an animal body
- the camera for example, a fixed camera such as a surveillance camera is assumed.
- the information processing device uses, for example, a background subtraction method for extracting a region changed between a frame image and a background image prepared in advance, a frame subtraction method for extracting a region changed between frames, or both of them. Can be extracted.
- the example of FIG. 1 shows an example in which the moving body A1 was extracted at time T.
- the information processing device generates a rectangle A2 that circumscribes the extracted moving object A1.
- the shape of the frame indicating the moving body region is described as being a rectangle, but the shape is not limited to a rectangle, and may be an ellipse, a polygon, a curve circumscribing the moving body, or the like. Anything that surrounds the moving body area may be used.
- the information processing device acquires the reliability of the moving object by inputting the detected moving object into the machine learning classifier, for example. It is assumed that the reliability in the example of FIG. 1 is the reliability of the human body.
- the circumscribed rectangle A2 includes a region in which a portion excluding the head of the human body is extracted as a moving body region. Therefore, the reliability when the image of the area surrounded by the circumscribed rectangle A2 is input to the classifier is 500.
- the information processing device calculates the reliability of the image cut from the current frame using the detection rectangle of the same subject detected in the previous frame. The information processing device compares the calculated reliability with the reliability of the circumscribed rectangle of the moving object detected in the current frame.
- the information processing apparatus calculates the reliability of the image cut out at the current frame at time T by using the detection rectangle A3 for the same subject as the moving object A1 at time T-1 (previous frame).
- the calculated reliability 1000 is higher than the reliability of the moving body A1 detected at time T by the circumscribed rectangle A2.
- the information processing device determines the detection rectangle in the previous frame as the detection rectangle of the moving object detected in the current frame.
- the information processing apparatus since the reliability 1000 by the detection rectangle A3 at time T-1 is higher than the reliability 500 at time T, the information processing apparatus detects the detection rectangle A3 at the current frame at time T.
- the detection rectangle of the moving body A1 is determined.
- the information processing apparatus determines the detection rectangle of the moving body based on the reliability of the circumscribed rectangle of the moving body detected in the current frame and the reliability of the detection rectangle of the same moving body detected in the previous frame.
- the information processing device can improve the accuracy of moving object detection by adopting a rectangle having higher reliability as the detection rectangle. Further, the information processing apparatus can output a stable detection rectangle by adopting the detection rectangle of the previous frame even when the moving object stops moving in the moving image or when the moving object hardly moves. Therefore, even when a moving object is detected by the inter-frame difference method, the detection accuracy of a stationary object is improved.
- FIG. 2 is a diagram illustrating a hardware configuration of the information processing device 1.
- the information processing device 1 includes a processor 101, a main storage device 102, an auxiliary storage device 103, a communication interface (I / F) 104, and an output device 105.
- the processor 101 realizes the functions as each functional configuration described with reference to FIG. 3 by reading the program stored in the auxiliary storage device 103 into the main storage device 102 and executing the program.
- the communication interface 104 is an interface for performing wired or wireless communication.
- the output device 105 is, for example, a device for outputting a display or the like.
- the information processing device 1 may be a general-purpose computer such as a personal computer, a server computer, a tablet terminal, or a smartphone, or an embedded computer such as an onboard computer.
- the information processing device 1 may be realized by, for example, distributed computing by a plurality of computer devices, or a part of each functional unit may be realized by a cloud server. Further, a part of each functional unit of the information processing device 1 may be realized by a dedicated hardware device such as FPGA or ASIC.
- the information processing device 1 is connected to the camera 2 by wire (USB cable, LAN cable, etc.) or wirelessly (WiFi, etc.), and receives image data captured by the camera 2.
- the camera 2 is an image pickup device having an optical system including a lens and an image pickup element (an image sensor such as a CCD or CMOS).
- the information processing device 1 may be integrally configured with the camera 2. Further, a part of the processing of the information processing apparatus 1 such as motion detection and human body determination processing for the captured image may be executed by the camera 2. Further, the result of the human body detection by the information processing device 1 may be transmitted to an external device and presented to the user.
- FIG. 3 is a diagram illustrating the functional configuration of the information processing device 1.
- the information processing device 1 includes an image acquisition unit 11, a processing unit 12, a detection rectangular database (DB) 13, and an output unit 14.
- the processing unit 12 includes a detection unit 121, a calculation unit 122, a determination unit 123, and a determination unit 124.
- the image acquisition unit 11 transmits the moving image data acquired from the camera 2 to the processing unit 12.
- the detection unit 121 of the processing unit 12 detects a moving object for each frame of the moving image received from the image acquisition unit 11.
- the detection unit 121 can detect a moving object by, for example, a background subtraction method or an inter-frame difference method.
- the calculation unit 122 calculates the reliability that the detected moving object is a predetermined subject (for example, a human body).
- the calculation unit 122 can calculate the reliability by, for example, an algorithm of a neural network such as CNN (Convolution Neural Network). Further, the calculation unit 122 may calculate the reliability by using a discriminator by machine learning such as a boosting or support vector machine (SVM, Support Vector Machine).
- CNN Convolution Neural Network
- SVM Support Vector Machine
- the determination unit 123 determines which of the moving objects detected in the previous frame is the same subject as the moving object detected in the current frame.
- Information on the moving object detected in the previous frame and the corresponding detection rectangle is stored in the detection rectangle database 13.
- the determination unit 123 is detected in the moving body detected in the current frame and the previous frame, for example, based on the distance between the centers of the rectangle circumscribing the moving body detected in the current frame and the detection rectangle of the moving body detected in the previous frame. It is determined whether or not the moving object is the same subject.
- the determination unit 124 determines the detection rectangle for the moving object detected in the current frame based on the reliability calculated by the calculation unit 122, and registers it in the detection rectangle database 13. For example, when the reliability of the circumscribed rectangle of the moving object detected in the current frame is larger than a predetermined threshold value, the determination unit 124 determines the circumscribed rectangle as the detection rectangle of the moving object and registers it in the detection rectangle database 13. do.
- the determination unit 124 applies the detection rectangle of the same subject detected in the previous frame to the current frame to calculate the reliability.
- the determination unit 124 determines the rectangle with higher reliability among the circumscribed rectangle in the current frame and the detection rectangle of the same subject in the previous frame as the detection rectangle of the moving object detected in the current frame, and determines the detection rectangle database. Register in 13.
- the detection rectangle database 13 stores the moving objects detected in each frame of the moving image in association with the respective detection rectangles determined by the determination unit 124.
- the detection rectangle database 13 stores, for example, information on the position and size of the detection rectangle in the frame as the information on the detection rectangle. Further, the detection rectangle database 13 may store the reliability of the moving object according to the detection rectangle calculated by the calculation unit 122 as the information of the detection rectangle.
- the detection rectangle database 13 is an example of a recording unit.
- the output unit 14 superimposes the detected moving object detection rectangle on the image of each frame based on the information of the moving object and the corresponding detection rectangle stored in the detection rectangle database 13, and outputs the detected moving object to the output device 105 such as a display. do.
- FIG. 4 is a flowchart showing an example of the detection rectangle output process.
- the detection rectangle output process is started, for example, by transmitting each frame of the moving image acquired by the image acquisition unit 11 to the processing unit.
- the detection rectangle output process shown in FIG. 4 is a process executed for each frame of the moving image.
- the detection unit 121 detects a moving object from the image of the frame to be processed (hereinafter referred to as the current frame) received from the image acquisition unit 11.
- the detection unit 121 can detect a moving object by a background subtraction method for extracting a region changed between a frame image and a background image prepared in advance, and an interframe difference method for extracting a region changed between frames.
- the calculation unit 122 calculates the reliability of the image cut from the current frame by the circumscribed rectangle generated in S102.
- the reliability is the reliability that the moving object i in the clipped image is a predetermined subject, for example, a person.
- the calculation unit 122 can calculate the reliability by using, for example, a neural network algorithm such as CNN, or a machine learning discriminator such as boosting or SVM.
- the determination unit 124 determines whether or not the reliability of the circumscribed rectangle calculated in S103 is greater than the predetermined threshold value TH1 (first threshold value). If the reliability of the circumscribed rectangle is greater than TH1 (S104: Yes), the process proceeds to S109. When the reliability of the circumscribed rectangle is TH1 or less (S104: No), the process proceeds from S105 to S108, which is the loop process L2.
- the reliability of the moving body i is calculated using the detection rectangle of j m .
- the determination unit 124 determines the detection rectangle of the moving body i in the current frame based on the calculated reliability and the reliability of the moving body i by the circumscribed rectangle.
- the processing of each step will be specifically described.
- the determination unit 123 determines whether or not the subject of the moving object j detected in the previous frame is the same as the subject of the moving object i in the current frame. When it is determined that the subject of the moving body j detected in the previous frame is the same as the subject of the moving body i in the current frame (S106: Yes), the process proceeds to S107. If it is determined that they are not the same (S106: No), the process proceeds to the loop process L2 for the detection rectangle of the next moving object j + 1.
- FIG. 5A shows an example of the first same determination method.
- the determination unit 123 determines that the subject of the moving body j detected in the previous frame is the subject of the moving body j detected in the previous frame based on the distance d between the center of the rectangle A512 circumscribing the moving body i of the current frame and the detection rectangle A511 of the moving body j of the previous frame. It is determined whether or not it is the same as the subject of the moving body i.
- the determination unit 123 determines that the subject of the moving body j detected in the previous frame is the same as the subject of the moving body i in the current frame.
- the predetermined threshold value for the center-to-center distance d can be, for example, 1/2 the width of the rectangle A512 circumscribing the moving body i of the current frame.
- FIG. 5B shows an example of the second same determination method.
- the determination unit 123 presents the subject of the moving body j detected in the previous frame based on the IoU (Intersection over Union) of the rectangle A522 circumscribing the moving body i of the current frame and the detection rectangle A521 of the moving body j of the previous frame. It is determined whether or not it is the same as the subject of the moving object i of the frame.
- IoU is the ratio of the overlapping region of both rectangles to the region (sum region) occupied by the rectangle A522 circumscribing the moving body i of the current frame and the detection rectangle A521 of the moving body j of the previous frame.
- the determination unit 123 determines that the subject of the moving body j detected in the previous frame is the same as the subject of the moving body i in the current frame.
- the predetermined threshold value for IoU can be, for example, 80%.
- FIG. 5C shows an example of the third same determination method.
- the determination unit 123 collates the moving body i of the current frame with the moving body j of the previous frame using a matching algorithm (Re-Id) by machine learning, so that the subject of the moving body j detected in the previous frame is determined. It is determined whether or not it is the same as the subject of the moving object i of the current frame.
- Re-Id a matching algorithm
- the determination unit 123 can accurately determine that the subject is the same subject by using a collation algorithm based on machine learning.
- the determination unit 123 acquires, for example, the degree of similarity between the moving body detected in the current frame and the plurality of moving bodies detected in the previous frame.
- the determination unit 123 uses the same subject as the moving body detected in the current frame to select the moving body having the highest similarity among the moving bodies having the similarity degree equal to or higher than the threshold value (for example, 0.5 when 1 is the maximum value). It can be determined that there is.
- the threshold value for example, 0.5 when 1 is the maximum value
- the calculation unit 122 calculates the reliability of the moving body i cut out from the current frame by the detection rectangle of the moving body jm determined to be the same as the subject of the moving body i of the current frame.
- the determination unit 124 compares the reliability of the moving object jm of the front frame calculated in S107 with the detection rectangle and the reliability with the circumscribed rectangle calculated in S103.
- the determination unit 124 determines the circumscribed rectangle as the detection rectangle of the moving body i of the current frame. ..
- the determination unit 124 uses the detection rectangle of the moving body j m of the previous frame as the detection rectangle of the moving body i of the current frame. To determine as.
- the detection rectangle with the highest reliability calculated in S107 is adopted, and in S103. It may be compared with the calculated reliability of the circumscribed rectangle.
- the determination unit 124 records the information of the detection rectangle determined as the detection rectangle of the moving object i of the current frame in S108 in the detection rectangle database 13.
- the information of the detection rectangle includes the image information of the moving body i, the position and size of the determined detection rectangle, and the reliability value of the moving body i by the determined detection rectangle.
- the detection rectangle of the moving object i of the current frame recorded in the detection rectangle database 13 in S109 is used to calculate the reliability of the moving object detected in the next frame.
- the processing proceeds to S110.
- the output unit 14 superimposes the detection rectangle determined in S108 on the image of the current frame and outputs it.
- the detection rectangle output process for the current frame ends.
- the information processing apparatus 1 determines the reliability of the moving object by the rectangle circumscribing the moving object of the current frame and the reliability of the moving object in the current frame by the detection rectangle of the same subject detected in the previous frame. compare.
- the information processing device 1 determines a rectangle having a higher reliability among the rectangles whose reliability is compared as a rectangle for detecting the moving object in the current frame. Since the detection rectangle with higher reliability is adopted, the information processing apparatus 1 can improve the detection accuracy of the moving object and stably output the detection rectangle.
- the information processing apparatus 1 records the extrinsic rectangle as the detection rectangle of the moving object.
- a predetermined threshold value first threshold value
- the information processing apparatus 1 can reduce the processing load because the comparison with the reliability by the detection rectangle of the previous frame is not performed.
- the information processing apparatus 1 determines whether or not the moving object detected in the current frame and the moving object in the previous frame are the same subject in the processing of S105 and S106 of the detection rectangle output processing shown in FIG.
- the same determination method using the center-to-center distance described in FIG. 5A and the same determination method using IoU described in FIG. 5B determine whether or not they are the same subject by the same determination method by machine learning described in FIG. 5C. It can be judged with a lower load as compared with. Further, the same determination method by machine learning described with reference to FIG. 5C can determine whether or not the subject is the same subject more accurately than the same determination method using the distance between centers or IoU.
- the information processing apparatus 1 when the reliability of the moving object detected in the current frame by the circumscribed rectangle is larger than the predetermined threshold value, the information processing apparatus 1 does not compare with the reliability of the detection rectangle in the previous frame and does not compare with the reliability of the detection rectangle in the previous frame.
- the circumscribed rectangle in is determined as the detection rectangle of the detected moving object.
- the information processing apparatus 1 compares with the reliability of the moving object detected in the previous frame by the circumscribing rectangle of the moving object detected in the current frame, and has a reliability of the moving object. The rectangle with the higher value is determined as the detection rectangle of the moving object detected in the current frame.
- FIG. 6 is a flowchart showing an example of the detection rectangle output process according to the second embodiment.
- the detection rectangle output process according to the second embodiment is different from the detection rectangle output process of the first embodiment shown in FIG. 4 in that there is no determination process of S104.
- the same processing as that of the detection rectangle output processing of the first embodiment shown in FIG. 4 is designated by the same reference numerals and the description thereof will be omitted.
- the detection rectangle output process according to the second embodiment of FIG. 6 can also be realized by setting the threshold value TH1 of S104 to the maximum value of the reliability in the detection rectangle output process of FIG.
- the determination unit 123 determines the reliability of the circumscribed rectangle by the detection rectangle of the moving body j detected in the previous frame regardless of whether or not the reliability of the moving body i by the circumscribed rectangle is larger than the threshold value TH1. Compare with the reliability of. Regardless of the reliability of the circumscribed rectangle of the moving body i, a rectangle with higher reliability is adopted including the detection rectangle in the previous frame, so that the accuracy of the output detection rectangle is improved.
- the detection rectangle is not output when the reliability of the detection rectangle determined by the determination unit 124 is equal to or less than a predetermined threshold value, and the determined detection rectangle is output when the reliability is larger than the predetermined threshold value. Is. By not outputting the detection rectangle when the reliability is equal to or less than a predetermined threshold value, the information processing apparatus 1 can continuously output the detection rectangle with stable reliability.
- 7 and 8 are flowcharts showing an example of the detection rectangle output process according to the third embodiment.
- determination processing S701, S801 for determining whether or not the reliability of the detection rectangle is greater than a predetermined threshold value is added to the detection rectangle output process of the first embodiment shown in FIG. Has been done.
- the same processing as that of the detection rectangle output processing of the first embodiment shown in FIG. 4 is designated by the same reference numerals and the description thereof will be omitted.
- the detection rectangle output process of FIG. 7 and the detection rectangle output process of FIG. 8 differ in the timing of determining whether or not the reliability of the detection rectangle is larger than the predetermined threshold value TH2 (second threshold value).
- the predetermined threshold value TH2 second threshold value
- FIG. 7 whether or not the reliability of the detection rectangle is larger than the predetermined threshold value TH2 is determined before the information of the detection rectangle is stored in the detection rectangle database 13 in S109. That is, when the reliability of the detection rectangle is equal to or less than the predetermined threshold value TH2, the detection rectangle is not stored in the detection rectangle database 13 and is not output.
- whether or not the reliability of the detection rectangle is larger than the predetermined threshold value TH2 is determined before the detection rectangle is output in S110. That is, when the reliability of the detection rectangle is equal to or less than the predetermined threshold value TH2, the detection rectangle is stored in the detection rectangle database 13, but is not output.
- the processing proceeds to S701.
- the determination unit 124 determines whether or not the reliability of the determined detection rectangle is greater than the predetermined threshold value TH2.
- the predetermined threshold value TH2 can be set to, for example, a value equal to or less than the threshold value TH1.
- the process proceeds to S109.
- the reliability of the determined detection rectangle is equal to or less than the predetermined threshold value TH2 (S701: No)
- the process proceeds to the loop process L1 for the next moving object i + 1.
- the information of the detection rectangle whose reliability by the determined detection rectangle is larger than the predetermined threshold value TH2 is stored in the detection rectangle database 13.
- the output unit 14 outputs the detection rectangle stored in the detection rectangle database 13 to the moving object detected in the current frame. That is, the output unit 14 outputs the circumscribed rectangle of the moving body i whose reliability is greater than the predetermined threshold value TH1 in S104, and the detection rectangle whose reliability is determined to be greater than the predetermined threshold value TH2 in S701.
- the information processing apparatus 1 can continuously output a detection rectangle having a stable reliability.
- the process proceeds to S801.
- the output unit 14 determines whether or not the reliability of each moving object recorded in the detection rectangle database 13 based on the detection rectangle is greater than the predetermined threshold value TH2.
- the process proceeds to S110.
- the detection rectangle is not output, and the detection rectangle output process of the current frame shown in FIG. 8 ends.
- the output unit 14 outputs the detection rectangle whose reliability is determined to be greater than the predetermined threshold value TH2 in S801 among the detection rectangles stored in the detection rectangle database 13. By not outputting a rectangle whose reliability is equal to or less than a predetermined threshold value, the information processing apparatus 1 can continuously output a detection rectangle having a stable reliability.
- the fourth embodiment is an embodiment for eliminating the situation where the detection rectangle for a stationary object is adopted and remains as the detection rectangle of the moving object because the reliability is higher than the circumscribed rectangle of the moving object of the current frame. Since the hardware configuration and the functional configuration of the information processing apparatus 1 according to the fourth embodiment are the same as those of the first embodiment, the description thereof will be omitted.
- the number of consecutive frames in which the difference between the circumscribed rectangle of the moving object detected in the current frame and the detection rectangle of the moving object determined to be the same subject in the previous frame is larger than a predetermined threshold value is set to a predetermined number. If it exceeds, the detection rectangle is not output.
- the difference may be, for example, an area change from the detection rectangle of the previous frame to the circumscribed rectangle of the current frame, or may be the ratio of the area change to the area of the detection rectangle of the previous frame. That is, the information processing device 1 detects the number of frames whose difference between the circumscribed rectangle of the moving object in the current frame and the detection rectangle in the previous frame is larger than a predetermined threshold value and is less than or equal to a predetermined number.
- the rectangle is recorded as a moving object detection rectangle.
- the information processing apparatus 1 can control the detection rectangle of the stationary object that is erroneously adopted as the detection rectangle of the moving object so as not to be adopted in the subsequent frames.
- FIG. 9A shows an example of detecting a human body as a detection target from a frame image.
- the object 902 is an object such as an electric fan that can be detected as a moving body.
- the object 903 is an object that is imaged by overlapping with the object 902 and may be erroneously detected as a human body.
- the object 903 is assumed to be a robot, a poster showing a human body, a coat rack, a wall pattern, or the like, which may overlap with the object 902 and be determined to be a human body.
- FIG. 9A illustrates a scene in which the present embodiment is applied.
- the human body 901 passes by so as to overlap the object 902 when viewed from the position of the camera 2.
- FIG. 9B illustrates the detection result of a moving object from time T-1 to time T + 1 in the scene of FIG. 9A. It is assumed that the time T is the time immediately after the human body 901 passes the position where the human body 901 overlaps the object 902 when viewed from the position of the camera 2.
- the human body 901 is detected around the object 902, and the detection rectangle A91 is recorded in the detection rectangle database 13 as the detection rectangle of the human body 901.
- the determination unit 123 sets the human body 901 at the time T-1 as the object 902 from the distance between the centers of the circumscribing rectangle A92 of the object 902 and the detection rectangle A91 of the human body 901. It is conceivable to determine that they are the same subject. In this case, the calculation unit 122 calculates the reliability of the object 902 using the detection rectangle A91 of the human body 901.
- the reliability of the object 902 by the detection rectangle A91 (the reliability of being a human body) is higher than the reliability of the circumscribing rectangle A92 of the object 902 due to the existence of the object 903. Therefore, the determination unit 124 determines the detection rectangle A91 at time T-1 as the detection rectangle of the object 902.
- the determination unit 124 sets the detection rectangle A91 at the time T-1 and the time T as the detection rectangle of the object 902, as in the case of the time T. To determine as. Similarly, after the time T + 1, the detection rectangle A91, which is an erroneous detection, is recorded in the detection rectangle database 13 as the detection rectangle of the object 902.
- the information processing apparatus 1 detects a detection rectangle when a predetermined number of frames in which the difference between the circumscribed rectangle in the current frame and the detection rectangle in the previous frame is larger than the predetermined threshold TH3 are continuous. Control is performed so that A91 is not recorded in the detection rectangle database 13.
- the difference in the example of FIG. 9B can be the ratio of the area change from the detection rectangle A91 to the circumscribed rectangle A92 with respect to the area of the detection rectangle A91 in the previous frame.
- the information processing apparatus 1 can control the detection rectangle A91 so as not to be recorded in the detection rectangle database 13 when the number of frames having a difference greater than 50%, which is a predetermined threshold value TH3, is more than 5 consecutive frames. .. That is, the information processing apparatus 1 records the detection rectangle A91 when the number of consecutive frames whose difference is larger than 50% of the predetermined threshold value TH3 is 5 frames or less.
- the information processing apparatus 1 By controlling whether or not to record the detection rectangle based on the difference between the detection rectangle of the previous frame and the circumscribed rectangle of the current frame, the information processing apparatus 1 has more consecutive detection rectangles that are false detections than a predetermined number. It is possible to avoid being output.
- the detection rectangle output process according to the fourth embodiment is a determination process (S1001 to S1004, S1101 to S1101 to) regarding the number of consecutive frames in which the difference between the rectangles is larger than a predetermined threshold value, as compared with the detection rectangle output process of the first embodiment shown in FIG. S1104) has been added.
- the same processing as that of the detection rectangle output processing of the first embodiment shown in FIG. 4 is designated by the same reference numerals and the description thereof will be omitted.
- the detection rectangle output process of FIG. 10 and the detection rectangle output process of FIG. 11 determine whether or not the number of consecutive frames in which the difference between the rectangles is larger than the predetermined threshold value TH3 (third threshold value) is larger than the predetermined number TH4.
- the timing is different.
- whether or not the continuous number is larger than the predetermined number TH4 is determined before the information of the detection rectangle is stored in the detection rectangle database 13 in S109. That is, when the continuous number is equal to or less than a predetermined number, the detection rectangle is not stored in the detection rectangle database 13 and is not output.
- whether or not the continuous number is larger than the predetermined number TH4 is determined before the detection rectangle is output in S110. That is, when the continuous number is a predetermined number TH4 or less, the detection rectangle is stored in the detection rectangle database 13, but is not output.
- the processing proceeds to S1001.
- S1001 the difference between the detection rectangle of the previous frame and the circumscribed rectangle of the current frame is calculated.
- the difference between the rectangles can be calculated, for example, as the amount of change in the area between the circumscribed rectangle of the moving body i and the detection rectangle of the moving body i determined in S108.
- the difference between the rectangles is recorded in the detection rectangle database 13 together with the information on the detection rectangle.
- the determination unit 124 determines whether or not the difference between the rectangles of the moving body i is larger than the predetermined threshold value TH3.
- the process proceeds to S1002.
- the difference between the rectangles of the moving body i is equal to or less than the predetermined threshold value TH3 (S1001: No)
- the process proceeds to S1003.
- the determination unit 124 initializes the continuous number F1 of frames in which the difference change amount of the rectangle of the moving body i is larger than the predetermined threshold value TH3.
- the process proceeds to S109, and the detection rectangle corresponding to the moving object i determined in S108 is recorded in the detection rectangle database 13.
- the determination unit 124 increments the number of consecutive frames F1 in which the difference between the rectangles of the moving body i is larger than the predetermined threshold value TH3.
- the continuous number F1 of frames in which the difference between the rectangles of the moving body i is larger than the predetermined threshold value TH3 is recorded in the detection rectangle database 13 and referred to in the processing of each frame.
- the determination unit 124 determines whether or not the continuous number F1 exceeds the predetermined number TH4.
- the continuous number F1 exceeds the predetermined number TH4 (S1004: Yes)
- the detection rectangle corresponding to the moving object i is not recorded in the detection rectangle database 13, and the process proceeds to the next loop process L1.
- the continuous number F1 is a predetermined number TH4 or less (S1004: No)
- the process proceeds to S109, and the detection rectangle corresponding to the moving object i is recorded in the detection rectangle database 13.
- the information processing apparatus 1 reduces the output of the detection rectangle due to erroneous detection by not outputting the detection rectangle when the number of consecutive frames in which the difference between the rectangles is larger than the predetermined threshold exceeds the predetermined number. be able to.
- the processes of S1101 to S1103 are the same as those of S1001 and S1003 of FIG. 10, respectively.
- the determination unit 124 records the continuous number F1 in the detection rectangular database 13 after increasing the continuous number F1 by 1 in S1102 or initializing the continuous number F1 to 0 in S1103.
- the determination unit 124 records the information of the moving object i and the corresponding detection rectangle in the detection rectangle database 13 regardless of the value of the continuous number F1.
- the process proceeds to S1104.
- the output unit 14 determines whether or not the continuous number F1 exceeds the predetermined number TH4.
- the detection rectangle is not output, and the detection rectangle output process of the current frame shown in FIG. 11 ends.
- the output unit 14 initializes the continuous number F1 for the moving object i recorded in the detection rectangle database 13 to 0.
- the process proceeds to S110.
- the output unit 14 outputs the detection rectangle in which the continuous number F1 is determined to be a predetermined number TH4 or less in S1104 among the detection rectangles stored in the detection rectangle database 13.
- the information processing device 1 reduces the output of the detection rectangle due to erroneous detection by not outputting the detection rectangle when the number of consecutive frames in which the difference between the rectangles is larger than the predetermined threshold exceeds the predetermined number. be able to.
- the fifth embodiment is an embodiment in which a detection rectangle is output when a predetermined number of frames having a reliability greater than a predetermined threshold value are consecutive. By not outputting the detection rectangle when the reliability is equal to or less than a predetermined threshold value, the information processing apparatus 1 can continuously output the detection rectangle with stable reliability.
- FIG. 12 is a flowchart showing an example of the detection rectangle output process according to the fifth embodiment.
- a determination process (S1201 to S1204) for the number of consecutive frames whose reliability is greater than a predetermined threshold value is added to the detection rectangle output process of the first embodiment shown in FIG. ing.
- the same processing as that of the detection rectangle output processing of the first embodiment shown in FIG. 4 is designated by the same reference numerals and the description thereof will be omitted.
- the determination unit 124 increases the number of consecutive frames F2 whose reliability is greater than a predetermined threshold value by 1.
- the continuous number F2 of frames whose reliability is greater than a predetermined threshold value is recorded in the detection rectangle database 13 and is referred to in the processing of each frame.
- the processing proceeds to S1201.
- the determination unit 124 determines whether or not the reliability of the moving body i based on the detection rectangle determined in the loop process L2 is greater than the predetermined threshold value TH1.
- the process proceeds to S1202.
- the reliability of the determined detection rectangle is TH1 or less (S1201: No)
- the process proceeds to S109.
- the determination unit 124 increases the number of consecutive frames F2 whose reliability is greater than a predetermined threshold value by 1.
- information on the moving object i and the corresponding detection rectangle is recorded in the detection rectangle database 13 regardless of the value of the continuous number F2.
- the determination unit 124 initializes the continuous number F2 for the moving body i to 0 because the frame in which the reliability is determined to be TH1 or less in S1201 and the reliability is larger than a predetermined threshold value is not continuous.
- the process proceeds to S1204.
- the output unit 14 determines whether or not the continuous number F2 exceeds the predetermined number TH5.
- the process proceeds to S110.
- the detection rectangle is not output, and the detection rectangle output process of the current frame shown in FIG. 12 ends.
- the output unit 14 outputs the detection rectangle in which the continuous number F2 is determined to exceed the predetermined number TH5 in S1204 among the detection rectangles stored in the detection rectangle database 13.
- the information processing apparatus 1 can continuously output the detection rectangle with high reliability by not outputting the detection rectangle.
- the reliability of the moving body of the current frame due to the circumscribed rectangle is compared with the reliability of the detection rectangle of the same subject detected in the previous frame.
- the sixth embodiment is an embodiment in which the reliability of the circumscribed rectangle of the moving body of the current frame is compared with the reliability of each detection rectangle of the same subject detected retroactively to a plurality of frames.
- the information processing apparatus 1 outputs a rectangle having higher reliability among the circumscribed rectangle of the current frame and the detection rectangle up to a plurality of frames before as the detection rectangle of the moving object of the current frame.
- FIG. 13 is a flowchart showing an example of the detection rectangle output process according to the sixth embodiment.
- a loop process L3 that goes back in the frame is added to the detection rectangle output process of the first embodiment shown in FIG.
- the same processing as that of the detection rectangle output processing of the first embodiment shown in FIG. 4 is designated by the same reference numerals and the description thereof will be omitted.
- the number of frames L to be traced back may be, for example, about 5 frames, and may be determined according to the processing time and the processing load.
- the calculation unit 122 determines the reliability of the moving body i cut out from the current frame by the detection rectangle of the moving body j m determined to be the same as the subject of the moving body i in the current frame, similarly to S107 in FIG. calculate.
- the determination unit 124 compares the reliability calculated in each retroactive frame with the reliability calculated by the circumscribed rectangle in S103.
- the determination unit 124 determines the rectangle having the highest reliability among the compared reliabilitys as the detection rectangle of the moving body i.
- the process of comparing the reliability of S1302 may be performed after the reliability of S1301 is calculated.
- the information processing apparatus 1 compares the reliability of the detection rectangle obtained retroactively to a plurality of frames before with the reliability of the circumscribed rectangle in the current frame. By going back not only to the previous frame but also to a plurality of frames before, the reliability of the output detection rectangle is improved, and the information processing apparatus 1 can output a stable detection rectangle.
- the seventh embodiment is an embodiment in which the position and size of the detection rectangle of the previous frame are corrected, and the reliability of the moving object detected in the current frame is calculated using the corrected detection rectangle.
- the moving object detected in the current frame may not obtain the desired reliability even if the detection rectangle in the previous frame is applied to the current frame due to the movement from the previous frame. Therefore, the information processing apparatus 1 improves the reliability of the detection rectangle in the front frame by correcting the position or size of the detection rectangle in the front frame.
- FIG. 14 is a diagram illustrating a functional configuration of the information processing apparatus according to the seventh embodiment.
- the information processing apparatus 1 according to the seventh embodiment includes a correction unit 125 in addition to the functional configuration according to the first embodiment shown in FIG.
- the same functional configurations as those in FIG. 3 are designated by the same reference numerals and the description thereof will be omitted.
- the correction unit 125 corrects the detection rectangle in the previous frame for the same subject as the moving object detected in the current frame.
- the rectangle A151 is a detection rectangle of a moving object detected in the frame two times before.
- the rectangle A152 is a detection rectangle of a moving object detected in the previous frame.
- the information of the rectangle A151 and the information of the rectangle A152 is stored in the detection rectangle database 13.
- the rectangle A153 is a circumscribed rectangle of the moving body detected in the current frame. In the example of FIG. 15, the head of the human body is not recognized as a moving body, and the rectangle A153 is a rectangle surrounding a portion other than the head.
- the reliability of the rectangle A152 may be lower than the reliability of the rectangle A153 whose head is not recognized as a moving object.
- the correction unit 125 corrects the position and size of the previous rectangle A152 according to the position of the moving object in the current frame.
- the correction unit 125 is, for example, an estimated value of the width, height, and center coordinates of the rectangle in the current frame based on the amount of change in the width, height, center coordinates, etc. of the rectangle A152 of the previous frame and the rectangle A151 of the previous frame. Can be calculated.
- the correction unit 125 can estimate the direction and moving distance of the moving object from the center coordinates of the detection rectangles of the previous frame and the previous frame, and calculate the center coordinates of the current frame. Further, the correction unit 125 can calculate the average value of the width and height of the detection rectangles of the previous frame and the previous frame as the width and height of the current frame. The correction unit 125 can generate the correction rectangle A154 based on the calculated estimated value.
- the information processing apparatus 1 can output a detection rectangle with higher reliability.
- the correction rectangle is not limited to the previous frame and the previous frame, and may be generated based on the information of the circumscribed rectangle of the current frame and the detection rectangle obtained retroactively to a plurality of frames.
- FIG. 16 is a flowchart showing an example of the detection rectangle output process according to the seventh embodiment.
- the detection rectangle output process according to the seventh embodiment instead of S107 and S108 of the detection rectangle output process of the first embodiment shown in FIG. 4, the detection rectangle in the previous frame is corrected, and the reliability is calculated by the corrected rectangle. Processings (S1601 to S1603) have been added. The same processing as that of the detection rectangle output processing of the first embodiment shown in FIG. 4 is designated by the same reference numerals and the description thereof will be omitted.
- the process proceeds to S1601 for the moving body jm determined in S106 to be the same as the subject of the moving body i in the current frame.
- the correction unit 125 determines the position of the detection rectangle of the moving body j m , the detection rectangle of the moving body j m , and the detection rectangle of the moving body determined to be the same subject as the moving body i in the frame before the previous frame. And correct based on the amount of change in size.
- the calculation unit 122 calculates the reliability of the moving body i cut out from the current frame by the correction rectangle corrected in S1601.
- the determination unit 124 compares the reliability calculated in S1602 with the reliability calculated by the circumscribed rectangle in S103. When the reliability of the moving body i of the current frame by the circumscribed rectangle is higher than the reliability of the moving body i of the current frame, the determination unit 124 determines the circumscribed rectangle as the detection rectangle of the moving body i of the current frame. On the other hand, when the reliability of the correction rectangle is higher than the reliability of the circumscribed rectangle, the determination unit 124 determines the correction rectangle as the detection rectangle of the moving body i of the current frame.
- the correction unit 125 corrects the detection rectangle of the moving object detected in the previous frame based on the detection rectangle in the frame before the previous frame. By correcting the detection rectangle in the previous frame, the information processing device 1 can improve the reliability of the moving object when the correction rectangle is applied to the current frame.
- the reliability of the human body has been described as the reliability of being an unspecified human body, but the reliability is not limited to this.
- the reliability may be the reliability of being a specific person to be detected.
- the information processing apparatus 1 may go back to the previous frame every plurality of frames such as 2 frames and 3 frames, and output a rectangle having higher reliability as a detection rectangle of the current frame.
- the reliability of the moving object in the current frame is calculated using the detection rectangle of the moving object detected in the frame before the current frame, but the reliability is not limited to this.
- the information processing device 1 may calculate the reliability of the moving object in the current frame by using the circumscribed rectangle of the moving object detected in the frame after the current frame in the captured moving image. In this case, if the information processing device 1 has a higher reliability calculated using the circumscribed rectangle of the moving object detected in the later frame than the reliability of the circumscribed rectangle of the moving object in the current frame, the circumscribed circle of the later frame. The rectangle can be determined to be the detection rectangle at the current frame.
- a detection unit (121) that detects a moving object from each frame image of the moving image, and A calculation unit (122) for calculating the reliability of the detected moving object as a predetermined subject, and The reliability of the first moving body by the frame circumscribing the first moving body detected in the first frame, and the detection frame of the second moving body detected in the second frame before the first frame. Based on the reliability of the first moving object in the first frame, the determination unit (124) that determines the detection frame of the first moving object and records it in the recording unit, An information processing device (1).
- the computer A detection step (S101) for detecting a first moving object from a first frame included in a moving image, and A second frame before the first frame, which is a frame circumscribing the first moving body and a detection frame recorded in the recording unit, for reliability in which the first moving body is a predetermined subject.
- Calculation steps (S103, S107) calculated using the detection frame of the second moving object detected in the frame, and Based on the reliability of the first moving body by the frame circumscribing the first moving body and the reliability of the first moving body in the first frame by the detecting frame of the second moving body, the first In the determination step (S108, S109) of determining the detection frame of the moving object 1 and recording it in the recording unit, Information processing methods including.
- Information processing device 2 Camera, 11: Image acquisition unit, 12: Processing unit, 121: Detection unit, 122: Calculation unit, 123: Judgment unit, 124: Decision unit, 125: Correction unit, 13: Detection rectangle Database, 14: Output
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Geometry (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
Abstract
Description
図1は、実施形態に係る情報処理装置の適用例を説明する図である。情報処理装置は、カメラから入力される動画像を取得し、取得した動画像の各画像フレームから、動物体(以下、動体とも称する)を検出する。カメラは、例えば、監視カメラ等の固定カメラが想定される。
(ハードウェア構成)
図2を参照して、情報処理装置1のハードウェア構成の一例について説明する。図2は、情報処理装置1のハードウェア構成を例示する図である。情報処理装置1は、プロセッサ101、主記憶装置102、補助記憶装置103、通信インタフェース(I/F)104、出力装置105を備える。プロセッサ101は、補助記憶装置103に記憶されたプログラムを主記憶装置102に読み出して実行することにより、図3で説明する各機能構成としての機能を実現する。通信インタフェース104は、有線または無線通信を行うためのインタフェースである。出力装置105は、例えば、ディスプレイ等の出力を行うための装置である。
図3は、情報処理装置1の機能構成を例示する図である。情報処理装置1は、画像取得部11、処理部12、検出矩形データベース(DB)13、出力部14を含む。処理部12は、検出部121、算出部122、判定部123、決定部124を含む。
図4を参照して、検出矩形出力処理の全体的な流れを説明する。図4は、検出矩形出力処理の例を示すフローチャートである。検出矩形出力処理は、例えば、画像取得部11が取得した動画像の各フレームが処理部に送信されることにより開始される。図4に示す検出矩形出力処理は、動画像のフレームごとに実行される処理である。
上記の実施形態1において、情報処理装置1は、現フレームの動体に外接する矩形による当該動体の信頼度、および前フレームで検出した同じ被写体の検出矩形による現フレームでの当該動体の信頼度を比較する。情報処理装置1は、信頼度を比較した矩形のうち信頼度がより高い矩形を、現フレームでの当該動体の検出矩形として決定する。信頼度がより高くなる検出矩形が採用されるため、情報処理装置1は、動体の検出精度を改善し、検出矩形を安定して出力することができる。
実施形態1では、現フレームで検出した動体の外接矩形による信頼度が所定の閾値より大きい場合、情報処理装置1は、前フレームでの検出矩形による信頼度との比較をせずに、現フレームでの外接矩形を検出した動体の検出矩形として決定する。これに対し、実施形態2では、情報処理装置1は、現フレームで検出した動体の外接矩形による信頼度に関わらず、前フレームで検出した動体の検出矩形による信頼度との比較し、信頼度が高いほうの矩形を、現フレームで検出した動体の検出矩形として決定する。
実施形態3は、決定部124によって決定した検出矩形による信頼度が、所定の閾値以下の場合には検出矩形を出力せず、所定の閾値より大きい場合には決定した検出矩形を出力する実施形態である。信頼度が所定の閾値以下の場合には検出矩形を出力しないようにすることで、情報処理装置1は、安定した信頼度の検出矩形を、継続して出力することができる。
実施形態4は、静止物体に対する検出矩形が、現フレームの動体の外接矩形よりも信頼度が高くなるために、動体の検出矩形として採用され残ってしまう状況を解消するための実施形態である。実施形態4に係る情報処理装置1のハードウェア構成および機能構成は、実施形態1と同様であるため、説明は省略する。
実施形態5は、信頼度が所定の閾値より大きいフレームが所定数連続する場合に、検出矩形を出力する実施形態である。信頼度が所定の閾値以下の場合には検出矩形を出力しないようにすることで、情報処理装置1は、安定した信頼度の検出矩形を、継続して出力することができる。
上述の各実施形態では、現フレームの動体の外接矩形による信頼度は、前フレームで検出された同じ被写体の検出矩形による信頼度と比較される。これに対し、実施形態6は、現フレームの動体の外接矩形による信頼度を、複数フレーム前まで遡って検出された同じ被写体の各検出矩形による信頼度と比較する実施形態である。実施形態6では、情報処理装置1は、現フレームの外接矩形および複数フレーム前までの検出矩形のうち、信頼度がより高くなる矩形を、現フレームの動体の検出矩形として出力する。
実施形態7は、前フレームの検出矩形の位置およびサイズを補正し、補正後の検出矩形を用いて、現フレームで検出した動体の信頼度を算出する実施形態である。現フレームで検出される動体は、前フレームからの移動により、前フレームでの検出矩形を現フレームに適用しても所望の信頼度を得られない可能性がある。そこで、情報処理装置1は、前フレームでの検出矩形の位置または大きさを補正することにより、前フレームでの検出矩形による信頼度を改善する。
上記実施形態は、本発明の構成例を例示的に説明するものに過ぎない。各実施形態の構成は、上記の具体的な形態には限定されることはなく、本発明の技術的思想の範囲内で適宜組み合わせて利用することができる。また、本発明は、その技術的思想を逸脱しない範囲で、種々の変形が可能である。
(1)動画像の各フレーム画像から動体を検出する検出部(121)と、
前記検出された動体が所定の被写体である信頼度を算出する算出部(122)と、
第1のフレームで検出された第1の動体に外接する枠による前記第1の動体の信頼度、および前記第1のフレームより前の第2のフレームで検出された第2の動体の検出枠による前記第1のフレームでの前記第1の動体の信頼度に基づいて、前記第1の動体の検出枠を決定し、記録部に記録する決定部(124)と、
を備える情報処理装置(1)。
動画像に含まれる第1のフレームから第1の動体を検出する検出ステップ(S101)と、
前記第1の動体が所定の被写体である信頼度を、前記第1の動体に外接する枠、および記録部に記録されている検出枠であって、前記第1のフレームより前の第2のフレームで検出された第2の動体の検出枠を用いて算出する算出ステップ(S103、S107)と、
前記第1の動体に外接する枠による前記第1の動体の信頼度、および前記第2の動体の検出枠による前記第1のフレームでの前記第1の動体の信頼度に基づいて、前記第1の動体の検出枠を決定し、前記記録部に記録する決定ステップ(S108、S109)と、
を含む情報処理方法。
Claims (19)
- 動画像の各フレーム画像から動体を検出する検出部と、
前記検出された動体が所定の被写体である信頼度を算出する算出部と、
第1のフレームで検出された第1の動体に外接する枠による前記第1の動体の信頼度、および前記第1のフレームより前の第2のフレームで検出された第2の動体の検出枠による前記第1のフレームでの前記第1の動体の信頼度に基づいて、前記第1の動体の検出枠を決定し、記録部に記録する決定部と、
を備える情報処理装置。 - 前記第2のフレームで検出された複数の動体のうち、前記第1の動体と同じ被写体である前記第2の動体を判定する判定部をさらに備える、
請求項1に記載の情報処理装置。 - 前記判定部は、前記第1の動体に外接する枠と前記第2のフレームで検出された各動体の検出枠との中心間距離に基づいて、前記第1の動体と同じ被写体である前記第2の動体を判定する、
請求項2に記載の情報処理装置。 - 前記判定部は、前記第1の動体に外接する枠と前記第2のフレームで検出した各動体の検出枠とが占める領域に対する重なりの領域の割合に基づいて、前記第1の動体と同じ被写体である前記第2の動体を判定する、
請求項2または3に記載の情報処理装置。 - 前記判定部は、前記第1の動体と前記第2のフレームで検出した各動体とを、機械学習による照合のアルゴリズムを用いて照合することにより、前記第1の動体と同じ被写体である前記第2の動体を判定する、
請求項2から4のいずれか1項に記載の情報処理装置。 - 前記判定部は、前記第1のフレームより前の複数のフレームで検出された動体のうち、前記第1の動体と同じ被写体がいずれの動体かをそれぞれのフレームで判定し、
前記決定部は、各フレームで前記第1の動体と同じ被写体であると判定された動体のそれぞれの検出枠による前記第1の動体の信頼度のうち最も大きい信頼度が、前記第1の動体に外接する枠による前記第1の動体の信頼度よりも大きい場合、前記最も大きい信頼度が算出された検出枠を、前記第1の動体の検出枠として決定する、
請求項2から5のいずれか1項に記載の情報処理装置。 - 前記決定部は、前記第1の動体に外接する枠による前記第1の動体の信頼度が、第1閾値より大きい場合、前記第1の動体に外接する枠を、前記第1の動体の検出枠として決定する、
請求項1から6のいずれか1項に記載の情報処理装置。 - 前記決定部は、前記第1の動体に外接する枠による前記第1の動体の信頼度よりも、前記第2の動体の検出枠による前記第1の動体の信頼度が大きい場合、前記第2の動体の検出枠を、前記第1の動体の検出枠として決定する、
請求項1から7のいずれか1項に記載の情報処理装置。 - 前記決定部は、前記決定した前記第1の動体の検出枠による信頼度が、第2閾値より大きい場合に、前記第1の動体の検出枠を前記記録部に記録する、
請求項1から8のいずれか1項に記載の情報処理装置。 - 前記決定部は、前記第1の動体に外接する枠による前記第1の動体の信頼度よりも、前記第2の動体の検出枠による前記第1の動体の信頼度が大きい場合であって、前記第1の動体に外接する枠と前記第2の動体の検出枠との差分が第3閾値より大きいフレームの連続数が所定数以下の場合に、前記第2の動体の検出枠を前記第1の動体の検出枠として決定し、前記第1の動体の検出枠を前記記録部に記録する、
請求項1から9のいずれか1項に記載の情報処理装置。 - 前記記録部に記録された前記第1の動体の検出枠を、前記第1のフレームに重畳させて出力する出力部をさらに備える、
請求項1から10のいずれか1項に記載の情報処理装置。 - 前記出力部は、前記記録部に記録された前記第1の動体の検出枠による信頼度が、第2閾値より大きい場合に、前記第1の動体の検出枠を出力する、
請求項11に記載の情報処理装置。 - 前記出力部は、前記第1の動体に外接する枠による前記第1の動体の信頼度よりも、前記第2の動体の検出枠による前記第1の動体の信頼度が大きい場合であって、前記第1の動体に外接する枠と前記第2の動体の検出枠の差分が第3閾値より大きいフレームの連続数が所定数以下の場合に、前記記録部に記録された前記第1の動体の検出枠を出力する、請求項11または12に記載の情報処理装置。
- 前記出力部は、前記決定された前記第1の動体の検出枠による信頼度が第1閾値より大きいフレームの連続数が所定数より多い場合に、前記第1の動体の検出枠を出力する、
請求項11から13のいずれか1項に記載の情報処理装置。 - 前記第2の動体の検出枠と、前記第2のフレームよりも前のフレームで前記第1の動体と同じ被写体であると判定された動体の検出枠との位置および大きさの変化量に基づいて、前記第2の動体の検出枠を補正する補正部をさらに備える、
請求項1から14のいずれか1項に記載の情報処理装置。 - 前記検出部は、フレーム間差分法および背景差分法の少なくともいずれかにより動体を検出する、
請求項1から15のいずれか1項に記載の情報処理装置。 - 前記算出部は、ニューラルネットワーク、ブースティング、サポートベクターマシンの少なくともいずれかに基づく識別器により、前記検出された動体が所定の被写体である信頼度を算出する、
請求項1から16のいずれか1項に記載の情報処理装置。 - コンピュータが、
動画像に含まれる第1のフレームから第1の動体を検出する検出ステップと、
前記第1の動体が所定の被写体である信頼度を、前記第1の動体に外接する枠、および記録部に記録されている検出枠であって、前記第1のフレームより前の第2のフレームで検出された第2の動体の検出枠を用いて算出する算出ステップと、
前記第1の動体に外接する枠による前記第1の動体の信頼度、および前記第2の動体の検出枠による前記第1のフレームでの前記第1の動体の信頼度に基づいて、前記第1の動体の検出枠を決定し、前記記録部に記録する決定ステップと、
を含む情報処理方法。 - 請求項18に記載の方法の各ステップをコンピュータに実行させるためのプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE112021006829.6T DE112021006829T5 (de) | 2021-01-18 | 2021-09-14 | Informationsverarbeitungsvorrichtung und informationsverarbeitungsverfahren |
CN202180088783.1A CN116802679A (zh) | 2021-01-18 | 2021-09-14 | 信息处理装置以及信息处理方法 |
US18/259,639 US20240071028A1 (en) | 2021-01-18 | 2021-09-14 | Information processing device and information processing method |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021-005855 | 2021-01-18 | ||
JP2021005855A JP2022110441A (ja) | 2021-01-18 | 2021-01-18 | 情報処理装置および情報処理方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022153599A1 true WO2022153599A1 (ja) | 2022-07-21 |
Family
ID=82447070
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2021/033706 WO2022153599A1 (ja) | 2021-01-18 | 2021-09-14 | 情報処理装置および情報処理方法 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20240071028A1 (ja) |
JP (1) | JP2022110441A (ja) |
CN (1) | CN116802679A (ja) |
DE (1) | DE112021006829T5 (ja) |
WO (1) | WO2022153599A1 (ja) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007180933A (ja) * | 2005-12-28 | 2007-07-12 | Secom Co Ltd | 画像センサ |
WO2016021411A1 (ja) * | 2014-08-06 | 2016-02-11 | ソニー株式会社 | 画像処理装置、画像処理方法、およびプログラム |
JP2016085675A (ja) * | 2014-10-28 | 2016-05-19 | セコム株式会社 | 移動物体追跡装置 |
JP2020107349A (ja) * | 2014-09-26 | 2020-07-09 | 日本電気株式会社 | 物体追跡システム、物体追跡方法、プログラム |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000105835A (ja) | 1998-07-28 | 2000-04-11 | Hitachi Denshi Ltd | 物体認識方法及び物体追跡監視装置 |
-
2021
- 2021-01-18 JP JP2021005855A patent/JP2022110441A/ja active Pending
- 2021-09-14 CN CN202180088783.1A patent/CN116802679A/zh active Pending
- 2021-09-14 DE DE112021006829.6T patent/DE112021006829T5/de active Pending
- 2021-09-14 WO PCT/JP2021/033706 patent/WO2022153599A1/ja active Application Filing
- 2021-09-14 US US18/259,639 patent/US20240071028A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007180933A (ja) * | 2005-12-28 | 2007-07-12 | Secom Co Ltd | 画像センサ |
WO2016021411A1 (ja) * | 2014-08-06 | 2016-02-11 | ソニー株式会社 | 画像処理装置、画像処理方法、およびプログラム |
JP2020107349A (ja) * | 2014-09-26 | 2020-07-09 | 日本電気株式会社 | 物体追跡システム、物体追跡方法、プログラム |
JP2016085675A (ja) * | 2014-10-28 | 2016-05-19 | セコム株式会社 | 移動物体追跡装置 |
Also Published As
Publication number | Publication date |
---|---|
DE112021006829T5 (de) | 2023-11-16 |
CN116802679A (zh) | 2023-09-22 |
JP2022110441A (ja) | 2022-07-29 |
US20240071028A1 (en) | 2024-02-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4579191B2 (ja) | 移動体の衝突回避システム、プログラムおよび方法 | |
US9734404B2 (en) | Motion stabilization and detection of articulated objects | |
JP5484184B2 (ja) | 画像処理装置、画像処理方法及びプログラム | |
US10896495B2 (en) | Method for detecting and tracking target object, target object tracking apparatus, and computer-program product | |
JP7447302B2 (ja) | デバイスのハンドジェスチャベースの制御のための方法及びシステム | |
KR20150067680A (ko) | 차량용 제스처 인식 시스템 및 그 방법 | |
CN113508420A (zh) | 物体追踪装置以及物体追踪方法 | |
US20110069155A1 (en) | Apparatus and method for detecting motion | |
WO2022014252A1 (ja) | 情報処理装置および情報処理方法 | |
JP5839796B2 (ja) | 情報処理装置、情報処理システム、情報処理方法及びプログラム | |
US9256945B2 (en) | System for tracking a moving object, and a method and a non-transitory computer readable medium thereof | |
CN110458861B (zh) | 对象检测与跟踪方法和设备 | |
JP7255173B2 (ja) | 人検出装置および人検出方法 | |
JP7354767B2 (ja) | 物体追跡装置および物体追跡方法 | |
WO2022153599A1 (ja) | 情報処理装置および情報処理方法 | |
TW202001783A (zh) | 影像分析方法、電子系統以及非暫態電腦可讀取記錄媒體 | |
JP2019185556A (ja) | 画像解析装置、方法およびプログラム | |
JP2021149687A (ja) | 物体認識装置、物体認識方法及び物体認識プログラム | |
JP7197011B2 (ja) | 身長推定装置、身長推定方法及びプログラム | |
JP5539565B2 (ja) | 撮像装置及び被写体追跡方法 | |
US20200257888A1 (en) | Three-dimensional facial shape estimating device, three-dimensional facial shape estimating method, and non-transitory computer-readable medium | |
JP7338174B2 (ja) | 物体検出装置および物体検出方法 | |
WO2021140844A1 (ja) | 人体検出装置および人体検出方法 | |
JP5247419B2 (ja) | 撮像装置および被写体追跡方法 | |
JP2019192155A (ja) | 画像処理装置、撮影装置、画像処理方法およびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21919490 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18259639 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180088783.1 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 112021006829 Country of ref document: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21919490 Country of ref document: EP Kind code of ref document: A1 |