WO2018058573A1

WO2018058573A1 - Object detection method, object detection apparatus and electronic device

Info

Publication number: WO2018058573A1
Application number: PCT/CN2016/101204
Authority: WO
Inventors: 伍健荣; 刘晓青; 白向晖; 谭志明; 东明浩
Original assignee: 富士通株式会社; 伍健荣; 刘晓青; 白向晖; 谭志明
Priority date: 2016-09-30
Filing date: 2016-09-30
Publication date: 2018-04-05
Also published as: CN109479118A

Abstract

Provided in the embodiments of the present application are an object detection method, an object detection apparatus and an electronic device, for detecting an object from a video image frame. The object detection apparatus comprises: an extraction unit for extracting, on the basis of movement information of a video image frame, a region of interest from the video image frame; and a detection unit for performing, according to the region of interest extracted by the extraction unit, object detection on the video image frame. According to the present application, the accuracy and speed of object detection are improved.

Description

Object detection method, object detection device, and electronic device

Technical field

The present application relates to the field of information technology, and in particular, to a video image based object detecting method, an object detecting device, and an electronic device.

Background technique

With the development of information technology, image-based object detection technology is more and more widely used. For example, in the field of traffic monitoring, object detection can be performed on a video surveillance image, thereby identifying an object such as a specific vehicle, and further implementing functions such as object recognition, tracking, and control.

In the existing video image-based object detection technology, object detection can be performed on the entire image range of the video image frame, so that the blind area of detection can be avoided, but the range of detection needs to be large, and the data processing amount when detecting is compared. Big.

In the prior art, a Region of Interest (ROI) may be preset in a video image frame, and object detection is performed for a region of interest of each video image frame, thereby reducing detection time. The amount of data processing increases the detection speed.

It should be noted that the above description of the technical background is only for the purpose of facilitating a clear and complete description of the technical solutions of the present application, and is convenient for understanding by those skilled in the art. The above technical solutions are not considered to be well known to those skilled in the art simply because these aspects are set forth in the background section of this application.

Application content

The inventors of the present application found that in the prior art, the region of interest is preset, and the locations of the regions of interest in each video image frame are the same unless a new region of interest is re-set. However, in the scene in which the object is detected, the object to be detected usually moves, and when it moves outside the region of interest, it is difficult to be detected, thereby causing a missed detection.

An embodiment of the present application provides an object detecting method, an object detecting apparatus, and an electronic device, which can extract a region of interest based on motion information of a video image frame, and perform object detection according to the extracted region of interest, Therefore, the accuracy of object detection can be improved and the detection speed can be improved.

According to a first aspect of embodiments of the present application, an object detection apparatus is provided, For detecting a target object from a video image frame, the apparatus includes:

An extracting unit that extracts a region of interest from the video image frame based on motion information of a video image frame;

And a detecting unit that performs object detection in the video image frame according to the region of interest extracted by the extracting unit.

According to a second aspect of the embodiments of the present application, an object detecting method is provided for detecting a target object from a video image frame, the method comprising:

Extracting a region of interest from the video image frame based on motion information of the video image frame;

Object detection is performed in the video image frame based on the extracted region of interest.

According to a third aspect of the embodiments of the present application, there is provided an electronic device comprising the object detecting device of the first aspect of the above embodiment.

The beneficial effects of the embodiments of the present application are that, according to the implementation of the present application, the accuracy of object detection can be improved, and the detection speed can be improved.

Specific embodiments of the present application are disclosed in detail with reference to the following description and accompanying drawings, in which <RTIgt; It should be understood that the embodiments of the present application are not limited in scope. The embodiments of the present application include many variations, modifications, and equivalents within the scope of the appended claims.

Features described and/or illustrated with respect to one embodiment may be used in one or more other embodiments in the same or similar manner, in combination with, or in place of, features in other embodiments. .

It should be emphasized that the term "comprising" or "comprises" or "comprising" or "comprising" or "comprising" or "comprising" or "comprises"

DRAWINGS

The drawings are included to provide a further understanding of the embodiments of the present application, and are intended to illustrate the embodiments of the present application Obviously, the drawings in the following description are only some of the embodiments of the present application, and those skilled in the art can obtain other drawings according to the drawings without any inventive labor. In the drawing:

1 is a schematic diagram of an object detecting device according to Embodiment 1 of the present application;

2 is a schematic diagram of an extracting unit of Embodiment 1 of the present application;

3 is a schematic diagram of a video image frame according to Embodiment 1 of the present application;

4 is a schematic diagram of a binarized moving image corresponding to the video image frame of FIG. 3;

5 is a schematic diagram of performing a connected domain segmentation process on the binarized moving image of FIG. 4 and generating a circumscribed rectangle;

6 is a schematic diagram of merging connected domains according to Embodiment 1 of the present application;

FIG. 7 is another schematic diagram of merging connected domains according to Embodiment 1 of the present application; FIG.

8 is a schematic diagram of a detecting unit of Embodiment 1 of the present application;

9 is a schematic diagram of combining detection results according to Embodiment 1 of the present application;

10 is a working flow chart of the detecting unit of Embodiment 1 of the present application;

11 is a schematic flow chart of an object detecting method according to Embodiment 2 of the present application;

FIG. 12 is a schematic diagram of a method for extracting a region of interest according to Embodiment 2 of the present application; FIG.

FIG. 13 is a schematic diagram of a method for performing object detection according to Embodiment 2 of the present application; FIG.

FIG. 14 is a schematic diagram showing the configuration of an electronic device according to Embodiment 3 of the present application.

detailed description

The foregoing and other features of the present application will be apparent from the description, The specific embodiments of the present application are specifically disclosed in the specification and the drawings, which illustrate a part of the embodiments in which the principles of the present application may be employed, it being understood that the present application is not limited to the described embodiments, but instead The application includes all modifications, variations and equivalents falling within the scope of the appended claims. Various embodiments of the present application will be described below with reference to the accompanying drawings. These embodiments are merely exemplary and are not limiting of the application.

Example 1

Embodiment 1 of the present application provides an object detection device for detecting a target object from a video image frame.

1 is a schematic diagram of an object detecting device of the first embodiment. As shown in FIG. 1, the detecting device 100 includes an extracting unit 101 and a detecting unit 102.

The extracting unit 101 extracts a region of interest from the video image frame based on the motion information of the video image frame; the detecting unit 102 is in the video image frame according to the region of interest extracted by the extracting unit 101. Perform object detection.

According to the present embodiment, the object detecting apparatus can extract the region of interest based on the motion information of the video image frame, and perform object detection based on the extracted region of interest, thereby being able to extract more accurately for each video image frame Corresponding regions of interest, thereby improving the accuracy of object detection and increasing the speed of detection.

In this embodiment, the video image frame may be, for example, an image frame in a video captured by the surveillance camera. Of course, the video image frame may also be from other devices, which is not limited in this embodiment.

2 is a schematic diagram of the extracting unit 101 of the present embodiment. As shown in FIG. 2, the extracting unit 101 includes a motion detecting unit 201, a region dividing unit 202, and a generating unit 203.

In this embodiment, the motion detecting unit 201 is configured to detect motion information in a video image frame; the region dividing unit 202 is configured to divide each moving object in the video image frame according to the motion information detected by the motion detecting unit 201. The occupied area; the generating unit 203 generates at least one region of interest according to an area occupied by each moving object in the video image frame, the at least one region of interest covering an area where each moving object in the video image frame is located .

In this embodiment, the motion detecting unit 201 may perform foreground detection on the video image frame to generate a binarized motion image of the video image frame, and according to the binarized moving image, the video image frame may be obtained. The motion information, for example, may reflect motion information of the video image frame according to the first pixel in the binarized motion image, wherein the first pixel may be, for example, a white pixel.

3 is a schematic diagram of a video image frame, FIG. 4 is a schematic diagram of a binarized moving image corresponding to the video image frame of FIG. 3, and the white pixels in the binarized moving image 400 of FIG. 4 can reflect the video image frame 300. Sports information.

In this embodiment, the region dividing unit 202 may perform a connected domain segmentation process on the binarized moving image to obtain at least one connected domain of the pixel, where the at least one connected domain may correspond to each moving object in the video image frame. Area. For example, in the binarized moving image, each connected domain includes a plurality of first pixels, and within each connected domain, the first pixel is connected, and between the different connected domains, the first pixel is not connected, and thus different The connected areas are isolated from each other.

In this embodiment, the region dividing unit 202 may further generate a circumscribed polygon of the connected domain for each connected domain in the binarized moving image, and the circumscribed polygon may be used to represent a contour of each connected domain, and the circumscribed polygon may be, for example, It is a rectangle or the like. 5 is a schematic diagram of the connected domain segmentation process of the binarized moving image of FIG. 4 and the generation of the circumscribed rectangle. As shown in FIG. 5, each circumscribed rectangle 501 represents the contour of each connected domain, and The area enclosed by each circumscribed rectangle 501 corresponds to the area occupied by each moving object in the video image frame 300.

In this embodiment, the area dividing unit 202 may also merge the connected domains whose distances are less than or equal to the first threshold as a new connected domain.

In this embodiment, the first threshold may be a value greater than 0, and the distance between the connected domains may refer to the distance between the boundaries of the connected domains, or may refer to the geometric center or the centroid of each connected domain. The distance and the like; and, in the case where each connected domain has an outer polygon, the distance between the connected domains may refer to the distance between the boundaries of the contiguous polygons, or may refer to the geometric center or centroid of each circumscribed polygon. The distance between the two, etc., wherein if the two connected domains circumscribed polygons partially overlap each other, the distance between the two connected domains can be considered to be a negative value, which is smaller than the first threshold.

In this embodiment, the area dividing unit 202 may also generate a circumscribed polygon by a new connected domain formed by combining at least two connected domains.

FIG. 6 is a schematic diagram of merging the connected domains. As shown in FIG. 6, in the pre-merge diagram 601, the circumscribed rectangles of the two connected domains are 6011 and 6012, respectively, and the circumscribed

rectangles

6011 and 6012 partially overlap. In the merged diagram 602, the two connected domains are merged into the connected domain 6020, and the circumscribed rectangle of the connected domain 6020 is 6021, wherein the circumscribed rectangle is 6021, which may be a circumscribed rectangle of the circumscribed

rectangles

6011 and 6012.

Figure 7 is another schematic diagram of the merging of connected domains. As shown in FIG. 7, in the pre-merge diagram 701, the circumscribed rectangles of the four connected domains are 7011, 7012, 7013, and 7014, respectively, and the distance between the four circumscribed rectangles and the boundary of the adjacent circumscribed rectangle is smaller than the first Threshold. In the diagram 702 after the merging process is performed by the zoning unit 202, the four connected domains are merged into the connected domain 7020, and the circumscribed rectangle of the connected domain 7020 is 7021. The circumscribed rectangle 7021 may be a circumscribed

rectangle

7011, 7012, 7013. And the circumscribed rectangle of 7014.

In addition, as shown in FIG. 7 , the distance between the circumscribed rectangle 7016 and the circumscribed rectangles 7011 7070 is far, for example, the distance is greater than the first threshold. Therefore, the connected domain corresponding to the circumscribed rectangle 7016 is not connected to the circumscribed rectangle 7011 ～ The connected domains corresponding to 7014 are merged.

In the present embodiment, the generating unit 203 is capable of generating at least one region of interest according to the distance between the regions occupied by the moving objects in the video image frame, whereby the regions closer to each other can be in the same interest. Within the scope covered by the area.

In this embodiment, since the distance between the regions occupied by the moving objects in the video image frame can correspond to the distance between the connected domains in the binarized moving image, the generating unit 203 can be binarized according to the binarization. The distance between the connected domains in the moving image to generate a region of interest, for example, the generating unit 203 can make the distance A connected domain that is less than or equal to the second threshold is covered by the same region of interest.

As shown in FIG. 7, the distance between the connected domain corresponding to the circumscribed rectangle 7016 and the connected domain 7020 is less than or equal to the second threshold. Therefore, the connected domain corresponding to the circumscribed rectangle 7016 and the connected domain 7020 are the same region of interest 703. Covered, wherein the boundary 7031 of the region of interest 703 is identified by a rectangular frame. Of course, the embodiment is not limited thereto, and the region of interest may be identified in other manners. For example, the boundary 7031 may be other polygonal frames.

In FIG. 7, the size of the boundary of the region of interest 703 may be larger than the size of the circumscribed polygon of each connected domain covered by it. For example, the size of the boundary 7031 of the region of interest 703 may be larger than the size of the circumscribed rectangle 7016 and the circumscribed rectangle 7021. The size of the circumscribed rectangle, for example, the former can be 10% larger than the latter.

In the present embodiment, the generating unit 203 may use a corresponding region of the region of interest generated in the binarized moving image in the video image frame as the region of interest of the video image frame, whereby the extracting unit 101 can obtain the video image from the video image. The region of interest is extracted from the frame.

Block 301 of FIG. 3 illustrates the boundaries of the region of interest extracted from the video image frame 300 by the extraction unit 101 in accordance with the present application.

In the present embodiment, the detecting unit 102 can perform object detection in the video image frame based on the region of interest extracted by the extracting unit 101.

FIG. 8 is a schematic diagram of the detecting unit 102. As shown in FIG. 8, the detecting unit 102 may include a determining unit 801 and an object detecting unit 802.

In this embodiment, the determining unit 801 is configured to determine whether the number of regions of interest in the video image frame is less than or equal to a third threshold, and whether the area of the region of interest is less than or equal to a fourth threshold; the object detecting unit 802 determines The result of the determination by unit 801 is object detection in the region of interest of the video image frame or the entire image range of the video image frame.

In this embodiment, if the determining unit 801 determines that the number of regions of interest in the video image frame is less than or equal to a third threshold, and the sum of the areas of the region of interest in the video image frame is less than or equal to a fourth threshold, Then, the object detecting unit 802 performs object detection in each region of interest of the video image frame, whereby fast object detection can be performed.

Furthermore, if the judging unit 801 determines that the number of regions of interest in the video image frame is 0, the object detecting unit 802 does not perform object detection on the video image frame.

In this embodiment, if the determining unit 801 determines that the number of regions of interest in the video image frame is large At a third threshold, or the sum of the areas of the region of interest in the video image frame is greater than a fourth threshold, then in the video image frame, the object detecting unit 802 performs object detection in the entire image range of the video image frame, thereby Can prevent missed inspections.

In this embodiment, the specific method for the object detection unit 802 to perform the object detection may refer to the prior art, and is not described in this embodiment.

In addition, in this embodiment, a specific video image frame in the video may be used as a key frame, and other video image frames in the video may be used as a normal frame, wherein a video separated by a predetermined time may be used. The image frame or the video image frame of a predetermined number of frames is used as a key frame. In addition, other methods may be used to set the key frame. The determining unit 801 can determine whether the video image frame is a normal frame, and for the normal frame, further determining, according to the determination result of the determining unit 801, performing object detection in the region of interest of the normal frame or the entire image range of the normal frame. For the key frame, the determination unit 801 can be used for further determination, and the object detection can be performed directly in the entire image range of the key frame. Thereby, it is possible to prevent missed detection by performing object detection in the entire image range on the key frame.

In this embodiment, the detecting unit 102 may further have a merging unit 803. In the case where the object detecting unit 802 performs object detection on the region of interest of the current video image frame, the merging unit 803 may detect the detection result in the region of interest of the current video image frame and the video image frame before the current video image frame. The detection result is merged, and the combined detection result may include, for example, a detection result of the region of interest of the current video image frame, and detection of the previous video image frame outside the region of interest of the current video image frame. result.

FIG. 9 is a schematic diagram of merging detection results, 901 is a video image frame before the current

video image frame

902, and 9011, 9012 are object objects detected in the video image frame 901, and the current video image frame 902 is sensed. The region of interest is 9021, the target object 9022 is detected in the region of interest 9021, and the detection result of the current video image frame 902 is combined with the detection result of the previous video image frame 901 to obtain a combined detection result 903, in the merged The detection result 903 includes: the object object 9022 detected in the region of interest 9021 in the current video image frame 902, and the region other than the region of interest 9021 of the current video image frame 902, detected in the previous video image frame 901 Object object 9012.

Next, the workflow of the detecting unit 102 will be described with reference to FIG.

Step 1001: The determining unit 801 determines whether the current video image frame is a normal frame, and if yes, proceeds to step 1002, and if no, proceeds to step 1005.

Step 1002: The determining unit 801 determines whether the number of regions of interest in the current video image frame is less than or equal to a third threshold, and if yes, proceeds to step 1003, and if no, proceeds to step 1005.

Step 1003: The determining unit 801 determines whether the total area of the region of interest in the current video image frame is less than or equal to the fourth threshold. If yes, proceed to step 1004. If no, proceed to step 1005.

Step 1004: The object detecting unit 802 performs object detection in the region of interest of the video image frame.

Step 1005: The object detecting unit 802 performs object detection in the entire image range of the video image frame.

Step 1006: The merging unit 803 combines the detection result of the region of interest of the current video image frame with the detection result of the previous video image frame.

Example 2

The embodiment of the present application further provides an object detecting method for detecting a target object from a video image frame, corresponding to the object detecting device of Embodiment 1.

FIG. 11 is a schematic flowchart of the object detecting method in the second embodiment. As shown in FIG. 11, the detecting method may include:

Step 1101, extracting a region of interest from the video image frame based on motion information of a video image frame;

Step 1102: Perform object detection in the video image frame according to the extracted region of interest.

FIG. 12 is a schematic diagram of a method for extracting a region of interest according to the second embodiment. As shown in FIG. 12, the method includes:

Step 1201: Detect motion information in the video image frame.

Step 1202: According to the detected motion information, divide an area occupied by each moving object in the video image frame;

Step 1203: Generate at least one region of interest according to an area occupied by each moving object in the video image frame, where the at least one region of interest covers an area where each moving object in the video image frame is located.

In step 1201 of this embodiment, binarization of the video image frame may be generated based on foreground detection. A motion image to obtain the motion information of the video image frame.

In step 1202 of the embodiment, the connected domain segmentation process may be performed on the binarized moving image to obtain at least one connected domain of the pixel, where the at least one connected domain corresponds to each moving object in the video image frame. Occupied area.

In step 1202 of this embodiment, an circumscribed polygon of each of the connected domains may also be generated.

In step 1202 of this embodiment, the connected domains whose distances from each other are less than or equal to the first threshold may also be merged into one new connected domain.

In step 1203 of the embodiment, the at least one region of interest may be generated according to the distance of the region.

FIG. 13 is a schematic diagram of a method for performing object detection in the video image frame according to the extracted region of interest according to the second embodiment. As shown in FIG. 13, the method includes:

Step 1301: Determine whether the number of the regions of interest in the video image frame is less than or equal to a third threshold, and whether an area of the region of interest is less than or equal to a fourth threshold;

Step 1302: Perform object detection in the region of interest of the video image frame or the entire image range of the video image frame according to the result of the determining.

As shown in FIG. 13, the method further includes:

Step 1303: Combine the detection result in the region of interest of the current video image frame and the detection result of the video image frame before the current video image frame in the case of performing object detection on the region of interest of the current video image frame. .

For a detailed description of the above steps, reference may be made to the description of the corresponding unit in Embodiment 1, and the repeated description is not repeated here.

According to the present embodiment, the object detecting method can extract the region of interest based on the motion information of the video image frame, and perform object detection based on the extracted region of interest, thereby being able to extract more accurately for each video image frame. Corresponding regions of interest, thereby improving the accuracy of object detection and increasing the speed of detection.

Example 3

Embodiment 3 of the present application provides an electronic device including the object detecting device as described in Embodiment 1.

FIG. 14 is a schematic diagram showing the configuration of an electronic device according to Embodiment 3 of the present application. As shown in FIG. 14, the electronic device 1400 can include a central processing unit (CPU) 1401 and a memory 1402; the memory 1402 is coupled to the center. The processor 1401. The memory 1402 can store various data; in addition, a program for performing object detection is stored, and the program is executed under the control of the central processing unit 1401.

In one embodiment, the functionality in the object detection device can be integrated into the central processor 1401.

The central processing unit 1401 can be configured to:

The central processor 1401 can also be configured to:

Detecting motion information in the video image frame;

Defining an area occupied by each moving object in the video image frame according to the detected motion information;

And generating at least one region of interest according to an area occupied by each moving object in the video image frame, the at least one region of interest covering an area in which each moving object in the video image frame is located.

The central processor 1401 can also be configured to:

A binarized motion image of the video image frame is generated based on foreground detection, thereby obtaining the motion information of the video image frame.

The central processor 1401 can also be configured to:

Connected domain segmentation processing is performed on the binarized moving image to obtain at least one connected domain of the pixel, the at least one connected domain corresponding to an area occupied by each moving object in the video image frame.

The central processor 1401 can also be configured to:

An circumscribed polygon of each of the connected domains is generated.

The central processor 1401 can also be configured to:

The connected domains that are less than or equal to the first threshold are merged into a new connected domain.

The central processor 1401 can also be configured to:

The at least one region of interest is generated based on the distance of the region.

The central processor 1401 can also be configured to:

Determining whether the number of the regions of interest in the video image frame is less than or equal to a third threshold, and whether an area of the region of interest is less than or equal to a fourth threshold;

Based on the result of the determination, object detection is performed in the region of interest of the video image frame or in the entire image range of the video image frame.

The central processor 1401 can also be configured to:

In the case of performing object detection on the region of interest of the current video image frame, the detection results in the region of interest of the current video image frame and the detection results of the video image frames preceding the current video image frame are combined.

In addition, as shown in FIG. 14, the electronic device 1400 may further include: an input and output unit 1403, a display unit 1404, and the like; wherein the functions of the above components are similar to those of the prior art, and details are not described herein again. It should be noted that the electronic device 1400 does not necessarily have to include all the components shown in FIG. 14; in addition, the electronic device 1400 may further include components not shown in FIG. 14, and reference may be made to the prior art.

The embodiment of the present application further provides a computer readable program, wherein the program causes the object detecting device or the electronic device to perform the object detection described in Embodiment 2 when the program is executed in an object detecting device or an electronic device method.

The embodiment of the present application further provides a storage medium storing a computer readable program, wherein the storage medium stores the computer readable program, wherein the computer readable program causes the object detecting device or the electronic device to perform the embodiment 2 Object detection method.

The object detecting apparatus described in connection with the embodiments of the present invention may be directly embodied as hardware, a software module executed by a processor, or a combination of both. For example, one or more of the functional blocks shown in Figures 1, 2, and 8 and/or one or more combinations of functional blocks may correspond to individual software modules of a computer program flow, or to individual hardware. Module. These software modules may correspond to the respective steps shown in Embodiment 2, respectively. These hardware modules can be implemented, for example, by curing these software modules using a Field Programmable Gate Array (FPGA).

The software module can reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. A storage medium can be coupled to the processor to enable the processor to read information from, and write information to, the storage medium; or the storage medium can be an integral part of the processor. The processor and the storage medium can be located in an ASIC. The software module can be stored in the memory of the mobile terminal or in a memory card that can be inserted into the mobile terminal. For example, if a device (such as a mobile terminal) uses a larger capacity MEGA-SIM card or a large-capacity flash memory device, the software module can be stored in the MEGA-SIM card or a large-capacity flash memory device.

One or more of the functional block diagrams described with respect to Figures 1, 2, 8 and/or one or more groups of functional block diagrams A general purpose processor, digital signal processor (DSP), application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete door for performing the functions described herein can be implemented. Or transistor logic device, discrete hardware component, or any suitable combination thereof. One or more of the functional blocks described with respect to Figures 1-3 and/or one or more combinations of functional blocks may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, multiple microprocessors One or more microprocessors in conjunction with DSP communication or any other such configuration.

The present invention has been described in connection with the specific embodiments thereof, but it is to be understood that the description is intended to be illustrative and not restrictive. Various modifications and alterations of this application will be apparent to those skilled in the art in the light of the invention.

Claims

An object detection device for detecting a target object from a video image frame, the device comprising:

An extracting unit that extracts a region of interest from the video image frame based on motion information of a video image frame;

And a detecting unit that performs object detection in the video image frame according to the region of interest extracted by the extracting unit.
The object detecting device according to claim 1, wherein said extracting unit comprises:

a motion detecting unit, configured to detect motion information in the video image frame;

a region dividing unit configured to divide, according to motion information detected by the motion detecting unit, an area occupied by each moving object in the video image frame;

And a generating unit that generates at least one region of interest according to an area occupied by each moving object in the video image frame, the at least one region of interest covering an area in which each moving object in the video image frame is located.
The object detecting device according to claim 2, wherein

The motion detecting unit generates a binarized motion image of the video image frame based on foreground detection, thereby obtaining the motion information of the video image frame.
The object detecting device according to claim 3, wherein

The area dividing unit performs a connected domain segmentation process on the binarized moving image to obtain at least one connected domain of the pixel, where the at least one connected domain corresponds to an area occupied by each moving object in the video image frame .
The object detecting device according to claim 4, wherein

The area dividing unit generates a circumscribed polygon of each of the connected domains.
The object detecting device according to claim 4, wherein

The area dividing unit merges the connected domains whose distances from each other by less than or equal to the first threshold into a new connected domain.
The object detecting device according to claim 2, wherein

The generating unit generates the at least one region of interest according to the distance of the region.
The object detecting device according to claim 1, wherein said detecting unit comprises:

a determining unit, configured to determine whether the number of the regions of interest in the video image frame is less than or equal to a third threshold, and whether an area of the region of interest is less than or equal to a fourth threshold;

An object detecting unit that performs object detection in the region of interest of the video image frame or the entire image range of the video image frame according to a determination result of the determining unit.
The object detecting device according to claim 8, wherein the detecting unit further comprises:

a merging unit, where the object detecting unit performs object detection on the region of interest of the current video image frame, the detection result in the region of interest of the current video image frame, and the video image frame before the current video image frame The test results are combined.
An electronic device having the object detecting device according to any one of claims 1-9.
An object detecting method for detecting a target object from a video image frame, the method comprising:

Extracting a region of interest from the video image frame based on motion information of the video image frame;

Object detection is performed in the video image frame based on the extracted region of interest.
The object detecting method according to claim 11, wherein extracting the region of interest from the video image frame comprises:

Detecting motion information in the video image frame;

Defining an area occupied by each moving object in the video image frame according to the detected motion information;

And generating at least one region of interest according to an area occupied by each moving object in the video image frame, the at least one region of interest covering an area in which each moving object in the video image frame is located.
The object detecting method according to claim 12, wherein detecting the motion information in the video image frame comprises:

A binarized motion image of the video image frame is generated based on foreground detection, thereby obtaining the motion information of the video image frame.
The object detecting method according to claim 13, wherein the area occupied by each moving object in the video image frame is divided according to the detected motion information, including:

Connected domain segmentation processing is performed on the binarized moving image to obtain at least one connected domain of the pixel, the at least one connected domain corresponding to an area occupied by each moving object in the video image frame.
The object detecting method according to claim 14, wherein, based on the detected motion information, The area occupied by each moving object in the video image frame is further included:

An circumscribed polygon of each of the connected domains is generated.
The object detecting method according to claim 14, wherein the dividing the area occupied by each moving object in the video image frame according to the detected motion information further comprises:

The connected domains that are less than or equal to the first threshold are merged into a new connected domain.
The object detecting method according to claim 12, wherein extracting the region of interest from the video image frame comprises:

The at least one region of interest is generated based on the distance of the region.
The object detecting method according to claim 11, wherein the detecting the object in the video image frame according to the extracted region of interest comprises:

Determining whether the number of the regions of interest in the video image frame is less than or equal to a third threshold, and whether an area of the region of interest is less than or equal to a fourth threshold;

Based on the result of the determination, object detection is performed in the region of interest of the video image frame or in the entire image range of the video image frame.
The object detecting method according to claim 18, wherein performing object detection in the video image frame according to the extracted region of interest further comprises:

In the case of performing object detection on the region of interest of the current video image frame, the detection results in the region of interest of the current video image frame and the detection results of the video image frames preceding the current video image frame are combined.