CN112528763A - Target detection method, electronic device and computer storage medium - Google Patents

Target detection method, electronic device and computer storage medium Download PDF

Info

Publication number
CN112528763A
CN112528763A CN202011334487.9A CN202011334487A CN112528763A CN 112528763 A CN112528763 A CN 112528763A CN 202011334487 A CN202011334487 A CN 202011334487A CN 112528763 A CN112528763 A CN 112528763A
Authority
CN
China
Prior art keywords
target
data
detection result
image data
radar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011334487.9A
Other languages
Chinese (zh)
Inventor
金智
缪其恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Automobile Technology Co ltd
Original Assignee
Zhejiang Dahua Automobile Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Automobile Technology Co ltd filed Critical Zhejiang Dahua Automobile Technology Co ltd
Priority to CN202011334487.9A priority Critical patent/CN112528763A/en
Publication of CN112528763A publication Critical patent/CN112528763A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The application discloses a target detection method, an electronic device and a computer storage medium, wherein the target method comprises the following steps: acquiring image data shot by a camera device and preprocessing the image data to acquire radar data detected by a millimeter wave radar and preprocessing the radar data so as to enable the radar data and the image data to be respectively input into a convolutional neural network in the same scale; cascading the radar data and the image data in multiple levels of a convolutional neural network to obtain fusion data, and outputting a target class detection result and a key point detection result in the fusion data; and performing post-processing on the category detection result and the key point detection result to output the category and the motion state of the target. By means of the method, the radar data and the image data can be subjected to cascade fusion at multiple levels, and the characteristics of the two data are fully fused, so that the accuracy and the robustness of target detection are improved.

Description

Target detection method, electronic device and computer storage medium
Technical Field
The present application relates to the field of intelligent recognition technologies, and in particular, to a target detection method, an electronic device, and a computer storage medium.
Background
With the development of the fields of interest of artificial intelligence, automatic driving and the like, intelligent identification becomes an important research direction. The target detection is the core research field of automatic driving, the accuracy of the target detection result has a very great influence on the safety of automatic driving, and the existing target detection method is difficult to effectively utilize the characteristics of data collected by a sensing device, so that the target identification effect is poor. In view of the above, how to improve the accuracy of target detection is an urgent problem to be solved.
Disclosure of Invention
The technical problem mainly solved by the application is to provide a target detection method, electronic equipment and a computer storage medium, which can perform cascade fusion on radar data and image data at multiple levels, and fully fuse the characteristics of the two data so as to improve the accuracy and robustness of target detection.
In order to solve the above technical problem, a first aspect of the present application provides a target detection method, including: acquiring image data shot by a camera device and preprocessing the image data to acquire radar data detected by a millimeter wave radar and preprocessing the radar data so as to enable the radar data and the image data to be respectively input into a convolutional neural network in the same scale; cascading the radar data and the image data in multiple levels of the convolutional neural network to obtain fusion data, and outputting a category detection result and a key point detection result of a target in the fusion data; and performing post-processing on the category detection result and the key point detection result to output the category and the motion state of the target.
The steps of obtaining radar data detected by the millimeter wave radar and preprocessing the radar data include: obtaining target data including the target detected by the millimeter wave radar at a current time point; compensating the target at the current time point by using historical data which is detected by the millimeter wave radar before the current time point and contains the target, and generating radar data at the current time point; obtaining position information of the target in the radar data at the current time point, and projecting the target to a pixel point of the image data at the current time point according to the position information to obtain target enhancement data of the radar data on the image data; and carrying out normalization processing on the target data.
The target data includes a relative distance, a relative speed and a scattering cross section of the target relative to the millimeter wave radar, and after the step of normalizing the target data, the method further includes: and generating three-channel data according to the distance, the relative speed and the scattering cross section corresponding to the target.
The step of obtaining image data shot by the camera device and preprocessing the image data comprises the following steps: adjusting image processing parameters of the camera device; obtaining the adjusted image data shot by the camera device; and carrying out normalization processing on the image data.
Wherein the step of concatenating the radar data and the image data in a plurality of levels of the convolutional neural network to obtain fused data, and outputting a category detection result and a keypoint detection result of a target in the fused data includes: cascading the radar data maps with different scales obtained after the radar data passes through different levels of the convolutional neural network and the image data maps with the same scale obtained after the image data passes through different levels of the neural network for multiple times to obtain a fusion characteristic map; and outputting the category detection result and the key point detection result of the target according to the fusion feature map.
Wherein, after the step of concatenating the radar data and the image data in multiple hierarchies of the convolutional neural network to obtain fused data and outputting a category detection result and a keypoint detection result of a target in the fused data, the method further includes: and acquiring the key point detection result corresponding to the adjacent time point before the current time point, further acquiring the coincidence coefficient of the key point detection results corresponding to the current time point and the adjacent time point, and determining the key points with the coincidence coefficient larger than a first threshold value as the same target so as to acquire a target time sequence matching result.
Wherein the step of post-processing the category detection result and the key point detection result to output the category and the motion state of the target includes: acquiring the accumulated detection times of the target at the current time point according to the target time sequence matching result; acquiring the accumulated identification times of the category of the target at the current time point according to the category detection result; when the ratio of the accumulated identification times to the accumulated detection times of the same target reaches a second threshold, determining the target as a category reaching the second threshold; and obtaining the motion state of the target by using a preset filtering method according to the key point detection result.
Wherein, the step of obtaining the motion state of the target by using a preset filtering method according to the detection result of the key point comprises: obtaining a world coordinate system of the camera device, and obtaining the position of the target in the world coordinate system according to the detection result of the key point; and acquiring the motion state of the target in the world coordinate system by using an interactive multi-model filtering method.
In order to solve the above technical problem, a second aspect of the present application provides an electronic device, which includes a memory and a processor coupled to each other, wherein the memory stores program data, and the processor calls the program data to execute the object detection method of the first aspect.
In order to solve the above technical problem, a third aspect of the present application provides a computer storage medium having stored thereon program data that, when executed by a processor, implements the object detection method of the first aspect.
The beneficial effect of this application is: according to the method, the radar data and the image data are preprocessed respectively, so that the radar data and the image data are input into the convolutional neural network in the same scale, the radar data and the image data are subjected to cascade fusion in multiple levels, after the characteristics of the two kinds of data are fused sufficiently, the category detection result and the key point detection result of the target are output according to the data after the characteristics are fused, the accuracy of the category and the motion state of the target output according to the category detection result and the key point detection result is improved, and the accuracy and the robustness of target detection are improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:
fig. 1 is a schematic flowchart illustrating a target detection method according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of another embodiment of a target detection method provided in the present application;
FIG. 3 is a flowchart illustrating an embodiment corresponding to step S202 in FIG. 2;
FIG. 4 is a schematic flowchart of an embodiment corresponding to step S203 in FIG. 2;
FIG. 5 is a diagram of a topology corresponding to a convolutional neural network;
FIG. 6 is a flowchart illustrating an embodiment corresponding to step S205 in FIG. 2;
FIG. 7 is a flowchart illustrating an embodiment corresponding to step S604 in FIG. 6;
FIG. 8 is a schematic structural diagram of an embodiment of an electronic device provided in the present application;
FIG. 9 is a schematic structural diagram of an embodiment of a computer storage medium provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "system" and "network" are often used interchangeably herein. The term "and/or" herein is merely an association describing an associated object, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship. Further, the term "plurality" herein means two or more than two.
Referring to fig. 1, fig. 1 is a schematic flow chart illustrating a target detection method according to an embodiment of the present disclosure, the method including:
step S101: and acquiring image data shot by the camera device, preprocessing the image data, acquiring radar data detected by the millimeter wave radar, and preprocessing the radar data so that the radar data and the image data are respectively input into the convolutional neural network in the same scale.
Specifically, a camera device and a millimeter wave radar are provided on the unmanned vehicle for image capturing and obstacle detection, respectively. The millimeter wave radar has the characteristics of strong penetration capability, no influence of weather, small and compact size, high identification precision, long detection distance and relatively low price cost, so that the millimeter wave radar can be suitable for severe natural environment, and can improve the detection range and detection precision of the unmanned automobile when being applied to the field of unmanned driving.
Specifically, image data photographed by a camera is acquired and preprocessed, and linear correction, noise removal, dead pixel removal, and white balance are performed on the original image data to enhance the image quality of effective data on the image data. The method comprises the steps of obtaining radar data detected by a millimeter wave radar, wherein the radar data comprise scanning points of a plurality of targets, projecting the scanning points of the targets in the radar data to target pixel points of image data synchronous in time and space, so that the scanning points in the radar data are compensated by the pixel points around the target pixel points, and further increasing the radar data density to obtain target enhancement data.
Further, the scales of the processed radar data and the processed image data are adjusted to be the same, and if the radar data and the image data are regarded as a group of matrixes, the step of adjusting the scales to be the same is to adjust the matrixes corresponding to the radar data and the image data to be the same in length and width.
Step S102: and cascading the radar data and the image data in multiple hierarchies of the convolutional neural network to obtain fusion data, and outputting a target class detection result and a key point detection result in the fusion data.
Specifically, radar data and image data which are preprocessed and have the same scale are respectively sent to a convolutional neural network, a first radar feature output after the radar data passes through a first level in the convolutional neural network is cascaded with a first image feature output after the image data passes through the first level to obtain a first fusion image, the first fusion image is sent to the next level of the convolutional neural network, a second radar feature output after the radar data passes through a second level in the convolutional neural network is cascaded with a second image output after the first fusion image passes through the second level to obtain a second fusion image, and by analogy, multiple scales of radar features of the radar data are fused with the image data for multiple times to further obtain the fusion data.
Further, the convolutional neural network outputs a target category detection result and a key point detection result on the fusion data according to the fusion data obtained through multiple cascades. The target in the fusion data is a target in the radar data, and the class detection result indicates which class the target belongs to, such as: and the detection results of people, animals, buildings, plants and vehicles and key points give the position information of the target in the fusion data.
Step S103: and performing post-processing on the category detection result and the key point detection result to output the category and the motion state of the target.
Specifically, the method includes the steps of obtaining the number N of times that targets in radar data at multiple time points are recognized within a period of time, and determining the category of the targets as the category corresponding to the maximum recognition number when the M/N value reaches over 90% according to the maximum recognition number M of the category of the corresponding targets in the category detection result.
Further, position information of the target in the key point detection result is obtained, the probability of each motion state is output by considering various current possible motion states of the unmanned automobile, the state with the highest probability is used as the current state of the unmanned automobile, and then the motion state of the target relative to the current unmanned automobile is obtained.
The target detection method provided by this embodiment respectively preprocesses radar data and image data, so that the radar data and the image data are input to a convolutional neural network at the same scale, the radar data and the image data are cascaded and fused at multiple levels, after the features of the two types of data are fully fused, the category detection result and the key point detection result of a target are output according to the data with the fused features, so as to improve the accuracy of outputting the category and the motion state of the target according to the category detection result and the key point detection result, and further improve the accuracy and the robustness of target detection.
Referring to fig. 2, fig. 2 is a schematic flow chart illustrating a target detection method according to another embodiment of the present application, the method including:
step S201: and acquiring image data shot by the camera device and preprocessing the image data.
Specifically, the Image data preprocessing mainly includes exposure parameters, gain parameters, white balance parameters, 3D noise reduction, and digital wide dynamic parameter adjustment, and an Image processing module (ISP) of the Image pickup apparatus is adjusted to obtain Image processing parameters adapted to the current environment.
Optionally, the step of obtaining and pre-processing image data captured by the image capturing device comprises: adjusting image processing parameters of the camera device; obtaining the adjusted image data shot by the camera device; and carrying out normalization processing on the image data.
Specifically, parameters of the image processing module are adjusted to be adaptive to the current environment so as to improve the matching degree of various parameters of the camera device and the current environment, the quality of image data is improved, after the adjustment of the image processing parameters is completed, the image data shot by the camera device with the adjusted image processing parameters is obtained, the image is further cut and/or zoomed, further the image data subjected to preliminary preprocessing is subjected to normalization processing, and the image data subjected to preliminary preprocessing is converted into a standard form through a series of conversion.
Step S202: and radar data detected by the millimeter wave radar is obtained and preprocessed.
Specifically, referring to fig. 3, fig. 3 is a schematic flowchart illustrating an embodiment corresponding to step S202 in fig. 2, where the step S202 includes:
step S301: target data including a target detected by the millimeter wave radar at the current time point is obtained.
Specifically, the millimeter wave radar scans the surrounding environment, identifies and outputs a group of targets corresponding to the current time point, the millimeter wave radar sends out electromagnetic waves with the wavelength of 1-10 mm when scanning the targets, the target object distance is obtained according to the time of the round trip flight from sending to receiving of the electromagnetic waves, the moving speed of the targets relative to the radar is obtained by calculating the frequency change of the returned electromagnetic waves according to the Doppler effect, and the azimuth angles of the targets are calculated according to the phase difference of the electromagnetic waves reflected by the same targets received by the parallel receiving antennas. Therefore, the millimeter wave radar further generates target data of the target when scanning the target, the target data including the relative distance, the relative speed, and the scattering cross section of the target with respect to the millimeter wave radar.
Step S302: and compensating the target at the current time point by using the historical data containing the target detected by the millimeter wave radar before the current time point, and generating radar data at the current time point.
Specifically, a target scanned by the millimeter wave radar and target data corresponding to the target are cached for a predetermined time, which may be scanning cycle time of the millimeter wave radar, historical data including the target before a current time point are acquired, and the historical data and the target data at the current time point are combined to generate radar data at the current time point, so that the target in the radar data at the current time point is more comprehensive.
Step S303: and acquiring the position information of the target in the radar data at the current time point, and projecting the target to the pixel point of the image data at the current time point according to the position information to acquire target enhancement data of the radar data on the image data.
Specifically, combining the kinematics of the radar, the relative motion of the target and the radar, calculating the position of the target in the radar data at the current time point through a uniform acceleration model, projecting the position of the target onto the image data of the current time point, and after one target is projected onto a pixel point on the image data, compensating the target by the pixel point around the pixel point corresponding to the target so as to increase the density of the radar data to obtain target enhancement data, thereby enriching the characteristics of the target at the corresponding position and improving the accuracy of target position identification.
Specifically, the projection of the target in the radar data onto the image data can be represented by the following formula:
XCam=PCamTCamRadXRad (1)
wherein, PCamIs the projection matrix of the camera device, determined by the internal and external parameters of the camera device, TCamRadIs an external parameter matrix, X, of millimeter wave radar to a camera deviceRadIs a homogeneous vector of the target.
Step S304: and carrying out normalization processing on the target data.
Specifically, the relative distance, the relative speed and the scattering cross section of the target with respect to the millimeter wave radar are normalized to output data of the same scale, so that the target data is converted into normalized data.
Further, the target data includes a relative distance, a relative speed and a scattering cross section of the target with respect to the millimeter wave radar, and after the step of normalizing the target data, the method further includes: and generating three-channel data according to the distance, the relative speed and the scattering cross section corresponding to the target.
Specifically, data of three channels, namely distance, relative speed and scattering cross section, are generated for the target on the target enhancement data, and all the data are subjected to normalization processing, so that the data are input into the convolutional neural network to be analyzed for different types of data respectively, and the influence of data mixture on data analysis is avoided.
Step S203: and cascading the radar data and the image data in multiple hierarchies of the convolutional neural network to obtain fusion data, and outputting a target class detection result and a key point detection result in the fusion data.
Specifically, referring to fig. 4, fig. 4 is a schematic flowchart illustrating an embodiment corresponding to step S203 in fig. 2, where step S203 includes:
step S401: and performing cascade connection on the radar data maps with different scales obtained by the radar data after passing through different levels of the convolutional neural network and the image data maps with the same scale obtained by the image data after passing through different levels of the neural network for multiple times to obtain a fusion characteristic map.
Specifically, the radar data and the image data are normalized and then have the same scale, radar data maps with different scales output after the radar data passes through a plurality of levels such as convolution, pooling and activation in a convolutional neural network are cascaded with image data maps with the same scale after passing through the convolutional neural network, and then the image data maps fused with the radar data maps are input into the next level. After multiple cascades, a fusion characteristic map is obtained.
In an application manner, please refer to fig. 5, where fig. 5 is a topology structure diagram corresponding to a convolutional neural network, where side lengths in the diagram represent scales of data, the side lengths are equal and represent the scales of the data, and a change in the side lengths represents a scaling of the scales of the data. The method comprises the steps of performing convolution, pooling and activation on radar data to generate radar data maps rf1, rf2, rf3 and rf4 of different scales, cascading the radar data map rf1 with an image data map pf1 of the same scale to generate a first fusion feature map f1, sending the first fusion feature map f1 to the next level, outputting the image data map pf2, cascading a radar data map rf2 of the same scale with the image data map pf2 to generate a second fusion feature map f2, inputting the second fusion feature map f2 into an image data map pf3 of the next level, cascading a radar data map rf3 of the same scale with the image data map pf3 to generate a third fusion feature map f3, inputting the third fusion feature map f3 into an image data map pf4 of the next level, and cascading the radar data map rf4 of the same scale with the image data map pf4 to generate a final fusion feature map 4. In the application mode, the radar data and the image data are subjected to cascade fusion at a plurality of levels, and the characteristics of the two data are fully fused to obtain a fusion characteristic map f 4.
Step S402: and outputting a target category detection result and a key point detection result according to the fusion feature map.
Specifically, the convolutional neural network identifies the target on the fused feature map, analyzes the category of the target and detects the position of the target, and then outputs a category detection result and a key point detection result of the target.
Further, in order to improve the ability of the convolutional neural network to identify the target, the convolutional neural network needs to be trained, a loss function in the training process includes target class loss and key point regression loss, and a calculation formula of the loss function is as follows:
Figure BDA0002796762060000091
where i denotes the sample number and m denotes the keypoint location.
Step S204: and obtaining a key point detection result corresponding to an adjacent time point before the current time point, further obtaining a coincidence coefficient of the key point detection results corresponding to the current time point and the adjacent time point, and determining the key points with the coincidence coefficient larger than a first threshold value as the same target so as to obtain a target time sequence matching result.
Specifically, based on the detection result of the key point of at least one adjacent time point before the current time point, the coincidence coefficient of the target in the current time point and the target in the adjacent time point is calculated, and the time sequence target is matched in the image corresponding to the current time point. If the coincidence coefficient of the target is larger than the first threshold value, the target is matched as the same target, wherein the calculation formula of the coincidence coefficient is as follows:
Figure BDA0002796762060000101
wherein, areaiPosition information of an object representing a current time point, areajPosition information of objects representing adjacent time points.
In a specific application scene, the camera device collects 25 frames of image data every second, the interval of each frame of image data is 40 milliseconds, the coincidence coefficient of the targets in the key point detection result of the current time point and the key point detection result of the previous time point is calculated, if the coincidence coefficient is greater than 90%, the targets are judged to be the same target, the same targets in different time points are screened out, the probability that the targets are repeatedly identified is reduced, and the accuracy of target identification in a period of time is improved.
Step S205: and performing post-processing on the category detection result and the key point detection result to output the category and the motion state of the target.
Specifically, referring to fig. 6, fig. 6 is a flowchart illustrating an embodiment corresponding to step S205 in fig. 2, where step S205 specifically includes:
step S601: and acquiring the accumulated detection times of the target at the current time point according to the target time sequence matching result.
Specifically, according to the target timing matching result, if a historical target which is not matched with the target before the current time point exists in the image at the current time point, a new sequence number is generated for the new target which is not matched with the historical target, and corresponding category information is stored. And if the target in the image at the current time point is matched with the historical target before the current time point, increasing 1 for the life cycle of the target successfully matched with the historical target, and setting 0 for the loss cycle, wherein the life cycle is the accumulated detection times of the target detected. And when the time sequence target in the current time point is not matched in the historical targets, increasing the loss period of the target by 1, and if the loss period exceeds a certain threshold value, deleting the target from the historical targets.
Step S602: and acquiring the accumulated identification times of the category of the target at the current time point according to the category detection result.
Specifically, according to a target time sequence matching result and a category detection result in a period of time, the accumulated detection times N of the same target and the category C identified by the maximum times of the target are obtainediAccumulated number of identifications Mi
Step S603: and when the ratio of the accumulated identification times to the accumulated detection times of the same target reaches a second threshold, determining the target as the class reaching the second threshold.
Specifically, when Miwhen/N is greater than the second threshold, then the target class is set to CiFor the purpose that the cumulative number of detections reaches N, the category C identified as the largest number of the N times is selectediQuoting with N, judging the target as class C only if reaching a second threshold valueiAnd further, the influence of discrete values in the category detection result on the category identification result is reduced, the value with the highest possibility in the multiple identification results is taken as the category of the target, and the accuracy of the identification result is improved.
Step S604: and obtaining the motion state of the target by using a preset filtering method according to the key point detection result.
Specifically, referring to fig. 7, fig. 7 is a flowchart illustrating an embodiment corresponding to step S604 in fig. 6, where step S604 includes:
step S701: and acquiring a world coordinate system of the camera device, and acquiring the position of the target in the world coordinate system according to the detection result of the key point.
Specifically, a world coordinate system corresponding to the image capturing device is obtained, and the position information in the detection result of the key point of the target is converted into the world coordinate system to obtain the position of the target relative to the image capturing device.
Step S702: and acquiring the motion state of the target in the world coordinate system by using an interactive multi-model filtering method.
Specifically, an interactive multi-model filtering method is utilized, four possible motion models of uniform speed, uniform acceleration, left turning and right turning are considered, and the motion state of the target in the world coordinate system of the camera device is updated based on adaptive Kalman filtering.
Furthermore, by utilizing an interactive multi-model filtering method, prediction results and corresponding probabilities of the targets under various motion models are provided, so that the targets under different states can output data according with the motion states of the targets, and the method is suitable for road driving environments of different scenes.
Referring to fig. 8, fig. 8 is a schematic structural diagram of an embodiment of an electronic device 80 provided in the present application, where the electronic device 80 includes a memory 801 and a processor 802 coupled to each other, where the memory 801 stores program data (not shown), and the processor 802 invokes the program data to implement the target detection method in any of the embodiments described above, and the description of relevant contents refers to the detailed description of the embodiments of the methods described above, which is not repeated herein.
Referring to fig. 9, fig. 9 is a schematic structural diagram of an embodiment of a computer storage medium provided in the present application, the computer storage medium 90 stores program data 900, and the program data 900 is executed by a processor to implement the object detection method in any of the above embodiments, and the description of the related contents refers to the detailed description of the above method embodiments, which is not repeated herein.
It should be noted that, units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings of the present application or are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. A method of object detection, the method comprising:
acquiring image data shot by a camera device and preprocessing the image data to acquire radar data detected by a millimeter wave radar and preprocessing the radar data so as to enable the radar data and the image data to be respectively input into a convolutional neural network in the same scale;
cascading the radar data and the image data in multiple levels of the convolutional neural network to obtain fusion data, and outputting a category detection result and a key point detection result of a target in the fusion data;
and performing post-processing on the category detection result and the key point detection result to output the category and the motion state of the target.
2. The method of claim 1, wherein the step of obtaining and pre-processing radar data for millimeter wave radar detection comprises:
obtaining target data including the target detected by the millimeter wave radar at a current time point;
compensating the target at the current time point by using historical data which is detected by the millimeter wave radar before the current time point and contains the target, and generating radar data at the current time point;
obtaining position information of the target in the radar data at the current time point, and projecting the target to a pixel point of the image data at the current time point according to the position information to obtain target enhancement data of the radar data on the image data;
and carrying out normalization processing on the target data.
3. The method of claim 2, wherein the target data includes a relative distance, a relative velocity, and a scattering cross-section of the target relative to the millimeter wave radar, and wherein the step of normalizing the target data further comprises:
and generating three-channel data according to the distance, the relative speed and the scattering cross section corresponding to the target.
4. The method of claim 1, wherein the step of obtaining and pre-processing image data captured by an imaging device comprises:
adjusting image processing parameters of the camera device;
obtaining the adjusted image data shot by the camera device;
and carrying out normalization processing on the image data.
5. The method of claim 1, wherein the step of concatenating the radar data and the image data in multiple levels of the convolutional neural network to obtain fused data, and outputting a class detection result and a keypoint detection result of a target in the fused data comprises:
cascading the radar data maps with different scales obtained after the radar data passes through different levels of the convolutional neural network and the image data maps with the same scale obtained after the image data passes through different levels of the neural network for multiple times to obtain a fusion characteristic map;
and outputting the category detection result and the key point detection result of the target according to the fusion feature map.
6. The method of claim 1, wherein the step of concatenating the radar data and the image data in multiple levels of the convolutional neural network to obtain fused data, and outputting a class detection result and a keypoint detection result of a target in the fused data further comprises:
and acquiring the key point detection result corresponding to the adjacent time point before the current time point, further acquiring the coincidence coefficient of the key point detection results corresponding to the current time point and the adjacent time point, and determining the key points with the coincidence coefficient larger than a first threshold value as the same target so as to acquire a target time sequence matching result.
7. The method according to claim 6, wherein the step of post-processing the class detection result and the keypoint detection result to output the class and the motion state of the target comprises:
acquiring the accumulated detection times of the target at the current time point according to the target time sequence matching result;
acquiring the accumulated identification times of the category of the target at the current time point according to the category detection result;
when the ratio of the accumulated identification times to the accumulated detection times of the same target reaches a second threshold, determining the target as a category reaching the second threshold;
and obtaining the motion state of the target by using a preset filtering method according to the key point detection result.
8. The method according to claim 7, wherein the step of obtaining the motion state of the target by using a preset filtering method according to the key point detection result comprises:
obtaining a world coordinate system of the camera device, and obtaining the position of the target in the world coordinate system according to the detection result of the key point;
and acquiring the motion state of the target in the world coordinate system by using an interactive multi-model filtering method.
9. An electronic device, comprising: a memory and a processor coupled to each other, wherein the memory stores program data that the processor calls to perform the method of any of claims 1-8.
10. A computer storage medium having program data stored thereon, which program data, when executed by a processor, implements the method according to any one of claims 1-8.
CN202011334487.9A 2020-11-24 2020-11-24 Target detection method, electronic device and computer storage medium Pending CN112528763A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011334487.9A CN112528763A (en) 2020-11-24 2020-11-24 Target detection method, electronic device and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011334487.9A CN112528763A (en) 2020-11-24 2020-11-24 Target detection method, electronic device and computer storage medium

Publications (1)

Publication Number Publication Date
CN112528763A true CN112528763A (en) 2021-03-19

Family

ID=74993216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011334487.9A Pending CN112528763A (en) 2020-11-24 2020-11-24 Target detection method, electronic device and computer storage medium

Country Status (1)

Country Link
CN (1) CN112528763A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111808A (en) * 2021-04-20 2021-07-13 山东大学 Abnormal behavior detection method and system based on machine vision
CN115014366A (en) * 2022-05-31 2022-09-06 中国第一汽车股份有限公司 Target fusion method and device, vehicle and storage medium

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110050482A1 (en) * 2008-09-05 2011-03-03 Toyota Jidosha Kabushiki Kaisha Object detecting device
CN107330920A (en) * 2017-06-28 2017-11-07 华中科技大学 A kind of monitor video multi-target tracking method based on deep learning
CN108663677A (en) * 2018-03-29 2018-10-16 上海智瞳通科技有限公司 A kind of method that multisensor depth integration improves target detection capabilities
CN109655826A (en) * 2018-12-16 2019-04-19 成都汇蓉国科微系统技术有限公司 The low slow Small object track filtering method of one kind and device
CN109738884A (en) * 2018-12-29 2019-05-10 百度在线网络技术(北京)有限公司 Method for checking object, device and computer equipment
CN111027401A (en) * 2019-11-15 2020-04-17 电子科技大学 End-to-end target detection method with integration of camera and laser radar
CN111160248A (en) * 2019-12-30 2020-05-15 北京每日优鲜电子商务有限公司 Method and device for tracking articles, computer equipment and storage medium
CN111257866A (en) * 2018-11-30 2020-06-09 杭州海康威视数字技术股份有限公司 Target detection method, device and system for linkage of vehicle-mounted camera and vehicle-mounted radar
CN111366919A (en) * 2020-03-24 2020-07-03 南京矽典微系统有限公司 Target detection method and device based on millimeter wave radar, electronic equipment and storage medium
CN111382768A (en) * 2018-12-29 2020-07-07 华为技术有限公司 Multi-sensor data fusion method and device
CN111462237A (en) * 2020-04-03 2020-07-28 清华大学 Target distance detection method for constructing four-channel virtual image by using multi-source information
CN111652097A (en) * 2020-05-25 2020-09-11 南京莱斯电子设备有限公司 Image millimeter wave radar fusion target detection method
CN111860589A (en) * 2020-06-12 2020-10-30 中山大学 Multi-sensor multi-target cooperative detection information fusion method and system
CN111856441A (en) * 2020-06-09 2020-10-30 北京航空航天大学 Train positioning method based on fusion of vision and millimeter wave radar
CN111967498A (en) * 2020-07-20 2020-11-20 重庆大学 Night target detection and tracking method based on millimeter wave radar and vision fusion

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102084408A (en) * 2008-09-05 2011-06-01 丰田自动车株式会社 Object detecting device
US20110050482A1 (en) * 2008-09-05 2011-03-03 Toyota Jidosha Kabushiki Kaisha Object detecting device
CN107330920A (en) * 2017-06-28 2017-11-07 华中科技大学 A kind of monitor video multi-target tracking method based on deep learning
CN108663677A (en) * 2018-03-29 2018-10-16 上海智瞳通科技有限公司 A kind of method that multisensor depth integration improves target detection capabilities
CN111257866A (en) * 2018-11-30 2020-06-09 杭州海康威视数字技术股份有限公司 Target detection method, device and system for linkage of vehicle-mounted camera and vehicle-mounted radar
CN109655826A (en) * 2018-12-16 2019-04-19 成都汇蓉国科微系统技术有限公司 The low slow Small object track filtering method of one kind and device
CN109738884A (en) * 2018-12-29 2019-05-10 百度在线网络技术(北京)有限公司 Method for checking object, device and computer equipment
CN111382768A (en) * 2018-12-29 2020-07-07 华为技术有限公司 Multi-sensor data fusion method and device
CN111027401A (en) * 2019-11-15 2020-04-17 电子科技大学 End-to-end target detection method with integration of camera and laser radar
CN111160248A (en) * 2019-12-30 2020-05-15 北京每日优鲜电子商务有限公司 Method and device for tracking articles, computer equipment and storage medium
CN111366919A (en) * 2020-03-24 2020-07-03 南京矽典微系统有限公司 Target detection method and device based on millimeter wave radar, electronic equipment and storage medium
CN111462237A (en) * 2020-04-03 2020-07-28 清华大学 Target distance detection method for constructing four-channel virtual image by using multi-source information
CN111652097A (en) * 2020-05-25 2020-09-11 南京莱斯电子设备有限公司 Image millimeter wave radar fusion target detection method
CN111856441A (en) * 2020-06-09 2020-10-30 北京航空航天大学 Train positioning method based on fusion of vision and millimeter wave radar
CN111860589A (en) * 2020-06-12 2020-10-30 中山大学 Multi-sensor multi-target cooperative detection information fusion method and system
CN111967498A (en) * 2020-07-20 2020-11-20 重庆大学 Night target detection and tracking method based on millimeter wave radar and vision fusion

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113111808A (en) * 2021-04-20 2021-07-13 山东大学 Abnormal behavior detection method and system based on machine vision
CN113111808B (en) * 2021-04-20 2022-03-29 山东大学 Abnormal behavior detection method and system based on machine vision
CN115014366A (en) * 2022-05-31 2022-09-06 中国第一汽车股份有限公司 Target fusion method and device, vehicle and storage medium

Similar Documents

Publication Publication Date Title
CN109753903B (en) Unmanned aerial vehicle detection method based on deep learning
Wang et al. RODNet: A real-time radar object detection network cross-supervised by camera-radar fused object 3D localization
CN110879994A (en) Three-dimensional visual inspection detection method, system and device based on shape attention mechanism
CN112367474B (en) Self-adaptive light field imaging method, device and equipment
CN110298281B (en) Video structuring method and device, electronic equipment and storage medium
CN114677554A (en) Statistical filtering infrared small target detection tracking method based on YOLOv5 and Deepsort
CN112528763A (en) Target detection method, electronic device and computer storage medium
CN111738071B (en) Inverse perspective transformation method based on motion change of monocular camera
CN112507849A (en) Dynamic-to-static scene conversion method for generating countermeasure network based on conditions
CN113297959A (en) Target tracking method and system based on corner attention twin network
CN116222577A (en) Closed loop detection method, training method, system, electronic equipment and storage medium
CN115546705A (en) Target identification method, terminal device and storage medium
CN115995042A (en) Video SAR moving target detection method and device
CN113935379B (en) Human body activity segmentation method and system based on millimeter wave radar signals
CN114169425A (en) Training target tracking model and target tracking method and device
CN113191427A (en) Multi-target vehicle tracking method and related device
CN113850783B (en) Sea surface ship detection method and system
CN115586506A (en) Anti-interference target classification method and device
WO2018143277A1 (en) Image feature value output device, image recognition device, image feature value output program, and image recognition program
CN113269808B (en) Video small target tracking method and device
CN112487984B (en) Point cloud data lightweight rapid generation method
CN114820705A (en) Method, apparatus, device and medium for tracking moving object
CN110766005B (en) Target feature extraction method and device and terminal equipment
Zhu et al. Robust target detection of intelligent integrated optical camera and mmWave radar system
CN115830079B (en) Traffic participant trajectory tracking method, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310056 Room 301, building 3, No. 2930, South Ring Road, Puyan street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Zhejiang huaruijie Technology Co.,Ltd.

Address before: 310056 Room 301, building 3, No. 2930, South Ring Road, Puyan street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: Zhejiang Dahua Automobile Technology Co.,Ltd.