CN113297881A - Target detection method and related device - Google Patents
Target detection method and related device Download PDFInfo
- Publication number
- CN113297881A CN113297881A CN202010113268.1A CN202010113268A CN113297881A CN 113297881 A CN113297881 A CN 113297881A CN 202010113268 A CN202010113268 A CN 202010113268A CN 113297881 A CN113297881 A CN 113297881A
- Authority
- CN
- China
- Prior art keywords
- detection frame
- target image
- point
- distance
- area
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 647
- 238000000034 method Methods 0.000 claims description 59
- 230000015654 memory Effects 0.000 claims description 27
- 238000013519 translation Methods 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 9
- 238000013461 design Methods 0.000 description 50
- 238000010586 diagram Methods 0.000 description 30
- 230000006870 function Effects 0.000 description 15
- 238000000926 separation method Methods 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 7
- 230000033001 locomotion Effects 0.000 description 6
- 230000011218 segmentation Effects 0.000 description 6
- 238000004590 computer program Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 3
- 238000003709 image segmentation Methods 0.000 description 3
- 238000011022 operating instruction Methods 0.000 description 3
- 206010063385 Intellectualisation Diseases 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000004888 barrier function Effects 0.000 description 2
- 238000002485 combustion reaction Methods 0.000 description 2
- 230000002860 competitive effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 206010039203 Road traffic accident Diseases 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 230000000979 retarding effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
- G06V20/584—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Bioinformatics & Computational Biology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
The embodiment of the application provides a target detection method, which is applied to the field of automatic driving and comprises the following steps: acquiring a target image; acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles; if the first area and the second area have a common boundary line, determining a first detection frame corresponding to the first object and determining a second detection frame corresponding to the second object, wherein the first detection frame comprises the first area, and the second detection frame comprises the second area; the first detection frame and the second detection frame are output, and therefore the detection precision of the detection frames when the target image is shielded is improved.
Description
Technical Field
The present application relates to the field of automatic driving, and in particular, to a target detection method and related apparatus.
Background
Currently, with the increasing year by year of the automobile keeping amount and the continuous development of intellectualization, the automobile safe driving technology has become a competitive technological hotspot in the automobile field. In automatic driving or unmanned driving of a vehicle, obstacle objects around the vehicle are detected, so that an automatic driving planning and control module of the vehicle can make corresponding behavior decisions (such as path planning, obstacle avoidance and the like) according to the obstacle information.
The vehicle can detect the obstacle object in the video frame image acquired by the sensor, and determine the corresponding detection frame position of the surrounding obstacle object in the image, wherein the detection frame is an area including the obstacle object, the position of the obstacle object can be represented by the detection frame, when the obstacle objects which are mutually shielded exist in the target image, the shielded target can not be detected, or the shielding object and the shielded target are considered as the same obstacle object, and the position of the two obstacle objects is represented by the same detection frame.
In the prior art, information of a current frame image is predicted by using information of a target historical frame image through a tracking algorithm, so that an occluded object cannot be detected when the occluded object is detected, taking a video frame comprising a first object and a second object as an example, in the first frames of images in the video, the first object and the second object are staggered with each other, and no occlusion occurs, in some consecutive frames, the first object occludes the second object, and the target image only comprises a part of the second object. However, when the motion state of the second object changes abruptly (for example, suddenly accelerates or suddenly decelerates, etc.), the detected frame position of the second object predicted by the host vehicle is not accurate.
Disclosure of Invention
The embodiment of the application provides a target detection method and a related device, which are used for improving the detection precision of a detection frame of a blocked object.
In a first aspect, an embodiment of the present application provides a target detection method, including:
acquiring a target image; acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles; if the first area and the second area have a common boundary line, determining a first detection frame corresponding to the first object and determining a second detection frame corresponding to the second object, wherein the first detection frame comprises the first area, and the second detection frame comprises the second area; and outputting the first detection frame and the second detection frame.
In the embodiment of the application, when the first object and the second object are shielded, the positions of the detection frames can be detected through the first area and the second area corresponding to the first object and the second object, so that the detection accuracy of the detection frames is not high.
Optionally, in an optional design of the first aspect, the first detection frame is a circumscribed rectangle frame of the first region, and the second detection frame is a circumscribed rectangle frame of the second region.
In this embodiment, the recognized external rectangular frame of the first region may be directly used as the detection frame of the first object, and the recognized external rectangular frame of the second region may be used as the detection frame of the second object, so that the computational cost is reduced on the premise that the missing detection is not caused.
Optionally, in an optional design of the first aspect, end points of a common boundary line between the first region and the second region are a first boundary point and a second boundary point, a connecting line between the first boundary point and the second boundary point is a target boundary line, and determining the first detection frame corresponding to the first object includes: determining a first initial detection box, the initial detection box corresponding to the first object and/or the second object; determining a first central point according to the first initial detection frame and the target boundary line, wherein the distance between the longitudinal coordinate of the first central point in the target image and the longitudinal coordinate of the midpoint of the target boundary line in the target image is within a preset range, and the distance between the transverse coordinate of the first central point in the target image and the longitudinal coordinate of the central point of the first initial detection frame in the target image is within a preset range; and determining a first detection frame corresponding to the first object according to the first central point and the first area, wherein the distance between the central point of the first detection frame and the first central point is within a preset range, and the first detection frame comprises the first area.
In this embodiment, when detection of the detection frame is incorrect, the detection frame of the first object is obtained by dividing the detection frame by the boundary between the first area and the second area, and the detection accuracy of the detection frame is further improved on the premise that detection omission does not occur.
Optionally, in an optional design of the first aspect, the determining a second detection frame corresponding to the second object includes: determining a second central point according to the first central point and the target boundary line, wherein the distance between the second central point and a symmetrical point of the first central point relative to the target boundary line is smaller than a preset value; and determining a second detection frame corresponding to the second object according to the second center point and the second area, wherein the distance between the center point of the second detection frame and the second center point is within a preset range, and the second detection frame comprises the second area.
In this embodiment, when detection of the detection frame is incorrect, the detection frame is divided by the boundary between the first region and the second region to obtain the detection frame of the second object, and the detection accuracy of the detection frame is further improved on the premise that detection omission does not occur.
Optionally, in an optional design of the first aspect, when the target image is captured, a distance between a capture point of the target image and the first object is smaller than a distance between the capture point and the second object. In this embodiment, the first object is an obstructing vehicle, and the second object is an obstructed vehicle.
Optionally, in an optional design of the first aspect, when the target image is captured, a distance between a capture point of the target image and the first object is greater than a preset distance, and a distance between the capture point of the target image and the second object is greater than the preset distance.
In this embodiment, when the first object and the second object are at a distance from the host vehicle, the sizes of the first object and the second object in the target image are small, and therefore, it is necessary to determine the detection frame based on the image segmentation result with a large computational cost, and the detection accuracy of the detection frame is ensured.
Optionally, in an optional design of the first aspect, the method further includes: outputting occlusion information indicating that the first object is an occluding object and the second object is an occluded object.
In this embodiment, after the determination of the detection frame is completed, an occlusion condition between the first object and the second object may be further output.
Optionally, in an optional design of the first aspect, the method further includes: outputting object association information indicating an association relationship between the first detection frame and the first object and an association relationship between the second detection frame and the second object.
In this embodiment, after the determination of the detection frame is completed, the first detection frame, the second detection frame, and the association relationship between the detection frame and the object may be output.
Optionally, in an optional design of the first aspect, the target image further includes: a first shaded region and a second shaded region, the method further comprising: acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles; determining a second initial detection box, the second initial detection box corresponding to the third object and the fourth object; if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object; and outputting the third detection frame and the fourth detection frame, wherein the preset direction is consistent with the driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with the lane line direction of the current driving of the third vehicle.
In this embodiment, the detection frames of the third object and the fourth object can be obtained based on the vehicle bottom shadow lines of the third object and the fourth object in the identified target image and based on the division processing of the detection frames performed by the shadow lines, and the detection accuracy of the detection frames is further improved on the premise that the detection omission does not occur.
Optionally, in an optional design of the first aspect, the determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object includes: dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point; and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
In this embodiment, the detection frame is divided by the longitudinal straight line where the intersection point is located, so as to obtain the detection frame of the third object and the detection frame of the fourth object, and on the premise that detection omission does not occur, the detection accuracy of the detection frames is further improved.
Optionally, in an optional design of the first aspect, when the target image is captured, a distance between a capture point of the target image and the third object is smaller than a distance between the capture point of the target image and the fourth object.
In this embodiment, the third object is an obstructing vehicle, and the fourth object is an obstructed vehicle.
Optionally, in an optional design of the first aspect, when the target image is captured, a distance between the capture point of the target image and the third object is less than a preset distance, and a distance between the capture point of the target image and the fourth object is less than a preset distance.
In this embodiment, when the third object and the fourth object are within a certain distance from the host vehicle, the third object and the fourth object have larger sizes in the target image, and therefore the detection frame needs to be determined based on the vehicle bottom shadow line detection result with smaller computational cost, and the computational cost is reduced on the premise that the detection accuracy of the detection frame is ensured.
Optionally, in an optional design of the first aspect, the preset distance is related to a driving state of a host vehicle, the host vehicle is a vehicle where a sensor for acquiring the target image is located, and the driving state at least includes one of: the driving direction and the up-down slope state of the road surface where the vehicle runs are included, wherein the driving direction comprises left turning, right turning or straight running, and the method further comprises the following steps: when the vehicle turns left or right, determining the preset distance as a first distance value; when the vehicle is in a straight-ahead driving state, determining the preset distance as a second distance value, wherein the first distance value is smaller than the second distance value; the uphill and downhill states comprise an uphill state, a downhill state and a flat state; the method further comprises the following steps: when the vehicle is in an uphill state, determining the preset distance as a third distance value; when the vehicle is in a downhill state, determining the preset distance as a fourth distance value; and when the vehicle is in a flat slope state, determining that the preset distance is a fifth distance value, wherein the third distance value is smaller than or equal to the fifth distance value, and the fifth distance value is smaller than the fourth distance value.
In this embodiment, the preset distance may be dynamically adjusted based on the driving state of the vehicle, so that the selection of the detection algorithm for the remote detection frame and the detection algorithm for the short-distance detection frame is more reasonable, that is, the cost of computational power is reduced on the premise of ensuring the detection accuracy of the detection frame as much as possible.
Optionally, in an optional design of the first aspect, the method further includes: outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
In this embodiment, after the determination of the detection frame is completed, an occlusion condition between the third object and the fourth object may be further output.
Optionally, in an optional design of the first aspect, the method further includes: outputting object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object.
In this embodiment, after the determination of the detection frame is completed, the third detection frame, the fourth detection frame, and the association relationship between the detection frame and the object may be output.
In a second aspect, the present application provides an object detection apparatus, comprising:
the acquisition module is used for acquiring a target image; acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles;
a determining module, configured to determine a first detection frame corresponding to the first object and a second detection frame corresponding to the second object if a common boundary line exists between the first area and the second area, where the first detection frame includes the first area and the second detection frame includes the second area;
and the output module is used for outputting the first detection frame and the second detection frame.
Optionally, in an optional design of the second aspect, the first detection frame is a circumscribed rectangle frame of the first region, and the second detection frame is a circumscribed rectangle frame of the second region.
Optionally, in an optional design of the second aspect, end points of a common boundary line between the first region and the second region are a first boundary point and a second boundary point, and a connecting line between the first boundary point and the second boundary point is a target boundary line, where the determining module is specifically configured to:
determining a first initial detection box, the initial detection box corresponding to the first object and/or the second object;
determining a first central point according to the first initial detection frame and the target boundary line, wherein the distance between the longitudinal coordinate of the first central point in the target image and the longitudinal coordinate of the midpoint of the target boundary line in the target image is within a preset range, and the distance between the transverse coordinate of the first central point in the target image and the longitudinal coordinate of the central point of the first initial detection frame in the target image is within a preset range;
and determining a first detection frame corresponding to the first object according to the first central point and the first area, wherein the distance between the central point of the first detection frame and the first central point is within a preset range, and the first detection frame comprises the first area.
Optionally, in an optional design of the second aspect, the determining module is specifically configured to:
determining a second central point according to the first central point and the target boundary line, wherein the distance between the second central point and a symmetrical point of the first central point relative to the target boundary line is smaller than a preset value;
and determining a second detection frame corresponding to the second object according to the second center point and the second area, wherein the distance between the center point of the second detection frame and the second center point is within a preset range, and the second detection frame comprises the second area.
Optionally, in an optional design of the second aspect, when the target image is captured, a distance between a capture point of the target image and the first object is smaller than a distance between the capture point and the second object.
Optionally, in an optional design of the second aspect, when the target image is captured, a distance between a capture point of the target image and the first object is greater than a preset distance, and a distance between the capture point of the target image and the second object is greater than the preset distance.
Optionally, in an optional design of the second aspect, the output module is further configured to:
outputting occlusion information indicating that the first object is an occluding object and the second object is an occluded object.
Optionally, in an optional design of the second aspect, the output module is further configured to:
outputting object association information indicating an association relationship between the first detection frame and the first object and an association relationship between the second detection frame and the second object.
Optionally, in an optional design of the second aspect, the target image further includes: the obtaining module is further configured to:
acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles;
the determining module is further configured to:
determining a second initial detection box, the second initial detection box corresponding to the third object and the fourth object;
if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object;
the output module is further configured to:
and outputting the third detection frame and the fourth detection frame.
Optionally, in an optional design of the second aspect, the preset direction is consistent with a driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with a lane line direction of a current driving of the third vehicle.
Optionally, in an optional design of the second aspect, the determining module is specifically configured to:
dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point;
and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
Optionally, in an optional design of the second aspect, when the target image is captured, a distance between a capture point of the target image and the third object is smaller than a distance between the capture point and the fourth object.
Optionally, in an optional design of the second aspect, when the target image is captured, a distance between the capture point of the target image and the third object is less than a preset distance, and a distance between the capture point of the target image and the fourth object is less than a preset distance.
Optionally, in an optional design of the second aspect, the preset distance is related to a driving state of a host vehicle, the host vehicle is a vehicle where a sensor for acquiring the target image is located, and the driving state at least includes one of:
the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
Optionally, in an optional design of the second aspect, the driving direction includes a left turn, a right turn, or a straight line, and the determining module is further configured to:
when the vehicle turns left or right, determining the preset distance as a first distance value;
and when the vehicle is in a straight line, determining the preset distance as a second distance value, wherein the first distance value is smaller than the second distance value.
Optionally, in an optional design of the second aspect, the uphill and downhill states include an uphill state, a downhill state, and a flat state;
the determining module is further configured to:
when the vehicle is in an uphill state, determining the preset distance as a third distance value;
when the vehicle is in a downhill state, determining the preset distance as a fourth distance value;
and when the vehicle is in a flat slope state, determining that the preset distance is a fifth distance value, wherein the third distance value is smaller than or equal to the fifth distance value, and the fifth distance value is smaller than the fourth distance value.
Optionally, in an optional design of the second aspect, the output module is further configured to:
outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
Optionally, in an optional design of the second aspect, the output module is further configured to:
outputting object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object.
In a third aspect, the present application provides a target detection method, including:
acquiring a target image, wherein the target image comprises a first shadow area and a second shadow area;
acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles;
determining a second initial detection box, the second initial detection box corresponding to the third object and the fourth object;
if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object;
and outputting the third detection frame and the fourth detection frame.
Optionally, in an optional design of the third aspect, the preset direction is consistent with a driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with a lane line direction in which the third vehicle is currently driving.
Optionally, in an optional design of the third aspect, the determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object includes:
dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point;
and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
Optionally, in an optional design of the third aspect, when the target image is captured, a distance between a capture point of the target image and the third object is smaller than a distance between the capture point and the fourth object.
Optionally, in an optional design of the third aspect, when the target image is captured, a distance between the capture point of the target image and the third object is less than a preset distance, and a distance between the capture point of the target image and the fourth object is less than a preset distance.
Optionally, in an optional design of the third aspect, the preset distance is related to a driving state of a host vehicle, the host vehicle is a vehicle where a sensor for acquiring the target image is located, and the driving state at least includes one of:
the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
Optionally, in an optional design of the third aspect, the driving direction includes a left turn, a right turn, or a straight line, and the method further includes:
when the vehicle turns left or right, determining the preset distance as a first distance value;
and when the vehicle is in a straight line, determining the preset distance as a second distance value, wherein the first distance value is smaller than the second distance value.
Optionally, in an optional design of the third aspect, the uphill and downhill states include an uphill state, a downhill state, and a flat state;
the method further comprises the following steps:
when the vehicle is in an uphill state, determining the preset distance as a third distance value;
when the vehicle is in a downhill state, determining the preset distance as a fourth distance value;
and when the vehicle is in a flat slope state, determining that the preset distance is a fifth distance value, wherein the third distance value is smaller than or equal to the fifth distance value, and the fifth distance value is smaller than the fourth distance value.
Optionally, in an optional design of the third aspect, the method further includes:
outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
Optionally, in an optional design of the third aspect, the method further includes:
outputting object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object.
In a fourth aspect, the present application provides an object detection apparatus, comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a target image, and the target image comprises a first shadow area and a second shadow area;
acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles;
a determining module to determine a second initial detection box, the second initial detection box corresponding to the third object and the fourth object;
if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object;
and the output module is used for outputting the third detection frame and the fourth detection frame.
Optionally, in an optional design of the fourth aspect, the preset direction is consistent with a driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with a lane line direction of a current driving of the third vehicle.
Optionally, in an optional design of the fourth aspect, the determining module is specifically configured to:
dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point;
and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
Optionally, in an optional design of the fourth aspect, when the target image is captured, a distance between a capture point of the target image and the third object is smaller than a distance between the capture point and the fourth object.
Optionally, in an optional design of the fourth aspect, when the target image is captured, a distance between the capture point of the target image and the third object is less than a preset distance, and a distance between the capture point of the target image and the fourth object is less than a preset distance.
Optionally, in an optional design of the fourth aspect, the preset distance is related to a driving state of a host vehicle, the host vehicle is a vehicle where a sensor for acquiring the target image is located, and the driving state at least includes one of:
the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
Optionally, in an optional design of the fourth aspect, the driving direction includes a left turn, a right turn, or a straight line, and the determining module is further configured to:
when the vehicle turns left or right, determining the preset distance as a first distance value;
and when the vehicle is in a straight line, determining the preset distance as a second distance value, wherein the first distance value is smaller than the second distance value.
Optionally, in an optional design of the fourth aspect, the uphill and downhill states include an uphill state, a downhill state, and a flat state;
the determining module is further configured to:
when the vehicle is in an uphill state, determining the preset distance as a third distance value;
when the vehicle is in a downhill state, determining the preset distance as a fourth distance value;
and when the vehicle is in a flat slope state, determining that the preset distance is a fifth distance value, wherein the third distance value is smaller than or equal to the fifth distance value, and the fifth distance value is smaller than the fourth distance value.
Optionally, in an optional design of the fourth aspect, the output module is further configured to:
outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
Optionally, in an optional design of the fourth aspect, the output module is further configured to:
outputting object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object.
In a fifth aspect, the present application provides an object detection apparatus, comprising: one or more processors; one or more memories; a plurality of application programs; and one or more programs, wherein the one or more programs are stored in the memory, which when executed by the processor, cause the object detection apparatus to perform the steps of any of the possible implementations of the first or third aspects and any of the above.
In a sixth aspect, the present application provides a vehicle comprising the object detection apparatus of any one of the second aspect or the second aspect and any possible implementation manner of the fourth aspect or the fourth aspect and any possible implementation manner of the fourth aspect.
In a seventh aspect, the present application provides an apparatus, which is included in an object detection apparatus, and which has a function of implementing the behavior of the terminal device in any one of the first aspect, the third aspect and possible implementation manners of any one of the above aspects. The functions may be implemented by hardware, or by hardware executing corresponding software. The hardware or software includes one or more modules or units corresponding to the above-described functions.
In an eighth aspect, the present application provides a computer storage medium comprising computer instructions that, when executed on an electronic device or a server, cause the electronic device to perform the possible implementations of the first aspect or the third aspect and any one of the above aspects.
In a ninth aspect, the present application provides a computer program product, which, when run on an electronic device or a server, causes the electronic device to perform the possible implementations of the first or third aspect and any one of the above aspects.
In a tenth aspect, the present application provides a chip system, which includes a processor, configured to support an executing device or a training device to implement the functions recited in the above aspects, for example, to transmit or process data recited in the above methods; or, information. In one possible design, the system-on-chip further includes a memory for storing program instructions and data necessary for the execution device or the training device. The chip system may be formed by a chip, or may include a chip and other discrete devices.
The embodiment of the application provides a target detection method, which comprises the following steps: acquiring a target image; acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles; if the first area and the second area have a common boundary line, determining a first detection frame corresponding to the first object and determining a second detection frame corresponding to the second object, wherein the first detection frame comprises the first area, and the second detection frame comprises the second area; and outputting the first detection frame and the second detection frame. By the mode, after the first area and the second area corresponding to the first object and the second object are identified, the position of the detection frame can be directly detected according to the first area and the second area, and the detection precision is high.
Drawings
Fig. 1 is a functional block diagram of an automatic driving apparatus having an automatic driving function according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of an automatic driving system according to an embodiment of the present application;
fig. 3 is a schematic diagram of an embodiment of a multi-bar code identification method provided in an embodiment of the present application;
FIG. 4a is a schematic diagram of ROI extraction provided in an embodiment of the present application;
fig. 4b is a schematic diagram of an image segmentation method provided in an embodiment of the present application;
fig. 5 is a schematic diagram of a first region and a second region acquired in a target image in an embodiment of the present application;
fig. 6a is a schematic diagram of detection of a detection frame of an obstacle object provided in an embodiment of the present application;
fig. 6b is a schematic diagram of detection of a detection frame of an obstacle object provided in an embodiment of the present application;
fig. 7a is a schematic diagram of an image recognition process provided in an embodiment of the present application;
fig. 7b is a schematic diagram of an image recognition process provided in an embodiment of the present application;
fig. 8a, 8b and 8c are schematic diagrams of object detection provided by an embodiment of the present application;
fig. 8d is a schematic diagram of object detection provided in an embodiment of the present application;
fig. 9 is a schematic flowchart of another target detection method provided in the embodiments of the present application;
FIG. 10 is a schematic structural diagram of an object detection apparatus according to an embodiment of the present disclosure;
fig. 11 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
The embodiments of the present invention will be described below with reference to the drawings. The terminology used in the description of the embodiments of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and are merely descriptive of the various embodiments of the application and how objects of the same nature can be distinguished. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of elements is not necessarily limited to those elements, but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Before explaining the embodiments of the present invention in detail, an application scenario related to the embodiments of the present invention will be described.
Currently, with the increasing year by year of the automobile keeping amount and the continuous development of intellectualization, the automobile safe driving technology has become a competitive technological hotspot in the automobile field. In the automatic driving or unmanned driving of the vehicle, by detecting the obstacle object around the vehicle, so that the automatic driving planning and control module of the vehicle can make corresponding behavior decision (such as path planning, obstacle avoidance, etc.) according to the obstacle information, the obstacle object in the present embodiment may be, but is not limited to, a pedestrian, a vehicle, etc., wherein the vehicle may be an internal combustion engine vehicle using an engine as a power source, a hybrid vehicle using an engine and an electric motor as power sources, an electric vehicle using an electric motor as a power source, a motorcycle, etc., and the present application is not limited thereto.
In one scenario, when the vehicle is cruising at a high speed, in order to ensure the driving safety of the vehicle, the vehicle may automatically detect surrounding obstacle objects and lane information, and send the surrounding obstacle objects to an automatic driving planning module and a control module of the vehicle, so that the automatic driving planning module and the control module may plan a driving path of the vehicle according to the received surrounding obstacle objects and lane information, so as to control the vehicle to perform lane change, speed change, and the like. For another example, when the vehicle automatically follows the congested road, the vehicle needs to accurately detect the surrounding obstacle objects and the road condition information due to the complex road condition, so as to provide the accurate surrounding obstacle objects and road condition information for the automatic driving planning and control module, so that the automatic driving planning and control module can make an accurate path planning and control the vehicle according to the surrounding obstacle objects and the road condition information, thereby avoiding the occurrence of traffic accidents.
In the prior art, the vehicle may perform target (obstacle object) detection on a target image (e.g., a video frame) acquired by a sensor, and determine a corresponding detection frame position of a surrounding obstacle object in the target image, where the detection frame is an area including the obstacle object, and may be, for example, a rectangular frame, and the detection frame may indicate the position of the obstacle object. In a scene, when mutually-occluded obstacle objects exist in a target image, the occluded object may not be detected in the prior art, or the occluding object and the occluded object are considered as the same obstacle object, and the positions of the two obstacle objects are represented by the same detection frame. One solution is: by means of a tracking algorithm, information of a next frame of a target is predicted by using information of a current frame of the target, and it is avoided that an occluded object cannot be detected when the occluded object is occluded. However, when the motion state of the second object changes abruptly (for example, suddenly accelerates or suddenly decelerates, etc.), the detected frame position of the second object predicted by the host vehicle is not accurate. In the present embodiment, the "own vehicle" may be understood as a vehicle that performs object detection.
In order to solve the technical problem, the present application provides a target detection method, which can acquire a detection frame position of each object in a target image under the condition that occlusion exists among a plurality of objects in the target image.
Next, a system architecture according to an embodiment of the present invention will be described first.
The vehicle (for example, the first object, the second object, the third object, the fourth object, or the host vehicle hereinafter) described in this specification may be an internal combustion engine vehicle having an engine as a power source, a hybrid vehicle having an engine and an electric motor as power sources, an electric car having an electric motor as a power source, a motorcycle, or the like.
In the embodiment of the present application, the vehicle may include an automatic driving apparatus 100 having an automatic driving function.
Referring to fig. 1, fig. 1 is a functional block diagram of an automatic driving apparatus 100 having an automatic driving function according to an embodiment of the present application. In one embodiment, the autopilot device 100 is configured in a fully or partially autopilot mode.
The autopilot device 100 may include various subsystems such as a travel system 102, a sensor system 104, a control system 106, one or more peripheral devices 108, as well as a power supply 110, a computer system 112, and a user interface 116.
The sensor system 104 may include a radar 126, a camera 130, a laser range finder 128, and other subsystems, of which only the radar 126 and the camera 130 are described in the embodiments of the present application.
The radar 126 may utilize radio signals to sense objects within the surrounding environment of the autopilot device 100. In some embodiments, in addition to sensing objects, radar 126 may also be used to sense the speed and/or heading of an object.
The laser rangefinder 128 may utilize laser light to sense objects in the environment in which the autopilot device 100 is located.
The camera 130 may be used to capture multiple images of the surrounding environment of the autonomous device 100. The camera 130 may be a still camera or a video camera. Alternatively, the camera 130 may be located at a suitable position outside the vehicle in order to acquire images of the outside of the vehicle. For example, the camera 130 may be disposed in the vehicle interior near the front windshield in order to capture an image in front of the vehicle. Alternatively, the camera 130 may be disposed around the front bumper or the radiator grille. For example, the camera 130 may be disposed close to a rear window in the vehicle interior in order to capture an image behind the vehicle. Alternatively, the camera 130 may be disposed around a rear bumper, trunk, or tailgate. For example, the camera 130 may be disposed in the vehicle interior in close proximity to at least one of the side windows in order to capture an image of the side of the vehicle. Alternatively, the camera 130 may be disposed around a side mirror, fender, or door.
The computer vision system 140 may be operable to process and analyze images captured by the camera 130 to identify objects and/or features in the environment surrounding the autonomous device 100. The objects and/or features may include traffic signals, road boundaries, and obstacles. The computer vision system 140 may use object recognition algorithms, Structure From Motion (SFM) algorithms, video tracking, and other computer vision techniques. In some embodiments, the computer vision system 140 may be used to map an environment, track objects, estimate the speed of objects, and so forth.
The processor 113 may be any conventional processor, such as a commercially available Central Processing Unit (CPU). Alternatively, the processor may be a dedicated device such as an Application Specific Integrated Circuit (ASIC) or other hardware-based processor. Although fig. 1 functionally illustrates a processor, memory, and other elements of a computer in the same block, those skilled in the art will appreciate that the processor, computer, or memory may actually comprise multiple processors, computers, or memories that may or may not be stored within the same physical housing. For example, the memory may be a hard drive or other storage medium located in a different enclosure than the computer. Thus, references to a processor or computer are to be understood as including references to a collection of processors or computers or memories which may or may not operate in parallel. Rather than using a single processor to perform the steps described herein, some components, such as the steering component and the retarding component, may each have their own processor that performs only computations related to the component-specific functions. In various aspects described herein, the processor may be located remotely from the autonomous device and in wireless communication with the autonomous device. In other aspects, some of the processes described herein are executed on a processor disposed within the autopilot device while others are executed by a remote processor, including taking the steps necessary to execute a single maneuver.
In some embodiments, the memory 114 may include instructions 115 (e.g., program logic), and the instructions 115 may be executable by the processor 113 to perform various functions of the autopilot device 100, including those described above. The memory 114 may also contain additional instructions, including instructions to send data to, receive data from, interact with, and/or control one or more of the travel system 102, the sensor system 104, the control system 106, and the peripheral devices 108. Such information may be used by the autonomous device 100 and the computer system 112 during operation of the autonomous device 100 in autonomous, semi-autonomous, and/or manual modes.
The autopilot device 100 may be a car, a truck, a motorcycle, a bus, a boat, an airplane, a helicopter, a lawn mower, an amusement car, a playground autopilot device, construction equipment, a trolley, a golf cart, a train, a cart, or the like, and the embodiment of the present application is not particularly limited.
Fig. 1 illustrates a functional block diagram of the automatic driving apparatus 100, and an automatic driving system in the automatic driving apparatus 100 will be described below. Fig. 2 is a schematic structural diagram of an automatic driving system according to an embodiment of the present application. Fig. 1 and 2 illustrate the autopilot device 100 from different perspectives, for example, the computer system 101 of fig. 2 is the computer system 112 of fig. 1.
As shown in FIG. 2, computer system 101 comprises a processor 103, processor 103 coupled to a system bus 105. Processor 103 may be one or more processors, where each processor may include one or more processor cores. System bus 105 is coupled through a bus bridge 111 and an input/output (I/O) bus 113. The I/O interface 115 is coupled to an I/O bus. The I/O interface 115 communicates with various I/O devices, such as an input device 117 (e.g., keyboard, mouse, touch screen, etc.), a multimedia disk (media tray)121, e.g., CD-ROM, multimedia interface, etc. A transceiver 123 (which can send and/or receive radio communication signals), a camera 155 (which can capture scenic and motion digital video images), and an external USB port 125.
For convenience of understanding, in the following embodiments of the present application, an automatic driving device having a structure shown in fig. 1 and fig. 2 is taken as an example, and a target detection method provided by the embodiments of the present application is specifically described in conjunction with the accompanying drawings and application scenarios.
Referring to fig. 3, fig. 3 is a schematic diagram of an embodiment of an object detection method provided in an embodiment of the present application, and as shown in fig. 3, the object detection method provided in the present application includes:
301. and acquiring a target image.
In the embodiment of the application, the host vehicle may acquire a target image, wherein the target image may be an image of one frame in a video. Specifically, the host vehicle may acquire the target image based on an image pickup device, wherein the image pickup device may be mounted outside a vehicle body of the host vehicle. Note that a plurality of image pickup devices may be mounted around the vehicle body of the host vehicle. For example, 4 image capturing devices may be installed around the body of the host vehicle, but are not limited to, the 4 image capturing devices may be respectively a forward-view image capturing device, a rear-view image capturing device, a left-view image capturing device, and a right-view image capturing device. The front view image acquisition equipment is arranged at the central position of the head of the vehicle, the rear view image acquisition equipment is arranged at the central position of the tail of the vehicle, the left view image acquisition equipment is arranged at the midpoint position of the left side of the vehicle along the length direction, and the right view image acquisition equipment is arranged at the midpoint position of the right side of the vehicle along the length direction. It should be noted that the above is only described by taking 4 image capturing devices as an example, and in practical applications, more or fewer image capturing devices may be installed around the vehicle body of the host vehicle. The image acquisition equipment can acquire images of road conditions around the vehicle in the driving process of the vehicle.
In the embodiment of the application, the image acquisition device of the automatic driving device can transmit the target image to the processor after acquiring the target image, and the processor can perform image processing on the target object.
302. The method comprises the steps of obtaining a first area and a second area which are included in a target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles.
In this embodiment of the application, after the processor receives the target image, before acquiring the first region and the second region included in the target image, a region of interest (ROI) in the target image may be extracted first, specifically, the ROI may include an obstacle object (a vehicle or a pedestrian) in the target image, where the distance between the obstacle object and the host vehicle is outside a preset distance, where the ROI is a region capable of representing a specific feature in the target image, and the ROI does not exceed the range of the target image, and the ROI may be set on the target image through an open source computer vision library (OpenCV), so as to segment the target image, so that only the ROI is subjected to correlation operation when a function is used, and when the target image is processed by using the ROI, a focused region in the target image may be defined in a targeted manner. In the embodiment of the present application, the "focused point" may be a region including an obstacle object that is away from the host vehicle by a predetermined distance.
In this embodiment of the application, a first ROI may be extracted first, where the first ROI includes an obstacle object that is not a preset distance away from the host vehicle, and then, a size, a shape, and a position of the first ROI may be further adaptively adjusted according to a position of each obstacle object, so that the first ROI fits the obstacle object that is not the preset distance away from the host vehicle as much as possible, to obtain a second ROI, and in an alternative expression, a size, a shape, and a ROI position of the first ROI may be further adaptively adjusted according to a position of each obstacle object in the target image, so that the first ROI includes only the obstacle object that is not the preset distance away from the host vehicle as much as possible, and the second ROI is obtained in a region that includes non-obstacle objects as little as possible. In the embodiment of the application, the barrier objects included in the first ROI and the second ROI are not changed, and only the region of the non-barrier objects is removed.
It should be noted that the shapes of the first ROI and the second ROI may be, but not limited to, rectangles, trapezoids, triangles, irregular figures, etc., and may be changed according to the shape of at least one obstacle object out of the preset distance from the host vehicle, for example, when the number of obstacle objects out of the preset distance from the host vehicle is one, the second ROI may be identical or approximately identical to the outer contour shape of the only obstacle object. For example, when the number of the obstacle objects out of the preset distance from the host vehicle is plural, the plural obstacle objects form a new obstacle shape, and the second ROI may be matched or approximately matched with the outer contour line shape of the new obstacle shape.
Specifically, referring to fig. 4a, fig. 4a is a schematic diagram of ROI extraction provided in an embodiment of the present application, as shown in fig. 4a, a first ROI 401 may be extracted, where the first ROI 401 includes a plurality of vehicles away from a host vehicle by a preset distance, but a large part of the ROI 401 is non-obstacle objects, such as a road, a number, a sky, and the like, and then a second ROI 402 may be obtained by adaptively adjusting (adjusting a size, a shape, a position, and the like) the first ROI 402, where the second ROI 402 includes a plurality of vehicles away from the host vehicle by the preset distance and excludes a large part of non-obstacle objects. By the aid of the mode, calculation force is reduced, real-time performance is improved, and detection precision is improved.
It should be noted that the embodiment of the present application may be implemented based on any method in the prior art that can extract an ROI including an object beyond a preset distance, and the present application does not limit the manner of extracting the ROI.
In an embodiment, the preset distance in the foregoing embodiment may be specifically related to a driving state of a host vehicle, where the host vehicle is a vehicle in which a sensor for acquiring the target image is located, and the driving state may include at least one of: the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
In this application embodiment, the driving direction may include left turning, right turning, or straight traveling, and when the host vehicle is left turning or right turning, it is determined that the preset distance is a first distance value, and when the host vehicle is straight traveling, it is determined that the preset distance is a second distance value, where the first distance value is smaller than the second distance value.
For example, when the host vehicle is in a traveling state of straight traveling, or in a traveling direction that is deviated from the traveling direction of straight traveling by a preset angle range, the preset distance may be determined to be a second distance value, for example, the second distance value may be 50 meters, and when the host vehicle is in a traveling state of left-turn or right-turn, the preset distance may be determined to be a first distance value, for example, the first distance value may be 40 meters.
In this application embodiment, the uphill and downhill path state can include uphill state, downhill path state and flat slope state, wherein, when this car is in the uphill state, confirm preset distance is the third distance value when this car is in the downhill path state, confirm preset distance is the fourth distance value when this car is in the flat slope state, confirm preset distance is the fifth distance value, wherein the third distance value is less than or equal to the fifth distance value, the fifth distance value is less than the fourth distance value. In the embodiment of the present application, which of the uphill state, the downhill state, and the flat-hill state the host vehicle is in may be determined according to an inclination angle between a horizontal plane and a road surface on which the host vehicle is currently traveling, and is not limited herein.
For example, the preset distance may be determined to be a third distance value, for example, 40 meters, when the host vehicle is in an uphill state, may be determined to be a fourth distance value, for example, 55 meters, when the host vehicle is in a downhill state, and may be determined to be a fifth distance value, for example, 50 meters, when the host vehicle is in a flat-hill state.
In the embodiment of the present application, when the host vehicle is in different driving directions and different uphill and downhill states of the road surface on which the host vehicle is located when driving, the preset distance may be different, exemplarily, when the host vehicle is in a straight running state and a downhill state, it may be determined that the preset distance is a meter a, when the host vehicle is in a straight running state and a downhill state, it may be determined that the preset distance is a meter B, when the host vehicle is in a straight running state and an uphill state, it may be determined that the preset distance is a meter C, when the host vehicle is in a left turn or a right turn, and in a downhill state, it may be determined that the preset distance is a meter D, when the host vehicle is in a left turn or a right turn, and in a downhill state, it may be determined that the preset distance is a meter F, wherein a is greater than or equal to B, and B is greater than or equal to C, c is greater than or equal to D, D is greater than or equal to E, and E is greater than or equal to F.
In the embodiment of the present application, when the host vehicle is in different driving directions and different uphill and downhill states of the road surface on which the host vehicle is located when driving, the preset distance may be different, exemplarily, when the host vehicle is in a downhill state and straight traveling, it may be determined that the preset distance is a meter, when the host vehicle is in a downhill state and left-hand or right-hand turning, it may be determined that the preset distance is B meter, when the host vehicle is in a flat-hill state and straight traveling, it may be determined that the preset distance is C meter, when the host vehicle is in a flat-hill state and left-hand or right-hand turning, it may be determined that the preset distance is D meter, when the host vehicle is in an uphill state and straight traveling, it may be determined that the preset distance is F meter, wherein a is greater than or equal to B, B is greater than or equal to C, c is greater than or equal to D, D is greater than or equal to E, and E is greater than or equal to F.
The specific magnitude of the preset distance value may be a specific angle between the traveling direction of the vehicle and the straight traveling direction and a specific angle between the road surface on which the vehicle is traveling and the horizontal plane, and is not limited herein.
In the embodiment of the application, when the distance between the obstacle object and the vehicle is short, the definition of the obstacle object in the target image is clear, the size of the obstacle object is large, and the detection accuracy of the detection frame is high, so that different target detection methods can be adopted for the obstacle objects with different distances, a method with high detection accuracy and high computational cost can be adopted for the obstacle object with a long distance, a method with low detection accuracy and low computational cost can be adopted for the obstacle object with a short distance, so that the balance between the detection accuracy and the computational cost is realized, and the description is respectively performed for the two different target detection methods.
First, how to detect the detection frame for the obstacle object whose distance from the host vehicle exceeds the preset distance value will be described.
In the embodiment of the present application, an example is described in which a target image includes a first object and a second object, where in the target image, a distance between a shooting point (or a host vehicle) of the target image and the first object is greater than a preset distance, and a distance between the shooting point (or the host vehicle) of the target image and the second object is greater than the preset distance.
At this time, the processor may acquire a first region and a second region included in the target image, where a boundary line of the first region corresponds to an outline of a first object in the target image, and a boundary line of the second region corresponds to an outline of a second object in the target image.
Specifically, the processor may obtain, but is not limited to, a first region and a second region included in the target image based on semantic segmentation of the image. The semantic segmentation of the image can accurately segment each object included in the target image. In the embodiment of the present application, the first region and the second region included in the target image may be obtained according to any image semantic segmentation technique in the prior art. Referring to fig. 4b, fig. 4b is a schematic diagram of an image segmentation method provided in the embodiment of the present application, as shown in fig. 4b, when the obstacle objects in the target image include a first object, a second object, and other obstacle objects (a vehicle and a pedestrian), based on semantic segmentation of the image, a first region 403 corresponding to the first object, a second region 404 corresponding to the second object, a region 405 corresponding to another vehicle, and a region 406 corresponding to a pedestrian can be acquired. It should be noted that the first area may be represented by using a pixel point corresponding to an area of the first object in the target image as a coordinate set, or by using a pixel coordinate set corresponding to a contour line of the first object in the target image, and the second area may be represented by using a pixel point coordinate set corresponding to an area of the second object in the target image, or by using a pixel point coordinate set corresponding to a contour line of the second object in the target image, which is not limited herein. Specifically, after the image semantic segmentation is performed, the pixel point coordinate set of the first region, the association relationship between the pixel point coordinate set of the first region and the first object, the pixel point coordinate set of the second region, and the association relationship between the pixel point coordinate set of the second region and the second object may be stored.
It should be noted that the boundary line of the first region corresponds to the contour line of the first object in the target image, and the boundary line of the second region corresponds to the contour line of the second object in the target image. Here, "correspond" should be understood to mean that the first boundary line may indicate the shape of the contour line of the first object in the target image and the position of the first object in the target image to some extent, however, the first boundary line does not exactly coincide with the contour line and the position of the first object in the target image, but may describe the approximate shape and position.
303. If the first area and the second area have a common boundary line, determining a first detection frame corresponding to the first object and determining a second detection frame corresponding to the second object, wherein the first detection frame includes the first area, and the second detection frame includes the second area.
In an embodiment of the present application, if a common boundary line exists between the first area and the second area, a first detection frame corresponding to the first object and a second detection frame corresponding to the second object are determined, where the first detection frame includes the first area, and the second detection frame includes the second area.
In this embodiment of the application, after the first region and the second region are detected, it may be determined whether a common boundary line exists between the first region and the second region first, and in a case that the common boundary line exists between the first region and the second region, it may be considered that an occlusion exists between the first object and the second object.
Specifically, whether or not a common boundary line exists between the first area and the second area may be determined in one of the following ways:
1. and judging whether the first area and the second area comprise a plurality of common and continuous pixel point coordinates, and if so, determining that a common boundary line exists between the first area and the second area.
Referring to fig. 5, fig. 5 is a schematic diagram of a first region and a second region acquired in a target image in the embodiment of the present application, as shown in fig. 5, a common boundary line 501 exists between the first region 403 and the second region 404, where a pixel point corresponding to the boundary line 501 is a common pixel point included in a pixel coordinate point set corresponding to a first object and a pixel coordinate point set corresponding to a second object, and at this time, it may be determined that a common boundary line exists between the first region and the second region.
2. Whether two intersection points exist between the edge line of the first area and the edge line of the second area is judged (hereinafter, the intersection point can also be described as a concave point, the concave point can be understood as the intersection point of the edge line of the obstacle object in the target image, and the two intersection points can also be described as a first boundary point and a second boundary point), and if so, a common boundary line exists between the first area and the second area.
Referring to fig. 5, fig. 5 is a schematic diagram of a first region and a second region acquired in a target image in the embodiment of the present application, and as shown in fig. 5, two intersections (a concave point 501 and a concave point 502) exist between an edge line of the first region and an edge line of the second region, and at this time, it may be determined that a common boundary line exists between the first region and the second region.
Specifically, when a plurality of pits are detected in the target image, matching between the pits may be performed first, and the purpose of matching is to determine the pits between objects in an occluded relationship with each other, and the pits may be used to determine a boundary line between the obstacle objects occluded with each other, so as to adjust the position of the detection frame by the boundary line. Specifically, the matching between the concave points can be performed by adopting corner matching, color matching, squared figure matching and the like.
The corner matching can be to find the corresponding relation of characteristic pixel points between the sheltered vehicles, so as to determine the position relation between the running vehicles, and obtain the concave points between the mutually sheltered obstacle objects according to the characteristic pixel points. Color matching may refer to extracting color information of an image according to a result of semantic segmentation of the image, and then obtaining pits between mutually occluded obstacle objects. The nine-square-grid matching can be that a picture is divided into a plurality of grids according to the size of the obstacle object and the feature points, the number of the feature points in the grids is subjected to movement statistics, and through the distance between the feature points and the central feature point, if the distance exceeds a threshold value, the picture is considered to be a correct match, otherwise, the picture is screened. After the paired pits are obtained, a boundary line between the mutually occluded obstacle objects can be obtained from the paired pits. For example, the pits that are paired with each other may be the pit 501 and the pit 502 shown in fig. 5.
In one embodiment, a first initial detection box may be determined, the initial detection box corresponding to the first object and the second object.
In this embodiment of the application, the first initial detection frame may be a result obtained after the detection frame is detected based on the prior art, at this time, the accuracy of the detection frame may be determined, if it is determined that the first initial detection frame corresponds to the first object and/or the second object, that is, when the same detection frame (first initial detection frame) is used as the detection frame of the first object and/or the second object, it is determined that the detection structure of the first initial detection frame is incorrect, and at this time, the detection frame position of the first object and the detection frame position of the second object may be determined based on the first initial detection frame, respectively.
It should be noted that the first initial detection box corresponds to the first object and/or the second object, where "correspond" may be understood as: the first object and the second object are used as an obstacle object, and the first initial detection frame corresponds to the obstacle object, or the detection frame of the first object is a first initial detection frame, and the detection frame of the second object is a second initial detection frame, at this time, the first initial detection frame has an association relationship with the first object and the second object.
It should be noted that the first initial detection box corresponds to the first object and/or the second object, where "correspond" may also be understood as: the processor does not detect one of the first object and the second object, and at this time, detects only a detection frame (first initial detection frame) of the first object or the second object, and at this time, it may be considered that the first initial detection frame corresponds to the first object or the second object.
It should be noted that the above "correspondence" does not mean that the first initial detection frame includes all the pixel points of the first object and/or the second object, but is used to indicate the position of the first object and/or the second object in the target image.
It should be noted that, in an embodiment, it may be determined that the first initial detection frame corresponds to the first object and/or the second object based on the number of detection frames or the corresponding relationship between the first object and the second object.
Specifically, if the first object and the second object correspond to one detection frame as a whole, it may be considered that the first initial detection frame corresponds to the first object and the second object.
Specifically, if a first object corresponds to one detection frame and a second object does not correspond to one detection frame, it may be considered that the first initial detection frame corresponds to the first object.
Specifically, if the second object corresponds to one detection frame and the first object does not correspond to one detection frame, it may be considered that the first initial detection frame corresponds to the second object.
In this embodiment, it may be determined that the first initial detection frame corresponds to the first object and the second object according to the result of semantic separation of the images, for example, if only one first initial detection frame is identified for the first object and the second object, and the first initial detection frame includes most or all of the pixels of the first object, and the first initial detection frame includes most or all of the pixels of the second object, the first initial detection frame may be considered to correspond to the first object and the second object.
It should be noted that, in an embodiment, it may be determined that the first initial detection frame corresponds to the first object and the second object based on the result of semantic separation of the images, for example, if only one first initial detection frame is identified near a boundary line between the first object and the second object, and the first initial detection frame includes most or all of the pixels of the first object, and the first initial detection frame includes most or all of the pixels of the second object, the first initial detection frame may be considered to correspond to the first object and the second object.
In this embodiment, it may be determined that the first initial detection frame corresponds to the first object or the second object according to the result of semantic separation of the images, for example, if only one first initial detection frame is identified for the first object and the second object, and the first initial detection frame includes most or all of the pixels of the first object, and the first initial detection frame includes most or all of the pixels of the second object, the first initial detection frame may be considered to correspond to the first object and the second object.
In this embodiment, it may be determined that the first initial detection frame corresponds to the first object and/or the second object according to the result of semantic separation of the images, for example, if only one first initial detection frame is identified for the first object and the second object, and the first initial detection frame includes most or all of the pixels of the first object and does not include or only includes a small number of pixels of the second object, the first initial detection frame may be considered to correspond to the first object.
In the embodiment of the application, when the target image is shot, the distance between the shooting point of the target image and the first object is smaller than the distance between the shooting point of the target image and the second object, that is, in the target image, the first object is an occlusion party, and the second object is an occluded party.
In the embodiment of the present application, after determining the first initial detection frame, where the initial detection frame corresponds to the first object and the second object, if it is determined that the detection result of the initial detection frame is incorrect, the number and the position of the detection frames need to be adjusted.
In this embodiment, a first center point may be determined according to the first initial detection frame and the target boundary line, a distance between a vertical coordinate of the first center point in the target image and a vertical coordinate of a midpoint of the target boundary line in the target image is within a preset range, and a distance between a horizontal coordinate of the first center point in the target image and a vertical coordinate of the center point of the first initial detection frame in the target image is within a preset range.
In an embodiment of the application, a first center point may be determined based on a first initial detection frame and the target boundary, where a horizontal coordinate position of the first center point in the target image is related to the first initial detection frame, and a vertical coordinate position of the first center point in the target image is related to the target boundary.
Specifically, a distance between a longitudinal coordinate of the first center point in the target image and a longitudinal coordinate of a midpoint of the target boundary in the target image is within a preset range, which may be determined according to requirements, for example, may be within a distance of 4 pixels, and the like, and is not limited herein.
Specifically, a distance between the horizontal coordinate of the first center point in the target image and the vertical coordinate of the center point of the first initial detection frame in the target image is within a preset range, which may be determined according to requirements, for example, may be within a distance of 4 pixels, and the like, and is not limited herein.
Referring to fig. 6a, fig. 6a is a schematic diagram of detection of a detection frame of an obstacle object provided in an embodiment of the present application, and as shown in fig. 6a, a target image includes a first area 601 and a second area 602. A connecting line between a first boundary point and a second boundary point between the first region 601 and the second region 602 is a target boundary line 603.
Initially, based on some target detection method, a first initial detection box 604 corresponding to a first object and a second object may be obtained. Next, the position of the first center point 605 may be determined based on the first initial detection box 604 and the target boundary 603.
The distance between the center point of the first initial detection frame 604 and the lateral coordinate of the first center point 605 is within a preset range, which may be determined according to actual requirements, for example, may be within a distance of 5 pixels, and the like, which is not limited herein. It should be noted that the center point of the first initial detecting frame 604 may coincide with the lateral coordinate position of the first center point 605. It should be noted that the "certain target detection method" may be any detection frame detection method in the prior art, and the detection frames of the first object and the second object are identified as the same detection frame (first initial detection frame) due to the accuracy of the detection frame detection method.
The distance between the center point of the target boundary 603 and the first center point 605 is within a preset range, which may be determined according to actual requirements, for example, may be within a distance of 5 pixels, and the like, and is not limited herein. It should be noted here that the center point of the target boundary line 603 may coincide with the longitudinal coordinate position of the first center point 605.
In this embodiment of the application, after the position of the first center point 605 is determined, the position of the second center point 606 may be determined according to the position of the first center point 605 and the target boundary line 603, where a distance between the second center point 606 and a symmetric point of the first center point 605 with respect to the target boundary line 603 is smaller than a preset value, and the preset range may be determined according to actual requirements, for example, may be within a distance of 5 pixels, and the like, and is not limited herein. It should be noted here that the second center point 606 may coincide with a symmetry point of the first center point 605 with respect to the target boundary line 603, i.e. the symmetry point of the first center point 605 with respect to the target boundary line 603 is the second center point 606.
In this embodiment of the application, after the first center point 605 is determined, the first detection frame 607 corresponding to the second object may be determined according to the first center point 605 and the first region 601, a distance between a center point of the first detection frame 607 and the first center point 605 is within a preset range, and the first detection frame 607 includes the first region 601. The preset range may be determined according to actual requirements, and may be, for example, within a distance of 5 pixels, and the like, which is not limited herein. It should be noted here that the first center point 605 may coincide with a center point of the first detection frame 607.
In this embodiment, the first detection frame 607 includes the first region 601, that is, the pixel point of the first region 601 is within the image region range included in the first detection frame 607, specifically, the position and size of the first detection frame 607 can be adjusted, so that the first detection frame 607 includes the first region 601, and the distance between the central point of the second detection frame 608 and the first central point 605 is within a preset range.
In this embodiment of the application, after the second center point 606 is determined, a second detection frame 608 corresponding to the second object may be determined according to the second center point 606 and the second area 602, a distance between a center point of the second detection frame 608 and the second center point 606 is within a preset range, and the second detection frame 608 includes the second area 602. The preset range may be determined according to actual requirements, and may be, for example, within a distance of 5 pixels, and the like, which is not limited herein. It should be noted that the second center point 606 may coincide with a center point of the second detection frame 608.
In this embodiment, the second detection frame 608 includes the second region 602, that is, the pixel point of the second region 602 is within the image region included in the second detection frame 608. Specifically, the position and size of the second detection frame 608 may be adjusted, so that the second detection frame 608 includes the second area 602, and the distance between the central point of the second detection frame 608 and the second central point 606 is within a preset range, which may be specifically referred to fig. 6 b.
In one embodiment, the first detection frame is a circumscribed rectangle frame of the first region, and the second detection frame is a circumscribed rectangle frame of the second region.
In this embodiment of the application, after the first region and the second region are obtained, the circumscribed rectangle frame of the first region may be directly used as a first detection frame corresponding to the first object, and the circumscribed rectangle frame of the second region may be used as a second detection frame corresponding to the second object.
Compared with the prior art, the position of the detection frame of the blocked obstacle object is obtained in an object tracking-based mode, and when the motion state of the blocked obstacle object changes suddenly, the position detection accuracy of the detection frame is low.
304. And outputting the first detection frame and the second detection frame.
In this embodiment of the present application, after determining a first detection frame corresponding to the first object and determining a second detection frame corresponding to the second object, the first detection frame and the second detection frame may be output, where the first detection frame and the second detection frame may be rectangular frames and may be represented by end positions of a diagonal line. That is, in the embodiment of the present application, the end point position of the diagonal line of the first detection frame and the end point position of the diagonal line of the second detection frame may be output. Specifically, the information of the first detection frame and the second detection frame may be provided as an interface to a module such as a route planning control module of the automatic driving apparatus.
In addition, in this embodiment of the present application, occlusion information may be further output, where the occlusion information is used to indicate that the first object is an occluding object and the second object is an occluded object. Object association information indicating an association relationship between the first detection frame and the first object and an association relationship between the second detection frame and the second object may also be output.
In this embodiment of the application, occlusion information indicating that the first object is an occluding object and the second object is an occluded object may be output, for example, occlusion or occlusion may be represented by a preset character string, and exemplarily, a first object (an occluding object) may be represented by "1" and a second object (an occluded object) may be represented by "0". In this embodiment, object association information indicating an association relationship between the first detection frame and the first object and an association relationship between the second detection frame and the second object may be output.
The embodiment of the application provides a target detection method, which comprises the following steps: acquiring a target image; acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles; if the first area and the second area have a common boundary line, determining a first detection frame corresponding to the first object and determining a second detection frame corresponding to the second object, wherein the first detection frame comprises the first area, and the second detection frame comprises the second area; and outputting the first detection frame and the second detection frame. By the mode, after the first area and the second area corresponding to the first object and the second object are identified, the position of the detection frame can be directly detected according to the first area and the second area, and the detection precision is high.
In the embodiment of the present application, when the distance between the obstacle object and the host vehicle is short, the definition of the obstacle object in the target image is clear, the size of the obstacle object is large, and the detection accuracy of the detection frame is high, so that a method with low detection accuracy and low computational cost can be adopted for the obstacle object with a short distance, so as to achieve the balance between the detection accuracy and the computational cost, and then how to detect the detection frame for the obstacle object with a distance from the host vehicle smaller than a preset distance value is described.
In an embodiment of the present application, the target image further includes a first shadow region and a second shadow region. The first shadow area corresponds to a third object, the second shadow area corresponds to a fourth object, and when the target image is shot, the distance between the shooting point of the target image and the third object is smaller than a preset distance, and the distance between the shooting point of the target image and the fourth object is smaller than the preset distance.
Optionally, when the target image is captured, a distance between the capture point of the target image and the third object is smaller than a distance between the capture point of the target image and the fourth object.
In an embodiment, the preset distance in the foregoing embodiment may be specifically related to a driving state of a host vehicle, where the host vehicle is a vehicle in which a sensor for acquiring the target image is located, and the driving state may include at least one of: the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
In this application embodiment, the driving direction may include left turning, right turning, or straight traveling, and when the host vehicle is left turning or right turning, it is determined that the preset distance is a first distance value, and when the host vehicle is straight traveling, it is determined that the preset distance is a second distance value, where the first distance value is smaller than the second distance value.
For example, when the host vehicle is in a traveling state of straight traveling, or in a traveling direction that is deviated from the traveling direction of straight traveling by a preset angle range, the preset distance may be determined to be a second distance value, for example, the second distance value may be 50 meters, and when the host vehicle is in a traveling state of left-turn or right-turn, the preset distance may be determined to be a first distance value, for example, the first distance value may be 40 meters.
In this application embodiment, the uphill and downhill path state can include uphill state, downhill path state and flat slope state, wherein, when this car is in the uphill state, confirm preset distance is the third distance value when this car is in the downhill path state, confirm preset distance is the fourth distance value when this car is in the flat slope state, confirm preset distance is the fifth distance value, wherein the third distance value is less than or equal to the fifth distance value, the fifth distance value is less than the fourth distance value. In the embodiment of the present application, which of the uphill state, the downhill state, and the flat-hill state the host vehicle is in may be determined according to an inclination angle between a horizontal plane and a road surface on which the host vehicle is currently traveling, and is not limited herein.
For example, the preset distance may be determined to be a third distance value, for example, 40 meters, when the host vehicle is in an uphill state, may be determined to be a fourth distance value, for example, 55 meters, when the host vehicle is in a downhill state, and may be determined to be a fifth distance value, for example, 50 meters, when the host vehicle is in a flat-hill state.
In the embodiment of the present application, when the host vehicle is in different driving directions and different uphill and downhill states of the road surface on which the host vehicle is located when driving, the preset distance may be different, exemplarily, when the host vehicle is in a straight running state and a downhill state, it may be determined that the preset distance is a meter a, when the host vehicle is in a straight running state and a downhill state, it may be determined that the preset distance is a meter B, when the host vehicle is in a straight running state and an uphill state, it may be determined that the preset distance is a meter C, when the host vehicle is in a left turn or a right turn, and in a downhill state, it may be determined that the preset distance is a meter D, when the host vehicle is in a left turn or a right turn, and in a downhill state, it may be determined that the preset distance is a meter F, wherein a is greater than or equal to B, and B is greater than or equal to C, c is greater than or equal to D, D is greater than or equal to E, and E is greater than or equal to F.
In the embodiment of the present application, when the host vehicle is in different driving directions and different uphill and downhill states of the road surface on which the host vehicle is located when driving, the preset distance may be different, exemplarily, when the host vehicle is in a downhill state and straight traveling, it may be determined that the preset distance is a meter, when the host vehicle is in a downhill state and left-hand or right-hand turning, it may be determined that the preset distance is B meter, when the host vehicle is in a flat-hill state and straight traveling, it may be determined that the preset distance is C meter, when the host vehicle is in a flat-hill state and left-hand or right-hand turning, it may be determined that the preset distance is D meter, when the host vehicle is in an uphill state and straight traveling, it may be determined that the preset distance is F meter, wherein a is greater than or equal to B, B is greater than or equal to C, c is greater than or equal to D, D is greater than or equal to E, and E is greater than or equal to F.
The specific magnitude of the preset distance value may be a specific angle between the traveling direction of the vehicle and the straight traveling direction and a specific angle between the road surface on which the vehicle is traveling and the horizontal plane, and is not limited herein.
In the embodiment of the present application, the third object and the fourth object may refer to an obstacle object that is not within the ROI area, and the third object and the fourth object are pedestrians or vehicles.
In this embodiment of the present application, a first edge line of the first shadow area and a second edge line of the second shadow area in the target image may be obtained, where the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, and the second shadow area is a shadow area of a fourth object.
In the embodiment of the application, the edge line of the shadow area of the close-distance obstacle object can be detected through a certain computer vision algorithm. Specifically, a shadow area of the obstacle object may be detected first, and then a position of a lowermost edge line in the target image in the longitudinal direction in the shadow area may be acquired, so as to obtain the first edge line and the second edge line.
In an embodiment of the present application, a second initial detection box may be determined, where the second initial detection box corresponds to the third object and the fourth object.
In this embodiment of the application, the second initial detection frame may be a result obtained after the detection frame is detected based on the prior art, at this time, the accuracy of the detection frame may be determined, if it is determined that the second initial detection frame corresponds to the third object and/or the fourth object, that is, when the same detection frame (second initial detection frame) is used as the detection frame of the third object and/or the fourth object, it is determined that the detection structure of the second initial detection frame is incorrect, and at this time, the detection frame position of the third object and the detection frame position of the fourth object may be determined based on the second initial detection frame, respectively.
It should be noted that the second initial detection box corresponds to the third object and/or the fourth object, where "correspond" may be understood as: the third object and the fourth object are used as one obstacle object, and the second initial detection frame corresponds to the obstacle object, or the detection frame of the third object is the second initial detection frame, and the detection frame of the fourth object is the second initial detection frame, at this time, the second initial detection frame has an association relationship with the third object and the fourth object.
It should be noted that the second initial detection box corresponds to the third object and/or the fourth object, where "correspond" may also be understood as: the processor does not detect one of the third object and the fourth object, and at this time, only the detection frame (second initial detection frame) of the third object or the fourth object is detected, and at this time, it may be considered that the second initial detection frame corresponds to the third object or the fourth object.
The "correspondence" does not mean that the second initial detection frame includes all the pixel points of the third object and/or the fourth object, but is used to indicate the position of the third object and/or the fourth object in the target image.
It should be noted that, in an embodiment, it may be determined that the second initial detection frame corresponds to the third object and/or the fourth object based on the number of detection frames or the corresponding relationship between the third object and the fourth object.
Specifically, if the third object and the fourth object correspond to one detection frame as a whole, it may be considered that the second initial detection frame corresponds to the third object and the fourth object.
Specifically, if the third object corresponds to one detection frame and the fourth object does not correspond to one detection frame, it may be considered that the second initial detection frame corresponds to the third object.
Specifically, if the fourth object corresponds to one detection frame and the third object does not correspond to one detection frame, it may be considered that the second initial detection frame corresponds to the fourth object.
In this embodiment, it may be determined that the second initial detection frame corresponds to the third object and the fourth object according to a result of semantic separation of the images, for example, if only one second initial detection frame is identified for the third object and the fourth object, and the second initial detection frame includes most or all of the pixel points of the third object, and the second initial detection frame includes most or all of the pixel points of the fourth object, the second initial detection frame may be considered to correspond to the third object and the fourth object.
It should be noted that, in an embodiment, it may be determined that the second initial detection frame corresponds to the third object and the fourth object based on the result of semantic separation of the images, for example, if only one second initial detection frame is identified near a boundary line between the third object and the fourth object, and the second initial detection frame includes most or all of the pixel points of the third object, and the second initial detection frame includes most or all of the pixel points of the fourth object, the second initial detection frame may be considered to correspond to the third object and the fourth object.
In this embodiment, it may be determined that the second initial detection frame corresponds to the third object or the fourth object according to a result of semantic separation of the images, for example, if only one second initial detection frame is identified for the third object and the fourth object, and the second initial detection frame includes most or all of the pixel points of the third object, and the second initial detection frame includes most or all of the pixel points of the fourth object, the second initial detection frame may be considered to correspond to the third object and the fourth object.
In this embodiment, it may be determined that the second initial detection frame corresponds to the third object and/or the fourth object according to the result of semantic separation of the images, for example, if only one second initial detection frame is identified for the third object and the fourth object, and the second initial detection frame includes most or all of the pixel points of the third object and does not include or only includes a small number of pixel points in the fourth object, the second initial detection frame may be considered to correspond to the third object.
In the embodiment of the application, after a first edge line and a second edge line are obtained, if an intersection point exists between a boundary point of the first edge line and the second edge line along a preset direction, the second initial detection frame is divided by a longitudinal straight line where the intersection point is located, and a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object are determined. Optionally, in an embodiment, the preset direction may be the same as the driving direction of the third vehicle or the fourth vehicle, or the preset direction may be the same as the lane line direction in which the third vehicle is currently driving.
In this embodiment of the application, the first edge line may include two boundary points, and in an alternative expression, the boundary points may also be understood as end points of a line segment (the first edge line), and when at least one of the two boundary points has an intersection with the second edge line along a preset direction, the second initial detection frame may be divided by a longitudinal straight line where the intersection is located.
In this embodiment of the application, the preset direction may be the same as the traveling direction of the third vehicle or the fourth vehicle. Specifically, the host vehicle may, but is not limited to, perform image analysis on the target image or perform analysis on data acquired by other sensors to obtain a driving direction of the third vehicle or the fourth vehicle, and further determine whether the boundary point of the first edge line and the second edge line have an intersection point along the driving direction according to the driving direction of the third vehicle or the fourth vehicle.
It should be noted that the preset direction indicates an actual driving direction of the third vehicle or the fourth vehicle, in this embodiment, a direction corresponding to the preset direction may be determined in the target image, and it is determined whether the boundary point of the first edge line has an intersection with the second edge line along the preset direction, which may be understood as determining whether the boundary point of the first edge line is mapped to the direction in the target image along the actual driving direction of the third vehicle or the fourth vehicle and whether the boundary point has an intersection with the second edge line, where the mapping process of the directions may be performed in combination with parameters and the like when the camera captures the target image, and details are not described here.
In this embodiment of the application, the preset direction may be the same as the lane line direction in which the third vehicle is currently driving. Specifically, the vehicle may be, but not limited to, based on image analysis of the target image or based on analysis of data acquired by other sensors, to obtain a lane line direction in which the third vehicle is currently driving, and further, according to the lane line direction in which the third vehicle is currently driving, whether an intersection point exists between the boundary point of the first edge line and the second edge line along the lane line direction may be determined.
It should be noted that, the preset direction indicates an actual lane line direction, and in this embodiment, a direction corresponding to the preset direction may be determined in the target image, and it is determined whether the boundary point of the first edge line and the second edge line have an intersection point along the preset direction, it may be understood that it is determined whether the boundary point of the first edge line is mapped to the direction in the target image along the lane line direction and whether the boundary point and the second edge line have an intersection point, where mapping processing of the directions may be performed in combination with parameters and the like that are used when the camera shoots the target image, and details are not described here.
Referring to fig. 7a, fig. 7a is a schematic diagram of an image recognition process provided in this embodiment, as shown in fig. 7a, a first shadow area 701 and a second shadow area 702 are obtained by recognizing a target image, fig. 7a further includes a lane line 706, an edge point of the first shadow area 701 is a first edge line 703, and an edge point of the second shadow area 702 is a second edge line 705, where an intersection 705 exists between a boundary point of the first edge line 703 and the second edge line 704, and the preset direction may be consistent with a driving direction 707 of the third vehicle, or consistent with a driving direction 708 of the fourth vehicle, or consistent with a lane line direction in which the third vehicle is currently driving, and is not limited herein.
For better understanding, referring to fig. 7b, fig. 7b is a schematic diagram of an image recognition process provided in an embodiment of the present application, where fig. 7b shows an actual top view of a scene, and as shown in fig. 7b, from a top view of a ground, an edge point of a first shadow area 701 is a first edge line 703, an edge point of a second shadow area 702 is a second edge line 705, where an intersection 705 exists between the first edge line 703 and the second edge line 704 along a preset direction.
It should be noted that the lane line direction may be understood as a direction of a straight line in which the lane line is located.
The term "match" does not mean a mathematically exact match, but means a match in the direction indicated, and the predetermined direction may have a certain angle with the direction of the lane line due to other reasons such as recognition accuracy, the printing condition of the lane line on the road surface, and the like, or the lane line may not be a straight line in a strict sense, and in this case, the direction in which the lane line is located may represent the approximate direction represented by the lane line.
In this embodiment of the application, if there is an intersection point between the boundary point of the first edge line and the second edge line along the preset direction, it may be determined that the fourth object is blocked by the third object in the target image, and since the second initial detection frame corresponds to the third object and the fourth object, the detection result corresponding to the detection frame is incorrect.
In this embodiment of the application, the second initial detection frame may be divided by a longitudinal straight line where the intersection point is located, and a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object are determined.
Optionally, in an embodiment, the second initial detection frame may be divided into the third detection frame and a second sub-detection frame by a longitudinal straight line where the intersection point is located, a central point of the third detection frame is a third central point, and the fourth detection frame is determined by performing translation adjustment and/or size adjustment on the second sub-detection frame according to the third central point, where a distance between a fourth central point of the fourth detection frame and a symmetric point of the third central point with respect to a target line is smaller than a preset value, the target line passes through the intersection point, and a direction of the target line is perpendicular to the preset direction.
Referring to fig. 8a, 8b, and 8c, fig. 8a, 8b, and 8c are schematic diagrams of object detection provided in an embodiment of the present application, and as shown in fig. 8a, a target image is identified to obtain a first shadow region 701 and a second shadow region 702, fig. 7a further includes a lane line 706, an edge point of the first shadow region 701 is a first edge line 703, an edge point of the second shadow region 702 is a second edge line 705, where an intersection 705 exists between a boundary point of the first edge line 703 and the second edge line 704 along a preset direction, and a second initial detection frame is obtained as 801. The vertical straight line where the intersection 705 is located is a straight line 804, and as shown in fig. 8b, the straight line 804 divides the second initial detection frame 801 into a third detection frame 802 and a second sub-detection frame 803, where the third detection frame 802 may be regarded as a detection frame corresponding to the third vehicle, and then the position and size of the second sub-detection frame 803 may be adjusted according to the position of the third detection frame 802, so as to obtain a detection frame (fourth detection frame) corresponding to the fourth object.
As shown in fig. 8b, the central point of the third detection frame 802 is a third central point 805, the target line 806 is a straight line passing through the intersection point 705 and perpendicular to the preset direction, where the preset direction may be consistent with the lane line direction where the third object is located or consistent with the driving direction of the third object or the fourth object, the distance between the symmetric point of the third central point 805 with respect to the target line 806 and the fourth central point 807 is less than the preset distance, the size of the preset distance may be determined according to actual requirements, for example, may be within a distance of 5 pixels, and it is to be noted that the symmetric point of the third central point 805 with respect to the target line 806 may coincide with the fourth central point 807.
In the embodiment of the present application, the fourth center point 807 may be regarded as a center point position of a fourth detection frame corresponding to the fourth object. After obtaining the center point position (fourth center point 807) of the fourth detection frame, the fourth detection frame may be determined by performing translation adjustment and/or size adjustment on the second sub-detection frame 803, as shown in fig. 8c, where the center point of the fourth detection frame 808 is the fourth center point 807. Optionally, two adjacent edges of the second sub-detection frame 803 may be fixed, and positions of the remaining two adjacent edges are adjusted to perform translation adjustment and/or size adjustment on the second sub-detection frame 803, referring to fig. 8d, fig. 8d is a schematic diagram of object detection provided in this embodiment of the present application, fig. 8d shows positions of the second sub-detection frame 803 and a fourth central point 807, positions of an upper edge and a right edge of the second sub-detection frame 803 in fig. 8d are fixed, positions of a lower edge and a left edge of the second sub-detection frame 803 in fig. 8d are adjusted, specifically, in order to locate the fourth central point at a central position of the detection frame, the lower edge is moved upward, and the left edge is moved leftward until the fourth central point is located at the central position of the detection frame, so as to obtain a fourth detection frame 808.
The above is only one illustrative way to obtain the fourth detection frame, and the present application is not limited thereto.
After obtaining a third detection frame corresponding to a third object and a fourth detection frame corresponding to a fourth object, the third detection frame and the fourth detection frame may be output. The third detection frame and the fourth detection frame may be rectangular frames, and may be represented by end points of a diagonal line. That is, in the embodiment of the present application, the end point position of the diagonal line of the third detection frame and the end point position of the diagonal line of the fourth detection frame may be output. Specifically, the information of the third detection frame and the fourth detection frame may be provided as an interface to a module such as a route planning control module of the automatic driving apparatus.
In addition, in this embodiment of the application, occlusion information may be further output, where the occlusion information is used to indicate that the third object is an occluding object and the fourth object is an occluded object. Object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object may also be output.
In this embodiment of the application, occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object may be output, for example, occlusion or occlusion may be represented by a preset character string, and exemplarily, "1" may represent the third object (the occluding object), and "0" may represent the fourth object (the occluded object). In this embodiment of the application, object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object may be output.
Referring to fig. 9, fig. 9 is a schematic flowchart of another object detection method provided in the embodiment of the present application, and as shown in fig. 9, the object detection method provided in the embodiment includes:
901. a target image is acquired, wherein the target image comprises a first shadow area and a second shadow area.
The specific description of step 901 may refer to the description of the foregoing embodiments, and is not repeated here.
902. Acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles.
The specific description of step 902 may refer to the description of the foregoing embodiments, and is not repeated here.
903. Determining a second initial detection box, the second initial detection box corresponding to the third object and the fourth object.
The detailed description of step 903 may refer to the description of the foregoing embodiments, and is not repeated here.
904. If the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame by a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object.
The detailed description of step 904 may refer to the description of the above embodiments, and is not repeated here.
905. And outputting the third detection frame and the fourth detection frame.
The specific description of step 905 may refer to the description of the above embodiments, and is not repeated here.
Optionally, the preset direction is consistent with the driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with the current driving lane direction of the third vehicle.
Optionally, in this embodiment, the second initial detection frame may be further divided into the third detection frame and a second sub-detection frame by a longitudinal straight line where the intersection point is located, where a center point of the third detection frame is a third center point; and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
Optionally, when the target image is captured, a distance between the capture point of the target image and the third object is smaller than a distance between the capture point of the target image and the fourth object.
Optionally, when the target image is shot, a distance between the shot point of the target image and the third object is smaller than a preset distance, and a distance between the shot point of the target image and the fourth object is smaller than the preset distance.
Optionally, the preset distance is related to a driving state of a vehicle, the vehicle is a vehicle where a sensor for acquiring the target image is located, and the driving state at least includes one of the following states:
the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
Optionally, the driving direction includes a left turn, a right turn, or a straight line, and in this embodiment, the preset distance may also be determined as a first distance value when the host vehicle is in a left turn or a right turn; and when the vehicle is in a straight line, determining the preset distance as a second distance value, wherein the first distance value is smaller than the second distance value.
Optionally, the uphill and downhill states include an uphill state, a downhill state, and a flat state;
in this embodiment, when the vehicle is in an uphill state, the preset distance may be determined to be a third distance value; when the vehicle is in a downhill state, determining the preset distance as a fourth distance value; and when the vehicle is in a flat slope state, determining that the preset distance is a fifth distance value, wherein the third distance value is smaller than or equal to the fifth distance value, and the fifth distance value is smaller than the fourth distance value.
Optionally, this embodiment may further output occlusion information, where the occlusion information is used to indicate that the third object is an occluding object and the fourth object is an occluded object.
Optionally, in this embodiment, object association information may also be output, where the object information is used to indicate an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object.
The embodiment of the application provides a target detection method, which comprises the following steps: acquiring a target image, wherein the target image comprises a first shadow area and a second shadow area; acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles; determining a second initial detection box, the second initial detection box corresponding to the third object and the fourth object; if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object; and outputting the third detection frame and the fourth detection frame. Through the above manner, after the first edge line of the first shadow region and the second edge line of the second shadow region are identified, the third detection frame of the third object and the fourth detection frame of the fourth object can be obtained directly according to the intersection points of the first edge line and the second edge line along the preset direction.
Referring to fig. 10, fig. 10 is a schematic structural diagram of a target detection apparatus provided in an embodiment of the present application, where the target detection apparatus includes:
an obtaining module 1001 configured to obtain a target image; acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles;
a determining module 1002, configured to determine, if a common boundary line exists between the first area and the second area, a first detection frame corresponding to the first object and a second detection frame corresponding to the second object, where the first detection frame includes the first area, and the second detection frame includes the second area;
an output module 1003, configured to output the first detection frame and the second detection frame.
Optionally, the first detection frame is a circumscribed rectangle frame of the first region, and the second detection frame is a circumscribed rectangle frame of the second region.
Optionally, the end points of the common boundary line between the first area and the second area are a first boundary point and a second boundary point, a connecting line between the first boundary point and the second boundary point is a target boundary line, and the determining module is specifically configured to:
determining a first initial detection box, the initial detection box corresponding to the first object and/or the second object;
determining a first central point according to the first initial detection frame and the target boundary line, wherein the distance between the longitudinal coordinate of the first central point in the target image and the longitudinal coordinate of the midpoint of the target boundary line in the target image is within a preset range, and the distance between the transverse coordinate of the first central point in the target image and the longitudinal coordinate of the central point of the first initial detection frame in the target image is within a preset range;
and determining a first detection frame corresponding to the first object according to the first central point and the first area, wherein the distance between the central point of the first detection frame and the first central point is within a preset range, and the first detection frame comprises the first area.
Optionally, the determining module is specifically configured to:
determining a second central point according to the first central point and the target boundary line, wherein the distance between the second central point and a symmetrical point of the first central point relative to the target boundary line is smaller than a preset value;
and determining a second detection frame corresponding to the second object according to the second center point and the second area, wherein the distance between the center point of the second detection frame and the second center point is within a preset range, and the second detection frame comprises the second area.
Optionally, when the target image is captured, a distance between a capture point of the target image and the first object is smaller than a distance between the capture point of the target image and the second object.
Optionally, when the target image is shot, a distance between the shot point of the target image and the first object is greater than a preset distance, and a distance between the shot point of the target image and the second object is greater than the preset distance.
Optionally, the output module is further configured to:
outputting occlusion information indicating that the first object is an occluding object and the second object is an occluded object.
Optionally, the output module is further configured to:
outputting object association information indicating an association relationship between the first detection frame and the first object and an association relationship between the second detection frame and the second object.
Optionally, the target image further includes: the obtaining module is further configured to:
acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles;
the determining module is further configured to:
determining a second initial detection box, the second initial detection box corresponding to the third object and the fourth object;
if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object;
the output module is further configured to:
and outputting the third detection frame and the fourth detection frame.
Optionally, the preset direction is consistent with the driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with the current driving lane direction of the third vehicle.
Optionally, the determining module is specifically configured to:
dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point;
and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
Optionally, when the target image is captured, a distance between the capture point of the target image and the third object is smaller than a distance between the capture point of the target image and the fourth object.
Optionally, when the target image is shot, a distance between the shot point of the target image and the third object is smaller than a preset distance, and a distance between the shot point of the target image and the fourth object is smaller than the preset distance.
Optionally, the preset distance is related to a driving state of a vehicle, the vehicle is a vehicle where a sensor for acquiring the target image is located, and the driving state at least includes one of the following states:
the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
Optionally, the driving direction includes left turning, right turning, or straight traveling, and the determining module is further configured to:
when the vehicle turns left or right, determining the preset distance as a first distance value;
and when the vehicle is in a straight line, determining the preset distance as a second distance value, wherein the first distance value is smaller than the second distance value.
Optionally, the uphill and downhill states include an uphill state, a downhill state, and a flat state;
the determining module is further configured to:
when the vehicle is in an uphill state, determining the preset distance as a third distance value;
when the vehicle is in a downhill state, determining the preset distance as a fourth distance value;
and when the vehicle is in a flat slope state, determining that the preset distance is a fifth distance value, wherein the third distance value is smaller than or equal to the fifth distance value, and the fifth distance value is smaller than the fourth distance value.
Optionally, the output module is further configured to:
outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
Optionally, the output module is further configured to:
outputting object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object.
The present application further provides an object detection apparatus, referring to fig. 10, fig. 10 is a schematic structural diagram of an object detection apparatus provided in an embodiment of the present application, and as shown in fig. 10, the object detection apparatus includes:
an obtaining module 1001 configured to obtain a target image, where the target image includes a first shadow region and a second shadow region;
acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles;
a determining module 1002, configured to determine a second initial detection box, where the second initial detection box corresponds to the third object and the fourth object;
if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object;
an output module 1003, configured to output the third detection frame and the fourth detection frame.
Optionally, the preset direction is consistent with the driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with the current driving lane direction of the third vehicle.
Optionally, the determining module is specifically configured to:
dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point;
and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
Optionally, when the target image is captured, a distance between the capture point of the target image and the third object is smaller than a distance between the capture point of the target image and the fourth object.
Optionally, when the target image is shot, a distance between the shot point of the target image and the third object is smaller than a preset distance, and a distance between the shot point of the target image and the fourth object is smaller than the preset distance.
Optionally, the preset distance is related to a driving state of a vehicle, the vehicle is a vehicle where a sensor for acquiring the target image is located, and the driving state at least includes one of the following states:
the direction of travel, and the state of the uphill or downhill slope of the road surface on which the vehicle is traveling.
Optionally, the driving direction includes left turning, right turning, or straight traveling, and the determining module is further configured to:
when the vehicle turns left or right, determining the preset distance as a first distance value;
and when the vehicle is in a straight line, determining the preset distance as a second distance value, wherein the first distance value is smaller than the second distance value.
Optionally, the uphill and downhill states include an uphill state, a downhill state, and a flat state;
the determining module is further configured to:
when the vehicle is in an uphill state, determining the preset distance as a third distance value;
when the vehicle is in a downhill state, determining the preset distance as a fourth distance value;
and when the vehicle is in a flat slope state, determining that the preset distance is a fifth distance value, wherein the third distance value is smaller than or equal to the fifth distance value, and the fifth distance value is smaller than the fourth distance value.
Optionally, the output module is further configured to:
outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
Optionally, the output module is further configured to:
outputting object association information indicating an association relationship between the third detection frame and the third object and an association relationship between the fourth detection frame and the fourth object.
Referring to fig. 11, fig. 11 is a schematic structural diagram of a terminal device according to an embodiment of the present disclosure, and a terminal device 1100 may be embodied as an automatic driving apparatus in fig. 1 or fig. 2, which is not limited herein. Specifically, the terminal device 1100 includes: a receiver 1101, a transmitter 1102, a processor 1103 and a memory 1104 (wherein the number of processors 1103 in the terminal device 1100 may be one or more, one processor is taken as an example in fig. 11). In some embodiments of the present application, the receiver 1101, the transmitter 1102, the processor 1103, and the memory 1104 may be connected by a bus or other means.
The memory 1104, which may include both read-only memory and random-access memory, provides instructions and data to the processor 1103. A portion of the memory 1104 may also include non-volatile random access memory (NVRAM). The memory 1104 stores the processor and operating instructions, executable modules or data structures, or a subset or an expanded set thereof, wherein the operating instructions may include various operating instructions for performing various operations.
The processor 1103 controls the operation of the terminal device. In a specific application, the various components of the terminal device are coupled together by a bus system, wherein the bus system may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus. For clarity of illustration, the various buses are referred to in the figures as a bus system.
The method disclosed in the embodiments of the present application can be applied to the processor 1103 or implemented by the processor 1103. The processor 1103 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in software form in the processor 1103. The processor 1103 may be a general-purpose processor, a Digital Signal Processor (DSP), a microprocessor or a microcontroller, and may further include an Application Specific Integrated Circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The processor 1103 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present application. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1104, and the processor 1103 reads the information in the memory 1104 and performs the steps of the method in combination with the hardware.
The receiver 1101 may be used to receive input numeric or character information and generate signal inputs related to the relevant settings and function control of the terminal device. The transmitter 1102 may be configured to output numeric or character information via the first interface; the transmitter 1102 is also operable to send instructions to the disk groups via the first interface to modify data in the disk groups; the transmitter 1102 may also include a display device such as a display screen.
In the embodiment of the present application, in one case, the processor 1103 is configured to execute the steps related to the processing in the target detection method in the foregoing embodiment.
There is also provided in an embodiment of the present application a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of the object detection method.
Also provided in an embodiment of the present application is a computer-readable storage medium in which a program for signal processing is stored, which, when run on a computer, causes the computer to perform the steps of the object detection method in the method described in the foregoing embodiment.
The present application further provides a vehicle comprising a processor and a memory, wherein the processor is configured to retrieve and execute code in the memory to perform the object detection method of any of the above embodiments.
Alternatively, the vehicle may be a smart vehicle that supports an autonomous driving function.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection therebetween, and may be implemented as one or more communication buses or signal lines.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be substantially embodied in the form of a software product, which is stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, an exercise device, or a network device) to execute the method according to the embodiments of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, training device, or data center to another website site, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that a computer can store or a data storage device, such as a training device, a data center, etc., that incorporates one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
Claims (29)
1. A method of object detection, comprising:
acquiring a target image;
acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles;
if the first area and the second area have a common boundary line, determining a first detection frame corresponding to the first object and determining a second detection frame corresponding to the second object, wherein the first detection frame comprises the first area, and the second detection frame comprises the second area;
and outputting the first detection frame and the second detection frame.
2. The method of claim 1, wherein the first detection box is a bounding rectangle of the first region and the second detection box is a bounding rectangle of the second region.
3. The method according to claim 1, wherein end points of a common boundary line between the first region and the second region are a first boundary point and a second boundary point, and a connecting line between the first boundary point and the second boundary point is a target boundary line, and the determining the first detection frame corresponding to the first object includes:
determining a first initial detection box, the initial detection box corresponding to the first object and the second object;
determining a first central point according to the first initial detection frame and the target boundary line, wherein the distance between the longitudinal coordinate of the first central point in the target image and the longitudinal coordinate of the midpoint of the target boundary line in the target image is within a preset range, and the distance between the transverse coordinate of the first central point in the target image and the longitudinal coordinate of the central point of the first initial detection frame in the target image is within a preset range;
and determining a first detection frame corresponding to the first object according to the first central point and the first area, wherein the distance between the central point of the first detection frame and the first central point is within a preset range, and the first detection frame comprises the first area.
4. The method of claim 3, wherein the determining the second detection frame corresponding to the second object comprises:
determining a second central point according to the first central point and the target boundary line, wherein the distance between the second central point and a symmetrical point of the first central point relative to the target boundary line is smaller than a preset value;
and determining a second detection frame corresponding to the second object according to the second center point and the second area, wherein the distance between the center point of the second detection frame and the second center point is within a preset range, and the second detection frame comprises the second area.
5. The method according to claim 3 or 4, wherein, when the target image is captured, a distance between a capture point of the target image and the first object is smaller than a distance between the capture point and a second object.
6. The method according to any one of claims 1 to 5, wherein, when the target image is captured, a distance between a capture point of the target image and the first object is greater than a preset distance, and a distance between the capture point of the target image and the second object is greater than the preset distance.
7. The method of claim 6, further comprising:
outputting occlusion information indicating that the first object is an occluding object and the second object is an occluded object.
8. A method of object detection, the method comprising:
acquiring a target image, wherein the target image comprises a first shadow area and a second shadow area;
acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles;
determining a second initial detection box, the second initial detection box corresponding to the third object and the fourth object;
if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object;
and outputting the third detection frame and the fourth detection frame.
9. The method according to claim 8, wherein the preset direction is consistent with a driving direction of the third vehicle or the fourth vehicle, or the preset direction is consistent with a lane line direction currently driven by the third vehicle.
10. The method according to claim 8 or 9, wherein the determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object comprises:
dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point;
and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
11. The method according to any one of claims 8 to 10, wherein, when the target image is captured, a distance between a capture point of the target image and the third object is smaller than a distance between the capture point and the fourth object.
12. The method according to any one of claims 8 to 11, wherein, when the target image is captured, a distance between the capture point of the target image and the third object is less than a preset distance, and a distance between the capture point of the target image and the fourth object is less than a preset distance.
13. The method according to any one of claims 8 to 12, further comprising:
outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
14. An object detection device, comprising:
the acquisition module is used for acquiring a target image; acquiring a first area and a second area which are included in the target image, wherein a boundary line of the first area corresponds to a contour line of a first object in the target image, a boundary line of the second area corresponds to a contour line of a second object in the target image, the target image includes the first object and the second object, and the first object and the second object are pedestrians or vehicles;
a determining module, configured to determine a first detection frame corresponding to the first object and a second detection frame corresponding to the second object if a common boundary line exists between the first area and the second area, where the first detection frame includes the first area and the second detection frame includes the second area;
and the output module is used for outputting the first detection frame and the second detection frame.
15. The apparatus of claim 14, wherein the first detection box is a bounding rectangle of the first region, and wherein the second detection box is a bounding rectangle of the second region.
16. The apparatus according to claim 14 or 15, wherein the end points of the common boundary line between the first area and the second area are a first boundary point and a second boundary point, and a connecting line between the first boundary point and the second boundary point is a target boundary line, and the determining module is specifically configured to:
determining a first initial detection box, the initial detection box corresponding to the first object and the second object;
determining a first central point according to the first initial detection frame and the target boundary line, wherein the distance between the longitudinal coordinate of the first central point in the target image and the longitudinal coordinate of the midpoint of the target boundary line in the target image is within a preset range, and the distance between the transverse coordinate of the first central point in the target image and the longitudinal coordinate of the central point of the first initial detection frame in the target image is within a preset range;
and determining a first detection frame corresponding to the first object according to the first central point and the first area, wherein the distance between the central point of the first detection frame and the first central point is within a preset range, and the first detection frame comprises the first area.
17. The apparatus of claim 16, wherein the determining module is specifically configured to:
determining a second central point according to the first central point and the target boundary line, wherein the distance between the second central point and a symmetrical point of the first central point relative to the target boundary line is smaller than a preset value;
and determining a second detection frame corresponding to the second object according to the second center point and the second area, wherein the distance between the center point of the second detection frame and the second center point is within a preset range, and the second detection frame comprises the second area.
18. The apparatus according to claim 16 or 17, wherein, when the target image is captured, a distance between a capture point of the target image and the first object is smaller than a distance between the capture point and a second object.
19. The apparatus according to any one of claims 14 to 18, wherein, when the target image is captured, a distance between a capture point of the target image and the first object is greater than a preset distance, and a distance between the capture point of the target image and the second object is greater than the preset distance.
20. The apparatus of claim 19, wherein the output module is further configured to:
outputting occlusion information indicating that the first object is an occluding object and the second object is an occluded object.
21. An object detection device, comprising:
the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a target image, and the target image comprises a first shadow area and a second shadow area;
acquiring a first edge line of the first shadow area and a second edge line of the second shadow area, wherein the first edge line is an edge line of a longitudinal bottom of the first shadow area, the second edge line is an edge line of a longitudinal bottom of the second shadow area, the first shadow area is a shadow area of a third object, the second shadow area is a shadow area of a fourth object, and the third object and the fourth object are pedestrians or vehicles;
a determining module to determine a second initial detection box, the second initial detection box corresponding to the third object and the fourth object;
if the boundary point of the first edge line and the second edge line have an intersection point along a preset direction, dividing the second initial detection frame through a longitudinal straight line where the intersection point is located, and determining a third detection frame corresponding to the third object and a fourth detection frame corresponding to the fourth object;
and the output module is used for outputting the third detection frame and the fourth detection frame.
22. The apparatus of claim 21, wherein the predetermined direction is the same as a driving direction of the third vehicle or the fourth vehicle, or the predetermined direction is the same as a lane line direction in which the third vehicle is currently driving.
23. The apparatus according to claim 21 or 22, wherein the determining module is specifically configured to:
dividing the second initial detection frame into a third detection frame and a second sub-detection frame through a longitudinal straight line where the intersection point is located, wherein the center point of the third detection frame is a third center point;
and according to the third central point, determining the fourth detection frame by performing translation adjustment and/or size adjustment on the second sub-detection frame, wherein the distance between the fourth central point of the fourth detection frame and a symmetrical point of the third central point relative to a target line is smaller than a preset value, the target line passes through the intersection point, and the direction of the target line is perpendicular to the preset direction.
24. The apparatus according to any one of claims 21 to 23, wherein a distance between a point of photographing of the target image and the third object is smaller than a distance between the point of photographing and the fourth object when the target image is photographed.
25. The apparatus according to any one of claims 21 to 24, wherein when the target image is captured, a distance between the capture point of the target image and the third object is less than a preset distance, and a distance between the capture point of the target image and the fourth object is less than a preset distance.
26. The apparatus of any one of claims 21 to 25, wherein the output module is further configured to:
outputting occlusion information indicating that the third object is an occluding object and the fourth object is an occluded object.
27. A computer-readable storage medium comprising a program which, when run on a computer, causes the computer to perform the method of any one of claims 1 to 13.
28. An object detection apparatus comprising a processor and a memory, the processor coupled with the memory,
the memory is used for storing programs;
the processor configured to execute the program in the memory to cause the terminal device to perform the method of any one of claims 1 to 13.
29. A vehicle characterized by comprising an object detection device as claimed in any one of claims 14 to 26.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010113268.1A CN113297881B (en) | 2020-02-24 | 2020-02-24 | Target detection method and related device |
PCT/CN2021/077537 WO2021169964A1 (en) | 2020-02-24 | 2021-02-24 | Target detection method and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010113268.1A CN113297881B (en) | 2020-02-24 | 2020-02-24 | Target detection method and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113297881A true CN113297881A (en) | 2021-08-24 |
CN113297881B CN113297881B (en) | 2024-05-14 |
Family
ID=77317866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010113268.1A Active CN113297881B (en) | 2020-02-24 | 2020-02-24 | Target detection method and related device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113297881B (en) |
WO (1) | WO2021169964A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115616560A (en) * | 2022-12-02 | 2023-01-17 | 广汽埃安新能源汽车股份有限公司 | Vehicle obstacle avoidance method and device, electronic equipment and computer readable medium |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114332708A (en) * | 2021-12-29 | 2022-04-12 | 深圳市商汤科技有限公司 | Traffic behavior detection method and device, electronic equipment and storage medium |
CN113987667B (en) * | 2021-12-29 | 2022-05-03 | 深圳小库科技有限公司 | Building layout grade determining method and device, electronic equipment and storage medium |
CN115249355B (en) * | 2022-09-22 | 2022-12-27 | 杭州枕石智能科技有限公司 | Object association method, device and computer-readable storage medium |
CN115690767B (en) * | 2022-10-26 | 2023-08-22 | 北京远度互联科技有限公司 | License plate recognition method, license plate recognition device, unmanned aerial vehicle and storage medium |
CN118429372A (en) * | 2024-04-19 | 2024-08-02 | 山东龙佰钛业科技有限公司 | Visual-aided zircon granularity refinement detection method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489202A (en) * | 2018-05-14 | 2019-11-22 | 帝斯贝思数字信号处理和控制工程有限公司 | The method quickly evaluated blocked for the perspectivity in the emulation of imaging sensor |
US20190392242A1 (en) * | 2018-06-20 | 2019-12-26 | Zoox, Inc. | Instance segmentation inferred from machine-learning model output |
CN110807385A (en) * | 2019-10-24 | 2020-02-18 | 腾讯科技(深圳)有限公司 | Target detection method and device, electronic equipment and storage medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107145816A (en) * | 2017-02-24 | 2017-09-08 | 北京悉见科技有限公司 | Object identifying tracking and device |
WO2019127227A1 (en) * | 2017-12-28 | 2019-07-04 | Intel Corporation | Vehicle sensor fusion |
CN108376235A (en) * | 2018-01-15 | 2018-08-07 | 深圳市易成自动驾驶技术有限公司 | Image detecting method, device and computer readable storage medium |
CN110059547B (en) * | 2019-03-08 | 2021-06-25 | 北京旷视科技有限公司 | Target detection method and device |
-
2020
- 2020-02-24 CN CN202010113268.1A patent/CN113297881B/en active Active
-
2021
- 2021-02-24 WO PCT/CN2021/077537 patent/WO2021169964A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110489202A (en) * | 2018-05-14 | 2019-11-22 | 帝斯贝思数字信号处理和控制工程有限公司 | The method quickly evaluated blocked for the perspectivity in the emulation of imaging sensor |
US20190392242A1 (en) * | 2018-06-20 | 2019-12-26 | Zoox, Inc. | Instance segmentation inferred from machine-learning model output |
CN110807385A (en) * | 2019-10-24 | 2020-02-18 | 腾讯科技(深圳)有限公司 | Target detection method and device, electronic equipment and storage medium |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115616560A (en) * | 2022-12-02 | 2023-01-17 | 广汽埃安新能源汽车股份有限公司 | Vehicle obstacle avoidance method and device, electronic equipment and computer readable medium |
Also Published As
Publication number | Publication date |
---|---|
WO2021169964A1 (en) | 2021-09-02 |
CN113297881B (en) | 2024-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113297881A (en) | Target detection method and related device | |
US10513269B2 (en) | Road profile along a predicted path | |
US9846812B2 (en) | Image recognition system for a vehicle and corresponding method | |
US10081308B2 (en) | Image-based vehicle detection and distance measuring method and apparatus | |
EP2575078B1 (en) | Front vehicle detecting method and front vehicle detecting apparatus | |
CN111091037B (en) | Method and device for determining driving information | |
US11577748B1 (en) | Real-time perception system for small objects at long range for autonomous vehicles | |
CN113492851A (en) | Vehicle control device, vehicle control method, and computer program for vehicle control | |
US20170259814A1 (en) | Method of switching vehicle drive mode from automatic drive mode to manual drive mode depending on accuracy of detecting object | |
US11829153B2 (en) | Apparatus, method, and computer program for identifying state of object, and controller | |
CN111213153A (en) | Target object motion state detection method, device and storage medium | |
CN108475471B (en) | Vehicle determination device, vehicle determination method, and computer-readable recording medium | |
US10846546B2 (en) | Traffic signal recognition device | |
WO2019065970A1 (en) | Vehicle exterior recognition device | |
CN113435237A (en) | Object state recognition device, recognition method, recognition program, and control device | |
JP7323356B2 (en) | PARKING ASSIST DEVICE AND PARKING ASSIST METHOD | |
EP3410345B1 (en) | Information processing apparatus and non-transitory recording medium storing thereon a computer program | |
US20230316539A1 (en) | Feature detection device, feature detection method, and computer program for detecting feature | |
US20230177844A1 (en) | Apparatus, method, and computer program for identifying state of lighting | |
JP2022123153A (en) | Stereo image processing device and stereo image processing method | |
EP4336466A2 (en) | Method and apparatus for modeling object, storage medium, and vehicle control method | |
US20240017748A1 (en) | Device, method, and computer program for lane determination | |
US20230260294A1 (en) | Apparatus, method, and computer program for estimating road edge | |
KR102119678B1 (en) | Lane detection method and electronic device performing the method | |
CN114084153A (en) | Object detection device, object detection method, and computer program for object detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |