WO2022166606A1 - 一种目标检测方法及装置 - Google Patents

一种目标检测方法及装置 Download PDF

Info

Publication number
WO2022166606A1
WO2022166606A1 PCT/CN2022/072994 CN2022072994W WO2022166606A1 WO 2022166606 A1 WO2022166606 A1 WO 2022166606A1 CN 2022072994 W CN2022072994 W CN 2022072994W WO 2022166606 A1 WO2022166606 A1 WO 2022166606A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
pixel
processed
marker
pixels
Prior art date
Application number
PCT/CN2022/072994
Other languages
English (en)
French (fr)
Inventor
马志贤
池清华
云一宵
郑迪威
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022166606A1 publication Critical patent/WO2022166606A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle

Definitions

  • the present application relates to the field of sensor technology, and in particular, to a target detection method and device.
  • intelligent terminals such as intelligent transportation equipment, smart home equipment, and robots are gradually entering people's daily life.
  • Sensors play a very important role in smart terminals.
  • Various sensors installed on the smart terminal such as millimeter-wave radar, lidar, camera, ultrasonic radar, etc., perceive the surrounding environment during the movement of the smart terminal, collect data, and identify and track moving objects.
  • identification of static scenes such as lane lines, signs, and construction areas, and combined with navigator and map data for path planning.
  • ADAS Advanced Driver Assistance Systems
  • ADS Automated Driving System
  • ADS Automated Driving System
  • ADS Automated Driving System
  • AEB Advanced Emergency Braking
  • the sensors for vehicle scene perception mainly include millimeter-wave radar, lidar and camera. Because the construction area has high-level semantic information, millimeter-wave radar often does not have the ability to perceive complex construction areas.
  • the current road construction area detection scheme adopts Sensors are mainly lidars and cameras, among which cameras have the advantages of low cost, small size, easy deployment and easy maintenance, and are widely used.
  • methods for detecting construction area markers based on cameras are mainly divided into two categories: image processing-based methods, such as using template matching to search for construction area markers in images; machine learning-based methods, such as using construction area markers. Color or texture features of objects, and a classifier is designed to classify pixels or areas in the image to determine construction area markers.
  • the above-mentioned methods for detecting markers in the construction area have the risk of false detection, especially when detecting non-columnar façade markers in the construction area such as road rails, water horses, and cement stone piers, the risk of false detection is high. Therefore, how to reduce the risk of false detection? The risk of false detection and improving the detection accuracy have become an urgent problem to be solved at present.
  • the present application proposes a target detection method and device.
  • an embodiment of the present application provides a target detection method, the method includes: acquiring an image to be processed; binarizing the image to be processed according to a feature of a marker, and determining the image to be processed
  • the candidate area in the candidate area includes at least one pixel, the candidate area includes N columns of pixels, and N is a positive integer; determine the number of pixels in each of the N columns of pixels; determine the The characteristic value of the pixel point with the largest number of rows in each of the N columns of pixels, the characteristic value includes the number of corresponding pixels in each column; the image to be compared is determined, and the image to be compared includes at least one pixel, Determine the theoretical feature value corresponding to each pixel point, the theoretical feature value is determined based on the reference height information of the marker; compare the feature value of the pixel point of the candidate area in the image to be processed with the image to be compared match the theoretical eigenvalues of the corresponding pixels in the image to be processed; when the eigenvalues of the first pixel in
  • a candidate region in the image to be processed is determined according to the characteristics of the marker, and the feature value of the pixel with the largest number of rows in each column of the candidate region in the image to be processed and each pixel in the image to be compared are determined.
  • Corresponding theoretical eigenvalues wherein the eigenvalues include the number of corresponding pixels in each column, the theoretical eigenvalues are determined based on the reference height information of the markers, and the eigenvalues of the pixels in the candidate region in the image to be processed are compared with the number of pixels to be processed.
  • the reference height information of the marker is further utilized. Detect the markers in the construction area. Through double screening, other objects with the same or similar characteristics as the markers can be effectively filtered out, thereby reducing the risk of false detection and improving the detection accuracy. And by performing binarization processing to determine candidate regions in the image to be processed, the accuracy of preliminary screening of markers is improved, and the amount of data of images involved in subsequent processing is greatly reduced, improving detection efficiency.
  • the determining the theoretical feature value corresponding to each pixel includes: collecting the reference height information of the marker and collecting the to-be-to-be The calibration parameters of the image acquisition device that processes the image determine the theoretical feature value corresponding to each pixel in the image to be compared.
  • the theoretical eigenvalue corresponding to each pixel in the image to be compared is determined based on the reference height information of the marker and the calibration parameters of the image acquisition device that collects the image to be processed.
  • the theoretical eigenvalue corresponding to the pixel point of represents the theoretical height information of the construction area marker when the construction area marker is located at the pixel point of the candidate area.
  • the theoretical eigenvalues are determined by using the calibration parameters of the image acquisition device that collects the image to be processed, so that the determined theoretical eigenvalues accurately match the characteristics of the image acquisition device.
  • the pixel points of the marker can be accurately determined by using the determined theoretical eigenvalues.
  • the theoretical eigenvalue corresponding to each pixel in the image to be compared can be pre-determined according to the calibration parameters of the image acquisition device and the reference height information of the marker. High-frequency refresh is performed between frames, which effectively reduces the processing delay and improves the detection efficiency.
  • the reference height information of the marker includes the height of the marker in the world coordinate system;
  • the determining the theoretical feature value corresponding to each pixel in the image to be compared according to the reference height information of the marker and the calibration parameters of the image acquisition device that collects the image to be processed includes: according to the image
  • the calibration parameters of the acquisition device are used to determine the transformation matrix from the world coordinate system to the coordinate system of the image acquisition device; according to the transformation matrix, the marker at at least one position in the world coordinate system is projected into the coordinate system of the image acquisition device.
  • the determining the characteristic value of the pixel with the largest number of rows in each of the N columns of pixels includes: determining the N columns The maximum number of rows and the minimum number of rows corresponding to all the pixels in each column of the pixel points; the difference between the maximum number of rows and the minimum number of rows is determined as the characteristic value of the pixel point with the largest number of rows in the column.
  • the pixel point with the largest number of rows represents the junction point between the construction area marker and the road surface that the candidate area may contain, and the feature value of the pixel point with the largest number of rows can represent the construction area marker that the candidate area may contain.
  • the approximate height information in the binarized image in this way, the approximate height information of the object represented by each row in the candidate region can be represented by the feature value of the pixel with the largest number of rows in each column.
  • the feature value of the pixel point of the candidate region in the image to be processed is compared with the corresponding pixel in the image to be compared Matching the theoretical eigenvalues of the points, including: determining the pixel points in the candidate area in the image to be processed that satisfy the range condition; wherein, if the pixel points are included in the range of the road surface in the image to be processed , the pixel point satisfies the range condition; the characteristic value of the pixel point satisfying the range condition is matched with the theoretical characteristic value of the corresponding pixel point in the image to be compared.
  • feature value matching can be performed only on the pixels included in the range of the road surface in the image to be processed, that is, the construction area markers placed on the road surface can be detected, thereby effectively improving the detection efficiency.
  • the image to be processed is binarized according to the feature of the marker, and a candidate region in the image to be processed is determined , comprising: performing a binarization process on the to-be-processed image according to the preset color feature and/or texture feature of the marker to obtain a binarized image; The area is determined as the candidate area.
  • binarization is performed according to the preset color features and/or texture features of the markers, so as to improve the accuracy of the preliminary screening of the markers in the construction area, and the amount of image data is greatly reduced, improving the detection efficiency. ; It can allow other objects in the image to be processed that have the same or similar color features and/or texture features as the markers to enter the subsequent screening process, effectively improving the recall rate of marker detection in the construction area.
  • the method further includes: determining a lane line in the to-be-processed image; according to the position of the lane line and the first The position of the pixel point is used to determine the lane where the first pixel point is located; when the number of the first pixel point in the lane exceeds a third threshold, the horizontal and vertical boundaries of the candidate area where the first pixel point is located are determined.
  • the lateral and vertical boundaries of the occupied lanes in the construction area are determined, and the conditions of the occupied lanes in the construction area are refined, thereby providing more abundant information for further vehicle control.
  • the method further includes: determining the first pixel point in the current image to be processed according to the first pixel point in the multiple images to be processed the driveway. For example, by using multiple images to be processed that are connected to the current image to be processed, the first pixel in the multiple images to be processed can be fused in the time axis direction to determine the lane where the first pixel in the current image to be processed is located. .
  • the marker includes a marker for dividing a construction area.
  • the marker may include a marker for dividing the construction area, so that the construction area can be sensed by detecting the marker, and support for vehicle travel path planning and the like can be provided.
  • an embodiment of the present application provides a target detection device, the device includes: an acquisition module, configured to acquire an image to be processed; a processing module, configured to perform a binary analysis on the to-be-processed image according to the feature of the marker Value processing, determine a candidate area in the image to be processed, the candidate area includes at least one pixel, the candidate area includes N columns of pixels, and N is a positive integer; determine the N columns of pixels The number of pixels in each column; determine the eigenvalue of the pixel with the largest number of rows in each of the N columns of pixels, the eigenvalue includes the corresponding number of pixels in each column; determine the image to be compared, the The image to be compared includes at least one pixel point, and the theoretical feature value corresponding to each pixel point is determined, and the theoretical feature value is determined based on the reference height information of the marker; the pixel points of the candidate area in the image to be processed are determined The eigenvalue of the corresponding pixel in the image to be compared is matched
  • the processing module is further configured to: based on the reference height information of the marker and an image acquisition device that acquires the image to be processed to determine the theoretical eigenvalue corresponding to each pixel in the image to be compared.
  • the reference height information of the marker includes the height of the marker in the world coordinate system;
  • the processing module is further configured to: determine a transformation matrix from the world coordinate system to the coordinate system of the image acquisition device according to the calibration parameters of the image acquisition device;
  • the marker is projected into the coordinate system of the image acquisition device to obtain the pixel points in the image to be compared corresponding to the position and the number of pixels in the image to be compared corresponding to the marker; the marker The corresponding number of pixel points in the to-be-compared image is taken as the theoretical feature value corresponding to the pixel point.
  • the processing module is further configured to: determine the maximum number of rows corresponding to all pixels in each column of the N columns of pixels, and Minimum number of rows; the difference between the maximum number of rows and the minimum number of rows is determined as the feature value of the pixel with the largest number of rows in the column.
  • the processing module is further configured to: determine the pixel points that satisfy the range condition in the candidate region in the image to be processed; wherein, If the pixel point is included in the range of the road surface in the image to be processed, the pixel point satisfies the range condition; compare the feature value of the pixel point satisfying the range condition with the to-be-processed image The theoretical eigenvalues of the corresponding pixels in the image are matched.
  • the processing module is further configured to: according to the preset color feature and/or texture feature of the marker, The to-be-processed image is subjected to binarization processing to obtain a binarized image; an area in the binarized image that satisfies the second condition is determined as the candidate area.
  • the processing module is further configured to: determine a lane line in the image to be processed; The position of the first pixel is determined, and the lane where the first pixel is located is determined; when the number of first pixels in the lane exceeds a third threshold, the horizontal and vertical boundaries of the candidate area where the first pixel is located are determined. .
  • the processing module is further configured to: determine, according to the first pixels in the multiple images to be processed, the first pixel in the current image to be processed The lane where a pixel is located. For example, by using multiple images to be processed that are connected to the current image to be processed, the first pixel in the multiple images to be processed can be fused in the time axis direction to determine the lane where the first pixel in the current image to be processed is located. .
  • the marker includes a marker for dividing a construction area.
  • an embodiment of the present application provides a target detection device, comprising: at least one sensor, the sensor is used to collect an image to be processed; a processor; a memory for storing executable instructions of the processor; The processor is configured to execute a target detection method that can perform the above-mentioned first aspect or one or more of various possible implementations of the first aspect.
  • embodiments of the present application provide a target detection apparatus, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the first aspect or One or more target detection methods in multiple possible implementation manners of the first aspect.
  • embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the first aspect or the first aspect above One or more of the various possible implementations of the target detection method.
  • embodiments of the present application provide a computer program product containing instructions, which, when run on a computer, enables the computer to execute the first aspect or one of multiple possible implementations of the first aspect or several target detection methods.
  • an embodiment of the present application further provides an automatic driving assistance system, the system includes the above-mentioned second aspect or one or more target detection apparatuses in multiple possible implementation manners of the second aspect, and, At least one sensor, the sensor is used to collect the above image to be processed.
  • an embodiment of the present application further provides a vehicle, where the vehicle includes the above-mentioned second aspect or one or more target detection apparatuses in multiple possible implementation manners of the second aspect.
  • FIG. 1 shows a schematic diagram of a camera projection according to an embodiment of the present application.
  • FIG. 2 shows a schematic diagram of a marker in a construction area according to an embodiment of the present application.
  • FIG. 3 shows a schematic diagram of a construction area detection according to an embodiment of the present application.
  • FIG. 4 shows a schematic diagram of another construction area detection according to an embodiment of the present application.
  • FIG. 5 shows a schematic diagram of another construction area detection according to an embodiment of the present application.
  • FIG. 6 shows a schematic diagram of markers and other traffic objects according to an embodiment of the present application.
  • FIG. 7 shows a schematic diagram of a possible application scenario to which the target detection method according to an embodiment of the present application is applicable.
  • FIG. 8 shows a flowchart of a target detection method according to an embodiment of the present application.
  • FIG. 9 shows a schematic diagram of an image to be processed according to an embodiment of the present application.
  • FIG. 10 shows a schematic diagram of another image to be processed according to an embodiment of the present application.
  • FIG. 11 shows a schematic diagram of a binarized image according to an embodiment of the present application.
  • FIG. 12 shows a schematic diagram of determining the number of pixels in each column of a candidate region according to an embodiment of the present application.
  • FIG. 13 shows a schematic diagram of pixels with feature values in a candidate region according to an embodiment of the present application.
  • FIG. 14 shows a schematic diagram of a preset area according to an embodiment of the present application.
  • FIG. 15 shows an inverse three-dimensional projected height calorific value map according to an embodiment of the present application.
  • FIG. 16 shows a schematic diagram of pixels that satisfy the range condition in the candidate area according to an embodiment of the present application.
  • FIG. 17 shows a height thermal value map corresponding to a candidate region according to an embodiment of the present application.
  • FIG. 18 shows a schematic diagram of a target detection result according to an embodiment of the present application.
  • FIG. 19 shows a schematic diagram of a lane (a vehicle body coordinate system) where the first pixel point is located according to an embodiment of the present application.
  • FIG. 20 shows a schematic structural diagram of a target detection apparatus according to an embodiment of the present application.
  • FIG. 21 shows a schematic structural diagram of another target detection apparatus according to an embodiment of the present application.
  • FIG. 1 shows a schematic diagram of a camera projection according to an embodiment of the present application; as shown in FIG. 1 , when the camera 101 observes a point 112 (A) in a three-dimensional space, it can be mapped to two points through the camera model (eg, pinhole imaging) The location of the midpoint 114 (A') of the dimensional image plane 105.
  • the camera model eg, pinhole imaging
  • Ideal ground plane assumption flat-earth assumption: It is assumed that the road on which the vehicle travels is an ideal plane (such as plane 108 in FIG. 1 above). Based on this assumption, in the absence of the height information of the object in the three-dimensional space, a reverse projection of the two-dimensional image plane to the ideal ground plane can be realized. For example, the above-mentioned two-dimensional image plane 105 in FIG. A pixel point, such as point 115(B'), maps to a corresponding point 113(B) on the ideal ground plane 108 in world space.
  • Inverse 3D projection height heatmap A 2D image matrix. Let R and C be the number of rows and columns of the matrix, (u, v, h) represent the elements of the u-th row and the v-th column of the matrix, the value is h, and h is the inverse three-dimensional projected height heat value. Taking the above Figure 1 as an example, the calculation method of the calorific value will be described:
  • the point 115(B′) in the two-dimensional image plane 105 in FIG. 1 (the coordinates of point B′ in the image coordinate system are denoted as (u B′ ,v B′ )) are back-projected To the plane 108, corresponding to the point 113(B), the coordinates of the point 113(B) in the world coordinate system are marked as (x, y, 0); the object 107 with a physical height of H in the world space is placed vertically At point 113(B), the position of one end of the object 107 is point 113(B), the position of the other end is point 112(A), and the position of point 112(A) in the world coordinate system Marked as (x, y, H); based on the calibration parameters of the camera 101, calculate the three-dimensional projection transformation matrix M VI between the image coordinate system and the world coordinate system; based on the three-dimensional projection transformation matrix M VI , point 112 ( A) (x, y, H) is projected
  • this h is the calorific value of point 115(B'), which is (u B' ,v B' ,h), and (u B' , v B′ ,h) is filled into the inverse 3D projected height heat map.
  • Look-up table A data structure (for example, can be an array) that uses simple indexing operations to replace online repeated operations; the inverse 3D projected height heat map, as a two-dimensional matrix, also It can be regarded as a kind of look-up table, and the look-up table in this embodiment of the present application can be constructed based on the above-mentioned inverse three-dimensional projected height heat map.
  • FIG. 2 shows a schematic diagram of a marker in a construction area according to an embodiment of the present application.
  • the construction area markers may include: traffic cones 201, road rails 202, water horses 203, cement stone piers 204, construction vehicles 205, construction personnel 206 and warning signs 207 and other markers.
  • Non-cylindrical vertical planar object A class of objects whose basic shape is a rectangle or a convex polygon that is placed perpendicular to the ground. For example, markers such as the road rail 202, the water horse 203, and the cement stone pier 204 in the above-mentioned FIG. 2 are such objects.
  • FIG. 3 shows a schematic diagram of a construction area detection according to an embodiment of the present application.
  • the image in Fig. 3(a) has construction area markers in the form of non-columnar façades
  • the image in Fig. 3(b) has construction area markers in the form of columnar façades.
  • Cones, cylinders, poles and other construction area markers with columnar features in or near the road are detected in the area by statistics ( Figure 3(b)
  • the number of construction area markers in the middle rectangular frame determines whether there is a construction area in the current image or video frame captured by the vehicle-mounted camera.
  • the detection ability of the construction area markers in the non-columnar façade shape is weak; at the same time, by counting the construction area markers There is a risk of false detection (the area in the rectangular frame in Figure 3(b) is not the actual construction area); in addition, the lane where the construction area is located is not clearly marked.
  • FIG. 4 shows a schematic diagram of another construction area detection according to an embodiment of the present application.
  • a Convolutional Neural Network (CNN) is trained by using a set of training images including various construction-related objects, including construction area markers, for classifying such construction area markers and identification, and mark the construction area markers in the form of a two-dimensional labeling frame. Determine whether there are construction areas in the environment that affect driving based on numbers, locations, construction-related objects, etc.; then combine distance judgment and ray projection to locate them in three-dimensional space, and determine temporary traffic jams based on the properties of construction objects relative to road lanes various settings.
  • CNN Convolutional Neural Network
  • the two-dimensional annotation frame may introduce potential positioning errors; Output in the form of the dotted line box 402 in a) and the dotted line box 405 in FIG. 4(b); when the actual road bar is placed in the form of 404 in FIG. There is an error between the actual center points 408 of , resulting in an error in the positioning of the road rail.
  • methods based on CNN or machine learning have a strong dependence on training data, and have high requirements on the computing power and hardware storage space of sensing devices, which are not suitable for deployment on platforms with low power consumption and low computing power.
  • FIG. 5 shows a schematic diagram of another construction area detection according to an embodiment of the present application.
  • construction area markers with yellow, orange or red including traffic cones, traffic columns, etc.
  • road bar, etc. to detect
  • the three-channel image of red, green and blue (RGB) acquired by the camera is processed to obtain two images called luminance map and orange map.
  • the sliding window method based on template matching is used to detect the area with bright orange texture; at the same time, the classifier is used to judge whether the area contains streaks to determine whether the area is a construction area.
  • Fig. 6 shows a schematic diagram of a marker and other traffic objects according to an embodiment of the present application, wherein Fig.
  • FIG. 6(a) is an example of a marker having the color and texture characteristics of the marker in the construction area
  • Fig. 6(b) is an Examples of other traffic targets other than markers.
  • the color and texture features of the construction area markers such as road rails, cement stone piers, etc. in Figure 6(a)
  • Other traffic objects such as road edges, lane lines, container vehicles, etc.
  • a target detection method is provided, which can improve the detection ability of markers in the construction area, especially non-columnar façade markers such as road rails, water horses, cement stone piers, etc.; detection risk and improve detection accuracy.
  • the target detection method provided in the embodiment of the present application can be applied to a target detection system including at least one image acquisition device and a target detection device; wherein, the target detection device may be set independently, or integrated in a control device, or through software Alternatively, software and hardware are implemented in combination, which is not specifically limited.
  • the target detection system can be applied to advanced driver assistance systems (ADAS) and automated driving systems (Automated Driving System, ADS), and can also be used in various driving functions (such as adaptive cruise control (Adaptive cruise control). Cruise Control, ACC) and automatic emergency braking (Advanced Emergency Braking, AEB)), etc., can also be applied to things communication (Device to Device Communication, D2D), vehicle and anything communication (vehicle to everything, V2X) , Vehicle to Vehicle (V2V), Long Term Evolution and Vehicle (LTE-V), Long Term Evolution and Machine (LTE-M) and other scenarios.
  • ADAS advanced driver assistance systems
  • ADS Automated Driving System
  • Adaptive cruise control Adaptive cruise control
  • ACC Advanced cruise control
  • AEB Advanced Emergency Braking
  • the image acquisition device can be one or more cameras installed on the vehicle body, or can be one or more cameras installed on other mobile smart terminals, for example, can be a surround-view camera, a monocular camera, Binocular cameras, etc., are used to capture the surrounding environment of vehicles or other mobile smart terminals, generate images and/or videos, and transmit the images and/or videos to the target detection device; the target detection device can capture images based on the image acquisition device
  • the obtained images and/or videos are used for object detection, which can include parking spaces, people, obstacles, lane lines, construction areas, etc., to understand the surrounding environment of vehicles or other mobile smart terminals.
  • the object detection device can be a vehicle with object detection function, or other components with object detection function.
  • the target detection device includes but is not limited to: vehicle-mounted terminal, vehicle-mounted controller, vehicle-mounted module, vehicle-mounted module, vehicle-mounted components, vehicle-mounted chip, vehicle-mounted unit, vehicle-mounted radar or vehicle-mounted camera and other sensors, the vehicle can control the vehicle through the vehicle-mounted terminal, vehicle-mounted device, vehicle-mounted module, vehicle-mounted module, vehicle-mounted component, vehicle-mounted chip, vehicle-mounted unit, vehicle-mounted radar or camera, and implement the method provided in this application.
  • the target detection device may also be other smart terminals with target detection functions except vehicles, or be set in other smart terminals with target detection functions other than vehicles, or set in the smart terminal. in the parts.
  • the intelligent terminal may be other terminal equipment such as intelligent transportation equipment, smart home equipment, robots, and drones.
  • the target detection device includes, but is not limited to, a smart terminal or a controller, a chip, other sensors such as radar or a camera, and other components in the smart terminal.
  • the target detection apparatus may be a general-purpose device or a special-purpose device.
  • the apparatus can also be a desktop computer, a portable computer, a network server, a PDA (personal digital assistant, PDA), a mobile phone, a tablet computer, a wireless terminal device, an embedded device, or other devices with processing functions.
  • PDA personal digital assistant
  • the embodiment of the present application does not limit the type of the target detection device.
  • the target detection apparatus may also be a chip or processor with a processing function, and the target detection apparatus may include a plurality of processors.
  • the processor can be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the chip or processor with processing function may be arranged in the sensor, or may not be arranged in the sensor, but arranged at the receiving end of the output signal of the sensor.
  • FIG. 7 shows a schematic diagram of a possible application scenario to which the target detection method according to an embodiment of the present application is applicable.
  • the application scenario may be an automatic driving scenario, and the scenario includes at least one vehicle 701 , at least one image acquisition device 702 is installed in the vehicle 701 , and the vehicle 701 also includes a target detection device (not shown in the figure). shown), the image acquisition device 702 is used to capture the road environment in front of the vehicle to generate a corresponding image, there may be a target area 703 (for example, a construction area) on the road in front of the vehicle, and the image acquisition device 702 transmits the collected image to the target detection device, the target detection device is configured to execute the target detection method in the embodiment of the present application.
  • a target detection device for example, a construction area
  • FIG. 7 Only one vehicle, one image acquisition device and one construction area are shown in FIG. 7. It should be understood that this does not limit the number of vehicles, the number of image acquisition devices and the number of construction areas in the application scenario.
  • the application scenario may include more vehicles, more image acquisition devices, and more target areas, which are not shown here; the road in front of the vehicle may be a structured road or an unstructured road. This application The embodiment does not limit this.
  • the target detection method provided by the present application will be described below with reference to FIG. 7 .
  • FIG. 8 shows a flowchart of a target detection method according to an embodiment of the present application.
  • the execution body of the method may be the target detection device in the vehicle in the above-mentioned FIG. 7 , and the method may include the following steps:
  • Step 801 The target detection apparatus acquires the image to be processed.
  • the image to be processed may be an image or video frame of the surrounding environment of the vehicle collected in real time by the image collecting device installed on the vehicle in FIG. 7 . It can be understood that, during the driving process of the vehicle, the to-be-processed image collected at each moment may include construction area markers or may not include construction area markers.
  • the image acquisition device may be a camera installed at the upper position of the middle of the vehicle logo or at the rear-view mirror in the front windshield of the vehicle, etc., wherein the camera is relative to the installation roll angle and pitch of the vehicle body.
  • the angle can be small (eg, approximately zero), and the camera can capture real-time images of the road ahead of the vehicle and its surroundings, and transmit the images to the object detection device in real-time.
  • FIG. 9 shows a schematic diagram of an image to be processed according to an embodiment of the present application; as shown in FIG. 9 , the image to be processed includes lanes, lane lines, other vehicles, construction area markers and Environmental information on both sides of the road.
  • the target detection device can intercept the part where the road is located in the above-mentioned image to be processed, thereby reducing the amount of data processing and improving the target detection efficiency.
  • the target detection device can identify the vanishing point of the road in the image to be processed (that is, the junction point between the end of the road and the sky), and then use the straight line where the vanishing point is located as the boundary to horizontally divide the image to be processed into two parts. Make the part containing the road as the new image to process.
  • FIG. 10 shows a schematic diagram of another image to be processed according to an embodiment of the present application; as shown in FIG. 10 , after dividing FIG. 9 into two parts according to the vanishing point of the road in FIG. Part of the image; compared with Figure 9, the to-be-processed image in Figure 10 lacks parts that are not related to target detection, such as the sky.
  • Step 802 The target detection device performs binarization processing on the image to be processed according to the characteristics of the marker, and determines a candidate region in the image to be processed, wherein the candidate region includes at least one pixel, and the candidate region includes N columns of pixels, where N is positive. Integer.
  • the target detection device performs binarization processing on the image to be processed according to the characteristics of the marker.
  • the data amount of the image is greatly reduced, and the detection efficiency is improved;
  • the candidate area of the marker is screened out, and the preliminary screening of the marker in the construction area is completed, which can allow other objects in the image to be processed that meet the characteristics of the marker to enter the subsequent screening process, effectively improving the recall rate of marker detection in the construction area.
  • the marker may include a marker for dividing the construction area, so that by detecting the marker, the construction area can be sensed, and support for the planning of the driving path of the vehicle, etc. can be provided.
  • the marker can be any one of the traffic cones, road rails, water horses, concrete piers, construction vehicles, construction personnel and warning signs shown in FIG. 2 above. It can be understood that different The marker may have different characteristics, and the characteristics of the marker may be preset; in the embodiment of the present application, the marker is the road rail 202 in FIG. 2 as an example for illustrative description.
  • the object detection method of the application embodiment is used to illustrate the detection capability of the construction area markers, especially the non-columnar façade markers such as road rails, water horses, and cement stone piers.
  • the performing binarization processing on the image to be processed according to the feature of the marker to determine the candidate region in the image to be processed may include: according to the preset color feature and/or texture of the marker feature, perform binarization processing on the image to be processed to obtain a binarized image; and determine an area in the binarized image that satisfies the second condition as a candidate area.
  • binarization processing can be performed according to the preset color features and/or texture features of markers, so as to improve the accuracy of preliminary screening of markers in the construction area, and the amount of image data is greatly reduced to improve detection efficiency; Allow other objects in the image to be processed that have the same or similar color features and/or texture features as the markers to enter the subsequent screening process, effectively improving the recall rate of marker detection in the construction area.
  • the image in which the image to be processed is located may be an RGB color space (also called a blue green red (BGR) color space), a YUV (wherein Y represents the brightness (Luminance or Luma), and U represents the hue (hue). , V represents the color space such as color saturation (saturation) color space, etc.
  • RGB color space also called a blue green red (BGR) color space
  • YUV wherein Y represents the brightness (Luminance or Luma)
  • U represents the hue (hue).
  • V represents the color space such as color saturation (saturation) color space, etc.
  • W represents an image of width (Width) and W represents height (Height) pixels, wherein the three channels correspond to the three primary colors of red (R), green (G) and blue (B) respectively.
  • W Represents an image of width (Width) and W represents height (Height) pixels wherein the three channels correspond to the three primary colors of red (R), green (G) and blue (B) respectively
  • I 1c (u, v) represents the pixel value of the pixel point (u, v) in the single-channel image
  • R represents the value of the red light channel of the pixel point (u, v)
  • G represents the pixel point (u, v) the value of the green light channel
  • B represents the value of the blue light channel of the pixel point (u, v)
  • a R , a G , a B represent the weighting coefficients corresponding to the red light channel, the green light channel, and the blue light channel, respectively;
  • a R , a G , and a B can be determined according to the color feature and/or texture feature of the marker.
  • I bin (u, v) represents the pixel value of the pixel point (u, v) in the binarized image
  • I 1c (u, v) represents the pixel point (u, v) in the single channel.
  • TB represents the lower threshold
  • TU represents the upper threshold; wherein, TB and TU can be determined according to the color feature and/or texture feature of the marker.
  • the pixel value of each pixel in the image to be processed in the original color space is converted into the corresponding pixel value in the binarized image, that is, 1 or 0.
  • the second condition can be set as The value of all pixels in the area is set to 1, so that the area that satisfies the second condition is screened out in the binarized image, and the area is the candidate area.
  • the above-mentioned formula (1) and formula (2) are used to binarize the to-be-processed image shown in FIG. 10.
  • the above formula (1) can be converted into
  • the weighting coefficients a R , a G , and a B are set to 0.5, -0.25 and -0.25 respectively, and each pixel in the image to be processed in Figure 10 is processed by formula (1) to obtain a single-channel image, and then A single-channel image is binarized by using formula (2) to obtain a binarized image.
  • FIG. 11 shows a schematic diagram of a binarized image according to an embodiment of the present application; as shown in FIG.
  • the pixel value in the single-channel image whose pixel value is between the lower bound threshold TB and the upper bound threshold TU is set to 1 (that is, the white area in Figure 11) in the binarized image.
  • the rest of the pixels are set to 0 (that is, the black area in Figure 11), wherein, in the binarized image, the pixel with a pixel value of 1 may belong to the road column, and the pixel with a pixel value of 0 does not belong to the road column.
  • the second condition is set as the value of all pixel points in the region is 1, that is, the white region in the binarized image in FIG. 11 is determined as the candidate region.
  • Step 803 The target detection device determines the number of pixels in each of the N columns of pixels.
  • the target detection device may determine the number of pixels in each column of N columns of pixels in the candidate region in the binarized image determined in the above step 802 .
  • the target detection device can search for the maximum number of rows and the minimum number of rows corresponding to all pixels in each column in the candidate region, and calculate the difference between the maximum number of rows and the minimum number of rows, and the difference is the number of rows in the column. the number of pixels.
  • FIG. 12 shows a schematic diagram of determining the number of pixels in each column of a candidate region according to an embodiment of the present application; as shown in FIG. 12 , the binarized image shown in FIG.
  • the binarized image, the coordinates of each pixel in the binarized image are expressed as (u, v), where u represents the column where the pixel is located, and v represents the row where the pixel is located; search for the white color in the binarized image the maximum row value of each column u of the range (eg point P1 in Figure 12) and the minimum row value (eg point P2 in Figure 12), calculate and
  • the difference is the number of pixels in the column, denoted as
  • Step 804 The target detection device determines the characteristic value of the pixel point with the largest row number in each of the N columns of pixel points, wherein the characteristic value includes the corresponding number of pixel points in each column.
  • the target detection device may determine the number of pixels in each column as the pixel with the largest number of rows in the column on the basis of the number of pixels in each column of the N columns of pixels in the candidate area obtained in the above step 803 eigenvalues.
  • the determining the feature value of the pixel point with the largest row number in each column of the N columns of pixel points includes: determining the maximum row number corresponding to all the pixel points in each column of the N columns of pixel points and the minimum number of rows; the difference between the maximum number of rows and the minimum number of rows is determined as the feature value of the pixel with the largest number of rows in the column.
  • the pixel point with the largest number of rows represents the junction point between the construction area marker and the road surface that may be included in the candidate area
  • the feature value of the pixel point with the largest row number can represent the construction area marker that may be included in the candidate area.
  • the approximate height information of the object represented by each row in the candidate region can be represented by the feature value of the pixel point with the largest number of rows in each column.
  • the number of pixels h D in the column where the point P1 and the point P2 are located is determined as the feature value of the point P1, if the white area where the point P1 is located is surrounded by a road fence In the construction area, the point P1 is the junction point between the road barrier and the road surface, and the feature value of the point P1 can represent the approximate height of the road barrier whose junction point with the road surface is P1 in the binarized image. In this way, by traversing all the columns of the candidate region, the feature value of the pixel with the largest number of rows in each column can be obtained.
  • FIG. 13 shows a schematic diagram of pixels with eigenvalues in a candidate region according to an embodiment of the present application; as shown in FIG. 13 , the pixels shown in the figure are all pixels with eigenvalues in the candidate region. It can be seen that the outline of the bottom edge of the marker in the construction area can be clearly displayed in the binarized image shown in Figure 13.
  • Step 805 The target detection device determines the image to be compared, wherein the image to be compared includes at least one pixel, and determines the theoretical feature value corresponding to each pixel, wherein the theoretical feature value is determined based on the reference height information of the marker.
  • the image to be compared may be a pre-built image or an image constructed in real time; the size of the image to be compared and the image to be processed may be the same (that is, the number of pixels contained is the same); exemplarily , the image to be compared can be an inverse three-dimensional projected height thermal value map, and the thermal value of each pixel in the inverse three-dimensional projected height thermal value map is the theoretical eigenvalue corresponding to each pixel point.
  • the determining the theoretical feature value corresponding to each pixel point may include: determining the image to be compared according to the reference height information of the marker and the calibration parameters of the image acquisition device that collects the image to be processed The theoretical eigenvalue corresponding to each pixel in . In this way, based on the reference height information of the marker and the calibration parameters of the image acquisition device that collects the image to be processed, the theoretical eigenvalue corresponding to each pixel in the image to be compared is determined, so that the corresponding pixel point in the image to be compared can be used.
  • the theoretical eigenvalue of the construction area marker represents the theoretical height information of the construction area marker when the construction area marker is located at the pixel point in the candidate area (that is, the number of pixels in the column where the pixel point is occupied by the marker in the image to be compared) ).
  • the theoretical eigenvalues are determined by using the calibration parameters of the image acquisition device that collects the image to be processed, so that the determined theoretical eigenvalues accurately match the characteristics of the image acquisition device.
  • the pixel points of the marker can be accurately determined by using the determined theoretical eigenvalues.
  • the theoretical eigenvalue corresponding to each pixel in the image to be compared can be pre-determined according to the calibration parameters of the image acquisition device and the reference height information of the marker. High-frequency refresh is performed between frames, which effectively reduces the processing delay and improves the detection efficiency.
  • the calibration parameters of the image acquisition device may include: the internal parameter matrix K of the image acquisition device, the rotation matrix R and translation vector T of the image acquisition device relative to the vehicle body, etc.
  • the reference height information of the marker may include the marker in the world coordinate system. For example, if the marker is a road barrier, and the road barrier is placed perpendicular to the road surface, the height of the road barrier in the world coordinate system can be 1 meter.
  • the theoretical eigenvalue corresponding to each pixel in the image to be compared can be pre-determined according to the reference height information of the marker and the calibration parameters of the image acquisition device that collects the image to be processed, so as to obtain the pre-built marker.
  • the inverse three-dimensional projection height thermal value map (or look-up table) of the image to be compared can also be determined in real time according to the reference height information of the marker and the calibration parameters of the image acquisition device that collects the image to be processed. eigenvalues, so as to obtain the inverse three-dimensional projection height thermal value map (or look-up table) of the marker constructed in real time.
  • pre-determining the theoretical feature value corresponding to each pixel in the image to be compared may include: according to The calibration parameters of the image acquisition device determine the transformation matrix from the world coordinate system to the coordinate system of the image acquisition device; according to the transformation matrix, the marker at at least one position in the world coordinate system is projected into the coordinate system of the image acquisition device, and the corresponding position of the position is obtained.
  • the number of pixels in the image to be compared and the number of pixels in the image to be compared corresponding to the marker; the number of pixels in the image to be compared corresponding to the marker is taken as the theoretical feature value corresponding to the pixel.
  • the world coordinate system may be the coordinate system where the ground plane (that is, the actual road surface) in the above-mentioned FIG. 1 is located, and the coordinate system of the image acquisition device may be the coordinate system where the two-dimensional image (that is, the image to be compared) in FIG. 1 is located;
  • 103 and 104 respectively represent the u direction and v direction of the coordinate system of the image acquisition device;
  • 109, 110, and 111 respectively represent the x, y, and z directions in the world coordinate system.
  • the transformation matrix from the world coordinate system to the coordinate system of the image acquisition device is determined by the following formula (3):
  • u and v represent the coordinates of the point in the coordinate system of the image acquisition device in the u and v directions, respectively, and x, y, and z represent the point in the world coordinate system in the x and y directions, respectively.
  • the coordinates in the z direction K represents the internal parameter matrix of the image acquisition device, R and T represent the rotation matrix and translation vector of the image acquisition device relative to the vehicle body, respectively, s represents the distance of the image acquisition device relative to each point in the world coordinate system, and M represents the world coordinate system Transformation matrix to the coordinate system of the image acquisition device.
  • the markers placed vertically on the ideal ground plane in the world coordinate system are projected into the coordinate system of the image acquisition device to obtain the coordinates corresponding to the above positions in the images to be compared in the coordinate system of the image acquisition device.
  • the line segment between point A(x,y,1) and point B(x,y,0) can represent the height of the road bar in the world coordinate system of 1 meter.
  • the line segment can represent the theoretical height information of the road bar in the image to be compared, and the theoretical height information can be represented by the number of pixels h between points A' and B'; at this time, the road bar is at the theoretical height of the image to be compared.
  • the information is the theoretical eigenvalue corresponding to point B'.
  • the road fence with a height of 1 meter is placed vertically at each position of the ideal ground plane, and the theoretical eigenvalue corresponding to each pixel in the image to be compared can be obtained.
  • the image to be compared is for the sign Inverse 3D projected height heat map of the object.
  • a road barrier with a height of 1 meter can be vertically placed in a preset area of an ideal ground plane, and the preset area can be an area where a section of road ahead of the vehicle is located.
  • a schematic diagram of a preset area such as the preset area shown in the dotted box in FIG.
  • the preset area can be centered on the vehicle on which the image acquisition device is installed, and the width is 5 meters on the left and right sides of the vehicle. , the front is 55 meters long, forming a rectangular area.
  • FIG. 15 shows an inverse three-dimensional projected height thermal value map according to an embodiment of the present application. As shown in FIG. 15 , it is an inverse three-dimensional projected height thermal value map for the preset area in FIG. 14 .
  • the pixels in the trapezoidal area correspond to The position within the preset area in the ideal ground plane
  • the pixels outside the trapezoidal area correspond to the positions outside the preset area in the ideal ground plane
  • the pixels outside the trapezoidal area If the location is not within the detection range, only the construction area markers placed on the road surface and within the preset area can be detected, thereby effectively improving the detection efficiency.
  • the theoretical eigenvalue corresponding to each pixel in the image to be compared is pre-determined, so that the inverse three-dimensional projected height calorific value of the marker is pre-constructed Compared with the method of searching the construction area markers pixel by pixel in the whole map, there is no need to do high-frequency refresh between frames, which effectively reduces the processing delay and improves the detection efficiency.
  • the theoretical eigenvalue corresponding to each pixel in the image to be compared can be determined in real time based on the reference height information of the marker and the calibration parameters of the image acquisition device that collects the image to be processed, so as to obtain real-time The constructed inverse 3D projected height heat map of this marker.
  • the pixel point with the largest number of rows in each column in the above-determined candidate area combined with the above-mentioned calculation method of calorific value shown in FIG. 1, the object whose physical height is H on the ideal ground plane in FIG.
  • a road fence with a height of 1 meter so that the theoretical eigenvalues corresponding to the pixels with the largest number of lines can be determined, and the images to be compared can be obtained according to the coordinates of the pixels with the largest number of lines and the corresponding theoretical eigenvalues.
  • the to-be-compared image is an inverse three-dimensional projected height thermal value map for the road rail.
  • the image to be compared may only be included in the theoretical eigenvalue corresponding to the pixel with the largest number of rows in each column in the candidate region, which reduces the amount of data processing and improves the detection efficiency.
  • Step 806 The target detection device matches the eigenvalues of the pixels of the candidate area in the image to be processed with the theoretical eigenvalues of the corresponding pixels in the image to be compared; when the first pixel of the candidate area in the image to be processed When the eigenvalue of and the theoretical eigenvalue of the corresponding second pixel in the image to be compared satisfy the first condition, the first pixel is the pixel of the marker.
  • the first condition may be determined through experimental statistics based on factors such as the resolution of the image to be processed and the size of the marker.
  • the marker is a road bar.
  • the eigenvalue differs from the theoretical eigenvalue of the corresponding second pixel in the image to be compared by a preset pixel value (eg, 100 pixels, 200 pixels, etc.)
  • the first pixel is determined to be the pixel of the road bar.
  • the construction area markers can be effectively detected, especially for the construction area markers with non-columnar façade morphological characteristics, which improves the generalization ability of the target detection method; at the same time, it can filter out other construction area markers with similar characteristics.
  • Objects such as container vehicles, lane lines, etc.
  • the matching the eigenvalues of the pixel points in the candidate region in the image to be processed with the theoretical eigenvalues of the corresponding pixel points in the image to be compared includes: determining the eigenvalues in the image to be processed. Pixels that satisfy the range condition in the candidate area; among them, if the pixel is included in the range of the road surface in the image to be processed, the pixel meets the range condition; compare the feature value of the pixel that meets the range condition with the to-be-processed pixel The theoretical eigenvalues of the corresponding pixels in the image are matched.
  • the range of the road surface in the to-be-processed image can be pre-determined in combination with the calibration parameters of the image acquisition device and the preset area shown in FIG. 14 .
  • the trapezoidal area shown in FIG. 15 can be determined as the range of the road surface in the image to be processed; it is also possible to perceive the road edge in the image to be processed in real time to determine whether the road surface is in the area to be processed.
  • the range in the processed image for example, the road edge and the vanishing line at the end of the road in the image to be processed can be identified, and the area enclosed by the road edge and the vanishing line at the end of the road is determined as the road surface in the image to be processed. scope.
  • the pixel with the largest number of rows in each column of the candidate region in the image to be processed in the above step 804 is within the range where the road surface is located in the image to be processed. If the pixel point is in the range of the road surface in the image to be processed, the eigenvalue of the pixel point is matched with the theoretical eigenvalue of the corresponding pixel point in the image to be compared. The feature value matching is performed on the pixel points within the range of , that is, the construction area markers placed on the road surface are detected, so as to effectively improve the detection efficiency.
  • Fig. 16 shows a schematic diagram of pixels satisfying the range condition in the candidate region according to an embodiment of the present application; as shown in Fig. 16, the solid line in the figure represents the road edge in the image to be processed obtained by real-time perception, In Fig. 16, the part below the solid line and the trapezoidal area surrounded by the edge of the image is the area where the road surface is located in the image to be processed. In addition, Fig. 16 is the same as the above-mentioned Fig. 13.
  • the candidate area has eigenvalues All the pixels of the road surface are within the range of the road surface in the image to be processed, that is, the pixel with the largest number of rows in each column in the candidate area in Figure 13 satisfies the range condition, then the pixel with the largest number of rows in each column in the candidate area The points are matched with the theoretical eigenvalues of the corresponding pixels in the images to be compared.
  • matching the eigenvalues of the pixel points in the candidate region in the image to be processed with the theoretical eigenvalues of the corresponding pixel points in the image to be compared; may include: based on the candidate values in the image to be processed
  • the feature values of the pixels of the region are used to construct a height heat map corresponding to the candidate region, and the height heat map corresponding to the candidate region and the inverse three-dimensional projected height heat map of the marker constructed in the above step 805 are made pixel by pixel.
  • the residual operation is performed to obtain a height deviation map; based on the first condition, a point whose height value deviation is less than the first condition is searched in the height deviation map to obtain the first pixel point of the candidate region in the image to be processed.
  • the eigenvalues of the pixels in the candidate region in the image to be processed are matched with the theoretical eigenvalues of the corresponding pixels in the image to be compared, and the first pixel can be quickly screened out .
  • a height heat map corresponding to the candidate region is constructed; As shown in FIG. 17 , a height heat value map corresponding to a candidate area of an embodiment, the height heat value map corresponding to the candidate area is the same size as the inverse three-dimensional projected height heat value map shown in FIG.
  • the box contains candidate regions; perform pixel-by-pixel residual operation on Figure 15 and Figure 17 to obtain a height deviation map (shown in the figure). In this height deviation map, search for points with a height value deviation less than 100 pixels, that is the first pixel of the candidate region in the image to be processed.
  • matching the eigenvalues of the pixel points of the candidate region in the image to be processed with the theoretical eigenvalues of the corresponding pixel points in the image to be compared may include: based on the maximum number of rows in each column The position of the pixel point, in the form of a look-up table, in the inverse three-dimensional projection height calorific value look-up table of the marker constructed in the above step 805, look up the theoretical eigenvalue of the corresponding pixel point, and obtain the pixel point in the image to be processed.
  • the difference between the eigenvalue of , and the theoretical eigenvalue of the corresponding pixel, when the difference is less than the first condition, the pixel in the image to be processed is the first pixel.
  • the eigenvalues of the pixel points in the candidate region in the image to be processed are matched with the theoretical eigenvalues of the corresponding pixel points in the image to be compared, and the number of points not exceeding the number of columns of the image to be processed is processed.
  • the image processing efficiency is effectively improved, and it can be effectively applied to the scene where the construction area is relatively sparse (that is, the pixels in the image containing the construction area are significantly less than those in the non-construction area).
  • the characteristic value of the point P1 is h D , according to the coordinates of the point P1, in the inverse three-dimensional projection height heat value lookup table for the road rail, look up the corresponding coordinate of the coordinate.
  • the theoretical eigenvalue h R , the difference value Diff
  • FIG. 18 shows a schematic diagram of a target detection result according to an embodiment of the present application.
  • the first pixel included in the figure is the pixel of the road rail, and the first pixel is the road rail and the road rail.
  • the pixel points at the junction of the road surface can realize the accurate positioning of the construction area and the road boundary.
  • Step 807 the target detection device determines the condition of the lane occupied in the construction area.
  • the first pixel point of the marker is detected, and the first pixel point is the pixel point at the junction of the construction area surrounded by the marker and the road surface.
  • the lane line constraint can be used. , and count the situation of the lane where each first pixel point is located, so as to determine the occupation of the lane in the construction area.
  • the determining that the construction area occupies a lane may include: determining a lane line (or a virtual lane line) in the image to be processed; A lane where a pixel is located; when the number of first pixels in the lane exceeds a third threshold, determine the horizontal and vertical boundaries of the candidate area where the first pixel is located.
  • the candidate area where the first pixel is located is the construction area surrounded by markers, and the third threshold can be set according to actual needs, which is not limited here.
  • the lane where the first pixel is located can be determined in the coordinate system of the image acquisition device, for example, by identifying the lane line in the image to be processed, according to the position of the lane line in the image to be processed and the location of the first pixel in the image to be processed.
  • the position in the image to be processed determines the lane where the first pixel is located; the lane where the first pixel is located can also be determined in the vehicle body coordinate system or the world coordinate system; for example, it can be based on the calibration parameters of the image acquisition device and the projection theorem , project the first pixel point and the lane line in the to-be-processed image to the vehicle body coordinate system (or world coordinate system), so that according to the position of the lane line in the vehicle body coordinate system (or world coordinate system) and the first pixel point position to determine the lane where the first pixel is located.
  • the number of first pixels in each lane can be determined by voting, and when the number of first pixels in a lane exceeds a third threshold, the horizontal and vertical distances between the first pixels in the lane and the vehicle can be counted. The nearest distance and the farthest distance in the direction determine the horizontal and vertical boundaries of the lane occupied by the construction area.
  • Fig. 19 shows a schematic diagram of the lane (vehicle coordinate system) where the first pixel point is located according to an embodiment of the present application; as shown in Fig.
  • the minimum X coordinate value, the maximum X coordinate value and the maximum X coordinate value can be determined among the coordinate values of all the first pixel points in the right lane.
  • Coordinate value, minimum Y coordinate value, maximum Y coordinate value find the difference between the minimum Y coordinate value and the maximum Y coordinate value as the horizontal boundary of the lane occupied by the construction area, and find the difference between the minimum X coordinate value and the maximum X coordinate value. The difference is taken as the construction area occupying the longitudinal boundary of the lane.
  • the occupied lanes of the construction area can be determined in the coordinate system of the image acquisition device, the vehicle body coordinate system or the world coordinate system.
  • the method is more flexible and applicable to a wider range; at the same time, the distance from the construction area to the vehicle can be calculated.
  • the boundary distances in the horizontal and vertical directions refine the boundary conditions of the lane occupied by the construction area, which can provide more abundant information for further vehicle control.
  • the method further includes: determining the lane where the first pixel point in the current image to be processed is located according to the first pixel point in the multiple images to be processed. For example, by using multiple images to be processed that are connected to the current image to be processed, the first pixel in the multiple images to be processed can be fused in the time axis direction to determine the lane where the first pixel in the current image to be processed is located. . In this way, the accuracy and stability of the determined condition of the lane occupied in the construction area can be improved.
  • a quadratic polynomial can be used to fit all the first pixel points obtained in the above step 806 to obtain a fitting curve; a motion equation for the fitting curve is established based on the Kalman filter, and the fitting curve is performed. Tracking; fuse the tracking result with the first pixel detected in the current image to be processed to obtain a smooth and stable first pixel detection result, so that the location of the first pixel in the current image to be processed can be determined more quickly and accurately Lane.
  • the target detection device can report the above-mentioned condition of lane occupation in the construction area to a relevant module with a vehicle control function in the form of a lane occupancy message. Occupy packets and make effective vehicle travel path planning. The lane occupancy message provides more abundant information to the control module, which improves the vehicle control effect and user experience.
  • the lane occupancy message may include the condition of the lane occupied by the construction area and the boundary condition of the area occupied by the construction area, for example: the lane occupancy sign OccLane: the second lane on the right (2(right)), that is, the right lane, the minimum longitudinal distance Xmin: 2.819m, vertical maximum distance Xmax: 52.073m, horizontal minimum distance Ymin: 1.528m, horizontal maximum distance Ymax: 4.701m.
  • the lane occupancy message contains the detailed condition of the lane boundary occupied by the construction area, which can better help the control module to control the vehicle.
  • this step 807 is an optional step.
  • the embodiment of the present application is simple and efficient; at the same time, it can effectively solve the problem of inaccurate evaluation of the conditions of the lanes occupied by the construction area caused by the positioning error of the construction area, and the lanes Occupy the problem that the message of the situation is not enough.
  • GPS Global Positioning System
  • a candidate region in the image to be processed is determined according to the characteristics of the marker, and the feature value of the pixel with the largest number of rows in each column of the candidate region in the image to be processed and each pixel in the image to be compared are determined.
  • Corresponding theoretical eigenvalues wherein the eigenvalues include the number of corresponding pixels in each column, the theoretical eigenvalues are determined based on the reference height information of the markers, and the eigenvalues of the pixels in the candidate region in the image to be processed are compared with the number of pixels to be processed.
  • the reference height information of the marker is further utilized. Detect the markers in the construction area. Through double screening, other objects with the same or similar characteristics as the markers can be effectively filtered out, thereby reducing the risk of false detection and improving the detection accuracy. And by performing binarization processing, the candidate regions in the to-be-processed images are determined, the accuracy of preliminary screening of markers is improved, and the amount of data of images involved in subsequent processing is greatly reduced, improving detection efficiency.
  • the embodiments of the present application further provide a target detection device, and the target detection device is used to execute the technical solutions described in the above method embodiments.
  • FIG. 20 shows a schematic structural diagram of a target detection apparatus according to an embodiment of the present application; as shown in FIG. 20 , the target detection apparatus may include: an acquisition module 901 for acquiring an image to be processed; a processing module 902 for Binarization is performed on the image to be processed according to the feature of the marker, and a candidate area in the image to be processed is determined, where the candidate area includes at least one pixel, the candidate area includes N columns of pixels, and the N is a positive integer; determine the number of pixels in each of the N columns of pixels; determine the eigenvalue of the pixel with the largest row number in each of the N columns of pixels, and the eigenvalue includes the corresponding the number of pixel points; determine the image to be compared, the image to be compared includes at least one pixel point, determine the theoretical feature value corresponding to each pixel point, the theoretical feature value is determined based on the reference height information of the marker; Match the eigenvalues of the pixels of the candidate area in the image to be processed with the theoretical e
  • the processing module 902 is further configured to: determine the to-be-compared image according to the reference height information of the marker and the calibration parameters of the image capture device that captures the to-be-processed image The theoretical eigenvalue corresponding to each pixel in .
  • the reference height information of the marker includes the height of the marker in the world coordinate system; the processing module 902 is further configured to: according to the calibration parameters of the image acquisition device, Determine the transformation matrix from the world coordinate system to the coordinate system of the image acquisition device; according to the transformation matrix, project the marker on at least one position in the world coordinate system into the coordinate system of the image acquisition device, and obtain the corresponding position
  • the number of pixels in the image to be compared and the number of pixels in the image to be compared corresponding to the marker; the number of pixels in the image to be compared corresponding to the marker is taken as the number of pixels corresponding to the pixel theoretical eigenvalues.
  • the processing module 902 is further configured to: determine the maximum number of rows and the minimum number of rows corresponding to all the pixels in each column of the N columns of pixels; and compare the maximum number of rows with the The difference of the minimum number of rows is determined as the feature value of the pixel with the largest number of rows in the column.
  • the processing module 902 is further configured to: determine the pixel points that satisfy the range condition in the candidate region in the image to be processed; wherein, if the pixel points are included in the road surface in the Within the range of the image to be processed, the pixel points meet the range conditions; the eigenvalues of the pixel points that meet the range conditions are compared with the theoretical eigenvalues of the corresponding pixels in the image to be compared. match.
  • the processing module 902 is further configured to: perform binarization processing on the to-be-processed image according to the preset color feature and/or texture feature of the marker to obtain two binarizing the image; determining an area that satisfies the second condition in the binarizing image as the candidate area.
  • the processing module 902 is further configured to: determine the lane line in the image to be processed; determine the lane line according to the position of the lane line and the position of the first pixel point Lane where the first pixel is located; when the number of first pixels in the lane exceeds a third threshold, determine the horizontal and vertical boundaries of the candidate area where the first pixel is located.
  • the processing module 902 is further configured to: determine the lane where the first pixel point in the current image to be processed is located according to the first pixel point in the multiple images to be processed.
  • the marker includes a marker for dividing the construction area.
  • An embodiment of the present application provides a target detection apparatus, including: a processor and a memory for storing instructions executable by the processor; wherein the processor is configured to implement the above target detection method when executing the instructions.
  • the target detection device may further include: at least one sensor, the sensor is used for collecting the image to be processed.
  • FIG. 21 shows a schematic structural diagram of another target detection apparatus according to an embodiment of the present application.
  • the target detection apparatus may include: at least one processor 1001 , a communication line 1002 , a memory 1003 and at least one communication Interface 1004.
  • the processor 1001 may be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more processors for controlling the execution of the programs of the present application. integrated circuit.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • Communication line 1002 may include a path to communicate information between the components described above.
  • Communication interface 1004 using any transceiver-like device for communicating with other devices or communication networks, such as Ethernet, Radio Access Network (RAN), wireless local area networks (WLAN), etc. .
  • RAN Radio Access Network
  • WLAN wireless local area networks
  • Memory 1003 may be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), or other types of information and instructions It can also be an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, CD-ROM storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or capable of carrying or storing desired program code in the form of instructions or data structures and capable of being executed by a computer Access any other medium without limitation.
  • the memory may exist independently and be connected to the processor through communication line 1002 .
  • the memory can also be integrated with the processor.
  • the memory provided by the embodiments of the present application may generally be non-volatile.
  • the memory 1003 is used for storing computer-executed instructions for executing the solutions of the present application, and the execution is controlled by the processor 1001 .
  • the processor 1001 is configured to execute the computer-executed instructions stored in the memory 1003, thereby implementing the methods provided in the foregoing embodiments of the present application.
  • the computer-executed instructions in the embodiments of the present application may also be referred to as application code, which is not specifically limited in the embodiments of the present application.
  • the processor 1001 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 21 .
  • the target detection apparatus may include multiple processors, for example, the processor 1001 and the processor 1007 in FIG. 21 .
  • processors can be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor.
  • a processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
  • the target detection apparatus may further include an output device 1005 and an input device 1006.
  • the output device 1005 is in communication with the processor 1001 and can display information in a variety of ways.
  • the output device 1005 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector) Wait.
  • the input device 1006 is in communication with the processor 1001 and can receive user input in a variety of ways.
  • the input device 1006 may be a mouse, a keyboard, a touch screen device, a sensor device, or the like.
  • the embodiments of the present application further provide an automatic driving assistance system, which is applied in unmanned driving or intelligent driving, which includes at least one target detection device mentioned in the above-mentioned embodiments of the present application, and at least one of other sensors such as cameras or radars.
  • the sensor is used to collect images to be processed, and at least one device in the system can be integrated into a complete machine or equipment, or at least one device in the system can also be independently set as a component or device.
  • any of the above systems may interact with the vehicle's central controller to provide detection and/or fusion information for decision-making or control of the vehicle's driving.
  • An embodiment of the present application further provides a vehicle, and the vehicle includes at least one target detection device or any of the above-mentioned systems mentioned in the above-mentioned embodiments of the present application.
  • Embodiments of the present application provide a non-volatile computer-readable storage medium on which computer program instructions are stored, and when the computer program instructions are executed by a processor, implement the above method.
  • Embodiments of the present application provide a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.
  • a computer-readable storage medium may be a tangible device that can hold and store instructions for use by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (Electrically Programmable Read-Only-Memory, EPROM or flash memory), static random access memory (Static Random-Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD - ROM), Digital Video Disc (DVD), memory sticks, floppy disks, mechanically encoded devices, such as punch cards or raised structures in grooves on which instructions are stored, and any suitable combination of the foregoing .
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read-only memory
  • EPROM Errically Programmable Read-Only-Memory
  • SRAM static random access memory
  • portable compact disk read-only memory Compact Disc Read-Only Memory
  • CD - ROM Compact Disc Read-Only Memory
  • DVD Digital Video Disc
  • memory sticks floppy disks
  • Computer readable program instructions or code described herein may be downloaded to various computing/processing devices from a computer readable storage medium, or to an external computer or external storage device over a network such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from a network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .

Abstract

一种目标检测方法及装置,属于传感器技术领域,可用于辅助驾驶和自动驾驶。该方法包括:获取传感器采集的待处理图像;确定候选区域中像素点的特征值,特征值包括对应的每一列的像素点的数量;确定待比对图像中像素点对应的理论特征值,理论特征值基于标志物的参考高度信息确定;将候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配,从而感知标志物。该方法可基于传感器感知道路施工区标志物,从而提升终端在自动驾驶或者辅助驾驶中的高级辅助驾驶系统ADAS能力,可以应用于车联网。

Description

一种目标检测方法及装置
本申请要求于2021年02月07日提交中国专利局、申请号为202110168662.X、发明名称为“一种目标检测方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及传感器技术领域,尤其涉及一种目标检测方法及装置。
背景技术
随着社会的发展,智能运输设备、智能家居设备、机器人等智能终端正在逐步进入人们的日常生活中。传感器在智能终端上发挥着十分重要的作用。安装在智能终端上的各式各样的传感器,比如毫米波雷达,激光雷达,摄像头,超声波雷达等,在智能终端的运动过程中感知周围的环境,收集数据,进行移动物体的辨识与追踪,以及静止场景如车道线、标示牌、施工区的识别,并结合导航仪及地图数据进行路径规划。
在高级辅助驾驶系统(Advanced Driver Assistance Systems,ADAS)和自动驾驶系统(Automated Driving System,ADS)中,及多种驾驶功能(如自适应巡航控制(Adaptive Cruise Control,ACC)和自动紧急制动(Advanced Emergency Braking,AEB))都要求具备对施工区及其占据车道情况的基本感知能力。针对车载场景感知的传感器主要有毫米波雷达、激光雷达和摄像头,由于施工区具有高级别的语义信息,毫米波雷达往往不具备感知复杂施工区的能力,目前的道路施工区检测方案所采用的传感器主要为激光雷达和摄像头,其中摄像头具有成本低廉、体积小、易部署和易维护等优势,得到较为广泛的应用。相关技术中,基于摄像头检测施工区标志物的方法主要分为两类:基于图像处理的方法,如采用模板匹配的方式搜寻图像中施工区标志物;基于机器学习的方法,如借助施工区标志物的颜色或纹理特征,设计分类器对图像中的像素或区域进行分类,从而确定施工区标志物。
然而,上述检测施工区标志物的方法,存在误检风险,尤其检测路栏、水马和水泥石墩等非柱状立面体的施工区标志物时,误检风险较高,因此,如何降低误检风险,提高检测准确性成为了目前亟待解决的问题。
发明内容
有鉴于此,本申请提出了一种目标检测方法及装置。
第一方面,本申请的实施例提供了一种目标检测方法,所述方法包括:获取待处理图像;根据标志物的特征对所述待处理图像进行二值化处理,确定所述待处理图像中的候选区域,所述候选区域包括至少一个像素点,所述候选区域包括N列像素点,所述N为正整数;确定所述N列像素点中每一列的像素点数量;确定所述N列像素点中每一列中行数最大的像素点的特征值,所述特征值包括对应的每一列的像素点的数量;确定待比对图像,所述待比对图像包括至少一个像素点,确定每个像素点对应的理论特征值,所述理论特征值基于所述标志物的参考高度信息确定;将所述待处理图像中的候选区域的像素点的特征值与所述待比对图像中 对应的像素点的理论特征值进行匹配;当所述待处理图像中的候选区域的第一像素点的特征值与所述待比对图像中对应的第二像素点的理论特征值满足第一条件时,所述第一像素点为所述标志物的像素点。
基于上述技术方案,根据标志物的特征确定待处理图像中的候选区域,并确定待处理图像中的候选区域的每一列中行数最大的像素点的特征值及待比对图像中每个像素点对应的理论特征值,其中,特征值包括对应的每一列的像素点的数量,理论特征值基于标志物的参考高度信息确定,并将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配,确定标志物的像素点;这样,在根据标志物的特征确定待处理图像中的候选区域的基础上,进一步利用标志物的参考高度信息对施工区标志物进行检测,通过双重筛选,有效滤除与标志物特征相同或相似的其他物体,降低误检风险,提高检测准确性。并且通过进行二值化处理,确定所述待处理图像中的候选区域,提升了标志物的初步筛选的准确性,且参与后续处理的图像的数据量大为减少,提高检测效率。
根据第一方面,在所述第一方面的第一种可能的实现方式中,所述确定每个像素点对应的理论特征值,包括:根据所述标志物的参考高度信息及采集所述待处理图像的图像采集装置的标定参数,确定所述待比对图像中每个像素点对应的理论特征值。
基于上述技术方案,基于标志物的参考高度信息及采集待处理图像的图像采集装置的标定参数,确定待比对图像中每个像素点对应的理论特征值,这样,可以利用待比对图像中的像素点对应的理论特征值,表征施工区标志物位于候选区域的该像素点时该施工区标志物的理论高度信息。利用采集所述待处理图像的图像采集装置的标定参数进行理论特征值的确定,使得所确定的理论特征值精确地匹配该图像采集装置的特性,这样,在以同样的图像采集装置采集的待处理图像中,利用所确定的理论特征值即可精确地确定标志物的像素点。例如,可以根据图像采集装置的标定参数和标志物的参考高度信息,预先确定待比对图像中每个像素点对应的理论特征值,相对于全图逐像素搜索施工区标志物的方式,无需在帧间做高频率刷新,有效降低处理时延,提高了检测效率。
根据第一方面的第一种可能的实现方式,在所述第一方面的第二种可能的实现方式中,所述标志物的参考高度信息包括所述标志物在世界坐标系中的高度;所述根据所述标志物的参考高度信息及采集所述待处理图像的图像采集装置的标定参数,确定所述待比对图像中每个像素点对应的理论特征值,包括:根据所述图像采集装置的标定参数,确定世界坐标系到图像采集装置坐标系的变换矩阵;根据所述变换矩阵,将所述世界坐标系中至少一个位置上的标志物投影到所述图像采集装置坐标系中,得到该位置对应的所述待比对图像中像素点及该标志物对应的所述待比对图像中像素点的数量;将该标志物对应的所述待比对图像中像素点的数量作为该像素点对应的理论特征值。
根据第一方面,在所述第一方面的第三种可能的实现方式中,所述确定所述N列像素点中每一列中行数最大的像素点的特征值,包括:确定所述N列像素点中每一列中所有像素点对应的最大行数及最小行数;将所述最大行数与最小行数的差值,确定为该列中行数最大的像素点的特征值。
基于上述技术方案,行数最大的像素点即表示候选区域可能包含的施工区标志物与路面的交界点,该行数最大的像素点的特征值即可表征候选区域可能包含的施工区标志物在二值化图像中的近似高度信息,这样,可以利用每一列中行数最大的像素点的特征值表征候选区 域中各行所代表物体的近似高度信息。
根据第一方面,在所述第一方面的第四种可能的实现方式中,所述将所述待处理图像中的候选区域的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配,包括:确定所述待处理图像中的候选区域中满足范围条件的像素点;其中,若所述像素点包含于路面在所述待处理图像中所在的范围之内,则所述像素点满足所述范围条件;将所述满足范围条件的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配。
基于上述技术方案,可以仅对包含于路面在待处理图像中所在的范围之内的像素点进行特征值匹配,即检测放置在路面上的施工区标志物,从而有效提高检测效率。
根据第一方面,在所述第一方面的第五种可能的实现方式中,所述根据标志物的特征对所述待处理图像进行二值化处理,确定所述待处理图像中的候选区域,包括:根据预设的所述标志物的颜色特征和/或纹理特征,对所述待处理图像进行二值化处理,得到二值化图像;将所述二值化图像中满足第二条件的区域,确定为所述候选区域。
基于上述技术方案,根据预设的标志物的颜色特征和/或纹理特征,进行二值化处理,提高施工区标志物的初步筛选的准确性,且图像的数据量大为减少,提高检测效率;可以允许待处理图像中与标志物的颜色特征和/或纹理特征相同、相似的其他物体进入后续筛选流程,有效提升施工区标志物检测的召回率。
根据第一方面,在所述第一方面的第六种可能的实现方式中,所述方法还包括:确定所述待处理图像中的车道线;根据所述车道线的位置及所述第一像素点的位置,确定所述第一像素点所在的车道;在车道内的第一像素点数量超过第三阈值时,确定所述第一像素点所在候选区域的横向边界及纵向边界。
基于上述技术方案,确定施工区的占用车道的横向边界和纵向边界,细化了施工区占据车道情况,从而可以为进一步地车辆控制提供更加丰富的信息。
根据第一方面,在所述第一方面的第七种可能的实现方式中,所述方法还包括:根据多张待处理图像中的第一像素点,确定当前待处理图像中第一像素点所在车道。例如,可以通过与当前待处理图像相连的多张待处理图像,对多张待处理图像中的第一像素点进行时间轴方向上的融合处理,确定当前待处理图像中第一像素点所在车道。
基于上述技术方案,可提升所确定的施工区占用车道情况的准确性和稳定性。
根据第一方面及第一方面的多种可能的实现方式,在所述第一方面的第八种可能的实现方式中,所述标志物包括用于划分施工区域的标志物。
基于上述技术方案,标志物可以包括用于划分施工区域的标志物,从而可以通过检测该标志物,感知施工区域,为车辆行驶路径规划等提供支持。
第二方面,本申请的实施例提供了一种目标检测装置,所述装置包括:获取模块,用于获取待处理图像;处理模块,用于根据标志物的特征对所述待处理图像进行二值化处理,确定所述待处理图像中的候选区域,所述候选区域包括至少一个像素点,所述候选区域包括N列像素点,所述N为正整数;确定所述N列像素点中每一列的像素点数量;确定所述N列像素点中每一列中行数最大的像素点的特征值,所述特征值包括对应的每一列的像素点的数量;确定待比对图像,所述待比对图像包括至少一个像素点,确定每个像素点对应的理论特征值,所述理论特征值基于所述标志物的参考高度信息确定;将所述待处理图像中的候选区域的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配;当所述待处理图像 中的候选区域的第一像素点的特征值与所述待比对图像中对应的第二像素点的理论特征值满足第一条件时,所述第一像素点为所述标志物的像素点。
根据第二方面,在所述第二方面的第一种可能的实现方式中,所述处理模块,还用于:根据所述标志物的参考高度信息及采集所述待处理图像的图像采集装置的标定参数,确定所述待比对图像中每个像素点对应的理论特征值。
根据第二方面的第一种可能的实现方式,在所述第二方面的第二种可能的实现方式中,所述标志物的参考高度信息包括所述标志物在世界坐标系中的高度;所述处理模块,还用于:根据所述图像采集装置的标定参数,确定世界坐标系到图像采集装置坐标系的变换矩阵;根据所述变换矩阵,将所述世界坐标系中至少一个位置上的标志物投影到所述图像采集装置坐标系中,得到该位置对应的所述待比对图像中像素点及该标志物对应的所述待比对图像中像素点的数量;将该标志物对应的所述待比对图像中像素点的数量作为该像素点对应的理论特征值。
根据第二方面,在所述第二方面的第三种可能的实现方式中,所述处理模块,还用于:确定所述N列像素点中每一列中所有像素点对应的最大行数及最小行数;将所述最大行数与最小行数的差值,确定为该列中行数最大的像素点的特征值。
根据第二方面,在所述第二方面的第四种可能的实现方式中,所述处理模块,还用于:确定所述待处理图像中的候选区域中满足范围条件的像素点;其中,若所述像素点包含于路面在所述待处理图像中所在的范围之内,则所述像素点满足所述范围条件;将所述满足范围条件的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配。
根据第二方面,在所述第二方面的第五种可能的实现方式中,所述处理模块,还用于:根据预设的所述标志物的颜色特征和/或纹理特征,对所述待处理图像进行二值化处理,得到二值化图像;将所述二值化图像中满足第二条件的区域,确定为所述候选区域。
根据第二方面,在所述第二方面的第六种可能的实现方式中,所述处理模块,还用于:确定所述待处理图像中的车道线;根据所述车道线的位置及所述第一像素点的位置,确定所述第一像素点所在的车道;在车道内的第一像素点数量超过第三阈值时,确定所述第一像素点所在候选区域的横向边界及纵向边界。
根据第二方面,在所述第二方面的第七种可能的实现方式中,所述处理模块,还用于:根据多张待处理图像中的第一像素点,确定当前待处理图像中第一像素点所在车道。例如,可以通过与当前待处理图像相连的多张待处理图像,对多张待处理图像中的第一像素点进行时间轴方向上的融合处理,确定当前待处理图像中第一像素点所在车道。
根据第二方面及第二方面的多种可能的实现方式,在所述第二方面的第八种可能的实现方式中,所述标志物包括用于划分施工区域的标志物。
第三方面,本申请的实施例提供了一种目标检测装置,包括:至少一个传感器,所述传感器用于采集待处理图像;处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行可以执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的目标检测方法。
第四方面,本申请的实施例提供了一种目标检测装置,包括:处理器;用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行可以执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的目标检测方法。
第五方面,本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的目标检测方法。
第六方面,本申请的实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面或者第一方面的多种可能的实现方式中的一种或几种的目标检测方法。
第七方面,本申请实施例还提供一种自动驾驶辅助系统,所述系统包括上述第二方面或者第二方面的多种可能的实现方式中的一种或几种的目标检测装置,以及,至少一个传感器,所述传感器用于采集上述待处理图像。
第八方面,本申请实施例还提供一种车辆,所述车辆包括上述第二方面或者第二方面的多种可能的实现方式中的一种或几种的目标检测装置。
上述第二方面至第八方面的各方面,及第二方面的各种可能的实现方式的技术效果,参见上述第一方面。
本申请的这些和其他方面在以下(多个)实施例的描述中会更加简明易懂。
附图说明
包含在说明书中并且构成说明书的一部分的附图与说明书一起示出了本申请的示例性实施例、特征和方面,并且用于解释本申请的原理。
图1示出根据本申请一实施例的摄像机投影示意图。
图2示出根据本申请一实施例的施工区的标志物示意图。
图3示出根据本申请一实施例的一种施工区检测的示意图。
图4示出根据本申请一实施例的另一种施工区检测的示意图。
图5示出根据本申请一实施例的另一种施工区检测的示意图。
图6示出根据本申请一实施例的标志物及其他交通目标的示意图。
图7示出根据本申请一实施例的目标检测方法适用的一种可能的应用场景示意图。
图8示出根据本申请一实施例的目标检测方法的流程图。
图9示出根据本申请一实施例的一种待处理图像的示意图。
图10示出根据本申请一实施例的另一种待处理图像的示意图。
图11示出根据本申请一实施例的一种二值化图像的示意图。
图12示出根据本申请一实施例的一种确定候选区域每一列的像素点数量的示意图。
图13示出根据本申请一实施例的一种候选区域中具有特征值的像素点的示意图。
图14示出根据本申请一实施例的一种预设区域的示意图。
图15示出根据本申请一实施例的一种逆三维投影高度热值图。
图16示出了根据本申请一实施例的候选区域中满足范围条件的像素点示意图。
图17示出根据本申请一实施例的一种候选区域对应的高度热值图。
图18示出根据本申请一实施例的目标检测结果示意图。
图19示出根据本申请一实施例的第一像素点所在车道(车体坐标系)的示意图。
图20示出根据本申请一实施例的一种目标检测装置的结构示意图。
图21示出根据本申请一实施例的另一种目标检测装置的结构示意图。
具体实施方式
以下将参考附图详细说明本申请的各种示例性实施例、特征和方面。附图中相同的附图标记表示功能相同或相似的元件。尽管在附图中示出了实施例的各种方面,但是除非特别指出,不必按比例绘制附图。
在这里专用的词“示例性”意为“用作例子、实施例或说明性”。这里作为“示例性”所说明的任何实施例不必解释为优于或好于其它实施例。
另外,为了更好的说明本申请,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本申请同样可以实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本申请的主旨。
下面对本申请实施例涉及的一些概念进行简单介绍。
三维投影(3D projection):将三维空间中的点映射到二维图像平面上的过程。图1示出根据本申请一实施例的摄像机投影示意图;如图1所示,摄像机101观测三维空间中的点112(A)时,可以通过该摄像机模型(例如小孔成像),映射到二维图像平面105中点114(A′)所在的位置。
理想地平面假设(flat-earth assumption):假设车辆所行驶的路面是一个理想的平面(如上述图1中平面108)。基于这个假设,可以在缺少物体在三维空间的高度信息的情况下实现一种二维图像平面到理想地平面的逆向投影,例如,可以将上述图1中二维图像平面105中对应于路面的像素点(如点115(B′))映射到世界空间中理想地平面108上对应的点113(B)。
逆三维投影高度热值图(inverse 3D projection height heatmap):一种二维图像矩阵。设R和C为该矩阵的行数和列数,(u,v,h)表示该矩阵第u行第v列元素,数值为h,h即为所述逆三维投影高度热值。以上述图1为例,对该热值的计算方式进行说明:
基于上述理想地平面假设,将图1中二维图像平面105中的点115(B′)(点B′在该图像坐标系中的坐标记为(u B′,v B′))逆投影到平面108中,对应于点113(B),点113(B)的在世界坐标系中的坐标记为(x,y,0);将世界空间中物理高度为H的物体107,垂直放置在点113(B)处,此时物体107的一端所在的位置即为点113(B),另一端所在的位置为点112(A),点112(A)的在世界坐标系中的坐标记为(x,y,H);基于摄像机101的标定参数,计算该图像坐标系与该世界坐标系之间的三维投影变换矩阵M VI;基于该三维投影变换矩阵M VI,将点112(A)(x,y,H)投影到二维图像平面105,得到点114(A′),A′的在该图像坐标系中的坐标记为(u A′,v A′)。计算114(A′)到115(B′)的距离h,该h即为点115(B′)的热值,即为(u B′,v B′,h),将(u B′,v B′,h)填充到逆三维投影高度热值图中。
查找表(look-up table,LUT):一种数据结构(例如,可以为数组),采用简单的索引操作来替换在线的重复运算;逆三维投影高度热值图作为一种二维矩阵,也可以视为一种查找表,本申请实施例中的查找表可以基于上述逆三维投影高度热值图构建。
施工区(Construction Area,CA),是一种或多种施工区标志物围成的区域,是城市道路的常见交通场景,也是影响车辆行驶安全的重要因素。其中,施工区标志物为垂直于路面放置,是一类国标规定的具有固定高度的静止物体。图2示出根据本申请一实施例的施工区的标志物示意图。如图2所示,施工区标志物可以包括:交通锥201、路栏202、水马203、水泥石 墩204、施工车辆205、施工人员206和警示标志207等多种标志物。
非柱状立面体(non-cylindrical vertical planar object):一类基本形状为矩形或凸多边形的垂直于地面放置的物体。例如,上述图2中的路栏202、水马203、水泥石墩204等标志物即为此类物体。
下面对检测施工区标志物的一些示例进行简单介绍。
在一些示例中,图3示出根据本申请一实施例的一种施工区检测的示意图。如图3所示,图3(a)中图像具有非柱状立面体形态的施工区标志物,3(b)中图像具有柱状立面体形态的施工区标志物,针对存在于所述道路中或道路附近的圆锥体、圆柱体、杆等具有柱状特征的施工区标志物(如图3(b)中交通锥、交通柱等),通过统计检测区域内的(如图3(b)中矩形框内)施工区标志物的数目判定车载摄像机所拍摄的当前图像或视频帧中是否存在施工区。该示例中,关注柱状形态的施工区标志物,对于非柱状立面体形态的施工区标志物(如图3(a)中的路栏)检测能力较弱;同时,通过统计施工区标志物的数目来确定是否为施工区,存在误检的风险(如图3(b)中矩形框内区域并非实际的施工区);另外,未对施工区所在车道做明确标注。
在一些示例中,图4示出根据本申请一实施例的另一种施工区检测的示意图。如图4所示,通过使用一组包括各种施工相关对象(包括施工区标志物)的训练图像训练卷积神经网络(Convolutional Neural Network,CNN),用于对此类施工区标志物进行分类和识别,并以二维标注框的形式标注施工区标志物。基于数字、位置、施工相关对象等判断环境内是否存在影响行驶的施工区域;进而结合距离判断和射线投影来将他们定位在三维空间中,并根据施工对象相对于道路车道的性质确定临时交通堵塞的各种设置。该示例中,当标志物以不同角度放置路栏,二维标注框可能引入潜在的定位误差;例如,在利用CNN或其他机器学习算法对于图像中的施工区标志物的检测结果以图4(a)中虚线框402和图4(b)中虚线框405的形式输出;当实际路栏以图4(b)中404的方式放置时,标注框405的底边中心点406与该路栏的实际中心点408之间存在误差,从而造成路栏定位误差。同时,基于CNN或机器学习的方法对于训练数据具有很强的依赖性,且对于感知设备的计算能力和硬件存储空间具有较高的要求,不适于在低功耗低运算能力的平台上部署。
在一些示例中,图5示出根据本申请一实施例的另一种施工区检测的示意图,如图5所示,针对具有黄色、橙色或红色的施工区标志物(包括交通锥、交通柱、路栏等)进行检测,首先对摄像头获取的红绿蓝(Red Green Blue,RGB)三通道图像进行处理,得到称为亮度图和橙色图的两种图像。针对这两种图像,采用基于模板匹配的滑窗法检测具有亮橙色纹理的区域;同时,基于分类器判断该区域是否含有条纹,以确定该区域是否为施工区。图6示出根据本申请一实施例的标志物及其他交通目标的示意图,其中,图6(a)中为具有施工区标志物颜色纹理特征的标志物示例,图6(b)中为除标志物外其他交通目标的示例,该示例中,除了施工区标志物,该检测方式所基于的施工区标志物(如图6(a)中路栏、水泥石墩等)的颜色和纹理特征,在其他交通目标(如图6(b)中所示的路沿、车道线、集装箱车辆等)上也具备相近颜色和纹理特征,区分难度大,从而存在误检的风险;同时,该检测方式仅检测具有条状纹理的标志物,对单一颜色的施工区标志物不具备检测能力(如图6(a)水马);另外,采用的滑窗模板匹配方式对于分辨率较高的图像,存在处理时延高的风险,且模板尺度大小的选择影响检测的准确性;此外,没有统计施工区针对其所在交通车道的占用情况。
上述检测施工区标志物的方式中存在误检风险,尤其检测路栏、水马和水泥石墩等非柱状立面体的施工区标志物时,误检风险较高,为此,本申请实施例提供了一种目标检测方法,该目标检测方法能够提高针对施工区标志物,尤其是路栏、水马、水泥石墩等非柱状立面体标志物的检测能力;降低施工区标志物误检风险,提高检测准确性。
本申请实施例提供的目标检测方法可以应用于包括至少一个图像采集装置及目标检测装置的目标检测系统;其中,目标检测装置可以是独立设置,也可以集成在控制装置中,还可以是通过软件或者软件与硬件结合实现,对此不做具体限定。
示例性地,该目标检测系统可以应用于高级辅助驾驶系统(Advanced Driver Assistance Systems,ADAS)和自动驾驶系统(Automated Driving System,ADS)中,还可以多种驾驶功能(如自适应巡航控制(Adaptive Cruise Control,ACC)和自动紧急制动(Advanced Emergency Braking,AEB))等等,还可以应用于物物通信(Device to Device Communication,D2D)、车与任何事物相通信(vehicle to everything,V2X)、车与车通信(Vehicle to Vehicle,V2V)、长期演进与车通信(long term evolution-vehicle,LTE-V)、长期演进与机器通信(long term evolution-machine,LTE-M)等场景。
示例性地,该图像采集装置可以为安装在车身上的一个或多个摄像头,也可以为安装在其他可移动智能终端上的一个或多个摄像头,例如,可以为环视摄像头、单目摄像头、双目摄像头等等,用于拍摄车辆或其他可移动智能终端周围环境,生成图像和/视频,并将该图像和/视频传输到该目标检测装置;该目标检测装置可以基于该图像采集装置拍摄得到的图像和/视频进行目标检测,该目标可以包括停车位、人、障碍物、车道线、施工区等等,以了解车辆或其他可移动智能终端的周边环境。
示例性地,该目标检测装置可为具有目标检测功能的车辆,或者为具有目标检测功能的其他部件。该目标检测装置包括但不限于:车载终端、车载控制器、车载模块、车载模组、车载部件、车载芯片、车载单元、车载雷达或车载摄像头等其他传感器,车辆可通过该车载终端、车载控制器、车载模块、车载模组、车载部件、车载芯片、车载单元、车载雷达或摄像头,实施本申请提供的方法。
示例性地,该目标检测装置还可以为除了车辆之外的其他具有目标检测功能的智能终端,或设置在除了车辆之外的其他具有目标检测功能的智能终端中,或设置于该智能终端的部件中。该智能终端可以为智能运输设备、智能家居设备、机器人、无人机等其他终端设备。该目标检测装置包括但不限于智能终端或智能终端内的控制器、芯片、雷达或摄像头等其他传感器、以及其他部件等。
示例性地,该目标检测装置可以是一个通用设备或者是一个专用设备。在具体实现中,该装置还可以台式机、便携式电脑、网络服务器、掌上电脑(personal digital assistant,PDA)、移动手机、平板电脑、无线终端设备、嵌入式设备或其他具有处理功能的设备。本申请实施例不限定该目标检测装置的类型。
示例性地,该目标检测装置还可以是具有处理功能的芯片或处理器,该目标检测装置可以包括多个处理器。处理器可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。该具有处理功能的芯片或处理器可以设置在传感器中,也可以不设置在传感器中,而设置在传感器输出信号的接收端。
图7示出根据本申请一实施例的目标检测方法适用的一种可能的应用场景示意图。如图 7所示,该应用场景可以为自动驾驶场景,该场景中包括至少一辆车辆701,该车辆701中安装有至少一个图像采集装置702,该车辆701还包括目标检测装置(图中未示出),该图像采集装置702用于在拍摄车辆前方道路环境,生成相应的图像,车辆前方道路可以存在目标区域703(例如,施工区),图像采集装置702将采集的图像传输到目标检测装置,该目标检测装置用于执行本申请实施例中的目标检测方法。
需要说明的是,图7中仅以一辆车辆、一个图像采集装置及一片施工区示出,应理解,这并不限定应用场景中车辆的数量、图像采集装置的数量及施工区的数量,应用场景中可以包括更多的车辆、更多的图像采集装置及更多的目标区域,此处不再示出;车辆前方的道路可以为结构化道路,也可以为非结构话道路,本申请实施例对此不作限定。
此外,本申请实施例描述的应用场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,针对其他相似的或新的应用场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
以下结合图7对本申请提供的目标检测方法进行说明。
图8示出根据本申请一实施例的目标检测方法的流程图。如图8所示,该方法的执行主体可以为上述图7中的车辆中的目标检测装置,该方法可以包括以下步骤:
步骤801、目标检测装置获取待处理图像。
其中,待处理图像可以为图7中车辆所安装的图像采集装置实时采集的车辆周围环境的图像或视频帧。可以理解的是,在车辆行驶过程中,每一时刻采集的待处理图像可以包含施工区标志物,也可以不包含施工区标志物。
示例性地,该图像采集装置可以为安装在车辆车标中部靠上的位置或安装在车辆前风挡内后视镜处等位置的摄像头,其中,该摄像头相对于车体的安装滚转角和俯仰角可以较小(例如,近似为零),该摄像头可以实时拍摄车辆行驶前方的道路及道路周围的图像,并将该图像实时传输到目标检测装置。例如,图9示出根据本申请一实施例的一种待处理图像的示意图;如图9所示,待处理图像中包含车辆所行驶道路的车道、车道线、其他车辆、施工区标志物及道路两旁的环境信息。
进一步地,目标检测装置可以截取上述待处理图像中道路所在的部分,从而减少数据处理量,提高目标检测效率。例如,目标检测装置可以识别待处理图像道路在图像中的消失点(即道路尽头与天空的交界点),进而以该消失点所在的直线为边界,将待处理图像横向分割为两部分,并将包含道路的部分作为新的待处理图像。图10示出根据本申请一实施例的另一种待处理图像的示意图;如图10所示,该图10为根据图9中道路的消失点将图9分割为两部分后,包含道路的部分的图像;图10中的待处理图像与图9相比,缺少了天空等与目标检测无关的部分。
步骤802、目标检测装置根据标志物的特征对待处理图像进行二值化处理,确定待处理图像中的候选区域,其中,候选区域包括至少一个像素点,候选区域包括N列像素点,N为正整数。
该步骤中,目标检测装置根据标志物的特征对待处理图像进行二值化处理,经过二值化处理,图像的数据量大为减少,提高检测效率;同时,可以将待处理图像中的可能存在该标志物的候选区域筛选出来,完成施工区标志物的初步筛选,可以允许待处理图像中符合标志物特征的其他物体进入后续筛选流程,有效提升施工区标志物检测的召回率。
其中,标志物可以包括用于划分施工区域的标志物,从而可以通过检测该标志物,感知施工区域,为车辆行驶路径规划等提供支持。例如,标志物可以为上述图2中所示的交通锥、路栏、水马、水泥石墩、施工车辆、施工人员和警示标志等等标志物的任意一种,可以理解的是,不同的标志物可以具有不同的特征,标志物的特征可以预先设定;本申请实施例中,以标志物为图2中的路栏202为例进行示例性说明。以说明申请实施例的目标检测方法针对施工区标志物,尤其是路栏、水马、水泥石墩等非柱状立面体标志物的检测能力。
在一种可能的实现方式中,所述根据标志物的特征对待处理图像进行二值化处理,确定待处理图像中的候选区域,可以包括:根据预设的标志物的颜色特征和/或纹理特征,对待处理图像进行二值化处理,得到二值化图像;将二值化图像中满足第二条件的区域,确定为候选区域。这样,可以根据预设的标志物的颜色特征和/或纹理特征,进行二值化处理,提高施工区标志物的初步筛选的准确性,且图像的数据量大为减少,提高检测效率;可以允许待处理图像中与标志物的颜色特征和/或纹理特征相同、相似的其他物体进入后续筛选流程,有效提升施工区标志物检测的召回率。
其中,待处理图像所在的图像可以为RGB色彩空间(也称蓝绿红(Blue Green Red,BGR)色彩空间)、YUV(其中,Y表示明亮度(Luminance或Luma)、U表示色相(hue)、V表示色饱和度(saturation))色彩空间等等色彩空间,以待处理图像所在的色彩空间为RGB色彩空间为例,待处理图像为具有三通道的分辨率为W×H(其中,W表示宽度(Width),W表示高度(Height))像素的图像,其中,三通道分别对应为红色(R)、绿色(G)和蓝色(B)三基色。针对待处理图像中的每个像素点,采用下述公式(1)进行处理,得到单通道图像I 1c
I 1c(u,v)=a R·R(u,v)+a G·G(u,v)+a B·B(u,v)……………(1)
上述公式(1)中,I 1c(u,v)表示像素点(u,v)在单通道图像中的像素值,R表示像素点(u,v)红光通道的值、G表示像素点(u,v)绿光通道的值、B表示像素点(u,v)蓝光通道的值;a R,a G,a B分别表示红光通道、绿光通道、蓝光通道对应的加权系数;其中,a R,a G,a B可以根据标志物的颜色特征和/或纹理特征确定。
进一步地,根据下述公式(2)对上述单通道图像I 1c进行二值化处理,得到二值化图像I bin
Figure PCTCN2022072994-appb-000001
上述公式(2)中,I bin(u,v)表示像素点(u,v)在二值化图像中的像素值,I 1c(u,v)表示像素点(u,v)在单通道图像中的像素值,T B表示下界门限,T U表示上界门限;其中,T B及T U可以根据标志物的颜色特征和/或纹理特征确定。
经过上述公式(1)及(2)处理后,将待处理图像中各像素点在原色彩空间的像素值转化为二值化图像中对应的像素值,即1或0,可以将第二条件设定为区域内所有像素点的值为1,这样,在二值化图像中筛选出满足该第二条件的区域,该区域即为候选区域。
例如,通过上述公式(1)及公式(2)对上述图10所示的待处理图像进行二值化处理,可以根据预设的路栏的颜色特征(黄色),将上述公式(1)中加权系数a R,a G,a B分别设定为0.5,-0.25和-0.25,将图10中待处理图像中的每个像素点,采用公式(1)进行处理,得到单通道图像,进而采用公式(2)对单通道图像进行二值化处理,得到二值化图像,图11示出根据本申请一实施例的一种二值化图像的示意图;如图11所示,在二值化图像中,将单通道图像中像素值在下界门限T B和上界门限T U之间的像素点在该二值化图像中对应的像素值设 定为1(即图11中白色区域),其余像素点设定为0(即图11黑色区域),其中,在该二值化图像中像素值为1的像素点可能属于路栏,像素点为0的像素点则不属于路栏,这样,将第二条件设定为区域内所有像素点的值为1,即将图11中二值化图像中的白色区域确定为候选区域。
步骤803、目标检测装置确定N列像素点中每一列的像素点数量。
该步骤,目标检测装置可以确定上述步骤802中所确定的二值化图像中候选区域中N列像素点中每一列的像素点数量。
示例性地,目标检测装置可以搜索候选区域中每一列中所有像素点对应的最大行数及最小行数,计算该最大行数与该最小行数的差值,该差值即为该列中的像素点数量。
例如,图12示出根据本申请一实施例的一种确定候选区域每一列的像素点数量的示意图;如图12所示,该图12所示的二值化图像即为上述图11中二值化图像,该二值化图像中各像素点的坐标表示为(u,v),其中,u表示像素点所在的列,v表示像素点所在的行;搜索该二值化图像中的白色区域的每一列u的最大行数值
Figure PCTCN2022072994-appb-000002
(例如图12中点P1)和最小行数值
Figure PCTCN2022072994-appb-000003
(例如图12中点P2),计算
Figure PCTCN2022072994-appb-000004
Figure PCTCN2022072994-appb-000005
的差值,即为该列中的像素点数量,记为
Figure PCTCN2022072994-appb-000006
步骤804、目标检测装置确定N列像素点中每一列中行数最大的像素点的特征值,其中,特征值包括对应的每一列的像素点的数量。
该步骤中,目标检测装置在上述步骤803得到的候选区域中N列像素点中每一列的像素点数量的基础上,可以将每一列的像素点数量,确定为该列中行数最大的像素点的特征值。
在一种可能的实现方式中,所述确定所述N列像素点中每一列中行数最大的像素点的特征值,包括:确定N列像素点中每一列中所有像素点对应的最大行数及最小行数;将最大行数与最小行数的差值,确定为该列中行数最大的像素点的特征值。其中,行数最大的像素点即表示候选区域可能包含的施工区标志物与路面的交界点,该行数最大的像素点的特征值即可表征候选区域可能包含的施工区标志物在二值化图像中的近似高度信息,这样,可以利用每一列中行数最大的像素点的特征值表征候选区域中各行所代表物体的近似高度信息。
例如,上述图12中的二值化图像中,将点P1及点P2所在列的像素点数量h D,确定为该点P1的特征值,若点P1所在的白色区域为路栏围成的施工区,则该点P1为路栏与路面的交界点,点P1的特征值即可表征与路面的交界点为P1的路栏在二值化图像中所呈现的近似高度。这样,遍历候选区域的所有列,即可得到每一列中行数最大的像素点的特征值。图13示出根据本申请一实施例的一种候选区域中具有特征值的像素点的示意图;如图13所示,该图中所示出的像素点即为候选区域中具有特征值的所有像素点,可以看出施工区标志物的底边轮廓能够清晰地显示在图13所示的二值化图像中。
步骤805、目标检测装置确定待比对图像,其中,待比对图像包括至少一个像素点,确定每个像素点对应的理论特征值,其中,该理论特征值基于标志物的参考高度信息确定。
该步骤中,该待比对图像可以为预先构建的图像,也可以为实时构建的图像;该待比对图像与待处理图像的大小可以相同(即包含的像素点数量相同);示例性地,该待比对图像可以为逆三维投影高度热值图,在逆三维投影高度热值图中各像素点的热值即为每个像素点对应的理论特征值。
在一种可能的实现方式中,所述确定每个像素点对应的理论特征值,可以包括:根据标志物的参考高度信息及采集待处理图像的图像采集装置的标定参数,确定待比对图像中每个 像素点对应的理论特征值。这样,基于标志物的参考高度信息及采集待处理图像的图像采集装置的标定参数,确定待比对图像中每个像素点对应的理论特征值,从而可以利用待比对图像中的像素点对应的理论特征值,表征施工区标志物位于候选区域的该像素点时该施工区标志物的理论高度信息(即该标志物在待比对图像中所占用的该像素点所在列的像素点数量)。利用采集所述待处理图像的图像采集装置的标定参数进行理论特征值的确定,使得所确定的理论特征值精确地匹配该图像采集装置的特性,这样,在以同样的图像采集装置采集的待处理图像中,利用所确定的理论特征值即可精确地确定标志物的像素点。例如,可以根据图像采集装置的标定参数和标志物的参考高度信息,预先确定待比对图像中每个像素点对应的理论特征值,相对于全图逐像素搜索施工区标志物的方式,无需在帧间做高频率刷新,有效降低处理时延,提高了检测效率。
其中,图像采集装置的标定参数可以包括:图像采集装置内参矩阵K,图像采集装置相对于车体的旋转矩阵R及平移向量T等,标志物的参考高度信息可以包括该标志物在世界坐标系中的高度,例如,标志物为路栏,路栏垂直于路面放置时,路栏在世界坐标系中的高度可以为1米。
示例性地,可以根据标志物的参考高度信息及采集待处理图像的图像采集装置的标定参数,预先确定待比对图像中每个像素点对应的理论特征值,从而得到预先构建的该标志物的逆三维投影高度热值图(或查找表);还可以根据标志物的参考高度信息及采集待处理图像的图像采集装置的标定参数,实时确定待比对图像中每个像素点对应的理论特征值,从而得到实时构建的该标志物的逆三维投影高度热值图(或查找表)。
在一种可能的实现方式中,根据标志物的参考高度信息及采集待处理图像的图像采集装置的标定参数,预先确定待比对图像中每个像素点对应的理论特征值,可以包括:根据图像采集装置的标定参数,确定世界坐标系到图像采集装置坐标系的变换矩阵;根据变换矩阵,将世界坐标系中至少一个位置上的标志物投影到图像采集装置坐标系中,得到该位置对应的待比对图像中像素点及该标志物对应的待比对图像中像素点的数量;将该标志物对应的待比对图像中像素点的数量作为该像素点对应的理论特征值。
其中,世界坐标系可以为上述图1中地平面(即实际路面)所在的坐标系,图像采集装置坐标系可以为图1中二维图像(即待比对图像)所在的坐标系;在上述图1中,103及104分别表示图像采集装置坐标系的u方向及v方向;109、110、111分别表示世界坐标系中的x方向及、y方向、z方向。
基于上述图像采集装置的标定参数,通过下述公式(3)确定世界坐标系到图像采集装置坐标系的变换矩阵:
Figure PCTCN2022072994-appb-000007
在公式(3)中,u、v分别表示图像采集装置坐标系中的点的在u方向、v方向的坐标,x,y,z分别表示世界坐标系中的点在x方向及、y方向、z方向的坐标。K表示图像采集装置的内参矩阵,R及T分别表示图像采集装置相对于车体的旋转矩阵及平移向量,s表示图像采集装置相对 于世界坐标系中的各点的距离,M表示世界坐标系到图像采集装置坐标系的变换矩阵。
根据上述变换矩阵M,将世界坐标系中垂直放置在理想地平面的各位置上的标志物投影到图像采集装置坐标系中,得到图像采集装置坐标系中待对比图像中对应于上述各位置的像素点及待对比图像中对应与放置在各位置的标志物的像素点的数量;例如,将高度为1米的路栏,垂直放置在理想地平面的B(x,y,0)点,此时,A(x,y,1)点与B(x,y,0)点之间的线段可以表示路栏在世界坐标系中的高度1米,根据上述变换矩阵M,结合上述公式(3),将点A及点B投影到待对比图像,得到点A′(u A′,v A′)及B′(u B′,v B′),A′点与B′点之间的线段可以表示路栏在待对比图像的理论高度信息,该理论高度信息可以通过A′点与B′点之间的像素点数量h表示;此时,该路栏在待对比图像的理论高度信息即为B′点对应的理论特征值。
这样,将高度为1米的路栏,垂直放置在理想地平面的各位置处,即可得到待比对图像中各像素点对应的理论特征值,此时待比对图像即为针对该标志物的逆三维投影高度热值图。示例性地,可以将高度为1米的路栏,垂直放置在理想地平面的预设区域,该预设区域可以为车辆前方一段道路所在的区域,图14示出根据本申请一实施例的一种预设区域的示意图,如图14中的虚线方框所示的预设区域,该预设区域可以为以安装该图像采集装置的车辆为中心,在该车辆的左右各5米为宽,前方55米为长,所构成的矩形区域。图15示出根据本申请一实施例的一种逆三维投影高度热值图,如图15所示,为针对图14中预设区域的逆三维投影高度热值图,图14中的矩形区域投影到逆三维投影高度热值图中,对应于图15中的梯形区域,在该逆三维投影高度热值图中,梯形区域中的像素点(即热度值不为0的像素点)对应于理想地平面中该预设区域之内的位置,梯形区域之外的像素点(即热度值为0的像素点)对应理想地平面中该预设区域之外的位置,该梯形区域之外的位置不在检测范围内,即可以仅检测放置在路面上的在预设区域内的施工区标志物,从而有效提高检测效率。
该实现方式中,根据图像采集装置的标定参数和标志物的参考高度信息,预先确定待比对图像中每个像素点对应的理论特征值,从而预先构建该标志物的逆三维投影高度热值图;相对于全图逐像素搜索施工区标志物的方式,无需在帧间做高频率刷新,有效降低处理时延,提高了检测效率。
在一种可能的实现方式中,可以基于标志物的参考高度信息及采集待处理图像的图像采集装置的标定参数,实时确定待比对图像中每个像素点对应的理论特征值,从而得到实时构建的该标志物的逆三维投影高度热值图。示例性地,可以根据上述确定的候选区域中每一列中行数最大的像素点,结合上述图1所示的热值计算方式,将图1中理想地平面上物理高度为H的物体设定为高度为1米的路栏,从而可以确定上述各行数最大的像素点所对应的理论特征值,根据各行数最大的像素点的坐标及所对应的理论特征值,得到待比对图像,此时该待比对图像即为针对路栏的逆三维投影高度热值图。该实现方式中,待比对图像可以为仅包含于候选区域中每一列中行数最大的像素点对应的理论特征值,减少了数据处理量,提高检测效率。
步骤806、目标检测装置将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配;当待处理图像中的候选区域的第一像素点的特征值与待比对图像中对应的第二像素点的理论特征值满足第一条件时,第一像素点为标志物的像素点。
其中,第一条件可以基于待处理图像的分辨率及标志物的大小等因素,通过试验统计确 定,示例性地,标志物为路栏,当待处理图像中的候选区域的第一像素点的特征值与待比对图像中对应的第二像素点的理论特征值相差预设像素值(如100像素、200像素等)时,确定第一像素点为该路栏的像素点。
这样,可以有效检测施工区标志物,尤其针对非柱状立面体形态特征的施工区标志物,提升了目标检测方法的泛化能力;同时,能够滤除其他与施工区标志物有相似特征的物体(如集装箱车辆、车道线等),有效降低误检风险,提高检测准确性。
在一种可能的实现方式中,所述将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配,包括:确定待处理图像中的候选区域中满足范围条件的像素点;其中,若像素点包含于路面在待处理图像中所在的范围之内,则像素点满足范围条件;将满足范围条件的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配。
其中,路面在待处理图像中所在的范围可以结合图像采集装置的标定参数及上述图14中所示的预设区域预先确定,例如,当车辆所行驶的道路情况及图像采集装置的标定参数与上述图14所示的情况相同时,则可以将图15所示的梯形区域,确定为路面在待处理图像中所在的范围;也可以通过实时感知待处理图像中的道路边沿,确定路面在待处理图像中所在的范围,例如,可以识别待处理图像中的道路边沿及道路的尽头的消失线,将道路边沿及道路尽头的消失线所围成的区域确定为路面在待处理图像中所在的范围。基于所确定的路面在待处理图像中所在的范围,判断上述步骤804中待处理图像中的候选区域中每一列中行数最大的像素点是否在路面在待处理图像中所在的范围之内,若像素点在路面在待处理图像中所在的范围,则将该像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配,这样,可以仅对包含于路面在待处理图像中所在的范围之内的像素点进行特征值匹配,即检测放置在路面上的施工区标志物,从而有效提高检测效率。
例如,图16示出了根据本申请一实施例的候选区域中满足范围条件的像素点示意图;如图16所示,图中实线表示通过实时感知所得到的待处理图像中的道路边沿,图16中实线以下部分及图像边缘围成的梯形区域即为路面在待处理图像中所在的范围,除此之外,图16与上述图13相同,可以看出,候选区域中具有特征值的所有像素点均在路面在待处理图像中所在的范围之内,即图13中候选区域中每一列中行数最大的像素点均满足范围条件,则将候选区域中每一列中行数最大的像素点与待比对图像中对应的像素点的理论特征值进行匹配。
在一种可能的实现方式中,将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配;可以包括:基于待处理图像中的候选区域的像素点的特征值,构建候选区域对应的高度热值图,将该候选区域对应的高度热值图与上述步骤805中构建的该标志物的逆三维投影高度热值图做逐像素的残差运算,得到高度偏差图;基于第一条件,在该高度偏差图中搜索高度值偏差小于第一条件的点,得到待处理图像中的候选区域的第一像素点。这样,基于逆三维投影高度热值图,将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配,可以快速筛选出第一像素点。
示例性地,根据上述候选区域中每一列中行数最大的像素点的特征值,及每一列中行数最大的像素点的位置,构建候选区域对应的高度热值图;图17示出根据本申请一实施例的一种候选区域对应的高度热值图,如图17所示,该候选区域对应的高度热值图与上述图15中所示的逆三维投影高度热值图大小相同,17中方框内包含候选区域;将图15与图17做逐像 素的残差运算,得到高度偏差图(图中为示出),在该高度偏差图中,搜索高度值偏差小于100像素的点,即为待处理图像中的候选区域的第一像素点。
在一种可能的实现方式中,将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配;可以包括:基于每一列中行数最大的像素点的位置,以查找表的方式,在与上述步骤805中构建的该标志物的逆三维投影高度热值查找表中,查找对应像素点的理论特征值,求取待处理图像中像素点的特征值与对应像素点的理论特征值的差值,在该差值小于第一条件时,待处理图像中该像素点即为第一像素点。这样,基于查找表的方式将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配,处理不超过待处理图像列数的点对数,相对于其他针对全图的处理方式,有效提升了图像处理效率,可以有效适用于施工区比较稀疏的场景下(即图像中含有施工区的像素点显著少于非施工区的像素点)。
示例性地,针对上述图12所示的点P1,点P1的特征值为h D,根据点P1的坐标,在针对路栏的逆三维投影高度热值查找表中,查找该坐标的对应的理论特征值h R,计算差值Diff=|h D-h R|,若Diff小于100像素点,则将点P1确定为第一像素点。
图18示出根据本申请一实施例的目标检测结果示意图,如图18所示,该图中所包含的第一像素点即为路栏的像素点,同时该第一像素点为路栏与路面交界处的像素点,从而实现了施工区域与道路边界的准确定位。通过对比图18与上述图16可以看出,筛除了图16中左半区域的像素点,这些被筛除的像素点实为车道线的像素点,从而有效滤除与标志物特征相同或相似的其他物体。
步骤807、目标检测装置确定施工区占用车道情况。
在上述步骤806中,检测得到标志物的第一像素点,该第一像素点即为该标志物所围成的施工区与路面交界处的像素点,在此基础上,可以基于车道线约束,统计各第一像素点所在车道的情况,从而确定施工区占用车道情况。
在一种可能的实现方式中,所述确定施工区占用车道情况可以包括:确定待处理图像中的车道线(或虚拟车道线);根据车道线的位置及第一像素点的位置,确定第一像素点所在的车道;在车道内的第一像素点数量超过第三阈值时,确定第一像素点所在候选区域的横向边界及纵向边界。其中,第一像素点所在候选区域即为标志物围成的施工区,第三阈值可以根据实际需求进行设定,在此不作限定。
示例性地,可以在图像采集装置坐标系下确定第一像素点所在的车道,例如,可以通过识别待处理图像中的车道线,根据车道线在待处理图像中的位置及第一像素点在待处理图像中的位置,确定第一像素点所在的车道;也可以在车体坐标系或世界坐标系下确定第一像素点所在的车道;例如,可以基于图像采集装置的标定参数和射影定理,将第一像素点及待处理图像中的车道线投影到车体坐标系(或世界坐标系)下,从而在根据车体坐标系(或世界坐标系)中车道线位置及第一像素点的位置,确定第一像素点所在的车道。
示例性地,可以通过投票的方式确定各车道内的第一像素点数量,在一车道内的第一像素点数量超过第三阈值时,可以统计该车道中第一像素点距离车辆的横纵方向的最近距离和最远距离,确定施工区占用该车道的横向边界及纵向边界。图19示出根据本申请一实施例的第一像素点所在车道(车体坐标系)的示意图;如图19所示,路面中共有三个车道,分别为左车道、中间车道、右车道(相对于车辆行驶方向);其中,右车道内的第一像素点数量最多, 且超过第三阈值,则可以在右车道内的所有第一像素点的坐标值中,确定最小X坐标值、最大X坐值标、最小Y坐标值、最大Y坐标值,求取最小Y坐标值与最大Y坐标值的差值作为施工区占用该车道的横向边界,求取最小X坐标值与最大X坐标值的差值作为施工区占用该车道的纵向边界。
这样,可以在图像采集装置坐标系下,也可以在车体坐标系或世界坐标系下确定施工区占用车道情况,方法灵活性更强,适用范围更广;同时,能够统计施工区距离自车在横向和纵向上的边界距离,细化了施工区占据车道边界情况,从而可以为进一步地车辆控制提供更加丰富的信息。
在一种可能的实现方式中,所述方法还包括:根据多张待处理图像中的第一像素点,确定当前待处理图像中第一像素点所在车道。例如,可以通过与当前待处理图像相连的多张待处理图像,对多张待处理图像中的第一像素点进行时间轴方向上的融合处理,确定当前待处理图像中第一像素点所在车道。这样,可提升所确定的施工区占用车道情况的准确性和稳定性。
示例性地,可以采用二次多项式对上述步骤806中得到的所有第一像素点进行拟合,得到拟合曲线;基于卡尔曼滤波器建立针对拟合曲线的运动方程,对该拟合曲线进行跟踪;将跟踪结果与当前待处理图像检测到的第一像素点进行融合,得到平滑且稳定的第一像素点检测结果,从而可以更快速且准确地确定当前待处理图像中第一像素点所在车道。
进一步地,目标检测装置可以将上述施工区占用车道情况,以车道占据报文形式上报到具有车辆控制功能的相关模块,例如,可以上报到自动驾驶辅助系统的控制模块,控制模块根据上报的车道占据报文,做出有效的车辆行驶路径规划。该车道占据报文提供了更加丰富的信息给控制模块,提升了车辆控制效果和用户体验。
示例性地,车道占据报文可以施工区占用车道情况和施工区所占用区域的边界情况,例如:车道占用标识OccLane:右侧第二车道(2(right)),即右车道,纵向最小距离Xmin:2.819m,纵向最大距离Xmax:52.073m,横向最小距离Ymin:1.528m,横向最大距离Ymax:4.701m。该车道占据报文包含细化的施工区占据车道边界情况,能够更好地帮助控制模块对车辆进行控制。
需要说明的是,该步骤807为可选步骤,可以在上述步骤801-806的实现施工区标志物检测的基础上,进一步基于检测到的第一像素点,确定施工区占用车道情况,相对基于全球定位系统(Global Positioning System,GPS)和高精地图的复杂方案,本申请实施例简单而高效;同时,可以有效解决施工区定位误差导致对施工区所占用车道情况的评估不准确,以及车道占据情况的报文不够充分的问题。
基于上述技术方案,根据标志物的特征确定待处理图像中的候选区域,并确定待处理图像中的候选区域的每一列中行数最大的像素点的特征值及待比对图像中每个像素点对应的理论特征值,其中,特征值包括对应的每一列的像素点的数量,理论特征值基于标志物的参考高度信息确定,并将待处理图像中的候选区域的像素点的特征值与待比对图像中对应的像素点的理论特征值进行匹配,确定标志物的像素点;这样,在根据标志物的特征确定待处理图像中的候选区域的基础上,进一步利用标志物的参考高度信息对施工区标志物进行检测,通过双重筛选,有效滤除与标志物特征相同或相似的其他物体,降低误检风险,提高检测准确性。并且通过进行二值化处理,确定所述待处理图像中的候选区域,提升了标志物的初步筛 选的准确性,且参与后续处理的图像的数据量大为减少,提高检测效率。
基于上述方法实施例的同一发明构思,本申请的实施例还提供了一种目标检测装置,该目标检测装置用于执行上述方法实施例所描述的技术方案。
图20示出根据本申请一实施例的一种目标检测装置的结构示意图;如图20所示,该目标检测装置可以包括:获取模块901,用于获取待处理图像;处理模块902,用于根据标志物的特征对所述待处理图像进行二值化处理,确定所述待处理图像中的候选区域,所述候选区域包括至少一个像素点,所述候选区域包括N列像素点,所述N为正整数;确定所述N列像素点中每一列的像素点数量;确定所述N列像素点中每一列中行数最大的像素点的特征值,所述特征值包括对应的每一列的像素点的数量;确定待比对图像,所述待比对图像包括至少一个像素点,确定每个像素点对应的理论特征值,所述理论特征值基于所述标志物的参考高度信息确定;将所述待处理图像中的候选区域的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配;当所述待处理图像中的候选区域的第一像素点的特征值与所述待比对图像中对应的第二像素点的理论特征值满足第一条件时,所述第一像素点为所述标志物的像素点。
在一种可能的实现方式中,所述处理模块902,还用于:根据所述标志物的参考高度信息及采集所述待处理图像的图像采集装置的标定参数,确定所述待比对图像中每个像素点对应的理论特征值。
在一种可能的实现方式中,所述标志物的参考高度信息包括所述标志物在世界坐标系中的高度;所述处理模块902,还用于:根据所述图像采集装置的标定参数,确定世界坐标系到图像采集装置坐标系的变换矩阵;根据所述变换矩阵,将所述世界坐标系中至少一个位置上的标志物投影到所述图像采集装置坐标系中,得到该位置对应的所述待比对图像中像素点及该标志物对应的所述待比对图像中像素点的数量;将该标志物对应的所述待比对图像中像素点的数量作为该像素点对应的理论特征值。
在一种可能的实现方式中,所述处理模块902,还用于:确定所述N列像素点中每一列中所有像素点对应的最大行数及最小行数;将所述最大行数与最小行数的差值,确定为该列中行数最大的像素点的特征值。
在一种可能的实现方式中,所述处理模块902,还用于:确定所述待处理图像中的候选区域中满足范围条件的像素点;其中,若所述像素点包含于路面在所述待处理图像中所在的范围之内,则所述像素点满足所述范围条件;将所述满足范围条件的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配。
在一种可能的实现方式中,所述处理模块902,还用于:根据预设的所述标志物的颜色特征和/或纹理特征,对所述待处理图像进行二值化处理,得到二值化图像;将所述二值化图像中满足第二条件的区域,确定为所述候选区域。
在一种可能的实现方式中,所述处理模块902,还用于:确定所述待处理图像中的车道线;根据所述车道线的位置及所述第一像素点的位置,确定所述第一像素点所在的车道;在车道内的第一像素点数量超过第三阈值时,确定所述第一像素点所在候选区域的横向边界及纵向边界。
在一种可能的实现方式中,所述处理模块902,还用于:根据多张待处理图像中的第一像素点,确定当前待处理图像中第一像素点所在车道。
在一种可能的实现方式中,所述标志物包括用于划分施工区域的标志物。
本申请实施例中,目标检测装置及其各种可能的实现方式的技术效果可参见上述目标检测方法。
上述实施例的各种可能的实现方式或说明参见上文,此处不再赘述。
本申请的实施例提供了一种目标检测装置,包括:处理器以及用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行所述指令时实现上述目标检测方法。
进一步,该目标检测装置还可以包括:至少一个传感器,所述传感器用于采集待处理图像。
图21示出根据本申请一实施例的另一种目标检测装置的结构示意图,如图21所示,该目标检测装置可以包括:至少一个处理器1001,通信线路1002,存储器1003以及至少一个通信接口1004。
处理器1001可以是一个通用中央处理器(central processing unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。
通信线路1002可包括一通路,在上述组件之间传送信息。
通信接口1004,使用任何收发器一类的装置,用于与其他设备或通信网络通信,如以太网,无线接入网(Radio Access Network,RAN),无线局域网(wireless local area networks,WLAN)等。
存储器1003可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过通信线路1002与处理器相连接。存储器也可以和处理器集成在一起。本申请实施例提供的存储器通常可以具有非易失性。其中,存储器1003用于存储执行本申请方案的计算机执行指令,并由处理器1001来控制执行。处理器1001用于执行存储器1003中存储的计算机执行指令,从而实现本申请上述实施例中提供的方法。
可选的,本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不作具体限定。
在具体实现中,作为一种实施例,处理器1001可以包括一个或多个CPU,例如图21中的CPU0和CPU1。
在具体实现中,作为一种实施例,目标检测装置可以包括多个处理器,例如图21中的处理器1001和处理器1007。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
在具体实现中,作为一种实施例,目标检测装置还可以包括输出设备1005和输入设备 1006。输出设备1005和处理器1001通信,可以以多种方式来显示信息。例如,输出设备1005可以是液晶显示器(liquid crystal display,LCD),发光二级管(light emitting diode,LED)显示设备,阴极射线管(cathode ray tube,CRT)显示设备,或投影仪(projector)等。输入设备1006和处理器1001通信,可以以多种方式接收用户的输入。例如,输入设备1006可以是鼠标、键盘、触摸屏设备或传感设备等。
本申请实施例还提供一种自动驾驶辅助系统,应用于无人驾驶或智能驾驶中,其包含至少一个本申请上述实施例提到的目标检测装置,以及,摄像头或雷达等其他传感器中的至少一个,该传感器用于采集待处理图像,该系统内的至少一个装置可以集成为一个整机或设备,或者该系统内的至少一个装置也可以独立设置为元件或装置。
进一步,上述任一系统可以与车辆的中央控制器进行交互,为所述车辆驾驶的决策或控制提供探测和/或融合信息。
本申请实施例还提供一种车辆,所述车辆包括至少一个本申请上述实施例提到的目标检测装置或上述任一系统。
本申请的实施例提供了一种非易失性计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令被处理器执行时实现上述方法。
本申请的实施例提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述方法。
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是――但不限于――电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Electrically Programmable Read-Only-Memory,EPROM或闪存)、静态随机存取存储器(Static Random-Access Memory,SRAM)、便携式压缩盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能盘(Digital Video Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。
这里所描述的计算机可读程序指令或代码可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。
以上已经描述了本申请的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术的改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。

Claims (21)

  1. 一种目标检测方法,其特征在于,所述方法包括:
    获取待处理图像;
    根据标志物的特征对所述待处理图像进行二值化处理,确定所述待处理图像中的候选区域,所述候选区域包括至少一个像素点,所述候选区域包括N列像素点,所述N为正整数;
    确定所述N列像素点中每一列的像素点数量;
    确定所述N列像素点中每一列中行数最大的像素点的特征值,所述特征值包括对应的每一列的像素点的数量;
    确定待比对图像,所述待比对图像包括至少一个像素点,确定每个像素点对应的理论特征值,所述理论特征值基于所述标志物的参考高度信息确定;
    将所述待处理图像中的候选区域的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配;
    当所述待处理图像中的候选区域的第一像素点的特征值与所述待比对图像中对应的第二像素点的理论特征值满足第一条件时,所述第一像素点为所述标志物的像素点。
  2. 根据权利要求1所述的方法,其特征在于,所述确定每个像素点对应的理论特征值,包括:
    根据所述标志物的参考高度信息及采集所述待处理图像的图像采集装置的标定参数,确定所述待比对图像中每个像素点对应的理论特征值。
  3. 根据权利要求2所述的方法,其特征在于,所述标志物的参考高度信息包括所述标志物在世界坐标系中的高度;
    所述根据所述标志物的参考高度信息及采集所述待处理图像的图像采集装置的标定参数,确定所述待比对图像中每个像素点对应的理论特征值,包括:
    根据所述图像采集装置的标定参数,确定世界坐标系到图像采集装置坐标系的变换矩阵;
    根据所述变换矩阵,将所述世界坐标系中至少一个位置上的标志物投影到所述图像采集装置坐标系中,得到该位置对应的所述待比对图像中像素点及该标志物对应的所述待比对图像中像素点的数量;
    将该标志物对应的所述待比对图像中像素点的数量作为该像素点对应的理论特征值。
  4. 根据权利要求1所述的方法,其特征在于,所述确定所述N列像素点中每一列中行数 最大的像素点的特征值,包括:
    确定所述N列像素点中每一列中所有像素点对应的最大行数及最小行数;
    将所述最大行数与最小行数的差值,确定为该列中行数最大的像素点的特征值。
  5. 根据权利要求1所述的方法,其特征在于,所述将所述待处理图像中的候选区域的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配,包括:
    确定所述待处理图像中的候选区域中满足范围条件的像素点;其中,若所述像素点包含于路面在所述待处理图像中所在的范围之内,则所述像素点满足所述范围条件;
    将所述满足范围条件的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配。
  6. 根据权利要求1所述的方法,其特征在于,所述根据标志物的特征对所述待处理图像进行二值化处理,确定所述待处理图像中的候选区域,包括:
    根据预设的所述标志物的颜色特征和/或纹理特征,对所述待处理图像进行二值化处理,得到二值化图像;
    将所述二值化图像中满足第二条件的区域,确定为所述候选区域。
  7. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    确定所述待处理图像中的车道线;
    根据所述车道线的位置及所述第一像素点的位置,确定所述第一像素点所在的车道;
    在车道内的第一像素点数量超过第三阈值时,确定所述第一像素点所在候选区域的横向边界及纵向边界。
  8. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据多张待处理图像中的第一像素点,确定当前待处理图像中第一像素点所在车道。
  9. 根据权利要求1-8中任一所述的方法,其特征在于,所述标志物包括用于划分施工区域的标志物。
  10. 一种目标检测装置,其特征在于,所述装置包括:
    获取模块,用于获取待处理图像;
    处理模块,用于根据标志物的特征对所述待处理图像进行二值化处理,确定所述待处理图像中的候选区域,所述候选区域包括至少一个像素点,所述候选区域包括N列像素点,所 述N为正整数;确定所述N列像素点中每一列的像素点数量;确定所述N列像素点中每一列中行数最大的像素点的特征值,所述特征值包括对应的每一列的像素点的数量;确定待比对图像,所述待比对图像包括至少一个像素点,确定每个像素点对应的理论特征值,所述理论特征值基于所述标志物的参考高度信息确定;将所述待处理图像中的候选区域的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配;当所述待处理图像中的候选区域的第一像素点的特征值与所述待比对图像中对应的第二像素点的理论特征值满足第一条件时,所述第一像素点为所述标志物的像素点。
  11. 根据权利要求10所述的装置,其特征在于,所述处理模块,还用于:
    根据所述标志物的参考高度信息及采集所述待处理图像的图像采集装置的标定参数,确定所述待比对图像中每个像素点对应的理论特征值。
  12. 根据权利要求11所述的装置,其特征在于,所述标志物的参考高度信息包括所述标志物在世界坐标系中的高度;所述处理模块,还用于:根据所述图像采集装置的标定参数,确定世界坐标系到图像采集装置坐标系的变换矩阵;根据所述变换矩阵,将所述世界坐标系中至少一个位置上的标志物投影到所述图像采集装置坐标系中,得到该位置对应的所述待比对图像中像素点及该标志物对应的所述待比对图像中像素点的数量;将该标志物对应的所述待比对图像中像素点的数量作为该像素点对应的理论特征值。
  13. 根据权利要求10所述的装置,其特征在于,所述处理模块,还用于:确定所述N列像素点中每一列中所有像素点对应的最大行数及最小行数;将所述最大行数与最小行数的差值,确定为该列中行数最大的像素点的特征值。
  14. 根据权利要求10所述的装置,其特征在于,所述处理模块,还用于:确定所述待处理图像中的候选区域中满足范围条件的像素点;其中,若所述像素点包含于路面在所述待处理图像中所在的范围之内,则所述像素点满足所述范围条件;将所述满足范围条件的像素点的特征值与所述待比对图像中对应的像素点的理论特征值进行匹配。
  15. 根据权利要求10所述的装置,其特征在于,所述处理模块,还用于:根据预设的所述标志物的颜色特征和/或纹理特征,对所述待处理图像进行二值化处理,得到二值化图像;将所述二值化图像中满足第二条件的区域,确定为所述候选区域。
  16. 根据权利要求10所述的装置,其特征在于,所述处理模块,还用于:确定所述待处理图像中的车道线;根据所述车道线的位置及所述第一像素点的位置,确定所述第一像素点所在的车道;在车道内的第一像素点数量超过第三阈值时,确定所述第一像素点所在候选区域的横向边界及纵向边界。
  17. 根据权利要求10所述的装置,其特征在于,所述处理模块,还用于:根据多张待处理图像中的第一像素点,确定当前待处理图像中第一像素点所在车道。
  18. 根据权利要求10-17中任一所述的装置,其特征在于,所述标志物包括用于划分施工 区域的标志物。
  19. 一种目标检测装置,其特征在于,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令时实现权利要求1-9中任意一项所述的方法。
  20. 一种非易失性计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1-9中任意一项所述的方法。
  21. 一种车辆,其特征在于,包括如权利要求10-19中任意一项所述的目标检测装置。
PCT/CN2022/072994 2021-02-07 2022-01-20 一种目标检测方法及装置 WO2022166606A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110168662.X 2021-02-07
CN202110168662.XA CN114943941A (zh) 2021-02-07 2021-02-07 一种目标检测方法及装置

Publications (1)

Publication Number Publication Date
WO2022166606A1 true WO2022166606A1 (zh) 2022-08-11

Family

ID=82741923

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/072994 WO2022166606A1 (zh) 2021-02-07 2022-01-20 一种目标检测方法及装置

Country Status (2)

Country Link
CN (1) CN114943941A (zh)
WO (1) WO2022166606A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115379408A (zh) * 2022-10-26 2022-11-22 斯润天朗(北京)科技有限公司 基于场景感知的v2x多传感器融合方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117058922B (zh) * 2023-10-12 2024-01-09 中交第一航务工程局有限公司 一种用于路桥施工的无人机监测方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3115933A1 (en) * 2015-07-07 2017-01-11 Ricoh Company, Ltd. Image processing device, image capturing device, mobile body control system, image processing method, and computer-readable recording medium
CN108319931A (zh) * 2018-03-12 2018-07-24 海信集团有限公司 一种图像处理方法、装置及终端
CN109558767A (zh) * 2017-09-25 2019-04-02 比亚迪股份有限公司 汽车及道路限速标志的识别方法、装置
CN110838144A (zh) * 2018-08-15 2020-02-25 杭州萤石软件有限公司 一种充电设备识别方法、移动机器人和充电设备识别系统
CN111973410A (zh) * 2020-06-30 2020-11-24 北京迈格威科技有限公司 障碍物检测方法、装置、避障设备及计算机可读存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3115933A1 (en) * 2015-07-07 2017-01-11 Ricoh Company, Ltd. Image processing device, image capturing device, mobile body control system, image processing method, and computer-readable recording medium
CN109558767A (zh) * 2017-09-25 2019-04-02 比亚迪股份有限公司 汽车及道路限速标志的识别方法、装置
CN108319931A (zh) * 2018-03-12 2018-07-24 海信集团有限公司 一种图像处理方法、装置及终端
CN110838144A (zh) * 2018-08-15 2020-02-25 杭州萤石软件有限公司 一种充电设备识别方法、移动机器人和充电设备识别系统
CN111973410A (zh) * 2020-06-30 2020-11-24 北京迈格威科技有限公司 障碍物检测方法、装置、避障设备及计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115379408A (zh) * 2022-10-26 2022-11-22 斯润天朗(北京)科技有限公司 基于场景感知的v2x多传感器融合方法及装置
CN115379408B (zh) * 2022-10-26 2023-01-13 斯润天朗(北京)科技有限公司 基于场景感知的v2x多传感器融合方法及装置

Also Published As

Publication number Publication date
CN114943941A (zh) 2022-08-26

Similar Documents

Publication Publication Date Title
CN114282597B (zh) 一种车辆可行驶区域检测方法、系统以及采用该系统的自动驾驶车辆
Meyer et al. Automotive radar dataset for deep learning based 3d object detection
WO2020052530A1 (zh) 一种图像处理方法、装置以及相关设备
WO2018068653A1 (zh) 点云数据处理方法、装置及存储介质
CN111542860A (zh) 用于自主车辆的高清地图的标志和车道创建
JP2023523243A (ja) 障害物検出方法及び装置、コンピュータデバイス、並びにコンピュータプログラム
Nedevschi et al. A sensor for urban driving assistance systems based on dense stereovision
WO2022166606A1 (zh) 一种目标检测方法及装置
WO2020043081A1 (zh) 定位技术
WO2013186662A1 (en) Multi-cue object detection and analysis
CN112753038B (zh) 识别车辆变道趋势的方法和装置
EP2743861B1 (en) Method and device for detecting continuous object in disparity direction based on disparity map
WO2020068757A1 (en) Dynamic driving metric output generation using computer vision methods
CN110197173B (zh) 一种基于双目视觉的路沿检测方法
CN112740225B (zh) 一种路面要素确定方法及装置
US20230177724A1 (en) Vehicle to infrastructure extrinsic calibration system and method
Choi et al. Map-matching-based cascade landmark detection and vehicle localization
Cicek et al. Fully automated roadside parking spot detection in real time with deep learning
Zhang et al. Image-based vehicle tracking from roadside LiDAR data
Sanberg et al. Color-based free-space segmentation using online disparity-supervised learning
Lu et al. Digitalization of traffic scenes in support of intelligent transportation applications
WO2022142827A1 (zh) 占道信息确定方法和装置
Xiong et al. Fast and robust approaches for lane detection using multi‐camera fusion in complex scenes
Laureshyn et al. Automated video analysis as a tool for analysing road user behaviour
CN114565906A (zh) 障碍物检测方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22748885

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22748885

Country of ref document: EP

Kind code of ref document: A1