WO2020248910A1 - 目标检测方法和装置 - Google Patents

目标检测方法和装置 Download PDF

Info

Publication number
WO2020248910A1
WO2020248910A1 PCT/CN2020/094612 CN2020094612W WO2020248910A1 WO 2020248910 A1 WO2020248910 A1 WO 2020248910A1 CN 2020094612 W CN2020094612 W CN 2020094612W WO 2020248910 A1 WO2020248910 A1 WO 2020248910A1
Authority
WO
WIPO (PCT)
Prior art keywords
confidence
target
image
threshold
degree
Prior art date
Application number
PCT/CN2020/094612
Other languages
English (en)
French (fr)
Inventor
孙旭彤
李俊超
张伟
凌立
陈曦
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20823164.7A priority Critical patent/EP3965005A4/en
Publication of WO2020248910A1 publication Critical patent/WO2020248910A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/147Details of sensors, e.g. sensor lenses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06T3/047
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/16Image acquisition using multiple overlapping images; Image stitching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/809Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
    • G06V10/811Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data the classifiers operating on different input data, e.g. multi-modal recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/586Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of parking space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • This application relates to automobile intelligent assisted driving technology, in particular to a target detection method and device.
  • sensors play a very important role in assisted driving and autonomous driving of smart cars.
  • a variety of sensors installed on the car such as millimeter wave radar, lidar, camera, ultrasonic radar, etc., can sense the surrounding environment at any time during the driving process of the car, collect data, identify and track moving objects, and stand still Scenarios such as the identification of lane lines and signs, combined with navigator and map data for path planning. Sensors can perceive possible dangers in advance and promptly help the driver even take necessary evasive measures autonomously, effectively increasing the safety and comfort of car driving.
  • multiple cameras installed at different positions of the vehicle body are used to shoot separately to obtain images of the front, rear, left, and right angles of the car, and then these images are stitched together to obtain a panoramic view image, which the driver can watch on the large screen of the vehicle central control
  • This look-around image understands the surrounding environment of the car.
  • the detection result has a high recall and accuracy rate, thereby realizing the intelligence of the car Assisted driving.
  • the present application provides a target detection method and device to ensure the credibility of the detected target by fusing the confidence of the target detected from a single image and the confidence of the target detected from the surround view image , Thereby improving the accuracy of target detection.
  • this application provides a target detection method, including:
  • the fusion confidence is obtained according to the first confidence and the second confidence; wherein, the first confidence is the credibility of the first target in the first image
  • the second degree of confidence is the degree of credibility of the second target in a second image
  • the first image is an image from a first camera
  • the second image is based on
  • the multiple camera devices include the first camera device; and output target detection results according to the fusion confidence.
  • This application fuses the confidence of the target detected from a single fisheye image and the confidence of the target detected from the surround view image to ensure the credibility of the detected target, thereby improving the accuracy of target detection .
  • the obtaining the fusion confidence degree according to the first confidence degree and the second confidence degree includes: when the first confidence degree is greater than a first threshold, and the second confidence degree is greater than a second confidence degree. Threshold, determine the fusion confidence according to a preset weight value; and/or, when the first confidence is less than or equal to the first threshold, and the second confidence is greater than the second threshold , Determining that the fusion confidence is the second confidence.
  • the outputting the target detection result according to the fusion confidence includes: determining that the fusion confidence is greater than a third threshold; outputting the second detected in the second image Target location information.
  • the positioning information of the second target detected in the second image is input based on the fusion confidence, so as to improve the accuracy of target detection.
  • the obtaining the fusion confidence degree according to the first confidence degree and the second confidence degree includes: when the first confidence degree is greater than the first threshold and the second confidence degree is less than When it is equal to or equal to the second threshold, it is determined that the fusion confidence is the first confidence.
  • the outputting the target detection result according to the fusion confidence includes: determining that the fusion confidence is greater than a third threshold; and outputting the first detected in the first image Target location information.
  • the positioning information of the first target detected in the first image is input based on the fusion confidence, so as to improve the accuracy of target detection.
  • the method further includes: determining whether the first target and the second target are the same target according to the degree of image coincidence of the first target and the second target; When the degree of image overlap between the target and the second target is greater than a fourth threshold, it is determined that the first target and the second target are the same target.
  • This application determines whether the first target and the second target are the same target based on the degree of overlap of the images of the second target, thereby improving the accuracy of target judgment.
  • the method further includes: acquiring the first image from the first camera device; and stitching the images from the multiple camera devices to obtain the second image.
  • the present application provides a target detection device, which is characterized in that it includes:
  • the processing module is configured to obtain the fusion confidence degree according to the first confidence degree and the second confidence degree when the first target and the second target are the same target; wherein, the first confidence degree is that the first target is The degree of credibility in the first image, the second degree of confidence is the degree of credibility of the second target in the second image, the first image is an image from a first camera, and the second image
  • the multiple camera devices include the first camera device; and an interface module configured to output a target detection result according to the fusion confidence.
  • the processing module is configured to determine according to a preset weight value when the first confidence is greater than a first threshold and the second confidence is greater than a second threshold The fusion confidence; and/or, when the first confidence is less than or equal to the first threshold, and the second confidence is greater than the second threshold, it is determined that the fusion confidence is the The second degree of confidence.
  • the processing module is configured to determine that the fusion confidence is greater than a third threshold; the interface module is configured to output all detected in the second image The positioning information of the second target is described.
  • the processing module is configured to determine when the first confidence is greater than the first threshold, and the second confidence is less than or equal to the second threshold
  • the fusion confidence is the first confidence
  • the processing module is configured to determine that the fusion confidence is greater than a third threshold; the interface module is configured to output all detected in the first image The positioning information of the first target.
  • the processing module is configured to determine whether the first target and the second target are the same target according to the degree of image coincidence of the first target and the second target When the degree of overlap of the images of the first target and the second target is greater than a fourth threshold, it is determined that the first target and the second target are the same target.
  • the processing module is configured to obtain the first image from the first camera; and, stitch the images from the multiple camera devices to obtain the second image.
  • this application provides a device including:
  • One or more processors are One or more processors;
  • Memory used to store one or more programs
  • the device realizes the method according to any one of the above-mentioned first aspects.
  • the present application provides a computer-readable storage medium including a computer program, which when executed on a computer, causes the computer to execute the method described in any one of the above-mentioned first aspects.
  • the present application provides a computer program that, when executed by a computer, implements the method described in any one of the above-mentioned first aspects.
  • the present application provides a device including a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory, so as to implement the above-mentioned first aspect The method of any one of.
  • Figure 1 is a schematic diagram of a target detection system applicable to the target detection method of this application;
  • Figure 2 is a schematic diagram of the line of sight of the camera device of this application.
  • FIG. 3 is a flowchart of an embodiment of the target detection method of this application.
  • 4A and 4B show a schematic diagram of a fisheye image and a corrected image
  • FIG. 5 is a schematic diagram of the detection result of the first image of the application.
  • FIG. 6 is a schematic diagram of the detection result of the second image of the application.
  • FIG. 7 is a schematic diagram of another detection result of the first image of the application.
  • FIG. 8 is a schematic diagram of the detection results of the first image and the second image of the application.
  • Figure 9 is a schematic diagram of the overlapping area of two BBs in this application.
  • Figure 10 is a schematic diagram of the joint area of two BBs in this application.
  • FIG. 11 is a flow chart of the application for fusing the confidence of the first image and the second image
  • FIG. 12 is a schematic structural diagram of an embodiment of a target detection device according to this application.
  • FIG. 13 is a schematic block diagram of the target detection device provided by this application.
  • At least one (item) refers to one or more, and “multiple” refers to two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B , Where A and B can be singular or plural.
  • the character “/” generally indicates that the associated objects are in an “or” relationship.
  • the following at least one item (a)” or similar expressions refers to any combination of these items, including any combination of a single item (a) or plural items (a).
  • At least one (a) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
  • Fish-eye image is a three-dimensional picture, which is less affected by the environment and can effectively capture the space scene and the ground scene.
  • Fisheye image is an image taken by a fisheye lens.
  • a fisheye lens is a lens with a focal length less than or equal to 16mm and an angle of view close to or equal to 180°. It is an extreme wide-angle lens.
  • fisheye The front lens of the lens is very short in diameter and protrudes parabolic to the front of the lens, which is quite similar to the eyes of a fish, hence the name "fisheye lens".
  • Confidence The description of the credibility of the detection result during target detection, which indicates the credibility of the detection target.
  • the value range is between 0-1.
  • Fig. 1 is a schematic diagram of a target detection system to which the target detection method of this application is applicable.
  • the target detection system may include multiple camera devices and target detection devices. Among them, multiple camera devices can be distributed and installed around the vehicle body. Two adjacent camera devices have their respective camera coverage areas partially overlapped, so as to facilitate the stitching of the surround view images described below.
  • the target detection device may be set independently, or integrated in the control device, or it may be implemented through software or a combination of software and hardware, which is not specifically limited.
  • the aforementioned target detection system may be applied to an Advanced Driver Assistance System (ADAS).
  • ADAS Advanced Driver Assistance System
  • the above-mentioned multiple camera devices may adopt a four-way fish-eye camera device installed on the vehicle body, wherein the front fish-eye camera device can be installed in the upper part of the car logo, and the rear fish-eye camera device can be installed in the car.
  • the upper position of the middle of the license plate needs to ensure that the front and rear cameras are located on the center axis of the car (the error can be within plus or minus 5 mm).
  • the left fisheye camera device and the right fisheye camera device can be installed under the left and right side mirrors of the car respectively. It is necessary to ensure that the left and right camera devices are symmetrical with respect to the center axis of the car, and these two fisheye cameras Keep the distance between the device and the front of the vehicle and the baseline consistent.
  • the four-way fisheye camera device can be 50cm above the ground, especially the left and right camera devices need to maintain the same height.
  • the angle between the line of sight of the four-way fisheye camera device and the vertical ground direction is about 40-45 degrees, and the maximum is not more than 45 degrees, ensuring that the four-way fisheye camera device captures the image You can see the car body.
  • the above-mentioned target detection device can perform target detection based on images captured by multiple camera devices.
  • the target includes parking spaces, people, obstacles, lane lines, etc., to understand the surrounding environment of the car.
  • target detection is performed by techniques such as neural networks or computer vision.
  • the system applicable to the target detection method of the present application is not limited to include four-channel fisheye camera devices, and can include any number of any type of camera devices, subject to the technical solution of the present application. Make any restrictions on the composition.
  • a multi-channel fisheye camera device is taken as an example for description.
  • Fig. 3 is a flowchart of an embodiment of a target detection method of the present application. As shown in Fig. 3, the method of this embodiment may be executed by the chip in the ADAS mentioned above, and the target detection method may include:
  • Step 301 When the first target and the second target are the same target, obtain the fusion confidence according to the first confidence and the second confidence.
  • the first confidence is the credibility of the first target in the first image
  • the second confidence is the credibility of the second target in the second image.
  • the first image is the image from the first camera
  • the second The image is a look-around image obtained from images from multiple camera devices.
  • the multiple camera devices include a first camera device.
  • the plurality of camera devices do not include the first camera device.
  • the first image in this application is an image captured by the first camera device.
  • the multiple camera devices may be arranged on the vehicle body, but the specific positions are not limited.
  • each camera device can be considered as the first camera device, so the first image can come from one of the camera devices, or from many of them.
  • Road camera device If the vehicle-mounted camera device is a fish-eye camera device, the captured image is a fish-eye image. Because the fish-eye camera device has a large field of view, it can capture many image features, so the three-dimensional fish-eye image captured can be effectively Show the characteristics of space objects. However, the fish-eye camera device has a distortion phenomenon. The farther away from the imaging center point, the greater the image distortion, which is reflected in the geometric deformation of the object in the fish-eye image, such as stretching and bending.
  • the embodiment of the present application also provides a second image.
  • the second image is a surround view image obtained by stitching together images taken by a plurality of camera devices, the plurality of camera devices including the above-mentioned first camera device.
  • image fusion stitching algorithm is used to stitch the top views taken by the four-way fisheye camera device to obtain the surround view image.
  • the image stitching may include There are three main steps of fisheye image distortion correction, top view conversion and image stitching. Each step can get a coordinate conversion relationship between the corresponding pixels in the input image and the output image, and finally get the pixel correspondence from a single image to a surround view image relationship. It should be noted that for the splicing technology and the generation of the surround view image, reference may be made to the prior art, and this application does not specifically limit the specific splicing method and the generation of the surround view image.
  • the solution of the present application relates to obtaining detection results based on a first image and a second image.
  • the first image may be multiple images, that is, at least two of the multiple images captured by multiple camera devices.
  • the embodiment of the present application may also perform target detection based on multiple first images and second images.
  • for processing multiple first images refer to the first image processing method in the embodiment of this application.
  • multiple first images correspond to multiple first confidence levels.
  • the processing related to the first degree of confidence hereinafter may be extended to the processing of at least one first degree of confidence among the plurality of first degree of confidence.
  • the first image is a fisheye image.
  • the first image may be obtained after processing based on a fisheye image obtained from a camera device.
  • the internal parameters and distortion coefficients of the fisheye camera can be obtained by calibrating the fisheye camera device, and then combined with the external parameters relative to the calibration object, the obtained parameters and fisheye distortion can be obtained
  • the model corrects the fisheye image to obtain a corrected image, and uses the corrected image as the first image.
  • specific processing methods refer to the image processing methods in the prior art.
  • the above description for obtaining the first image is only a possible image processing method, and the embodiment of the present application does not specifically limit it.
  • the first degree of confidence is the degree of credibility of the first target detected from the first image
  • the second degree of confidence is the degree of credibility of the second target detected from the second image
  • the target detection can be performed by techniques such as neural network or computer vision.
  • the main processing methods in the process of target detection based on computer vision include:
  • Image gray-scale processing images are usually stored in red green blue (RGB) format.
  • RGB red green blue
  • the value is called the gray value. Therefore, each pixel in the gray image only needs one byte to store the gray value (also known as intensity value, brightness Value), the gray scale range is 0-255.
  • Image gray-scale processing can be used as a pre-processing step of image processing to prepare for subsequent operations such as image segmentation, image recognition and image analysis.
  • Gaussian filter is a linear smoothing filter, which is suitable for eliminating Gaussian noise and is widely used in the denoising process of image processing.
  • Gaussian filtering is the weighted average process of the entire image. The value of each pixel is obtained by weighted average of itself and other pixel values in the neighborhood.
  • the specific operation is to scan each pixel in the image with a template (or convolution, mask), and use the weighted average gray value of the pixels in the neighborhood determined by the template to replace the value of the center pixel of the template.
  • Edge detection refers to the collection of pixels whose surrounding pixels change sharply. It is the most basic feature of an image. Edge detection refers to extracting the features of discontinuous parts in the image, and determining the area according to the closed edge.
  • Hough transform The basic principle of the Hough transform is to use the duality of points and lines to transform a given curve in the original image space into a point in the parameter space through a curve expression. In this way, the problem of detecting a given curve in the original image is transformed into a problem of finding a peak in the parameter space. That is, the detection of the overall characteristics is transformed into the detection of local characteristics, such as straight lines, ellipses, circles, arcs, etc.
  • Target detection based on computer vision includes first graying the input image to turn the image into a grayscale image, then Gaussian filtering is used to reduce the image noise, and then the edge detection model is used to extract the discontinuous features in the image, and finally through Hough transform transforms global features into local features.
  • the target detection method can obtain a detection result similar to that of a neural network, for example, draw a frame around the detected target to obtain the confidence of the target.
  • FIGS. 4A and 4B show schematic diagrams of a fisheye image and a corrected image.
  • the neural network can realize the target detection of the image, and the detection result can be presented in the form of an image.
  • the output image obtained through the neural network is shown as a frame .
  • FIG. 5 shows a schematic diagram of the first image (a single fisheye image) detecting a parking space through a neural network
  • FIG. 6 shows a second image (a surround view image) detecting a parking space through the neural network
  • Schematic diagram of the detection results the boxes in the figure are the detected parking spaces.
  • target detection can also give a description of the degree of credibility of the detection result, that is, the confidence of the target, which ranges from 0-1.
  • the target (parking space) detection is performed on the first image by, for example, neural network technology, and 2 parking spaces are detected, one of which has a confidence of 0.3 and the other has a confidence of 0.6, indicating the latter Compared with the former, it is more likely to be a parking space in real situations, and the former has the possibility of misdetecting other objects as parking spaces.
  • multiple methods can be used to determine whether the first target and the second target are the same target.
  • the image coincidence degree of the first target and the second target is greater than the fourth threshold, It is determined that the first target and the second target are the same target, and the image coincidence degree of the first target and the second target can be obtained according to the pixel correspondence between the first image and the second image.
  • the judgment is made based on the pixel distance between the center point of the first target and the center point of the second target.
  • the second target is converted to the second image according to the pixel correspondence to obtain its center point, and then the pixel distance between the center point and the center point of the second target is calculated, and the pixel distance is compared with the set pixel distance
  • the threshold value is compared to obtain the judgment result of whether the first target and the second target are the same target.
  • the resolution of the second image is 600 ⁇ 600
  • the pixel distance threshold is 30, the center point of the first target is (400, 500), and the center point of the second target is (420, 520).
  • the calculation can get between the two center points
  • the pixel distance of is 28, 28 is less than 30, so it can be judged that the first target and the second target are the same target.
  • FIG. 7 shows a schematic diagram of the detection result of the parking space detected in the first image (single fisheye image), where the dotted frame is the parking space line mark for marking the parking space, when the parking space in the dotted frame is detected
  • the line mark can be considered that there is a parking space there
  • the solid line frame is a bounding box (BB) drawn according to the detected parking line mark to indicate the parking space.
  • Figure 8 shows the parking space detected in the second image. The parking space is detected and the parking space detected in the first image (single fisheye image) in Figure 7 is converted to the second image according to the pixel correspondence.
  • This application sets a threshold (fourth threshold, such as 0.7) for the degree of coincidence of two BBs.
  • the degree of coincidence can be represented by Intersection Over Union (IOU).
  • Figure 9 shows the overlap area of two BBs.
  • IOU Intersection Over Union
  • the fourth threshold it means that the two BBs correspond to the same parking space.
  • the fusion confidence can be obtained according to the first confidence and the second confidence.
  • the IOU is less than or equal to the fourth threshold, it means that the two BBs may not correspond to the same parking space, and there is no need to fuse the confidence of the two BBs. At this time, there is no need to obtain the fusion confidence and terminate the target Detection.
  • Step 302 Output the target detection result according to the fusion confidence.
  • the present application can obtain the fusion confidence level according to the comparison result of the first confidence level and the first threshold, and/or the comparison result of the second confidence level and the second threshold, where the first threshold, the second threshold, the fourth threshold, and The third threshold involved later can be determined according to the target detection accuracy, accuracy and/or recall rate, etc., can also be learned according to training, or can be a pre-configured or defined value.
  • This application does not specifically limit the acquisition methods of various thresholds.
  • This application outputs target detection results according to the fusion confidence level, including at least one of the following situations:
  • the fusion confidence is determined according to the preset weight value; the fusion confidence is determined to be greater than the third threshold and output in the second image Location information of the detected second target.
  • the multiple first images in the optional implementations involved in the following are the same as the multiple first images mentioned here. They all involve multiple first images containing the same first target. .
  • a first weight and a second weight are provided, the first weight corresponds to the first confidence degree, and the second weight corresponds to the second confidence degree. Further, the first weight and the second weight are preset. Specifically, the first confidence is weighted by a first weight, the second confidence is weighted by a second weight, and the fusion confidence is obtained after the two weights are summed.
  • the first weight can be set corresponding to the first confidence level
  • the second weight can be set corresponding to the second confidence level.
  • the confidence is weighted
  • the second confidence is weighted by the second weight
  • the fusion confidence is obtained after the sum of the two weights.
  • the first confidence of the first target detected in the first image and the second confidence of the second target detected in the second image are both higher than their respective thresholds, indicating that the first detected in the first image
  • the first target and the second target detected in the second image are more likely to be real targets.
  • the detection result of the second image is combined with the detection result of the first image to obtain Fusion of confidence can improve the confidence of the detection result of the second image, and ultimately improve the credibility of the target. If the fusion confidence is greater than the third threshold, the second target detected in the second image can be directly used as the final detected target, and the positioning information of the target is output, and the positioning information is used to present the surrounding environment of the vehicle.
  • the fusion confidence is determined to be the second confidence; the fusion confidence is determined to be greater than the third threshold, and the output is in the second image
  • there are multiple first confidence levels corresponding to the multiple first images there are multiple first confidence levels corresponding to the multiple first images, and the above condition can also be replaced with "when the multiple first confidence levels are all less than Or equal to the first threshold and the second confidence is greater than the second threshold".
  • the first confidence of the first target detected in the first image is less than or equal to the first threshold
  • the second confidence of the second target detected in the second image is greater than the second threshold, indicating that the The detected first target may be more likely to be falsely detected, while the second target detected in the second image is more likely to be a real target, so the second confidence of the second target is used as the fusion confidence .
  • the fusion confidence is greater than the third threshold, the second target detected in the second image can be used as the finally detected target, and the positioning information of the target is output, and the positioning information is used to present the surrounding environment of the vehicle.
  • the fusion confidence is determined to be the first confidence; if the fusion confidence is determined to be greater than the third threshold, the output is in the first image The location information of the first target detected in.
  • there are multiple first confidence levels corresponding to the multiple first images and the above feature can also be replaced with "when the When at least one is greater than the first threshold or all greater than the first threshold, and the second confidence is less than or equal to the second threshold, it is determined that the fusion confidence is one of the multiple first confidences.
  • the fusion confidence may be the first confidence with the largest value among the plurality of first confidences".
  • the first confidence of the first target detected in the first image is greater than the first threshold
  • the second confidence of the second target detected in the second image is less than or equal to the second threshold, indicating that the The first target detected is more likely to be the real target, while the second target detected in the second image is more likely to be misdetected. Therefore, the first confidence of the first target is used as the fusion confidence . If the fusion confidence is greater than the third threshold, the first target detected in the first image can be used as the finally detected target, and the positioning information of the target is output, and the positioning information is used to present the surrounding environment of the vehicle.
  • the first confidence is less than or equal to the first threshold, and the second confidence is less than or equal to the second threshold, it is determined that the fusion confidence acquisition fails; and a prompt message indicating that the target detection fails is output.
  • the above condition can also be replaced with "when the multiple first confidence levels are all less than or equal to the first Threshold, and the second confidence is less than or equal to the second threshold".
  • the first confidence of the first target detected in the first image is less than or equal to the first threshold
  • the second confidence of the second target detected in the second image is less than or equal to the second threshold, indicating that the first Both the first target detected in the image and the second target detected in the second image are likely to be misdetected.
  • the condition of each case is an exemplary description, for the first confidence equal to the first threshold and the second confidence equal to the second threshold As a special case of, you can set the situation to be classified according to the specific setting of the first threshold and the second threshold.
  • condition of the above case (1) can also be that the first confidence is greater than or equal to the first threshold, and the second confidence is greater than or equal to the second threshold
  • condition of the case (2) can also be that the first confidence is less than the first A threshold
  • the second confidence is greater than or equal to the second threshold
  • condition of case (3) can also be that the first confidence is greater than or equal to the first threshold, and the second confidence is less than the second threshold
  • case (4) The condition can also be that the first confidence is less than the first threshold, and the second confidence is less than the second threshold.
  • condition of the above case (1) can also be that the first confidence is greater than the first threshold, and the second confidence is greater than or equal to the second threshold
  • condition of the case (2) can also be that the first confidence is less than or equal to The first threshold
  • the second confidence is greater than or equal to the second threshold
  • condition of case (3) can also be that the first confidence is greater than the first threshold, and the second confidence is less than the second threshold
  • condition of case (4) It may also be that the first confidence is less than or equal to the first threshold, and the second confidence is less than the second threshold.
  • condition of the above case (1) can also be that the first confidence is greater than or equal to the first threshold, and the second confidence is greater than the second threshold
  • condition of the case (2) can also be that the first confidence is less than the first Threshold, and the second confidence is greater than the second threshold
  • condition of case (3) can also be that the first confidence is greater than or equal to the first threshold, and the second confidence is less than or equal to the second threshold
  • condition of case (4) It may also be that the first confidence is less than the first threshold, and the second confidence is less than or equal to the second threshold.
  • This application fuses the confidence of the target detected from a single image and the confidence of the target detected from the surround view image to ensure the credibility of the detected target, thereby improving the accuracy of target detection.
  • Figure 11 shows the process of fusing the confidence of the first image and the second image in the target detection method, including:
  • Step 1101 Determine whether the IOU is greater than 0.7.
  • IOU is the degree of image coincidence between the target detected in the first image and the target detected in the second image acquired according to the pixel correspondence between the first image and the second image.
  • step 1102 If IOU>0.7, perform step 1102, if IOU ⁇ 0.7, perform step 1107 to terminate the confidence fusion process.
  • Step 1102. Compare x with A, and compare y with B.
  • z>0.5 output the location information of the detected target in the second image.
  • Step 1106 When x ⁇ A and y ⁇ B, z acquisition fails.
  • FIG. 12 is a schematic structural diagram of an embodiment of a target detection device of this application.
  • the target detection device of this embodiment can be set independently, integrated in a control device, or through software or a combination of software and hardware. achieve.
  • it can be applied to ADAS, and can be specifically implemented as a chip or integrated circuit in the ADAS, or implemented independently of the ADAS; for another example, it can be applied to a vehicle control entity, and can be specifically integrated into the control entity or Is independent of the controlling entity.
  • the device includes: a processing module 1201 and an interface module 1202, wherein the processing module 1201 is configured to obtain the fusion confidence degree according to the first confidence degree and the second confidence degree when the first target and the second target are the same target; Wherein, the first degree of confidence is the degree of credibility of the first target in the first image, and the second degree of confidence is the degree of credibility of the second target in the second image.
  • the image is an image from a first camera device, the second image is a surround view image obtained from images from a plurality of camera devices, the plurality of camera devices including the first camera device; the interface module 1202 is configured to And output the target detection result according to the fusion confidence.
  • the processing module 1201 is configured to, when the first confidence level is greater than a first threshold, and the second confidence level is greater than a second threshold, according to a preset weight value Determine the fusion confidence; and/or, when the first confidence is less than or equal to the first threshold, and the second confidence is greater than the second threshold, determine that the fusion confidence is all The second degree of confidence.
  • the processing module 1201 is configured to determine that the fusion confidence is greater than a third threshold; the interface module 1202 is configured to output the detection in the second image The positioning information of the second target.
  • the processing module 1201 is configured to, when the first confidence is greater than the first threshold, and the second confidence is less than or equal to the second threshold, The fusion confidence is determined to be the first confidence.
  • the processing module 1201 is configured to determine that the fusion confidence is greater than a third threshold; the interface module 1202 is configured to output the detection in the first image The positioning information of the first target.
  • the processing module 1201 is configured to determine whether the first target and the second target are the same according to the degree of image coincidence of the first target and the second target. Target; when the image overlap of the first target and the second target is greater than a fourth threshold, it is determined that the first target and the second target are the same target.
  • the processing module 1201 is configured to obtain the first image from the first camera; and to splice the images from the multiple camera devices to obtain the first image. Two images.
  • the device in this embodiment may be used to implement the technical solution of the method embodiment shown in FIG. 3 or 11, and its implementation principles and technical effects are similar, and will not be repeated here.
  • the present application provides a computer-readable storage medium, including a computer program, which when executed on a computer, causes the computer to execute any of the methods shown in FIGS. 3-11. The method in the embodiment.
  • the present application provides a computer program, when the computer program is executed by a computer, the method of any one of the method embodiments shown in FIGS. 3-11 is implemented.
  • the present application provides a device, which may be an independent device, may also be integrated in a control device, or may be implemented through software or a combination of software and hardware.
  • a device which may be an independent device, may also be integrated in a control device, or may be implemented through software or a combination of software and hardware.
  • it can be applied to ADAS, and can be specifically implemented as a chip or integrated circuit in the ADAS, or implemented independently of the ADAS; for another example, it can be applied to a vehicle control entity, and can be specifically integrated into the control entity or Is independent of the controlling entity.
  • the device includes a processor and a memory, the memory is used to store a computer program, and the processor is used to call and run the computer program stored in the memory to implement the method embodiment shown in any one of FIGS. 3-11. Methods.
  • the target detection device 1300 may be applied to ADAS or a control entity of a vehicle, and it includes a processor 1301 and a transceiver 1302.
  • the target detection device 1300 further includes a memory 1303.
  • the processor 1301, the transceiver 1302, and the memory 1303 can communicate with each other through an internal connection path to transfer control signals and/or data signals.
  • the memory 1303 is used to store computer programs.
  • the processor 1301 is configured to execute a computer program stored in the memory 1303, so as to realize each function of the target detection device 1300 in the foregoing apparatus embodiment.
  • the processor 1301 may be used to perform the operations and/or processing performed by the fusion module 1202 and the output module 1203 described in the apparatus embodiment (for example, FIG. 12), and the transceiver 1302 is used to perform the operations performed by the acquisition module 1201. Operation and/processing.
  • the memory 1303 may also be integrated in the processor 1301 or independent of the processor 1301.
  • the target detection device 1300 may further include a power supply 1304 for providing power to various devices or circuits.
  • the target detection device 1300 may also include one or more of an input unit 1305, a display unit 1306 (also can be regarded as an output unit), an audio circuit 1307, a camera 1308, a sensor 1309, etc. .
  • the audio circuit may also include a speaker 13071, a microphone 13072, etc., which will not be repeated.
  • the processor and memory mentioned in the above embodiments may be located on an integrated circuit or chip, and the processor has image processing capabilities.
  • the steps of the foregoing method embodiments can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
  • the processor may be a general-purpose processor, a digital signal processor (digital signal processor, DSP), and the integrated circuit or chip may be an application-specific integrated circuit (ASIC) or a field programmable gate array (field programmable gate array). gate array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware encoding processor, or executed and completed by a combination of hardware and software modules in the encoding processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electronic Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • the volatile memory may be random access memory (RAM), which is used as an external cache.
  • RAM random access memory
  • static random access memory static random access memory
  • dynamic RAM dynamic random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
  • enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
  • serial link DRAM SLDRAM
  • direct rambus RAM direct rambus RAM
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (personal computer, server, or network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

Abstract

本申请公开了一种目标检测方法和装置,以确保检测到的目标的可信程度,进而提高目标检测的准确性,属于传感器技术领域。所述目标检测方法包括:当第一目标和第二目标为同一目标时,根据第一置信度和第二置信度获取融合置信度;其中,所述第一置信度为所述第一目标在第一图像中的可信程度,所述第二置信度为所述第二目标在第二图像中的可信程度,所述第一图像为来自第一摄像装置的图像,所述第二图像为根据来自多个摄像装置的图像得到的环视图像,所述多个摄像装置包括所述第一摄像装置;根据所述融合置信度输出目标检测结果。该方法和装置可以用于辅助驾驶和自动驾驶中的目标检测和跟踪。

Description

目标检测方法和装置
本申请要求于2019年6月10日提交中国专利局、申请号为201910498217.2、申请名称为“目标检测方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及汽车智能辅助驾驶技术,尤其涉及一种目标检测方法和装置。
背景技术
随着社会的发展,智能汽车正在逐步进入人们的日常生活中。传感器在智能汽车的辅助驾驶和自动驾驶中发挥着十分重要的作用。安装在车上的各式各样的传感器,比如毫米波雷达,激光雷达,摄像头,超声波雷达等,在汽车行驶过程中随时感知周围的环境,收集数据,进行移动物体的辨识与追踪,以及静止场景如车道线、标示牌的识别,并结合导航仪及地图数据进行路径规划。传感器可以预先察觉到可能发生的危险并及时帮助驾驶员甚至自主采取必要的规避手段,有效增加了汽车驾驶的安全性和舒适性。
相关技术中,通过安装在车身不同位置的多个摄像头分别进行拍摄,从而得到汽车前后左右方位视角的图像,然后将这些图像拼接起来得到全景环视图像,驾驶员可以在车载中控大屏上观看该环视图像了解汽车周围环境。通过将环视图像通过训练好的神经网络进行目标检测,确定目标(例如停车位、人、障碍物、车道线等)的位置,该检测结果的召回率和准确率较高,从而实现汽车的智能辅助驾驶。
但是基于环视图像进行目标检测,得到的检测结果中目标存在的置信度过低,无法判断是否为真正的目标,很容易出现目标检测错误。
发明内容
本申请提供一种目标检测方法和装置,以通过对从单张图像中检测到的目标的置信度和从环视图像中检测到的目标的置信度进行融合,确保检测到的目标的可信程度,进而提高目标检测的准确性。
第一方面,本申请提供一种目标检测方法,包括:
当第一目标和第二目标为同一目标时,根据第一置信度和第二置信度获取融合置信度;其中,所述第一置信度为所述第一目标在第一图像中的可信程度,所述第二置信度为所述第二目标在第二图像中的可信程度,所述第一图像为来自第一摄像装置的图像,所述第二图像为根据来自多个摄像装置的图像得到的环视图像,所述多个摄像装置包括所述第一摄像装置;根据所述融合置信度输出目标检测结果。
本申请通过对从单张鱼眼图像中检测到的目标的置信度和从环视图像中检测到的目标的置信度进行融合,确保检测到的目标的可信程度,进而提高目标检测的准确性。
在一种可能的实现方式中,所述根据第一置信度和第二置信度获取融合置信度,包括:当所述第一置信度大于第一阈值,且所述第二置信度大于第二阈值时,根据预设的权重值确定所述融合置信度;和/或,当所述第一置信度小于或等于所述第一阈值,且所述第二 置信度大于所述第二阈值时,确定所述融合置信度为所述第二置信度。
在一种可能的实现方式中,所述根据所述融合置信度输出目标检测结果,包括:确定所述融合置信度大于第三阈值;输出在所述第二图像中检测到的所述第二目标的定位信息。
本申请基于融合置信度输入第二图像中检测到的第二目标的定位信息,提高目标检测的准确性。
在一种可能的实现方式中,所述根据第一置信度和第二置信度获取融合置信度,包括:当所述第一置信度大于所述第一阈值,且所述第二置信度小于或等于所述第二阈值时,确定所述融合置信度为所述第一置信度。
在一种可能的实现方式中,所述根据所述融合置信度输出目标检测结果,包括:确定所述融合置信度大于第三阈值;输出在所述第一图像中检测到的所述第一目标的定位信息。
本申请基于融合置信度输入第一图像中检测到的第一目标的定位信息,提高目标检测的准确性。
在一种可能的实现方式中,还包括:根据所述第一目标和所述第二目标的图像重合度确定所述第一目标和所述第二目标是否为同一目标;当所述第一目标和所述第二目标的图像重合度大于第四阈值时,确定所述第一目标和第二目标为同一目标。
本申请基于第一目标和第二目标的图像重合度确定二者是否为同一目标,提高目标判断的准确性。
在一种可能的实现方式中,还包括:获取来自所述第一摄像装置的所述第一图像;以及,拼接来自所述多个摄像装置的图像得到所述第二图像。
第二方面,本申请提供一种目标检测装置,其特征在于,包括:
处理模块,被配置用于当第一目标和第二目标为同一目标时,根据第一置信度和第二置信度获取融合置信度;其中,所述第一置信度为所述第一目标在第一图像中的可信程度,所述第二置信度为所述第二目标在第二图像中的可信程度,所述第一图像为来自第一摄像装置的图像,所述第二图像为根据来自多个摄像装置的图像得到的环视图像,所述多个摄像装置包括所述第一摄像装置;接口模块,被配置用于根据所述融合置信度输出目标检测结果。
在一种可能的实现方式中,所述处理模块,被配置用于当所述第一置信度大于第一阈值,且所述第二置信度大于第二阈值时,根据预设的权重值确定所述融合置信度;和/或,当所述第一置信度小于或等于所述第一阈值,且所述第二置信度大于所述第二阈值时,确定所述融合置信度为所述第二置信度。
在一种可能的实现方式中,所述处理模块,被配置用于确定所述融合置信度大于第三阈值;所述接口模块,被配置用于输出在所述第二图像中检测到的所述第二目标的定位信息。
在一种可能的实现方式中,所述处理模块,被配置用于当所述第一置信度大于所述第一阈值,且所述第二置信度小于或等于所述第二阈值时,确定所述融合置信度为所述第一置信度。
在一种可能的实现方式中,所述处理模块,被配置用于确定所述融合置信度大于第三 阈值;所述接口模块,被配置用于输出在所述第一图像中检测到的所述第一目标的定位信息。
在一种可能的实现方式中,所述处理模块,被配置用于根据所述第一目标和所述第二目标的图像重合度确定所述第一目标和所述第二目标是否为同一目标;当所述第一目标和所述第二目标的图像重合度大于第四阈值时,确定所述第一目标和所述第二目标为同一目标。
在一种可能的实现方式中,所述处理模块,被配置用于获取来自所述第一摄像装置的所述第一图像;以及,拼接来自所述多个摄像装置的图像得到所述第二图像。
第三方面,本申请提供一种设备,包括:
一个或多个处理器;
存储器,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述设备实现如上述第一方面中任一项所述的方法。
第四方面,本申请提供一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行上述第一方面中任一项所述的方法。
第五方面,本申请提供一种计算机程序,当所述计算机程序被计算机执行时,实现上述第一方面中任一项所述的方法。
第六方面,本申请提供一种装置,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以实现如上述第一方面中任一项所述的方法。
附图说明
图1为本申请目标检测方法适用的目标检测系统示意图;
图2为本申请摄像装置视线方向示意图;
图3为本申请目标检测方法实施例的流程图;
图4A和4B示出了一种鱼眼图像和矫正后的图像的示意图;
图5为本申请第一图像的检测结果示意图;
图6为本申请第二图像的检测结果示意图;
图7为本申请第一图像的另一检测结果示意图;
图8为本申请第一图像和第二图像的检测结果示意图;
图9为本申请两个BB的重叠面积示意图;
图10为本申请两个BB的联合面积示意图;
图11为本申请对第一图像和第二图像的置信度进行融合的流程图;
图12为本申请目标检测装置实施例的结构示意图;
图13为本申请提供的目标检测设备的示意性框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例, 而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书实施例和权利要求书及附图中的术语“第一”、“第二”等仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
以下是本申请涉及到的一些名词说明,属于公知术语的特征的含义:
鱼眼图像:鱼眼图像为三维图片,受环境影响较小,能够有效的拍摄到空间景物及地面景物。鱼眼图像由鱼眼镜头拍摄的图像,鱼眼镜头是一种焦距小于或等于16mm、视角接近或等于180°的镜头,是一种极端的广角镜头,为使镜头达到最大的摄影视角,鱼眼镜头的前镜片直径很短且呈抛物状向镜头前部凸出,与鱼的眼睛颇为相似,因此而得名“鱼眼镜头”。
拼接:通过来自多个摄像装置的图像中彼此重合区的对应图像关系,依靠图像的融合拼接算法将来自多个摄像装置的图像进行图像拼接。
置信度:目标检测时,对于检测结果的可信程度的描述,其表明检测目标的可信程度。例如,取值范围在0-1之间。
图1为本申请目标检测方法适用的目标检测系统示意图,如图1所示,该目标检测系统可以包括多个摄像装置和目标检测装置,其中,多个摄像装置可以分布安装于车身一周,相邻两个摄像装置,其各自的摄像覆盖区有部分重合,以便于下文中描述的环视图像的拼接。目标检测装置可以是独立设置,也可以集成在控制装置中,还可以是通过软件或者软件与硬件结合实现,对此不做具体限定。可选的,前述目标检测系统可以应用于高级驾驶辅助系统(Advanced Driver Assistance System,ADAS)。
示例性的,上述多个摄像装置可以采用安装在车身上的四路鱼眼摄像装置,其中前方鱼眼摄像装置可以安装在汽车车标中部靠上的位置,后方鱼眼摄像装置可以安装在汽车车牌中部靠上的位置,需要保证前后摄像装置位于汽车的中轴线上(误差可以在正负5mm以内)。左侧鱼眼摄像装置和右侧鱼眼摄像装置可以分别安装在汽车的左侧和右侧后视镜的下方,需要保证左右摄像装置相对于汽车的中轴线对称,并且这两个鱼眼摄像装置和车前方及基线的距离保持一致。基于视角获取的要求,四路鱼眼摄像装置可以高于地面50cm,尤其是左右摄像装置需要保持同一高度。如图2所示,四路鱼眼摄像装置的视线方向与垂直地面方向之间的夹角大概在40-45度之间,最多不超过45度,确保四路鱼眼摄 像装置拍摄的图像中可以看到汽车车身。
上述目标检测装置可以基于多个摄像装置拍摄得到的图像进行目标检测,该目标包括停车位、人、障碍物、车道线等,以了解汽车周边环境。例如通过神经网络或计算机视觉等技术进行目标检测。
这里需要说明的是,本申请目标检测方法所适用的系统不限于包括四路鱼眼摄像装置,可以包含任意数量的任意类型的摄像装置,以能实现本申请的技术方案为准,不对系统的组成做任何限制。本申请以下实施例中,为阐述方便,仅以多路鱼眼摄像装置为例进行说明。
以下对本申请提供的目标检测方法进行说明。
图3为本申请目标检测方法实施例的流程图,如图3所示,本实施例的方法可以由上述ADAS中的芯片执行,该目标检测方法可以包括:
步骤301、当第一目标和第二目标为同一目标时,根据第一置信度和第二置信度获取融合置信度。
第一置信度为第一目标在第一图像中的可信程度,第二置信度为第二目标在第二图像中的可信程度,第一图像为来自第一摄像装置的图像,第二图像为根据来自多个摄像装置的图像得到的环视图像。可选的,多个摄像装置包括第一摄像装置。又一可选的,所述多个摄像装置不包括第一摄像装置。
本申请中第一图像为第一摄像装置拍摄得到的图像。进一步,所述多个摄像装置可以设置于车身,但是具体位置不做具限定。
具体的,车身的周围安装了多个摄像装置。其中相邻的两个或者多个摄像装置的摄像覆盖区可能存在部分重合。例如,如上述实施例中描述的车身上的四路鱼眼摄像装置,每个摄像装置均可认为是第一摄像装置,因此第一图像可以来自于其中一路摄像装置,也可以来自于其中多路摄像装置。如果车载摄像装置是鱼眼摄像装置,拍摄得到的图像为鱼眼图像,由于鱼眼摄像装置的视场角较大,可以拍摄到图像特征多,因此拍摄得到的三维的鱼眼图像能够有效地显示空间物体的特征。但是鱼眼摄像装置存在畸变现象,距离成像中心点越远图像畸变越大,在鱼眼图像中体现为物体的几何形变,例如拉伸和弯曲。
因此,本申请实施例还提供了第二图像。第二图像为根据多个摄像装置拍摄的图像拼接后得到的环视图像,该多个摄像装置包括上述第一摄像装置。根据多个摄像装置拍摄得到的多个角度的图像中的重合区域的对应图像关系,依靠图像的融合拼接算法将四路鱼眼摄像装置拍摄的俯视图进行图像拼接得到环视图像,该图像拼接可以包括鱼眼图像畸变校正,俯视转换和图像拼接三个主要步骤,每个步骤都可以得到一个输入图像和输出图像中对应像素之间的坐标转换关系,进而最终得到单张图像到环视图像的像素对应关系。需要说明的是,关于拼接技术以及环视图像的生成可以参考现有技术,本申请不对具体的拼接方式和环视图像的生成进行具体限定。
进一步需要说明的是,由于存在多个摄像装置,多个摄像装置所拍摄得到的多个图像中,可以存在多个包括上述第一目标的第一图像。为阐述方便,本申请方案涉及的是根据一个第一图像以及第二图像获取检测结果。但是本领域技术人员可知,所述第一图像可以为多个图像,即可以为多个摄像装置所拍摄的多个图像中的至少两个第一图像。进一步,本申请的实施例也可以基于多个第一图像以及第二图像执行目标检测。在这种场景下,针 对多个第一图像的处理参考本申请实施例中关于第一图像的处理方式。例如,多个第一图像对应多个第一置信度。下文中涉及第一置信度的处理,可以扩展到多个第一置信度中的至少一个第一置信度的处理。
可选的,第一图像是鱼眼图像。进一步,所述第一图像可以是基于从摄像装置得到的鱼眼图像进行处理后得到的。例如,为了矫正图像中物体的几何形变,可以通过对鱼眼摄像装置进行标定得到鱼眼摄像装置的内参和畸变系数,再结合相对于标定物的外参,可以根据得到的参数和鱼眼畸变模型对鱼眼图像进行校正得到校正后图像,将校正后图像作为第一图像。具体的处理方式可以参见现有技术中的图像处理方式,上述针对第一图像的获得的阐述仅仅是一种可能的图像处理手段,本申请实施例不做具体限定。
本申请中第一置信度为从第一图像中检测到的第一目标的可信程度,第二置信度为从第二图像中检测到的第二目标的可信程度。
示例性的,可以通过神经网络或计算机视觉等技术进行目标检测。其中,基于计算机视觉进行目标检测的过程中主要的处理方式包括:
图像灰度化处理:图像通常以红绿蓝(red green blue,RGB)格式进行存储。在图像的RGB模型中,当红绿蓝三个颜色通道的值相同时,该值叫灰度值,因此灰度图像中每个像素只需一个字节存放灰度值(又称强度值、亮度值),灰度范围为0-255。图像灰度化处理可以作为图像处理的预处理步骤,为之后的图像分割、图像识别和图像分析等操作做准备。
高斯滤波:高斯滤波是一种线性平滑滤波,适用于消除高斯噪声,广泛应用于图像处理的减噪过程。通俗的讲,高斯滤波就是对整幅图像进行加权平均的过程,每一个像素点的值,都由其本身和邻域内的其他像素值经过加权平均后得到。其具体操作是用一个模板(或称卷积、掩模)扫描图像中的每一个像素,用模板确定的邻域内像素的加权平均灰度值去替代模板中心像素点的值。
边缘检测:边缘是指其周围像素灰度急剧变化的那些象素的集合,它是图像最基本的特征。边缘检测是指提取图像中不连续部分的特征,根据闭合的边缘确定区域。
霍夫变换:霍夫变换的基本原理在于利用点与线的对偶性,将原始图像空间给定的曲线通过曲线表达形式变为参数空间的一个点。这样就把原始图像中给定曲线的检测问题转化为寻找参数空间中的峰值问题。亦即把检测整体特性转化为检测局部特性,例如直线、椭圆、圆、弧线等。
基于计算机视觉进行目标检测包括首先对输入图像进行灰度化处理,使图像变为灰度图,接着通过高斯滤波对图像进行减噪,再通过边缘检测模型提取图像中不连续的特征,最后通过霍夫变换将整体特征转换为局部特性。该目标检测方法可以得到与神经网络类似的检测结果,例如,在检测到的目标周围画框,得到该目标的置信度等。
图4A和4B示出了一种鱼眼图像和矫正后的图像的示意图。通过例如神经网络可以实现对图像的目标检测,其检测结果可以以图像的方式呈现,例如,要在车载摄像装置拍摄的图像中检测是否存在停车位,通过神经网络得到的输出图像中以画框、高亮等方式圈出被检测为停车位的区域。示例性的,图5示出了第一图像(单张鱼眼图像)通过神经网络检测到停车位的检测结果示意图,图6示出了第二图像(环视图像)通过神经网络检测到停车位的检测结果示意图,图中的方框内即为检测到的停车位。此外目标检测还可以给 出对于检测结果的可信程度的描述,即目标的置信度,其取值范围在0-1之间。示例性的,通过例如神经网络技术对第一图像进行目标(车位)检测,检测出2个停车位,其中一个停车位的置信度为0.3,另一个停车位的置信度为0.6,说明后者相较于前者在真实情况下是停车位的可能性高,前者存在把其他物体误检为停车位的可能性。
需要说明的是,除了上述基于神经网络的目标检测方法和基于计算机视觉的目标检测方法外,本申请对第一目标和第二目标的检测还可以采用其他的目标检测方法,对此不做具体限定。
本申请中判断第一目标和第二目标是否为同一目标可以采用多种方法。
一种可能的实现中,根据第一目标和第二目标的图像重合度确定第一目标和第二目标是否为同一目标,当第一目标和第二目标的图像重合度大于第四阈值时,确定第一目标和第二目标为同一目标,可以根据第一图像和第二图像的像素对应关系获取第一目标和第二目标的图像重合度。
另一种可能的实现中,根据第一目标的中心点和第二目标的中心点之间的像素距离进行判断。具体的,根据像素对应关系将第二目标转换到第二图像后获取其中心点,再计算该中心点和第二目标的中心点之间的像素距离,将该像素距离和设定的像素距离阈值进行比较,得到第一目标和第二目标是否为同一目标的判断结果。例如,第二图像的分辨率为600×600,像素距离阈值为30,第一目标的中心点为(400,500),第二目标的中心点为(420,520),计算可得两个中心点之间的像素距离为28,28小于30因此可判断得出第一目标和第二目标为同一目标的结论。
需要说明的是,上述两种方法是示例性说明,本申请对确定第一目标和第二目标是否为同一目标的方法不做具体限定。
示例性的,图7示出了第一图像(单张鱼眼图像)中检测到停车位的检测结果示意图,其中虚线框中为标定停车位的车位线标志,当检测到虚线框内的车位线标志可以认为该处存在停车位,实线框为根据检测到的车位线标志画出来的示意停车位的边框(Bounding Box,BB)。图8示出了第二图像中检测到的停车位检测到停车位,并将图7中的第一图像(单张鱼眼图像)中检测到的停车位根据像素对应关系转换到第二图像(环视图像)中的检测结果示意图,其中虚线框表示根据第一图像中检测到的停车位转换后得到的停车位的BB,实线框表示第二图像中检测到的停车位的BB,从图中可以看到实线框完全包含在虚线框中,表示这两个BB对应的是同一个停车位。
本申请设定了两个BB的重合度的阈值(第四阈值,例如0.7),该重合度可以由交并比(Intersection Over Union,IOU)表示,图9示出了两个BB的重叠面积,图10示出了两个BB的联合面积,IOU=重叠面积/联合面积。当IOU大于第四阈值,表示两个BB对应的为同一停车位,此时可以根据第一置信度和第二置信度获取融合置信度。但如果IOU小于或等于第四阈值,则表示两个BB对应的可能不是同一停车位,也就没必要对这两个BB的置信度进行融合,此时不需要获取融合置信度,并终止目标检测。
步骤302、根据融合置信度输出目标检测结果。
本申请可以根据第一置信度和第一阈值的比较结果,和/或,第二置信度和第二阈值的比较结果获取融合置信度,其中第一阈值、第二阈值、第四阈值,以及之后涉及到的第三阈值可以根据目标检测精度、准确率和/或召回率等确定得到的,也可以根据训练学习 到,还可以是预先配置或定义的数值。本申请对各种阈值的获取方式不作具体限定。
本申请根据融合置信度输出目标检测结果包括以下几种情况中的至少一个:
(1)当第一置信度大于第一阈值,且第二置信度大于第二阈值时,根据预设的权重值确定融合置信度;确定融合置信度大于第三阈值,输出在第二图像中检测到的第二目标的定位信息。一种可选的实现中,对于存在多个第一图像的场景(即所述多个第一图像中均包含所述第一目标),对应多个第一图像存在多个第一置信度,上述条件还可以替换为“当所述多个第一置信度中的至少一个大于第一阈值或者全部大于第一阈值,且第二置信度大于第二阈值时”。这里需要说明的是,下文中所涉及的可选的实现中的多个第一图像与本处提到的多个第一图像的解释相同,均涉及多个第一图像包含了相同第一目标。
可选的,提供第一权重和第二权重,所述第一权重对应所述第一置信度,所述第二权重对应所述第二置信度。进一步,所述第一权重和所述第二权重是预设的。具体的,通过第一权重对所述第一置信度进行加权,通过第二权重对所述第二置信度进行加权,两个加权求和后得到融合置信度。
例如,可以根据分别对第一图像和第二图像进行目标检测的结果,对应于第一置信度设定第一权重,对应于第二置信度设定第二权重,通过第一权重对第一置信度进行加权,通过第二权重对第二置信度进行加权,两个加权求和后得到融合置信度。该情况下第一图像中检测到的第一目标的第一置信度和第二图像中检测到的第二目标的第二置信度均高于各自的阈值,说明第一图像中检测到的第一目标和第二图像中检测到的第二目标均为真实目标的可能性较高,为了提高最终检测结果的置信度,在第二图像的检测结果的基础上结合第一图像的检测结果获取融合置信度,可以提高第二图像的检测结果的置信度,最终提高目标的可信程度。若融合置信度大于第三阈值,则可以直接采用第二图像中检测到的第二目标作为最终检测到的目标,并输出该目标的定位信息,该定位信息用于呈现车辆的周边环境。
(2)当第一置信度小于或等于第一阈值,且第二置信度大于第二阈值时,确定融合置信度为第二置信度;确定融合置信度大于第三阈值,输出在第二图像中检测到的第二目标的定位信息。一种可选的实现中,对于存在多个第一图像的场景,对应多个第一图像存在多个第一置信度,上述条件还可以替换为“当所述多个第一置信度全部小于或等于第一阈值,且第二置信度大于第二阈值时”。
该情况下第一图像中检测到的第一目标的第一置信度小于或等于第一阈值,第二图像中检测到的第二目标的第二置信度大于第二阈值,说明第一图像中检测到的第一目标可能是误检的可能性较高,而第二图像中检测到的第二目标为真实目标的可能性较高,因此将第二目标的第二置信度作为融合置信度。若融合置信度大于第三阈值,则可以采用第二图像中检测到的第二目标作为最终检测到的目标,并输出该目标的定位信息,该定位信息用于呈现车辆的周边环境。
(3)当第一置信度大于第一阈值,且第二置信度小于或等于第二阈值时,确定融合置信度为第一置信度;确定融合置信度大于第三阈值,输出在第一图像中检测到的第一目标的定位信息。一种可选的实现中,对于存在多个第一图像的场景,对应多个第一图像存在多个第一置信度,上述特征还可以替换为“当所述多个第一置信度中的至少一个大于第一阈值或者全部大于第一阈值,且第二置信度小于或等于第二阈值时,确定融合置信度为 所述多个第一置信度中的一个第一置信度。例如所述融合置信度可以为所述多个第一置信度中数值最大的第一置信度”。
该情况下第一图像中检测到的第一目标的第一置信度大于第一阈值,第二图像中检测到的第二目标的第二置信度小于或等于第二阈值,说明第一图像中检测到的第一目标为真实目标的可能性较高,而第二图像中检测到的第二目标可能是误检的可能性较高,因此将第一目标的第一置信度作为融合置信度。若融合置信度大于第三阈值,则可以采用第一图像中检测到的第一目标作为最终检测到的目标,并输出该目标的定位信息,该定位信息用于呈现车辆的周边环境。
(4)当第一置信度小于或等于第一阈值,且第二置信度小于或等于第二阈值时,确定融合置信度获取失败;输出目标检测失败的提示信息。可选的,对于存在多个第一图像的场景,对应多个第一图像存在多个第一置信度,上述条件还可以替换为“当所述多个第一置信度全部小于或等于第一阈值,且第二置信度小于或等于第二阈值时”。
该情况下第一图像中检测到的第一目标的第一置信度小于或等于第一阈值,第二图像中检测到的第二目标的第二置信度小于或等于第二阈值,说明第一图像中检测到的第一目标和第二图像中检测到的第二目标可能是误检的可能性均比较高,此时可以确定融合置信度获取失败,直接输出目标检测失败的提示信息。
需要说明的是,上述根据融合置信度输出目标检测结果的四种情况,每种情况的条件是一种示例性的描述,对于第一置信度等于第一阈值和第二置信度等于第二阈值的特例,可以根据第一阈值和第二阈值的具体设定其要归入的情况。
例如,上述情况(1)的条件也可以是第一置信度大于或等于第一阈值,且第二置信度大于或等于第二阈值,情况(2)的条件也可以是第一置信度小于第一阈值,且第二置信度大于或等于第二阈值,情况(3)的条件也可以是第一置信度大于或等于第一阈值,且第二置信度小于第二阈值,情况(4)的条件也可以是第一置信度小于第一阈值,且第二置信度小于第二阈值。
又例如,上述情况(1)的条件也可以是第一置信度大于第一阈值,且第二置信度大于或等于第二阈值,情况(2)的条件也可以是第一置信度小于或等于第一阈值,且第二置信度大于或等于第二阈值,情况(3)的条件也可以是第一置信度大于第一阈值,且第二置信度小于第二阈值,情况(4)的条件也可以是第一置信度小于或等于第一阈值,且第二置信度小于第二阈值。
又例如,上述情况(1)的条件也可以是第一置信度大于或等于第一阈值,且第二置信度大于第二阈值,情况(2)的条件也可以是第一置信度小于第一阈值,且第二置信度大于第二阈值,情况(3)的条件也可以是第一置信度大于或等于第一阈值,且第二置信度小于或等于第二阈值,情况(4)的条件也可以是第一置信度小于第一阈值,且第二置信度小于或等于第二阈值。
本申请通过对从单张图像中检测到的目标的置信度和从环视图像中检测到的目标的置信度进行融合,确保检测到的目标的可信程度,进而提高目标检测的准确性。
示例性的,假设第一置信度为x,第二置信度为y,第一阈值A=0.6,第二阈值B=0.4,x的权重a=0.2,y的权重b=0.8,第三阈值为0.5,第四阈值为0.7。图11示出了目标检测方法中对第一图像和第二图像的置信度进行融合的过程,包括:
步骤1101、判断IOU是否大于0.7。
IOU是根据第一图像和第二图像的像素对应关系获取的第一图像中检测到的目标和第二图像中检测到的目标的图像重合度。
若IOU>0.7则执行步骤1102,若IOU≤0.7则执行步骤1107终止置信度的融合过程。
步骤1102、将x和A进行比较,并将y和B进行比较。
步骤1103、当x>A,且y>B时,融合置信度z=ax+by。
例如x=0.8,y=0.5,z=0.2×0.8+0.8×0.5=0.56。z>0.5,输出第二图像中检测到的目标的定位信息。
步骤1104、当x≤A,且y>B时,z=y。
例如x=0.6,y=0.5,z=0.5。输出第二图像中检测到的目标的定位信息。
步骤1105、当x>A,且y≤B时,z=x。
例如x=0.8,y=0.3,z=0.8。输出第一图像中检测到的目标的定位信息。
步骤1106、当x≤A,且y≤B时,z获取失败。
图12为本申请目标检测装置实施例的结构示意图,如图12所示,本实施例的目标检测装置可以是独立设置,也可以集成在控制装置中,还可以是通过软件或者软件与硬件结合实现。例如,可以应用于ADAS,具体可以实现为所述ADAS中的芯片或者集成电路,或者是独立于所述ADAS实现;又如,可以应用于车辆的控制实体,具体可以集成于所述控制实体或者是独立于所述控制实体。
该装置包括:处理模块1201和接口模块1202,其中,处理模块1201,被配置用于当第一目标和第二目标为同一目标时,根据第一置信度和第二置信度获取融合置信度;其中,所述第一置信度为所述第一目标在第一图像中的可信程度,所述第二置信度为所述第二目标在第二图像中的可信程度,所述第一图像为来自第一摄像装置的图像,所述第二图像为根据来自多个摄像装置的图像得到的环视图像,所述多个摄像装置包括所述第一摄像装置;接口模块1202,被配置用于根据所述融合置信度输出目标检测结果。
在一种可能的实现方式中,所述处理模块1201,被配置用于当所述第一置信度大于第一阈值,且所述第二置信度大于第二阈值时,根据预设的权重值确定所述融合置信度;和/或,当所述第一置信度小于或等于所述第一阈值,且所述第二置信度大于所述第二阈值时,确定所述融合置信度为所述第二置信度。
在一种可能的实现方式中,所述处理模块1201,被配置用于确定所述融合置信度大于第三阈值;所述接口模块1202,被配置用于输出在所述第二图像中检测到的所述第二目标的定位信息。
在一种可能的实现方式中,所述处理模块1201,被配置用于当所述第一置信度大于所述第一阈值,且所述第二置信度小于或等于所述第二阈值时,确定所述融合置信度为所述第一置信度。
在一种可能的实现方式中,所述处理模块1201,被配置用于确定所述融合置信度大于第三阈值;所述接口模块1202,被配置用于输出在所述第一图像中检测到的所述第一目标的定位信息。
在一种可能的实现方式中,所述处理模块1201,被配置用于根据所述第一目标和所述第二目标的图像重合度确定所述第一目标和所述第二目标是否为同一目标;当所述第一 目标和所述第二目标的图像重合度大于第四阈值时,确定所述第一目标和所述第二目标为同一目标。
在一种可能的实现方式中,所述处理模块1201,被配置用于获取来自所述第一摄像装置的所述第一图像;以及,拼接来自所述多个摄像装置的图像得到所述第二图像。
本实施例的装置,可以用于执行图3或11所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
在一种可能的实现方式中,本申请提供一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行上述图3-11任一所示方法实施例中的方法。
在一种可能的实现方式中,本申请提供一种计算机程序,当所述计算机程序被计算机执行时,实现上述图3-11任一所示方法实施例的方法。
在一种可能的实现方式中,本申请提供一种装置,该装置可以是独立设置,也可以集成在控制装置中,还可以是通过软件或者软件与硬件结合实现。例如,可以应用于ADAS,具体可以实现为所述ADAS中的芯片或者集成电路,或者是独立于所述ADAS实现;又如,可以应用于车辆的控制实体,具体可以集成于所述控制实体或者是独立于所述控制实体。该装置其包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以实现如上述图3-11任一所示方法实施例的方法。
参见图13,图13为本申请提供的目标检测设备的示意性框图。目标检测设备1300可以应用于ADAS,也可以是车辆的控制实体,其包括处理器1301和收发器1302。
可选地,目标检测设备1300还包括存储器1303。其中,处理器1301、收发器1302和存储器1303之间可以通过内部连接通路互相通信,传递控制信号和/或数据信号。
其中,存储器1303用于存储计算机程序。处理器1301用于执行存储器1303中存储的计算机程序,从而实现上述装置实施例中目标检测设备1300的各功能。
具体地,处理器1301可以用于执行装置实施例(例如,图12)中描述的由融合模块1202和输出模块1203执行的操作和/或处理,而收发器1302用于执行由获取模块1201执行操作和/处理。
可选地,存储器1303也可以集成在处理器1301中,或者独立于处理器1301。
可选地,目标检测设备1300还可以包括电源1304,用于给各种器件或电路提供电源。
除此之外,为了使功能更加完善,目标检测设备1300还可以包括输入单元1305、显示单元1306(也可以认为是输出单元)、音频电路1307、摄像头1308和传感器1309等中的一个或多个。音频电路还可以包括扬声器13071、麦克风13072等,不再赘述。
以上各实施例中提及的处理器和存储器可以位于集成电路或者芯片上,处理器具有图像处理能力。在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP),所述集成电路或者芯片可以是特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直 接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的 存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (18)

  1. 一种目标检测方法,其特征在于,包括:
    当第一目标和第二目标为同一目标时,根据第一置信度和第二置信度获取融合置信度;其中,所述第一置信度为所述第一目标在第一图像中的可信程度,所述第二置信度为所述第二目标在第二图像中的可信程度,所述第一图像为来自第一摄像装置的图像,所述第二图像为根据来自多个摄像装置的图像得到的环视图像,所述多个摄像装置包括所述第一摄像装置;
    根据所述融合置信度输出目标检测结果。
  2. 根据权利要求1所述的方法,其特征在于,所述根据第一置信度和第二置信度获取融合置信度,包括:
    当所述第一置信度大于第一阈值,且所述第二置信度大于第二阈值时,根据预设的权重值确定所述融合置信度;和/或,
    当所述第一置信度小于或等于所述第一阈值,且所述第二置信度大于所述第二阈值时,确定所述融合置信度为所述第二置信度。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述融合置信度输出目标检测结果,包括:
    确定所述融合置信度大于第三阈值;
    输出在所述第二图像中检测到的所述第二目标的定位信息。
  4. 根据权利要求1所述的方法,其特征在于,所述根据第一置信度和第二置信度获取融合置信度,包括:
    当所述第一置信度大于第一阈值,且所述第二置信度小于或等于第二阈值时,确定所述融合置信度为所述第一置信度。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述融合置信度输出目标检测结果,包括:
    确定所述融合置信度大于第三阈值;
    输出在所述第一图像中检测到的所述第一目标的定位信息。
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,还包括:
    根据所述第一目标和所述第二目标的图像重合度确定所述第一目标和所述第二目标是否为同一目标;
    当所述第一目标和所述第二目标的图像重合度大于第四阈值时,确定所述第一目标和所述第二目标为同一目标。
  7. 根据权利要求1-6中任一项所述的方法,其特征在于,还包括:
    获取来自所述第一摄像装置的所述第一图像;以及
    拼接来自所述多个摄像装置的图像得到所述第二图像。
  8. 一种目标检测装置,其特征在于,包括:
    处理模块,被配置用于当第一目标和第二目标为同一目标时,根据第一置信度和第二置信度获取融合置信度;其中,所述第一置信度为所述第一目标在第一图像中的可信程度,所述第二置信度为所述第二目标在第二图像中的可信程度,所述第一图像为来自第一摄像 装置的图像,所述第二图像为根据来自多个摄像装置的图像得到的环视图像,所述多个摄像装置包括所述第一摄像装置;
    接口模块,被配置用于根据所述融合置信度输出目标检测结果。
  9. 根据权利要求8所述的装置,其特征在于,所述处理模块,被配置用于当所述第一置信度大于第一阈值,且所述第二置信度大于第二阈值时,根据预设的权重值确定所述融合置信度;和/或,当所述第一置信度小于或等于所述第一阈值,且所述第二置信度大于所述第二阈值时,确定所述融合置信度为所述第二置信度。
  10. 根据权利要求9所述的装置,其特征在于,所述处理模块,被配置用于确定所述融合置信度大于第三阈值;
    所述接口模块,被配置用于输出在所述第二图像中检测到的所述第二目标的定位信息。
  11. 根据权利要求8所述的装置,其特征在于,所述处理模块,被配置用于当所述第一置信度大于第一阈值,且所述第二置信度小于或等于第二阈值时,确定所述融合置信度为所述第一置信度。
  12. 根据权利要求11所述的装置,其特征在于,所述处理模块,被配置用于确定所述融合置信度大于第三阈值;
    所述接口模块,被配置用于输出在所述第一图像中检测到的所述第一目标的定位信息。
  13. 根据权利要求8-12中任一项所述的装置,其特征在于,所述处理模块,被配置用于根据所述第一目标和所述第二目标的图像重合度确定所述第一目标和所述第二目标是否为同一目标;当所述第一目标和所述第二目标的图像重合度大于第四阈值时,确定所述第一目标和所述第二目标为同一目标。
  14. 根据权利要求8-13中任一项所述的装置,其特征在于,所述处理模块,被配置用于获取来自所述第一摄像装置的所述第一图像;以及,拼接来自所述多个摄像装置的图像得到所述第二图像。
  15. 一种设备,其特征在于,包括:
    一个或多个处理器;
    存储器,用于存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述设备实现如权利要求1-7中任一项所述的方法。
  16. 一种计算机可读存储介质,其特征在于,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行权利要求1-7中任一项所述的方法。
  17. 一种计算机程序,其特征在于,当所述计算机程序被计算机执行时,实现权利要求1-7中任一项所述的方法。
  18. 一种装置,其特征在于,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以实现如权利要求1-7中任一项所述的方法。
PCT/CN2020/094612 2019-06-10 2020-06-05 目标检测方法和装置 WO2020248910A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP20823164.7A EP3965005A4 (en) 2019-06-10 2020-06-05 Target detection method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910498217.2A CN112069862A (zh) 2019-06-10 2019-06-10 目标检测方法和装置
CN201910498217.2 2019-06-10

Publications (1)

Publication Number Publication Date
WO2020248910A1 true WO2020248910A1 (zh) 2020-12-17

Family

ID=73658735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/094612 WO2020248910A1 (zh) 2019-06-10 2020-06-05 目标检测方法和装置

Country Status (3)

Country Link
EP (1) EP3965005A4 (zh)
CN (1) CN112069862A (zh)
WO (1) WO2020248910A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112660125A (zh) * 2020-12-26 2021-04-16 江铃汽车股份有限公司 一种车辆巡航控制方法、装置、存储介质及车辆
CN114821531A (zh) * 2022-04-25 2022-07-29 广州优创电子有限公司 基于电子外后视镜adas的车道线识别图像显示系统
CN116148801A (zh) * 2023-04-18 2023-05-23 深圳市佰誉达科技有限公司 一种基于毫米波雷达的目标检测方法及系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112633214B (zh) * 2020-12-30 2022-09-23 潍柴动力股份有限公司 一种车辆识别方法及装置
CN115214637B (zh) * 2021-04-01 2024-02-02 广州汽车集团股份有限公司 倒车制动辅助方法、辅助控制器、驾驶辅助系统和汽车
CN113844463B (zh) * 2021-09-26 2023-06-13 国汽智控(北京)科技有限公司 基于自动驾驶系统的车辆控制方法、装置及车辆

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914821A (zh) * 2012-12-31 2014-07-09 株式会社理光 多角度图像对象融合方法及系统
CN106408950A (zh) * 2016-11-18 2017-02-15 北京停简单信息技术有限公司 停车场出入口车牌识别系统及方法
CN107306338A (zh) * 2016-04-19 2017-10-31 通用汽车环球科技运作有限责任公司 用于对象检测和跟踪的全景摄像机系统
US10055853B1 (en) * 2017-08-07 2018-08-21 Standard Cognition, Corp Subject identification and tracking using image recognition
CN208036106U (zh) * 2017-12-27 2018-11-02 鹰驾科技(深圳)有限公司 一种全景驾驶辅助系统和汽车
CN109598747A (zh) * 2017-09-30 2019-04-09 上海欧菲智能车联科技有限公司 运动目标检测系统、运动目标检测方法和车辆

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9892493B2 (en) * 2014-04-21 2018-02-13 Texas Instruments Incorporated Method, apparatus and system for performing geometric calibration for surround view camera solution
US9443164B2 (en) * 2014-12-02 2016-09-13 Xerox Corporation System and method for product identification
CN105205486B (zh) * 2015-09-15 2018-12-07 浙江宇视科技有限公司 一种车标识别方法及装置
CN106485736B (zh) * 2016-10-27 2022-04-12 深圳市道通智能航空技术股份有限公司 一种无人机全景视觉跟踪方法、无人机以及控制终端
KR20180068578A (ko) * 2016-12-14 2018-06-22 삼성전자주식회사 복수의 센서를 이용하여 객체를 인식하는 전자 기기 및 방법
CN106791710B (zh) * 2017-02-10 2020-12-04 北京地平线信息技术有限公司 目标检测方法、装置和电子设备
CN106952477B (zh) * 2017-04-26 2020-01-14 智慧互通科技有限公司 基于多相机图像联合处理的路侧停车管理方法
CN109740617A (zh) * 2019-01-08 2019-05-10 国信优易数据有限公司 一种图像检测方法及装置
CN109816045A (zh) * 2019-02-11 2019-05-28 青岛海信智能商用系统股份有限公司 一种商品识别方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914821A (zh) * 2012-12-31 2014-07-09 株式会社理光 多角度图像对象融合方法及系统
CN107306338A (zh) * 2016-04-19 2017-10-31 通用汽车环球科技运作有限责任公司 用于对象检测和跟踪的全景摄像机系统
CN106408950A (zh) * 2016-11-18 2017-02-15 北京停简单信息技术有限公司 停车场出入口车牌识别系统及方法
US10055853B1 (en) * 2017-08-07 2018-08-21 Standard Cognition, Corp Subject identification and tracking using image recognition
CN109598747A (zh) * 2017-09-30 2019-04-09 上海欧菲智能车联科技有限公司 运动目标检测系统、运动目标检测方法和车辆
CN208036106U (zh) * 2017-12-27 2018-11-02 鹰驾科技(深圳)有限公司 一种全景驾驶辅助系统和汽车

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3965005A4

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112660125A (zh) * 2020-12-26 2021-04-16 江铃汽车股份有限公司 一种车辆巡航控制方法、装置、存储介质及车辆
CN114821531A (zh) * 2022-04-25 2022-07-29 广州优创电子有限公司 基于电子外后视镜adas的车道线识别图像显示系统
CN116148801A (zh) * 2023-04-18 2023-05-23 深圳市佰誉达科技有限公司 一种基于毫米波雷达的目标检测方法及系统

Also Published As

Publication number Publication date
CN112069862A (zh) 2020-12-11
EP3965005A1 (en) 2022-03-09
EP3965005A4 (en) 2022-06-29

Similar Documents

Publication Publication Date Title
WO2020248910A1 (zh) 目标检测方法和装置
CN109614889B (zh) 对象检测方法、相关设备及计算机存储介质
CN107577988B (zh) 实现侧方车辆定位的方法、装置及存储介质、程序产品
EP3438777B1 (en) Method, apparatus and computer program for a vehicle
WO2020062856A1 (zh) 一种车辆特征获取方法及装置
CN111508260A (zh) 车辆停车位检测方法、装置和系统
EP2889641A1 (en) Image processing apparatus, image processing method, program and image processing system
JP2020501281A (ja) 障害物検出方法及び装置
JP2002359838A (ja) 運転支援装置
CN109741241B (zh) 鱼眼图像的处理方法、装置、设备和存储介质
CN113065590A (zh) 一种基于注意力机制的视觉与激光雷达多模态数据融合方法
US20220277470A1 (en) Method and system for detecting long-distance target through binocular camera, and intelligent terminal
CN113128347B (zh) 基于rgb-d融合信息的障碍物目标分类方法、系统和智能终端
KR20180092765A (ko) 탑뷰 영상을 이용한 차선인식 장치 및 방법
CN113792707A (zh) 基于双目立体相机的地形环境检测方法、系统和智能终端
KR20170001765A (ko) Avm 시스템의 공차 보정 장치 및 방법
CN111325799A (zh) 一种大范围高精度的静态环视自动标定图案及系统
US11275952B2 (en) Monitoring method, apparatus and system, electronic device, and computer readable storage medium
US20230049561A1 (en) Data processing method and apparatus
KR20140052769A (ko) 왜곡 영상 보정 장치 및 방법
WO2022267068A1 (zh) 光束调整方法及装置、发射端、计算机存储介质
CN114972470A (zh) 基于双目视觉的路面环境获取方法和系统
CN110610171A (zh) 图像处理方法和装置、电子设备、计算机可读存储介质
US11669992B2 (en) Data processing
CN111656404A (zh) 图像处理方法、系统及可移动平台

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20823164

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020823164

Country of ref document: EP

Effective date: 20211202

NENP Non-entry into the national phase

Ref country code: DE