CN116778151A - Target detection method, device, vehicle, computer equipment and storage medium - Google Patents

Target detection method, device, vehicle, computer equipment and storage medium Download PDF

Info

Publication number
CN116778151A
CN116778151A CN202310780442.1A CN202310780442A CN116778151A CN 116778151 A CN116778151 A CN 116778151A CN 202310780442 A CN202310780442 A CN 202310780442A CN 116778151 A CN116778151 A CN 116778151A
Authority
CN
China
Prior art keywords
projection image
fisheye
cylindrical projection
photo
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310780442.1A
Other languages
Chinese (zh)
Inventor
许凯秋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jidu Technology Co Ltd
Original Assignee
Beijing Jidu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jidu Technology Co Ltd filed Critical Beijing Jidu Technology Co Ltd
Priority to CN202310780442.1A priority Critical patent/CN116778151A/en
Publication of CN116778151A publication Critical patent/CN116778151A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present disclosure provides a target detection method, apparatus, vehicle, computer device, and storage medium. Wherein the method comprises the following steps: acquiring a fisheye photo shot by a fisheye camera; determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters; converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera; and carrying out target detection processing on the cylindrical projection image by using a target detection model to obtain a target detection result. By converting the fisheye photo with strong radial distortion into a cylindrical projection image with lower radial distortion degree, the cylindrical projection image can be detected by the target detection model, so that the target detection model can provide the best fit for the first target object, and the problem of losing the first target object is avoided.

Description

Target detection method, device, vehicle, computer equipment and storage medium
Technical Field
The disclosure relates to the technical field of target detection, and in particular relates to a target detection method, a target detection device, a vehicle, computer equipment and a storage medium.
Background
In the related art for object detection by using a fisheye camera, since the fisheye camera imaging tends to show strong radial distortion, a certain degree of distortion occurs to the object in the photo, and the closer the object is to the edge of the field of view, the greater the distortion. The object detection of such photographs results in more complex object detection algorithms, difficulty in providing a best fit for severely distorted first object objects, and problems with losing the first object objects.
Disclosure of Invention
The embodiment of the disclosure at least provides a target detection method, a target detection device, a vehicle, computer equipment and a storage medium.
In a first aspect, an embodiment of the present disclosure provides a target detection method, including:
acquiring a fisheye photo shot by a fisheye camera; and
determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters;
converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera;
and carrying out target detection processing on the cylindrical projection image by using a target detection model to obtain a target detection result.
In this way, by determining the size information of the cylindrical projection image, the virtual camera parameters and the fisheye camera parameters corresponding to the fisheye photo, the fisheye photo with strong radial distortion is converted into the cylindrical projection image with low radial distortion degree, which can be detected by the target detection model, so that the target detection model can provide the best fit for the first target object in the cylindrical projection image, and the problem of losing the first target object is avoided.
In an optional embodiment, the determining size information of the cylindrical projection image corresponding to the fisheye photo to be generated, and virtual camera parameters includes:
acquiring radar sensing data synchronously acquired with the fisheye photo;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the radar sensing data.
In an alternative embodiment, the determining, based on the radar sensing data, the size information of the cylindrical projection image to be generated, and virtual camera parameters includes:
determining second target object state information of the vehicle periphery based on the radar sensing data;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the second target object state information.
In an alternative embodiment, the second target object state information includes at least one of:
a relative movement direction between the second target object and the vehicle, a relative movement speed, a distance between the second target object and the vehicle, a volume of the second target object.
In this way, the radar sensing data acquired in synchronization with the fisheye photo is used to determine the second target object state information of the vehicle periphery, and the size information of the cylindrical projection image and the virtual camera parameters which are matched with the fisheye photo can be determined according to the second target object state information.
In an alternative embodiment, the converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera includes:
determining a plurality of first pixel points of the cylindrical projection image based on the size information;
determining three-dimensional coordinate values of each first pixel point under a world coordinate system based on the virtual camera parameters for each first pixel point;
determining two-dimensional coordinate values of each first pixel point in the fisheye photo based on the fisheye camera parameters and the coordinate values of each first pixel point in the world coordinate system;
And determining a position mapping relation between a second pixel point in the fisheye photo and the first pixel point based on the two-dimensional coordinate value, and converting the fisheye photo into the cylindrical projection image based on the position mapping relation.
Therefore, the fisheye photo is converted into the cylindrical projection image by utilizing the position mapping relation, so that all characteristic information in the fisheye photo can be reserved by the converted cylindrical projection image, and the problem of losing the first target object is avoided.
In an alternative embodiment, the second pixel point in the fisheye photo includes pixel value information, and the method further includes:
determining pixel value information of a first pixel point in the cylindrical projection image based on the pixel value information of a second pixel point in the fisheye photo and the position mapping relation;
and obtaining a converted cylindrical projection image based on the pixel value information of the first pixel point in the cylindrical projection image.
In an alternative embodiment, the target detection result includes: the type of the first target object, and the motion state of the first target object; the method further comprises the steps of:
and controlling the vehicle to execute corresponding operation based on the type of the first target object and the motion state of the first target object.
In a second aspect, the presently disclosed embodiments also provide a vehicle comprising steps as in the first aspect, or any of the possible implementations of the first aspect.
In a third aspect, an optional implementation manner of the present disclosure further provides an object detection apparatus, including:
the acquisition module is used for acquiring the fisheye photo shot by the fisheye camera; and
determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters;
a conversion module for converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera;
and the detection module is used for carrying out target detection processing on the cylindrical projection image by utilizing a target detection model to obtain a target detection result.
In an optional implementation manner, the acquiring module is configured to, when determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters:
acquiring radar sensing data synchronously acquired with the fisheye photo;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the radar sensing data.
In an optional embodiment, the acquiring module is configured to, when determining, based on the radar sensing data, size information of the cylindrical projection image to be generated, and virtual camera parameters:
determining second target object state information of the vehicle periphery based on the radar sensing data;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the second target object state information.
In an alternative embodiment, the conversion module is configured to, when converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameter, and a fisheye camera parameter of the fisheye camera:
determining a plurality of first pixel points of the cylindrical projection image based on the size information;
determining three-dimensional coordinate values of each first pixel point under a world coordinate system based on the virtual camera parameters for each first pixel point;
determining two-dimensional coordinate values of each first pixel point in the fisheye photo based on the fisheye camera parameters and the coordinate values of each first pixel point in the world coordinate system;
And determining a position mapping relation between a second pixel point in the fisheye photo and the first pixel point based on the two-dimensional coordinate value, and converting the fisheye photo into the cylindrical projection image based on the position mapping relation.
In an alternative embodiment, the conversion module is further configured to:
determining pixel value information of a first pixel point in the cylindrical projection image based on the pixel value information of a second pixel point in the fisheye photo and the position mapping relation;
and obtaining a converted cylindrical projection image based on the pixel value information of the first pixel point in the cylindrical projection image.
In an alternative embodiment, the apparatus further comprises an execution module for:
and controlling the vehicle to execute corresponding operation based on the type of the first target object and the motion state of the first target object.
In a fourth aspect, an optional implementation of the disclosure further provides a computer device, a processor, and a memory, where the memory stores machine-readable instructions executable by the processor, and the processor is configured to execute the machine-readable instructions stored in the memory, where the machine-readable instructions, when executed by the processor, perform the steps of the first aspect, or any possible implementation of the first aspect.
In a fifth aspect, an alternative implementation of the present disclosure further provides a computer readable storage medium having stored thereon a computer program which when executed performs the steps of the first aspect, or any of the possible implementation manners of the first aspect.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the aspects of the disclosure.
The foregoing objects, features and advantages of the disclosure will be more readily apparent from the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the embodiments are briefly described below, which are incorporated in and constitute a part of the specification, these drawings showing embodiments consistent with the present disclosure and together with the description serve to illustrate the technical solutions of the present disclosure. It is to be understood that the following drawings illustrate only certain embodiments of the present disclosure and are therefore not to be considered limiting of its scope, for the person of ordinary skill in the art may admit to other equally relevant drawings without inventive effort.
FIG. 1 illustrates a flow chart of a method of object detection provided by some embodiments of the present disclosure;
FIG. 2 illustrates a flow chart for converting a fisheye photo into a cylindrical projection image provided by some embodiments of the present disclosure;
FIG. 3 illustrates an example diagram provided by some embodiments of the present disclosure for determining three-dimensional coordinate values of a first pixel point in a world coordinate system;
FIG. 4 illustrates one example diagram of a target detection method provided by some embodiments of the present disclosure;
FIG. 5 illustrates a second exemplary diagram of a target detection method provided by some embodiments of the present disclosure;
FIG. 6 illustrates a schematic diagram of an object detection device provided by some embodiments of the present disclosure;
fig. 7 illustrates a schematic diagram of a computer device provided by some embodiments of the present disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, but not all embodiments. The components of the disclosed embodiments generally described and illustrated herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be made by those skilled in the art based on the embodiments of this disclosure without making any inventive effort, are intended to be within the scope of this disclosure.
It is found that when the intelligent vehicle senses the surrounding objects, the camera or the radar is generally used to acquire the position information, the shape information and the like of the surrounding objects of the vehicle, the information is input into an electronic control system (Electronic Control Unit, ECU) of the vehicle, and the ECU detects the surrounding objects of the vehicle so as to realize functions of automatic driving, emergency obstacle avoidance and the like.
Currently, in the related technical solutions for object detection for optical imaging of cameras, in order to obtain as much information as possible by using fewer cameras, a fisheye camera with a focal length below 16mm and a field of view close to 180 ° is generally used to take a photograph. However, since the photo taken by the fisheye camera often shows strong radial distortion, serious distortion occurs to the object in the photo, and when the object in the photo is detected, the object detection algorithm is more complex, and it is difficult to provide the best fit for the object with serious distortion, and if the large radial distortion of the fisheye photo is removed through the distortion correction processing (i.e. the area with serious distortion is cut out), a large field of view loss exists in the distortion correction image, which results in the problem of losing the first target object.
Based on the above-described studies, the present disclosure provides a target detection method that can provide a best fit for a first target object by converting a fisheye photo having strong radial distortion into a cylindrical projection image that can be detected by a target detection model, and that can avoid the problem of losing the first target object because the region of the fisheye photo having serious distortion is not cut out during the conversion process.
The present invention is directed to a method for manufacturing a semiconductor device, and a semiconductor device manufactured by the method.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
For the convenience of understanding the present embodiment, a detailed description will be first given of a target detection method disclosed in the embodiments of the present disclosure.
Referring to fig. 1, a flowchart of a target detection method according to an embodiment of the disclosure includes S101 to S103, where:
S101: acquiring a fisheye photo shot by a fisheye camera; and determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters.
Here, the fisheye camera refers to a camera using a fisheye lens having a focal length of 16mm or less and a field of view close to 180 °, and such a camera may take a photograph having a large field of view, may contain relatively much information, and may take a photograph of a first target object at a close distance.
For example, in an intelligent vehicle, in order to satisfy functions of driving assistance, automatic driving, automatic parking, emergency obstacle avoidance, etc., a plurality of fisheye lenses are generally disposed around the vehicle to capture information of obstacles, pedestrians, animals, other vehicles, etc. around the vehicle.
Next, size information of the lenticular projection image corresponding to the fisheye photo, which may have a correspondence relationship with the size information of the fisheye photo, for example, an enlargement or reduction equal to or in proportion to the size of the fisheye photo, and virtual camera parameters are determined. The size information of the lenticular projection image may be determined based on the user's input, regardless of the size information of the fisheye photo.
The virtual camera parameters include external parameters and internal parameters, the external parameters including: the position, rotation direction, etc. of the virtual camera; the internal parameters include the focal length, pixel size, etc. of the virtual camera.
S102: the fisheye photo is converted to a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera.
Here, the fisheye camera parameters also include internal parameters and external parameters, and in a possible application scenario, the fisheye camera is mounted on the vehicle, and the fisheye camera parameters can be directly acquired in the ECU of the vehicle.
Illustratively, a cylindrical projection model is constructed from the size information of the cylindrical projection image, and the virtual camera parameters. The cylindrical projection model is a cylinder, the cylindrical plane is unfolded to be the size of the cylindrical projection image, the cylindrical projection model represents a projection relation, the cylindrical projection image can be subjected to coordinate conversion by utilizing the projection relation, and the cylindrical projection image can be converted into a fisheye camera coordinate system by matching with fisheye camera parameters. And after the coordinate systems are the same, converting the fisheye photo into a cylindrical projection image.
In some embodiments provided in the present disclosure, referring to a flowchart of converting a fisheye photo into a cylindrical projection image shown in fig. 2, the step of converting the fisheye photo into the cylindrical projection image includes, but is not limited to, S1021 to S2024 described below, wherein:
S1021: a plurality of first pixels of the lenticular projection image is determined based on the size information.
Here, the first pixel points of the cylindrical projection image are characterized as coordinate points of the cylindrical projection image on the 2D image, and at this time, the first pixel points have no 3D information and are uniformly arranged on the 2D plane. The plurality of first pixel points may be all pixel points on the cylindrical projection image, or may be part of pixel points.
For example, when the plurality of first pixels of the cylindrical projection image are determined for the plurality of first pixels as part of the pixels of the cylindrical projection image, the plurality of first pixels may be uniformly determined in the cylindrical projection image according to the ratio of all the pixels of the cylindrical projection image to the plurality of first pixels to be determined. The number of total pixels of the lenticular projection image may be determined by the size information of the lenticular projection image in the above example, and the pixel size in the virtual camera reference. In the following examples, a plurality of first pixel points are taken as all pixel points in the cylindrical projection image.
S1022: for each first pixel point, determining a three-dimensional coordinate value of each first pixel point under a world coordinate system based on the virtual camera parameters.
For example, after determining the plurality of first pixels of the cylindrical projection image, the first pixels are projected on the cylindrical projection image according to the cylindrical projection model and coordinate values of the plurality of first pixels in a world coordinate system, so that the first pixels in the cylindrical projection image are converted from the coordinate values in the 2D coordinate system to the coordinate values in the world coordinate system.
Referring to an exemplary diagram for determining three-dimensional coordinate values of a first pixel point in a world coordinate system shown in fig. 3, x, y, and z form the world coordinate system, x1 and y1 form a 2D plane coordinate system, I is a cylindrical projection image, and H is a cylinder; the point P is represented as a first pixel point of the cylindrical projection image in the 2D plane coordinate system, and the point Q is the projection of a point on the cylindrical surface H onto the cylindrical projection image I. And converting the coordinates of the first pixel point under the 2D plane coordinate system into the coordinates under the world coordinate system according to the projection relation of the point Q and the point P on the cylindrical projection image.
After the projection relation of the point P and the point Q is determined, the method is popularized to all first pixel points on the cylindrical projection image, and three-dimensional coordinate values of all the first pixel points in the cylindrical projection image under a world coordinate system are determined. And converting the coordinate value of the first pixel point in the cylindrical projection image from the 2D plane coordinate system to the coordinate value of the world coordinate system according to the three-dimensional coordinate values of all the pixel points in the cylindrical projection image under the world coordinate system.
S1023: and determining two-dimensional coordinate values of each first pixel point in the fisheye photo based on the fisheye camera parameters and the coordinate values of each first pixel point in the world coordinate system.
Here, pose information of the fisheye camera is determined according to the parameters of the fisheye camera, and the coordinate value of the first pixel point in the cylindrical projection image in the world coordinate system is converted into a two-dimensional coordinate value in the fisheye photo according to the pose information and the coordinate value of the first pixel point in the world coordinate system obtained in S1022.
S1024: and determining a position mapping relation between a second pixel point in the fisheye photo and the first pixel point based on the two-dimensional coordinate value, and converting the fisheye photo into the cylindrical projection image based on the position mapping relation.
After converting the coordinate value of the first pixel point in the cylindrical projection image under the world coordinate system into the two-dimensional coordinate value in the fisheye photo, determining the position mapping relation between the second pixel point and the first pixel point according to the two-dimensional coordinate value of the second pixel point in the fisheye photo and the two-dimensional coordinate value of the first pixel point in the fisheye photo; the position mapping relation characterizes the displacement variation of the projection of the second pixel points in the fisheye image on the cylindrical projection image, and the first pixel points respectively correspond to the second pixel points in the fisheye photo on the cylindrical projection image.
For example, a second pixel point in the fisheye photo is mapped to a first pixel point in the cylindrical projection image by using a position mapping relationship, so that the fisheye photo is converted into the cylindrical projection image. For example, the middle area of the fisheye photo is less distorted, the displacement variation of the second pixel point in the area projected onto the cylindrical projection image is also less, the distortion of the two side areas of the fisheye photo is larger, and the position variation of the second pixel point in the area projected onto the cylindrical projection image is also relatively larger.
S103: and carrying out target detection processing on the cylindrical projection image by using a target detection model to obtain a target detection result.
Here, the object detection model may be obtained by training the cylindrical projection image, or may be obtained by training a photograph taken by a general camera.
The target detection model may be obtained by training a cylindrical projection image, in one possible implementation manner, the target detection model to be trained may be deployed on a test vehicle, at least one fisheye camera is deployed on the test vehicle at the same time, then a fisheye photo obtained by shooting a real street view is converted into a cylindrical projection image by using S101 and S102, and then the cylindrical projection image is used for training, or a data set of the cylindrical projection image may be constructed, and then the data set is used for training.
The target detection model is obtained by training photos shot by a common camera, and the photos are common, so that the training is only performed by using the existing data sets of some street view photos.
Here, the algorithm used by the object detection model may be, for example, a bounding box (bbox) algorithm, by which a region of interest in the image, i.e., a region in which the first object is located, may be identified and surrounded by a frame. After the bounding box is generated, some sort of algorithm, such as convolutional neural network (Convolutional Neural Network, CNN) or the like, may be used for object recognition.
In addition, instead of using the bbox algorithm, a Region convolution neural network (Region-CNN) may be directly used to identify a Region in the cylindrical projection image where the first target object is located, and classify the first target object in the Region. The specific algorithm adopted by the target detection model to perform target detection on the cylindrical projection image can be set according to an application scene, and the disclosure is not repeated.
The method includes the steps of performing target detection processing on a cylindrical projection image by using a trained target detection model, marking a first target object and position information in the cylindrical projection image to obtain a target detection result, and inputting the target detection result into an ECU (electronic control unit) so that a vehicle can judge the conditions of pedestrians, obstacles and the like around the vehicle, and sending out corresponding instructions to control the vehicle to act.
The present disclosure provides some preferred embodiments in determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated, and virtual camera parameters:
acquiring radar sensing data synchronously acquired with the fisheye photo; and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the radar sensing data.
Here, the radar sensing data is acquired in synchronization with the fisheye photo taken by the fisheye camera, and the second target object state information of the vehicle periphery can be obtained according to the radar sensing data. The second target object and the first target object indicate the same target, except that the first target object is obtained by using a fisheye photo taken by a fisheye camera, and the second target object is obtained by radar sensing data acquired by a vehicle radar.
For example, when the vehicle-mounted radar and the fisheye camera are mounted, the mounting positions of the vehicle-mounted radar and the fisheye camera can be as close as possible, so that the number and the positions of the first target objects in the fisheye photo shot by the fisheye camera can correspond to the number and the positions of the second target objects in the radar sensing data acquired by the vehicle-mounted radar. When the radar sensing data and the fish-eye photo are acquired, a timestamp can be marked on the radar sensing data and the fish-eye photo, and whether the radar sensing data and the fish-eye photo are acquired simultaneously or not can be judged through the timestamp. In addition, a judgment condition can be introduced here to judge whether the difference between the time stamp when the radar sensing data are acquired and the time stamp when the fisheye photo is acquired is smaller than the synchronous acquisition limit time, if so, the radar sensing data and the fisheye photo are acquired at the same time, and at the moment, the size information of the cylindrical projection image to be generated and the virtual camera parameters are determined by using the radar sensing data. If the radar sensing data and the fish-eye photo are not acquired at the same time, the radar sensing data and the fish-eye photo are abandoned, the radar sensing data and the fish-eye photo are acquired again, and the judgment process is carried out.
The present disclosure further provides some preferred embodiments in determining size information of the cylindrical projection image to be generated, and virtual camera parameters using radar sensing data:
determining second target object state information of the vehicle periphery based on the radar sensing data; and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the second target object state information.
Here, the second target object state information includes at least one of: a relative movement direction between the second target object and the vehicle, a relative movement speed, a distance between the second target object and the vehicle, a volume of the second target object.
For example, after determining the state information of the second target object according to the radar sensing data, the second target object sensed by the radar may be compared with the first target object in the fisheye photo, and a corresponding relationship between the second target object and the first target object may be determined; after the correspondence is determined, size information of the cylindrical projection image is determined according to the state information of the second target object, for example, according to a relative moving direction and a relative moving speed between the second target object and the vehicle, and a distance between the second target object and the vehicle, whether the size of the cylindrical projection image is enlarged or reduced is determined, for example, if the distance between the second target object and the vehicle is far, but the moving direction of the second target object is moving toward the vehicle, and the moving speed of the second target object is fast, the cylindrical projection image may be enlarged so that the second target object may be more clearly identified. That is, according to the state information of the second target object, it can be determined whether the second target object affects the vehicle, and if the second target object affects the vehicle, the size of the cylindrical projection image is appropriately enlarged so as to more accurately perform the target detection.
In addition, in some possible embodiments provided in the present disclosure, when performing object detection on the cylindrical projection image, some non-target objects in the cylindrical projection image may also be subjected to filtering processing.
For example, the correspondence between the first target object and the second target object and the state information of the second target object may be used in the above example to determine a non-target object in the cylindrical projection image, for example, according to the state information of the second target object, the shape and the volume of the second target object may be determined, and the motion state of the second target object may be determined, for example, stationary, or moving in a certain direction, etc., according to these information and the correspondence between the second target object and the first target object, the non-target object may be filtered in some possible scenarios, for example, when the vehicle is automatically parked, the cylindrical projection image is subjected to object detection, and what needs to be detected is that some of the first target objects in the cylindrical projection image affect the parking of the vehicle, for example, the first target objects parked in the parking space, or the obstacle on the parking space; if other non-target objects which do not affect the parking process, such as obstacles which are outside the parking planned route, are also present in the cylindrical projection image, the non-target objects can be filtered. The filtering treatment comprises the following steps: the bbox algorithm is not applicable to identify these non-target objects or to target these non-target objects.
In addition, the second pixel point in the fisheye photo further includes pixel value information, and the present disclosure provides further preferred embodiments when converting the fisheye photo into a cylindrical projection image:
determining pixel value information of a first pixel point in the cylindrical projection image based on the pixel value information of a second pixel point in the fisheye photo and the position mapping relation; and obtaining a converted cylindrical projection image based on the pixel value information of the first pixel point in the cylindrical projection image.
The pixel value information comprises gray values or RGB values of pixel points, if the target detection is not interested in the color of the first target object, the RGB values of the second pixel points in the color fisheye photo can be converted into gray values in the conversion process, and then the gray values are converted to obtain a converted cylindrical projection image; if the target detection is interested in the color of the target object, the converted cylindrical projection image is obtained directly according to the RGB value of the second pixel point in the color fisheye photo.
For example, in one possible implementation manner, when the object detection is performed on the cylindrical projection image, the color in the image is often not interested, only the shape information, the position information and the classification information of the first object in the cylindrical projection image are required to be detected, for example, in scenes such as obstacle detection, pedestrian detection and the like, the emphasis of object detection is to detect whether the first object appearing in the cylindrical projection image is an obstacle or a pedestrian, and after the type of the first object is determined, the ECU of the vehicle performs further actions according to the detection result. Therefore, in order to reduce the amount of computation and computation time when converting a lenticular projection image, the lenticular projection image is generally a gray scale image.
In another possible embodiment, if the object detection of the cylindrical projection image needs to detect the color information of the image, such as identifying traffic lights, indicating board colors, and the like, the RGB values are assigned to the cylindrical projection image according to the RGB values of the second pixel points in the fisheye photo, so that the converted cylindrical projection image has the same color as the fisheye photo.
In addition, in some possible examples, when the fisheye photo is converted into the cylindrical projection image, in order to ensure that the cylindrical projection image can obtain all the features in the fisheye photo as far as possible, each second pixel point in the fisheye photo can find a corresponding first pixel point in the cylindrical projection image; some first pixels in the cylindrical projection image may not find a corresponding second pixel; then the pixel value maximization or minimization process may be used for the pixel values of the first pixels that do not have a correspondence with the second pixels in the fisheye photo, that is, the first pixels are turned white or black.
In addition, pixel value maximization or minimization processing can be adopted for the areas of the sky part which are not interested in target detection in the cylindrical projection image, so that the operation resources during target detection are saved.
The target detection result comprises: the type of the first target object, and the motion state of the first target object; the present disclosure also provides other preferred embodiments:
and controlling the vehicle to execute corresponding operation based on the type of the first target object and the motion state of the first target object.
Here, the type of the first target object may be, for example: pedestrians, obstacles, other vehicles, animals, etc.; the motion state of the first target object may include: stationary, moving toward the vehicle, moving in other directions, etc.
For example, when the target detection result detects that the pedestrian moves toward the vehicle while the vehicle is in a running state, a control instruction may be issued according to the distance between the pedestrian and the vehicle, and the pedestrian moving speed, and the vehicle running speed, to control the vehicle to make a corresponding operation such as whistling, blinking, braking, or the like.
If the target detection result detects that the vehicle has a static obstacle in the running direction, a control instruction can be sent out according to the distance between the obstacle and the vehicle to control the vehicle to perform corresponding operations such as voice prompt, vibration prompt, avoidance, braking and the like.
In addition, the present disclosure also provides a specific example of a target detection method applied to a vehicle on which at least one fisheye camera is disposed, referring to one of example diagrams of a target detection method shown in fig. 4, a target object S1 and a target object S2 located on both sides of a photograph are not detected by a target detection model due to serious distortion. The target objects S3, S4 are located in the middle area, and thus can be detected by the target detection model, and a classification probability is obtained, wherein the probability that the target object S3 is judged to be a car (car) is 0.76, and the probability that the target object S4 is judged to be a car (car) is 0.82. In this case, in order to detect the severely distorted target object S1 and the target object S2, the above-described S101 to S103 of the present disclosure may be adopted, the fisheye photo is converted into the cylindrical projection image, and then the cylindrical projection image is subjected to target detection, see second example diagram of one target detection method shown in fig. 5, and in fig. 5, since the converted cylindrical projection image corrects the severely distorted target object S1 and the target object S2, the target object S1 and the target object S2 may be detected and classified, the probability of classifying the target object S1 as a pedestrian (Person) is 0.91, and the probability of classifying the target object S2 as a vehicle (car) is 0.84. According to the judgment threshold, if the probability of recognition is greater than the probability threshold, the classification result that the target object S1 is a pedestrian and the target object S2 is a vehicle can be output.
The method for converting the fisheye photo into the cylindrical projection image comprises the following steps of S1 to S7:
s1: size information of the cylindrical projection image, and internal parameters and external parameters of the virtual camera are determined.
S2: and constructing a cylindrical projection model according to the size information of the cylindrical projection image and the internal and external parameters of the virtual camera.
S3: and converting the 2D coordinates of the pixel points in the cylindrical projection image into 3D coordinates according to the cylindrical projection model.
S4: and calculating the world coordinates of the cylindrical projection image by using the external parameters of the virtual camera.
S5: and converting world coordinates into camera coordinates in a fisheye camera coordinate system by utilizing fisheye camera external parameters.
S6: and calculating according to an imaging model of the fish-eye camera to obtain pixel coordinates of the fish-eye photo.
S7: and assigning pixel values of pixel coordinates of the fisheye photo to the cylindrical projection image to complete the process of converting the fisheye photo into the cylindrical projection image.
Based on the same inventive concept, the embodiments of the present disclosure further provide a target detection device corresponding to a target detection method, and since the principle of solving the problem by the device in the embodiments of the present disclosure is similar to that of the foregoing one target detection method in the embodiments of the present disclosure, the implementation of the device may refer to the implementation of the method, and the repetition is omitted.
Referring to fig. 6, a flowchart of an object detection apparatus according to an embodiment of the disclosure is shown, where the apparatus includes: an acquisition module 61, a conversion module 62, a detection module 63, wherein:
an acquisition module 61 for acquiring a fisheye photo taken by the fisheye camera; and
determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters;
a conversion module 62 for converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera;
and the detection module 63 is configured to perform target detection processing on the cylindrical projection image by using a target detection model, so as to obtain a target detection result.
In an alternative embodiment, the obtaining module 61 is configured to, when determining the size information of the cylindrical projection image to be generated corresponding to the fisheye photo and the virtual camera parameter:
acquiring radar sensing data synchronously acquired with the fisheye photo;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the radar sensing data.
In an alternative embodiment, the acquiring module 61 is configured to, when determining, based on the radar sensing data, size information of the cylindrical projection image to be generated, and virtual camera parameters:
determining second target object state information of the vehicle periphery based on the radar sensing data;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the second target object state information.
In an alternative embodiment, the conversion module 62 is configured to, when converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera:
determining a plurality of first pixel points of the cylindrical projection image based on the size information;
determining three-dimensional coordinate values of each first pixel point under a world coordinate system based on the virtual camera parameters for each first pixel point;
determining two-dimensional coordinate values of each first pixel point in the fisheye photo based on the fisheye camera parameters and the coordinate values of each first pixel point in the world coordinate system;
And determining a position mapping relation between a second pixel point in the fisheye photo and the first pixel point based on the two-dimensional coordinate value, and converting the fisheye photo into the cylindrical projection image based on the position mapping relation.
In an alternative embodiment, the conversion module 62 is further configured to:
determining pixel value information of a first pixel point in the cylindrical projection image based on the pixel value information of a second pixel point in the fisheye photo and the position mapping relation;
and obtaining a converted cylindrical projection image based on the pixel value information of the first pixel point in the cylindrical projection image.
In an alternative embodiment, the apparatus further comprises an execution module 64 for:
and controlling the vehicle to execute corresponding operation based on the type of the first target object and the motion state of the first target object.
The embodiment of the disclosure also provides a vehicle, which comprises the object detection method in any of the embodiments of the disclosure.
The embodiment of the disclosure further provides a computer device, as shown in fig. 7, which is a schematic structural diagram of the computer device provided by the embodiment of the disclosure, including:
A processor 71 and a memory 72; the memory 72 stores machine readable instructions executable by the processor 71, the processor 71 being configured to execute the machine readable instructions stored in the memory 72, the machine readable instructions when executed by the processor 71, the processor 71 performing the steps of:
acquiring a fisheye photo shot by a fisheye camera; and
determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters;
converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera;
and carrying out target detection processing on the cylindrical projection image by using a target detection model to obtain a target detection result.
The memory 72 includes a memory 721 and an external memory 722; the memory 721 is also referred to as an internal memory, and is used for temporarily storing operation data in the processor 71 and data exchanged with the external memory 722 such as a hard disk, and the processor 71 exchanges data with the external memory 722 via the memory 721.
The specific execution process of the above instruction may refer to a step of a request response method described in the embodiments of the present disclosure, which is not described herein.
The disclosed embodiments also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of a request response method as described in the above method embodiments. Wherein the storage medium may be a volatile or nonvolatile computer readable storage medium.
The methods in the embodiments of the present disclosure may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user device, a core network device, an OAM, or other programmable apparatus.
The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center by wired or wireless means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, e.g., floppy disk, hard disk, tape; but also optical media such as digital video discs; but also semiconductor media such as solid state disks. The computer readable storage medium may be volatile or nonvolatile storage medium, or may include both volatile and nonvolatile types of storage medium.
Finally, it should be noted that: the foregoing examples are merely specific embodiments of the present disclosure, and are not intended to limit the scope of the disclosure, but the present disclosure is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, it is not limited to the disclosure: any person skilled in the art, within the technical scope of the disclosure of the present disclosure, may modify or easily conceive changes to the technical solutions described in the foregoing embodiments, or make equivalent substitutions for some of the technical features thereof; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the disclosure, and are intended to be included within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (11)

1. A method of detecting an object, comprising:
acquiring a fisheye photo shot by a fisheye camera; and
determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters;
converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera;
And carrying out target detection processing on the cylindrical projection image by using a target detection model to obtain a target detection result.
2. The method of claim 1, wherein determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated, and virtual camera parameters comprises:
acquiring radar sensing data synchronously acquired with the fisheye photo;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the radar sensing data.
3. The method of claim 2, wherein the determining size information of the cylindrical projection image to be generated, and virtual camera parameters based on the radar sensing data, comprises:
determining second target object state information of the periphery of the vehicle based on the radar sensing data;
and determining size information of the cylindrical projection image to be generated and virtual camera parameters based on the second target object state information.
4. A method according to claim 3, wherein the second target object state information comprises at least one of:
a relative movement direction between the second target object and the vehicle, a relative movement speed, a distance between the second target object and the vehicle, a volume of the second target object.
5. The method of any of claims 1-4, wherein the converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera comprises:
determining a plurality of first pixel points of the cylindrical projection image based on the size information;
determining three-dimensional coordinate values of each first pixel point under a world coordinate system based on the virtual camera parameters for each first pixel point;
determining two-dimensional coordinate values of each first pixel point in the fisheye photo based on the fisheye camera parameters and the coordinate values of each first pixel point in the world coordinate system;
and determining a position mapping relation between a second pixel point in the fisheye photo and the first pixel point based on the two-dimensional coordinate value, and converting the fisheye photo into the cylindrical projection image based on the position mapping relation.
6. The method of claim 5, wherein the second pixel point in the fish-eye photograph includes pixel value information, the method further comprising:
determining pixel value information of a first pixel point in the cylindrical projection image based on the pixel value information of a second pixel point in the fisheye photo and the position mapping relation;
And obtaining a converted cylindrical projection image based on the pixel value information of the first pixel point in the cylindrical projection image.
7. The method of claim 1, wherein the target detection result comprises: the type of the first target object, and the motion state of the first target object; the method further comprises the steps of:
and controlling the vehicle to execute corresponding operation based on the type of the first target object and the motion state of the first target object.
8. A vehicle, characterized in that the vehicle comprises the object detection method according to any one of claims 1 to 7.
9. An object detection apparatus, comprising:
the acquisition module is used for acquiring the fisheye photo shot by the fisheye camera; and
determining size information of a cylindrical projection image corresponding to the fisheye photo to be generated and virtual camera parameters;
a conversion module for converting the fisheye photo into a cylindrical projection image based on the size information, the virtual camera parameters, and fisheye camera parameters of the fisheye camera;
and the detection module is used for carrying out target detection processing on the cylindrical projection image by utilizing a target detection model to obtain a target detection result.
10. A computer device, comprising: a processor, a memory storing machine-readable instructions executable by the processor for executing the machine-readable instructions stored in the memory, which when executed by the processor, perform the steps of the object detection method according to any one of claims 1 to 7.
11. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when run by a computer device, performs the steps of the object detection method according to any of claims 1 to 7.
CN202310780442.1A 2023-06-28 2023-06-28 Target detection method, device, vehicle, computer equipment and storage medium Pending CN116778151A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310780442.1A CN116778151A (en) 2023-06-28 2023-06-28 Target detection method, device, vehicle, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310780442.1A CN116778151A (en) 2023-06-28 2023-06-28 Target detection method, device, vehicle, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116778151A true CN116778151A (en) 2023-09-19

Family

ID=88006085

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310780442.1A Pending CN116778151A (en) 2023-06-28 2023-06-28 Target detection method, device, vehicle, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116778151A (en)

Similar Documents

Publication Publication Date Title
CN110163904B (en) Object labeling method, movement control method, device, equipment and storage medium
CN107577988B (en) Method, device, storage medium and program product for realizing side vehicle positioning
US20230110116A1 (en) Advanced driver assist system, method of calibrating the same, and method of detecting object in the same
JP4692371B2 (en) Image processing apparatus, image processing method, image processing program, recording medium recording image processing program, and moving object detection system
CN110119679B (en) Object three-dimensional information estimation method and device, computer equipment and storage medium
US20200177867A1 (en) Camera-parameter-set calculation apparatus, camera-parameter-set calculation method, and recording medium
KR102229220B1 (en) Method and device for merging object detection information detected by each of object detectors corresponding to each camera nearby for the purpose of collaborative driving by using v2x-enabled applications, sensor fusion via multiple vehicles
CN113269163B (en) Stereo parking space detection method and device based on fisheye image
CN111462503A (en) Vehicle speed measuring method and device and computer readable storage medium
CN112069862A (en) Target detection method and device
CN112926461B (en) Neural network training and driving control method and device
CN111598065A (en) Depth image acquisition method, living body identification method, apparatus, circuit, and medium
CN111768332A (en) Splicing method of vehicle-mounted all-around real-time 3D panoramic image and image acquisition device
CN114913506A (en) 3D target detection method and device based on multi-view fusion
CN112172797B (en) Parking control method, device, equipment and storage medium
CN114708583A (en) Target object detection method, device, equipment and storage medium
CN111105351B (en) Video sequence image splicing method and device
JP7384158B2 (en) Image processing device, moving device, method, and program
JP7360520B1 (en) Object tracking integration method and integration device
CN111259709B (en) Elastic polygon-based parking space structure detection model training method
CN116778151A (en) Target detection method, device, vehicle, computer equipment and storage medium
CN115565155A (en) Training method of neural network model, generation method of vehicle view and vehicle
CN112686155A (en) Image recognition method, image recognition device, computer-readable storage medium and processor
CN115063594B (en) Feature extraction method and device based on automatic driving
CN112183413B (en) Parking space detection method and device, storage medium and vehicle

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination