WO2022033066A1 - 目标检测方法及装置 - Google Patents

目标检测方法及装置 Download PDF

Info

Publication number
WO2022033066A1
WO2022033066A1 PCT/CN2021/087917 CN2021087917W WO2022033066A1 WO 2022033066 A1 WO2022033066 A1 WO 2022033066A1 CN 2021087917 W CN2021087917 W CN 2021087917W WO 2022033066 A1 WO2022033066 A1 WO 2022033066A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
information
coordinate
dividing line
parameter matrix
Prior art date
Application number
PCT/CN2021/087917
Other languages
English (en)
French (fr)
Inventor
赵昕海
杨臻
张维
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022033066A1 publication Critical patent/WO2022033066A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • G06T3/604Rotation of whole images or parts thereof using coordinate rotation digital computer [CORDIC] devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration

Definitions

  • the present application relates to the field of artificial intelligence, and in particular, to a target detection method and device in the field of automatic driving or intelligent transportation.
  • ADAS advanced driver assistance system
  • 3D three-dimensional (3-dimensions, 3D) information of the external vehicle
  • the perception module such as the position and size of the vehicle
  • path planning and driving strategies based on the perceived information Adjustment.
  • the more accurate the 3D information of the vehicle perceived by the perception module the better the performance of ADAS and the higher the safety of the vehicle.
  • the cost of a monocular camera is low, and a monocular camera is usually configured in the perception module.
  • the monocular camera uses a camera to perceive information, and cannot perceive the 3D information of the vehicle.
  • the present application provides a target detection method and device, which can obtain 3D information of a vehicle according to an image collected by a monocular camera.
  • an embodiment of the present application provides a target detection method and device, the method comprising: acquiring two-dimensional (2 dimensions, 2D) information of a second object in an image obtained by a first object; 2D information of the second object Including: the coordinate information of the end point of the first dividing line of the second object and the type information of the second object, the first dividing line is the dividing line between the first surface and the second surface of the second object, the At least one of the first surface and the second surface is included in a first 2D frame of the second object, and the first 2D frame is a polygon included in the image and surrounding the second object; according to the 2D frame of the second object information to obtain the 3D information of the second object; the 3D information of the second object includes the coordinate information of the endpoints of the second dividing line, the second dividing line is the 3D frame of the second object, the first surface and the first The dividing line between the two sides, the 3D frame is the 3D model of the second object, and the length of the boundary
  • the method provided in the first aspect above can construct a 3D frame according to the image collected by the monocular camera, and use the three-dimensional coordinate system corresponding to the 3D frame and the three-dimensional coordinate system of the first object when the distance between the second object and the first object is K.
  • the transformation relationship between them obtains the distance between the second object and the first object, and the operation cost is low.
  • the distance between the second object and the first object can be obtained without a multi-eye camera or a depth camera, so the cost of the target detection device can be reduced.
  • the 2D information of the second object further includes: coordinate information of the first 2D frame of the second object; obtaining the 3D information of the second object according to the 2D information of the second object, including: according to The type information of the second object constructs the 3D frame, and the coordinate information of the 3D frame is obtained; according to the coordinate information of the first 2D frame and the coordinate information of the end point of the first dividing line, the first surface and the second surface are determined. face; according to the coordinate information of the first face, the second face and the 3D frame, obtain the coordinate information of the end point of the second boundary line.
  • the 2D information of the second object includes the coordinate information of the end point of the first dividing line of the second object, the coordinate information of the first 2D frame of the second object, and the type information of the second object
  • the 2D information of the second object further includes: surface information of the second object, where the surface information of the second object is used to indicate the first surface and the second surface; according to the second Obtaining the 3D information of the second object from the 2D information of the object includes: constructing the 3D frame according to the type information of the second object, and obtaining the coordinate information of the 3D frame; according to the surface information of the second object and the coordinates of the 3D frame information to obtain the coordinate information of the end point of the second dividing line.
  • a 3D frame can be constructed, and Determine the coordinate information of the end point of the second dividing line in the 3D frame according to the surface information of the second object, so that the corresponding point of the 3D frame can be obtained according to the point on the second dividing line and the mapping point of the point on the first dividing line.
  • the distance between the second object and the first object is obtained according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line, and the first transformation relationship, Including: performing coordinate transformation on the first coordinate through the first transformation relationship, the internal parameter matrix and the external parameter matrix to obtain a second coordinate, the first coordinate is on the second dividing line, and the second coordinate is the same as the first object
  • the distance is K
  • the second coordinate, the third coordinate and the first object are in a straight line
  • the third coordinate is the coordinate corresponding to the first coordinate on the first dividing line
  • the internal reference matrix is the image taken.
  • the internal parameter matrix of the device is the external parameter matrix of the device; a composite operation is performed on the second coordinate, the third coordinate and the K to obtain the distance between the second object and the first object.
  • the distance between the second object and the first object can be obtained according to the transformation relationship between the three-dimensional coordinate system corresponding to the 3D frame and the three-dimensional coordinate system of the first object, and the properties of similar triangles.
  • the method further includes: obtaining, according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line, and the distance between the second object and the first object, to obtain A second transformation relationship; the second transformation relationship is a transformation relationship between the three-dimensional coordinate system corresponding to the 3D frame and the three-dimensional coordinate system of the first object. Based on the above method, the transformation relationship between the three-dimensional coordinate system corresponding to the 3D frame and the three-dimensional coordinate system of the first object can be obtained, so as to determine the orientation angle of the second object or calibrate the length and/or width of the 3D frame according to the transformation relationship .
  • the method further includes: rotating the 3D frame N times with the second dividing line as the center, and obtaining a second 2D frame corresponding to the 3D frame after each rotation, and the second 2D frame is obtained.
  • the 2D frame is obtained by performing coordinate transformation on the rotated 3D frame, and N is a positive integer; according to the N second 2D frames and the first 2D frame corresponding to the rotated 3D frame, the Nth 2D frame is obtained.
  • Two 2D frames, the second 2D frame with the shortest distance between the boundary and the boundary of the first 2D frame; the orientation angle of the second object is determined according to the rotation angle of the 3D frame corresponding to the second 2D frame with the shortest distance .
  • the second 2D frame corresponding to the rotated 3D frame can be compared with the first 2D frame to obtain the second 2D frame closest to the first 2D frame, thereby obtaining the second 2D frame corresponding to the shortest distance
  • the rotation angle of the 3D frame, and the orientation angle of the second object is determined according to the rotation angle of the 3D frame corresponding to the second 2D frame with the shortest distance.
  • the orientation angle of the second object may also be referred to when planning the path, so that the planned path has a higher reference.
  • the second 2D frame is obtained by performing coordinate transformation on the rotated 3D frame, including: the second 2D frame is obtained through the second transformation relationship, the internal parameter matrix and the external parameter matrix pair
  • the rotated 3D frame is obtained by performing coordinate transformation
  • the internal parameter matrix is the internal parameter matrix of the device that captures the image
  • the external parameter matrix is the external parameter matrix of the device.
  • a second 2D frame can be obtained by performing coordinate transformation on the rotated 3D frame through the second transformation relationship, the internal parameter matrix and the external parameter matrix, so that the orientation angle of the second object can be subsequently obtained according to the second 2D frame.
  • the method further includes: adjusting the length of the boundary of the 3D frame corresponding to the second 2D frame with the shortest distance M times, and obtaining a third 2D frame corresponding to the adjusted 3D frame each time frame, the third 2D frame is obtained by performing coordinate transformation on the adjusted 3D frame, and M is a positive integer; according to the M third 2D frames and the first 2D frame corresponding to the adjusted 3D frame, Obtain the M third 2D frames, the third 2D frame with the shortest distance between the boundary and the boundary of the first 2D frame; according to the length of the boundary of the 3D frame corresponding to the third 2D frame with the shortest distance, determine the The length of the boundary of the second object.
  • the third 2D frame corresponding to the adjusted 3D frame can be compared with the first 2D frame to obtain a third 2D frame closest to the first 2D frame, thereby obtaining the length of the boundary of the second object. In this way, a more accurate size of the second object can be obtained.
  • the third 2D frame is obtained by performing coordinate transformation on the adjusted 3D frame, including: the third 2D frame is obtained through the pair of the second transformation relationship, the internal parameter matrix and the external parameter matrix
  • the adjusted 3D frame is obtained by performing coordinate transformation
  • the internal parameter matrix is the internal parameter matrix of the device that captures the image
  • the external parameter matrix is the external parameter matrix of the device.
  • a third 2D frame can be obtained by performing coordinate transformation on the adjusted 3D frame through the second transformation relationship, the internal parameter matrix and the external parameter matrix, so as to obtain a more accurate size of the second object according to the third 2D frame subsequently.
  • the acquiring the 2D information of the second object in the image acquired by the first object includes: inputting the image into a neural network model to obtain the 2D information of the second object. Based on the above method, the 2D information of the second object can be obtained through the neural network model, so as to obtain the distance between the second object and the first object according to the 2D information of the second object.
  • an embodiment of the present application provides a target detection apparatus, which can implement the method in the first aspect or any possible implementation manner of the first aspect.
  • the apparatus comprises corresponding units or components for carrying out the above-described method.
  • the units included in the apparatus may be implemented by software and/or hardware.
  • the device may be, for example, an ADAS, or a chip, a system-on-chip, or a processor that can support the ADAS to implement the above method.
  • an embodiment of the present application provides a target detection apparatus, including: a processor, the processor is coupled to a memory, the memory is used to store a program or an instruction, when the program or the instruction is executed by the processor , the device is made to implement the method described in the first aspect or any possible implementation manner of the first aspect.
  • an embodiment of the present application provides a target detection apparatus, which is used to implement the method described in the first aspect or any possible implementation manner of the first aspect.
  • an embodiment of the present application provides a computer-readable medium on which a computer program or instruction is stored, and when the computer program or instruction is executed, enables a computer to execute the first aspect or any possibility of the first aspect.
  • an embodiment of the present application provides a computer program product, which includes computer program code, and when the computer program code runs on a computer, the computer program code enables the computer to execute the first aspect or any possible implementation of the first aspect. method described in the method.
  • an embodiment of the present application provides a chip, including: a processor, where the processor is coupled to a memory, and the memory is used to store a program or an instruction, and when the program or instruction is executed by the processor, The chip is made to implement the method described in the first aspect or any possible implementation manner of the first aspect.
  • any target detection device, chip, computer-readable medium or computer program product provided above is used to execute the corresponding method provided above. Therefore, the beneficial effects that can be achieved can be referred to the corresponding method. The beneficial effects of the method are not repeated here.
  • FIG. 1 is a schematic diagram of a 3D frame provided by an embodiment of the present application.
  • FIG. 2A is a schematic diagram 1 of a system architecture provided by an embodiment of the present application.
  • FIG. 2B is a second schematic diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 2C is a schematic diagram 3 of a system architecture provided by an embodiment of the present application.
  • FIG. 2D is a schematic diagram 4 of a system architecture provided by an embodiment of the present application.
  • 2E is a schematic diagram 5 of a system architecture provided by an embodiment of the present application.
  • 2F is a schematic diagram 6 of a system architecture provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart 1 of a target detection method provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of an image captured by a sensing module provided in an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a second coordinate and a third coordinate provided by an embodiment of the present application.
  • FIG. 7 is a second schematic flowchart of a target detection method provided by an embodiment of the present application.
  • FIG. 9 is a schematic flowchart three of a target detection method provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a target detection apparatus provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a chip provided by an embodiment of the present application.
  • the coordinate system of the first object is a coordinate system that takes the first object as a reference, and is a three-dimensional coordinate system.
  • the coordinate system of the first object is a coordinate system with the center of mass of the vehicle as the origin.
  • the center of mass of the vehicle is the center of mass of the vehicle.
  • the coordinates of the vehicle may be represented by the coordinates of the center of mass of the vehicle.
  • the coordinate system of the second object is the coordinate system corresponding to the 3D frame, which is a three-dimensional coordinate system.
  • the 3D frame is a 3D model of the second object established by the target detection device according to the type information of the second object.
  • the origin of the coordinate system of the second object may be any point on the 3D frame.
  • the origin of the coordinate system of the second object may be the lower boundary point 102 on the boundary line between the left and the rear of the 3D frame 101 .
  • the points in the coordinate system of the first object and the points in the coordinate system of the second object have a mapping relationship.
  • the point (x, y, z) in the coordinate system of the first object and the point (X, Y, Z) in the coordinate system of the second object satisfy the following formula:
  • T ⁇ is the translation vector from the coordinate system of the second object to the coordinate system of the first object.
  • the coordinate system of the camera is a coordinate system with the optical center of the camera as the origin, which is a three-dimensional coordinate system.
  • the camera may be a module in the target detection device, or may not be included in the target detection device.
  • the points in the coordinate system of the camera and the points in the coordinate system of the second object have a mapping relationship.
  • the point in the coordinate system of the camera and the point in the coordinate system of the first object have a mapping relationship.
  • the point (B, C, D) in the coordinate system of the camera and the point (X, Y, Z) in the coordinate system of the second object satisfy the following formula:
  • the point (B, C, D) in the coordinate system of the camera and the point (x, y, z) in the coordinate system of the first object satisfy the following formula:
  • T] is the extrinsic parameter matrix of the camera.
  • the extrinsic parameter matrix of the camera and the method of obtaining the extrinsic parameter matrix of the camera please refer to the explanation and description in the conventional technology.
  • T ⁇ reference may be made to the introduction in the above-mentioned coordinate system of the second object.
  • the image coordinate system is the coordinate system corresponding to the image captured by the camera, which is a two-dimensional coordinate system.
  • the image coordinate system is a coordinate system established on the image captured by the camera as the center of the image.
  • the point in the image coordinate system and the point in the coordinate system of the second object have a mapping relationship.
  • the point in the image coordinate system has a mapping relationship with the point in the coordinate system of the first object. Points in the image coordinate system are mapped to points in the camera's coordinate system.
  • the point (a, b) in the image coordinate system and the point (X, Y, Z) in the coordinate system of the second object satisfy the following formula:
  • the point (a, b) in the image coordinate system and the point (x, y, z) in the coordinate system of the first object satisfy the following formula:
  • the point (a, b) in the image coordinate system and the point (B, C, D) in the camera's coordinate system satisfy the following formula:
  • s is the scale factor.
  • A is the internal parameter matrix of the camera.
  • T] is the extrinsic parameter matrix of the camera.
  • T ⁇ reference may be made to the introduction in the above-mentioned coordinate system of the second object.
  • the target detection method and device provided by the embodiments of the present application can be applied to any scene in which 3D information of a target object needs to be detected.
  • the target detection method and device can be applied to ADAS of vehicles or drones.
  • 3D information of the target object can be obtained, the calculation result is accurate, and the calculation cost is low.
  • the system architecture applicable to the embodiments of the present application includes a target detection apparatus.
  • a sensing module is deployed in the target detection device.
  • the perception module may include a camera.
  • the system architecture may be as shown in FIG. 2A .
  • the system architecture shown in FIG. 2A includes a target detection device 201 .
  • a sensing module 2011 is deployed in the target detection apparatus 201 .
  • the above-mentioned target detection apparatus may capture an image including the second object through the sensing module.
  • the target detection device can also obtain two-dimensional (2 dimensions, 2D) information of the second object in the image; obtain the 3D information of the second object according to the 2D information of the second object; and obtain the 3D information of the second object according to the 2D information of the second object;
  • the coordinate information of the end point of the dividing line is obtained to obtain the first transformation relationship; according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line and the first transformation relationship, the distance between the second object and the first object is obtained.
  • FIG. 4 refers to the method shown in FIG. 4 below.
  • the above system architecture further includes a detection module.
  • a neural network model is deployed in the detection module, and the 2D information of the second object can be obtained by inputting the above image into the neural network model.
  • neural network models include image preprocessing modules and network inference modules.
  • the image preprocessing module is used to standardize the collected images to obtain a model with more generalization ability.
  • the network reasoning module is used to obtain the 2D information of the second object in the image according to the standardized image, for example, the coordinate information of the first 2D frame, the coordinate information of the end point of the first dividing line, the type information of the second object, and the first 2D frame.
  • the above detection module may be included in the target detection device, or may be independent of the target detection device. When the detection module is independent of the target detection device, it can communicate with the target detection device by wire or wirelessly.
  • the system architecture may be as shown in FIG. 2B or FIG. 2C .
  • the system architecture shown in FIG. 2B includes a target detection device 201 .
  • a sensing module 2011 and a detection module 2012 are deployed in the target detection apparatus 201 .
  • the system architecture shown in FIG. 2C includes a target detection device 201 and a detection module 202 .
  • a sensing module 2011 is deployed in the target detection apparatus 201 .
  • the target detection device can capture an image including the second object through the sensing module, and detect the 2D information of the second object in the image through the detection module.
  • the detection module is independent of the target detection device, the target detection device can capture an image including the second object through the sensing module, and send the image to the detection module. After receiving the image, the detection module detects the 2D information of the second object in the image, and sends the 2D information of the second object to the target detection device.
  • a system architecture applicable to the embodiments of the present application includes a target detection device and a perception module.
  • the perception module and the target detection device are independent of each other.
  • the perception module can communicate with the target detection device in a wired or wireless manner.
  • the perception module may include a camera.
  • the system architecture may be as shown in FIG. 2D .
  • the system architecture shown in FIG. 2D includes a target detection device 203 and a perception module 204 .
  • the above-mentioned sensing module may capture an image including the second object, and send the image to the target detection device.
  • the target detection device can obtain the 2D information of the second object in the image; obtain the 3D information of the second object according to the 2D information of the second object; according to the coordinate information of the endpoint of the first dividing line and the second dividing line According to the coordinate information of the endpoint of the first dividing line, the coordinate information of the endpoint of the second dividing line and the first transformation relationship, the distance between the second object and the first object is obtained.
  • the target detection device can obtain the 2D information of the second object in the image; obtain the 3D information of the second object according to the 2D information of the second object; according to the coordinate information of the endpoint of the first dividing line and the second dividing line According to the coordinate information of the endpoint of the first dividing line, the coordinate information of the endpoint of the second dividing line and the first transformation relationship, the distance between the second object and the first object is obtained.
  • FIG. 4 the method
  • the above system architecture further includes a detection module.
  • a neural network model is deployed in the detection module, and the 2D information of the second object can be obtained by inputting the above image into the neural network model.
  • the above detection module may be included in the target detection device, or may be independent of the target detection device. When the detection module is independent of the target detection device, it can communicate with the target detection device by wire or wirelessly.
  • the system architecture may be as shown in FIG. 2E or FIG. 2F .
  • the system architecture shown in FIG. 2E includes a target detection device 203 and a perception module 204 .
  • a detection module 2031 is deployed in the target detection device 203 .
  • the system architecture shown in FIG. 2F includes a target detection device 203 , a perception module 204 and a detection module 205 .
  • the perception module can capture an image including the second object, and send the image to the target detection device. After receiving the image, the target detection device can detect the 2D information of the second object in the image through the detection module.
  • the perception module may capture an image including the second object and send the image to the detection module. After receiving the image, the detection module can detect the 2D information of the second object in the image, and send the 2D information of the second object to the target detection device.
  • the first object in this application may be a vehicle, an unmanned aerial vehicle, or an intelligent device (for example, a robot in various application scenarios, such as a domestic robot, a robot in an industrial scenario, etc.).
  • a perception module, and/or a target detection device, and/or a detection module may be deployed on the first object.
  • a perception module, and/or a target detection device, and/or a detection module are deployed in the ADAS of the first object.
  • the second object in the present application may be a vehicle, a guardrail, a road post or a building, and the like.
  • system architectures shown in FIG. 2A-FIG. 2F are only used for examples, and are not used to limit the technical solutions of the present application.
  • system architecture may also include other devices, and the number of sensing modules, target detection devices or detection modules may also be determined according to specific needs.
  • the target detection apparatus in FIG. 2A to FIG. 2F in the embodiment of the present application may be a functional module in a device.
  • the above-mentioned functions can be either electronic components in hardware devices, such as ADAS chips, software functions running on dedicated hardware, or virtualized functions instantiated on a platform (eg, a cloud platform). .
  • FIG. 3 is a schematic diagram of a hardware structure of an electronic device applicable to an embodiment of the present application.
  • the electronic device 300 includes at least a processor 301 , a communication line 302 , and a memory 303 .
  • the processor 301 can be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more processors for controlling the execution of the programs of the present application. integrated circuit.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • Communication line 302 may include a path, such as a bus, to transfer information between the components described above.
  • Memory 303 may be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM) or other types of information and instructions It can also be an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, CD-ROM storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or capable of carrying or storing desired program code in the form of instructions or data structures and capable of being executed by a computer Access any other medium without limitation.
  • the memory may exist independently and be connected to the processor through communication line 302 .
  • the memory can also be integrated with the processor.
  • the memory provided by the embodiments of the present application may generally be non-volatile.
  • the memory 303 is used for storing the computer-executed instructions involved in executing the solution of the present application, and the execution is controlled by the processor 301 .
  • the processor 301 is configured to execute the computer-executed instructions stored in the memory 303, so as to implement the method provided by the embodiments of the present application.
  • the computer-executed instructions in the embodiment of the present application may also be referred to as application code, which is not specifically limited in the embodiment of the present application.
  • the electronic device 300 further includes a communication interface 304 .
  • Communication interface 304 may use any transceiver-like device for communicating with other devices or communication networks, such as Ethernet interfaces, radio access network (RAN), wireless local area networks (wireless local area networks, WLAN), etc.
  • RAN radio access network
  • WLAN wireless local area networks
  • the electronic device 300 further includes a sensing module (not shown in FIG. 3 ).
  • the perception module may include a monocular camera, a binocular camera, a trinocular camera, or a multi-eye camera.
  • the perception module may be used to capture an image including the second object.
  • the electronic device 300 further includes a detection module (not shown in FIG. 3 ).
  • a neural network model is deployed in the detection module, and the collected images are input into the neural network model to obtain 2D information of the second object.
  • the processor 301 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 3 .
  • the electronic device 300 may include multiple processors, such as the processor 301 and the processor 307 in FIG. 3 .
  • processors can be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor.
  • a processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
  • the electronic device 300 may further include an output device 305 and an input device 306 .
  • the output device 305 is in communication with the processor 301 and can display information in a variety of ways.
  • the output device 305 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector) Wait.
  • Input device 306 is in communication with processor 301 and can receive user input in a variety of ways.
  • the input device 306 may be a mouse, a keyboard, a touch screen device, a sensor device, or the like.
  • the hardware structure shown in FIG. 3 does not constitute a limitation on the target detection device, and the target detection device may include more or less components than the one shown, or combine some components, or different Component placement.
  • the target detection method provided by the embodiment of the present application will be described in detail below with reference to FIGS. 1 to 3 , taking the second object as a vehicle as an example.
  • the target detection method provided in this embodiment of the present application can be applied to multiple fields, for example, the field of unmanned driving, the field of automatic driving, the field of assisted driving, the field of intelligent driving, the field of connected driving, and the field of intelligent connected driving , car sharing, etc.
  • first 2D frame and the like in this application have 2D frames with different numbers, and the numbers are only used for the convenience of context and writing, and the different sequence numbers themselves have no specific technical meaning, for example, the first 2D frame, the second 2D frame etc., can be understood as one or any of a series of 2D boxes.
  • the target detection apparatus may perform some or all of the steps in the embodiments of the present application, these steps are only examples, and the embodiments of the present application may also perform other steps or variations of various steps. In addition, various steps may be performed in different orders presented in the embodiments of the present application, and it may not be necessary to perform all the steps in the embodiments of the present application.
  • the specific structure of the execution body of the target detection method is not particularly limited in the embodiment of the present application, as long as the program in which the code of the target detection method of the embodiment of the present application is recorded can be executed according to the present application.
  • the target detection method provided by the embodiment of the present application may execute the corresponding operation.
  • the execution body of the target detection method provided by the embodiment of the present application may be a target detection device, or a component applied in the target detection device, such as a chip. This is not limited.
  • a target detection method provided by an embodiment of the present application includes steps 401 to 404 .
  • Step 401 The target detection apparatus acquires 2D information of a second object in the image acquired by the first object.
  • the target detection device may be any target detection device shown in FIGS. 2A-2F .
  • the object detection device may be deployed in the first object. It should be understood that if the target detection device is the target detection device in FIG. 2D , FIG. 2E or FIG. 2F , a perception module, such as a camera, is also deployed in the first object.
  • the image is acquired by the perception module.
  • the perception module may include a monocular camera, a binocular camera, a trinocular camera, or a multi-eye camera.
  • the image in step 401 may be captured by a sensing module in the first object.
  • the perception module may be the perception module in FIGS. 2A-2F .
  • the sensing module is the sensing module 2011 in FIG. 2A .
  • the sensing module is the sensing module 204 in FIG. 2E .
  • the image acquired by the first object includes one or more second objects.
  • the image may be as shown in FIG. 5, the image including a plurality of second objects.
  • the 2D information of the second object includes coordinate information of the end point of the first boundary line of the second object and type information of the second object. Further, the 2D information of the second object further includes coordinate information of the first 2D frame of the second object and surface information of the second object.
  • the coordinate information of the first 2D frame is used to indicate the coordinates of the second object in the image coordinate system corresponding to the image.
  • the image coordinate system is a two-dimensional coordinate system. Therefore, the first 2D frame is a plane figure, such as a rectangle or a polygon.
  • the first 2D box is a polygon that includes a polygon enclosing the second object in the image.
  • the first 2D frame may be the 2D frame 501 in FIG. 5 .
  • the coordinate information of the first 2D frame may include coordinates of each corner of the first 2D frame. For example, if the first 2D frame is a rectangle, the coordinate information of the first 2D frame includes the coordinates of the four corners of the rectangle. It should be understood that, in addition to including the coordinates of each corner of the first 2D frame, the coordinate information of the first 2D frame can also indicate the coordinates of the second object in the image coordinate system by other means, which is not limited.
  • the first boundary line may be a boundary line between the first surface and the second surface of the second object. At least one of the first face and the second face is included in the first 2D frame.
  • the first dividing line is the dividing line between the back and the left of the second object, or the dividing line between the back and the right of the second object, or the dividing line between the front and the left of the second object, or the front of the second object. and the demarcation line on the right.
  • the first dividing line may be the dividing line 502 in FIG. 5 .
  • the dividing line 502 is the dividing line between the rear and the left of the second object.
  • the coordinate information of the end point of the first dividing line is used to indicate the coordinates of the first dividing line in the image coordinate system corresponding to the image.
  • the coordinate information of the end point of the first dividing line includes: the coordinates of the lower dividing point of the first dividing line, and/or the coordinates of the upper dividing point of the first dividing line.
  • the lower boundary point of the first boundary line is the intersection of the first boundary line and the lower boundary of the first 2D frame.
  • the upper boundary point of the first boundary line is the intersection of the first boundary line and the upper boundary of the first 2D frame.
  • the coordinates of the first dividing line in the image coordinate system corresponding to the image can also be indicated by the coordinates of other points on the first dividing line.
  • the midpoint of the first dividing line is the midpoint of the line connecting the upper dividing point and the lower dividing point of the first dividing line.
  • the type information of the second object is used to indicate the type to which the second object belongs.
  • the type of the second object includes one or more of the following types: hatchback, sedan, subcompact, sport utility vehicle (SUV), pickup, minivan, minivan (open-top) , minivan (enclosed), light truck, heavy truck, engineering vehicle, medium bus, large bus or double decker bus. It should be understood that the above types are only examples of the type of the second object, and in practical applications, the type of the second object also includes other types, which are not limited.
  • the type information of the second object may include an identification of the type to which the second object belongs. Exemplarily, taking the second object as a hatchback car and the identification of the hatchback car as ID 2 as an example, the type information of the second object includes ID 2.
  • the surface information of the second object is used to indicate the first surface and/or the second surface.
  • the surface information of the second object may include the identification of the first surface and/or the identification of the second surface.
  • the surface information of the second object includes ID 1 and ID 2.
  • the included angle between the target detection device and the traveling direction of the second object is less than or equal to the first threshold, one surface of the second object can be seen in the image.
  • the second object 503 can be seen behind but not the side in the image shown in FIG. 5 .
  • the surface information of the second object is used to indicate the one surface
  • the first boundary is the left boundary or the right boundary of the first 2D frame.
  • the target detection apparatus may acquire 2D information of one or more second objects among the multiple second objects in the image. Further, when the target detection device acquires the 2D information of multiple second objects in the image, the target detection device may simultaneously acquire the 2D information of the multiple second objects, or may acquire the 2D information of the multiple second objects one by one. 2D information of two objects.
  • the target detection apparatus may acquire the 2D information of the second object in the image in the following two ways.
  • the target detection device acquires the 2D information of the second object in the image according to the user's input.
  • the target detection apparatus 201 captures an image including the second object through the sensing module 2011 .
  • the target detection device 201 displays the image for the user through the human-computer interaction interface, and receives the 2D information of the second object input by the user.
  • the sensing module 204 captures an image including the second object, and sends the image to the target detection device 203 .
  • the target detection device 203 displays the image for the user through the human-computer interaction interface, and receives the 2D information of the second object input by the user.
  • the target detection device inputs the image into the neural network model to obtain the 2D information of the second object.
  • the target detection device inputs the image into the neural network model in the detection module to obtain 2D information of the second object.
  • the detection module may be the detection module in FIGS. 2A-2F .
  • the detection module is the detection module 2012 in FIG. 2B .
  • the detection module is the detection module 205 in FIG. 2F .
  • the introduction of the neural network model please refer to the corresponding description in the above-mentioned introduction to the system architecture.
  • the target detection apparatus 201 captures an image including the second object through the sensing module 2011 .
  • the target detection device 201 inputs the image into the neural network model in the detection module 2012 to obtain 2D information of the second object.
  • the target detection apparatus 201 captures an image including the second object through the sensing module 2011 , and sends the image to the detection module 202 .
  • the detection module 202 inputs the image into the neural network model to obtain the 2D information of the second object, and sends the 2D information of the second object to the target detection device 201 .
  • the sensing module 204 captures an image including the second object, and sends the image to the target detection device 203 .
  • the target detection device 203 After receiving the image from the sensing module 204, the target detection device 203 inputs the image into the neural network model to obtain 2D information of the second object.
  • the sensing module 204 captures an image including the second object, and sends the image to the detection module 205 .
  • the detection module 205 inputs the image into the neural network model to obtain the 2D information of the second object, and sends the 2D information of the second object to the target detection device 203.
  • Step 402 The target detection apparatus acquires 3D information of the second object according to the 2D information of the second object.
  • the 3D information of the second object includes coordinate information of the end point of the second boundary line. Further, the 3D information of the second object further includes coordinate information of the 3D frame of the second object.
  • the 3D frame is a 3D model of the second object established according to the type information of the second object.
  • the 3D frame may be a solid figure, such as a rectangular parallelepiped.
  • a 3D box can be as shown in Figure 1.
  • the coordinate information of the 3D frame is used to indicate the coordinates of the 3D frame in the coordinate system corresponding to the 3D frame.
  • the coordinate system corresponding to the 3D frame may also be referred to as the coordinate system of the second object.
  • the coordinate information of the 3D frame may include coordinates of each corner of the 3D frame.
  • the coordinate information of the 3D frame includes the coordinates of the eight corners of the rectangular parallelepiped. It should be understood that, in addition to including the coordinates of each corner of the 3D frame, the coordinate information of the 3D frame may also indicate the coordinates of the 3D frame in the coordinate system corresponding to the 3D frame by other means, which is not limited.
  • the second dividing line corresponds to the first dividing line. That is, the second boundary line is the boundary line between the first surface and the second surface in the 3D frame of the second object.
  • the second dividing line is the dividing line between the back and the left of the 3D frame; if the first dividing line is in the image, the first dividing line is The dividing line between the back and the right of the two objects, then the second dividing line is the dividing line between the back and the right of the 3D frame; if the first dividing line is the dividing line between the front and the left of the second object in the image, then the second dividing line
  • the dividing line is the dividing line between the front and the left of the 3D frame; if the first dividing line is the dividing line between the front and the right of the second object in the image, the second dividing line is the dividing line between the front and the right of the 3D frame.
  • the second dividing line may be the dividing line 103 in FIG. 1 .
  • the dividing line 103 is the dividing line between the rear and the left of the 3D frame.
  • the front or back of the 3D box is a surface composed of the height of the 3D box and the width of the 3D box, and the left or right side of the 3D box is a surface composed of the height of the 3D box and the length of the 3D box.
  • the coordinate information of the end point of the second boundary line is used to indicate the coordinates of the second boundary line in the coordinate system corresponding to the 3D frame.
  • the coordinate information of the end point of the second dividing line includes: the coordinates of the lower dividing point of the second dividing line, and/or the coordinates of the upper dividing point of the second dividing line.
  • the lower boundary point of the second boundary line is the intersection point of the second boundary line and the lower plane of the 3D frame.
  • the upper boundary point of the second boundary line is the intersection of the second boundary line and the upper plane of the 3D frame.
  • the coordinates of the second dividing line in the coordinate system corresponding to the 3D frame can also be indicated by the coordinates of other points on the second dividing line.
  • the midpoint of the second dividing line is the midpoint of the line connecting the upper dividing point and the lower dividing point of the second dividing line.
  • the target detection apparatus may acquire the 3D information of the second object in the following two exemplary manners.
  • the 2D information of the second object includes the coordinate information of the end point of the first dividing line, the coordinate information of the first 2D frame and the type information of the second object, and the target detection device constructs the second object according to the type information of the second object
  • the 3D frame is obtained, and the coordinate information of the 3D frame is obtained
  • the target detection device determines the first surface and the second surface according to the coordinate information of the first 2D frame and the coordinate information of the endpoint of the first dividing line
  • the target detection device determines the first surface and the second surface according to the first surface,
  • the coordinate information of the second surface and the 3D frame is obtained, and the coordinate information of the end point of the second boundary line is obtained.
  • the length of the boundary of the 3D frame corresponds to the type information of the second object. Further, the length of the 3D frame, the width of the 3D frame and the height of the 3D frame correspond to the type information of the second object.
  • the corresponding relationship between the length of the 3D frame, the width of the 3D frame and the height of the 3D frame and the type information of the second object can be as shown in Table 1. shown.
  • Table 1 if the type of the second object is a hatchback, the length of the 3D frame is L 1 , the width of the 3D frame is W 1 , and the height of the 3D frame is H 1 .
  • the type of the second object is a sedan
  • the length of the 3D frame is L 2
  • the width of the 3D frame is W 2
  • the height of the 3D frame is H 2 .
  • the type of the second object is SUV
  • the length of the 3D frame is L 3
  • the width of the 3D frame is W 3
  • the height of the 3D frame is H 3 .
  • Table 1 is only an example of the correspondence between the length of the 3D frame, the width of the 3D frame, and the height of the 3D frame and the type information of the second object.
  • the length of the 3D frame, the width of the 3D frame may also be in other forms, which are not limited.
  • the distance between the first dividing line and the left or right border of the first 2D frame is greater than or equal to the second threshold, in the image, at least two faces of the second object that can be seen, if the first The distance between a dividing line and the left or right border of the first 2D box is less than a second threshold, and in the image, a face of the second object can be seen.
  • the target detection apparatus determines the boundary between the left side and the rear of the 3D frame as the second boundary, and acquires coordinate information of the endpoints of the second boundary.
  • the target detection device determines that the first side is the right side and the second side is the front side, or the first side is the front side and the second side is the front side. to the right. Subsequently, the target detection apparatus determines the boundary between the right side and the front of the 3D frame as the second boundary, and acquires coordinate information of the endpoints of the second boundary.
  • the target detection device determines that the The displayed face is the back of the second object.
  • the target detection device determines the dividing line between the back and the left of the 3D frame as the second dividing line, and obtains the second dividing line If the distance between the first dividing line and the right border of the first 2D frame is less than the second threshold, the target detection device determines the dividing line between the back and the right side of the 3D frame as the second dividing line, And obtain the coordinate information of the endpoint of the second dividing line.
  • the 2D information of the second object includes the coordinate information of the end point of the first dividing line, the type information of the second object, and the surface information of the second object.
  • the target detection device constructs the second object according to the type information of the second object. 3D frame, to obtain the coordinate information of the 3D frame; the target detection device obtains the coordinate information of the end point of the second boundary line according to the surface information of the second object and the coordinate information of the 3D frame.
  • the target detection device constructs a 3D frame of the second object according to the type information of the second object, and the specific process of obtaining the coordinate information of the 3D frame can refer to the above-mentioned method 1, and will not be repeated.
  • the target detection device determines the second dividing line according to the first surface and/or the second surface indicated in the surface information of the second object, and obtains the end point of the second dividing line according to the coordinate information of the 3D frame. Coordinate information.
  • the target detection device determines the dividing line between the left and the back of the 3D frame as the second dividing line, and according to The coordinate information of the 3D frame acquires the coordinate information of the end point of the second dividing line.
  • the target detection device will divide the distance between the left and the back of the 3D frame.
  • the boundary line is determined as the second boundary line, and the coordinate information of the end point of the second boundary line is obtained according to the coordinate information of the 3D frame.
  • the target detection device determines the boundary between the right side and the back of the 3D frame as the second boundary, and obtains the endpoint of the second boundary according to the coordinate information of the 3D frame coordinate information.
  • Step 403 The target detection device obtains a first transformation relationship according to the coordinate information of the end points of the first dividing line and the coordinate information of the end points of the second dividing line.
  • the first transformation relationship is a transformation relationship between the coordinate system corresponding to the 3D frame and the coordinate system of the first object when the distance between the second object and the first object is K.
  • K is greater than 0.
  • the first transformation relationship is the transformation relationship between the coordinate system corresponding to the 3D frame and the coordinate system of the first object when the distance between the second object and the optical center of the camera of the first object is K.
  • any point on the second dividing line and the mapping point of the point in the image coordinate system satisfy Formula 1:
  • any point on the second dividing line and the mapping point of the point in the camera's coordinate system satisfy Formula 2:
  • the target detection device can obtain s and the first transformation relationship, that is, s and T ⁇ , by solving the above formula 1 and formula 2.
  • (X, Y, Z) is any point on the second boundary line, for example, the lower boundary point of the second boundary line.
  • (a, b) are the mapping points of any point on the second dividing line in the image coordinate system. If (X, Y, Z) is the lower boundary point of the second boundary line, then (a, b) is the lower boundary point of the first boundary line.
  • T] are known quantities. - Indicates that the target detection device may not pay attention to this value when solving the relationship between s and the first transformation.
  • Step 404 The target detection device obtains the distance between the second object and the first object according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line and the first transformation relationship.
  • the target detection device performs coordinate transformation on the first coordinate through the first transformation relationship, the internal parameter matrix and the external parameter matrix to obtain the second coordinate; the target detection device checks the second coordinate, the third coordinate and K. Operation is performed to obtain the distance between the second object and the first object.
  • the internal parameter matrix is the internal parameter matrix of the camera, that is, the internal parameter matrix is A.
  • the extrinsic parameter matrix is the extrinsic parameter matrix of the camera, that is, the extrinsic parameter matrix is [R
  • the first coordinate is on the second dividing line. Because the first transformation relationship is the transformation relationship between the coordinate system corresponding to the 3D frame and the coordinate system of the first object when the distance between the second object and the first object is K, the distance between the second coordinate and the camera is K.
  • the second coordinate, the third coordinate and the camera are in a straight line.
  • the third coordinate is a coordinate corresponding to the first coordinate on the first dividing line.
  • the first coordinates (X, Y, Z) and the second coordinates (x, y) satisfy the following formula: Wherein, s and T ⁇ are the values calculated in the above step 403 .
  • the second coordinate (x, y) can be obtained by the above formula.
  • the target detection device performs a compound operation on the second coordinate, the third coordinate and K, so as to obtain the distance between the second object and the first object.
  • the target detection device may use the P value as the distance between the second object and the first object. Further, the target detection device can also obtain the distance between the center of mass of the second object and the camera according to the P value and the coordinate information of the 3D frame, and use the distance as the distance between the second object and the first object.
  • the target detection device may perform path planning according to the distance between the second object and the first object, and control the first object to travel according to the planned path, thereby effectively avoiding obstacles and increasing the comfort and safety of automatic driving. sex.
  • the target detection apparatus may acquire 2D information of the second object in the image obtained by the first object, for example, coordinate information of the end point of the first boundary line of the second object and type information of the second object.
  • the target detection apparatus may also acquire 3D information of the second object, for example, coordinate information of the end point of the second boundary line of the second object.
  • the target detection device can also obtain, according to the coordinate information of the end point of the first dividing line and the coordinate information of the end point of the second dividing line, that when the distance between the second object and the first object is K, the coordinate system corresponding to the 3D frame is the same as that of the first object.
  • the first transformation relationship between the coordinate systems, and the distance between the second object and the first object is obtained according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line, and the first transformation relationship.
  • the target detection device constructs a 3D frame according to the image collected by the monocular camera, and when the distance between the second object and the first object is K, the transformation relationship between the coordinate system corresponding to the 3D frame and the coordinate system of the first object can be Obtaining the distance between the second object and the first object has a lower computational cost.
  • the distance between the second object and the first object can be obtained without a multi-eye camera or a depth camera, so the cost of the target detection device can be reduced.
  • the target detection device may also acquire the orientation angle of the second object, that is, the included angle between the traveling direction of the second object and the first object.
  • the method shown in FIG. 4 further includes steps 701 to 703 .
  • Step 701 The target detection device rotates the 3D frame N times with the second boundary as the center, and obtains a second 2D frame corresponding to the 3D frame after each rotation.
  • N is a positive integer.
  • the target detection device rotates the 3D frame around the second dividing line, and obtains a second 2D frame corresponding to the rotated 3D frame every rotation angle ⁇ . Among them, 0 ⁇ 360°.
  • the target detection device needs to obtain the second image corresponding to the rotated 3D frame when the 3D frame is rotated by 60°, 120°, 180°, 240°, 300° and 360° respectively. 2D box.
  • the N is predefined, or randomly determined by the target detection device.
  • the difference between two adjacent rotation angles may be the same or different.
  • the target detection device needs to obtain the rotation of the 3D frame by 0°, respectively.
  • °, 30°, 90°, 150°, 200°, 260°, and 310° the second 2D frame corresponding to the rotated 3D frame.
  • the second 2D frame is obtained by performing coordinate transformation on the rotated 3D frame. Further, the second 2D frame is obtained by performing coordinate transformation on the rotated 3D frame through the second transformation relationship, the internal parameter matrix and the external parameter matrix.
  • the second transformation relationship is a transformation relationship between the coordinate system corresponding to the 3D frame and the coordinate system of the first object.
  • the coordinate information of the second 2D frame is used to indicate the coordinates of the second 2D frame in the image coordinate system.
  • any point (x, y) on the second 2D frame and the mapping point (X, Y, Z) of the point on the rotated 3D frame satisfy the following formula: Among them, (X, Y, Z), s, A, [R
  • T ⁇ ' is the second transformation relationship
  • the process of acquiring the second transformation relationship by the target detection device is as follows:
  • the second transformation relationship is obtained according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line, and the distance between the second object and the first object. Further, according to the coordinate information of the endpoint of the first dividing line, the coordinate information of the endpoint of the second dividing line, and the distance between the second object and the first object, the target detection device obtains a system of equations with the second transformation relationship as an unknown number. , solve the equation system to obtain the second transformation relation.
  • any point on the second dividing line and the mapping point of the point in the image coordinate system satisfy Formula 3: Any point on the above-mentioned second dividing line and the mapping point of this point in the coordinate system of the camera satisfy Formula 4:
  • the target detection device can obtain s and the second transformation relationship, that is, s and T ⁇ ', by solving the above formula 3 and formula 4.
  • (X, Y, Z) is any point on the second boundary line, for example, the lower boundary point of the second boundary line.
  • (a, b) are the mapping points of any point on the second dividing line in the image coordinate system. If (X, Y, Z) is the lower boundary point of the second boundary line, then (a, b) is the lower boundary point of the first boundary line.
  • T] are known quantities.
  • P is the distance between the second object and the first object obtained in step 404 . - Indicates that the target detection device may not pay attention to this value when solving the relationship between s and the second transformation.
  • Step 702 The target detection device obtains N second 2D frames according to the N second 2D frames and the first 2D frames corresponding to the 3D frames after N rotations, and the distance between the boundary and the boundary of the first 2D frame is the shortest the second 2D box.
  • the sum of the first distance and the second distance corresponding to the second 2D frame with the shortest distance is the smallest.
  • the first distance is the distance between the left border of the second 2D frame with the shortest distance and the left border of the first 2D frame.
  • the second distance is the distance between the right border of the second 2D frame with the shortest distance and the right border of the first 2D frame.
  • FIG. 8 is a schematic diagram of any second 2D frame and the first 2D frame.
  • the distance between the left border of the second 2D frame 801 and the left border of the first 2D frame 802 is ⁇ a
  • the distance between the right border of the second 2D frame 801 and the right border of the first 2D frame 802 is ⁇ b.
  • argmin represents the value of ⁇ when [ ⁇ a( ⁇ )+ ⁇ b( ⁇ )] is minimized.
  • the point p 1 corresponding to P 1 is The coordinates of the image coordinate system are (x 1 , y 1 ), the coordinates of the point p 2 corresponding to P 2 in the image coordinate system are (x 2 , y 2 ), and in the first 2D frame 802 , the point of Q 1 is at The coordinates of the image coordinate system are (x 3 , y 3 ), and the coordinates of Q 2 in the image coordinate system are (x 4 , y 4 ), then (X 1 , Y 1 , Z 1
  • (X 2 , Y 2 , Z 2 ) and (x 2 , y 2 ) satisfy the following formulas: Among them, s, A, [R
  • T], T ⁇ ' and (X 2 , Y 2 , Z 2 ) are known quantities, and (x 2 , y 2 ) can be obtained. Then ⁇ a
  • , ⁇ b
  • Step 703 The target detection device determines the orientation angle of the second object according to the rotation angle of the 3D frame corresponding to the second 2D frame with the shortest distance.
  • the target detection apparatus determines the sum of the orientation angle corresponding to the 3D frame in step 402 and the rotation angle of the 3D frame corresponding to the second 2D frame with the shortest distance as the orientation angle of the second object.
  • the rotation angle of the 3D frame corresponding to the second 2D frame with the shortest distance is the orientation angle of the second object. If the orientation angle corresponding to the 3D frame in step 402 is 30°, and the rotation angle of the 3D frame corresponding to the second 2D frame with the shortest distance is 30°, the orientation angle of the second object is 60°.
  • the target detection device may perform path planning according to the distance between the second object and the first object and the orientation angle of the second object, and control the first object to travel according to the planned path, thereby effectively avoiding obstacles, Increase the comfort and safety of autonomous driving.
  • the target detection apparatus can compare the second 2D frame corresponding to the rotated 3D frame with the first 2D frame to obtain the second 2D frame closest to the first 2D frame, thereby obtaining the distance
  • the rotation angle of the 3D frame corresponding to the shortest second 2D frame, and the orientation angle of the second object is determined according to the rotation angle of the 3D frame corresponding to the second 2D frame with the shortest distance.
  • the target detection device may refer to the orientation angle of the second object in addition to the distance between the second object and the first object, so that the planned path has a higher reference.
  • the length of the 3D frame, the width of the 3D frame and the height of the 3D frame in the above step 402 are obtained according to the type of the second object.
  • the dimensions of vehicles of different brands may be different. Therefore, depending on the length of the 3D frame obtained by the type of the second object, the width of the 3D frame or the height of the 3D frame may be inaccurate.
  • the target detection device may further calibrate the length of the 3D frame and/or the width of the 3D frame.
  • the method shown in FIG. 7 further includes steps 901 to 903 .
  • Step 901 The target detection apparatus adjusts the length of the boundary of the 3D frame corresponding to the second 2D frame with the shortest distance M times, and obtains the third 2D frame corresponding to the adjusted 3D frame each time.
  • M is a positive integer. M is predefined or determined by the target detection device.
  • the length of the boundary of the 3D box includes the length of the 3D box, and/or the width of the 3D box.
  • the purpose of adjusting the length of the boundary of the 3D frame corresponding to the second 2D frame with the shortest distance by the target detection device M times is to make the boundary of the third 2D frame corresponding to the adjusted 3D frame and the first 2D frame.
  • the difference between the boundaries of the 3D box is the smallest, so that a more accurate length of the boundary of the 3D box can be obtained. It is also possible to obtain more accurate length of the second object and width of the second object.
  • the target detection apparatus adjusts the length and/or the width of the 3D frame corresponding to the second 2D frame with the shortest distance M times. For example, each time the target detection apparatus increases the length and/or width of the 3D frame corresponding to the second 2D frame with the shortest distance by ⁇ j, the increased ⁇ j each time may be the same or different. For another example, the target detection apparatus reduces the length and/or width of the 3D frame corresponding to the second 2D frame with the shortest distance by ⁇ j each time, and the reduced ⁇ j each time may be the same or different.
  • ⁇ j is predefined or determined by the object detection device.
  • the adjustment values for the length and width of the 3D frame corresponding to the second 2D frame with the shortest distance can be the same. Can also be different.
  • the target detection apparatus may increase the length of the 3D frame corresponding to the second 2D frame with the shortest distance by ⁇ j, and decrease the width of the 3D frame corresponding to the second 2D frame with the shortest distance by ⁇ r.
  • the third 2D frame is obtained by performing coordinate transformation on the adjusted 3D frame. Further, the third 2D frame is obtained by performing coordinate transformation on the adjusted 3D frame through the second transformation relationship, the internal parameter matrix and the external parameter matrix.
  • any point (x, y) on the third 2D frame and the mapping point (X, Y, Z) of the point on the 3D frame corresponding to the third 2D frame satisfy the following formula: Among them, (X, Y, Z), s, A, [R
  • Step 902 According to the third 2D frame and the first 2D frame corresponding to the M adjusted 3D frame, the target detection device obtains M third 2D frames, the first 2D frame with the shortest distance between the boundary and the boundary of the first 2D frame. Three 2D boxes.
  • the sum of the third distance and the fourth distance corresponding to the third 2D box with the shortest distance is the smallest.
  • the third distance is the distance between the left border of the third 2D frame with the shortest distance and the left border of the first 2D frame.
  • the fourth distance is the distance between the right border of the third 2D frame with the shortest distance and the right border of the first 2D frame.
  • the sum of the third distance, the fourth distance and the fifth distance corresponding to the third 2D frame with the shortest distance is the smallest.
  • the fifth distance is the distance between the dividing line on the third 2D frame with the shortest distance and the first dividing line.
  • ⁇ c is the distance between the left border of any third 2D frame and the left border of the first 2D frame.
  • ⁇ d is the distance between the right boundary of any third 2D box and the right boundary of the first 2D box.
  • ⁇ e is the distance between the boundary line of any third 2D frame and the first boundary line.
  • Step 903 The target detection apparatus determines the length of the boundary of the second object according to the length of the boundary of the 3D frame corresponding to the third 2D frame with the shortest distance.
  • the target detection apparatus determines the length of the boundary of the 3D frame corresponding to the third 2D frame with the shortest distance as the length of the boundary of the second object.
  • the target detection device determines the length of the 3D frame corresponding to the third 2D frame with the shortest distance as the length of the second object; the width of the 3D frame corresponding to the third 2D frame with the shortest distance is determined as the width of the second object. ; Determine the height of the 3D frame in step 402 as the height of the second object.
  • the target detection apparatus may further calculate the centroid of the second object according to the length of the boundary of the second object.
  • the target detection apparatus can compare the third 2D frame corresponding to the adjusted 3D frame with the first 2D frame, and obtain the third 2D frame closest to the first 2D frame, thereby obtaining the second 2D frame.
  • the length of the object's bounds In this way, the target detection device can acquire a more accurate size of the second object.
  • the above-mentioned target detection apparatus and the like include corresponding hardware structures and/or software modules for performing each function.
  • Those skilled in the art should easily realize that the unit and algorithm operations of each example described in conjunction with the embodiments disclosed herein can be implemented in hardware or in the form of a combination of hardware and computer software. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
  • the target detection apparatus may be divided into functional modules according to the above method examples.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. It should be noted that, the division of modules in the embodiments of the present application is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
  • FIG. 10 shows a schematic structural diagram of a target detection apparatus.
  • the target detection device can be used to perform the functions of the target detection device involved in the above embodiments.
  • the target detection apparatus shown in FIG. 10 includes: an acquisition unit 1001 and a determination unit 1002 .
  • the acquiring unit 1001 is configured to acquire two-dimensional 2D information of a second object in the image acquired by the first object; the 2D information of the second object includes: coordinate information of the endpoint of the first dividing line of the second object and the second object type information, the first dividing line is the dividing line between the first surface and the second surface of the second object, and at least one of the first surface and the second surface is included in the first surface of the second object.
  • a 2D box, the first 2D box is a polygon that includes a second object in the image.
  • the obtaining unit 1001 is configured to perform step 401 .
  • the obtaining unit 1001 is further configured to obtain the three-dimensional 3D information of the second object according to the 2D information of the second object; the 3D information of the second object includes the coordinate information of the end points of the second dividing line, and the second dividing line is the first In the 3D frame of the two objects, the boundary between the first surface and the second surface, the 3D frame is the 3D model of the second object, and the length of the boundary of the 3D frame corresponds to the type information of the second object.
  • the obtaining unit 1001 is further configured to perform step 402 .
  • Determining unit 1002 configured to obtain a first transformation relationship according to the coordinate information of the end point of the first dividing line and the coordinate information of the end point of the second dividing line; the first conversion relationship is the second object and the first object
  • the distance is K
  • the transformation relationship between the three-dimensional coordinate system corresponding to the 3D frame and the three-dimensional coordinate system of the first object, the K is greater than 0.
  • the determining unit 1002 is configured to perform step 403 .
  • the determining unit 1002 is further configured to obtain the distance between the second object and the first object according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line and the first transformation relationship. For example, in conjunction with FIG. 4 , the determining unit 1002 is further configured to perform step 404 .
  • the 2D information of the second object further includes: coordinate information of the first 2D frame; the obtaining unit 1001 is specifically configured to construct the 3D frame according to the type information of the second object, and obtain the 3D frame. Coordinate information; the obtaining unit 1001 is also specifically configured to determine the first surface and the second surface according to the coordinate information of the first 2D frame and the coordinate information of the endpoints of the first dividing line; the obtaining unit 1001 is also specifically used to obtain the coordinate information of the end point of the second boundary line according to the coordinate information of the first surface, the second surface and the 3D frame.
  • the 2D information of the second object further includes: surface information of the second object, where the surface information of the second object is used to indicate the first surface and the second surface; the acquiring unit 1001, specifically It is used to construct the 3D frame according to the type information of the second object, and obtain the coordinate information of the 3D frame; the obtaining unit 1001 is also specifically configured to obtain the second score according to the surface information of the second object and the coordinate information of the 3D frame. Coordinate information of the endpoint of the boundary.
  • the determining unit 1002 is specifically configured to perform coordinate transformation on the first coordinate through the first transformation relationship, the internal parameter matrix and the external parameter matrix to obtain a second coordinate, and the first coordinate is at the second dividing line.
  • the distance between the second coordinate and the first object is K
  • the second coordinate, the third coordinate and the first object are in a straight line
  • the third coordinate is the first boundary line and the first coordinate
  • the internal parameter matrix is the internal parameter matrix of the device that captures the image
  • the external parameter matrix is the external parameter matrix of the device
  • the determining unit 1002 is also specifically used for the second coordinate, the third coordinate and the K A compound operation is performed to obtain the distance between the second object and the first object.
  • the determining unit 1002 is further specifically configured to, according to the coordinate information of the end point of the first dividing line, the coordinate information of the end point of the second dividing line, and the distance between the second object and the first object, A second transformation relationship is obtained; the second transformation relationship is a transformation relationship between the three-dimensional coordinate system corresponding to the 3D frame and the three-dimensional coordinate system of the first object.
  • the obtaining unit 1001 is further configured to rotate the 3D frame N times with the second dividing line as the center, and obtain the second 2D frame corresponding to the 3D frame after each rotation.
  • the 2D frame is obtained by performing coordinate transformation on the rotated 3D frame, and N is a positive integer;
  • the determining unit 1002 is also used for N second 2D frames corresponding to the rotated 3D frame and the first 2D frame according to N times frame, to obtain the N second 2D frames, the second 2D frame with the shortest distance between the boundary and the boundary of the first 2D frame;
  • the determining unit 1002 is also used for determining according to the second 2D frame corresponding to the shortest distance
  • the rotation angle of the 3D frame determines the orientation angle of the second object.
  • the second 2D frame is obtained by performing coordinate transformation on the rotated 3D frame, including: the second 2D frame is obtained by rotating the second 2D frame through the second transformation relationship, the internal parameter matrix and the external parameter matrix.
  • the obtained 3D frame is obtained by performing coordinate transformation
  • the internal parameter matrix is the internal parameter matrix of the device that captures the image
  • the external parameter matrix is the external parameter matrix of the device.
  • the obtaining unit 1001 is further configured to adjust the length of the boundary of the 3D frame corresponding to the second 2D frame with the shortest distance M times, and obtain the third 2D frame corresponding to the adjusted 3D frame each time.
  • the third 2D frame is obtained by performing coordinate transformation on the adjusted 3D frame, and M is a positive integer;
  • the determining unit 1002 is also used for M third 2D frames corresponding to the adjusted 3D frames according to M times and the first 2D frame to obtain the M third 2D frames, the third 2D frame with the shortest distance between the boundary and the boundary of the first 2D frame;
  • the determining unit 1002 is also used for the third 2D frame with the shortest distance according to the first 2D frame.
  • the length of the boundary of the 3D frame corresponding to the three 2D frames determines the length of the boundary of the second object.
  • the third 2D frame is obtained by performing coordinate transformation on the adjusted 3D frame, including: the third 2D frame is adjusted through the second transformation relationship, the internal parameter matrix and the external parameter matrix.
  • the obtained 3D frame is obtained by performing coordinate transformation, the internal parameter matrix is the internal parameter matrix of the device that captures the image, and the external parameter matrix is the external parameter matrix of the device.
  • the acquiring unit 1001 is specifically configured to input the image into the neural network model to obtain the 2D information of the second object.
  • the target detection apparatus is presented in the form of dividing each functional module in an integrated manner.
  • Module herein may refer to a specific ASIC, circuit, processor and memory executing one or more software or firmware programs, integrated logic circuit, and/or other device that may provide the functions described above.
  • the target detection device can take the form shown in FIG. 3 .
  • the processor 301 in FIG. 3 may invoke the computer execution instructions stored in the memory 303 to cause the target detection apparatus to execute the target detection method in the above method embodiments.
  • the functions/implementation process of the acquiring unit 1001 and the determining unit 1002 in FIG. 10 may be implemented by the processor 301 in FIG. 3 calling the computer-executed instructions stored in the memory 303 .
  • the target detection apparatus provided in this embodiment can perform the above-mentioned target detection method, reference can be made to the above-mentioned method embodiments for the technical effects that can be obtained, and details are not repeated here.
  • FIG. 11 is a schematic structural diagram of a chip according to an embodiment of the present application.
  • the chip 110 includes one or more processors 1101 and an interface circuit 1102 .
  • the chip 110 may further include a bus 1103 . in:
  • the processor 1101 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the processor 1101 or an instruction in the form of software.
  • the above-mentioned processor 1101 may be a general purpose processor, a digital communicator (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components .
  • DSP digital communicator
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the interface circuit 1102 is used for sending or receiving data, instructions or information.
  • the processor 1101 can use the data, instructions or other information received by the interface circuit 1102 to perform processing, and can send the processing completion information through the interface circuit 1102 .
  • the chip 110 further includes a memory, which may include a read-only memory and a random access memory, and provides operation instructions and data to the processor.
  • a portion of the memory may also include non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory stores executable software modules or data structures
  • the processor may execute corresponding operations by calling operation instructions stored in the memory (the operation instructions may be stored in the operating system).
  • the chip 110 may be used in the target detection apparatus involved in the embodiments of the present application.
  • the interface circuit 1102 may be used to output the execution result of the processor 1101 .
  • processor 1101 and the interface circuit 1102 can be implemented by hardware design, software design, or a combination of software and hardware, which is not limited here.
  • the disclosed apparatus and method may be implemented in other manners.
  • the device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be Incorporation may either be integrated into another device, or some features may be omitted, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components shown as units may be one physical unit or multiple physical units, that is, they may be located in one place, or may be distributed to multiple different places . Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a readable storage medium.
  • the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, which are stored in a storage medium , including several instructions to make a device (may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: a U disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk and other mediums that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

本申请公开了目标检测方法及装置,涉及人工智能领域,可以根据单目摄像机采集的图像得到车辆的3D信息。该方法包括:获取第一物体获取的图像中第二物体的第一分界线的端点的坐标信息和第二物体的类型信息;根据第一分界线的端点的坐标信息和第二物体的类型信息获取第二物体的第二分界线的端点的坐标信息;根据第一分界线的端点的坐标信息和第二分界线的端点的坐标信息,得到第二物体与第一物体的距离为K时,3D框对应的三维坐标系与第一物体的三维坐标系之间的第一变换关系;根据第一分界线的端点的坐标信息、第二分界线的端点的坐标信息和第一变换关系,得到第二物体与第一物体的距离。

Description

目标检测方法及装置
“本申请要求于2020年8月12日提交国家知识产权局、申请号为202010806077.3、发明名称为“目标检测方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中”。
技术领域
本申请涉及人工智能领域,尤其涉及自动驾驶领域或智能交通领域的目标检测方法及装置。
背景技术
随着经济的快速发展,汽车市场占有量逐年增加,交通事故发生的频率也急剧增加。为了降低交通事故发生的频率,高级驾驶辅助系统(advanced driver assistant system,ADAS)应运而生。ADAS可以通过感知模块感知外部车辆的三维(3 dimensions,3D)信息,例如,车辆的位置和尺寸等信息,提醒驾驶员车辆行驶中可能存在的危险,并依据感知的信息进行路径规划和驾驶策略调整。感知模块感知的车辆的3D信息越精确,ADAS的性能越好,车辆的安全性越高。
目前,单目摄像机的成本较低,感知模块中通常配置单目摄像机。但是单目摄像机是用一个摄像机来感知信息的,不能感知车辆的3D信息。
发明内容
本申请提供目标检测方法及装置,可以根据单目摄像机采集的图像得到车辆的3D信息。
为达到上述目的,本申请的实施例采用如下技术方案:
第一方面,本申请实施例提供一种目标检测方法及装置,该方法包括:获取第一物体获取的图像中第二物体的二维(2 dimensions,2D)信息;该第二物体的2D信息包括:该第二物体的第一分界线的端点的坐标信息和该第二物体的类型信息,该第一分界线为该第二物体的第一面与第二面之间的分界线,该第一面和该第二面中至少一个面包括在该第二物体的第一2D框中,该第一2D框为包括在该图像中,包围第二物体的多边形;根据第二物体的2D信息获取该第二物体的3D信息;该第二物体的3D信息包括第二分界线的端点的坐标信息,该第二分界线为该第二物体的3D框中,该第一面和该第二面之间的分界线,该3D框为第二物体的3D模型,3D框的边界的长度与该第二物体的类型信息对应;根据该第一分界线的端点的坐标信息和该第二分界线的端点的坐标信息,得到第一变换关系;该第一变换关系为该第二物体与该第一物体的距离为K时,该3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系,该K大于0;根据该第一分界线的端点的坐标信息、该第二分界线的端点的坐标信息和该第一变换关系,得到该第二物体与该第一物体的距离。
上述第一方面提供的方法,可以根据单目摄像机采集的图像构建3D框,并利用第二物体与第一物体的距离为K时,3D框对应的三维坐标系与第一物体的三维坐标系之间的变换关系得到第二物体与第一物体的距离,运算成本较低。另外,通过第一方面提供的方法,不需要多目摄像机或深度摄像机,就可以得到第二物体与第一物体 的距离,因此可以降低目标检测装置的成本。
在一种可能的实现方式中,该第二物体的2D信息还包括:第二物体的第一2D框的坐标信息;根据第二物体的2D信息获取该第二物体的3D信息,包括:根据该第二物体的类型信息构建该3D框,得到该3D框的坐标信息;根据该第一2D框的坐标信息和该第一分界线的端点的坐标信息,确定该第一面和该第二面;根据该第一面、该第二面和3D框的坐标信息,获取该第二分界线的端点的坐标信息。基于上述方法,当第二物体的2D信息包括第二物体的第一分界线的端点的坐标信息、第二物体的第一2D框的坐标信息和该第二物体的类型信息的情况下,可以构建3D框,并根据第一分界线的端点的坐标信息和第一2D框的坐标信息确定3D框中的第二分界线的端点的坐标信息,以便后续根据第二分界线上的点以及该点在第一分界线上的映射点,得到3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系。
在一种可能的实现方式中,该第二物体的2D信息还包括:该第二物体的面信息,该第二物体的面信息用于指示该第一面和该第二面;根据第二物体的2D信息获取该第二物体的3D信息,包括:根据该第二物体的类型信息构建该3D框,得到该3D框的坐标信息;根据该第二物体的面信息和该3D框的坐标信息,获取该第二分界线的端点的坐标信息。基于上述方法,当第二物体的2D信息包括第二物体的第一分界线的端点的坐标信息、第二物体的面信息和该第二物体的类型信息的情况下,可以构建3D框,并根据第二物体的面信息确定3D框中的第二分界线的端点的坐标信息,以便后续根据第二分界线上的点以及该点在第一分界线上的映射点,得到3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系。
在一种可能的实现方式中,根据该第一分界线的端点的坐标信息、该第二分界线的端点的坐标信息和该第一变换关系,得到该第二物体与第一物体的距离,包括:将第一坐标通过该第一变换关系、内参矩阵和外参矩阵进行坐标变换,得到第二坐标,该第一坐标在该第二分界线上,该第二坐标与该第一物体的距离为K,该第二坐标、第三坐标与该第一物体在一条直线上,该第三坐标为该第一分界线上与该第一坐标对应的坐标,该内参矩阵为拍摄该图像的设备的内参矩阵,该外参矩阵为该设备的外参矩阵;对该第二坐标、该第三坐标和该K进行复合运算,得到第二物体与所述第一物体的距离。基于上述方法,可以根据3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系,以及相似三角形的性质得到第二物体与所述第一物体的距离。
在一种可能的实现方式中,该方法还包括:根据该第一分界线的端点的坐标信息、该第二分界线的端点的坐标信息和第二物体与所述第一物体的距离,得到第二变换关系;该第二变换关系为该3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系。基于上述方法,可以得到3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系,以便根据该变换关系确定第二物体的朝向角或校准3D框的长度和/或宽度。
在一种可能的实现方式中,该方法还包括:以该第二分界线为中心对该3D框进行N次旋转,获取每次旋转后的3D框所对应的第二2D框,该第二2D框是对该旋转后的3D框进行坐标变换得到的,该N为正整数;根据N次旋转后的3D框对应的N个第二2D框和该第一2D框,得到该N个第二2D框中,边界与该第一2D框的边界 之间的距离最短的第二2D框;根据该距离最短的第二2D框对应的3D框的旋转角,确定该第二物体的朝向角。基于上述方法,可以将经过旋转后的3D框对应的第二2D框与第一2D框进行比较,得到与第一2D框最接近的第二2D框,从而获取距离最短的第二2D框对应的3D框的旋转角,并根据距离最短的第二2D框对应的3D框的旋转角,确定第二物体的朝向角。如此,在规划路径时除了参考第二物体与第一物体的距离之外,还可以参考该第二物体的朝向角,使得规划的路径的参考性更高。
在一种可能的实现方式中,该第二2D框是对该旋转后的3D框进行坐标变换得到的,包括:该第二2D框是通过该第二变换关系、内参矩阵和外参矩阵对该旋转后的3D框进行坐标变换得到的,该内参矩阵为拍摄该图像的设备的内参矩阵,该外参矩阵为该设备的外参矩阵。基于上述方法,可以通过第二变换关系、内参矩阵和外参矩阵对该旋转后的3D框进行坐标变换得到第二2D框,以便后续根据第二2D框获取第二物体的朝向角。
在一种可能的实现方式中,该方法还包括:对该距离最短的第二2D框对应的3D框的边界的长度进行M次调整,获取每次调整后的3D框所对应的第三2D框,该第三2D框是对该调整后的3D框进行坐标变换得到的,该M为正整数;根据M次调整后的3D框对应的M个第三2D框和该第一2D框,得到该M个第三2D框中,边界与该第一2D框的边界之间的距离最短的第三2D框;根据该距离最短的第三2D框对应的3D框的边界的长度,确定该第二物体的边界的长度。基于上述方法,可以将调整后的3D框对应的第三2D框与第一2D框进行比较,得到与第一2D框最接近的第三2D框,从而获取第二物体的边界的长度。如此,可以获取到第二物体的更为准确的尺寸。
在一种可能的实现方式中,该第三2D框是对该调整后的3D框进行坐标变换得到的,包括:该第三2D框是通过该第二变换关系、内参矩阵和外参矩阵对该调整后的3D框进行坐标变换得到的,该内参矩阵为拍摄该图像的设备的内参矩阵,该外参矩阵为该设备的外参矩阵。基于上述方法,可以通过第二变换关系、内参矩阵和外参矩阵对该调整后的3D框进行坐标变换得到第三2D框,以便后续根据第三2D框获取第二物体的更为准确的尺寸。
在一种可能的实现方式中,该获取第一物体获取的图像中第二物体的2D信息,包括:将该图像输入到神经网络模型,得到该第二物体的2D信息。基于上述方法,可以通过神经网络模型得到第二物体的2D信息,以便根据第二物体的2D信息得到第二物体与第一物体的距离。
第二方面,本申请实施例提供一种目标检测装置,可以实现上述第一方面、或第一方面任一种可能的实现方式中的方法。该装置包括用于执行上述方法的相应的单元或部件。该装置包括的单元可以通过软件和/或硬件方式实现。该装置例如可以为ADAS、或者为可支持ADAS实现上述方法的芯片、芯片系统、或处理器等。
第三方面,本申请实施例提供一种目标检测装置,包括:处理器,所述处理器与存储器耦合,所述存储器用于存储程序或指令,当所述程序或指令被所述处理器执行时,使得该装置实现上述第一方面、或第一方面任一种可能的实施方式中所述的方法。
第四方面,本申请实施例提供一种目标检测装置,该装置用于实现上述第一方面、 或第一方面任一种可能的实施方式中所述的方法。
第五方面,本申请实施例提供一种计算机可读介质,其上存储有计算机程序或指令,所述计算机程序或指令被执行时使得计算机执行上述第一方面、或第一方面任一种可能的实施方式中所述的方法。
第六方面,本申请实施例提供一种计算机程序产品,其包括计算机程序代码,所述计算机程序代码在计算机上运行时,使得计算机执行上述第一方面、或第一方面任一种可能的实施方式中所述的方法。
第七方面,本申请实施例提供一种芯片,包括:处理器,所述处理器与存储器耦合,所述存储器用于存储程序或指令,当所述程序或指令被所述处理器执行时,使得该芯片实现上述第一方面、或第一方面任一种可能的实施方式中所述的方法。
可以理解的,上述提供的任一种目标检测装置、芯片、计算机可读介质或计算机程序产品等均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考对应的方法中的有益效果,此处不再赘述。
附图说明
图1为本申请实施例提供的3D框的示意图;
图2A为本申请实施例提供的系统架构的示意图一;
图2B为本申请实施例提供的系统架构的示意图二;
图2C为本申请实施例提供的系统架构的示意图三;
图2D为本申请实施例提供的系统架构的示意图四;
图2E为本申请实施例提供的系统架构的示意图五;
图2F为本申请实施例提供的系统架构的示意图六;
图3为本申请实施例提供的电子设备的硬件结构示意图;
图4为本申请实施例提供的目标检测方法的流程示意图一;
图5为本申请实施例提供的感知模块拍摄的图像的示意图;
图6为本申请实施例提供的第二坐标和第三坐标的示意图;
图7为本申请实施例提供的目标检测方法的流程示意图二;
图8为本申请实施例提供的任一第二2D框与第一2D框的示意图;
图9为本申请实施例提供的目标检测方法的流程示意图三;
图10为本申请实施例提供的目标检测装置的结构示意图;
图11为本申请实施例提供的芯片的结构示意图。
具体实施方式
为方便理解本申请实施例的方案,首先对本申请实施例中涉及的各种坐标系进行介绍:
1、第一物体的坐标系
第一物体的坐标系是将第一物体作为参考的坐标系,是一个三维坐标系。示例性的,以第一物体为车辆为例,第一物体的坐标系是以车辆的质心为原点的坐标系。其中,车辆的质心为车辆的质量的中心。进一步的,在第一物体的坐标系中,可以以车辆的质心的坐标表示该车辆的坐标。
2、第二物体的坐标系
第二物体的坐标系是3D框对应的坐标系,是一个三维坐标系。该3D框是目标检测装置根据第二物体的类型信息建立的第二物体的3D模型。第二物体的坐标系的原点可以是3D框上的任一点。例如,如图1所示,第二物体的坐标系的原点可以是3D框101的左面和后面的分界线上的下分界点102。
在一些实施例中,第一物体的坐标系中的点和第二物体的坐标系中的点有映射关系。示例性的,第一物体的坐标系中的点(x,y,z)和第二物体的坐标系中的点(X,Y,Z)满足如下公式:
Figure PCTCN2021087917-appb-000001
其中,T δ为从第二物体的坐标系到第一物体的坐标系下的平移向量。T δ可以为三维向量,例如,T δ=(δ xyz)。
3、摄像机的坐标系
摄像机的坐标系是以摄像机的光心为原点的坐标系,是一个三维坐标系。该摄像机可以是目标检测装置中的模块,也可以不包括在目标检测装置中。
在一些实施例中,摄像机的坐标系中的点和第二物体的坐标系中的点有映射关系。摄像机的坐标系中的点和第一物体的坐标系中的点有映射关系。
示例性的,摄像机的坐标系中的点(B,C,D)和第二物体的坐标系中的点(X,Y,Z)满足如下公式:
Figure PCTCN2021087917-appb-000002
摄像机的坐标系中的点(B,C,D)和第一物体的坐标系中的点(x,y,z)满足如下公式:
Figure PCTCN2021087917-appb-000003
其中,[R|T]为摄像机的外参矩阵。摄像机的外参矩阵的具体介绍,以及获取摄像机外参矩阵的方法的具体介绍可以参考常规技术中的解释和说明。T δ的介绍可以参考上述第二物体的坐标系中的介绍。
4、图像坐标系
图像坐标系是摄像机拍摄的图像对应的坐标系,是一个二维坐标系。例如,图像坐标系是以摄像机拍摄的图像的中心为原点,在该图像上建立的坐标系。
在一些实施例中,图像坐标系中的点和第二物体的坐标系中的点有映射关系。图像坐标系中的点和第一物体的坐标系中的点有映射关系。图像坐标系中的点与摄像机的坐标系中的点有映射关系。
示例性的,图像坐标系中的点(a,b)和第二物体的坐标系中的点(X,Y,Z)满足如下公式:
Figure PCTCN2021087917-appb-000004
图像坐标系中的点(a,b)和第一物体的坐标系中的点(x,y,z)满足如下公式:
Figure PCTCN2021087917-appb-000005
图像坐标系中的点(a,b) 和摄像机的坐标系中的点(B,C,D)满足如下公式:
Figure PCTCN2021087917-appb-000006
其中,s为尺度比例因子。A为摄像机的内参矩阵。[R|T]为摄像机的外参矩阵。尺度比例因子、摄像机的内参矩阵、摄像机的外参矩阵、摄像机的内参矩阵和外参矩阵的获取方法的具体介绍可以参考常规技术中的解释和说明。T δ的介绍可以参考上述第二物体的坐标系中的介绍。
下面结合附图对本申请实施例的实施方式进行详细描述。
本申请实施例提供的目标检测方法及装置,能够应用于任一需要检测目标物体的3D信息的场景中。例如,该目标检测方法及装置能够应用于车辆或无人机等的ADAS中。通过本申请实施例提供的目标检测方法及装置,能够得到目标物体的3D信息,计算结果准确并且运算成本较低。
首先,对本申请实施例可应用的系统架构进行说明。
在一种可能的实现方式中,本申请实施例可应用的系统架构包括目标检测装置。其中,该目标检测装置中部署有感知模块。该感知模块可以包括摄像机。例如,单目摄像机、双目摄像机、三目摄像机或多目摄像机等等。其中,单目摄像机、双目摄像机、三目摄像机或多目摄像机的具体介绍可以参考常规技术中的解释说明,本申请实施例不做赘述。示例性的,该系统架构可以如图2A所示。图2A所示的系统架构包括目标检测装置201。目标检测装置201中部署有感知模块2011。
上述目标检测装置可以通过感知模块拍摄包括第二物体的图像。目标检测装置还可以获取图像中第二物体的二维(2 dimensions,2D)信息;根据第二物体的2D信息获取第二物体的3D信息;根据第一分界线的端点的坐标信息和第二分界线的端点的坐标信息,得到第一变换关系;根据第一分界线的端点的坐标信息、第二分界线的端点的坐标信息和第一变换关系,得到第二物体与第一物体的距离。具体的,可以参考下述图4所示的方法。
可选的,上述系统架构还包括检测模块。该检测模块中部署有神经网络模型,将上述图像输入到神经网络模型,可以得到第二物体的2D信息。例如,神经网络模型包括图片预处理模块和网络推理模块等。其中,图片预处理模块,用于对采集的图像进行标准化操作,以得到更具泛化能力的模型。网络推理模块用于根据经过标准化操作的图像,得到图像中第二物体的2D信息,例如,第一2D框的坐标信息、第一分界线的端点的坐标信息、第二物体的类型信息和第二物体的面信息等。其中,第一2D框的坐标信息、第一分界线的端点的坐标信息、第二物体的类型信息和第二物体的面信息的介绍可以参考下述图4所示方法中所述。
可以理解的,上述检测模块可以包括在目标检测装置中,也可以独立于目标检测装置。检测模块独立于目标检测装置时,可以通过有线或无线与目标检测装置通信。示例性的,该系统架构可以如图2B或图2C所示。图2B所示的系统架构包括目标检测装置201。目标检测装置201中部署有感知模块2011和检测模块2012。图2C所示的系统架构包括目标检测装置201和检测模块202。目标检测装置201中部署有感知模块2011。
可以理解的,当检测模块包括在目标检测装置中时,目标检测装置可以通过感知模块拍摄包括第二物体的图像,并通过检测模块检测该图像中第二物体的2D信息。当检测模块独立于目标检测装置时,目标检测装置可以通过感知模块拍摄包括第二物体的图像,并向检测模块发送该图像。检测模块接收到该图像后,检测该图像中第二物体的2D信息,并向目标检测装置发送该第二物体的2D信息。
在另一种可能的实现方式中,本申请实施例可应用的系统架构包括目标检测装置和感知模块。其中,感知模块和目标检测装置相互独立。感知模块可以通过有线或无线的方式与目标检测装置通信。该感知模块可以包括摄像机。示例性的,该系统架构可以如图2D所示。图2D所示的系统架构包括目标检测装置203和感知模块204。
上述感知模块可以拍摄包括第二物体的图像,并向目标检测装置发送该图像。目标检测装置接收到该图像后,可以获取图像中第二物体的2D信息;根据第二物体的2D信息获取第二物体的3D信息;根据第一分界线的端点的坐标信息和第二分界线的端点的坐标信息,得到第一变换关系;根据第一分界线的端点的坐标信息、第二分界线的端点的坐标信息和第一变换关系,得到第二物体与第一物体的距离。具体的,可以参考下述图4所示的方法。
可选的,上述系统架构还包括检测模块。该检测模块中部署有神经网络模型,将上述图像输入到神经网络模型,可以得到第二物体的2D信息。
可以理解的,上述检测模块可以包括在目标检测装置中,也可以独立于目标检测装置。检测模块独立于目标检测装置时,可以通过有线或无线与目标检测装置通信。示例性的,该系统架构可以如图2E或图2F所示。图2E所示的系统架构包括目标检测装置203和感知模块204。目标检测装置203中部署有检测模块2031。图2F所示的系统架构包括目标检测装置203、感知模块204和检测模块205。
可以理解的,当检测模块包括在目标检测装置中时,感知模块可以拍摄包括第二物体的图像,并向目标检测装置发送该图像。目标检测装置接收到该图像后,可以通过检测模块检测该图像中第二物体的2D信息。当检测模块独立于目标检测装置时,感知模块可以拍摄包括第二物体的图像,并向检测模块发送该图像。检测模块接收到该图像后,可以检测该图像中第二物体的2D信息,并向目标检测装置发送第二物体的2D信息。
可以理解的,本申请中的第一物体可以为车辆、无人机或智能体设备(例如,各种应用场景的机器人,如家用机器人,工业场景机器人等)等。第一物体上可以部署感知模块,和/或,目标检测装置,和/或,检测模块。例如,第一物体的ADAS中部署有感知模块,和/或,目标检测装置,和/或,检测模块。
可以理解的,本申请中的第二物体可以为车辆、护栏、路桩或建筑物等。
应注意,图2A-图2F所示的系统架构仅用于举例,并非用于限制本申请的技术方案。本领域的技术人员应当明白,在具体实现过程中,系统架构还可以包括其他设备,同时也可根据具体需要来确定感知模块、目标检测装置或检测模块的数量。
可选的,本申请实施例图2A-图2F中的目标检测装置,可以是一个设备内的一个功能模块。可以理解的是,上述功能既可以是硬件设备中的电子元件,例如ADAS的芯片,也可以是在专用硬件上运行的软件功能,或者是平台(例如,云平台)上实例化 的虚拟化功能。
例如,图2A-图2F中的目标检测装置均可以通过图3中的电子设备300来实现。图3所示为可适用于本申请实施例的电子设备的硬件结构示意图。该电子设备300包括至少一个处理器301,通信线路302,存储器303。
处理器301可以是一个通用中央处理器(central processing unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。
通信线路302可包括一通路,在上述组件之间传送信息,例如总线。
存储器303可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过通信线路302与处理器相连接。存储器也可以和处理器集成在一起。本申请实施例提供的存储器通常可以具有非易失性。其中,存储器303用于存储执行本申请方案所涉及的计算机执行指令,并由处理器301来控制执行。处理器301用于执行存储器303中存储的计算机执行指令,从而实现本申请实施例提供的方法。
可选的,本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不作具体限定。
可选的,电子设备300还包括通信接口304。通信接口304可以使用任何收发器一类的装置,用于与其他设备或通信网络通信,如以太网接口,无线接入网接口(radio access network,RAN),无线局域网接口(wireless local area networks,WLAN)等。
可选的,电子设备300还包括感知模块(图3中未示出)。该感知模块可以包括单目摄像机、双目摄像机、三目摄像机或多目摄像机。该感知模块可以用于拍摄包括第二物体的图像。
可选的,电子设备300还包括检测模块(图3中未示出)。该检测模块中部署有神经网络模型,将采集到的图像输入到神经网络模型,可以得到第二物体的2D信息。
在具体实现中,作为一种实施例,处理器301可以包括一个或多个CPU,例如图3中的CPU0和CPU1。
在具体实现中,作为一种实施例,电子设备300可以包括多个处理器,例如图3中的处理器301和处理器307。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
在具体实现中,作为一种实施例,电子设备300还可以包括输出设备305和输入设备306。输出设备305和处理器301通信,可以以多种方式来显示信息。例如,输出设备305可以是液晶显示器(liquid crystal display,LCD),发光二级管(light emitting  diode,LED)显示设备,阴极射线管(cathode ray tube,CRT)显示设备,或投影仪(projector)等。输入设备306和处理器301通信,可以以多种方式接收用户的输入。例如,输入设备306可以是鼠标、键盘、触摸屏设备或传感设备等。
本领域技术人员可以理解,图3中示出的硬件结构并不构成对目标检测装置的限定,目标检测装置可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图1-图3,以第二物体为车辆为例对本申请实施例提供的目标检测方法进行具体阐述。
需要说明的是,本申请实施例提供的目标检测方法可以应用于多个领域,例如:无人驾驶领域、自动驾驶领域、辅助驾驶领域、智能驾驶领域、网联驾驶领域、智能网联驾驶领域、汽车共享领域等。
需要说明的是,本申请下述实施例中的信息名字或信息中各参数的名字等只是一个示例,具体实现中也可以是其他的名字,本申请实施例对此不作具体限定。
需要说明的是,在本申请的描述中,“第一”、或“第二”等词汇,仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。本申请中的“第一2D框”等具有不同编号的2D框,该编号仅为用于上下文行文方便,不同的次序编号本身不具有特定技术含义,比如,第一2D框,第二2D框等,可以理解为是一系列2D框中的一个或者任一个。
需要说明的是,本申请下述实施例中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请下述实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。
可以理解的,本申请实施例中同一个步骤或者具有相同功能的步骤或者消息在不同实施例之间可以互相参考借鉴。
可以理解的,本申请实施例中,目标检测装置可以执行本申请实施例中的部分或全部步骤,这些步骤仅是示例,本申请实施例还可以执行其它步骤或者各种步骤的变形。此外,各个步骤可以按照本申请实施例呈现的不同的顺序来执行,并且有可能并非要执行本申请实施例中的全部步骤。
在本申请实施例中,目标检测方法的执行主体的具体结构,本申请实施例并未特别限定,只要可以通过运行记录有本申请实施例的目标检测方法的代码的程序,以根据本申请实施例的目标检测方法执行对应的操作即可,例如,本申请实施例提供的目标检测方法的执行主体可以是目标检测装置,或者为应用于目标检测装置中的部件,例如,芯片,本申请对此不进行限定。
如图4所示,为本申请实施例提供的一种目标检测方法,该目标检测方法包括步骤401-步骤404。
步骤401:目标检测装置获取第一物体获取的图像中第二物体的2D信息。
其中,该目标检测装置可以是图2A-图2F中所示的任一目标检测装置。目标检测装置可以部署在第一物体中。应理解,若目标检测装置为图2D、图2E或图2F中的目标检测装置,第一物体中还部署有感知模块,例如,摄像机。
其中,该图像为感知模块获取的。感知模块可以包括单目摄像机、双目摄像机、三目摄像机或多目摄像机。步骤401中的图像可以是第一物体中的感知模块拍摄的。感知模块可以是图2A-图2F中的感知模块。例如,若目标检测装置为图2A中的目标检测装置201,则该感知模块为图2A中的感知模块2011。若目标检测装置为图2E中的目标检测装置203,则该感知模块为图2E中的感知模块204。
其中,第一物体获取的图像包括一个或多个第二物体。例如,该图像可以如图5所示,该图像包括多个第二物体。
其中,第二物体的2D信息包括第二物体的第一分界线的端点的坐标信息和第二物体的类型信息。进一步的,第二物体的2D信息还包括第二物体的第一2D框的坐标信息和第二物体的面信息。
其中,第一2D框的坐标信息用于指示第二物体在该图像对应的图像坐标系中的坐标。如前所述,图像坐标系为二维坐标系,因此,第一2D框为一平面图形,例如矩形或多边形等。第一2D框为包括在图像中包围第二物体的多边形。示例性的,第一2D框可以为图5中的2D框501。第一2D框的坐标信息可以包括第一2D框中各个角的坐标。例如,若第一2D框为矩形,第一2D框的坐标信息包括该矩形的四个角的坐标。应理解,除了包括第一2D框中各个角的坐标之外,第一2D框的坐标信息还可以通过其他方式指示第二物体在该图像坐标系中的坐标,不予限制。
其中,第一分界线可以为第二物体的第一面与第二面之间的分界线。第一面和第二面中的至少一个包括在第一2D框中。示例性的,第一分界线为第二物体的后面和左面的分界线,或者第二物体的后面和右面的分界线,或者第二物体的前面和左面的分界线,或者第二物体的前面和右面的分界线等。例如,第一分界线可以为图5中的分界线502。分界线502为第二物体的后面和左面的分界线。
其中,第一分界线的端点的坐标信息用于指示第一分界线在该图像对应的图像坐标系中的坐标。示例性的,第一分界线的端点的坐标信息包括:第一分界线的下分界点的坐标,和/或,第一分界线的上分界点的坐标。其中,第一分界线的下分界点为第一分界线与第一2D框的下边界的交点。第一分界线的上分界点为第一分界线与第一2D框的上边界的交点。
可以理解的,在实际应用中,还可以通过第一分界线上的其他点的坐标指示第一分界线在图像对应的图像坐标系中的坐标。例如,第一分界线的中点的坐标。第一分界线的中点为第一分界线的上分界点和下分界点连线的中点。
其中,第二物体的类型信息用于指示该第二物体所属的类型。第二物体的类型包括以下类型中的一种或多种:两厢汽车、三厢汽车、微型车、运动型实用汽车(sport utility vehicle,SUV)、皮卡、小面包车、小型货车(敞口式)、小型货车(封闭式)、轻型卡车、重型卡车、工程车、中型客车、大型客车或双层客车。应理解,上述类型仅是第二物体的类型的示例,在实际应用中,第二物体的类型还包括其他类型,不予限制。
第二物体的类型信息可以包括第二物体所属的类型的标识。示例性的,以第二物体为两厢汽车,两厢汽车的标识为ID 2为例,第二物体的类型信息包括ID 2。
其中,第二物体的面信息用于指示上述第一面和/或第二面。示例性的,第二物体 的面信息可以包括第一面的标识和/或第二面的标识。例如,以第一分界线为图5中的分界线502,第二物体的左面的标识为ID 1,第二物体的后面的标识为ID 2为例,第二物体的面信息包括ID 1和ID 2。
需要说明的是,若目标检测装置与第二物体行驶方向之间的夹角小于或等于第一阈值时,在图像中,能看到的第二物体的一个面。如图5所示,第二物体503在图5所示的图像中能看到后面,但是看不到侧面。这种情况下,第二物体的面信息用于指示该一个面,第一分界线为第一2D框的左边界或右边界。
可以理解的,当图像中包括多个第二物体时,目标检测装置可以获取图像中多个第二物体中,一个或多个第二物体的2D信息。进一步的,当目标检测装置获取图像中多个第二物体的2D信息时,目标检测装置可以同时获取该多个第二物体的2D信息,也可以逐一获取多个第二物体中,每个第二物体的2D信息。
示例性的,目标检测装置可以通过如下两种方式获取图像中第二物体的2D信息。
一种可能的实现方式,目标检测装置根据用户的输入获取图像中第二物体的2D信息。
示例性的,以图2A所示的系统架构为例,目标检测装置201通过感知模块2011拍摄包括第二物体的图像。目标检测装置201通过人机交互界面为用户显示该图像,并接收用户输入的第二物体的2D信息。
示例性的,以图2D所示的系统架构为例,感知模块204拍摄包括第二物体的图像,并向目标检测装置203发送该图像。目标检测装置203接收到来自感知模块204的图像后,通过人机交互界面为用户显示该图像,并接收用户输入的第二物体的2D信息。
另一种可能的实现方式,目标检测装置将图像输入到神经网络模型,得到第二物体的2D信息。例如,目标检测装置将图像输入到检测模块中的神经网络模型,得到第二物体的2D信息。其中,检测模块可以是图2A-图2F中的检测模块。例如,若目标检测装置为图2B中的目标检测装置201,则该检测模块为图2B中的检测模块2012。若目标检测装置为图2F中的目标检测装置203,则该检测模块为图2F中的检测模块205。神经网络模型的介绍可以参考上述对系统架构的介绍中对应的描述。
示例性的,以图2B所示的系统架构为例,目标检测装置201通过感知模块2011拍摄包括第二物体的图像。目标检测装置201将图像输入到检测模块2012中的神经网络模型,得到第二物体的2D信息。
示例性的,以图2C所示的系统架构为例,目标检测装置201通过感知模块2011拍摄包括第二物体的图像,并向检测模块202发送该图像。检测模块202接收到该图像后,将该图像输入到神经网络模型,得到第二物体的2D信息,并向目标检测装置201发送该第二物体的2D信息。
示例性的,以图2E所示的系统架构为例,感知模块204拍摄包括第二物体的图像,并向目标检测装置203发送该图像。目标检测装置203接收到来自感知模块204的图像后,将该图像输入到神经网络模型,得到第二物体的2D信息。
示例性的,以图2F所示的系统架构为例,感知模块204拍摄包括第二物体的图像,并向检测模块205发送该图像。检测模块205接收到该图像后,将该图像输入到神经 网络模型,得到第二物体的2D信息,并向目标检测装置203发送该第二物体的2D信息。
步骤402:目标检测装置根据第二物体的2D信息获取第二物体的3D信息。
其中,第二物体的3D信息包括第二分界线的端点的坐标信息。进一步的,第二物体的3D信息还包括第二物体的3D框的坐标信息。
其中,3D框是根据第二物体的类型信息建立的第二物体的3D模型。示例性的,3D框可以是一个立体图形,如长方体。例如,3D框可以如图1所示。3D框的坐标信息用于指示3D框在3D框对应的坐标系中的坐标。3D框对应的坐标系也可以称为第二物体的坐标系。3D框的坐标信息可以包括3D框中各个角的坐标。例如,若3D框为长方体,3D框的坐标信息包括该长方体的八个角的坐标。应理解,除了包括3D框中各个角的坐标之外,3D框的坐标信息还可以通过其他方式指示3D框在3D框对应的坐标系中的坐标,不予限制。
其中,第二分界线与第一分界线对应。也就是说,第二分界线为第二物体的3D框中,第一面和第二面之间的分界线。例如,若第一分界线为该图像中,第二物体的后面和左面的分界线,则第二分界线为3D框的后面和左面的分界线;若第一分界线为该图像中,第二物体的后面和右面的分界线,则第二分界线为3D框的后面和右面的分界线;若第一分界线为该图像中,第二物体的前面和左面的分界线,则第二分界线为3D框的前面和左面的分界线;若第一分界线为该图像中,第二物体的前面和右面的分界线,则第二分界线为3D框的前面和右面的分界线。例如,第二分界线可以为图1中的分界线103。分界线103为3D框的后面和左面的分界线。其中,3D框的前面或后面为3D框的高度和3D框的宽度组成的面,3D框的左面或右面为3D框的高度和3D框的长度度组成的面。
其中,第二分界线的端点的坐标信息用于指示第二分界线在3D框对应的坐标系中的坐标。示例性的,第二分界线的端点的坐标信息包括:第二分界线的下分界点的坐标,和/或,第二分界线的上分界点的坐标。其中,第二分界线的下分界点为第二分界线与3D框的下平面的交点。第二分界线的上分界点为第二分界线与3D框的上平面的交点。
可以理解的,在实际应用中,还可以通过第二分界线上的其他点的坐标指示第二分界线在3D框对应的坐标系中的坐标。例如,第二分界线的中点的坐标。第二分界线的中点为第二分界线的上分界点和下分界点连线的中点。
目标检测装置可以通过下述示例性的两种方式获取第二物体的3D信息。
方式1:第二物体的2D信息包括第一分界线的端点的坐标信息、第一2D框的坐标信息和第二物体的类型信息,目标检测装置根据第二物体的类型信息,构建第二物体的3D框,得到3D框的坐标信息;目标检测装置根据第一2D框的坐标信息和第一分界线的端点的坐标信息,确定第一面和第二面;目标检测装置根据第一面、第二面和3D框的坐标信息,获取第二分界线的端点的坐标信息。
一种可能的实现方式,3D框的边界的长度与第二物体的类型信息对应。进一步的,3D框的长度、3D框的宽度和3D框的高度与第二物体的类型信息对应。
示例性的,以第二物体的类型包括两厢汽车、三厢汽车和SUV为例,3D框的长 度、3D框的宽度和3D框的高度与第二物体的类型信息的对应关系可以如表1所示。如表1所示,若第二物体的类型为两厢汽车,则3D框的长度为L 1,3D框的宽度为W 1,3D框的高度为H 1。若第二物体的类型为三厢汽车,则3D框的长度为L 2,3D框的宽度为W 2,3D框的高度为H 2。若第二物体的类型为SUV,则3D框的长度为L 3,3D框的宽度为W 3,3D框的高度为H 3
表1
Figure PCTCN2021087917-appb-000007
可以理解的,上述表1仅是3D框的长度、3D框的宽度和3D框的高度与第二物体的类型信息的对应关系的示例,在实际应用中,3D框的长度、3D框的宽度和3D框的高度与第二物体的类型信息的对应关系还可以是其他形式,不予限制。
可以理解的,若第一分界线与第一2D框的左边界或右边界之间的距离大于或等于第二阈值,在图像中,能看到的第二物体的至少两个面,若第一分界线与第一2D框的左边界或右边界之间的距离小于第二阈值,在图像中,能看到的第二物体的一个面。
示例性的,以第一分界线与第一2D框的左边界或右边界之间的距离大于或等于第二阈值为例,若第二物体与目标检测装置的行驶方向相同,且第一2D框在图像的右边,目标检测装置确定第一面为左面,第二面为后面,或者,第一面为后面,第二面为左面。后续,目标检测装置将3D框的左面和后面之间的分界线确定为第二分界线,并获取第二分界线的端点的坐标信息。若第二物体与目标检测装置的行驶方向相反,且第一2D框在图像的右边,目标检测装置确定第一面为右面,第二面为前面,或者,第一面为前面,第二面为右面。后续,目标检测装置将3D框的右面和前面之间的分界线确定为第二分界线,并获取第二分界线的端点的坐标信息。
示例性的,以第一分界线与第一2D框的左边界或右边界之间的距离小于第二阈值为例,若第二物体与目标检测装置的行驶方向相同,目标检测装置确定图像中显示的面为第二物体的后面。若第一分界线与第一2D框的左边界之间的距离小于第二阈值,目标检测装置将3D框的后面与左面之间的分界线确定为第二分界线,并获取第二分界线的端点的坐标信息;若第一分界线与第一2D框的右边界之间的距离小于第二阈值,目标检测装置将3D框的后面与右面之间的分界线确定为第二分界线,并获取第二分界线的端点的坐标信息。
方式2:第二物体的2D信息包括第一分界线的端点的坐标信息、第二物体的类型信息和第二物体的面信息,目标检测装置根据第二物体的类型信息,构建第二物体的3D框,得到3D框的坐标信息;目标检测装置根据第二物体的面信息和3D框的坐标信息,获取第二分界线的端点的坐标信息。
其中,目标检测装置根据第二物体的类型信息,构建第二物体的3D框,得到3D框的坐标信息的具体过程可以参考上述方式1中所述,不予赘述。
一种可能的实现方式,目标检测装置根据第二物体的面信息中指示的第一面和/或第二面确定第二分界线,并根据3D框的坐标信息获取第二分界线的端点的坐标信息。
示例性的,以第二物体的面信息包括第二物体的左面的标识和后面的标识为例,目标检测装置将3D框的左面和后面之间的分界线确定为第二分界线,并根据3D框的坐标信息获取第二分界线的端点的坐标信息。
示例性的,以第二物体的面信息包括第二物体的后面的标识为例,若第一分界线为第一2D框的左边界,目标检测装置将3D框的左面和后面之间的分界线确定为第二分界线,并根据3D框的坐标信息获取第二分界线的端点的坐标信息。若第一分界线为第一2D框的右边界,目标检测装置将3D框的右面和后面之间的分界线确定为第二分界线,并根据3D框的坐标信息获取第二分界线的端点的坐标信息。
步骤403:目标检测装置根据第一分界线的端点的坐标信息和第二分界线的端点的坐标信息,得到第一变换关系。
其中,第一变换关系为第二物体与第一物体的距离为K时,3D框对应的坐标系与第一物体的坐标系之间的变换关系。K大于0。例如,第一变换关系为第二物体与第一物体的摄像机的光心的距离为K时,3D框对应的坐标系与第一物体的坐标系之间的变换关系。
一种可能的实现方式,第二分界线上的任一点与该点在图像坐标系下的映射点满足公式1:
Figure PCTCN2021087917-appb-000008
第二物体与摄像机的距离为K的情况下,上述第二分界线上的任一点与该点在摄像机的坐标系下的映射点满足公式2:
Figure PCTCN2021087917-appb-000009
目标检测装置结合上述公式1和公式2求解可以得到s和第一变换关系,即s和T δ
其中,(X,Y,Z)为第二分界线上的任一点,例如,第二分界线的下分界点。(a,b)为第二分界线上的任一点在图像坐标系下的映射点。若(X,Y,Z)为第二分界线的下分界点,则(a,b)为第一分界线的下分界点。A和[R|T]为已知量。-表示目标检测装置在求解s和第一变换关系时,可以不关注这个值。
步骤404:目标检测装置根据第一分界线的端点的坐标信息、第二分界线的端点的坐标信息和第一变换关系,得到第二物体与第一物体的距离。
一种可能的实现方式,目标检测装置将第一坐标通过第一变换关系、内参矩阵和外参矩阵进行坐标变换,得到第二坐标;目标检测装置对第二坐标、第三坐标和K进行复核运算,得到第二物体与所述第一物体的距离。其中,内参矩阵为摄像机的内参矩阵,即内参矩阵为A。外参矩阵为摄像机的外参矩阵,即外参矩阵为[R|T]。
其中,第一坐标在第二分界线上。因为第一变换关系为第二物体与第一物体的距离为K时,3D框对应的坐标系与第一物体的坐标系之间的变换关系,所以第二坐标与摄像机的距离为K。第二坐标、第三坐标和摄像机在一条直线上。其中,第三坐标为第一分界线上与第一坐标对应的坐标。
进一步的,第二物体与第一物体的距离为K时,第一坐标(X,Y,Z)与第二坐标(x,y)满足如下公式:
Figure PCTCN2021087917-appb-000010
其中,s和T δ为上述步骤403中计算得到的值。通过上述公式可以得到第二坐标(x,y)。
示例性的,以第二物体与第一物体的光心的距离为K,第一坐标为第二分界线的下分界点的坐标为例,如图6所示,第一物体601上的A点为摄像机的光心的位置,C点的坐标为第二坐标,E点的坐标为第三坐标,也是第一分界线的下分界点的坐标。A点、C点和E点在一条直线上。D点为第一分界线的上分界点。根据点A、点C、点E和点D构建三角形ABC和三角形ADE。三角形ABC和三角形ADE为相似三角形。因此,目标检测装置根据相似三角形的性质,对第二坐标、第三坐标和K进行复合运算,可以得到第二物体与所述第一物体的距离。
例如,若点D的坐标为(x 1,y 1),点E的坐标为(x 1,y 2),点C的坐标为(x 2,y 3),可以推出点B的坐标为(x 2,y 1)。由相似三角形的性质可知:
Figure PCTCN2021087917-appb-000011
因此,
Figure PCTCN2021087917-appb-000012
由此可以得出P值。后续,目标检测装置可以将P值作为第二物体与第一物体的距离。进一步的,目标检测装置还可以根据P值以及3D框的坐标信息,得到第二物体的质心与摄像机的距离,并将该距离作为第二物体与第一物体的距离。
可选的,步骤404之后,目标检测装置可以根据第二物体与第一物体的距离进行路径规划,控制第一物体按照规划的路径行驶,从而有效规避障碍物,增加自动驾驶的舒适性和安全性。
基于图4所示的方法,目标检测装置可以获取第一物体获取的图像中第二物体的2D信息,例如,第二物体的第一分界线的端点的坐标信息和第二物体的类型信息。目标检测装置还可以获取第二物体的3D信息,例如,第二物体的第二分界线的端点的坐标信息。目标检测装置还可以根据第一分界线的端点的坐标信息和第二分界线的端点的坐标信息,得到第二物体与第一物体的距离为K时,3D框对应的坐标系与第一物体的坐标系之间的第一变换关系,并根据第一分界线的端点的坐标信息、第二分界线的端点的坐标信息和第一变换关系,得到第二物体与第一物体的距离。如此,目标检测装置根据单目摄像机采集的图像构建3D框,并利用第二物体与第一物体的距离为K时,3D框对应的坐标系与第一物体的坐标系之间的变换关系可以得到第二物体与第一物体的距离,运算成本较低。另外,通过图4所示的方法,不需要多目摄像机或深度摄像机,就可以得到第二物体与第一物体的距离,因此可以降低目标检测装置的成本。
可选的,目标检测装置还可以获取第二物体的朝向角,即第二物体与第一物体的行驶方向的夹角。具体的,如图7所示,图4所示的方法还包括步骤701-步骤703。
步骤701:目标检测装置以第二分界线为中心对3D框进行N次旋转,获取每次旋转后的3D框所对应的第二2D框。
其中,N为正整数。
一种可能的实现方式,目标检测装置以第二分界线为中心旋转3D框,每隔旋转 角α获取一次旋转后的3D框所对应的第二2D框。其中,0≤α≤360°。
示例性的,以α为60°为例,目标检测装置要分别获取对3D框旋转60°、120°、180°、240°、300°和360°时,旋转后的3D框对应的第二2D框。
另一种可能的实现方式,该N为预定义的,或目标检测装置随机确定的。对3D框进行N次旋转得到的N个旋转角中,相邻两个旋转角之差可以相同也可以不同。
示例性的,以N为7,该7个旋转角分别为0°、30°、90°、150°、200°、260°和310°为例,目标检测装置要分别获取对3D框旋转0°、30°、90°、150°、200°、260°和310°时,旋转后的3D框对应的第二2D框。
一种可能的实现方式,第二2D框是对旋转后的3D框进行坐标变换得到的。进一步的,第二2D框是通过第二变换关系、内参矩阵和外参矩阵对旋转后的3D框进行坐标变换得到的。
其中,第二变换关系为3D框对应的坐标系与第一物体的坐标系之间的变换关系。第二2D框的坐标信息用于指示第二2D框在图像坐标系中的坐标。
进一步的,第二2D框上的任一点(x,y)与该点在旋转后的3D框上的映射点(X,Y,Z)满足如下公式:
Figure PCTCN2021087917-appb-000013
其中,(X,Y,Z)、s、A、[R|T]和T δ'为已知量,由此可以得到(x,y)。
上述T δ'为第二变换关系,目标检测装置获取第二变换关系的过程如下:
一种可能的实现方式,第二变换关系是根据第一分界线的端点的坐标信息、第二分界线的端点的坐标信息和第二物体与所述第一物体的距离得到的。进一步的,目标检测装置根据第一分界线的端点的坐标信息、第二分界线的端点的坐标信息和第二物体与所述第一物体的距离,得到以第二变换关系为未知数的方程组,对该方程组求解得到第二变换关系。
示例性的,第二分界线上的任一点与该点在图像坐标系下的映射点满足公式3:
Figure PCTCN2021087917-appb-000014
上述第二分界线上的任一点与该点在摄像机的坐标系下的映射点满足公式4:
Figure PCTCN2021087917-appb-000015
目标检测装置结合上述公式3和公式4求解可以得到s和第二变换关系,即s和T δ'。
其中,(X,Y,Z)为第二分界线上的任一点,例如,第二分界线的下分界点。(a,b)为第二分界线上的任一点在图像坐标系下的映射点。若(X,Y,Z)为第二分界线的下分界点,则(a,b)为第一分界线的下分界点。A和[R|T]为已知量。P为步骤404中得到的第二物体与第一物体的距离。-表示目标检测装置在求解s和第二变换关系时,可以不关注这个值。
步骤702:目标检测装置根据N次旋转后的3D框对应的N个第二2D框和第一2D框,得到N个第二2D框中,边界与第一2D框的边界之间的距离最短的第二2D 框。
一种可能的实现方式,N个第二2D框中,距离最短的第二2D框对应的第一距离与第二距离之和最小。其中,第一距离为距离最短的第二2D框的左边界与第一2D框的左边界之间的距离。第二距离为距离最短的第二2D框的右边界与第一2D框的右边界之间的距离。
请参考图8,图8为任一第二2D框与第一2D框的示意图。图8中,第二2D框801的左边界与第一2D框802的左边界之间的距离为Δa,第二2D框801的右边界与第一2D框802的右边界之间的距离为Δb。其中,N个第二2D框中,距离最短的第二2D框对应的Δa+Δb最小。即距离最短的第二2D框对应的3D框的旋转角α满足如下公式:g=argmin[Δa(α)+Δb(α)]。其中,argmin表示使得[Δa(α)+Δb(α)]达到最小时α的取值。
示例性的,以图8所示的第二2D框801对应的3D框为图1所示的3D框101为例,若3D框101中P 1的在3D框对应的坐标系的坐标为(X 1,Y 1,Z 1),P 2的在3D框对应的坐标系的坐标为(X 2,Y 2,Z 2),第二2D框801中,与P 1对应的点p 1的在图像坐标系的坐标为(x 1,y 1),与P 2对应的点p 2的在图像坐标系的坐标为(x 2,y 2),第一2D框802中,Q 1的在图像坐标系的坐标为(x 3,y 3),Q 2的在图像坐标系的坐标为(x 4,y 4),则(X 1,Y 1,Z 1)和(x 1,y 1)满足如下公式:
Figure PCTCN2021087917-appb-000016
其中,s、A、[R|T]、T δ'和(X 1,Y 1,Z 1)为已知量,可以得到(x 1,y 1)。同理,(X 2,Y 2,Z 2)和(x 2,y 2)满足如下公式:
Figure PCTCN2021087917-appb-000017
其中,s、A、[R|T]、T δ'和(X 2,Y 2,Z 2)为已知量,可以得到(x 2,y 2)。则Δa=|x 1-x 3|,Δb=|x 2-x 4|。
可以理解的,N值越大,目标检测装置计算的距离最短的第二2D框对应的3D框的旋转角时的误差越小,得到的距离最短的第二2D框对应的3D框的旋转角越精确。
步骤703:目标检测装置根据距离最短的第二2D框对应的3D框的旋转角,确定第二物体的朝向角。
一种可能的实现方式,目标检测装置将步骤402中的3D框对应的朝向角与距离最短的第二2D框对应的3D框的旋转角之和确定为第二物体的朝向角。
例如,若步骤402中的3D框对应的朝向角为0°,则距离最短的第二2D框对应的3D框的旋转角即为第二物体的朝向角。若步骤402中的3D框对应的朝向角为30°,距离最短的第二2D框对应的3D框的旋转角为30°,则第二物体的朝向角为60°。
可选的,步骤703之后,目标检测装置可以根据第二物体与第一物体的距离,以及第二物体的朝向角进行路径规划,控制第一物体按照规划的路径行驶,从而有效规避障碍物,增加自动驾驶的舒适性和安全性。
基于图7所示的方法,目标检测装置可以将经过旋转后的3D框对应的第二2D框与第一2D框进行比较,得到与第一2D框最接近的第二2D框,从而获取距离最短的第二2D框对应的3D框的旋转角,并根据距离最短的第二2D框对应的3D框的旋转 角,确定第二物体的朝向角。如此,目标检测装置在规划路径时除了参考第二物体与第一物体的距离之外,还可以参考该第二物体的朝向角,使得规划的路径的参考性更高。
可以理解的,上述步骤402中的3D框的长度,3D框的宽度和3D框的高度是根据第二物体的类型得到的。在实际应用中,对于同一类型的车辆,不同牌子的车辆的尺寸有可能不同。因此,根据第二物体的类型得到的3D框的长度,3D框的宽度或3D框的高度有可能不准确。
可选的,目标检测装置还可以校准3D框的长度和/或3D框的宽度。具体的,如图9所示,图7所示的方法还包括步骤901-步骤903。
步骤901:目标检测装置对距离最短的第二2D框对应的3D框的边界的长度进行M次调整,获取每次调整后的3D框所对应的第三2D框。
其中,M为正整数。M为预定义的或者目标检测装置确定的。3D框的边界的长度包括3D框的长度,和/或,3D框的宽度。
可以理解的,目标检测装置对距离最短的第二2D框对应的3D框的边界的长度进行M次调整的目的是,使得调整后的3D框对应的第三2D框的边界与第一2D框的边界之差最小,如此,可以得到较为精确的3D框的边界的长度。也就可以得到,较精确的第二物体的长度、第二物体的宽度。
示例性的,目标检测装置对距离最短的第二2D框对应的3D框的长度,和/或,宽度进行M次调整。例如,目标检测装置每次对距离最短的第二2D框对应的3D框的长度,和/或,宽度增加Δj,每次增加的Δj可以相同也可以不同。又例如,目标检测装置每次对距离最短的第二2D框对应的3D框的长度,和/或,宽度减少Δj,每次减少的Δj可以相同也可以不同。Δj为预定义的或者目标检测装置确定的。
可以理解的,目标检测装置每次对距离最短的第二2D框对应的3D框的长度和宽度进行调整时,对距离最短的第二2D框对应的3D框的长度和宽度的调整值可以相同也可以不同。例如,目标检测装置在一次调整中,可以将距离最短的第二2D框对应的3D框的长度增加Δj,将距离最短的第二2D框对应的3D框的宽度减少Δr。
一种可能的实现方式,第三2D框是对调整后的3D框进行坐标变换得到的。进一步的,第三2D框是通过第二变换关系、内参矩阵和外参矩阵对调整后的3D框进行坐标变换得到的。
进一步的,第三2D框上的任一点(x,y)与该点在第三2D框对应的3D框上的映射点(X,Y,Z)满足如下公式:
Figure PCTCN2021087917-appb-000018
其中,(X,Y,Z)、s、A、[R|T]和T δ'为已知量,由此可以得到(x,y)。
步骤902:目标检测装置根据M次调整后的3D框对应的第三2D框和第一2D框,得到M个第三2D框中,边界与第一2D框的边界之间的距离最短的第三2D框。
例如,N个第三2D框中,距离最短的第三2D框对应的第三距离与第四距离之和最小。第三距离为距离最短的第三2D框的左边界与第一2D框的左边界之间的距离。第四距离为距离最短的第三2D框的右边界与第一2D框的右边界之间的距离。
又例如,N个第三2D框中,距离最短的第三2D框对应的第三距离、第四距离和第五距离之和最小。第五距离为距离最短的第三2D框上的分界线与第一分界线之间的距离。
示例性的,以3D框的边界的长度包括3D框的长度和3D框的宽度为例,对于上述M个第三2D框,距离最短的第三2D框对应的3D框的长度l和宽度w满足如下公式:h=argmin[Δc(l,w)+Δd(l,w)+Δe(l,w)]。其中,Δc为任一个第三2D框左边界与第一2D框的左边界之间的距离。Δd为任一个第三2D框右边界与第一2D框的右边界之间的距离。Δe为任一个第三2D框的分界线与第一分界线之间的距离。argmin表示使得[Δc(l,w)+Δd(l,w)+Δe(l,w)]达到最小时l和w的取值。
步骤903:目标检测装置根据距离最短的第三2D框对应的3D框的边界的长度,确定第二物体的边界的长度。
一种可能的实现方式,目标检测装置将距离最短的第三2D框对应的3D框的边界的长度,确定为第二物体的边界的长度。
示例性的,目标检测装置将距离最短的第三2D框对应的3D框的长度确定为第二物体的长度;将距离最短的第三2D框对应的3D框的宽度,确定第二物体的宽度;将步骤402中的3D框的高度确定为第二物体的高度。
可以理解的,M值越大,目标检测装置确定第二物体的边界的长度的误差越小,得到的第二物体的边界的长度越精确。
可选的,步骤903后,目标检测装置还可以根据第二物体的边界的长度计算第二物体的质心。
基于图9所示的方法,目标检测装置可以将调整后的3D框对应的第三2D框与第一2D框进行比较,得到与第一2D框最接近的第三2D框,从而获取第二物体的边界的长度。如此,目标检测装置可以获取到第二物体的更为准确的尺寸。
可以理解的是,上述目标检测装置等为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法操作,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对目标检测装置进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
比如,以采用集成的方式划分各个功能模块的情况下,图10示出了一种目标检测装置的结构示意图。该目标检测装置可以用于执行上述实施例中涉及的目标检测装置的功能。
作为一种可能的实现方式,图10所示的目标检测装置包括:获取单元1001和确定单元1002。
获取单元1001,用于获取第一物体获取的图像中第二物体的二维2D信息;该第二物体的2D信息包括:该第二物体的第一分界线的端点的坐标信息和第二物体的类型信息,该第一分界线为该第二物体的第一面与第二面之间的分界线,该第一面和该第二面中至少一个面包括在该第二物体的第一2D框中,该第一2D框为包括在该图像中包围第二物体的多边形。例如,结合图4,获取单元1001用于执行步骤401。
获取单元1001,还用于根据第二物体的2D信息获取该第二物体的三维3D信息;该第二物体的3D信息包括第二分界线的端点的坐标信息,该第二分界线为该第二物体的3D框中,该第一面和该第二面之间的分界线,3D框为第二物体的3D模型,3D框的边界的长度与第二物体的类型信息对应。例如,结合图4,获取单元1001还用于执行步骤402。
确定单元1002,用于根据该第一分界线的端点的坐标信息和该第二分界线的端点的坐标信息,得到第一变换关系;该第一变换关系为该第二物体与该第一物体的距离为K时,该3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系,该K大于0。例如,结合图4,确定单元1002用于执行步骤403。
确定单元1002,还用于根据该第一分界线的端点的坐标信息、该第二分界线的端点的坐标信息和该第一变换关系,得到该第二物体与第一物体的距离。例如,结合图4,确定单元1002还用于执行步骤404。
一种可能的实现方式,该第二物体的2D信息还包括:该第一2D框的坐标信息;获取单元1001,具体用于根据该第二物体的类型信息构建该3D框,得到3D框的坐标信息;获取单元1001,还具体用于根据该第一2D框的坐标信息和该第一分界线的端点的坐标信息,确定该第一面和该第二面;获取单元1001,还具体用于根据该第一面、该第二面和3D框的坐标信息,获取该第二分界线的端点的坐标信息。
一种可能的实现方式,该第二物体的2D信息还包括:该第二物体的面信息,该第二物体的面信息用于指示该第一面和该第二面;获取单元1001,具体用于根据该第二物体的类型信息构建该3D框,得到3D框的坐标信息;获取单元1001,还具体用于根据该第二物体的面信息和3D框的坐标信息,获取该第二分界线的端点的坐标信息。
一种可能的实现方式,确定单元1002,具体用于将第一坐标通过该第一变换关系、内参矩阵和外参矩阵进行坐标变换,得到第二坐标,该第一坐标在该第二分界线上,该第二坐标与该第一物体的距离为K,该第二坐标、第三坐标与该第一物体在一条直线上,该第三坐标为该第一分界线上与该第一坐标对应的坐标,该内参矩阵为拍摄该图像的设备的内参矩阵,该外参矩阵为该设备的外参矩阵;确定单元1002,还具体用于对该第二坐标、该第三坐标和该K进行复合运算,得到第二物体与所述第一物体的距离。
一种可能的实现方式,确定单元1002,还具体用于根据该第一分界线的端点的坐标信息、该第二分界线的端点的坐标信息和第二物体与所述第一物体的距离,得到第二变换关系;该第二变换关系为该3D框对应的三维坐标系与该第一物体的三维坐标系之间的变换关系。
一种可能的实现方式,获取单元1001,还用于以该第二分界线为中心对该3D框 进行N次旋转,获取每次旋转后的3D框所对应的第二2D框,该第二2D框是对该旋转后的3D框进行坐标变换得到的,该N为正整数;确定单元1002,还用于根据N次旋转后的3D框对应的N个第二2D框和该第一2D框,得到该N个第二2D框中,边界与该第一2D框的边界之间的距离最短的第二2D框;确定单元1002,还用于根据该距离最短的第二2D框对应的3D框的旋转角,确定该第二物体的朝向角。
一种可能的实现方式,该第二2D框是对该旋转后的3D框进行坐标变换得到的,包括:该第二2D框是通过该第二变换关系、内参矩阵和外参矩阵对该旋转后的3D框进行坐标变换得到的,该内参矩阵为拍摄该图像的设备的内参矩阵,该外参矩阵为该设备的外参矩阵。
一种可能的实现方式,获取单元1001,还用于对该距离最短的第二2D框对应的3D框的边界的长度进行M次调整,获取每次调整后的3D框所对应的第三2D框,该第三2D框是对该调整后的3D框进行坐标变换得到的,该M为正整数;确定单元1002,还用于根据M次调整后的3D框对应的M个第三2D框和该第一2D框,得到该M个第三2D框中,边界与该第一2D框的边界之间的距离最短的第三2D框;确定单元1002,还用于根据该距离最短的第三2D框对应的3D框的边界的长度,确定该第二物体的边界的长度。
一种可能的实现方式,该第三2D框是对该调整后的3D框进行坐标变换得到的,包括:该第三2D框是通过该第二变换关系、内参矩阵和外参矩阵对该调整后的3D框进行坐标变换得到的,该内参矩阵为拍摄该图像的设备的内参矩阵,该外参矩阵为该设备的外参矩阵。
一种可能的实现方式,获取单元1001,具体用于将该图像输入到神经网络模型,得到该第二物体的2D信息。
其中,上述方法实施例涉及的各操作的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。
在本实施例中,该目标检测装置以采用集成的方式划分各个功能模块的形式来呈现。这里的“模块”可以指特定ASIC,电路,执行一个或多个软件或固件程序的处理器和存储器,集成逻辑电路,和/或其他可以提供上述功能的器件。在一个简单的实施例中,本领域的技术人员可以想到该目标检测装置可以采用图3所示的形式。
比如,图3中的处理器301可以通过调用存储器303中存储的计算机执行指令,使得目标检测装置执行上述方法实施例中的目标检测方法。
示例性的,图10中的获取单元1001和确定单元1002的功能/实现过程可以通过图3中的处理器301调用存储器303中存储的计算机执行指令来实现。
由于本实施例提供的目标检测装置可执行上述的目标检测方法,因此其所能获得的技术效果可参考上述方法实施例,在此不再赘述。
图11为本申请实施例提供的一种芯片的结构示意图。芯片110包括一个或多个处理器1101以及接口电路1102。可选的,所述芯片110还可以包含总线1103。其中:
处理器1101可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1101中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1101可以是通用处理器、数字通信器(DSP)、专用集成电路(ASIC)、 现场可编程门阵列(FPGA)或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
接口电路1102用于数据、指令或者信息的发送或者接收。处理器1101可以利用接口电路1102接收的数据、指令或者其它信息,进行加工,可以将加工完成信息通过接口电路1102发送出去。
可选的,芯片110还包括存储器,存储器可以包括只读存储器和随机存取存储器,并向处理器提供操作指令和数据。存储器的一部分还可以包括非易失性随机存取存储器(NVRAM)。
可选的,存储器存储了可执行软件模块或者数据结构,处理器可以通过调用存储器存储的操作指令(该操作指令可存储在操作系统中),执行相应的操作。
可选的,芯片110可以使用在本申请实施例涉及的目标检测装置中。可选的,接口电路1102可用于输出处理器1101的执行结果。关于本申请的一个或多个实施例提供的目标检测方法可参考前述各个实施例,这里不再赘述。
需要说明的,处理器1101、接口电路1102各自对应的功能既可以通过硬件设计实现,也可以通过软件设计来实现,还可以通过软硬件结合的方式来实现,这里不作限制。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例所述方法的 全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (23)

  1. 一种目标检测方法,其特征在于,所述方法包括:
    获取第一物体获取的图像中第二物体的2D信息;所述第二物体的2D信息包括:所述第二物体的第一分界线的端点的坐标信息和所述第二物体的类型信息,所述第一分界线为所述第二物体的第一面与第二面之间的分界线,所述第一面和所述第二面中至少一个面包括在所述第二物体的第一2D框中,所述第一2D框为包括在所述图像中包围所述第二物体的多边形;
    根据所述第二物体的2D信息获取所述第二物体的3D信息;所述第二物体的3D信息包括第二分界线的端点的坐标信息,所述第二分界线为所述第二物体的3D框中,所述第一面和所述第二面之间的分界线,所述3D框为所述第二物体的3D模型,所述3D框的边界的长度与所述第二物体的类型信息对应;
    根据所述第一分界线的端点的坐标信息和所述第二分界线的端点的坐标信息,得到第一变换关系;所述第一变换关系为所述第二物体与所述第一物体的距离为K时,所述3D框对应的三维坐标系与所述第一物体的三维坐标系之间的变换关系,所述K大于0;
    根据所述第一分界线的端点的坐标信息、所述第二分界线的端点的坐标信息和所述第一变换关系,得到所述第二物体与所述第一物体的距离。
  2. 根据权利要求1所述的方法,其特征在于,所述第二物体的2D信息还包括:所述第一2D框的坐标信息;
    所述根据所述第二物体的2D信息获取所述第二物体的3D信息,包括:
    根据所述第二物体的类型信息构建所述3D框,得到所述3D框的坐标信息;
    根据所述第一2D框的坐标信息和所述第一分界线的端点的坐标信息,确定所述第一面和所述第二面;
    根据所述第一面、所述第二面和所述3D框的坐标信息,获取所述第二分界线的端点的坐标信息。
  3. 根据权利要求1所述的方法,其特征在于,所述第二物体的2D信息还包括:所述第二物体的面信息,所述第二物体的面信息用于指示所述第一面和所述第二面;
    所述根据所述第二物体的2D信息获取所述第二物体的3D信息,包括:
    根据所述第二物体的类型信息构建所述3D框,得到所述3D框的坐标信息;
    根据所述第二物体的面信息和所述3D框的坐标信息,获取所述第二分界线的端点的坐标信息。
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,所述根据所述第一分界线的端点的坐标信息、所述第二分界线的端点的坐标信息和所述第一变换关系,得到所述第二物体与所述第一物体的距离,包括:
    将第一坐标通过所述第一变换关系、内参矩阵和外参矩阵进行坐标变换,得到第二坐标,所述第一坐标在所述第二分界线上,所述第二坐标与所述第一物体的距离为K,所述第二坐标、第三坐标与所述第一物体在一条直线上,所述第三坐标为所述第一分界线上与所述第一坐标对应的坐标,所述内参矩阵为拍摄所述图像的设备的内参矩阵,所述外参矩阵为所述设备的外参矩阵;
    对所述第二坐标、所述第三坐标和所述K进行复合运算,得到所述第二物体与所述第一物体的距离。
  5. 根据权利要求1-4中任一项所述的方法,其特征在于,所述方法还包括:
    根据所述第一分界线的端点的坐标信息、所述第二分界线的端点的坐标信息和所述第二物体与所述第一物体的距离,得到第二变换关系;所述第二变换关系为所述3D框对应的三维坐标系与所述第一物体的三维坐标系之间的变换关系。
  6. 根据权利要求5所述的方法,其特征在于,所述方法还包括:
    以所述第二分界线为中心对所述3D框进行N次旋转,获取每次旋转后的3D框所对应的第二2D框,所述第二2D框是对所述旋转后的3D框进行坐标变换得到的,所述N为正整数;
    根据N次旋转后的3D框对应的N个第二2D框和所述第一2D框,得到所述N个第二2D框中,边界与所述第一2D框的边界之间的距离最短的第二2D框;
    根据所述距离最短的第二2D框对应的3D框的旋转角,确定所述第二物体的朝向角。
  7. 根据权利要求6所述的方法,其特征在于,所述第二2D框是对所述旋转后的3D框进行坐标变换得到的,包括:
    所述第二2D框是通过所述第二变换关系、内参矩阵和外参矩阵对所述旋转后的3D框进行坐标变换得到的,所述内参矩阵为拍摄所述图像的设备的内参矩阵,所述外参矩阵为所述设备的外参矩阵。
  8. 根据权利要求6或7所述的方法,其特征在于,所述方法还包括:
    对所述距离最短的第二2D框对应的3D框的边界的长度进行M次调整,获取每次调整后的3D框所对应的第三2D框,所述第三2D框是对所述调整后的3D框进行坐标变换得到的,所述M为正整数;
    根据M次调整后的3D框对应的M个第三2D框和所述第一2D框,得到所述M个第三2D框中,边界与所述第一2D框的边界之间的距离最短的第三2D框;
    根据所述距离最短的第三2D框对应的3D框的边界的长度,确定所述第二物体的边界的长度。
  9. 根据权利要求8所述的方法,其特征在于,所述第三2D框是对所述调整后的3D框进行坐标变换得到的,包括:
    所述第三2D框是通过所述第二变换关系、内参矩阵和外参矩阵对所述调整后的3D框进行坐标变换得到的,所述内参矩阵为拍摄所述图像的设备的内参矩阵,所述外参矩阵为所述设备的外参矩阵。
  10. 根据权利要求1-9中任一项所述的方法,其特征在于,所述获取第一物体获取的图像中第二物体的2D信息,包括:
    将所述图像输入到神经网络模型,得到所述第二物体的2D信息。
  11. 一种目标检测装置,其特征在于,所述目标检测装置包括:获取单元和确定单元;
    所述获取单元,用于获取第一物体获取的图像中第二物体的2D信息;所述第二物体的2D信息包括:所述第二物体的第一分界线的端点的坐标信息和所述第二物体 的类型信息,所述第一分界线为所述第二物体的第一面与第二面之间的分界线,所述第一面和所述第二面中至少一个面包括在所述第二物体的第一2D框中,所述第一2D框为包括在所述图像中包围所述第二物体的多边形;
    所述获取单元,还用于根据所述第二物体的2D信息获取所述第二物体的三维3D信息;所述第二物体的3D信息包括第二分界线的端点的坐标信息,所述第二分界线为所述第二物体的3D框中,所述第一面和所述第二面之间的分界线,所述3D框为所述第二物体的3D模型,所述3D框的边界的长度与所述第二物体的类型信息对应;
    所述确定单元,用于根据所述第一分界线的端点的坐标信息和所述第二分界线的端点的坐标信息,得到第一变换关系;所述第一变换关系为所述第二物体与所述第一物体的距离为K时,所述3D框对应的三维坐标系与所述第一物体的三维坐标系之间的变换关系,所述K大于0;
    所述确定单元,还用于根据所述第一分界线的端点的坐标信息、所述第二分界线的端点的坐标信息和所述第一变换关系,得到所述第二物体与所述第一物体的距离。
  12. 根据权利要求11所述的目标检测装置,其特征在于,所述第二物体的2D信息还包括:所述第一2D框的坐标信息;
    所述获取单元,具体用于根据所述第二物体的类型信息构建所述3D框,得到所述3D框的坐标信息;
    所述获取单元,还具体用于根据所述第一2D框的坐标信息和所述第一分界线的端点的坐标信息,确定所述第一面和所述第二面;
    所述获取单元,还具体用于根据所述第一面、所述第二面和所述3D框的坐标信息,获取所述第二分界线的端点的坐标信息。
  13. 根据权利要求11所述的目标检测装置,其特征在于,所述第二物体的2D信息还包括:所述第二物体的面信息,所述第二物体的面信息用于指示所述第一面和所述第二面;
    所述获取单元,具体用于根据所述第二物体的类型信息构建所述3D框,得到所述3D框的坐标信息;
    所述获取单元,还具体用于根据所述第二物体的面信息和所述3D框的坐标信息,获取所述第二分界线的端点的坐标信息。
  14. 根据权利要求11-13中任一项所述的目标检测装置,其特征在于,
    所述确定单元,具体用于将第一坐标通过所述第一变换关系、内参矩阵和外参矩阵进行坐标变换,得到第二坐标,所述第一坐标在所述第二分界线上,所述第二坐标与所述第一物体的距离为K,所述第二坐标、第三坐标与所述第一物体在一条直线上,所述第三坐标为所述第一分界线上与所述第一坐标对应的坐标,所述内参矩阵为拍摄所述图像的设备的内参矩阵,所述外参矩阵为所述设备的外参矩阵;
    所述确定单元,还具体用于对所述第二坐标、所述第三坐标和所述K进行复合运算,得到所述第二物体与所述第一物体的距离。
  15. 根据权利要求11-14中任一项所述的目标检测装置,其特征在于,
    所述确定单元,还用于根据所述第一分界线的端点的坐标信息、所述第二分界线的端点的坐标信息和所述第二物体与所述第一物体的距离,得到第二变换关系;所述 第二变换关系为所述3D框对应的三维坐标系与所述第一物体的三维坐标系之间的变换关系。
  16. 根据权利要求15所述的目标检测装置,其特征在于,
    所述获取单元,还用于以所述第二分界线为中心对所述3D框进行N次旋转,获取每次旋转后的3D框所对应的第二2D框,所述第二2D框是对所述旋转后的3D框进行坐标变换得到的,所述N为正整数;
    所述确定单元,还用于根据N次旋转后的3D框对应的N个第二2D框和所述第一2D框,得到所述N个第二2D框中,边界与所述第一2D框的边界之间的距离最短的第二2D框;
    所述确定单元,还用于根据所述距离最短的第二2D框对应的3D框的旋转角,确定所述第二物体的朝向角。
  17. 根据权利要求16所述的目标检测装置,其特征在于,所述第二2D框是对所述旋转后的3D框进行坐标变换得到的,包括:
    所述第二2D框是通过所述第二变换关系、内参矩阵和外参矩阵对所述旋转后的3D框进行坐标变换得到的,所述内参矩阵为拍摄所述图像的设备的内参矩阵,所述外参矩阵为所述设备的外参矩阵。
  18. 根据权利要求16或17所述的目标检测装置,其特征在于,
    所述获取单元,还用于对所述距离最短的第二2D框对应的3D框的边界的长度进行M次调整,获取每次调整后的3D框所对应的第三2D框,所述第三2D框是对所述调整后的3D框进行坐标变换得到的,所述M为正整数;
    所述确定单元,还用于根据M次调整后的3D框对应的M个第三2D框和所述第一2D框,得到所述M个第三2D框中,边界与所述第一2D框的边界之间的距离最短的第三2D框;
    所述确定单元,还用于根据所述距离最短的第三2D框对应的3D框的边界的长度,确定所述第二物体的边界的长度。
  19. 根据权利要求18所述的目标检测装置,其特征在于,所述第三2D框是对所述调整后的3D框进行坐标变换得到的,包括:
    所述第三2D框是通过所述第二变换关系、内参矩阵和外参矩阵对所述调整后的3D框进行坐标变换得到的,所述内参矩阵为拍摄所述图像的设备的内参矩阵,所述外参矩阵为所述设备的外参矩阵。
  20. 根据权利要求11-19中任一项所述的目标检测装置,其特征在于,
    所述获取单元,具体用于将所述图像输入到神经网络模型,得到所述第二物体的2D信息。
  21. 一种智能驾驶车辆,其特征在于,包括:如权利要求11-20中任一项所述的目标检测装置。
  22. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有计算机程序代码,所述计算机程序代码被处理电路执行时实现如权利要求1-10中任一项所述的目标检测方法。
  23. 一种芯片,其特征在于,所述芯片包括处理器,所述处理器与存储器耦合, 所述存储器用于存储程序或指令,当所述程序或指令被所述处理器执行时,使得所述芯片执行如权利要求1至10中任一项所述的目标检测方法。
PCT/CN2021/087917 2020-08-12 2021-04-16 目标检测方法及装置 WO2022033066A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010806077.3 2020-08-12
CN202010806077.3A CN114078247A (zh) 2020-08-12 2020-08-12 目标检测方法及装置

Publications (1)

Publication Number Publication Date
WO2022033066A1 true WO2022033066A1 (zh) 2022-02-17

Family

ID=80247623

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/087917 WO2022033066A1 (zh) 2020-08-12 2021-04-16 目标检测方法及装置

Country Status (2)

Country Link
CN (1) CN114078247A (zh)
WO (1) WO2022033066A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115880470A (zh) * 2023-03-08 2023-03-31 深圳佑驾创新科技有限公司 3d图像数据的生成方法、装置、设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272934A (zh) * 2022-08-01 2022-11-01 京东方科技集团股份有限公司 跑动距离估算方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140139635A1 (en) * 2012-09-17 2014-05-22 Nec Laboratories America, Inc. Real-time monocular structure from motion
CN109544633A (zh) * 2017-09-22 2019-03-29 华为技术有限公司 目标测距方法、装置及设备
CN110969064A (zh) * 2018-09-30 2020-04-07 北京四维图新科技股份有限公司 一种基于单目视觉的图像检测方法、装置及存储设备

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140139635A1 (en) * 2012-09-17 2014-05-22 Nec Laboratories America, Inc. Real-time monocular structure from motion
CN109544633A (zh) * 2017-09-22 2019-03-29 华为技术有限公司 目标测距方法、装置及设备
CN110969064A (zh) * 2018-09-30 2020-04-07 北京四维图新科技股份有限公司 一种基于单目视觉的图像检测方法、装置及存储设备

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115880470A (zh) * 2023-03-08 2023-03-31 深圳佑驾创新科技有限公司 3d图像数据的生成方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN114078247A (zh) 2022-02-22

Similar Documents

Publication Publication Date Title
EP3627109B1 (en) Visual positioning method and apparatus, electronic device and system
US11803981B2 (en) Vehicle environment modeling with cameras
CN112292711B (zh) 关联lidar数据和图像数据
US20200394445A1 (en) Method, apparatus, device and medium for calibrating pose relationship between vehicle sensor and vehicle
US9443309B2 (en) System and method for image based mapping, localization, and pose correction of a vehicle with landmark transform estimation
KR102267562B1 (ko) 무인자동주차 기능 지원을 위한 장애물 및 주차구획 인식 장치 및 그 방법
WO2020024234A1 (zh) 路径导航方法、相关装置及计算机可读存储介质
JP2020173799A (ja) 対象の三次元検出およびインテリジェント運転制御方法、装置、媒体および機器
WO2022033066A1 (zh) 目标检测方法及装置
WO2021227645A1 (zh) 目标检测方法和装置
CN113657224B (zh) 车路协同中用于确定对象状态的方法、装置、设备
WO2018120040A1 (zh) 一种障碍物检测方法及装置
WO2022053015A1 (zh) 基于单目图像的目标检测方法和装置
CN112097732A (zh) 一种基于双目相机的三维测距方法、系统、设备及可读存储介质
WO2023065342A1 (zh) 车辆及其定位方法、装置、设备、计算机可读存储介质
WO2023036083A1 (zh) 传感器数据处理方法、系统及可读存储介质
US11842440B2 (en) Landmark location reconstruction in autonomous machine applications
CN115410167A (zh) 目标检测与语义分割方法、装置、设备及存储介质
WO2020237553A1 (zh) 图像处理方法、系统及可移动平台
WO2023131203A1 (zh) 语义地图更新方法、路径规划方法以及相关装置
CN114648639B (zh) 一种目标车辆的检测方法、系统及装置
WO2020215296A1 (zh) 可移动平台的巡线控制方法、设备、可移动平台及系统
WO2023283929A1 (zh) 双目相机外参标定的方法及装置
CN113834463A (zh) 基于绝对尺寸的智能车侧方行人/车单目深度测距方法
US20220018658A1 (en) Measuring system, measuring method, and measuring program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21855122

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21855122

Country of ref document: EP

Kind code of ref document: A1