WO2023155580A1 - Object recognition method and apparatus - Google Patents

Object recognition method and apparatus Download PDF

Info

Publication number
WO2023155580A1
WO2023155580A1 PCT/CN2022/139873 CN2022139873W WO2023155580A1 WO 2023155580 A1 WO2023155580 A1 WO 2023155580A1 CN 2022139873 W CN2022139873 W CN 2022139873W WO 2023155580 A1 WO2023155580 A1 WO 2023155580A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
coordinates
image
category
cluster
Prior art date
Application number
PCT/CN2022/139873
Other languages
French (fr)
Chinese (zh)
Inventor
张宝丰
Original Assignee
京东鲲鹏(江苏)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东鲲鹏(江苏)科技有限公司 filed Critical 京东鲲鹏(江苏)科技有限公司
Publication of WO2023155580A1 publication Critical patent/WO2023155580A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S13/00Systems using the reflection or reradiation of radio waves, e.g. radar systems; Analogous systems using reflection or reradiation of waves whose nature or wavelength is irrelevant or unspecified
    • G01S13/88Radar or analogous systems specially adapted for specific applications
    • G01S13/89Radar or analogous systems specially adapted for specific applications for mapping or imaging

Definitions

  • the present disclosure relates to the technical field of automatic driving, and in particular to an object recognition method and device.
  • Autonomous driving uses artificial intelligence, machine vision, radar, navigation and positioning, communication and other technologies to cooperate together, so that the vehicle can drive automatically and safely through the computer control system without any active human operation.
  • the learning results of multiple machine learning models are usually fused to determine the final target; or, multiple machine learning models are fused, according to the fusion model The output of determines the final goal.
  • the embodiments of the present disclosure provide an object recognition method and device, which can use various simulated feature data as the input of the detection model, and the feature data are complementary, so that the input of the model is more multi-dimensional, and the output of the model is The accuracy of the determined recognition results is greatly improved, and each target object in the automatic driving scene can be accurately identified, thereby providing a reference for the simulation of automatic driving.
  • an object recognition method including: collecting multiple frames of point clouds and multiple frames of images in the detection area; wherein, the point cloud includes multiple point cloud coordinates , each frame of the image corresponds to one or more image instances; according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image, construct the point cloud and the The mapping of the image instance; according to the result of the mapping, determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster, and determine the center coordinates of the point cloud cluster; the point cloud of the point cloud Coordinates, category coordinates and center coordinates are input into a preset detection model to identify the target object of the point cloud.
  • the determining the point cloud cluster corresponding to the image instance includes: obtaining the current frame point cloud and the previous frame image; in the current frame point cloud, the A plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped constitutes the point cloud cluster.
  • forming the point cloud cluster from the plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped includes: The cloud is projected to the second coordinate system; the point cloud cluster is composed of projected point cloud coordinates corresponding to a plurality of pixel points of any image instance included in the previous frame image.
  • the image instance indicates category information
  • the determining the category coordinates of the point cloud cluster includes: determining according to the category information of the image instance included in the previous frame image The class coordinates of the point cloud cluster corresponding to the image instance.
  • it also includes: splicing the point cloud coordinates, category coordinates and center coordinates of the point cloud; the inputting the point cloud coordinates, category coordinates and center coordinates of the point cloud
  • the preset detection model includes: inputting the spliced results into the detection model.
  • the splicing the point cloud coordinates, category coordinates, and center coordinates of the point cloud includes: for each point cloud coordinate of the same point cloud cluster, execute:
  • it further includes: for the point cloud coordinates in the point cloud except for the point cloud cluster, determining that the category coordinates and center coordinates of the point cloud coordinates are the default category coordinates and Default center coordinates, splicing the default category coordinates and the default center coordinates to the point cloud coordinates.
  • the step of determining the center coordinates of the point cloud clusters is performed based on the point cloud after the noise reduction processing.
  • the point cloud and the image are respectively obtained by a radar sensor and a camera, wherein the radar sensor and the camera collect synchronously.
  • the category information is mapped with corresponding binary coordinates; determining the category coordinates of the point cloud cluster corresponding to the image instance includes: determining the binary coordinates of the category information of the image instance is the class coordinate of the point cloud cluster.
  • an object recognition device including: an acquisition module, configured to acquire multi-frame point clouds and multi-frame images in the detection area; wherein, the point cloud includes multiple point clouds Coordinates, each frame of the image corresponds to one or more image instances; the mapping module is used to construct the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image The mapping between the point cloud and the image instance; the data processing module is used to determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster according to the mapping result, and determine the point cloud cluster The center coordinates of the point cloud; the identification module is used to input the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
  • an object recognition electronic device including:
  • processors one or more processors
  • the one or more processors are made to implement the object recognition method provided in the present disclosure.
  • a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the object recognition method provided in the present disclosure is implemented.
  • An embodiment in the above disclosure has the following advantages or beneficial effects: because the point cloud of the current frame is converted to the camera coordinate system by using the real-time performance of the point cloud of the current frame and the previous frame image, thereby obtaining the point cloud corresponding to the image instance
  • the category coordinates of the clusters and calculate the center coordinates of the point cloud clusters, splice the point cloud coordinates, category coordinates and center coordinates as the input of the detection model, and identify the target object according to the output of the detection model, so it overcomes the existing goals.
  • the accuracy of the recognition results is low, and it is unable to provide a reference for the simulation of automatic driving.
  • FIG. 1 is a schematic diagram of the main flow of an object recognition method according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of the main flow of a method for determining a point cloud cluster according to an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of the main flow of a method for determining category coordinates of a point cloud according to an embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of the main flow of a method for determining the center coordinates of a point cloud according to an embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of the main flow of a point cloud coordinate splicing method according to an embodiment of the present disclosure
  • FIG. 6 is a schematic diagram of main modules of an object recognition device according to an embodiment of the disclosure.
  • FIG. 7 shows an exemplary system architecture diagram of an object recognition method or an object recognition device suitable for application in an embodiment of the present disclosure
  • Fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present disclosure.
  • Fig. 1 is a schematic diagram of the main flow of an object recognition method according to an embodiment of the present disclosure. As shown in Fig. 1, the object recognition method of the present disclosure includes the following steps:
  • Unmanned driving integrates sensors, computers, artificial intelligence, communications, navigation and positioning, pattern recognition, machine vision, intelligent control and other cutting-edge disciplines, and can achieve environmental perception, navigation and positioning, path planning, decision-making control and other goals.
  • Driverless cars use sensor technology, signal processing technology, communication technology, and computer technology to identify where the car is located by integrating various on-board sensors such as cameras, laser radars, ultrasonic sensors, microwave radars, GPS, odometers, and magnetic compass. Environment and status, and analyze and judge according to the obtained road information, traffic signal information, vehicle location information and obstacle information, and control the vehicle driving path, so as to realize humanoid driving.
  • sensors such as cameras, laser radars, ultrasonic sensors, microwave radars, GPS, odometers, and magnetic compass. Environment and status, and analyze and judge according to the obtained road information, traffic signal information, vehicle location information and obstacle information, and control the vehicle driving path, so as to realize humanoid driving.
  • Step S101 collecting multi-frame point clouds and multi-frame images in the detection area; wherein, the point cloud includes a plurality of point cloud coordinates, and each frame of the image corresponds to
  • the point cloud and the image are respectively obtained by the radar sensor and the camera, and the radar sensor and the camera collect synchronously during the driving of the vehicle.
  • the image is the output result of inputting the picture captured by the camera into the image instance segmentation model, therefore, each frame of image includes one or more image instances.
  • the object recognition server of the present disclosure is equipped with an image instance segmentation model; wherein, the image instance segmentation model can adopt methods such as Mask-RCNN, RetinaMask, CenterMask, DeepMask, PANet, and YOLACT.
  • Step S102 constructing a mapping between the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image.
  • the first coordinate system is a point cloud coordinate system, or called a radar coordinate system;
  • the second coordinate system is a camera coordinate system, or called an image coordinate system.
  • the point cloud includes multiple point cloud coordinates, and the point cloud coordinates are usually three-dimensional coordinates, such as (x, y, z);
  • the image includes multiple pixel points, and the pixel points are usually two-dimensional coordinates.
  • one pixel point can usually map one or more point cloud coordinates.
  • the point cloud realizes the mapping between point cloud coordinates and pixel points through the internal and external parameters of the camera.
  • Step S103 determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster, and determine the center coordinates of the point cloud cluster.
  • the method for determining the point cloud cluster of the present disclosure includes the following steps:
  • Step S201 acquiring the point cloud of the current frame and the image of the previous frame.
  • the method for determining the point cloud cluster in the present disclosure does not need to wait for the operation of the image instance segmentation model to Obtain the current frame image, directly obtain the previous frame image and the current frame point cloud for subsequent processing, which can ensure the real-time performance of object recognition; on the other hand, the interval between the current frame and the previous frame is negligible, therefore, in When converting the first coordinate system of the point cloud and the second coordinate system of the image, the mapping relationship between the point cloud of the current frame and the image of the previous frame is good.
  • the current frame point cloud and the current frame image can also be used for subsequent processing.
  • the real-time performance of obtaining the current frame point cloud and the current frame image is slightly worse, and it is necessary to wait for the image The processing result of the instance segmentation model to obtain the current frame image.
  • the object recognition server can save the image of each frame, so that the point cloud of the current frame and the image of the previous frame can be obtained in real time when the point cloud cluster is determined.
  • Step S202 in the point cloud of the current frame, a plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped to form a point cloud cluster.
  • Step S2021 project the point cloud of the current frame to the second coordinate system.
  • the point cloud of the current frame is projected into the second coordinate system according to the conversion relationship between the first coordinate system and the second coordinate system.
  • step S2022 the projected point cloud coordinates corresponding to a plurality of pixels of any image instance included in the previous frame image form a point cloud cluster.
  • the point cloud cluster includes multiple points corresponding to the pixel points of the image instance.
  • the point cloud cluster corresponding to the image instance can be determined according to the transformation relationship between the point cloud and the coordinate system of the image, so as to facilitate the analysis of the point cloud.
  • the category coordinates and center coordinates of the point cloud can be determined.
  • the method for determining the category coordinates of the point cloud of the present disclosure includes the following steps:
  • Step S301 obtaining the determination result of the point cloud cluster.
  • one or more point cloud clusters of the current frame point cloud corresponding to one or more image instances of the previous frame image determined in step S202 are obtained.
  • Step S302 according to the determination result of the point cloud cluster, judge whether the point cloud coordinates of the current frame point cloud correspond to the image instance, if yes, go to step S303; if not, go to step S304.
  • Step S303 according to the category information of the image instance included in the previous frame image, determine the category coordinates of the point cloud cluster corresponding to the image instance.
  • the image instance indicates category information.
  • the image is the output result of the image instance segmentation model, which has details such as texture and color, and can accurately determine the category information to which the image instance belongs; where the category information includes car (car), truck (truck), bus (bus), trailer Any one of (trailer), cart (c_vehicle), pedestrian (pedestrian), motorcycle (motorcycle), bicycle (bicycle), traffic cone (traffic_cone), and roadblock (barrier).
  • one category may correspond to one or more image instances, and one image instance may only correspond to one category.
  • the category information is mapped with corresponding binary coordinates
  • the category coordinates of the point cloud cluster are the binary coordinates of the category information of the image instance.
  • the binary coordinates corresponding to the category information are 10-bit binary coordinates.
  • the category of the image instance is a motorcycle, and the corresponding binary coordinates are (0,0,0,0,0,0 ,1,0,0,0), the category coordinates of the point cloud cluster are (0,0,0,0,0,0,1,0,0,0); for another example, the category of the image instance is a car, The corresponding binary coordinates are (1,0,0,0,0,0,0,0,0,0,0), and the category coordinates of the point cloud cluster are (1,0,0,0,0,0,0, 0,0,0).
  • the following two methods can be used: one point cloud cluster corresponds to one category coordinate; or, each point cloud coordinate of a point cloud cluster Corresponding to a category coordinate, the coordinates of each category are the same.
  • Step S304 for each point cloud coordinate in the point cloud except the point cloud cluster, determine the category coordinate of the point cloud coordinate as the default category coordinate.
  • the default category coordinates are (0,0,0,0,0,0,0,0,0,0,0,0), that is, it does not belong to any of the above 10 categories, which can represent the background, Blanks, misprojected points, etc.
  • each point cloud coordinate corresponds to a default category coordinate.
  • the category coordinates of the point cloud can be determined, using the complementarity between the feature data of the image and the point cloud, the image has details such as color and texture,
  • the point cloud has deep details, which expands the data latitude of the point cloud from 3 bits to 13 bits, which is equivalent to providing prior knowledge of the category for the points in the point cloud, thus making the input of the model more multidimensional and convenient for subsequent detection
  • the learning of the model improves the accuracy of the recognition results output by the model.
  • the method for determining the center coordinates of the point cloud of the present disclosure includes the following steps:
  • Step S401 obtaining the determination result of the point cloud cluster.
  • one or more point cloud clusters of the current frame point cloud corresponding to one or more image instances of the previous frame image determined in step S202 are obtained.
  • Step S402 according to the determination result of the point cloud cluster, judge whether the point cloud coordinates of the current frame point cloud correspond to the image instance, if yes, go to step S403; if not, go to step S404.
  • Step S403 according to the point cloud coordinates of the point cloud cluster, determine the center coordinates of the point cloud cluster corresponding to the image instance.
  • the average value is the center coordinates (x center , y center , z center ) of the point cloud cluster corresponding to the image instance.
  • each point cloud cluster corresponds to one center coordinate
  • each point cloud coordinate of a point cloud cluster Corresponding to a center coordinate, each center coordinate is the same.
  • Step S404 for each point cloud coordinate in the point cloud except the point cloud cluster, determine the center coordinate of the point cloud coordinate as the default center coordinate.
  • the default center coordinate is (0,0,0), which can represent the background, blank, wrong projection point and so on.
  • each point cloud coordinate corresponds to a default center coordinate.
  • the central coordinates of the point cloud can be determined, so that the data latitude of the point cloud is further expanded from 13 bits to 16 bits, so that the input of the model It is more multi-dimensional, which facilitates the learning of subsequent detection models and improves the accuracy of the recognition results output by the model.
  • Step S104 input the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
  • the category coordinates and center coordinates of the point cloud after data dimension expansion are spliced with the point cloud coordinates of the point cloud, and the result after splicing is used as the input of the preset detection model, and according to the detection model The output, identifies the target object of the point cloud.
  • the target object includes a box of each object in the detection area and a category of each object.
  • the detection model is a 3D object detection model.
  • the splicing method of the coordinates of the point cloud of the present disclosure includes the following steps:
  • Step S501 obtaining the determination result of the point cloud cluster.
  • one or more point cloud clusters of the current frame point cloud corresponding to one or more image instances of the previous frame image determined in step S202 are obtained.
  • Step S502 according to the determination result of the point cloud cluster, judge whether the point cloud coordinates of the current frame point cloud correspond to the image instance, if yes, go to step S503; if not, go to step S504.
  • Step S503 for each point cloud coordinate of the same point cloud cluster, perform: stitching the category coordinates and center coordinates into the point cloud coordinates.
  • the splicing result of splicing category coordinates and center coordinates to point cloud coordinates is (point cloud coordinates (x, y, z), category coordinates (10-bit binary type), center coordinates (x center , y center , z center )).
  • the category coordinates and center coordinates can also be spliced into the point cloud cluster.
  • Step S504 for each point cloud coordinate in the point cloud except the point cloud cluster, the default category coordinates and default center coordinates are spliced into the point cloud coordinates.
  • the splicing result of splicing the default category coordinates and default center coordinates to the point cloud coordinates is (x, y, z, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ,0,0,0).
  • the input of the detection model can be determined through the splicing method of the point cloud coordinates of the present disclosure, and the model input is 16-bit coordinates expanded by the data latitude, which makes the learning of the detection model easier and the recognition of the model output The accuracy of the result is higher.
  • the point cloud can also be denoised by clustering to eliminate the noise caused by the expansion of the data dimension, and based on the denoised point cloud, execute the method for determining the center coordinates of the point cloud cluster in the present disclosure , and then update the center coordinates of the point cloud, and execute the splicing method of the point cloud coordinates of the present disclosure according to the updated center coordinates of the point cloud.
  • the stitching result determined according to the center coordinates of the updated point cloud is input into the preset detection model, so that the recognition result of the target object is more accurate.
  • the clustering algorithm can use DBSCAN, K-median and other methods.
  • the point cloud includes multiple point cloud coordinates, and each frame of the image corresponds to one or more image instances; According to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image, construct a mapping between the point cloud and the image instance; determine the image instance according to the mapping result Corresponding point cloud clusters and category coordinates of the point cloud clusters, and determine the center coordinates of the point cloud clusters; input the point cloud coordinates, category coordinates, and center coordinates of the point clouds into a preset detection model, and identify all
  • the steps of describing the target object of the point cloud can use a variety of simulated characteristic data as the input of the detection model, and the characteristic data are complementary, so that the input of the model is more multi-dimensional, and the accuracy of the recognition result determined by the model output is greatly improved. Improvement, it can accurately identify each target object in the autonomous driving scene, so as to provide a reference for the simulation of automatic driving
  • FIG. 6 is a schematic diagram of main modules of an object recognition device according to an embodiment of the present disclosure.
  • the object recognition device 600 of the present disclosure includes:
  • the collection module 601 is configured to collect multiple frames of point clouds and multiple frames of images within the detection area; wherein, the point cloud includes multiple point cloud coordinates, and each frame of the image corresponds to one or more image instances.
  • the collection module 601 is used to collect multi-frame point clouds and multi-frame images in the detection area; wherein, the point clouds and images are respectively obtained through the radar sensor and the camera, and during the driving process of the vehicle, Synchronous acquisition by radar sensor and camera.
  • the image is the output result of inputting the picture captured by the camera into the image instance segmentation model, therefore, each frame of image includes one or more image instances.
  • the object recognition server of the present disclosure is equipped with an image instance segmentation model; wherein, the image instance segmentation model can adopt methods such as Mask-RCNN, RetinaMask, CenterMask, DeepMask, PANet, and YOLACT.
  • the mapping module 602 is configured to construct a mapping between the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image.
  • the mapping module 602 is used to construct the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image mapping.
  • the first coordinate system is a point cloud coordinate system, or called a radar coordinate system;
  • the second coordinate system is a camera coordinate system, or called an image coordinate system.
  • the point cloud includes multiple point cloud coordinates, and the point cloud coordinates are usually three-dimensional coordinates, such as (x, y, z);
  • the image includes multiple pixel points, and the pixel points are usually two-dimensional coordinates.
  • one pixel point can usually map one or more point cloud coordinates.
  • the point cloud realizes the mapping between point cloud coordinates and pixel points through the internal and external parameters of the camera.
  • the data processing module 603 is configured to determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster according to the mapping result, and determine the center coordinates of the point cloud cluster.
  • the data processing module 603 is used to determine the point cloud corresponding to the image instance in the current frame point cloud on the basis of the mapping result of the point cloud and the image instance according to the point cloud cluster determination method of the present disclosure.
  • a point cloud cluster a point cloud cluster usually includes multiple point cloud coordinates.
  • the data processing module 603 is further configured to determine the category coordinates of the current frame point cloud according to the method for determining the category coordinates of the point cloud in the present disclosure.
  • the data processing module 603 is further configured to determine the center coordinates of the current frame point cloud according to the method for determining the center coordinates of the point cloud in the present disclosure.
  • the identification module 604 is configured to input the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
  • the identification module 604 is used to splice the category coordinates and center coordinates of the point cloud after data dimension expansion and the point cloud coordinates of the point cloud, and use the spliced result as a preset detection model According to the input of the detection model, the target object of the point cloud is identified.
  • the target object includes a box of each object in the detection area and a category of each object.
  • various simulated characteristic data can be used as the input of the detection model through modules such as the acquisition module, the mapping module, the data processing module, and the identification module, and the characteristic data are complementary, so that the model's The input is more multi-dimensional, and the accuracy of the recognition results determined by the model output is greatly improved, which can accurately identify each target object in the autonomous driving scene, thereby providing a reference for the simulation of automatic driving.
  • Fig. 7 shows an exemplary system architecture diagram of an object recognition method or an object recognition device applicable to an embodiment of the present disclosure.
  • an exemplary system of an object recognition method or an object recognition device in an embodiment of the present disclosure Architecture includes:
  • a system architecture 700 may include detection devices 701 , 702 , and 703 , a network 704 and a server 705 .
  • the network 704 is used to provide a medium for communication links between the detection devices 701, 702, 703 and the server 105.
  • Network 704 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.
  • the detection devices 701, 702, 703 interact with the server 705 through the network 704 to collect or send messages and the like.
  • the detection devices 701, 702, and 703 may be various electronic devices with detection functions, including but not limited to lidar sensors, cameras, and so on.
  • the server 705 may be a server that provides various services, such as a background management server that supports the collected point clouds and images sent by the detection devices 701 , 702 , and 703 .
  • the background management server can analyze and process the collected data such as point clouds and images, and output the processing results (such as the object to which the point cloud cluster belongs).
  • the object recognition method provided by the embodiment of the present disclosure is generally executed by the server 705 , and correspondingly, the object recognition device is generally disposed in the server 705 .
  • FIG. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server in an embodiment of the present disclosure.
  • the computer system 800 of a terminal device or a server in an embodiment of the present disclosure includes:
  • a central processing unit (CPU) 801 that can execute various appropriate actions and processes according to programs stored in a read only memory (ROM) 802 or loaded from a storage section 808 into a random access memory (RAM) 803 .
  • ROM read only memory
  • RAM random access memory
  • various programs and data necessary for the operation of the system 800 are also stored.
  • the CPU 801 , ROM 802 , and RAM 803 are connected to each other via a bus 804 .
  • An input/output (I/O) interface 805 is also connected to the bus 804 .
  • the following components are connected to the I/O interface 805: an input section 806 including a keyboard, a mouse, etc.; an output section 807 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker; a storage section 808 including a hard disk, etc. and a communication section 809 including a network interface card such as a LAN card, a modem, or the like.
  • the communication section 809 performs communication processing via a network such as the Internet.
  • a drive 810 is also connected to the I/O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 810 as necessary so that a computer program read therefrom is installed into the storage section 808 as necessary.
  • the processes described above with reference to the flowcharts can be implemented as computer software programs.
  • the disclosed embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts.
  • the computer program may be downloaded and installed from a network via communication portion 809 and/or installed from removable media 811 .
  • this computer program is executed by a central processing unit (CPU) 801
  • CPU central processing unit
  • the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that includes one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block in the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be implemented by a A combination of dedicated hardware and computer instructions.
  • the modules involved in the embodiments described in the present disclosure may be implemented by software or by hardware.
  • the described modules can also be set in a processor, for example, it can be described as: a processor includes an acquisition module, a mapping module, a data processing module and an identification module.
  • a processor includes an acquisition module, a mapping module, a data processing module and an identification module.
  • the names of these modules do not constitute a limitation of the module itself under certain circumstances.
  • the recognition module can also be described as "inputting the point cloud coordinates, category coordinates and center coordinates of the point cloud into the preset detection model , a module for identifying target objects in point clouds".
  • the present disclosure also provides a computer-readable medium, which may be included in the device described in the above embodiments, or may exist independently without being assembled into the device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by one of the devices, the device includes: collecting multiple frames of point clouds and multiple frames of images in the detection area; wherein, the The point cloud includes a plurality of point cloud coordinates, and each frame of the image corresponds to one or more image instances; according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image, Construct the mapping of the point cloud and the image instance; according to the result of the mapping, determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster, and determine the center coordinates of the point cloud cluster; Inputting the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
  • multiple models (such as image-based 3D target detection, point cloud-based 3D target detection, image-based semantic segmentation, image-based instance segmentation, point cloud-based semantic Segmentation) output results are fused to determine the final perception result; or multiple models are fused to determine the final perception result. Due to the rough fusion logic of the output results of the existing target detection methods, or the inability to align the input features of multi-model fusion, the accuracy of the target recognition results is low, and the significance of being a simulation reference for autonomous driving is small.
  • the problem of feature alignment between modalities is solved, and the complementary relationship between point clouds and images is fully utilized to optimize the 3D detection model. input, greatly improving the accuracy of the model, and the operation of the model is real-time.
  • the point cloud data without details such as color and texture
  • it is very accurate in depth (distance above)
  • the image data has no depth details, and is very accurate in texture and color
  • the accuracy of the recognition result is greatly improved.
  • the 16-bit coordinates are used as input to improve the accuracy of the model and at the same time promote the convergence of the model.
  • the current frame point cloud and the previous frame image It greatly improves the operating efficiency of the model.
  • a variety of simulated feature data can be used as the input of the detection model, and the feature data are complementary, so that the input of the model is more multi-dimensional, and the accuracy of the recognition result determined by the model output is greatly improved. Improvement, it can accurately identify each target object in the autonomous driving scene, so as to provide a reference for the simulation of automatic driving.

Landscapes

  • Engineering & Computer Science (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

An object recognition method and apparatus, relating to the technical field of computers. A specific embodiment of the method comprises: collecting multiple point clouds and multiple images in a detection area; constructing mapping between the point clouds and image instances according to a preset conversion relation between a first coordinate system of the point clouds and a second coordinate system of the images; determining, according to the mapping result, a point cloud cluster corresponding to the image instances and category coordinates of the point cloud cluster, and determining center coordinates of the point cloud cluster; and inputting point cloud coordinates, category coordinates and center coordinates of the point clouds into a preset detection model to recognize target objects of the point clouds. According to the method, various simulated feature data can be used as the input of the detection model, and the feature data are complementary, so that the input of the model is more multi-dimensional, the accuracy of the recognition result output and determined by the model is greatly improved, and each target object in an autonomous driving scenario can be accurately recognized.

Description

一种对象识别方法和装置A method and device for object recognition
相关申请的交叉引用Cross References to Related Applications
本申请要求享有2022年2月17日提交的公开名称为“一种对象识别方法和装置”的中国专利申请202210147726.2的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分或全部。This application claims the priority of the Chinese patent application 202210147726.2 with the public name "A Method and Device for Object Recognition" filed on February 17, 2022, and the content disclosed in the above Chinese patent application is cited in its entirety as a part of this application or all.
技术领域technical field
本公开涉及自动驾驶技术领域,尤其涉及一种对象识别方法和装置。The present disclosure relates to the technical field of automatic driving, and in particular to an object recognition method and device.
背景技术Background technique
自动驾驶利用人工智能、机器视觉、雷达、导航定位、通信等技术协同合作,使得车辆可以通过电脑控制系统在没有任何人类主动的操作下,自动安全地行驶。Autonomous driving uses artificial intelligence, machine vision, radar, navigation and positioning, communication and other technologies to cooperate together, so that the vehicle can drive automatically and safely through the computer control system without any active human operation.
现有的自动驾驶在进行目标检测时,为了更准确地识别目标,通常将多个机器学习模型的学习结果进行融合,从而确定最终目标;或者,将多个机器学习模型进行融合,根据融合模型的输出确定最终目标。When performing target detection in existing autonomous driving, in order to identify targets more accurately, the learning results of multiple machine learning models are usually fused to determine the final target; or, multiple machine learning models are fused, according to the fusion model The output of determines the final goal.
现有的目标识别结果中,或者是根据多个模型的学习结果融合获得,融合逻辑较为粗糙,使得最终目标的识别结果并不准确;或者是根据融合模型的输出确定最终目标,多个模型融合时的特征数据无法完全对齐,同样会使得最终目标的确定准确度较低,甚至低于单个模型的识别结果。Among the existing target recognition results, either the learning results of multiple models are fused, and the fusion logic is relatively rough, which makes the final target recognition results inaccurate; or the final target is determined according to the output of the fusion model, and multiple models are fused When the feature data cannot be fully aligned, it will also make the final target determination accuracy lower, even lower than the recognition result of a single model.
发明内容Contents of the invention
有鉴于此,本公开实施例提供一种对象识别方法和装置,能够利用多种模拟的特征数据作为检测模型的输入,而特征数据之间具有互补性,从而使得模型的输入更加多维,模型输出确定的识别结果的准 确度大大提升,能够准确地识别自动驾驶场景中的各个目标对象,从而为自动驾驶的模拟提供参考。In view of this, the embodiments of the present disclosure provide an object recognition method and device, which can use various simulated feature data as the input of the detection model, and the feature data are complementary, so that the input of the model is more multi-dimensional, and the output of the model is The accuracy of the determined recognition results is greatly improved, and each target object in the automatic driving scene can be accurately identified, thereby providing a reference for the simulation of automatic driving.
为实现上述目的,根据本公开实施例的一个方面,提供了一种对象识别方法,包括:采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有一个或多个图像实例;根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射;根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标;将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别所述点云的目标对象。In order to achieve the above purpose, according to an aspect of the embodiments of the present disclosure, an object recognition method is provided, including: collecting multiple frames of point clouds and multiple frames of images in the detection area; wherein, the point cloud includes multiple point cloud coordinates , each frame of the image corresponds to one or more image instances; according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image, construct the point cloud and the The mapping of the image instance; according to the result of the mapping, determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster, and determine the center coordinates of the point cloud cluster; the point cloud of the point cloud Coordinates, category coordinates and center coordinates are input into a preset detection model to identify the target object of the point cloud.
根据本公开的一个或多个实施例,所述确定与所述图像实例对应的点云簇,包括:获取当前帧点云和前一帧图像;在所述当前帧点云中,将所述前一帧图像包括的任一图像实例所映射到的多个点云坐标组成所述点云簇。According to one or more embodiments of the present disclosure, the determining the point cloud cluster corresponding to the image instance includes: obtaining the current frame point cloud and the previous frame image; in the current frame point cloud, the A plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped constitutes the point cloud cluster.
根据本公开的一个或多个实施例,所述将所述前一帧图像包括的任一图像实例所映射到的多个点云坐标组成所述点云簇,包括:将所述当前帧点云投影到所述第二坐标系;将投影后与所述前一帧图像包括的任一图像实例的多个像素点对应的点云坐标组成所述点云簇。According to one or more embodiments of the present disclosure, forming the point cloud cluster from the plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped includes: The cloud is projected to the second coordinate system; the point cloud cluster is composed of projected point cloud coordinates corresponding to a plurality of pixel points of any image instance included in the previous frame image.
根据本公开的一个或多个实施例,所述图像实例指示了类别信息;所述确定所述点云簇的类别坐标,包括:根据所述前一帧图像包括的图像实例的类别信息,确定与所述图像实例对应的点云簇的类别坐标。According to one or more embodiments of the present disclosure, the image instance indicates category information; the determining the category coordinates of the point cloud cluster includes: determining according to the category information of the image instance included in the previous frame image The class coordinates of the point cloud cluster corresponding to the image instance.
根据本公开的一个或多个实施例,还包括:对所述点云的点云坐标、类别坐标和中心坐标进行拼接;所述将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,包括:将拼接后的结果输入所述检测模型。According to one or more embodiments of the present disclosure, it also includes: splicing the point cloud coordinates, category coordinates and center coordinates of the point cloud; the inputting the point cloud coordinates, category coordinates and center coordinates of the point cloud The preset detection model includes: inputting the spliced results into the detection model.
根据本公开的一个或多个实施例,所述对所述点云的点云坐标、类别坐标和中心坐标进行拼接,包括:对同一个所述点云簇的每一个点云坐标,执行:According to one or more embodiments of the present disclosure, the splicing the point cloud coordinates, category coordinates, and center coordinates of the point cloud includes: for each point cloud coordinate of the same point cloud cluster, execute:
将所述类别坐标和所述中心坐标拼接到所述点云坐标。Stitching the category coordinates and the center coordinates to the point cloud coordinates.
根据本公开的一个或多个实施例,还包括:对所述点云中除所述 点云簇以外的点云坐标,确定所述点云坐标的类别坐标和中心坐标分别为默认类别坐标和默认中心坐标,将所述默认类别坐标和所述默认中心坐标拼接到所述点云坐标。According to one or more embodiments of the present disclosure, it further includes: for the point cloud coordinates in the point cloud except for the point cloud cluster, determining that the category coordinates and center coordinates of the point cloud coordinates are the default category coordinates and Default center coordinates, splicing the default category coordinates and the default center coordinates to the point cloud coordinates.
根据本公开的一个或多个实施例,还包括:通过聚类方式,对所述点云进行降噪处理;According to one or more embodiments of the present disclosure, further comprising: performing noise reduction processing on the point cloud by clustering;
基于所述降噪处理后的点云,执行所述确定所述点云簇的中心坐标的步骤。The step of determining the center coordinates of the point cloud clusters is performed based on the point cloud after the noise reduction processing.
根据本公开的一个或多个实施例,所述点云、所述图像分别是通过雷达传感器和摄像头得到的,其中,所述雷达传感器和所述摄像头同步采集。According to one or more embodiments of the present disclosure, the point cloud and the image are respectively obtained by a radar sensor and a camera, wherein the radar sensor and the camera collect synchronously.
根据本公开的一个或多个实施例,所述类别信息映射有对应的二进制坐标;确定与所述图像实例对应的点云簇的类别坐标,包括:确定所述图像实例的类别信息的二进制坐标为所述点云簇的类别坐标。According to one or more embodiments of the present disclosure, the category information is mapped with corresponding binary coordinates; determining the category coordinates of the point cloud cluster corresponding to the image instance includes: determining the binary coordinates of the category information of the image instance is the class coordinate of the point cloud cluster.
根据本公开实施例的再一个方面,提供了一种对象识别装置,包括:采集模块,用于采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有一个或多个图像实例;映射模块,用于根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射;数据处理模块,用于根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标;识别模块,用于将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别所述点云的目标对象。According to still another aspect of an embodiment of the present disclosure, an object recognition device is provided, including: an acquisition module, configured to acquire multi-frame point clouds and multi-frame images in the detection area; wherein, the point cloud includes multiple point clouds Coordinates, each frame of the image corresponds to one or more image instances; the mapping module is used to construct the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image The mapping between the point cloud and the image instance; the data processing module is used to determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster according to the mapping result, and determine the point cloud cluster The center coordinates of the point cloud; the identification module is used to input the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
根据本公开实施例的另一个方面,提供了一种对象识别电子设备,包括:According to another aspect of the embodiments of the present disclosure, an object recognition electronic device is provided, including:
一个或多个处理器;one or more processors;
存储装置,用于存储一个或多个程序,storage means for storing one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本公开提供的对象识别方法。When the one or more programs are executed by the one or more processors, the one or more processors are made to implement the object recognition method provided in the present disclosure.
根据本公开实施例的还一个方面,提供了一种计算机可读介质, 其上存储有计算机程序,所述程序被处理器执行时实现本公开提供的对象识别方法。According to still another aspect of the embodiments of the present disclosure, a computer-readable medium is provided, on which a computer program is stored, and when the program is executed by a processor, the object recognition method provided in the present disclosure is implemented.
上述公开中的一个实施例具有如下优点或有益效果:因为采用利用当前帧点云和前一帧图像的实时性,将当前帧点云转换至相机坐标系,从而获得与图像实例对应的点云簇的类别坐标,并计算点云簇的中心坐标,将点云坐标、类别坐标和中心坐标拼接作为检测模型的输入,根据检测模型的输出识别目标对象的技术手段,所以克服了现有的目标识别结果准确度较低,无法为自动驾驶的模拟提供参考的技术问题,进而达到能够利用多种模拟的特征数据作为检测模型的输入,而特征数据之间具有互补性,从而使得模型的输入更加多维,模型输出确定的识别结果的准确度大大提升,能够准确地识别自动驾驶场景中的各个目标对象,从而为自动驾驶的模拟提供参考的技术效果。An embodiment in the above disclosure has the following advantages or beneficial effects: because the point cloud of the current frame is converted to the camera coordinate system by using the real-time performance of the point cloud of the current frame and the previous frame image, thereby obtaining the point cloud corresponding to the image instance The category coordinates of the clusters, and calculate the center coordinates of the point cloud clusters, splice the point cloud coordinates, category coordinates and center coordinates as the input of the detection model, and identify the target object according to the output of the detection model, so it overcomes the existing goals. The accuracy of the recognition results is low, and it is unable to provide a reference for the simulation of automatic driving. In this way, a variety of simulated characteristic data can be used as the input of the detection model, and the characteristic data are complementary, so that the input of the model is more accurate. Multi-dimensional, the accuracy of the recognition results determined by the model output is greatly improved, and it can accurately identify each target object in the automatic driving scene, thereby providing a reference technical effect for the simulation of automatic driving.
上述的非惯用的可选方式所具有的进一步效果将在下文中结合具体实施方式加以说明。The further effects of the above-mentioned non-conventional alternatives will be described below in conjunction with specific embodiments.
附图说明Description of drawings
附图用于更好地理解本公开,不构成对本公开的不当限定。其中:The accompanying drawings are for better understanding of the present disclosure, and do not constitute an improper limitation of the present disclosure. in:
图1是根据本公开实施例的对象识别方法的主要流程的示意图;FIG. 1 is a schematic diagram of the main flow of an object recognition method according to an embodiment of the present disclosure;
图2是根据本公开实施例的点云簇的确定方法的主要流程的示意图;2 is a schematic diagram of the main flow of a method for determining a point cloud cluster according to an embodiment of the present disclosure;
图3是根据本公开实施例的点云的类别坐标的确定方法的主要流程的示意图;3 is a schematic diagram of the main flow of a method for determining category coordinates of a point cloud according to an embodiment of the present disclosure;
图4是根据本公开实施例的点云的中心坐标的确定方法的主要流程的示意图;FIG. 4 is a schematic diagram of the main flow of a method for determining the center coordinates of a point cloud according to an embodiment of the present disclosure;
图5是根据本公开实施例的点云的坐标的拼接方法的主要流程的示意图;FIG. 5 is a schematic diagram of the main flow of a point cloud coordinate splicing method according to an embodiment of the present disclosure;
图6是根据本公开实施例的对象识别装置的主要模块的示意图;6 is a schematic diagram of main modules of an object recognition device according to an embodiment of the disclosure;
图7示出了适于应用于本公开实施例的对象识别方法或对象识别装置的示例性系统架构图;FIG. 7 shows an exemplary system architecture diagram of an object recognition method or an object recognition device suitable for application in an embodiment of the present disclosure;
图8是适于用来实现本公开实施例的终端设备或服务器的计算机 系统的结构示意图。Fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present disclosure.
具体实施方式Detailed ways
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
图1是根据本公开实施例的对象识别方法的主要流程的示意图,如图1所示,本公开的对象识别方法包括如下步骤:Fig. 1 is a schematic diagram of the main flow of an object recognition method according to an embodiment of the present disclosure. As shown in Fig. 1, the object recognition method of the present disclosure includes the following steps:
无人驾驶综合了传感器、计算机、人工智能、通信、导航定位、模式识别、机器视觉、智能控制等多门前沿学科的综合体,可以实现环境感知、导航定位、路径规划、决策控制等目标。Unmanned driving integrates sensors, computers, artificial intelligence, communications, navigation and positioning, pattern recognition, machine vision, intelligent control and other cutting-edge disciplines, and can achieve environmental perception, navigation and positioning, path planning, decision-making control and other goals.
无人驾驶汽车利用传感器技术、信号处理技术、通讯技术和计算机技术等,通过集成摄像头、激光雷达、超声传感器、微波雷达、GPS、里程计、磁罗盘等多种车载传感器来辨识汽车所处的环境和状态,并根据所获得的道路信息、交通信号信息、车辆位置信息和障碍物信息进行分析和判断,控制车辆行驶路径,从而实现拟人驾驶。Driverless cars use sensor technology, signal processing technology, communication technology, and computer technology to identify where the car is located by integrating various on-board sensors such as cameras, laser radars, ultrasonic sensors, microwave radars, GPS, odometers, and magnetic compass. Environment and status, and analyze and judge according to the obtained road information, traffic signal information, vehicle location information and obstacle information, and control the vehicle driving path, so as to realize humanoid driving.
步骤S101,采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有Step S101, collecting multi-frame point clouds and multi-frame images in the detection area; wherein, the point cloud includes a plurality of point cloud coordinates, and each frame of the image corresponds to
在本公开实施例中,点云、图像分别是通过雷达传感器和摄像头得到的,在车辆行驶过程中,雷达传感器和摄像头同步采集。In the embodiment of the present disclosure, the point cloud and the image are respectively obtained by the radar sensor and the camera, and the radar sensor and the camera collect synchronously during the driving of the vehicle.
在本公开实施例中,图像为将摄像头拍摄的画面输入图像实例分割模型后输出的结果,因此,每一帧图像包括有一个或多个图像实例。本公开的对象识别服务器搭载有图像实例分割模型;其中,图像实例分割模型可以采用Mask-RCNN、RetinaMask、CenterMask、DeepMask、PANet、YOLACT等方法。In the embodiment of the present disclosure, the image is the output result of inputting the picture captured by the camera into the image instance segmentation model, therefore, each frame of image includes one or more image instances. The object recognition server of the present disclosure is equipped with an image instance segmentation model; wherein, the image instance segmentation model can adopt methods such as Mask-RCNN, RetinaMask, CenterMask, DeepMask, PANet, and YOLACT.
步骤S102,根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射。Step S102, constructing a mapping between the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image.
在本公开实施例中,第一坐标系为点云坐标系,或者称为雷达坐 标系;第二坐标系为相机坐标系,或者称为图像坐标系。点云包括多个点云坐标,点云坐标通常为三维坐标,比如,(x,y,z);图像包括多个像素点,像素点通常为二维坐标。In the embodiment of the present disclosure, the first coordinate system is a point cloud coordinate system, or called a radar coordinate system; the second coordinate system is a camera coordinate system, or called an image coordinate system. The point cloud includes multiple point cloud coordinates, and the point cloud coordinates are usually three-dimensional coordinates, such as (x, y, z); the image includes multiple pixel points, and the pixel points are usually two-dimensional coordinates.
在本公开实施例中,由于点云坐标和像素点的维度差别,一个像素点通常可以映射一个或者多个点云坐标。在第一坐标系和第二坐标系转化时,点云通过相机的内参和外参,实现点云坐标与像素点的映射。In the embodiment of the present disclosure, due to the dimension difference between point cloud coordinates and pixel points, one pixel point can usually map one or more point cloud coordinates. When the first coordinate system and the second coordinate system are converted, the point cloud realizes the mapping between point cloud coordinates and pixel points through the internal and external parameters of the camera.
步骤S103,根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标。Step S103, according to the mapping result, determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster, and determine the center coordinates of the point cloud cluster.
在本公开实施例中,如图2所示,本公开的点云簇的确定方法包括如下步骤:In the embodiment of the present disclosure, as shown in FIG. 2 , the method for determining the point cloud cluster of the present disclosure includes the following steps:
步骤S201,获取当前帧点云和前一帧图像。Step S201, acquiring the point cloud of the current frame and the image of the previous frame.
在本公开实施例中,一方面,因为图像为图像实例分割模型输出的结果,图像的获取需要一定的处理时间,因此,本公开的点云簇的确定方法无需等待图像实例分割模型的运行以获取当前帧图像,直接获取前一帧图像和当前帧点云即可进行后续处理,可以保证对象识别的实时性;另一方面,当前帧和前一帧的间隔时间可以忽略不计,因此,在进行点云的第一坐标系和图像的第二坐标系的转化时,当前帧点云和前一帧图像的映射关系良好。In the embodiment of the present disclosure, on the one hand, because the image is the output result of the image instance segmentation model, the acquisition of the image requires a certain amount of processing time. Therefore, the method for determining the point cloud cluster in the present disclosure does not need to wait for the operation of the image instance segmentation model to Obtain the current frame image, directly obtain the previous frame image and the current frame point cloud for subsequent processing, which can ensure the real-time performance of object recognition; on the other hand, the interval between the current frame and the previous frame is negligible, therefore, in When converting the first coordinate system of the point cloud and the second coordinate system of the image, the mapping relationship between the point cloud of the current frame and the image of the previous frame is good.
进一步地,也可以当前帧点云和当前帧图像以进行后续处理,然而相较于当前帧点云和前一帧图像,获取当前帧点云和当前帧图像的实时性稍差,需要等待图像实例分割模型的处理结果以获取当前帧图像。Further, the current frame point cloud and the current frame image can also be used for subsequent processing. However, compared with the current frame point cloud and the previous frame image, the real-time performance of obtaining the current frame point cloud and the current frame image is slightly worse, and it is necessary to wait for the image The processing result of the instance segmentation model to obtain the current frame image.
在本公开实施例中,对象识别服务器可以保存每一帧的图像,从而在确定点云簇时可以实时获取当前帧点云和前一帧图像。In the embodiment of the present disclosure, the object recognition server can save the image of each frame, so that the point cloud of the current frame and the image of the previous frame can be obtained in real time when the point cloud cluster is determined.
步骤S202,在当前帧点云中,将前一帧图像包括的任一图像实例所映射到的多个点云坐标组成点云簇。Step S202, in the point cloud of the current frame, a plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped to form a point cloud cluster.
步骤S2021,将当前帧点云投影到第二坐标系。Step S2021, project the point cloud of the current frame to the second coordinate system.
在本公开实施例中,根据第一坐标系和第二坐标系的转化关系,将当前帧点云投影到第二坐标系中。In the embodiment of the present disclosure, the point cloud of the current frame is projected into the second coordinate system according to the conversion relationship between the first coordinate system and the second coordinate system.
步骤S2022,将投影后与前一帧图像包括的任一图像实例的多个像素点对应的点云坐标组成点云簇。In step S2022, the projected point cloud coordinates corresponding to a plurality of pixels of any image instance included in the previous frame image form a point cloud cluster.
在本公开实施例中,点云簇包括多个点,与图像实例的像素点相对应。In the embodiment of the present disclosure, the point cloud cluster includes multiple points corresponding to the pixel points of the image instance.
在本公开实施例中,通过本公开的点云簇的确定方法,能够根据点云和图像的坐标系的转化关系,确定与图像实例对应的点云簇,从而便于对点云进行分析,后续可以确定点云的类别坐标和中心坐标。In the embodiment of the present disclosure, through the method for determining the point cloud cluster of the present disclosure, the point cloud cluster corresponding to the image instance can be determined according to the transformation relationship between the point cloud and the coordinate system of the image, so as to facilitate the analysis of the point cloud. The category coordinates and center coordinates of the point cloud can be determined.
在本公开实施例中,如图3所示,本公开的点云的类别坐标的确定方法包括如下步骤:In the embodiment of the present disclosure, as shown in FIG. 3 , the method for determining the category coordinates of the point cloud of the present disclosure includes the following steps:
步骤S301,获取点云簇的确定结果。Step S301, obtaining the determination result of the point cloud cluster.
在本公开实施例中,获取步骤S202确定的与前一帧图像的一个或者多个图像实例对应的当前帧点云的一个或多个点云簇。In the embodiment of the present disclosure, one or more point cloud clusters of the current frame point cloud corresponding to one or more image instances of the previous frame image determined in step S202 are obtained.
步骤S302,根据点云簇的确定结果,判断当前帧点云的点云坐标是否与图像实例对应,如果是,转至步骤S303;如果否,转至步骤S304。Step S302, according to the determination result of the point cloud cluster, judge whether the point cloud coordinates of the current frame point cloud correspond to the image instance, if yes, go to step S303; if not, go to step S304.
步骤S303,根据前一帧图像包括的图像实例的类别信息,确定与图像实例对应的点云簇的类别坐标。Step S303, according to the category information of the image instance included in the previous frame image, determine the category coordinates of the point cloud cluster corresponding to the image instance.
在本公开实施例中,图像实例指示了类别信息。图像为图像实例分割模型输出的结果,具备纹理、色彩等细节,可以准确地确定图像实例所属的类别信息;其中,类别信息包括汽车(car)、卡车(truck)、公共汽车(bus)、拖车(trailer)、推车(c_vehicle)、行人(pedestrian)、摩托车(motorcycle)、自行车(bicycle)、交通锥(traffic_cone)、路障(barrier)中的任意一种。In the embodiments of the present disclosure, the image instance indicates category information. The image is the output result of the image instance segmentation model, which has details such as texture and color, and can accurately determine the category information to which the image instance belongs; where the category information includes car (car), truck (truck), bus (bus), trailer Any one of (trailer), cart (c_vehicle), pedestrian (pedestrian), motorcycle (motorcycle), bicycle (bicycle), traffic cone (traffic_cone), and roadblock (barrier).
在本公开实施例中,一种类别可以对应一个或多个图像实例,一个图像实例只能对应一种类别。In the embodiment of the present disclosure, one category may correspond to one or more image instances, and one image instance may only correspond to one category.
进一步地,类别信息映射有对应的二进制坐标,点云簇的类别坐标即为图像实例的类别信息的二进制坐标。对应于上述的10种图像实例的类别,类别信息对应的二进制坐标为10位二进制坐标,比如,图像实例的类别为摩托车,对应的二进制坐标为(0,0,0,0,0,0,1,0,0,0),则点云簇的类别坐标为(0,0,0,0,0,0,1,0,0,0);又比如,图像实例的类 别为汽车,对应的二进制坐标为(1,0,0,0,0,0,0,0,0,0),则点云簇的类别坐标为(1,0,0,0,0,0,0,0,0,0)。Furthermore, the category information is mapped with corresponding binary coordinates, and the category coordinates of the point cloud cluster are the binary coordinates of the category information of the image instance. Corresponding to the category of the above 10 kinds of image instances, the binary coordinates corresponding to the category information are 10-bit binary coordinates. For example, the category of the image instance is a motorcycle, and the corresponding binary coordinates are (0,0,0,0,0,0 ,1,0,0,0), the category coordinates of the point cloud cluster are (0,0,0,0,0,0,1,0,0,0); for another example, the category of the image instance is a car, The corresponding binary coordinates are (1,0,0,0,0,0,0,0,0,0), and the category coordinates of the point cloud cluster are (1,0,0,0,0,0,0, 0,0,0).
在本公开实施例中,在确定与图像实例对应的点云簇的类别坐标时,可以通过以下两种方式:一个点云簇对应一个类别坐标;或者,一个点云簇的每一个点云坐标对应一个类别坐标,各个类别坐标相同。In the embodiment of the present disclosure, when determining the category coordinates of the point cloud cluster corresponding to the image instance, the following two methods can be used: one point cloud cluster corresponds to one category coordinate; or, each point cloud coordinate of a point cloud cluster Corresponding to a category coordinate, the coordinates of each category are the same.
步骤S304,对点云中除点云簇以外的每一个点云坐标,确定点云坐标的类别坐标为默认类别坐标。Step S304, for each point cloud coordinate in the point cloud except the point cloud cluster, determine the category coordinate of the point cloud coordinate as the default category coordinate.
在本公开实施例中,默认类别坐标为(0,0,0,0,0,0,0,0,0,0),即不属于以上10种类别中的任意一种,可以表征背景、空白、错误投影点等。对点云中除点云簇以外的点云坐标,每一个点云坐标对应一个默认类别坐标。In the embodiment of the present disclosure, the default category coordinates are (0,0,0,0,0,0,0,0,0,0), that is, it does not belong to any of the above 10 categories, which can represent the background, Blanks, misprojected points, etc. For the point cloud coordinates in the point cloud except point cloud clusters, each point cloud coordinate corresponds to a default category coordinate.
在本公开实施例中,通过本公开的点云的类别坐标的确定方法,可以确定点云的类别坐标,利用图像和点云的特征数据之间的互补性,图像具备色彩、纹理等细节,点云具备深度细节,使得点云的数据纬度扩张,由3位扩展为13位,相当于为点云中的点提供了类别的先验知识,从而使得模型的输入更加多维,便于后续的检测模型的学习,提升模型输出的识别结果的准确度。In the embodiment of the present disclosure, through the method for determining the category coordinates of the point cloud in the present disclosure, the category coordinates of the point cloud can be determined, using the complementarity between the feature data of the image and the point cloud, the image has details such as color and texture, The point cloud has deep details, which expands the data latitude of the point cloud from 3 bits to 13 bits, which is equivalent to providing prior knowledge of the category for the points in the point cloud, thus making the input of the model more multidimensional and convenient for subsequent detection The learning of the model improves the accuracy of the recognition results output by the model.
在本公开实施例中,如图4所示,本公开的点云的中心坐标的确定方法包括如下步骤:In the embodiment of the present disclosure, as shown in FIG. 4, the method for determining the center coordinates of the point cloud of the present disclosure includes the following steps:
步骤S401,获取点云簇的确定结果。Step S401, obtaining the determination result of the point cloud cluster.
在本公开实施例中,获取步骤S202确定的与前一帧图像的一个或者多个图像实例对应的当前帧点云的一个或多个点云簇。In the embodiment of the present disclosure, one or more point cloud clusters of the current frame point cloud corresponding to one or more image instances of the previous frame image determined in step S202 are obtained.
步骤S402,根据点云簇的确定结果,判断当前帧点云的点云坐标是否与图像实例对应,如果是,转至步骤S403;如果否,转至步骤S404。Step S402, according to the determination result of the point cloud cluster, judge whether the point cloud coordinates of the current frame point cloud correspond to the image instance, if yes, go to step S403; if not, go to step S404.
步骤S403,根据点云簇的点云坐标,确定与图像实例对应的点云簇的中心坐标。Step S403, according to the point cloud coordinates of the point cloud cluster, determine the center coordinates of the point cloud cluster corresponding to the image instance.
在本公开实施例中,根据点云簇的多个点云坐标(x 1,y 1,z 1)、(x 2,y 2,z 2)、……、(x n,y n,z n),确定多个点云坐标的平均值,该平均 值即为与图像实例对应的点云簇的中心坐标(x center,y center,z center)。 In the embodiment of the present disclosure, according to multiple point cloud coordinates (x 1 , y 1 , z 1 ), (x 2 , y 2 , z 2 ), ..., (x n , y n , z n ), determine the average value of multiple point cloud coordinates, the average value is the center coordinates (x center , y center , z center ) of the point cloud cluster corresponding to the image instance.
在本公开实施例中,在确定与图像实例对应的点云簇的中心坐标时,可以通过以下两种方式:一个点云簇对应一个中心坐标,或者,一个点云簇的每一个点云坐标对应一个中心坐标,各个中心坐标相同。In the embodiment of the present disclosure, when determining the center coordinates of the point cloud cluster corresponding to the image instance, the following two methods can be used: one point cloud cluster corresponds to one center coordinate, or each point cloud coordinate of a point cloud cluster Corresponding to a center coordinate, each center coordinate is the same.
步骤S404,对点云中除点云簇以外的每一个点云坐标,确定点云坐标的中心坐标为默认中心坐标。Step S404, for each point cloud coordinate in the point cloud except the point cloud cluster, determine the center coordinate of the point cloud coordinate as the default center coordinate.
在本公开实施例中,默认中心坐标为(0,0,0),可以表征背景、空白、错误投影点等。对点云中除点云簇以外的点云坐标,每一个点云坐标对应一个默认中心坐标。In the embodiment of the present disclosure, the default center coordinate is (0,0,0), which can represent the background, blank, wrong projection point and so on. For the point cloud coordinates in the point cloud except point cloud clusters, each point cloud coordinate corresponds to a default center coordinate.
在本公开实施例中,通过本公开的点云的中心坐标的确定方法,可以确定点云的中心坐标,使得点云的数据纬度进一步扩张,由13位扩展为16位,从而使得模型的输入更加多维,便于后续的检测模型的学习,提升模型输出的识别结果的准确度。In the embodiment of the present disclosure, through the method for determining the central coordinates of the point cloud in the present disclosure, the central coordinates of the point cloud can be determined, so that the data latitude of the point cloud is further expanded from 13 bits to 16 bits, so that the input of the model It is more multi-dimensional, which facilitates the learning of subsequent detection models and improves the accuracy of the recognition results output by the model.
步骤S104,将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别所述点云的目标对象。Step S104, input the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
在本公开实施例中,将经过数据维度扩张后的点云的类别坐标和中心坐标与点云的点云坐标进行拼接,将拼接后的结果作为预设的检测模型的输入,根据检测模型的输出,识别点云的目标对象。In the embodiment of the present disclosure, the category coordinates and center coordinates of the point cloud after data dimension expansion are spliced with the point cloud coordinates of the point cloud, and the result after splicing is used as the input of the preset detection model, and according to the detection model The output, identifies the target object of the point cloud.
在本公开实施例中,目标对象包括检测区域内的各个物体的box以及各个物体的类别。In the embodiment of the present disclosure, the target object includes a box of each object in the detection area and a category of each object.
在本公开实施例中,检测模型为3D目标检测模型。In the embodiment of the present disclosure, the detection model is a 3D object detection model.
在本公开实施例中,如图5所示,本公开的点云的坐标的拼接方法包括如下步骤:In the embodiment of the present disclosure, as shown in FIG. 5 , the splicing method of the coordinates of the point cloud of the present disclosure includes the following steps:
步骤S501,获取点云簇的确定结果。Step S501, obtaining the determination result of the point cloud cluster.
在本公开实施例中,获取步骤S202确定的与前一帧图像的一个或者多个图像实例对应的当前帧点云的一个或多个点云簇。In the embodiment of the present disclosure, one or more point cloud clusters of the current frame point cloud corresponding to one or more image instances of the previous frame image determined in step S202 are obtained.
步骤S502,根据点云簇的确定结果,判断当前帧点云的点云坐标是否与图像实例对应,如果是,转至步骤S503;如果否,转至步骤S504。Step S502, according to the determination result of the point cloud cluster, judge whether the point cloud coordinates of the current frame point cloud correspond to the image instance, if yes, go to step S503; if not, go to step S504.
步骤S503,对同一个点云簇的每一个点云坐标,执行:将类别坐标和中心坐标拼接到点云坐标。Step S503, for each point cloud coordinate of the same point cloud cluster, perform: stitching the category coordinates and center coordinates into the point cloud coordinates.
在本公开实施例中,将类别坐标和中心坐标拼接到点云坐标的拼接结果为(点云坐标(x,y,z),类别坐标(10位二进制类型),中心坐标(x center,y center,z center))。 In the embodiment of the present disclosure, the splicing result of splicing category coordinates and center coordinates to point cloud coordinates is (point cloud coordinates (x, y, z), category coordinates (10-bit binary type), center coordinates (x center , y center , z center )).
在本公开实施例中,也可以对同一个点云簇,将类别坐标和中心坐标拼接到点云簇。In the embodiment of the present disclosure, for the same point cloud cluster, the category coordinates and center coordinates can also be spliced into the point cloud cluster.
步骤S504,对点云中除点云簇以外的每一个点云坐标,将默认类别坐标和默认中心坐标拼接到点云坐标。Step S504, for each point cloud coordinate in the point cloud except the point cloud cluster, the default category coordinates and default center coordinates are spliced into the point cloud coordinates.
在本公开实施例中,将默认类别坐标和默认中心坐标拼接到点云坐标的拼接结果为(x,y,z,0,0,0,0,0,0,0,0,0,0,0,0,0)。In the embodiment of the present disclosure, the splicing result of splicing the default category coordinates and default center coordinates to the point cloud coordinates is (x, y, z, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ,0,0,0).
在本公开实施例中,通过本公开的点云的坐标的拼接方法,可以确定检测模型的输入,模型输入为经过数据纬度扩张的16位坐标,使得检测模型的学习更加容易,模型输出的识别结果的准确度更高。In the embodiment of the present disclosure, the input of the detection model can be determined through the splicing method of the point cloud coordinates of the present disclosure, and the model input is 16-bit coordinates expanded by the data latitude, which makes the learning of the detection model easier and the recognition of the model output The accuracy of the result is higher.
进一步地,还可以通过聚类方式对点云进行降噪处理,以排除数据维度扩张带来的噪声,并基于降噪处理后的点云,执行本公开的点云簇的中心坐标的确定方法,进而更新点云的中心坐标,根据更新后的点云的中心坐标,执行本公开的点云的坐标的拼接方法。将根据更新后的点云的中心坐标确定的拼接结果输入预设的检测模型,使得目标对象的识别结果更加准确。其中,聚类算法可以采用DBSCAN、K-median等方法。Further, the point cloud can also be denoised by clustering to eliminate the noise caused by the expansion of the data dimension, and based on the denoised point cloud, execute the method for determining the center coordinates of the point cloud cluster in the present disclosure , and then update the center coordinates of the point cloud, and execute the splicing method of the point cloud coordinates of the present disclosure according to the updated center coordinates of the point cloud. The stitching result determined according to the center coordinates of the updated point cloud is input into the preset detection model, so that the recognition result of the target object is more accurate. Among them, the clustering algorithm can use DBSCAN, K-median and other methods.
在本公开实施例中,通过采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有一个或多个图像实例;根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射;根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标;将所述点云的点云坐标、类别坐标和 中心坐标输入预设的检测模型,识别所述点云的目标对象等步骤,能够利用多种模拟的特征数据作为检测模型的输入,而特征数据之间具有互补性,从而使得模型的输入更加多维,模型输出确定的识别结果的准确度大大提升,能够准确地识别自动驾驶场景中的各个目标对象,从而为自动驾驶的模拟提供参考。In the embodiment of the present disclosure, by collecting multiple frames of point clouds and multiple frames of images in the detection area; wherein, the point cloud includes multiple point cloud coordinates, and each frame of the image corresponds to one or more image instances; According to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image, construct a mapping between the point cloud and the image instance; determine the image instance according to the mapping result Corresponding point cloud clusters and category coordinates of the point cloud clusters, and determine the center coordinates of the point cloud clusters; input the point cloud coordinates, category coordinates, and center coordinates of the point clouds into a preset detection model, and identify all The steps of describing the target object of the point cloud can use a variety of simulated characteristic data as the input of the detection model, and the characteristic data are complementary, so that the input of the model is more multi-dimensional, and the accuracy of the recognition result determined by the model output is greatly improved. Improvement, it can accurately identify each target object in the autonomous driving scene, so as to provide a reference for the simulation of automatic driving.
图6是根据本公开实施例的对象识别装置的主要模块的示意图,如图6所示,本公开的对象识别装置600包括:FIG. 6 is a schematic diagram of main modules of an object recognition device according to an embodiment of the present disclosure. As shown in FIG. 6 , the object recognition device 600 of the present disclosure includes:
采集模块601,用于采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有一个或多个图像实例。The collection module 601 is configured to collect multiple frames of point clouds and multiple frames of images within the detection area; wherein, the point cloud includes multiple point cloud coordinates, and each frame of the image corresponds to one or more image instances.
在本公开实施例中,所述采集模块601用于采集检测区域内的多帧点云和多帧图像;其中,点云、图像分别是通过雷达传感器和摄像头得到的,在车辆行驶过程中,雷达传感器和摄像头同步采集。In the embodiment of the present disclosure, the collection module 601 is used to collect multi-frame point clouds and multi-frame images in the detection area; wherein, the point clouds and images are respectively obtained through the radar sensor and the camera, and during the driving process of the vehicle, Synchronous acquisition by radar sensor and camera.
在本公开实施例中,图像为将摄像头拍摄的画面输入图像实例分割模型后输出的结果,因此,每一帧图像包括有一个或多个图像实例。本公开的对象识别服务器搭载有图像实例分割模型;其中,图像实例分割模型可以采用Mask-RCNN、RetinaMask、CenterMask、DeepMask、PANet、YOLACT等方法。In the embodiment of the present disclosure, the image is the output result of inputting the picture captured by the camera into the image instance segmentation model, therefore, each frame of image includes one or more image instances. The object recognition server of the present disclosure is equipped with an image instance segmentation model; wherein, the image instance segmentation model can adopt methods such as Mask-RCNN, RetinaMask, CenterMask, DeepMask, PANet, and YOLACT.
映射模块602,用于根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射。The mapping module 602 is configured to construct a mapping between the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image.
在本公开实施例中,所述映射模块602用于根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射。第一坐标系为点云坐标系,或者称为雷达坐标系;第二坐标系为相机坐标系,或者称为图像坐标系。点云包括多个点云坐标,点云坐标通常为三维坐标,比如,(x,y,z);图像包括多个像素点,像素点通常为二维坐标。In the embodiment of the present disclosure, the mapping module 602 is used to construct the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image mapping. The first coordinate system is a point cloud coordinate system, or called a radar coordinate system; the second coordinate system is a camera coordinate system, or called an image coordinate system. The point cloud includes multiple point cloud coordinates, and the point cloud coordinates are usually three-dimensional coordinates, such as (x, y, z); the image includes multiple pixel points, and the pixel points are usually two-dimensional coordinates.
在本公开实施例中,由于点云坐标和像素点的维度差别,一个像素点通常可以映射一个或者多个点云坐标。在第一坐标系和第二坐标系转化时,点云通过相机的内参和外参,实现点云坐标与像素点的映 射。In the embodiment of the present disclosure, due to the dimension difference between point cloud coordinates and pixel points, one pixel point can usually map one or more point cloud coordinates. When the first coordinate system and the second coordinate system are transformed, the point cloud realizes the mapping between point cloud coordinates and pixel points through the internal and external parameters of the camera.
数据处理模块603,用于根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标。The data processing module 603 is configured to determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster according to the mapping result, and determine the center coordinates of the point cloud cluster.
在本公开实施例中,所述数据处理模块603用于根据本公开的点云簇的确定方法,在点云与图像实例的映射结果的基础上,确定当前帧点云中与图像实例对应的点云簇,点云簇通常包括多个点云坐标。In the embodiment of the present disclosure, the data processing module 603 is used to determine the point cloud corresponding to the image instance in the current frame point cloud on the basis of the mapping result of the point cloud and the image instance according to the point cloud cluster determination method of the present disclosure. A point cloud cluster, a point cloud cluster usually includes multiple point cloud coordinates.
在本公开实施例中,所述数据处理模块603还用于根据本公开的点云的类别坐标的确定方法,确定当前帧点云的类别坐标。In the embodiment of the present disclosure, the data processing module 603 is further configured to determine the category coordinates of the current frame point cloud according to the method for determining the category coordinates of the point cloud in the present disclosure.
在本公开实施例中,所述数据处理模块603还用于根据本公开的点云的中心坐标的确定方法,确定当前帧点云的中心坐标。In the embodiment of the present disclosure, the data processing module 603 is further configured to determine the center coordinates of the current frame point cloud according to the method for determining the center coordinates of the point cloud in the present disclosure.
识别模块604,用于将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别所述点云的目标对象。The identification module 604 is configured to input the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
在本公开实施例中,所述识别模块604用于将经过数据维度扩张后的点云的类别坐标和中心坐标与点云的点云坐标进行拼接,将拼接后的结果作为预设的检测模型的输入,根据检测模型的输出,识别点云的目标对象。In the embodiment of the present disclosure, the identification module 604 is used to splice the category coordinates and center coordinates of the point cloud after data dimension expansion and the point cloud coordinates of the point cloud, and use the spliced result as a preset detection model According to the input of the detection model, the target object of the point cloud is identified.
在本公开实施例中,目标对象包括检测区域内的各个物体的box以及各个物体的类别。In the embodiment of the present disclosure, the target object includes a box of each object in the detection area and a category of each object.
在本公开实施例中,通过采集模块、映射模块、数据处理模块和识别模块等模块,能够利用多种模拟的特征数据作为检测模型的输入,而特征数据之间具有互补性,从而使得模型的输入更加多维,模型输出确定的识别结果的准确度大大提升,能够准确地识别自动驾驶场景中的各个目标对象,从而为自动驾驶的模拟提供参考。In the embodiment of the present disclosure, various simulated characteristic data can be used as the input of the detection model through modules such as the acquisition module, the mapping module, the data processing module, and the identification module, and the characteristic data are complementary, so that the model's The input is more multi-dimensional, and the accuracy of the recognition results determined by the model output is greatly improved, which can accurately identify each target object in the autonomous driving scene, thereby providing a reference for the simulation of automatic driving.
图7示出了适于应用于本公开实施例的对象识别方法或对象识别装置的示例性系统架构图,如图7所示,本公开实施例的对象识别方法或对象识别装置的示例性系统架构包括:Fig. 7 shows an exemplary system architecture diagram of an object recognition method or an object recognition device applicable to an embodiment of the present disclosure. As shown in Fig. 7, an exemplary system of an object recognition method or an object recognition device in an embodiment of the present disclosure Architecture includes:
如图7所示,系统架构700可以包括检测设备701、702、703,网络704和服务器705。网络704用以在检测设备701、702、703和服务 器105之间提供通信链路的介质。网络704可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 7 , a system architecture 700 may include detection devices 701 , 702 , and 703 , a network 704 and a server 705 . The network 704 is used to provide a medium for communication links between the detection devices 701, 702, 703 and the server 105. Network 704 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.
检测设备701、702、703通过网络704与服务器705交互,以采集或发送消息等。The detection devices 701, 702, 703 interact with the server 705 through the network 704 to collect or send messages and the like.
检测设备701、702、703可以是具有检测功能的各种电子设备,包括但不限于激光雷达传感器、摄像头等等。The detection devices 701, 702, and 703 may be various electronic devices with detection functions, including but not limited to lidar sensors, cameras, and so on.
服务器705可以是提供各种服务的服务器,例如对采集到的检测设备701、702、703发送的点云和图像提供支持的后台管理服务器。后台管理服务器可以对采集到的点云和图像等数据进行分析等处理,并输出处理结果(例如点云簇所属对象)。The server 705 may be a server that provides various services, such as a background management server that supports the collected point clouds and images sent by the detection devices 701 , 702 , and 703 . The background management server can analyze and process the collected data such as point clouds and images, and output the processing results (such as the object to which the point cloud cluster belongs).
需要说明的是,本公开实施例所提供的对象识别方法一般由服务器705执行,相应地,对象识别装置一般设置于服务器705中。It should be noted that the object recognition method provided by the embodiment of the present disclosure is generally executed by the server 705 , and correspondingly, the object recognition device is generally disposed in the server 705 .
应该理解,图7中的检测设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的检测设备、网络和服务器。It should be understood that the numbers of detection devices, networks and servers in Fig. 7 are only illustrative. There may be any number of detection devices, networks, and servers according to implementation requirements.
图8是适于用来实现本公开实施例的终端设备或服务器的计算机系统的结构示意图,如图8所示,本公开实施例的终端设备或服务器的计算机系统800包括:FIG. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server in an embodiment of the present disclosure. As shown in FIG. 8 , the computer system 800 of a terminal device or a server in an embodiment of the present disclosure includes:
中央处理单元(CPU)801,其可以根据存储在只读存储器(ROM)802中的程序或者从存储部分808加载到随机访问存储器(RAM)803中的程序而执行各种适当的动作和处理。在RAM803中,还存储有系统800操作所需的各种程序和数据。CPU801、ROM802以及RAM803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。A central processing unit (CPU) 801 that can execute various appropriate actions and processes according to programs stored in a read only memory (ROM) 802 or loaded from a storage section 808 into a random access memory (RAM) 803 . In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801 , ROM 802 , and RAM 803 are connected to each other via a bus 804 . An input/output (I/O) interface 805 is also connected to the bus 804 .
以下部件连接至I/O接口805:包括键盘、鼠标等的输入部分806;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分807;包括硬盘等的存储部分808;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分809。通信部分809经由诸如因特网的网络执行通信处理。驱动器810也根据需要连接至I/O接口805。可拆卸介质811,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器810上,以便于从其上读出的计算机程序根据需要 被安装入存储部分808。The following components are connected to the I/O interface 805: an input section 806 including a keyboard, a mouse, etc.; an output section 807 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker; a storage section 808 including a hard disk, etc. and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the Internet. A drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 810 as necessary so that a computer program read therefrom is installed into the storage section 808 as necessary.
特别地,根据本公开公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分809从网络上被下载和安装,和/或从可拆卸介质811被安装。在该计算机程序被中央处理单元(CPU)801执行时,执行本公开的系统中限定的上述功能。In particular, according to the disclosed embodiments of the present disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, the disclosed embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication portion 809 and/or installed from removable media 811 . When this computer program is executed by a central processing unit (CPU) 801, the above-described functions defined in the system of the present disclosure are performed.
需要说明的是,本公开所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. . Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点 上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that includes one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block in the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be implemented by a A combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的模块可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中,例如,可以描述为:一种处理器包括采集模块、映射模块、数据处理模块和识别模块。其中,这些模块的名称在某种情况下并不构成对该模块本身的限定,例如,识别模块还可以被描述为“将点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别点云的目标对象的模块”。The modules involved in the embodiments described in the present disclosure may be implemented by software or by hardware. The described modules can also be set in a processor, for example, it can be described as: a processor includes an acquisition module, a mapping module, a data processing module and an identification module. Among them, the names of these modules do not constitute a limitation of the module itself under certain circumstances. For example, the recognition module can also be described as "inputting the point cloud coordinates, category coordinates and center coordinates of the point cloud into the preset detection model , a module for identifying target objects in point clouds".
作为另一方面,本公开还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的设备中所包含的;也可以是单独存在,而未装配入该设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被一个该设备执行时,使得该设备包括:采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有一个或多个图像实例;根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射;根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标;将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别所述点云的目标对象。As another aspect, the present disclosure also provides a computer-readable medium, which may be included in the device described in the above embodiments, or may exist independently without being assembled into the device. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by one of the devices, the device includes: collecting multiple frames of point clouds and multiple frames of images in the detection area; wherein, the The point cloud includes a plurality of point cloud coordinates, and each frame of the image corresponds to one or more image instances; according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image, Construct the mapping of the point cloud and the image instance; according to the result of the mapping, determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster, and determine the center coordinates of the point cloud cluster; Inputting the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
现有的自动驾驶在进行目标检测时,通常对多个模型(比如基于图像的3D目标检测、基于点云的3D目标检测、基于图像的语义分割、基于图像的实例分割、基于点云的语义分割)的输出结果进行融合,以确定最终的感知结果;或者是将多个模型融合,以确定最终的感知结果。现有的目标检测方法由于输出结果融合逻辑粗糙,或者多模型融合的输入特征无法对齐,导致目标识别结果准确度较低,为自动驾驶的模拟参考的意义较小。When performing target detection in existing autonomous driving, multiple models (such as image-based 3D target detection, point cloud-based 3D target detection, image-based semantic segmentation, image-based instance segmentation, point cloud-based semantic Segmentation) output results are fused to determine the final perception result; or multiple models are fused to determine the final perception result. Due to the rough fusion logic of the output results of the existing target detection methods, or the inability to align the input features of multi-model fusion, the accuracy of the target recognition results is low, and the significance of being a simulation reference for autonomous driving is small.
根据本公开实施例的技术方案,基于雷达点云和相机图像的多模态融合,解决了模态之间的特征对齐问题,充分利用点云和图像之间的互补关系,优化3D检测模型的输入,大幅提升模型的准确率,且模型的运行具有实时性。According to the technical solutions of the embodiments of the present disclosure, based on the multimodal fusion of radar point clouds and camera images, the problem of feature alignment between modalities is solved, and the complementary relationship between point clouds and images is fully utilized to optimize the 3D detection model. input, greatly improving the accuracy of the model, and the operation of the model is real-time.
根据本公开实施例的技术方案,根据点云的数据无颜色、纹理等细节,在深度(距离上面)上很准,而图像的数据无深度细节,在纹理、色彩上很准的特点,充分利用两种模态之间的数据互补性,融合点云和图像实例分割结果,根据两者的坐标对应关系,将图像实例的类别信息提供给点云,使得3D检测模型的学习更加容易。相较于现有的3位点云坐标输入,大幅提升了识别结果的准确率。再加上与图像实例对应的点云簇的中心坐标,将16位坐标作为输入,提高模型准确率的同时,还可以促进模型收敛,在利用当前帧点云和前一帧图像的基础上,大大提升了模型的运行效率。According to the technical solution of the embodiment of the present disclosure, according to the point cloud data without details such as color and texture, it is very accurate in depth (distance above), while the image data has no depth details, and is very accurate in texture and color, fully Using the data complementarity between the two modalities, the point cloud and image instance segmentation results are fused, and the category information of the image instance is provided to the point cloud according to the coordinate correspondence between the two, making the learning of the 3D detection model easier. Compared with the existing 3-bit point cloud coordinate input, the accuracy of the recognition result is greatly improved. Coupled with the center coordinates of the point cloud cluster corresponding to the image instance, the 16-bit coordinates are used as input to improve the accuracy of the model and at the same time promote the convergence of the model. On the basis of using the current frame point cloud and the previous frame image, It greatly improves the operating efficiency of the model.
根据本公开实施例的技术方案,能够利用多种模拟的特征数据作为检测模型的输入,而特征数据之间具有互补性,从而使得模型的输入更加多维,模型输出确定的识别结果的准确度大大提升,能够准确地识别自动驾驶场景中的各个目标对象,从而为自动驾驶的模拟提供参考。According to the technical solutions of the embodiments of the present disclosure, a variety of simulated feature data can be used as the input of the detection model, and the feature data are complementary, so that the input of the model is more multi-dimensional, and the accuracy of the recognition result determined by the model output is greatly improved. Improvement, it can accurately identify each target object in the autonomous driving scene, so as to provide a reference for the simulation of automatic driving.
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。The specific implementation manners described above do not limit the protection scope of the present disclosure. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present disclosure shall be included within the protection scope of the present disclosure.

Claims (13)

  1. 一种对象识别方法,包括:A method of object recognition comprising:
    采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有一个或多个图像实例;Collect multi-frame point clouds and multi-frame images in the detection area; wherein, the point cloud includes a plurality of point cloud coordinates, and each frame of the image corresponds to one or more image instances;
    根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射;Constructing a mapping between the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image;
    根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标;According to the result of the mapping, determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster, and determine the center coordinates of the point cloud cluster;
    将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别所述点云的目标对象。Inputting the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
  2. 根据权利要求1所述的方法,其中,所述确定与所述图像实例对应的点云簇,包括:The method according to claim 1, wherein said determining a point cloud cluster corresponding to said image instance comprises:
    获取当前帧点云和前一帧图像;Obtain the point cloud of the current frame and the image of the previous frame;
    在所述当前帧点云中,将所述前一帧图像包括的任一图像实例所映射到的多个点云坐标组成所述点云簇。In the current frame point cloud, a plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped to form the point cloud cluster.
  3. 根据权利要求2所述的方法,其中,所述将所述前一帧图像包括的任一图像实例所映射到的多个点云坐标组成所述点云簇,包括:The method according to claim 2, wherein the point cloud cluster is composed of a plurality of point cloud coordinates to which any image instance included in the previous frame image is mapped, comprising:
    将所述当前帧点云投影到所述第二坐标系;projecting the current frame point cloud to the second coordinate system;
    将投影后与所述前一帧图像包括的任一图像实例的多个像素点对应的点云坐标组成所述点云簇。The point cloud cluster is composed of projected point cloud coordinates corresponding to multiple pixel points of any image instance included in the previous frame image.
  4. 根据权利要求2所述的方法,其中,所述图像实例指示了类别信息;所述确定所述点云簇的类别坐标,包括:The method according to claim 2, wherein said image instance indicates category information; said determining category coordinates of said point cloud cluster comprises:
    根据所述前一帧图像包括的图像实例的类别信息,确定与所述图像实例对应的点云簇的类别坐标。According to the category information of the image instance included in the previous frame image, the category coordinates of the point cloud cluster corresponding to the image instance are determined.
  5. 根据权利要求1所述的方法,其中,还包括:The method according to claim 1, further comprising:
    对所述点云的点云坐标、类别坐标和中心坐标进行拼接;Splicing the point cloud coordinates, category coordinates and center coordinates of the point cloud;
    所述将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,包括:The input of the point cloud coordinates, category coordinates and center coordinates of the point cloud into the preset detection model includes:
    将拼接后的结果输入所述检测模型。Input the spliced results into the detection model.
  6. 根据权利要求5所述的方法,其中,所述对所述点云的点云坐标、类别坐标和中心坐标进行拼接,包括:The method according to claim 5, wherein said splicing the point cloud coordinates, category coordinates and center coordinates of the point cloud comprises:
    对同一个所述点云簇的每一个点云坐标,执行:For each point cloud coordinate of the same point cloud cluster, execute:
    将所述类别坐标和所述中心坐标拼接到所述点云坐标。Stitching the category coordinates and the center coordinates to the point cloud coordinates.
  7. 根据权利要求6所述的方法,其中还包括:The method of claim 6, further comprising:
    对所述点云中除所述点云簇以外的点云坐标,确定所述点云坐标的类别坐标和中心坐标分别为默认类别坐标和默认中心坐标,将所述默认类别坐标和所述默认中心坐标拼接到所述点云坐标。For the point cloud coordinates in the point cloud except the point cloud cluster, determine that the category coordinates and center coordinates of the point cloud coordinates are default category coordinates and default center coordinates respectively, and set the default category coordinates and the default The center coordinates are stitched to the point cloud coordinates.
  8. 根据权利要求1所述的方法,其中还包括:The method of claim 1, further comprising:
    通过聚类方式,对所述点云进行降噪处理;Performing noise reduction processing on the point cloud by means of clustering;
    基于所述降噪处理后的点云,执行所述确定所述点云簇的中心坐标的步骤。The step of determining the center coordinates of the point cloud clusters is performed based on the point cloud after the noise reduction processing.
  9. 根据权利要求2所述的方法,其中,所述点云、所述图像分别是通过雷达传感器和摄像头得到的,其中,所述雷达传感器和所述摄像头同步采集。The method according to claim 2, wherein the point cloud and the image are respectively obtained by a radar sensor and a camera, wherein the radar sensor and the camera collect synchronously.
  10. 根据权利要求4所述的方法,其中,所述类别信息映射有对应的二进制坐标;确定与所述图像实例对应的点云簇的类别坐标,包括:The method according to claim 4, wherein the category information is mapped with corresponding binary coordinates; determining the category coordinates of the point cloud cluster corresponding to the image instance comprises:
    确定所述图像实例的类别信息的二进制坐标为所述点云簇的类别坐标。Determine the binary coordinates of the category information of the image instance as the category coordinates of the point cloud cluster.
  11. 一种对象识别装置,包括:An object recognition device, comprising:
    采集模块,用于采集检测区域内的多帧点云和多帧图像;其中,所述点云包括多个点云坐标,每一帧所述图像对应有一个或多个图像实例;An acquisition module, configured to acquire a multi-frame point cloud and a multi-frame image in the detection area; wherein the point cloud includes a plurality of point cloud coordinates, and each frame of the image corresponds to one or more image instances;
    映射模块,用于根据预设的所述点云的第一坐标系和所述图像的第二坐标系的转化关系,构建所述点云与所述图像实例的映射;A mapping module, configured to construct a mapping between the point cloud and the image instance according to the preset conversion relationship between the first coordinate system of the point cloud and the second coordinate system of the image;
    数据处理模块,用于根据映射的结果,确定与所述图像实例对应的点云簇以及所述点云簇的类别坐标,并确定所述点云簇的中心坐标;The data processing module is used to determine the point cloud cluster corresponding to the image instance and the category coordinates of the point cloud cluster according to the mapping result, and determine the center coordinates of the point cloud cluster;
    识别模块,用于将所述点云的点云坐标、类别坐标和中心坐标输入预设的检测模型,识别所述点云的目标对象。The identification module is configured to input the point cloud coordinates, category coordinates and center coordinates of the point cloud into a preset detection model to identify the target object of the point cloud.
  12. 一种对象识别的电子设备,包括:An electronic device for object recognition, comprising:
    一个或多个处理器;one or more processors;
    存储装置,用于存储一个或多个程序,storage means for storing one or more programs,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-10中任一所述的方法。When the one or more programs are executed by the one or more processors, the one or more processors are made to implement the method according to any one of claims 1-10.
  13. 一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现如权利要求1-10中任一所述的方法。A computer-readable medium, on which a computer program is stored, and when the program is executed by a processor, the method according to any one of claims 1-10 is realized.
PCT/CN2022/139873 2022-02-17 2022-12-19 Object recognition method and apparatus WO2023155580A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210147726.2 2022-02-17
CN202210147726.2A CN114550116A (en) 2022-02-17 2022-02-17 Object identification method and device

Publications (1)

Publication Number Publication Date
WO2023155580A1 true WO2023155580A1 (en) 2023-08-24

Family

ID=81675563

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/139873 WO2023155580A1 (en) 2022-02-17 2022-12-19 Object recognition method and apparatus

Country Status (2)

Country Link
CN (1) CN114550116A (en)
WO (1) WO2023155580A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117152199A (en) * 2023-08-30 2023-12-01 成都信息工程大学 Dynamic target motion vector estimation method, system, equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114550116A (en) * 2022-02-17 2022-05-27 京东鲲鹏(江苏)科技有限公司 Object identification method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190287254A1 (en) * 2018-03-16 2019-09-19 Honda Motor Co., Ltd. Lidar noise removal using image pixel clusterings
CN111340797A (en) * 2020-03-10 2020-06-26 山东大学 Laser radar and binocular camera data fusion detection method and system
CN111709343A (en) * 2020-06-09 2020-09-25 广州文远知行科技有限公司 Point cloud detection method and device, computer equipment and storage medium
CN113887376A (en) * 2021-09-27 2022-01-04 中汽创智科技有限公司 Target detection method, device, medium and equipment
CN114550116A (en) * 2022-02-17 2022-05-27 京东鲲鹏(江苏)科技有限公司 Object identification method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190287254A1 (en) * 2018-03-16 2019-09-19 Honda Motor Co., Ltd. Lidar noise removal using image pixel clusterings
CN111340797A (en) * 2020-03-10 2020-06-26 山东大学 Laser radar and binocular camera data fusion detection method and system
CN111709343A (en) * 2020-06-09 2020-09-25 广州文远知行科技有限公司 Point cloud detection method and device, computer equipment and storage medium
CN113887376A (en) * 2021-09-27 2022-01-04 中汽创智科技有限公司 Target detection method, device, medium and equipment
CN114550116A (en) * 2022-02-17 2022-05-27 京东鲲鹏(江苏)科技有限公司 Object identification method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117152199A (en) * 2023-08-30 2023-12-01 成都信息工程大学 Dynamic target motion vector estimation method, system, equipment and storage medium
CN117152199B (en) * 2023-08-30 2024-05-31 成都信息工程大学 Dynamic target motion vector estimation method, system, equipment and storage medium

Also Published As

Publication number Publication date
CN114550116A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
EP3779358B1 (en) Map element extraction method and apparatus
EP3627180B1 (en) Sensor calibration method and device, computer device, medium, and vehicle
EP3505866B1 (en) Method and apparatus for creating map and positioning moving entity
WO2023155580A1 (en) Object recognition method and apparatus
US20210302585A1 (en) Smart navigation method and system based on topological map
CN115540896B (en) Path planning method and device, electronic equipment and computer readable medium
EP3822852B1 (en) Method, apparatus, computer storage medium and program for training a trajectory planning model
JP7440005B2 (en) High-definition map creation method, apparatus, device and computer program
US20220343758A1 (en) Data Transmission Method and Apparatus
EP4155679A2 (en) Positioning method and apparatus based on lane line and feature point
WO2021017072A1 (en) Laser radar-based slam closed-loop detection method and detection system
WO2022222647A1 (en) Method and apparatus for predicting vehicle intention, device, and storage medium
EP4105600A2 (en) Method for automatically producing map data, related apparatus and computer program product
CN115879060B (en) Multi-mode-based automatic driving perception method, device, equipment and medium
WO2023155581A1 (en) Image detection method and apparatus
WO2023088486A1 (en) Lane line extraction method and apparatus, vehicle and storage medium
CN115339453A (en) Vehicle lane change decision information generation method, device, equipment and computer medium
CN114972758A (en) Instance segmentation method based on point cloud weak supervision
CN113378605A (en) Multi-source information fusion method and device, electronic equipment and storage medium
CN113932796A (en) High-precision map lane line generation method and device and electronic equipment
CN115866229B (en) Viewing angle conversion method, device, equipment and medium for multi-viewing angle image
CN115965961B (en) Local-global multi-mode fusion method, system, equipment and storage medium
CN115937449A (en) High-precision map generation method and device, electronic equipment and storage medium
CN116664498A (en) Training method of parking space detection model, parking space detection method, device and equipment
CN114724116B (en) Vehicle traffic information generation method, device, equipment and computer readable medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22926880

Country of ref document: EP

Kind code of ref document: A1