WO2023165220A1 - 一种目标物体的检测方法和装置 - Google Patents

一种目标物体的检测方法和装置 Download PDF

Info

Publication number
WO2023165220A1
WO2023165220A1 PCT/CN2022/139875 CN2022139875W WO2023165220A1 WO 2023165220 A1 WO2023165220 A1 WO 2023165220A1 CN 2022139875 W CN2022139875 W CN 2022139875W WO 2023165220 A1 WO2023165220 A1 WO 2023165220A1
Authority
WO
WIPO (PCT)
Prior art keywords
point cloud
point
position information
local feature
feature
Prior art date
Application number
PCT/CN2022/139875
Other languages
English (en)
French (fr)
Inventor
王丹
刘浩
徐卓然
许新玉
Original Assignee
京东鲲鹏(江苏)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东鲲鹏(江苏)科技有限公司 filed Critical 京东鲲鹏(江苏)科技有限公司
Publication of WO2023165220A1 publication Critical patent/WO2023165220A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the field of computer technology, in particular to a method and device for detecting a target object.
  • 3D target object detection is a very important technology in the field of automatic driving. By detecting and identifying obstacles that hinder driving, a reasonable avoidance plan is made for different obstacles according to the detection results to ensure the safety of vehicle driving. .
  • the more common target detection scheme in autonomous driving is the BEV (Bird's-eye View) detection of lidar point cloud (hereinafter referred to as point cloud), which compresses the 3D point cloud into the image data of the bird's-eye view perspective, and then inputs the 2D target detection algorithm.
  • embodiments of the present disclosure provide a method and device for detecting a target object.
  • a method for detecting a target object including: for each point cloud point of the original point cloud data, performing a first coordinate transformation on the point cloud point, and performing a first feature Extracting a first local feature, the first local feature includes spatial position information; performing a second coordinate transformation on the point cloud point, and performing a second feature extraction to obtain a second local feature, the second local feature includes height position information; the first local feature and the second local feature are fused to obtain the target local feature of the point cloud point; the neural network is used to perform multi-layer perception learning on the target local feature to obtain the point The global feature of the cloud point; the global feature is input into the target detection model to obtain the detection result of the target object.
  • performing a first coordinate transformation on the point cloud points and performing first feature extraction to obtain a first local feature includes: establishing a first transformed coordinate system, and Perform voxel grid division on the point cloud space under the transformed coordinate system; calculate the first position information of the point cloud point in the first transformed coordinate system according to the position information of the point cloud point; The position information determines the voxel grid to which the point cloud point belongs, and calculates the deviation between the first position information and the center point distribution of points in the voxel grid to which the point cloud point belongs, to obtain the first deviation information; The first position information and the first deviation information are concatenated to obtain the first local feature of the point cloud point.
  • the first local feature further includes an intensity feature of the point cloud point and a quantity feature of the point cloud point included in the voxel grid to which the point cloud point belongs.
  • performing a second coordinate transformation on the point cloud points and performing second feature extraction to obtain a second local feature includes: establishing a second transformed coordinate system, and Carry out voxel grid division on the point cloud space under the transformed coordinate system, and the voxel grid is parallel to the ground; calculate the first position of the point cloud point under the second transformed coordinates according to the position information of the point cloud point Two position information; determine the voxel grid to which the point cloud point belongs according to the second position information, and calculate the center point of the point distribution between the second position information and the voxel grid to which the point cloud point belongs The deviation of the second deviation information is obtained; the second position information and the second deviation information are spliced to obtain the second local feature.
  • the second local feature further includes a quantity feature of the point cloud points included in the voxel grid to which the point cloud points belong.
  • the center point of the point distribution in the voxel grid to which the point cloud point belongs is based on the average of the position information of all point cloud points in the voxel grid to which the point cloud point belongs value to determine.
  • the first coordinate transformation is to transform the position information of the point cloud point in the original Cartesian coordinate system into the first position information in the target Cartesian coordinate system
  • the The second coordinate transformation is to transform the position information of the point cloud points in the original Cartesian coordinate system into the second position information in the cylindrical coordinate system.
  • a target object detection device including: a first feature extraction module, for each point cloud point of the original point cloud data, the point cloud point is first coordinate transformation, and perform the first feature extraction to obtain the first local feature, the first local feature includes spatial position information; the second feature extraction module is used to perform the second coordinate transformation on the point cloud points, and perform the second The second local feature is obtained by feature extraction, and the second local feature includes height position information; the target local feature acquisition module is used to fuse the first local feature and the second local feature to obtain the point cloud point The local feature of the target; the global feature acquisition module is used to use the neural network to carry out multi-layer perception learning on the local feature of the target to obtain the global feature of the point cloud point; the detection module is used to input the global feature to the target In the detection model, the detection result of the target object is obtained.
  • an electronic device for detecting a target object including:
  • processors one or more processors
  • the one or more processors are made to implement the method provided by the first aspect of the embodiments of the present disclosure.
  • a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method provided in the first aspect of the embodiments of the present disclosure is implemented.
  • An embodiment in the disclosure has the following advantages or beneficial effects: by performing first coordinate transformation on each point cloud point of the original point cloud data, and performing first feature extraction to obtain the first local feature,
  • the first local feature includes spatial position information
  • the second coordinate transformation is performed on the point cloud point
  • the second feature extraction is performed to obtain the second local feature
  • the second local feature includes height position information
  • the first local feature and the second local feature Fusion is performed to obtain the target local features of the point cloud points
  • the neural network is used to perform multi-layer perceptual learning on the target local features to obtain the global features of the point cloud points
  • the global features are input into the target detection model to obtain the detection result of the target object
  • the technical solution realizes that through two coordinate transformations of the point cloud points, the extracted local features of the target include both spatial position information and height position information, and then learn the local features of the target based on the neural network to obtain the global features , to detect the target object, which solves the problem of low target object detection accuracy caused by the lack of height information of the point cloud when performing
  • FIG. 1 is a schematic diagram of the main flow of a method for detecting a target object according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of the principle of an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of main modules of a detection device for a target object according to an embodiment of the present disclosure
  • FIG. 4 is an exemplary system architecture diagram to which embodiments of the present disclosure can be applied;
  • Fig. 5 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present disclosure.
  • the mainstream point cloud feature extraction algorithm it is usually necessary to voxelize the point cloud based on the perspective of the top view, which will cause a large loss of the height information of the Z axis, and for different objects, different heights
  • the characteristics of the object have a great guiding effect on the recognition task of the object, so the Z-axis height information has great significance for the improvement of the detection performance of the target object.
  • the present disclosure proposes a detection method of a target object, through two coordinate transformations of point cloud points, so that the extracted target local features include both spatial position information and height Position information, and then learn the local features of the target based on the neural network to obtain global features for target object detection, which solves the low accuracy of target object detection caused by the lack of height information of the point cloud in the feature extraction in the prior art problem, thereby improving the accuracy of target object detection and better identifying target objects.
  • Point cloud the collection of point data on the product appearance surface obtained by measuring instruments in reverse engineering is also called point cloud;
  • BEV Bird's-eye View bird's-eye view, according to the principle of perspective, a three-dimensional map drawn from a high point of view overlooking the ground ups and downs by using high-viewpoint perspective;
  • Cylindrical view Cylindrical view, a view that retains object height information
  • Voxelization convert the geometric form representation of the object into the voxel representation closest to the object, and generate a volume data set, which not only contains the surface information of the model, but also can describe the internal properties of the model;
  • MLP mutil layer perceptron multi-layer perceptron, a feed-forward artificial neural network model that maps multiple input data sets to a single output data set.
  • FIG. 1 is a schematic diagram of a main process of a method for detecting a target object according to an embodiment of the present disclosure. As shown in FIG. 1 , the method for detecting a target object according to an embodiment of the present disclosure includes the following steps S101 to S105.
  • Step S101 for each point cloud point of the original point cloud data, perform a first coordinate transformation on the point cloud point, and perform first feature extraction to obtain a first local feature, and the first local feature includes spatial position information.
  • the first coordinate transformation is to transform the position information of the point cloud points in the original Cartesian coordinate system into the first position information in the target Cartesian coordinate system.
  • the type and location information of obstacles are determined through lidar point cloud detection, so as to make reasonable avoidance plans for different obstacles according to the detection results to ensure the safety of vehicle driving.
  • the point cloud can reflect the shape and attitude information of the target object, but lacks texture information. Therefore, in order to realize the detection of the 3D target object, it is necessary to extract features from the point cloud data.
  • the center point of the point cloud distribution is the origin.
  • the original Cartesian coordinates of the point cloud points need to be transformed to In the target Cartesian coordinate system, the position information of the point cloud point is a positive value in the target Cartesian coordinate system.
  • performing a first coordinate transformation on the point cloud points and performing first feature extraction to obtain a first local feature includes: establishing a first transformation coordinate system, and Carry out voxel grid division on the point cloud space under the system; calculate the first position information of the point cloud point in the first transformed coordinate system according to the position information of the point cloud point; according to the first position information Determine the voxel grid to which the point cloud point belongs, and calculate the deviation between the first position information and the center point of the point distribution in the voxel grid to which the point cloud point belongs, to obtain the first deviation information; The first position information and the first deviation information are spliced together to obtain the first local feature of the point cloud point.
  • the first local feature further includes an intensity feature of the point cloud point and a quantity feature of the point cloud point included in the voxel grid to which the point cloud point belongs.
  • the BEV Cartesian coordinate system of the original point cloud point establish a target Cartesian coordinate system satisfying that the position information of the point cloud point is a positive value, for example, for the point cloud point scanned by the lidar, according to the point cloud Point distribution characteristics, move the origin of the original Cartesian coordinates to the lower left, and the coordinate system in the case where the position information of point cloud points is positive is the target Cartesian coordinate system.
  • point The cloud space is divided into voxel grids.
  • For the distribution of the point cloud in the voxel grid find the point The center point of the cloud distribution, calculate the position information of the point in the voxel grid and the deviation of the center point, which is the first deviation information; splicing the first position information and the first deviation information, because the reflection intensity and The number of point cloud points in the grid is also feature information of point cloud points, so the above position information, deviation information, reflection intensity information and quantity information are combined to obtain the first local feature of point cloud points.
  • the information of the original point cloud point of a point is set as (x, y, z, r), the first position information of the current point in the target Cartesian coordinates and the intensity feature vector (x 1 , y 1 , z, r), the first deviation information in the voxel grid U is p(x' u , y' u , z' u ), the number of point cloud points in the voxel grid is n u , and finally the The above information is combined to obtain the first local feature P bev of the current point as:
  • P bev (x 1 , y 1 , z, r, x′ u , y′ u , z′ u , n u ), where U ⁇ C bev .
  • the center point of the point distribution in the voxel grid to which the point cloud point belongs is determined according to the average value of the position information of all point cloud points in the voxel grid to which the point cloud point belongs Sure.
  • the position information of each point cloud point in the voxel grid is set as (xi , y i , zi ) , and there are N point cloud points in total, then the center point of the voxel grid is the point cloud point Arithmetic mean of location information
  • Step S102 performing a second coordinate transformation on the point cloud points, and performing second feature extraction to obtain a second local feature, where the second local feature includes height position information.
  • the second coordinate transformation is to transform the position information of the point cloud point in the original Cartesian coordinate system into the second position information in the cylindrical coordinate system.
  • the position information of the point cloud point in the original Cartesian coordinate system is subjected to the second coordinate transformation in the cylindrical coordinate system to obtain the original point cloud point at
  • the second position information of the two-coordinate transformation is based on the second position information to perform feature extraction to obtain the second local feature. Since the cylindrical coordinates under the cylindrical perspective retain the height information of the target object, the feature information of the target object can be enriched to improve the accuracy of the target object. Accuracy of object detection.
  • performing a second coordinate transformation on the point cloud points and performing second feature extraction to obtain a second local feature includes: establishing a second transformation coordinate system, and Carry out voxel grid division on the point cloud space under the system, the voxel grid is parallel to the ground; calculate the second position of the point cloud point under the second transformation coordinates according to the position information of the point cloud point Information; determine the voxel grid to which the point cloud point belongs according to the second position information, and calculate the deviation between the second position information and the center point distribution of points in the voxel grid to which the point cloud point belongs , to obtain second deviation information; concatenate the second position information and second deviation information to obtain a second local feature.
  • the second local feature further includes a quantity feature of the point cloud points included in the voxel grid to which the point cloud points belong.
  • a cylindrical coordinate system take the radar as the axis, and project voxels to the surroundings, so that multiple voxel grids parallel to the ground are formed around, and the voxel grid division of the point cloud space is realized; according to the original
  • the position information of the point cloud point is calculated from the original Cartesian coordinate system to the second position information of the cylindrical coordinate system.
  • p i (xi , y i , z i ) is the original point cloud point in the original Cartesian coordinate system.
  • the position information of the coordinate system then the coordinates in the cylindrical coordinate system corresponding to p i for:
  • the second position information combined with the division of the voxel grid, determine the voxel grid to which the point cloud points belong, and obtain the number of point cloud points in the voxel grid, and the distribution of the point cloud in the voxel grid , find the center point of the point cloud distribution, and calculate the deviation between the position information of the points in the voxel grid and the center point, which is the second deviation information; splicing the second position information and the second deviation information, due to the
  • the number of point cloud points is also feature information of point cloud points, so the above position information, deviation information and quantity information are combined to obtain the second local feature of point cloud points.
  • the information of the original point cloud point of a point is set as (xi , y i , zi )
  • the second position information of the current point in cylindrical coordinates is
  • the second deviation information in the voxel grid U is
  • the number of point cloud points in the voxel grid is n u_cyu
  • the above information is combined to obtain the second local feature P clinder of the current point as:
  • the center point of the point distribution in the voxel grid to which the point cloud point belongs is determined according to the average value of the position information of all point cloud points in the voxel grid to which the point cloud point belongs Sure.
  • the method for determining the center point here is similar to the center point of the first coordinate transformation, and will not be repeated here.
  • the viewing angle conforms to The imaging principle of radar can more accurately characterize the characteristics of radar imaging.
  • Step S103 merging the first local feature and the second local feature to obtain the target local feature of the point cloud point.
  • the feature extraction of the spatial position information based on the first transformation and the feature extraction of the height position information based on the second transformation are guaranteed.
  • the eigenvalues of the two coordinate systems are fused to realize the feature complementarity of the two perspectives, and the target local features of the obtained point cloud point include both spatial position information and height position information.
  • the point cloud point The target local feature P f of is:
  • Step S104 using a neural network to perform multi-layer perceptual learning on the local features of the target to obtain the global features of the point cloud points.
  • the target local features of the above-mentioned point cloud points are used as the input of MLP (mutil layer perceptron), and the neural network method is used to perform multi-layer perception learning on the target local features to obtain the global features of the point cloud points for subsequent target object detection.
  • MLP utility layer perceptron
  • Step S105 input the global feature into the target detection model to obtain the detection result of the target object.
  • the global feature is input into the target detection model, and the category and position information of the target object is obtained through the target detection algorithm, so as to make a reasonable avoidance plan based on the obstacle information.
  • Fig. 2 is a schematic diagram of the principle of an embodiment of the present disclosure.
  • the original point cloud data is extracted from the spatial position information feature of the point cloud point through voxelization from the bird's-eye view perspective and the height position information feature of the point cloud point through voxelization from the cylindrical perspective view.
  • Extract to obtain the first local features of the Cartesian coordinate system of the bird's-eye view and the second local feature of the cylindrical coordinate system of the cylindrical view; the point-level features of the bird's-eye view and the point-level features of the cylindrical view are fused to obtain the local features of the target Information; finally, all the feature information of the point cloud point is obtained through multi-layer perceptron learning, and then the detection result of the target object is obtained through the detection model.
  • Fig. 3 is a schematic diagram of main modules of a detection device for a target object according to an embodiment of the present disclosure.
  • the object detection device 300 mainly includes a first feature extraction module 301 , a second feature extraction module 302 , a target local feature acquisition module 303 , a global feature acquisition module 304 and a detection module 305 .
  • the first feature extraction module 301 is configured to perform first coordinate transformation on each point cloud point of the original point cloud data, and perform first feature extraction to obtain a first local feature, the first local Features include spatial location information;
  • the second feature extraction module 302 is configured to perform a second coordinate transformation on the point cloud points, and perform second feature extraction to obtain a second local feature, and the second local feature includes height position information;
  • a target local feature acquisition module 303 configured to fuse the first local feature and the second local feature to obtain the target local feature of the point cloud point;
  • a global feature acquisition module 304 configured to use a neural network to perform multi-layer perceptual learning on the local features of the target to obtain global features of the point cloud points;
  • the detection module 305 is configured to input the global features into the target detection model to obtain a detection result of the target object.
  • the first feature extraction module 301 can also be used to: establish a first transformation coordinate system, and perform voxel grid division on the point cloud space under the first transformation coordinate system; Calculate the first position information of the point cloud point in the first transformed coordinate system; determine the voxel grid to which the point cloud point belongs according to the first position information, and calculate the first position information The deviation from the center point of the point distribution in the voxel grid to which the point cloud point belongs, to obtain first deviation information; splicing the first position information and the first deviation information to obtain the point cloud point The first local feature of .
  • the first local feature further includes an intensity feature of the point cloud point and a quantity feature of the point cloud point included in the voxel grid to which the point cloud point belongs.
  • the second feature extraction module 302 can also be used to: establish a second transformation coordinate system, and perform voxel grid division on the point cloud space under the second transformation coordinate system, and the voxel grid is parallel to ground; calculate the second position information of the point cloud point under the second transformation coordinates according to the position information of the point cloud point; determine the voxel grid to which the point cloud point belongs according to the second position information , and calculate the deviation between the second position information and the center point of the point distribution in the voxel grid to which the point cloud point belongs, to obtain the second deviation information; splicing the second position information and the second deviation information , to get the second local feature.
  • the second local feature further includes a quantity feature of the point cloud points included in the voxel grid to which the point cloud points belong.
  • the center point of the point distribution in the voxel grid to which the point cloud point belongs is determined according to the average value of the position information of all point cloud points in the voxel grid to which the point cloud point belongs.
  • the first coordinate transformation is to transform the position information of the point cloud points in the original Cartesian coordinate system into the first position information in the target Cartesian coordinate system
  • the second coordinate transformation is to transform the The position information of the point cloud points in the original Cartesian coordinate system is transformed into the second position information in the cylindrical coordinate system.
  • FIG. 4 is an exemplary system architecture diagram in which embodiments of the present disclosure can be applied.
  • the system architecture 400 may include terminal devices 401 , 402 , 403 , a network 404 and a server 405 .
  • the network 404 is used as a medium for providing communication links between the terminal devices 401 , 402 , 403 and the server 405 .
  • Network 404 may include various connection types, such as wires, wireless communication links, or fiber optic cables, among others.
  • Terminal devices 401 , 402 , 403 Users can use terminal devices 401 , 402 , 403 to interact with server 405 via network 404 to receive or send messages and the like.
  • Various communication client applications can be installed on the terminal devices 401, 402, 403, such as applications for detecting objects, applications for identifying objects, etc. (just examples).
  • the terminal devices 401, 402, 403 may be various electronic devices with display screens and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers and the like.
  • the server 405 may be a server that provides various services, such as a background management server that provides support for detection of target objects performed by users using the terminal devices 401 , 402 , 403 (just an example).
  • the background management server can perform the first coordinate transformation on each point cloud point of the original point cloud data, and perform the first feature extraction to obtain the first local feature, and the first local feature includes spatial position information ; Carry out a second coordinate transformation on the point cloud point, and perform a second feature extraction to obtain a second local feature, the second local feature includes height position information; combine the first local feature and the second local feature Fusion is carried out to obtain the target local features of the point cloud; the neural network is used to perform multi-layer perceptual learning on the target local features to obtain the global features of the point cloud; the global features are input into the target detection model , obtain the detection result of the target object and other processing, and feed back the processing result (for example, the detection result, etc.—just an example) to the terminal device.
  • the target object detection method provided by the embodiment of the present disclosure is generally executed by the server 405 , and correspondingly, the target object detection device is generally set in the server 405 .
  • terminal devices, networks and servers in Fig. 4 are only illustrative. According to the implementation needs, there can be any number of terminal devices, networks and servers.
  • FIG. 5 it shows a schematic structural diagram of a computer system 500 suitable for implementing a terminal device or a server according to an embodiment of the present disclosure.
  • the terminal device or server shown in FIG. 5 is only an example, and should not limit the functions and application scope of the embodiments of the present disclosure.
  • a computer system 500 includes a central processing unit (CPU) 501 that can be programmed according to a program stored in a read-only memory (ROM) 502 or a program loaded from a storage section 508 into a random-access memory (RAM) 503 Instead, various appropriate actions and processes are performed.
  • ROM read-only memory
  • RAM random-access memory
  • various programs and data required for the operation of the system 500 are also stored.
  • the CPU 501, ROM 502, and RAM 503 are connected to each other via a bus 504.
  • An input/output (I/O) interface 505 is also connected to the bus 504 .
  • the following components are connected to the I/O interface 505: an input section 506 including a keyboard, a mouse, etc.; an output section 507 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker; a storage section 508 including a hard disk, etc. and a communication section 509 including a network interface card such as a LAN card, a modem, or the like.
  • the communication section 509 performs communication processing via a network such as the Internet.
  • a drive 510 is also connected to the I/O interface 505 as needed.
  • a removable medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 510 as necessary so that a computer program read therefrom is installed into the storage section 508 as necessary.
  • embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, where the computer program includes program codes for executing the methods shown in the flowcharts.
  • the computer program may be downloaded and installed from a network via communication portion 509 and/or installed from removable media 511 .
  • this computer program is executed by a central processing unit (CPU) 501, the above-described functions defined in the system of the present disclosure are performed.
  • CPU central processing unit
  • the computer-readable medium shown in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the described.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the described.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the described.
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logic devices for implementing the specified Executable instructions for a function.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block in the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified function or operation, or can be implemented by a A combination of dedicated hardware and computer instructions.
  • a processor includes: a first feature extraction module, a second feature extraction module, a target local feature acquisition module, a global feature acquisition module and a detection module.
  • the names of these modules do not constitute a limitation of the module itself in some cases.
  • the detection module can also be described as "used to input the global features into the target detection model to obtain the detection of the target object. result module”.
  • the present disclosure also provides a computer-readable medium, which may be included in the device described in the embodiments, or may exist independently without being assembled into the device.
  • the computer-readable medium carries one or more programs, and when the one or more programs are executed by the device, the device includes: for each point cloud point of the original point cloud data, converting the point Carrying out the first coordinate transformation of the point cloud, and performing the first feature extraction to obtain the first local feature, the first local feature includes spatial position information; performing the second coordinate transformation on the point cloud point, and performing the second feature extraction to obtain The second local feature, the second local feature includes height position information; the first local feature and the second local feature are fused to obtain the target local feature of the point cloud point; the neural network is used for the Multi-layer perceptual learning is performed on the local features of the target to obtain the global features of the point cloud points; the global features are input into the target detection model to obtain the detection results of the target object.
  • the technical solution of the embodiment of the present disclosure it has the following advantages or beneficial effects: by performing the first coordinate transformation on each point cloud point of the original point cloud data, and performing the first feature extraction to obtain the first local feature , the first local feature includes spatial position information; the second coordinate transformation is performed on the point cloud point, and the second feature extraction is performed to obtain the second local feature, the second local feature includes height position information; the first local feature and the second local The features are fused to obtain the target local features of the point cloud points; the neural network is used to perform multi-layer perception learning on the target local features to obtain the global features of the point cloud points; the global features are input into the target detection model to obtain the detection results of the target object
  • the technical scheme realizes that through two coordinate transformations of the point cloud points, the extracted target local features include both spatial position information and height position information, and then learn the target local features based on the neural network to obtain the global feature, to detect the target object, which solves the problem of low target object detection accuracy caused by the lack of height information of the point cloud when performing feature extraction in the

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

一种目标物体的检测方法和装置,涉及计算机技术领域。该方法包括:对原始点云数据的每个点云点,将点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,第一局部特征包括空间位置信息(S101);对点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,第二局部特征包括高度位置信息(S102);将第一局部特征和第二局部特征进行融合,得到点云点的目标局部特征(S103);采用神经网络对目标局部特征进行多层感知学习,得到点云点的全局特征(S104);将全局特征输入到目标检测模型中,得到目标物体的检测结果(S105)。通过对点云点的两次坐标变换,提高了目标物体检测的准确性,更好的识别目标物体。

Description

一种目标物体的检测方法和装置
相关申请的交叉引用
本申请要求享有2022年3月4日提交的发明名称为“一种目标物体的检测方法和装置”的中国专利申请No.202210218821.7的优先权,在此全文引用上述中国专利申请公开的内容以作为本申请的一部分或全部。
技术领域
本公开涉及计算机技术领域,尤其涉及一种目标物体的检测方法和装置。
背景技术
3D目标物体检测是自动驾驶领域中一项非常重要的技术,通过对阻碍行驶的障碍物进行检测识别,根据检测的结果对不同的障碍物做出合理的规避计划,以保证车辆行驶的安全性。目前,自动驾驶中比较常见的目标物体检测方案为激光雷达点云(以下简称点云)的BEV(Bird’s-eye View)检测,即将三维点云压缩至鸟瞰图视角的图像数据,然后输入2D目标检测算法进行检测。
在实现本公开过程中,发明人发现现有技术中存在如下问题:
当前在主流的点云特征提取算法中,通常需要基于俯视图的视角去对点云进行体素化,这样对Z轴的高度信息有较大的损失,而对于不同的物体来说,不同高度上的特征对于该物体的识别任务是有较大的指导作用的。所以目前在进行特征提取时,由于对Z轴高度信息的损失而导致特征提取不完整,对目标物体检测的准确性有极大的影响。
发明内容
有鉴于此,本公开实施例提供一种目标物体的检测方法和装置。
根据本公开实施例的一个方面,提供了一种目标物体的检测方法, 包括:对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息;对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息;将所述第一局部特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征;采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征;将所述全局特征输入到目标检测模型中,得到目标物体的检测结果。
根据本公开的一个或多个实施例,对所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,包括:建立第一变换坐标系,并在所述第一变换坐标系下对点云空间进行体素网格划分;根据所述点云点的位置信息计算所述点云点在所述第一变换坐标系下的第一位置信息;根据所述第一位置信息确定所述点云点所属的体素网格,并计算所述第一位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第一偏差信息;将所述第一位置信息和所述第一偏差信息进行拼接,得到所述点云点的第一局部特征。
根据本公开的一个或多个实施例,所述第一局部特征还包括所述点云点的强度特征和所述点云点所属的体素网格包括的点云点的数量特征。
根据本公开的一个或多个实施例,对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,包括:建立第二变换坐标系,并在所述第二变换坐标系下对点云空间进行体素网格划分,所述体素网格平行于地面;根据所述点云点的位置信息计算所述点云点在所述第二变换坐标下的第二位置信息;根据所述第二位置信息确定所述点云点所属的体素网格,并计算所述第二位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第二偏差信息;将所述第二位置信息和第二偏差信息进行拼接,得到第二局部特征。
根据本公开的一个或多个实施例,所述第二局部特征还包括所述点云点所属的体素网格包括的点云点的数量特征。
根据本公开的一个或多个实施例,所述点云点所属的体素网格内 点分布的中心点根据所述点云点所属的体素网格内所有点云点的位置信息的平均值来确定。
根据本公开的一个或多个实施例,所述第一坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到目标笛卡尔坐标系下的第一位置信息,所述第二坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到圆柱坐标系下的第二位置信息。
根据本公开实施例的第二方面,提供一种目标物体的检测装置,包括:第一特征提取模块,用于对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息;第二特征提取模块,用于对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息;目标局部特征获取模块,用于将所述第一局部特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征;全局特征获取模块,用于采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征;检测模块,用于将所述全局特征输入到目标检测模型中,得到目标物体的检测结果。
根据本公开实施例的第三方面,提供一种目标物体的检测电子设备,包括:
一个或多个处理器;
存储装置,用于存储一个或多个程序,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本公开实施例第一方面提供的方法。
根据本公开实施例的第四方面,提供一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现本公开实施例第一方面提供的方法。
所述公开中的一个实施例具有如下优点或有益效果:通过对原始点云数据的每个点云点,将点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,第一局部特征包括空间位置信息;对点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,第二 局部特征包括高度位置信息;将第一局部特征和第二局部特征进行融合,得到点云点的目标局部特征;采用神经网络对目标局部特征进行多层感知学习,得到点云点的全局特征;将全局特征输入到目标检测模型中,得到目标物体的检测结果的技术方案,实现了通过对点云点的两次坐标变换,使得提取的目标局部特征中既包括了空间位置信息又包括了高度位置信息,再基于神经网络对目标局部特征进行学习,得到全局特征,以进行目标物体检测,解决了现有技术中进行特征提取时由于点云的高度信息缺失而导致的目标物体检测准确性低的问题,从而提高了目标物体检测的准确性,更好的识别目标物体。
附图说明
附图用于更好地理解本公开,不构成对本公开的不当限定。其中:
图1是根据本公开实施例的目标物体的检测方法的主要流程的示意图;
图2是本公开实施例的原理示意图;
图3是根据本公开实施例的目标物体的检测装置的主要模块示意图;
图4是本公开实施例可以应用于其中的示例性系统架构图;
图5是适于用来实现本公开实施例的终端设备或服务器的计算机系统的结构示意图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
目前在主流的点云特征提取算法中,通常需要基于俯视图的视角去对点云进行体素化,这样对Z轴的高度信息有较大的损失,而对于不同的物体来说,不同高度上的特征对于该物体的识别任务是有较大 的指导作用,所以Z轴高度信息对目标物体的检测性能的提高有较大的意义。
为了解决现有技术中存在的上述问题,本公开提出一种目标物体的检测方法,通过对点云点的两次坐标变换,使得提取的目标局部特征中既包括了空间位置信息又包括了高度位置信息,再基于神经网络对目标局部特征进行学习,得到全局特征,以进行目标物体检测,解决了现有技术中进行特征提取时由于点云的高度信息缺失而导致的目标物体检测准确性低的问题,从而提高了目标物体检测的准确性,更好的识别目标物体。
在本公开的实施例介绍中,所涉及的名词及其含义如下:
点云:在逆向工程中通过测量仪器得到的产品外观表面的点数据集合也称之为点云;
BEV:Bird’s-eye View鸟瞰图,根据透视原理,用高视点透视法从高处某一点俯视地面起伏绘制成的立体图;
Cylindrical view:柱面视图,一种保留物体高度信息的视图;
体素化:将物体的几何形式表示转换成最接近该物体的体素表示形式,产生体数据集,其不仅包含模型的表面信息,而且能描述模型的内部属性;
MLP:mutil layer perceptron多层感知器,一种前馈人工神经网络模型,其将输入的多个数据集映射到单一的输出的数据集上。
图1是根据本公开实施例的目标物体的检测方法的主要流程的示意图,如图1所示,本公开实施例的目标物体的检测方法包括如下的步骤S101至步骤S105。
步骤S101、对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息。
根据本公开的一个实施例,所述第一坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到目标笛卡尔坐标系下的第一位置信息。
具体地,在自动驾驶领域,通过激光雷达点云检测来确定障碍物 的类别和位置信息,以根据检测的结果对不同的障碍物做出合理的规避计划,来保证车辆行驶的安全性。点云可以反应目标物体的形状、姿态信息,但是缺少纹理信息,因此,为了实现3D目标物体的检测,需要对点云数据进行特征提取。基于原始点云数据在BEV(Bird’s-eye View)笛卡尔坐标系下是以点云分布的中心点为原点,为了便于目标物体的检测,需要将所述点云点的原始笛卡尔坐标变换到目标笛卡尔坐标系下,以使点云点的位置信息在目标笛卡尔坐标系下为正值。
根据本公开的另一个实施例,对所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,包括:建立第一变换坐标系,并在所述第一变换坐标系下对点云空间进行体素网格划分;根据所述点云点的位置信息计算所述点云点在所述第一变换坐标系下的第一位置信息;根据所述第一位置信息确定所述点云点所属的体素网格,并计算所述第一位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第一偏差信息;将所述第一位置信息和所述第一偏差信息进行拼接,得到所述点云点的第一局部特征。
根据本公开的再一个实施例,所述第一局部特征还包括所述点云点的强度特征和所述点云点所属的体素网格包括的点云点的数量特征。
具体地,根据原始的点云点的BEV笛卡尔坐标系,建立满足点云点的位置信息为正值的目标笛卡尔坐标系,示例性地,对于激光雷达扫描的点云点,根据点云点的分布特征,将原始的笛卡尔坐标的原点向左下移动,在满足点云点的位置信息为正值的情况下的坐标系为目标笛卡尔坐标系,基于目标笛卡尔坐标系,将点云空间进行体素网格的划分,示例性地,体素网格为H×W×1,其中H和W由目标物体检测的要求进行设定,可以用C bev=U i H*W*1表示U为C bev的一个体素网格;由于目标笛卡尔坐标系是基于原始笛卡尔坐标系的原点位置平移获取的,所以将原始点云点的位置信息进行平移即得到第一位置信息,示例性地,原始点云的一个点的信息为(x,y,z,r),其中(x,y,z)为位置信息,r为当前点的反射强度,经过第一变换后,当前点的第一位置信息为(x 1,y 1,z),其中x 1和y 1分别为向左向下平移后的位置信息;根据上述将点云点变换到目标笛卡尔坐标系中的位置信息,结 合上述体素网格的划分,确定点云点所在的体素网格,得到体素网格中点云点的数量,对于体素网格中的点云的分布,找到点云分布的中心点,计算体素网格内点的位置信息与中心点的偏差,即为第一偏差信息;将第一位置信息和第一偏差信息进行拼接,由于表示当前点的反射强度和网格内的点云点的数量也是点云点的特征信息,所以将上述位置信息、偏差信息、反射强度信息和数量信息进行组合,得到点云点的第一局部特征。
示例性地,设定一个点的原始点云点的信息为(x,y,z,r),当前点在目标笛卡尔坐标的第一位置信息和强度特征组成向量(x 1,y 1,z,r),在体素网格U内的第一偏差信息为p(x' u,y' u,z' u),体素网格内的点云点的数量为n u,最终将上述信息进行组合,得到当前点的第一局部特征P bev为:
P bev=(x 1,y 1,z,r,x' u,y' u,z' u,n u),where U∈C bev
根据本公开的又一个实施例,所述点云点所属的体素网格内点分布的中心点根据所述点云点所属的体素网格内所有点云点的位置信息的平均值来确定。
示例性地,设定体素网格内每个点云点的位置信息为(x i,y i,z i),共有N个点云点,则体素网格的中心点为点云点位置信息的算术平均值
Figure PCTCN2022139875-appb-000001
通过从BEV的视角的笛卡尔坐标系下的第一坐标变换,并进行点云点的体素化特征提取,保证了特征提取信息中保留了物体在空间中的位置信息。
步骤S102、对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息。
根据本公开的一个实施例,所述第二坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到圆柱坐标系下的第二位置信息。
具体地,根据原始的点云点的BEV笛卡尔坐标系,将所述点云点在原始笛卡尔坐标系下的位置信息进行圆柱坐标系下的第二坐标变换, 得到原始点云点在第二坐标变换的第二位置信息,基于第二位置信息进行特征提取,得到第二局部特征,由于圆柱视角下的圆柱坐标保留了目标物体的高度信息,可以丰富目标物体的特征信息,以提高目标物体检测的准确性。
根据本公开的另一个实施例,对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,包括:建立第二变换坐标系,并在所述第二变换坐标系下对点云空间进行体素网格划分,所述体素网格平行于地面;根据所述点云点的位置信息计算所述点云点在所述第二变换坐标下的第二位置信息;根据所述第二位置信息确定所述点云点所属的体素网格,并计算所述第二位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第二偏差信息;将所述第二位置信息和第二偏差信息进行拼接,得到第二局部特征。
根据本公开的再一个实施例,所述第二局部特征还包括所述点云点所属的体素网格包括的点云点的数量特征。
具体地,建立圆柱坐标系,以雷达为轴心,向四周进行体素的投射,使得四周形成多个平行于地面的体素网格,实现了点云空间的体素网格划分;根据原始点云点的位置信息,计算从原始笛卡尔坐标系转换到圆柱坐标系的第二位置信息,示例性地,p i(x i,y i,z i)为原始点云点在原始笛卡尔坐标系的位置信息,则p i对应的圆柱坐标系下的坐标
Figure PCTCN2022139875-appb-000002
为:
Figure PCTCN2022139875-appb-000003
根据上述的第二位置信息,结合体素网格的划分,确定点云点所属的体素网格,得到体素网格中点云点的数量,对于体素网格中的点云的分布,找到点云分布的中心点,计算体素网格内点的位置信息与中心点的偏差,即为第二偏差信息;将第二位置信息和第二偏差信息进行拼接,由于网格内的点云点的数量也是点云点的特征信息,所以将上述位置信息、偏差信息和数量信息进行组合,得到点云点的第二局部特征。
示例性地,设定一个点的原始点云点的信息为(x i,y i,z i),当前点在圆柱坐标的第二位置信息为
Figure PCTCN2022139875-appb-000004
在体素网格U内的第二偏差信息为
Figure PCTCN2022139875-appb-000005
体素网格内的点云点的数量为n u_cyu,最终将上述信息进行组合,得到当前点的第二局部特征P clinder为:
Figure PCTCN2022139875-appb-000006
根据本公开的又一个实施例,所述点云点所属的体素网格内点分布的中心点根据所述点云点所属的体素网格内所有点云点的位置信息的平均值来确定。
通常地,此处中心点的确定方法与第一坐标变换的中心点相似,在此不再重复说明。
通过上述从原始笛卡尔坐标系到圆柱视角的圆柱坐标系的第二变换,并进行点云点的体素化特征提取,保证了特征提取信息中保留了物体的高度信息,同时,该视角符合雷达的成像原理,能够更为准确的表征雷达成像的特征。
步骤S103、将所述第一局部特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征。
具体地,根据上述的BEV笛卡尔坐标系第一变换和圆柱视角的圆柱坐标系的第二变换,保证了基于第一变换的空间位置信息的特征提取和基于第二变换的高度位置信息的特征提取,将两种坐标系的特征值进行融合,实现了两种视角的特征互补,得到的点云点的目标局部特征既包括了空间位置信息又包括高度位置信息,示例性地,点云点的目标局部特征P f为:
Figure PCTCN2022139875-appb-000007
步骤S104、采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征。
具体地,将上述点云点的目标局部特征作为MLP(mutil layer perceptron)的输入,采用神经网络的方法对目标局部特征进行多层感知学习,得到点云点的全局特征,以进行后续的目标物体检测。
步骤S105、将所述全局特征输入到目标检测模型中,得到目标物体的检测结果。
具体地,将所述全局特征输入到目标检测模型中,通过目标检测算法,得到目标物体的类别和位置信息,便于根据障碍物的信息做出合理的规避计划。
图2是本公开实施例的原理示意图,图中将原始点云数据分别通过鸟瞰视角体素化进行点云点的空间位置信息特征提取和圆柱视角体素化进行点云点的高度位置信息特征提取,以得到鸟瞰视角的笛卡尔坐标系的第一局部特征和圆柱视角的圆柱坐标系的第二局部特征;将鸟瞰视角的点级别特征和圆柱视角的点级别特征进行融合,得到目标局部特征信息;最后经过多层感知器学习得到点云点的全部特征信息,再经过检测模型,得到目标物体的检测结果。
图3是根据本公开实施例的目标物体的检测装置的主要模块示意图。如图3所示,目标物体的检测装置300主要包括第一特征提取模块301、第二特征提取模块302、目标局部特征获取模块303、全局特征获取模块304和检测模块305。
第一特征提取模块301,用于对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息;
第二特征提取模块302,用于对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息;
目标局部特征获取模块303,用于将所述第一局部特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征;
全局特征获取模块304,用于采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征;
检测模块305,用于将所述全局特征输入到目标检测模型中,得到目标物体的检测结果。
具体地,第一特征提取模块301还可以用于:建立第一变换坐标系,并在所述第一变换坐标系下对点云空间进行体素网格划分;根据 所述点云点的位置信息计算所述点云点在所述第一变换坐标系下的第一位置信息;根据所述第一位置信息确定所述点云点所属的体素网格,并计算所述第一位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第一偏差信息;将所述第一位置信息和所述第一偏差信息进行拼接,得到所述点云点的第一局部特征。
具体地,所述第一局部特征还包括所述点云点的强度特征和所述点云点所属的体素网格包括的点云点的数量特征。
具体地,第二特征提取模块302还可以用于:建立第二变换坐标系,并在所述第二变换坐标系下对点云空间进行体素网格划分,所述体素网格平行于地面;根据所述点云点的位置信息计算所述点云点在所述第二变换坐标下的第二位置信息;根据所述第二位置信息确定所述点云点所属的体素网格,并计算所述第二位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第二偏差信息;将所述第二位置信息和第二偏差信息进行拼接,得到第二局部特征。
具体地,所述第二局部特征还包括所述点云点所属的体素网格包括的点云点的数量特征。
具体地,所述点云点所属的体素网格内点分布的中心点根据所述点云点所属的体素网格内所有点云点的位置信息的平均值来确定。
具体地,所述第一坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到目标笛卡尔坐标系下的第一位置信息,所述第二坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到圆柱坐标系下的第二位置信息。
图4是本公开实施例可以应用于其中的示例性系统架构图。
如图4所示,系统架构400可以包括终端设备401、402、403,网络404和服务器405。网络404用以在终端设备401、402、403和服务器405之间提供通信链路的介质。网络404可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
用户可以使用终端设备401、402、403通过网络404与服务器405交互,以接收或发送消息等。终端设备401、402、403上可以安装有各种通讯客户端应用,例如目标物体的检测应用、目标物体的识别应 用等(仅为示例)。
终端设备401、402、403可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。
服务器405可以是提供各种服务的服务器,例如对用户利用终端设备401、402、403所进行的目标物体的检测提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息;对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息;将所述第一局部特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征;采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征;将所述全局特征输入到目标检测模型中,得到目标物体的检测结果等处理,并将处理结果(例如检测结果等--仅为示例)反馈给终端设备。
需要说明的是,本公开实施例所提供的目标物体的检测方法一般由服务器405执行,相应地,目标物体的检测装置一般设置于服务器405中。
应该理解,图4中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
下面参考图5,其示出了适于用来实现本公开实施例的终端设备或服务器的计算机系统500的结构示意图。图5示出的终端设备或服务器仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图5所示,计算机系统500包括中央处理单元(CPU)501,其可以根据存储在只读存储器(ROM)502中的程序或者从存储部分508加载到随机访问存储器(RAM)503中的程序而执行各种适当的动作和处理。在RAM 503中,还存储有系统500操作所需的各种程序和数据。CPU 501、ROM 502以及RAM 503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。
以下部件连接至I/O接口505:包括键盘、鼠标等的输入部分506;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分507;包括硬盘等的存储部分508;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分509。通信部分509经由诸如因特网的网络执行通信处理。驱动器510也根据需要连接至I/O接口505。可拆卸介质511,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器510上,以便于从其上读出的计算机程序根据需要被安装入存储部分508。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分509从网络上被下载和安装,和/或从可拆卸介质511被安装。在该计算机程序被中央处理单元(CPU)501执行时,执行本公开的系统中限定的上述功能。
需要说明的是,本公开所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是所述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者所述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或所述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介 质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者所述的任意合适的组合。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,所述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括:第一特征提取模块、第二特征提取模块、目标局部特征获取模块、全局特征获取模块和检测模块。
其中,这些模块的名称在某种情况下并不构成对该模块本身的限定,例如,检测模块还可以被描述为“用于将所述全局特征输入到目标检测模型中,得到目标物体的检测结果的模块”。
另一方面,本公开还提供了一种计算机可读介质,该计算机可读介质可以是所述实施例中描述的设备中所包含的;也可以是单独存在,而未装配入该设备中。所述计算机可读介质承载有一个或者多个程序,当所述一个或者多个程序被一个该设备执行时,使得该设备包括:对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息;对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息;将所述第一局部 特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征;采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征;将所述全局特征输入到目标检测模型中,得到目标物体的检测结果。
根据本公开实施例的技术方案,具有如下优点或有益效果:通过对原始点云数据的每个点云点,将点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,第一局部特征包括空间位置信息;对点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,第二局部特征包括高度位置信息;将第一局部特征和第二局部特征进行融合,得到点云点的目标局部特征;采用神经网络对目标局部特征进行多层感知学习,得到点云点的全局特征;将全局特征输入到目标检测模型中,得到目标物体的检测结果的技术方案,实现了通过对点云点的两次坐标变换,使得提取的目标局部特征中既包括了空间位置信息又包括了高度位置信息,再基于神经网络对目标局部特征进行学习,得到全局特征,以进行目标物体检测,解决了现有技术中进行特征提取时由于点云的高度信息缺失而导致的目标物体检测准确性低的问题,从而提高了目标物体检测的准确性,更好的识别目标物体。
所述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。

Claims (10)

  1. 一种目标物体的检测方法,包括:
    对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息;
    对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息;
    将所述第一局部特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征;
    采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征;
    将所述全局特征输入到目标检测模型中,得到目标物体的检测结果。
  2. 根据权利要求1所述的方法,其中,对所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,包括:
    建立第一变换坐标系,并在所述第一变换坐标系下对点云空间进行体素网格划分;
    根据所述点云点的位置信息计算所述点云点在所述第一变换坐标系下的第一位置信息;
    根据所述第一位置信息确定所述点云点所属的体素网格,并计算所述第一位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第一偏差信息;
    将所述第一位置信息和所述第一偏差信息进行拼接,得到所述点云点的第一局部特征。
  3. 根据权利要求2所述的方法,其中,所述第一局部特征还包括所述点云点的强度特征和所述点云点所属的体素网格包括的点云点的 数量特征。
  4. 根据权利要求1所述的方法,其中,对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,包括:
    建立第二变换坐标系,并在所述第二变换坐标系下对点云空间进行体素网格划分,所述体素网格平行于地面;
    根据所述点云点的位置信息计算所述点云点在所述第二变换坐标下的第二位置信息;
    根据所述第二位置信息确定所述点云点所属的体素网格,并计算所述第二位置信息与所述点云点所属的体素网格内点分布的中心点的偏差,得到第二偏差信息;
    将所述第二位置信息和第二偏差信息进行拼接,得到第二局部特征。
  5. 根据权利要求4所述的方法,其中,所述第二局部特征还包括所述点云点所属的体素网格包括的点云点的数量特征。
  6. 根据权利要求2-5中任一所述的方法,其中,所述点云点所属的体素网格内点分布的中心点根据所述点云点所属的体素网格内所有点云点的位置信息的平均值来确定。
  7. 根据权利要求1所述的方法,其中,所述第一坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到目标笛卡尔坐标系下的第一位置信息,所述第二坐标变换是将所述点云点在原始笛卡尔坐标系下的位置信息变换到圆柱坐标系下的第二位置信息。
  8. 一种目标物体的检测装置,包括:
    第一特征提取模块,用于对原始点云数据的每个点云点,将所述点云点进行第一坐标变换,并进行第一特征提取得到第一局部特征,所述第一局部特征包括空间位置信息;
    第二特征提取模块,用于对所述点云点进行第二坐标变换,并进行第二特征提取得到第二局部特征,所述第二局部特征包括高度位置信息;
    目标局部特征获取模块,用于将所述第一局部特征和所述第二局部特征进行融合,得到所述点云点的目标局部特征;
    全局特征获取模块,用于采用神经网络对所述目标局部特征进行多层感知学习,得到所述点云点的全局特征;
    检测模块,用于将所述全局特征输入到目标检测模型中,得到目标物体的检测结果。
  9. 一种移动电子设备终端,包括:
    一个或多个处理器;
    存储装置,用于存储一个或多个程序,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一所述的方法。
  10. 一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现如权利要求1-7中任一所述的方法。
PCT/CN2022/139875 2022-03-04 2022-12-19 一种目标物体的检测方法和装置 WO2023165220A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210218821.7A CN114581871A (zh) 2022-03-04 2022-03-04 一种目标物体的检测方法和装置
CN202210218821.7 2022-03-04

Publications (1)

Publication Number Publication Date
WO2023165220A1 true WO2023165220A1 (zh) 2023-09-07

Family

ID=81778322

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/139875 WO2023165220A1 (zh) 2022-03-04 2022-12-19 一种目标物体的检测方法和装置

Country Status (2)

Country Link
CN (1) CN114581871A (zh)
WO (1) WO2023165220A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117274255A (zh) * 2023-11-21 2023-12-22 法奥意威(苏州)机器人系统有限公司 数据检测方法、装置、电子设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114581871A (zh) * 2022-03-04 2022-06-03 京东鲲鹏(江苏)科技有限公司 一种目标物体的检测方法和装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10970518B1 (en) * 2017-11-14 2021-04-06 Apple Inc. Voxel-based feature learning network
CN113361601A (zh) * 2021-06-04 2021-09-07 北京轻舟智航科技有限公司 基于无人车激光雷达数据进行透视与俯瞰特征融合的方法
CN113361538A (zh) * 2021-06-22 2021-09-07 中国科学技术大学 一种基于自适应选择邻域的点云分类和分割方法及系统
CN113759338A (zh) * 2020-11-09 2021-12-07 北京京东乾石科技有限公司 一种目标检测方法、装置、电子设备及存储介质
CN114581871A (zh) * 2022-03-04 2022-06-03 京东鲲鹏(江苏)科技有限公司 一种目标物体的检测方法和装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10970518B1 (en) * 2017-11-14 2021-04-06 Apple Inc. Voxel-based feature learning network
CN113759338A (zh) * 2020-11-09 2021-12-07 北京京东乾石科技有限公司 一种目标检测方法、装置、电子设备及存储介质
CN113361601A (zh) * 2021-06-04 2021-09-07 北京轻舟智航科技有限公司 基于无人车激光雷达数据进行透视与俯瞰特征融合的方法
CN113361538A (zh) * 2021-06-22 2021-09-07 中国科学技术大学 一种基于自适应选择邻域的点云分类和分割方法及系统
CN114581871A (zh) * 2022-03-04 2022-06-03 京东鲲鹏(江苏)科技有限公司 一种目标物体的检测方法和装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117274255A (zh) * 2023-11-21 2023-12-22 法奥意威(苏州)机器人系统有限公司 数据检测方法、装置、电子设备及存储介质
CN117274255B (zh) * 2023-11-21 2024-01-30 法奥意威(苏州)机器人系统有限公司 数据检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN114581871A (zh) 2022-06-03

Similar Documents

Publication Publication Date Title
JP6745328B2 (ja) 点群データを復旧するための方法及び装置
WO2023165220A1 (zh) 一种目标物体的检测方法和装置
CN110632608B (zh) 一种基于激光点云的目标检测方法和装置
CN112652036B (zh) 道路数据的处理方法、装置、设备及存储介质
CN111815738B (zh) 一种构建地图的方法和装置
WO2023241097A1 (zh) 一种语义实例重建方法、装置、设备及介质
CN115423946B (zh) 大场景弹性语义表征与自监督光场重建方法及装置
CN114792355B (zh) 虚拟形象生成方法、装置、电子设备和存储介质
CN114627239B (zh) 包围盒生成方法、装置、设备及存储介质
CN113920217A (zh) 用于生成高精地图车道线的方法、装置、设备和产品
CN114399588A (zh) 三维车道线生成方法、装置、电子设备和计算机可读介质
JP2023064082A (ja) 高精地図における三次元地図の構築方法、装置、機器および記憶媒体
CN114998433A (zh) 位姿计算方法、装置、存储介质以及电子设备
US20230048643A1 (en) High-Precision Map Construction Method, Apparatus and Electronic Device
EP4086853A2 (en) Method and apparatus for generating object model, electronic device and storage medium
CN115239892B (zh) 三维血管模型的构建方法、装置、设备及存储介质
CN110377776B (zh) 生成点云数据的方法和装置
CN113538555A (zh) 基于规则箱体的体积测量方法、系统、设备及存储介质
CN110363847B (zh) 一种基于点云数据的地图模型构建方法和装置
KR20230098058A (ko) 3차원 데이터 증강 방법, 모델 트레이닝 검출 방법, 설비 및 자율 주행 차량
CN113761090B (zh) 一种基于点云地图的定位方法和装置
CN114581523A (zh) 一种用于单目3d目标检测的标注数据确定方法和装置
KR20220063291A (ko) 물체에 대한 충돌 감지 방법과, 장치, 전자 기기, 저장 매체 및 컴퓨터 프로그램
CN110389349B (zh) 定位方法和装置
CN110399892B (zh) 环境特征提取方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22929647

Country of ref document: EP

Kind code of ref document: A1