CN113609985A - Object pose detection method, detection device, robot and storage medium - Google Patents
Object pose detection method, detection device, robot and storage medium Download PDFInfo
- Publication number
- CN113609985A CN113609985A CN202110895680.8A CN202110895680A CN113609985A CN 113609985 A CN113609985 A CN 113609985A CN 202110895680 A CN202110895680 A CN 202110895680A CN 113609985 A CN113609985 A CN 113609985A
- Authority
- CN
- China
- Prior art keywords
- pose
- target object
- candidate
- sensor
- data frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 74
- 238000013135 deep learning Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 9
- 238000000034 method Methods 0.000 claims description 24
- 230000015654 memory Effects 0.000 claims description 22
- 238000004364 calculation method Methods 0.000 claims description 11
- 239000007787 solid Substances 0.000 claims description 4
- 230000006870 function Effects 0.000 description 11
- 238000012545 processing Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000005034 decoration Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000007500 overflow downdraw method Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10028—Range image; Depth image; 3D point clouds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
The invention provides an object pose detection method, which comprises the following steps: training and identifying the detected target object to be identified by utilizing deep learning to obtain an identification network of the target object; calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result; identifying the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the identification network; according to the calibration result, extracting sensor point cloud data in an image range, and acquiring local point cloud corresponding to the characteristic part; and acquiring the central positions of a plurality of characteristic parts according to the local point clouds, and acquiring the pose of the object by using the central positions. Therefore, accurate information is provided for planning the movement route of the robot, and the movement capacity and efficiency of the robot are improved.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an object pose detection method, detection equipment, a robot and a storage medium.
Background
When the robot capable of moving autonomously moves in the motion environment, if the object detected by the sensor is not identified, part of some moving instruments, carts and other obstacle objects can be used as obstacles, and the whole object cannot be known, so that when the obstacle is shielded by people or carts, an improper route can be planned, the robot motion effect is poor, and even collision accidents occur. Because the movement route of the robot is relatively fixed, the detected object can be basically recognized through training by methods such as deep learning, and if only a target obstacle object is recognized, the pose of the obstacle object cannot be known, and good movement expression cannot be achieved.
Disclosure of Invention
The invention aims to provide an object pose detection method, detection equipment, a robot and a computer storage medium, which are used for solving the problem of pose detection of a detected target object when the robot moves, providing accurate information for planning a movement route of the robot and improving the movement capacity and efficiency of the robot.
The technical scheme provided by the invention is as follows:
an object pose detection method includes:
training and identifying a target object to be identified according to a data frame detected by a sensor by utilizing deep learning to obtain an identification network of the target object;
calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
identifying the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the identification network;
according to the calibration result, extracting sensor point cloud data in an image range, and acquiring local point cloud corresponding to the characteristic part;
and acquiring the central positions of a plurality of characteristic parts according to the local point clouds, and acquiring the pose of the object by using the central positions.
Preferably, the sensor is a depth sensor, including but not limited to an RGBD sensor, a lidar or a solid state lidar.
Further, the obtaining of the central positions of the plurality of feature parts according to the local point cloud and the obtaining of the pose of the object by using the central positions specifically include:
calculating sensor point cloud data and local point cloud to obtain the central position of the local point cloud;
and calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic parts relative to the center of the target object.
Optionally, the calculating the pose of the whole object according to the central positions of the plurality of local point clouds and the positions of the feature local relative to the center of the target object specifically includes:
determining candidate positions of a first characteristic part and a second characteristic part in a plurality of characteristic parts according to the central positions of the first characteristic part and the second characteristic part and the distance between the first characteristic part and the second characteristic part;
calculating candidate poses of the target object according to the candidate positions of the characteristic parts in the target object;
and obtaining the whole pose of the target object according to the candidate pose of the target object.
Optionally, obtaining the whole pose of the target object according to the candidate pose of the target object specifically includes:
the candidate poses of the target object comprise a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
obtaining a first detection data frame and a second detection data frame of the sensor at the first candidate pose and the second candidate pose respectively;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
In order to achieve the object of the present invention, an embodiment of the present invention further provides an object pose detection apparatus, including:
the first identification module is used for training and identifying a target object to be identified according to a data frame detected by a sensor by utilizing deep learning to obtain an identification network of the target object;
the calibration module is used for calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
the second identification module is used for identifying the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the identification network;
the point cloud data calculation module is used for extracting sensor point cloud data in an image range according to the calibration result and acquiring local point clouds corresponding to the characteristic parts;
and the pose acquisition module is used for acquiring the central positions of a plurality of characteristic parts according to the local point clouds and acquiring the pose of the object by using the central positions.
Further, the pose acquisition module specifically includes:
the central position calculating unit is used for calculating the sensor point cloud data and the local point cloud to obtain the central position of the local point cloud;
and the pose calculation unit is used for calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic parts relative to the center of the target object.
Optionally, the pose calculation unit specifically includes:
a candidate position determining subunit, configured to determine, according to respective central positions of a first feature part and a second feature part in the plurality of feature parts and a distance between the first feature part and the second feature part, a candidate position of the feature part in the target object;
the candidate pose calculation subunit is used for calculating candidate poses of the target object according to the candidate positions of the local features on the target object;
and the acquisition subunit is used for acquiring the whole pose of the target object according to the candidate pose of the target object.
In order to achieve the object of the present invention, an embodiment of the present invention further provides a robot, which includes a processor and a memory, where the processor is coupled to the memory, and the memory is used for storing a program; the processor is configured to execute the program in the memory to cause the robot to perform the method of object pose detection as described above.
To achieve the object of the present invention, the embodiment of the present invention further provides a computer-readable storage medium, which stores instructions that, when executed on a computer, enable the computer to execute any of the above-mentioned methods for implementing sensor detection.
According to the invention, the position and posture of the whole target object are obtained by identifying the target object and the local part of the target object and the characteristic local part of the target object by utilizing the depth sensor and the camera data frame, so that the robot can reasonably plan the movement route in the movement process, and the movement efficiency of the robot is improved.
Drawings
The above features, technical features, advantages and implementations of the method and apparatus for user equipment admission, the method and apparatus for user equipment handover will be further explained in the following detailed description of preferred embodiments in a clearly understandable manner with reference to the accompanying drawings.
Fig. 1 is a flowchart of an object pose detection method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an object pose detection apparatus according to an embodiment of the present invention;
fig. 3 is a schematic view of another object pose detection apparatus provided by the embodiment of the present invention;
fig. 4 is a schematic view of another object pose detection apparatus provided by the embodiment of the present invention;
fig. 5 is a schematic diagram of an autonomous mobile robot according to an embodiment of the present invention.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically depicted, or only one of them is labeled. In this document, "one" means not only "only one" but also a case of "more than one".
In the development process of autonomous movement of intelligent devices, the inventor needs to utilize some sensors to detect the environment in order to realize the autonomous movement of the intelligent devices. The intelligent device may be an autonomous moving robot, an autonomous moving automobile, or other autonomous walking devices, which may more or less employ depth sensors such as a laser radar to detect a target object and employ a camera to identify an object in a motion environment of the robot, however, in the prior art, detection or identification of a target object is only an object, but if only a part of the object is detected, the whole object is not obtained, and the obstacle level of the object cannot be accurately determined.
In order to accurately acquire the pose of a target object, the embodiment of the invention provides an object pose detection method.
Referring to fig. 1, an object pose detection method according to an embodiment of the present invention includes:
s1, training and identifying a target object to be identified according to a data frame detected by a sensor by utilizing deep learning to obtain an identification network of the target object;
the embodiment of the invention firstly uses a mature deep learning method to train the object to be recognized and the characteristic part of the object, such as a cart and wheels on the cart;
the deep learning of the embodiment of the invention does not limit the network used, for example, the full convolution network technology can be transplanted to the three-dimensional distance scanning data detection task. Specifically, the scene is set as a detection task based on the range data of the Velodyne64E lidar. Data is presented in a 2D point map and a single 2D end-to-end full convolution network is used to predict both target confidence and bounding box. The complete 3D bounding box can also be predicted using a 2D convolutional network by designed bounding box coding.
Or to eliminate the need for manual feature engineering of 3D point clouds, VoxelNet, a generic 3D detection network, can be used that unifies feature extraction and bounding box prediction into a single-step end-to-end trainable deep network.
Or the environment representation based on the grid graph is very suitable for a sensor fusion method, a free space estimation method and a machine learning method, and the deep CNN is mainly used for detecting and classifying the target. As an input of the CNN, 3D distance sensor information is efficiently encoded using a multi-layer mesh map. The inference output is a list of rotated bounding boxes with associated semantic categories. The distance sensor measurements are converted to a multi-layer grid map as input to the target detection and classification network. The CNN network infers the rotated 3D bounding box and semantic categories simultaneously. These frames are projected into the camera image for visual verification.
S2, calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
a spatial conversion relation from the sensor to the camera is found through calibration, a rotation matrix R and a translation matrix T are needed for conversion among different coordinate systems, and preparation is made for subsequent data fusion of the sensor and the camera.
S3, identifying the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the identification network;
the target object in the embodiment of the present invention refers to an image for performing pose detection, and the target object may include a person, an instrument device, a cart, and the like. In the embodiment of the present invention, an image of a target object may be acquired first, for example, the target image may be selected from stored image data, or a transmitted target image may also be received from another device, or the target image may also be captured directly by an image capturing device, which is only an exemplary illustration of acquiring the target image, and the embodiment of the present invention is not limited thereto.
After the image of the target object is obtained, the target object in the image of the target object may be identified, where the target object in the target image may be identified through an image identification algorithm, and the identification of the target object may also be performed through a trained machine learning network model, where the machine learning network model may include a neural network model, a deep learning neural network model, or the like, which is not limited in the embodiment of the present invention.
For example, the patient bed and the wheels of the patient bed are identified by deep learning, and the point cloud of the patient bed and the wheels is obtained by using the calibration relation of the camera and the sensor. The whole coordinates and pose of the sickbed can be calculated by utilizing the relative distance relationship among 4 wheels of the sickbed and the identified coordinates of the point cloud of the wheels.
S4, extracting sensor point cloud data in an image range according to the calibration result, and acquiring local point clouds corresponding to the characteristic parts;
and S5, acquiring central positions of a plurality of characteristic parts according to the local point clouds, and acquiring the pose of the object by using the central positions.
Preferably, the sensor is a depth sensor, including but not limited to an RGBD sensor, a lidar or a solid state lidar.
Further, the obtaining of the central positions of the plurality of feature parts according to the local point cloud and the obtaining of the pose of the object by using the central positions specifically include:
calculating sensor point cloud data and local point cloud to obtain the central position of the local point cloud;
and calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic parts relative to the center of the target object.
Optionally, the calculating the pose of the whole object according to the central positions of the plurality of local point clouds and the positions of the feature local relative to the center of the target object specifically includes:
determining candidate positions of a first characteristic part and a second characteristic part in a plurality of characteristic parts according to the central positions of the first characteristic part and the second characteristic part and the distance between the first characteristic part and the second characteristic part;
calculating candidate poses of the target object according to the candidate positions of the characteristic parts in the target object;
and obtaining the whole pose of the target object according to the candidate pose of the target object.
Optionally, obtaining the whole pose of the target object according to the candidate pose of the target object specifically includes:
the candidate poses of the target object comprise a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
obtaining a first detection data frame and a second detection data frame of the sensor at the first candidate pose and the second candidate pose respectively;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
For example, knowing the central positions W1 and W2 of two wheels on the cart, in combination with the feature that the cart has 4 wheels, the candidate poses P1 and P2 of the cart can be calculated with respect to the overall pose of the cart (since the cart has 4 wheels, the distance between the two wheels is generally three possibilities of L1, L2 and L3, and it can be known which two of the two wheels are possible according to W1 and W2);
assuming that the cart is in candidate positions P1 and P2, respectively, it is calculated whether there is a conflict between the depth data returned by the sensors and the object when the cart is in this position: for example, when an object is on P1, which objects should be detected locally but are not actually detected, two scores S1 and S2 are obtained; the highest pose is selected as the object.
In order to achieve the object of the present invention, as shown in fig. 2, an embodiment of the present invention further provides an object pose detection apparatus 100, including:
the first identification module 11 is configured to perform training identification on a target object to be identified according to a data frame detected by a sensor by using deep learning, and acquire an identification network of the target object;
the calibration module 12 is configured to calibrate the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera, so as to obtain a calibration result;
the second identification module 13 is configured to identify the target object, an image range of the target object in the image, and a plurality of feature parts according to the image data frame and the identification network;
the point cloud data calculation module 14 is configured to extract sensor point cloud data within an image range according to the calibration result, and acquire a local point cloud corresponding to the characteristic part;
and the pose acquisition module 15 is configured to acquire central positions of a plurality of feature parts according to the local point cloud, and acquire a pose of the object by using the central positions.
Further, as shown in fig. 3, the pose acquisition module specifically includes:
a central position calculating unit 151, configured to calculate sensor point cloud data and a local point cloud to obtain a central position of the local point cloud;
and a pose calculation unit 152, configured to calculate a pose of the entire target object according to the central positions of the several local point clouds and the positions of the feature local areas relative to the center of the target object.
Optionally, as shown in fig. 4, the pose calculation unit 152 specifically includes:
a candidate position determining subunit 1521, configured to determine, according to respective central positions of a first feature part and a second feature part in the plurality of feature parts and a distance between the first feature part and the second feature part, a candidate position of the feature part in the target object;
a candidate pose calculation subunit 1522, configured to calculate, according to the candidate position of the local feature in the target object, a candidate pose of the target object;
the obtaining subunit 1523 is configured to obtain the entire pose of the target object according to the candidate pose of the target object.
The candidate poses of the target object comprise a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
obtaining a first detection data frame and a second detection data frame of the sensor at the first candidate pose and the second candidate pose respectively;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
For example, knowing the central positions W1 and W2 of two wheels on the cart, in combination with the feature that the cart has 4 wheels, the candidate poses P1 and P2 of the cart can be calculated with respect to the overall pose of the cart (since the cart has 4 wheels, the distance between the two wheels is generally three possibilities of L1, L2 and L3, and it can be known which two of the two wheels are possible according to W1 and W2);
assuming that the cart is in candidate positions P1 and P2, respectively, it is calculated whether there is a conflict between the depth data returned by the sensors and the object when the cart is in this position: for example, when an object is on P1, which objects should be detected locally but are not actually detected, two scores S1 and S2 are obtained; the highest pose is selected as the object.
In order to achieve the object of the present invention, an embodiment of the present invention further provides a robot, which includes a processor and a memory, where the processor is coupled to the memory, and the memory is used for storing a program; the processor is configured to execute the program in the memory to cause the robot to perform the method of object pose detection as described above.
To achieve the object of the present invention, the embodiment of the present invention further provides a computer-readable storage medium, which stores instructions that, when executed on a computer, enable the computer to execute any of the above-mentioned methods for implementing sensor detection.
According to the invention, the position and posture of the whole target object are obtained by identifying the target object and the local part of the target object and the characteristic local part of the target object by utilizing the depth sensor and the camera data frame, so that the robot can reasonably plan the movement route in the movement process, and the movement efficiency of the robot is improved.
It should be noted that the embodiment of the pose detection apparatus provided by the present invention and the embodiment of the pose detection method provided by the foregoing are all based on the same inventive concept, and can obtain the same technical effect; thus, other specific contents of the embodiments of the pose detection apparatus can be referred to the description of the embodiments of the pose detection method described above.
It should be noted that the division of each module or unit of the above detection device is only a division of a logic function, and the actual implementation may be wholly or partially integrated into one physical entity, or may be physically separated. And these units can be realized in the form of software called by processor; or may be implemented entirely in hardware; and part of the units can be realized in the form of calling by a processor through software, and part of the units can be realized in the form of hardware.
For example, the functions of the above modules or units may be stored in a memory in the form of program codes, which are scheduled by a processor to implement the functions of the above units. The processor may be a general purpose processor such as a Central Processing Unit (CPU) or other processor capable of calling programs. As another example, the above units may be one or more integrated circuits configured to implement the above methods, such as: one or more Application Specific Integrated Circuits (ASICs), or one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, in combination with the above two methods, part of the functions is implemented in the form of a scheduler code of the processor, and part of the functions is implemented in the form of a hardware integrated circuit. And when the above functions are integrated together, the functions can be realized in the form of a system-on-a-chip (SOC).
The detection device and the like provided by the embodiment of the application can be specifically chips, and the chips comprise: a processing unit, which may be for example a processor, and a communication unit, which may be for example an input/output interface, a pin or a circuit, etc. The processing unit may execute computer-executable instructions stored by the storage unit to cause a chip within the detection device to perform the steps performed by the detection device described in the illustrated embodiment described above, or to cause a chip within the execution device to perform the steps performed by the detection device as described in the foregoing embodiment shown in fig. 2.
Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, and the like, and the storage unit may also be a storage unit located outside the chip in the wireless access device, such as a read-only memory (ROM) or another type of static storage device that can store static information and instructions, a Random Access Memory (RAM), and the like.
To achieve the object of the present invention, as shown in fig. 5, an embodiment of the present invention further provides a robot 180, where the robot 180 includes a processor 1803 and a memory 1804, the processor 1803 is coupled to the memory 1804, and where,
the memory 1804 for storing programs;
the processor 1803 is configured to execute the program in the memory, so that the robot performs the method for detecting the object pose as described above.
Referring to fig. 5, the method disclosed in the embodiment of the present invention and corresponding to fig. 1 may be applied to an autonomous mobile robot 180, where the robot 180 includes a processor 1803, and the processor 1803 may be an integrated circuit chip having signal processing capability. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 1803. The processor 1803 may be a general-purpose processor, a Digital Signal Processor (DSP), a microprocessor or a microcontroller, and may further include an Application Specific Integrated Circuit (ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The processor 1803 may implement or perform the methods, steps, and logic blocks disclosed in the embodiments corresponding to fig. 1 of the present application.
A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 1804, and the processor 1803 reads the information in the memory 1804, and completes the steps of the above method in combination with the hardware thereof.
The receiver 1801 may be used to receive entered numerical or character information and to generate signal inputs relating to the relevant settings and functional control of the robot 180. The transmitter 1802 may be used to output numeric or character information through a first interface; the transmitter 1802 is further operable to send instructions to the disk groups via the first interface to modify data in the disk groups; the transmitter 1802 may also include a display device such as a display screen.
An embodiment of the present invention also provides a computer-readable storage medium in which a program for performing signal processing is stored, which, when run on a computer, causes the computer to perform the steps performed by the object pose detection method described in the foregoing illustrated embodiment, or causes the computer to perform the steps performed by the detection apparatus described in the foregoing illustrated embodiment of fig. 2.
It should be noted that the above-described embodiments of the apparatus are merely schematic, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiments of the apparatus provided in the present application, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by software plus necessary general-purpose hardware, and certainly can also be implemented by special-purpose hardware including special-purpose integrated circuits, special-purpose CPUs, special-purpose memories, special-purpose components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions may be various, such as analog circuits, digital circuits, or dedicated circuits. However, for the present application, the implementation of a software program is more preferable. Based on such understanding, the technical solutions of the present application may be substantially embodied in or contributed to by the prior art, and the computer software product may be stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, and includes several instructions for causing a computer device (which may be a personal computer or a network device) to execute the method according to the embodiments of the present application.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer-readable storage medium, which may be any available medium that a computer can store or a data storage device, such as a training device, a data center, etc., that is integrated with one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. An object pose detection method, characterized by comprising:
training and identifying a target object to be identified according to a data frame detected by a sensor by utilizing deep learning to obtain an identification network of the target object;
calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
identifying the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the identification network;
according to the calibration result, extracting sensor point cloud data in the image range, and acquiring local point cloud corresponding to the characteristic part;
and acquiring the central positions of a plurality of characteristic parts according to the local point clouds, and acquiring the pose of the object by using the central positions.
2. The object pose detection method according to claim 1, wherein the sensor is a depth sensor including but not limited to an RGBD sensor, a lidar or a solid state lidar.
3. The object pose detection method according to claim 1, wherein the obtaining of the center positions of the plurality of feature parts according to the local point cloud and the obtaining of the pose of the object using the center positions specifically comprise:
calculating sensor point cloud data and local point cloud to obtain the central position of the local point cloud;
and calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic parts relative to the center of the target object.
4. The object pose detection method according to claim 3, wherein the calculating the pose of the entire object according to the central positions of the several local point clouds and the positions of the feature local to the center of the target object specifically comprises:
determining candidate positions of a first characteristic part and a second characteristic part in a plurality of characteristic parts according to the central positions of the first characteristic part and the second characteristic part and the distance between the first characteristic part and the second characteristic part;
calculating candidate poses of the target object according to the candidate positions of the characteristic parts in the target object;
and obtaining the whole pose of the target object according to the candidate pose of the target object.
5. The object pose detection method according to claim 4, wherein obtaining the entire pose of the target object based on the candidate poses of the target object specifically comprises:
the candidate poses of the target object comprise a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
obtaining a first detection data frame and a second detection data frame of the sensor at the first candidate pose and the second candidate pose respectively;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
6. An object pose detection apparatus characterized by comprising:
the first identification module is used for training and identifying a target object to be identified according to a data frame detected by a sensor by utilizing deep learning to obtain an identification network of the target object;
the calibration module is used for calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
the second identification module is used for identifying the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the identification network;
the point cloud data calculation module is used for extracting sensor point cloud data in an image range according to the calibration result and acquiring local point clouds corresponding to the characteristic parts;
and the pose acquisition module is used for acquiring the central positions of a plurality of characteristic parts according to the local point clouds and acquiring the pose of the object by using the central positions.
7. The object pose detection apparatus according to claim 6, wherein the pose acquisition module specifically includes:
the central position calculating unit is used for calculating the sensor point cloud data and the local point cloud to obtain the central position of the local point cloud;
and the pose calculation unit is used for calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic parts relative to the center of the target object.
8. The object pose detecting apparatus according to claim 7, wherein the pose calculating unit specifically includes:
a candidate position determining subunit, configured to determine, according to respective central positions of a first feature part and a second feature part in the plurality of feature parts and a distance between the first feature part and the second feature part, a candidate position of the feature part in the target object;
the candidate pose calculation subunit is used for calculating candidate poses of the target object according to the candidate positions of the local features on the target object;
and the acquisition subunit is used for acquiring the whole pose of the target object according to the candidate pose of the target object.
9. A robot, characterized in that the robot comprises a processor and a memory, the processor being coupled with the memory,
the memory is used for storing programs;
the processor to execute the program in the memory to cause the robot to perform the method of any of claims 1-5.
10. A computer storage medium, comprising a program which, when run on a computer, causes the computer to perform the method of any one of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110895680.8A CN113609985B (en) | 2021-08-05 | 2021-08-05 | Object pose detection method, detection device, robot and storable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110895680.8A CN113609985B (en) | 2021-08-05 | 2021-08-05 | Object pose detection method, detection device, robot and storable medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113609985A true CN113609985A (en) | 2021-11-05 |
CN113609985B CN113609985B (en) | 2024-02-23 |
Family
ID=78307027
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110895680.8A Active CN113609985B (en) | 2021-08-05 | 2021-08-05 | Object pose detection method, detection device, robot and storable medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113609985B (en) |
Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140300637A1 (en) * | 2013-04-05 | 2014-10-09 | Nokia Corporation | Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system |
US20180341836A1 (en) * | 2017-05-24 | 2018-11-29 | General Electric Company | Neural network point cloud generation system |
US20180365503A1 (en) * | 2017-06-16 | 2018-12-20 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and Apparatus of Obtaining Obstacle Information, Device and Computer Storage Medium |
CN109448034A (en) * | 2018-10-24 | 2019-03-08 | 华侨大学 | A kind of part pose acquisition methods based on geometric primitive |
CN110363816A (en) * | 2019-06-25 | 2019-10-22 | 广东工业大学 | A kind of mobile robot environment semanteme based on deep learning builds drawing method |
CN110533722A (en) * | 2019-08-30 | 2019-12-03 | 的卢技术有限公司 | A kind of the robot fast relocation method and system of view-based access control model dictionary |
CN110579215A (en) * | 2019-10-22 | 2019-12-17 | 上海木木机器人技术有限公司 | positioning method based on environmental feature description, mobile robot and storage medium |
CN111368852A (en) * | 2018-12-26 | 2020-07-03 | 沈阳新松机器人自动化股份有限公司 | Article identification and pre-sorting system and method based on deep learning and robot |
WO2020155616A1 (en) * | 2019-01-29 | 2020-08-06 | 浙江省北大信息技术高等研究院 | Digital retina-based photographing device positioning method |
CN111563442A (en) * | 2020-04-29 | 2020-08-21 | 上海交通大学 | Slam method and system for fusing point cloud and camera image data based on laser radar |
WO2020259248A1 (en) * | 2019-06-28 | 2020-12-30 | Oppo广东移动通信有限公司 | Depth information-based pose determination method and device, medium, and electronic apparatus |
US20210063577A1 (en) * | 2019-08-26 | 2021-03-04 | Ubtech Robotics Corp Ltd | Robot relocalization method and apparatus and robot using the same |
US20210116914A1 (en) * | 2019-10-16 | 2021-04-22 | Yuan Ren | Method and system for localization of an autonomous vehicle in real time |
CN112731358A (en) * | 2021-01-08 | 2021-04-30 | 奥特酷智能科技(南京)有限公司 | Multi-laser-radar external parameter online calibration method |
CN112784873A (en) * | 2020-12-25 | 2021-05-11 | 华为技术有限公司 | Semantic map construction method and equipment |
CN112967347A (en) * | 2021-03-30 | 2021-06-15 | 深圳市优必选科技股份有限公司 | Pose calibration method and device, robot and computer readable storage medium |
US20210181745A1 (en) * | 2019-12-17 | 2021-06-17 | Motional Ad Llc | Automated object annotation using fused camera/lidar data points |
CN113034575A (en) * | 2021-01-27 | 2021-06-25 | 深圳市华汉伟业科技有限公司 | Model construction method, pose estimation method and object picking device |
-
2021
- 2021-08-05 CN CN202110895680.8A patent/CN113609985B/en active Active
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140300637A1 (en) * | 2013-04-05 | 2014-10-09 | Nokia Corporation | Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system |
US20180341836A1 (en) * | 2017-05-24 | 2018-11-29 | General Electric Company | Neural network point cloud generation system |
US20180365503A1 (en) * | 2017-06-16 | 2018-12-20 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and Apparatus of Obtaining Obstacle Information, Device and Computer Storage Medium |
CN109448034A (en) * | 2018-10-24 | 2019-03-08 | 华侨大学 | A kind of part pose acquisition methods based on geometric primitive |
CN111368852A (en) * | 2018-12-26 | 2020-07-03 | 沈阳新松机器人自动化股份有限公司 | Article identification and pre-sorting system and method based on deep learning and robot |
WO2020155616A1 (en) * | 2019-01-29 | 2020-08-06 | 浙江省北大信息技术高等研究院 | Digital retina-based photographing device positioning method |
CN110363816A (en) * | 2019-06-25 | 2019-10-22 | 广东工业大学 | A kind of mobile robot environment semanteme based on deep learning builds drawing method |
WO2020259248A1 (en) * | 2019-06-28 | 2020-12-30 | Oppo广东移动通信有限公司 | Depth information-based pose determination method and device, medium, and electronic apparatus |
US20210063577A1 (en) * | 2019-08-26 | 2021-03-04 | Ubtech Robotics Corp Ltd | Robot relocalization method and apparatus and robot using the same |
CN110533722A (en) * | 2019-08-30 | 2019-12-03 | 的卢技术有限公司 | A kind of the robot fast relocation method and system of view-based access control model dictionary |
US20210116914A1 (en) * | 2019-10-16 | 2021-04-22 | Yuan Ren | Method and system for localization of an autonomous vehicle in real time |
CN110579215A (en) * | 2019-10-22 | 2019-12-17 | 上海木木机器人技术有限公司 | positioning method based on environmental feature description, mobile robot and storage medium |
US20210181745A1 (en) * | 2019-12-17 | 2021-06-17 | Motional Ad Llc | Automated object annotation using fused camera/lidar data points |
CN111563442A (en) * | 2020-04-29 | 2020-08-21 | 上海交通大学 | Slam method and system for fusing point cloud and camera image data based on laser radar |
CN112784873A (en) * | 2020-12-25 | 2021-05-11 | 华为技术有限公司 | Semantic map construction method and equipment |
CN112731358A (en) * | 2021-01-08 | 2021-04-30 | 奥特酷智能科技(南京)有限公司 | Multi-laser-radar external parameter online calibration method |
CN113034575A (en) * | 2021-01-27 | 2021-06-25 | 深圳市华汉伟业科技有限公司 | Model construction method, pose estimation method and object picking device |
CN112967347A (en) * | 2021-03-30 | 2021-06-15 | 深圳市优必选科技股份有限公司 | Pose calibration method and device, robot and computer readable storage medium |
Non-Patent Citations (3)
Title |
---|
李秀智;李家豪;张祥银;彭小彬;: "基于深度学习的机器人最优抓取姿态检测方法", 仪器仪表学报, no. 05, pages 108 - 117 * |
王宪伦;张海洲;安立雄;: "基于图像语义分割的物体位姿估计", 机械制造与自动化, no. 02, pages 216 - 220 * |
黄海卫;孔令成;谭治英;: "室内服务机器人实时目标识别与定位系统设计", 计算机工程与设计, no. 08, pages 2228 - 2232 * |
Also Published As
Publication number | Publication date |
---|---|
CN113609985B (en) | 2024-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11915099B2 (en) | Information processing method, information processing apparatus, and recording medium for selecting sensing data serving as learning data | |
JP6794436B2 (en) | Systems and methods for unobstructed area detection | |
Kanwal et al. | A navigation system for the visually impaired: a fusion of vision and depth sensor | |
WO2022012158A1 (en) | Target determination method and target determination device | |
Ushani et al. | A learning approach for real-time temporal scene flow estimation from lidar data | |
CN111587437A (en) | Activity recognition method using video pipe | |
JP7042905B2 (en) | Methods and devices for generating inverse sensor models, as well as methods for detecting obstacles | |
JP6853156B2 (en) | Posture estimation system, posture estimation device, and distance image camera | |
Vidas et al. | Real-time mobile 3D temperature mapping | |
KR101628155B1 (en) | Method for detecting and tracking unidentified multiple dynamic object in real time using Connected Component Labeling | |
US20220051425A1 (en) | Scale-aware monocular localization and mapping | |
CN112200129A (en) | Three-dimensional target detection method and device based on deep learning and terminal equipment | |
JPWO2017051480A1 (en) | Image processing apparatus and image processing method | |
Liang et al. | Image-based positioning of mobile devices in indoor environments | |
CN113405557B (en) | Path planning method and related device, electronic equipment and storage medium | |
Baig et al. | A robust motion detection technique for dynamic environment monitoring: A framework for grid-based monitoring of the dynamic environment | |
JP2017526083A (en) | Positioning and mapping apparatus and method | |
Ishihara et al. | Deep radio-visual localization | |
CN114859938A (en) | Robot, dynamic obstacle state estimation method and device and computer equipment | |
Kumar Rath et al. | Real‐time moving object detection and removal from 3D pointcloud data for humanoid navigation in dense GPS‐denied environments | |
CN113158779A (en) | Walking method and device and computer storage medium | |
CN113609985B (en) | Object pose detection method, detection device, robot and storable medium | |
WO2022083529A1 (en) | Data processing method and apparatus | |
Girão et al. | Real-time multi-view grid map-based spatial representation for mixed reality applications | |
Zhang et al. | 3D car-detection based on a Mobile Deep Sensor Fusion Model and real-scene applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |