CN113609985B - Object pose detection method, detection device, robot and storable medium - Google Patents

Object pose detection method, detection device, robot and storable medium Download PDF

Info

Publication number
CN113609985B
CN113609985B CN202110895680.8A CN202110895680A CN113609985B CN 113609985 B CN113609985 B CN 113609985B CN 202110895680 A CN202110895680 A CN 202110895680A CN 113609985 B CN113609985 B CN 113609985B
Authority
CN
China
Prior art keywords
pose
target object
candidate
sensor
data frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110895680.8A
Other languages
Chinese (zh)
Other versions
CN113609985A (en
Inventor
张干
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Noah Robot Technology Shanghai Co ltd
Original Assignee
Noah Robot Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Noah Robot Technology Shanghai Co ltd filed Critical Noah Robot Technology Shanghai Co ltd
Priority to CN202110895680.8A priority Critical patent/CN113609985B/en
Publication of CN113609985A publication Critical patent/CN113609985A/en
Application granted granted Critical
Publication of CN113609985B publication Critical patent/CN113609985B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an object pose detection method, which comprises the following steps: training and identifying the detected target object to be identified by deep learning to obtain an identification network of the target object; calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result; according to the image data frame and the identification network, identifying the target object, an image range of the target object in an image and a plurality of characteristic parts; according to the calibration result, sensor point cloud data in an image range are extracted, and local point clouds corresponding to the features are obtained; and acquiring the central positions of a plurality of characteristic parts according to the local point cloud, and acquiring the pose of the object by utilizing the central positions. Therefore, accurate information is provided for planning of the movement route of the robot, and the movement capacity and efficiency of the robot are improved.

Description

Object pose detection method, detection device, robot and storable medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an object pose detection method, detection equipment, a robot and a storable medium.
Background
When the robot capable of moving autonomously moves in the motion environment, if the object detected by the sensor is not identified, the part of the obstacle object such as some moving instruments and equipment, a cart and the like can be used as the obstacle, and the whole object cannot be known, so that when the robot is shielded by a person or the cart obstacle, an unsuitable route can be planned, the motion effect of the robot is poor, and even collision accidents occur. Because the motion route of the robot is relatively fixed, the detected object can be basically identified through training by deep learning and other methods, but if only the target obstacle object is identified, the pose of the obstacle object cannot be known, and good motion performance cannot be achieved.
Disclosure of Invention
The invention aims to provide an object pose detection method, detection equipment, a robot and a computer storage medium, which are used for solving the problem of pose detection of a detected target object when the robot moves, providing accurate information for planning a robot movement route and improving the movement capacity and efficiency of the robot.
The technical scheme provided by the invention is as follows:
an object pose detection method, comprising:
training and identifying a target object to be identified according to a data frame detected by a sensor by utilizing deep learning, and acquiring an identification network of the target object;
calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
according to the image data frame and the identification network, identifying the target object, an image range of the target object in an image and a plurality of characteristic parts;
according to the calibration result, sensor point cloud data in an image range are extracted, and local point clouds corresponding to the features are obtained;
and acquiring the central positions of a plurality of characteristic parts according to the local point cloud, and acquiring the pose of the object by utilizing the central positions.
Preferably, the sensor is a depth sensor, including but not limited to an RGBD sensor, a lidar or a solid state lidar.
Further, the obtaining the central positions of the feature parts according to the local point cloud, and obtaining the pose of the object by using the central positions specifically includes:
calculating sensor point cloud data and local point cloud to obtain the center position of the local point cloud;
and calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic local relative to the center of the target object.
Optionally, calculating the pose of the whole object according to the central positions of the local point clouds and the positions of the feature local relative to the center of the target object specifically includes:
according to the central positions of a first feature part and a second feature part in a plurality of feature parts and the distances between the first feature part and the second feature part, determining candidate positions of the feature parts on the target object;
according to the candidate position of the characteristic part on the target object, calculating to obtain the candidate pose of the target object;
and obtaining the whole pose of the target object according to the candidate pose of the target object.
Optionally, according to the candidate pose of the target object, obtaining the whole pose of the target object specifically includes:
the candidate pose of the target object comprises a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
respectively obtaining a first detection data frame and a second detection data frame of the sensor on the first candidate pose and the second pose;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, and if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
In order to achieve the object of the present invention, an embodiment of the present invention further provides an apparatus for detecting a pose of an object, the apparatus including:
the first recognition module is used for training and recognizing a target object to be recognized according to a data frame detected by the sensor by utilizing deep learning, and obtaining a recognition network of the target object;
the calibration module is used for calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
the second recognition module is used for recognizing the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the recognition network;
the point cloud data calculation module is used for extracting sensor point cloud data in an image range according to the calibration result and obtaining local point clouds corresponding to the features locally;
and the pose acquisition module is used for acquiring the central positions of a plurality of characteristic parts according to the local point cloud and acquiring the pose of the object by utilizing the central positions.
Further, the pose acquisition module specifically includes:
the central position calculating unit is used for calculating sensor point cloud data and local point clouds to obtain the central position of the local point clouds;
and the pose calculating unit is used for calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic local relative to the center of the target object.
Optionally, the pose calculating unit specifically includes:
a candidate position determining subunit, configured to determine a candidate position of the feature part in the target object according to respective center positions of a first feature part and a second feature part in the plurality of feature parts, and distances between the first feature part and the second feature part;
the candidate pose calculating subunit is used for calculating the candidate pose of the target object according to the candidate position of the characteristic part on the target object;
and the acquisition subunit is used for acquiring the whole pose of the target object according to the candidate pose of the target object.
In order to achieve the object of the invention, an embodiment of the present invention also provides a robot including a processor and a memory, the processor being coupled to the memory, the memory being for storing a program; the processor is configured to execute the program in the memory, so that the robot performs the method for detecting the pose of the object as described above.
In order to achieve the object of the present invention, an embodiment of the present invention further provides a computer-readable storage medium having instructions stored therein, which when run on a computer, cause the computer to perform any of the above-described methods for realizing detection by a sensor.
According to the invention, the depth sensor and the camera data frame are utilized, the target object and the local part of the target object are identified, and the pose of the whole target object is obtained through the characteristic local part of the target object, so that the robot can reasonably plan the movement route in the movement process, and the movement efficiency of the robot is improved.
Drawings
The above features, technical features, advantages and implementation manners of the user equipment admission method and apparatus, the user equipment handover method and apparatus will be further described in the following description of the preferred embodiments with reference to the accompanying drawings in a clearly understandable manner.
Fig. 1 is a flowchart of an object pose detection method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of an object pose detection device according to an embodiment of the present invention;
fig. 3 is a schematic diagram of another object pose detection apparatus according to an embodiment of the present invention;
fig. 4 is a schematic diagram of another object pose detection apparatus according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an autonomous mobile robot according to an embodiment of the present invention.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will explain the specific embodiments of the present invention with reference to the accompanying drawings. It is evident that the drawings in the following description are only examples of the invention, from which other drawings and other embodiments can be obtained by a person skilled in the art without inventive effort.
For the sake of simplicity of the drawing, the parts relevant to the present invention are shown only schematically in the figures, which do not represent the actual structure thereof as a product. Additionally, in order to facilitate a concise understanding of the drawings, components having the same structure or function in some of the drawings are only schematically depicted, or only one of them is labeled. Herein, "a" means not only "only this one" but also "more than one" case.
In the development process of autonomous movement of intelligent devices, the inventor needs to utilize sensors to detect environments in order to realize autonomous movement of the intelligent devices. The intelligent device may be an autonomous moving robot, an autonomous moving automobile, or other autonomous walking devices, and the devices more or less adopt depth sensors such as a laser radar to detect a target object and a camera to identify the object in the motion environment of the robot, however, in the prior art, the detection or identification of the target object only detects the object, but if only a part of the object is detected, the obstacle degree of the object cannot be accurately judged, even if the whole object part is obtained, if the pose of the object is not known, for example, the walking planning of the robot is affected by the walking direction of the object, so a method is needed to accurately detect the pose of the target object, and then the obstacle-free or obstacle-avoiding movement can be realized.
In order to accurately acquire the pose of a target object, the embodiment of the invention provides an object pose detection method.
Referring to fig. 1, an object pose detection method according to an embodiment of the present invention includes:
s1, training and identifying a target object to be identified according to a data frame detected by a sensor by utilizing deep learning, and obtaining an identification network of the target object;
firstly, training an object to be identified and a feature part on the object, such as a trolley and wheels on the trolley, by using a mature deep learning method;
the deep learning of the embodiment of the invention does not limit the network used, for example, the full convolution network technology can be transplanted to the three-dimensional distance scanning data detection task. Specifically, the scene is set as a detection task according to the distance data of the Velodyne64E lidar. Data is presented in a 2D point map and target confidence and bounding boxes are predicted simultaneously using a single 2D end-to-end full convolution network. By designed frame coding, a complete 3D frame can also be predicted using a 2D convolutional network.
Or to eliminate the need for manual feature engineering of the 3D point cloud, voxelNet, a general 3D detection network, can be used, which unifies feature extraction and frame prediction into a single-step end-to-end trainable depth network.
And the method is also suitable for sensor fusion, free space estimation and machine learning methods based on environment representation of a grid chart, and mainly uses depth CNN to detect and classify targets. As an input to the CNN, the 3D distance sensor information is efficiently encoded using a multi-layer mesh map. The inference output is a list of rotated bounding boxes with associated semantic categories. The distance sensor measurements are converted into a multi-layer mesh as input to the target detection and classification network. The CNN network simultaneously infers the rotated 3D bounding box and the semantic category. These frames are projected into the camera image for visual verification.
S2, calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
the space conversion relation from the sensor to the camera is found through calibration, a rotation matrix R and a translation matrix T are needed for conversion between different coordinate systems, and preparation is made for subsequent data fusion of the sensor and the camera.
S3, identifying the target object, an image range of the target object in an image and a plurality of characteristic parts according to the image data frame and the identification network;
the target object in the embodiment of the present invention refers to an image for performing pose detection, and may include a person, an instrument device, a cart, and the like. In the embodiment of the present invention, the image of the target object may be acquired first, for example, the target image may be selected from the stored image data, or the transmitted target image may be received from another device, or the target image may be directly captured by the image capturing device, which is merely illustrative of acquiring the target image, and the embodiment of the present invention is not limited thereto.
After the image of the target object is acquired, the target object in the image of the target object may be identified, where the target object in the target image may be identified by an image identification algorithm, or the identification of the target object may be performed by a trained machine learning network model, where the machine learning network model may include a neural network model, a deep learning neural network model, or the like, which is not limited in this embodiment of the present invention.
For example, a hospital bed and wheels of the hospital bed are identified by deep learning, and a point cloud of the hospital bed and wheels is obtained by using a calibrated relationship of cameras and sensors. The whole coordinate and the pose of the sickbed can be calculated by utilizing the relation of the relative distances among 4 wheels of the sickbed and the identified coordinates of the point clouds of the wheels.
S4, extracting sensor point cloud data in an image range according to the calibration result, and obtaining local point clouds corresponding to the features locally;
s5, according to the local point cloud, obtaining central positions of a plurality of feature parts, and obtaining the pose of the object by utilizing the central positions.
Preferably, the sensor is a depth sensor, including but not limited to an RGBD sensor, a lidar or a solid state lidar.
Further, the obtaining the central positions of the feature parts according to the local point cloud, and obtaining the pose of the object by using the central positions specifically includes:
calculating sensor point cloud data and local point cloud to obtain the center position of the local point cloud;
and calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic local relative to the center of the target object.
Optionally, calculating the pose of the whole object according to the central positions of the local point clouds and the positions of the feature local relative to the center of the target object specifically includes:
according to the central positions of a first feature part and a second feature part in a plurality of feature parts and the distances between the first feature part and the second feature part, determining candidate positions of the feature parts on the target object;
according to the candidate position of the characteristic part on the target object, calculating to obtain the candidate pose of the target object;
and obtaining the whole pose of the target object according to the candidate pose of the target object.
Optionally, according to the candidate pose of the target object, obtaining the whole pose of the target object specifically includes:
the candidate pose of the target object comprises a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
respectively obtaining a first detection data frame and a second detection data frame of the sensor on the first candidate pose and the second pose;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, and if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
For example, knowing the central positions W1 and W2 of the two wheels on the cart, combining the characteristics of the cart with 4 wheels, the candidate poses P1 and P2 of the cart can be calculated with respect to the overall pose of the cart (since the cart has 4 wheels, the distance between the two wheels is typically three possibilities of L1, L2, L3, and it can be known which two of the two wheels are possible from W1 and W2);
assuming that the carts are on candidate poses P1 and P2, respectively, calculating whether the depth data returned by the sensor and the object collide when the carts are on the pose: for example, when an object is on P1, which object portions should be detected but not actually detected, thereby obtaining two scores S1 and S2; the highest pose is selected as the object.
In order to achieve the object of the present invention, as shown in fig. 2, an embodiment of the present invention further provides an apparatus 100 for detecting a pose of an object, the apparatus including:
the first recognition module 11 is configured to perform training recognition on a target object to be recognized according to a data frame detected by a sensor by using deep learning, so as to obtain a recognition network of the target object;
the calibration module 12 is used for calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera to obtain a calibration result;
the second identifying module 13 is configured to identify, according to the image data frame and the identifying network, the target object, an image range of the target object in an image, and a plurality of feature parts;
the point cloud data calculation module 14 is configured to extract sensor point cloud data in an image range according to the calibration result, and obtain a local point cloud corresponding to the feature locally;
and the pose acquisition module 15 is used for acquiring the central positions of a plurality of characteristic parts according to the local point clouds and acquiring the pose of the object by utilizing the central positions.
Further, as shown in fig. 3, the pose acquisition module specifically includes:
a central position calculating unit 151, configured to calculate sensor point cloud data and a local point cloud, so as to obtain a central position of the local point cloud;
the pose calculating unit 152 is configured to calculate the pose of the entire target object according to the central positions of the local point clouds and the positions of the feature local relative to the center of the target object.
Optionally, as shown in fig. 4, the pose calculating unit 152 specifically includes:
a candidate position determining subunit 1521, configured to determine a candidate position of a feature part in the target object according to respective center positions of a first feature part and a second feature part in the plurality of feature parts, and a distance between the first feature part and the second feature part;
a candidate pose calculating subunit 1522, configured to calculate, according to the candidate position of the feature part on the target object, a candidate pose of the target object;
and the obtaining subunit 1523 is configured to obtain the entire pose of the target object according to the candidate pose of the target object.
The candidate pose of the target object comprises a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
respectively obtaining a first detection data frame and a second detection data frame of the sensor on the first candidate pose and the second pose;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, and if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
For example, knowing the central positions W1 and W2 of the two wheels on the cart, combining the characteristics of the cart with 4 wheels, the candidate poses P1 and P2 of the cart can be calculated with respect to the overall pose of the cart (since the cart has 4 wheels, the distance between the two wheels is typically three possibilities of L1, L2, L3, and it can be known which two of the two wheels are possible from W1 and W2);
assuming that the carts are on candidate poses P1 and P2, respectively, calculating whether the depth data returned by the sensor and the object collide when the carts are on the pose: for example, when an object is on P1, which object portions should be detected but not actually detected, thereby obtaining two scores S1 and S2; the highest pose is selected as the object.
In order to achieve the object of the invention, an embodiment of the present invention also provides a robot including a processor and a memory, the processor being coupled to the memory, the memory being for storing a program; the processor is configured to execute the program in the memory, so that the robot performs the method for detecting the pose of the object as described above.
In order to achieve the object of the present invention, an embodiment of the present invention further provides a computer-readable storage medium having instructions stored therein, which when run on a computer, cause the computer to perform any of the above-described methods for realizing detection by a sensor.
According to the invention, the depth sensor and the camera data frame are utilized, the target object and the local part of the target object are identified, and the pose of the whole target object is obtained through the characteristic local part of the target object, so that the robot can reasonably plan the movement route in the movement process, and the movement efficiency of the robot is improved.
It should be noted that, the embodiment of the pose detection apparatus provided by the present invention and the embodiment of the pose detection method provided by the present invention are both based on the same inventive concept, and can achieve the same technical effects; thus, other specific contents of the embodiment of the pose detection apparatus may refer to the description of the embodiment contents of the aforementioned pose detection method.
It should be noted that the above division of the modules or units of the detection device is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these units may all be implemented in software in the form of processor calls; or can be realized in hardware; and part of the units can be realized in a form of calling by a processor through software, and part of the units can be realized in a form of hardware.
For example, the functions of the above modules or units may be stored in a memory in the form of program codes, which are scheduled by a processor to realize the functions of the above units. The processor may be a general purpose processor such as a central processing unit (Central Processing Unit, CPU) or other processor that may invoke a program. As another example, each of the above units may be one or more integrated circuits configured to implement the above methods, e.g.: one or more Application Specific Integrated Circuits (ASICs), or one or more Digital Signal Processors (DSPs), or one or more Field Programmable Gate Arrays (FPGAs), etc. For another example, in combination with the two modes, part of the functions are realized by the form of processor scheduler codes, and part of the functions are realized by the form of hardware integrated circuits. And when the above functions are integrated together, they may be implemented in the form of a system-on-a-chip (SOC).
The detection device and the like provided in the embodiment of the application may specifically be a chip, where the chip includes: a processing unit, which may be, for example, a processor, and a communication unit, which may be, for example, an input/output interface, pins or circuitry, etc. The processing unit may execute the computer-executable instructions stored by the storage unit to cause a chip within the detection device to perform the steps performed by the detection device described in the above-described embodiment, or to cause a chip within the execution device to perform the steps performed by the detection device described in the embodiment as described in fig. 2.
Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be a storage unit in the wireless access device side located outside the chip, such as a read-only memory (ROM) or other type of static storage device that may store static information and instructions, a random access memory (random access memory, RAM), etc.
In order to achieve the object of the invention, as shown in fig. 5, an embodiment of the present invention further provides a robot 180, the robot 180 including a processor 1803 and a memory 1804, the processor 1803 being coupled to the memory 1804, wherein,
the memory 1804 is used for storing programs;
the processor 1803 is configured to execute the program in the memory, so that the robot performs the method for detecting the pose of the object as described above.
Referring to fig. 5, the method disclosed in the embodiment of the present invention corresponding to the embodiment of fig. 1 may be applied to an autonomous mobile robot 180, where the robot 180 includes a processor 1803, and the processor 1803 may be an integrated circuit chip with signal processing capability. In implementation, the steps of the above method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 1803. The processor 1803 may be a general-purpose processor, a digital signal processor (digital signalprocessing, DSP), a microprocessor, or a microcontroller, and may further include an application specific integrated circuit (applicationspecific integrated circuit, ASIC), a field-programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The processor 1803 may implement or perform the methods, steps, and logic blocks disclosed in the corresponding embodiment of fig. 1 of the present application.
A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in the memory 1804, and the processor 1803 reads information in the memory 1804 and, in combination with the hardware, performs the steps of the method described above.
The receiver 1801 may be used to receive input numeric or character information and to generate signal inputs related to the relevant setup and control of the robot 180. The transmitter 1802 is operable to output numeric or character information via a first interface; the transmitter 1802 is further operable to send instructions to the disk stack via the first interface to modify data in the disk stack; the transmitter 1802 may also include a display device such as a display screen.
In an embodiment of the present invention, there is also provided a computer-readable storage medium having stored therein a program for performing signal processing, which when executed on a computer, causes the computer to perform the steps performed by the object pose detection method described in the foregoing embodiment or causes the computer to perform the steps performed by the detection apparatus described in the foregoing embodiment shown in fig. 2.
It should be further noted that the above-described apparatus embodiments are merely illustrative, and that the units described as separate units may or may not be physically separate, and that units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the application, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented by means of software plus necessary general purpose hardware, or of course may be implemented by dedicated hardware including application specific integrated circuits, dedicated CPUs, dedicated memories, dedicated components and the like. Generally, functions performed by computer programs can be easily implemented by corresponding hardware, and specific hardware structures for implementing the same functions can be varied, such as analog circuits, digital circuits, or dedicated circuits. However, a software program implementation is a preferred embodiment in many cases for the present application. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a readable storage medium, such as a floppy disk, a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a personal computer, or a network device, etc.) to execute the method described in the embodiments of the present application.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium, which may be any available medium that can be stored by a computer or a data storage device, such as a training device, a data center, or the like, that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
It should be noted that the above embodiments can be freely combined as needed. The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (5)

1. The object pose detection method is characterized by comprising the following steps of:
training and identifying a target object to be identified according to a data frame detected by a depth sensor by utilizing deep learning, and acquiring an identification network of the target object;
calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera, and obtaining a calibration result by calibrating to find the space conversion relation from the sensor to the camera;
according to the image data frame and the identification network, identifying the target object, an image range of the target object in an image and a plurality of characteristic parts;
according to the calibration result, extracting sensor point cloud data in the image range, and obtaining local point clouds corresponding to the features locally;
according to the local point cloud, acquiring central positions of a plurality of feature parts, and acquiring the pose of an object by utilizing the central positions, wherein the method specifically comprises the following steps: calculating sensor point cloud data and local point cloud to obtain the center position of the local point cloud;
calculating the pose of the whole target object according to the central positions of the local point clouds and the positions of the characteristic local relative to the center of the target object, wherein calculating the pose of the whole object according to the central positions of the local point clouds and the positions of the characteristic local relative to the center of the target object specifically comprises:
according to the central positions of a first feature part and a second feature part in a plurality of feature parts and the distances between the first feature part and the second feature part, determining candidate positions of the feature parts on the target object;
according to the candidate position of the characteristic part on the target object, calculating to obtain the candidate pose of the target object;
obtaining the whole pose of the target object according to the candidate pose of the target object, wherein the obtaining the whole pose of the target object according to the candidate pose of the target object specifically comprises:
the candidate pose of the target object comprises a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
respectively obtaining a first detection data frame and a second detection data frame of the sensor on the first candidate pose and the second candidate pose;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, and if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
2. The object pose detection method according to claim 1, wherein the depth sensor includes, but is not limited to, an RGBD sensor, a laser radar.
3. An apparatus for detecting a pose of an object, the apparatus comprising:
the first recognition module is used for training and recognizing a target object to be recognized according to a data frame detected by the depth sensor by utilizing the deep learning to obtain a recognition network of the target object;
the calibration module is used for calibrating the camera and the sensor according to the detection data frame of the sensor and the image data frame of the camera, and the spatial conversion relation from the sensor to the camera is found through calibration to obtain a calibration result;
the second recognition module is used for recognizing the target object, the image range of the target object in the image and a plurality of characteristic parts according to the image data frame and the recognition network;
the point cloud data calculation module is used for extracting sensor point cloud data in an image range according to the calibration result and obtaining local point clouds corresponding to the features locally;
the pose acquisition module is used for acquiring central positions of a plurality of feature parts according to the local point cloud and acquiring the pose of an object by utilizing the central positions, wherein the pose acquisition module specifically comprises:
the central position calculating unit is used for calculating sensor point cloud data and local point clouds to obtain the central position of the local point clouds;
the pose calculating unit is used for calculating the pose of the whole target object according to the central positions of a plurality of local point clouds and the positions of the characteristic local relative to the center of the target object, wherein the pose calculating unit specifically comprises:
a candidate position determining subunit, configured to determine a candidate position of the feature part in the target object according to respective center positions of a first feature part and a second feature part in the plurality of feature parts, and distances between the first feature part and the second feature part;
the candidate pose calculating subunit is used for calculating the candidate pose of the target object according to the candidate position of the characteristic part on the target object;
the obtaining subunit is configured to obtain an entire pose of the target object according to the candidate pose of the target object, where obtaining the entire pose of the target object according to the candidate pose of the target object specifically includes:
the candidate pose of the target object comprises a first candidate pose and a second candidate pose, wherein the first candidate pose and the second candidate pose are obtained according to a first candidate position and a second candidate position of the characteristic part in the target object respectively;
respectively obtaining a first detection data frame and a second detection data frame of the sensor on the first candidate pose and the second candidate pose;
and respectively judging whether the first detection data frame and the second detection data frame are overlapped with the sensor point cloud data of the target object, and if so, taking the candidate pose corresponding to the overlapped detection data frame as the pose of the target object.
4. A robot comprising a processor and a memory, said processor being coupled to said memory, characterized in that,
the memory is used for storing programs;
the processor for executing a program in the memory, causing the robot to perform the method of any one of claims 1-2.
5. A computer storage medium comprising a program which, when run on a computer, causes the computer to perform the method of any of claims 1-2.
CN202110895680.8A 2021-08-05 2021-08-05 Object pose detection method, detection device, robot and storable medium Active CN113609985B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110895680.8A CN113609985B (en) 2021-08-05 2021-08-05 Object pose detection method, detection device, robot and storable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110895680.8A CN113609985B (en) 2021-08-05 2021-08-05 Object pose detection method, detection device, robot and storable medium

Publications (2)

Publication Number Publication Date
CN113609985A CN113609985A (en) 2021-11-05
CN113609985B true CN113609985B (en) 2024-02-23

Family

ID=78307027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110895680.8A Active CN113609985B (en) 2021-08-05 2021-08-05 Object pose detection method, detection device, robot and storable medium

Country Status (1)

Country Link
CN (1) CN113609985B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109448034A (en) * 2018-10-24 2019-03-08 华侨大学 A kind of part pose acquisition methods based on geometric primitive
CN110363816A (en) * 2019-06-25 2019-10-22 广东工业大学 A kind of mobile robot environment semanteme based on deep learning builds drawing method
CN110533722A (en) * 2019-08-30 2019-12-03 的卢技术有限公司 A kind of the robot fast relocation method and system of view-based access control model dictionary
CN110579215A (en) * 2019-10-22 2019-12-17 上海木木机器人技术有限公司 positioning method based on environmental feature description, mobile robot and storage medium
CN111368852A (en) * 2018-12-26 2020-07-03 沈阳新松机器人自动化股份有限公司 Article identification and pre-sorting system and method based on deep learning and robot
WO2020155616A1 (en) * 2019-01-29 2020-08-06 浙江省北大信息技术高等研究院 Digital retina-based photographing device positioning method
CN111563442A (en) * 2020-04-29 2020-08-21 上海交通大学 Slam method and system for fusing point cloud and camera image data based on laser radar
WO2020259248A1 (en) * 2019-06-28 2020-12-30 Oppo广东移动通信有限公司 Depth information-based pose determination method and device, medium, and electronic apparatus
CN112731358A (en) * 2021-01-08 2021-04-30 奥特酷智能科技(南京)有限公司 Multi-laser-radar external parameter online calibration method
CN112784873A (en) * 2020-12-25 2021-05-11 华为技术有限公司 Semantic map construction method and equipment
CN112967347A (en) * 2021-03-30 2021-06-15 深圳市优必选科技股份有限公司 Pose calibration method and device, robot and computer readable storage medium
CN113034575A (en) * 2021-01-27 2021-06-25 深圳市华汉伟业科技有限公司 Model construction method, pose estimation method and object picking device

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9558559B2 (en) * 2013-04-05 2017-01-31 Nokia Technologies Oy Method and apparatus for determining camera location information and/or camera pose information according to a global coordinate system
US10262243B2 (en) * 2017-05-24 2019-04-16 General Electric Company Neural network point cloud generation system
CN109145680B (en) * 2017-06-16 2022-05-27 阿波罗智能技术(北京)有限公司 Method, device and equipment for acquiring obstacle information and computer storage medium
CN110307838B (en) * 2019-08-26 2019-12-10 深圳市优必选科技股份有限公司 Robot repositioning method and device, computer-readable storage medium and robot
US11262759B2 (en) * 2019-10-16 2022-03-01 Huawei Technologies Co., Ltd. Method and system for localization of an autonomous vehicle in real time
US11940804B2 (en) * 2019-12-17 2024-03-26 Motional Ad Llc Automated object annotation using fused camera/LiDAR data points

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109448034A (en) * 2018-10-24 2019-03-08 华侨大学 A kind of part pose acquisition methods based on geometric primitive
CN111368852A (en) * 2018-12-26 2020-07-03 沈阳新松机器人自动化股份有限公司 Article identification and pre-sorting system and method based on deep learning and robot
WO2020155616A1 (en) * 2019-01-29 2020-08-06 浙江省北大信息技术高等研究院 Digital retina-based photographing device positioning method
CN110363816A (en) * 2019-06-25 2019-10-22 广东工业大学 A kind of mobile robot environment semanteme based on deep learning builds drawing method
WO2020259248A1 (en) * 2019-06-28 2020-12-30 Oppo广东移动通信有限公司 Depth information-based pose determination method and device, medium, and electronic apparatus
CN110533722A (en) * 2019-08-30 2019-12-03 的卢技术有限公司 A kind of the robot fast relocation method and system of view-based access control model dictionary
CN110579215A (en) * 2019-10-22 2019-12-17 上海木木机器人技术有限公司 positioning method based on environmental feature description, mobile robot and storage medium
CN111563442A (en) * 2020-04-29 2020-08-21 上海交通大学 Slam method and system for fusing point cloud and camera image data based on laser radar
CN112784873A (en) * 2020-12-25 2021-05-11 华为技术有限公司 Semantic map construction method and equipment
CN112731358A (en) * 2021-01-08 2021-04-30 奥特酷智能科技(南京)有限公司 Multi-laser-radar external parameter online calibration method
CN113034575A (en) * 2021-01-27 2021-06-25 深圳市华汉伟业科技有限公司 Model construction method, pose estimation method and object picking device
CN112967347A (en) * 2021-03-30 2021-06-15 深圳市优必选科技股份有限公司 Pose calibration method and device, robot and computer readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于图像语义分割的物体位姿估计;王宪伦;张海洲;安立雄;;机械制造与自动化(02);第216-220页 *
基于深度学习的机器人最优抓取姿态检测方法;李秀智;李家豪;张祥银;彭小彬;;仪器仪表学报(05);第108-117页 *
室内服务机器人实时目标识别与定位系统设计;黄海卫;孔令成;谭治英;;计算机工程与设计(08);第2228-2232页 *

Also Published As

Publication number Publication date
CN113609985A (en) 2021-11-05

Similar Documents

Publication Publication Date Title
US11216971B2 (en) Three-dimensional bounding box from two-dimensional image and point cloud data
EP3627180B1 (en) Sensor calibration method and device, computer device, medium, and vehicle
JP6794436B2 (en) Systems and methods for unobstructed area detection
CN111079619B (en) Method and apparatus for detecting target object in image
WO2022012158A1 (en) Target determination method and target determination device
CN110799989A (en) Obstacle detection method, equipment, movable platform and storage medium
CN111931764A (en) Target detection method, target detection framework and related equipment
Liang et al. Image-based positioning of mobile devices in indoor environments
CN112907625B (en) Target following method and system applied to quadruped bionic robot
WO2024087962A1 (en) Truck bed orientation recognition system and method, and electronic device and storage medium
EP3703008A1 (en) Object detection and 3d box fitting
Shi et al. An improved lightweight deep neural network with knowledge distillation for local feature extraction and visual localization using images and LiDAR point clouds
CN113781519A (en) Target tracking method and target tracking device
US11308324B2 (en) Object detecting system for detecting object by using hierarchical pyramid and object detecting method thereof
Ishihara et al. Deep radio-visual localization
CN115147333A (en) Target detection method and device
CN116563376A (en) LIDAR-IMU tight coupling semantic SLAM method based on deep learning and related device
CN116310673A (en) Three-dimensional target detection method based on fusion of point cloud and image features
CN115424233A (en) Target detection method and target detection device based on information fusion
US20220164595A1 (en) Method, electronic device and storage medium for vehicle localization
Ponnaganti et al. Deep learning for lidar-based autonomous vehicles in smart cities
CN114972492A (en) Position and pose determination method and device based on aerial view and computer storage medium
CN113609985B (en) Object pose detection method, detection device, robot and storable medium
CN115131756A (en) Target detection method and device
CN114384486A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant