CN111325795B - Image processing method, device, storage medium and robot - Google Patents

Image processing method, device, storage medium and robot Download PDF

Info

Publication number
CN111325795B
CN111325795B CN202010117760.6A CN202010117760A CN111325795B CN 111325795 B CN111325795 B CN 111325795B CN 202010117760 A CN202010117760 A CN 202010117760A CN 111325795 B CN111325795 B CN 111325795B
Authority
CN
China
Prior art keywords
grabbing
data information
points
target
image processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010117760.6A
Other languages
Chinese (zh)
Other versions
CN111325795A (en
Inventor
周韬
王旭新
成慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Sensetime Technology Co Ltd
Original Assignee
Shenzhen Sensetime Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Sensetime Technology Co Ltd filed Critical Shenzhen Sensetime Technology Co Ltd
Priority to CN202010117760.6A priority Critical patent/CN111325795B/en
Publication of CN111325795A publication Critical patent/CN111325795A/en
Application granted granted Critical
Publication of CN111325795B publication Critical patent/CN111325795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1656Programme controls characterised by programming, planning systems for manipulators
    • B25J9/1664Programme controls characterised by programming, planning systems for manipulators characterised by motion, path, trajectory planning
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J9/00Programme-controlled manipulators
    • B25J9/16Programme controls
    • B25J9/1679Programme controls characterised by the tasks executed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the disclosure discloses an image processing method, an image processing device, a storage medium and a robot, which comprise the steps of determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the grabbing surfaces from a multi-dimensional image to be processed according to image data information of the multi-dimensional image to be processed, wherein the grabbing surfaces and the grabbing points are in one-to-one correspondence; determining a plurality of grabbing parameters corresponding to the grabbing surfaces; evaluating the plurality of grabbing surfaces by utilizing the plurality of grabbing parameters, and determining a target grabbing surface from the plurality of grabbing surfaces according to an evaluation result; taking the grabbing point corresponding to the target grabbing surface as a target grabbing point; and determining the grabbing point pose corresponding to the target grabbing point according to the target grabbing point, so as to grab the target object corresponding to the target grabbing point from the multi-dimensional image to be processed according to the grabbing point pose.

Description

Image processing method, device, storage medium and robot
Technical Field
The present disclosure relates to the field of computer vision, and in particular, to an image processing method, an image processing device, a storage medium, and a robot.
Background
In recent years, pose calculation of objects has very important applications in the fields of robotics, automation and machine vision, in particular in the field of computer vision.
In the prior art, the image processing device determines the pose of the target object according to the height information of the target grabbing surface of the target object, so that the accuracy of the image processing device in determining the pose of the target object is reduced.
Disclosure of Invention
The embodiment of the disclosure provides an image processing method, an image processing device, a storage medium and a robot.
The technical scheme of the present disclosure is realized as follows:
the present embodiment provides an image processing method, including:
determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the grabbing surfaces from the multidimensional image to be processed according to image data information of the multidimensional image to be processed, wherein the grabbing surfaces and the grabbing points are in one-to-one correspondence;
determining a plurality of grabbing parameters corresponding to the grabbing surfaces;
evaluating the plurality of grabbing surfaces by utilizing the plurality of grabbing parameters, and determining a target grabbing surface from the plurality of grabbing surfaces according to an evaluation result;
taking the grabbing point corresponding to the target grabbing surface as a target grabbing point;
and determining a grabbing point pose corresponding to the target grabbing point according to the target grabbing point, so as to grab a target object corresponding to the target grabbing point from the multi-dimensional image to be processed according to the grabbing point pose.
The image processing device evaluates the plurality of grabbing surfaces according to the plurality of grabbing parameter values so as to determine the target grabbing surface, and the target grabbing surface is not determined according to a single height parameter value, so that the accuracy of determining the target grabbing surface by the image processing device is improved, the grabbing point pose of the target object is determined according to the target grabbing surface with high accuracy, and the accuracy of determining the pose of the target object by the image processing device is improved.
In the above solution, the estimating the plurality of grabbing surfaces by using the plurality of grabbing parameters, and determining the target grabbing surface from the plurality of grabbing surfaces according to the estimation result includes:
each grabbing surface of the grabbing surfaces is evaluated by the grabbing parameters, and a plurality of grabbing surface evaluation values corresponding to the grabbing surfaces are obtained;
and determining a first grabbing surface evaluation value with the highest evaluation value from the grabbing surface evaluation values, and taking the grabbing surface corresponding to the first grabbing surface evaluation value as the target grabbing surface.
The image processing device evaluates each grabbing surface according to the parameter values, and determines the target grabbing surface from the grabbing surfaces according to the evaluation values, so that the accuracy of the image processing device in determining the target grabbing surface is improved.
In the above aspect, the plurality of capturing parameters includes at least one of:
the area parameter of the grabbing surface, the height parameter of the grabbing surface, the flatness parameter of the grabbing surface and the gradient parameter of the grabbing surface.
The image processing device can evaluate the plurality of grabbing surfaces by utilizing the grabbing parameters to improve the accuracy in determining the target grabbing surface.
In the above aspect, the determining, according to the image data information of the multi-dimensional image to be processed, a plurality of capturing surfaces and a plurality of capturing points corresponding to the plurality of capturing surfaces from the multi-dimensional image to be processed includes:
inputting image data information of the multidimensional image to be processed into a deep learning network model to obtain a plurality of pixel points and a plurality of center points corresponding to the pixel points, wherein the deep learning network model is a model obtained by training an initial deep learning network model by using sample image data information of a sample multidimensional image, and the pixel points correspond to the center points one by one;
Dividing the plurality of center points into a plurality of groups of center points;
determining a grabbing point corresponding to any group of center points according to any group of center points in the plurality of groups of center points until the plurality of grabbing points are determined from the plurality of groups of center points;
and determining one grabbing surface corresponding to any group of center points according to any group of pixel points corresponding to any group of center points in the plurality of groups of center points until the plurality of grabbing surfaces are determined from the plurality of groups of pixel points corresponding to the plurality of groups of center points, wherein any group of center points corresponds to any group of pixel points one by one.
The image processing device determines a plurality of pixel points and a plurality of center points corresponding to the pixel points from image data information of the multi-dimensional image to be processed by using the deep learning network model, and determines a plurality of grabbing surfaces and a plurality of grabbing points according to the pixel points and the center points, so that the image processing device can infer the grabbing points corresponding to the grabbing surfaces without carrying out acquired data labeling and retraining.
In the above solution, determining, according to any one of the plurality of sets of center points, a gripping point corresponding to the any one set of center points until the plurality of gripping points are determined from the plurality of sets of center points includes:
Averaging one group of position data of any group of center points to obtain average position data until a plurality of average position data are obtained from a plurality of groups of position data information of a plurality of groups of center points;
and taking a plurality of points corresponding to the plurality of average position data as a plurality of grabbing points, wherein the plurality of average position data are in one-to-one correspondence with the plurality of grabbing points.
The image processing device averages a group of position data of any group of center points to determine the positions of a plurality of grabbing points, so that the accuracy of the position information of the plurality of grabbing points is improved.
In the above aspect, before determining, from the multi-dimensional image to be processed, a plurality of capture surfaces and a plurality of capture points corresponding to the plurality of capture surfaces according to the image data information of the multi-dimensional image to be processed, the method further includes:
acquiring original image data information of the multidimensional image to be processed;
preprocessing the original image data information to obtain the image data information.
Unifying the data points of the original image data so as to improve the accuracy of matching the sample image data with the image data.
In the above scheme, the preprocessing the original image data information to obtain the image data information includes:
When the number of the original data information in the original image data information does not meet a preset number value, adjusting the number of the original data information to be a preset number;
dividing the data of the original data information adjusted to the preset number by a preset value to obtain the image data information.
Dividing the original data information by a preset value to improve the convergence degree of the original data information during calculation and improve the accuracy of a calculation result.
In the above-described aspect, the image data information includes color channel data information and depth data information.
The method and the device determine the grabbing point pose of the target object according to the RGB information and the depth information of the multi-dimensional image to be processed, so that accuracy in calculating the grabbing point pose of the target object is improved.
An embodiment of the present disclosure provides an image processing apparatus including:
the device comprises a determining unit, a processing unit and a processing unit, wherein the determining unit is used for determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the grabbing surfaces from the multi-dimensional image to be processed according to the image data information of the multi-dimensional image to be processed, and the grabbing surfaces and the grabbing points are in one-to-one correspondence; determining a plurality of grabbing parameters corresponding to the grabbing surfaces; taking the grabbing point corresponding to the target grabbing surface as a target grabbing point; according to the target grabbing points, grabbing point positions corresponding to the target grabbing points are determined, and according to the grabbing point positions, target objects corresponding to the target grabbing points are grabbed from the RGBD images;
And the evaluation unit is used for evaluating the plurality of grabbing surfaces by utilizing the plurality of grabbing parameters and determining the target grabbing surface from the plurality of grabbing surfaces according to the evaluation result.
In the above aspect, the evaluation unit is specifically configured to evaluate each of the plurality of gripping surfaces by using a plurality of parameters of the plurality of gripping parameters, to obtain a plurality of gripping surface evaluation values corresponding to the plurality of gripping surfaces;
the determining unit is specifically configured to determine a first gripping surface evaluation value with a highest evaluation value from the plurality of gripping surface evaluation values, and take a gripping surface corresponding to the first gripping surface evaluation value as the target gripping surface.
In the above aspect, the plurality of capturing parameters includes at least one of:
the area parameter of the grabbing surface, the height parameter of the grabbing surface, the flatness parameter of the grabbing surface and the gradient parameter of the grabbing surface.
In the above scheme, the determining unit is specifically configured to input image data information of the multidimensional image to be processed into a deep learning network model, to obtain a plurality of pixel points and a plurality of center points corresponding to the plurality of pixel points, where the deep learning network model is a model obtained by training an initial deep learning network model by using sample image data information of a sample multidimensional image, and the plurality of pixel points and the plurality of center points are in one-to-one correspondence; dividing the plurality of center points into a plurality of groups of center points; determining a grabbing point corresponding to any group of center points according to any group of center points in the plurality of groups of center points until the plurality of grabbing points are determined from the plurality of groups of center points; and determining one grabbing surface corresponding to any group of center points according to any group of pixel points corresponding to any group of center points in the plurality of groups of center points until the plurality of grabbing surfaces are determined from the plurality of groups of pixel points corresponding to the plurality of groups of center points, wherein any group of center points corresponds to any group of pixel points one by one.
In the above scheme, the determining unit is specifically configured to average a set of position data of any set of center points to obtain average position data, until a plurality of average position data are obtained from a plurality of sets of position data information of a plurality of sets of center points; and taking a plurality of points corresponding to the plurality of average position data as a plurality of grabbing points, wherein the plurality of average position data are in one-to-one correspondence with the plurality of grabbing points.
In the above scheme, the device further comprises an acquisition unit and a preprocessing unit;
the acquisition unit is used for acquiring the original image data information of the multidimensional image to be processed;
the preprocessing unit is used for preprocessing the original image data information to obtain the image data information.
In the above aspect, the preprocessing unit is specifically configured to adjust, when the number of original data information in the original image data information does not meet a preset number value, the number of original data information to a preset number; dividing the data of the original data information adjusted to the preset number by a preset value to obtain the image data information.
In the above-described aspect, the image data information includes color channel data information and depth data information.
An embodiment of the present disclosure provides an image processing apparatus including:
a memory and a graphics processor, the memory storing an image processing program executable by the graphics processor, the method described above being performed by the graphics processor when the image processing program is executed.
The embodiment of the disclosure provides a storage medium having a computer program stored thereon for application to an image processing apparatus, wherein the computer program, when executed by a graphics processor, implements the method described above.
The embodiment of the disclosure provides a robot, which comprises a mechanical arm and an image processing device, wherein the image processing device is used for executing the method, and the mechanical arm is used for grabbing a target object at a grabbing point pose under the condition that the image processing device determines the grabbing point pose of the target object.
The embodiment of the disclosure provides an image processing method, an image processing device, a storage medium and a robot, wherein the image processing method comprises the following steps: determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the grabbing surfaces from the multi-dimensional image to be processed according to the image data information of the multi-dimensional image to be processed, wherein the grabbing surfaces and the grabbing points are in one-to-one correspondence; determining a plurality of grabbing parameters corresponding to the grabbing surfaces; evaluating the plurality of grabbing surfaces by utilizing the plurality of grabbing parameters, and determining a target grabbing surface from the plurality of grabbing surfaces according to an evaluation result; taking the grabbing point corresponding to the target grabbing surface as a target grabbing point; and determining the grabbing point pose corresponding to the target grabbing point according to the target grabbing point, so as to grab the target object corresponding to the target grabbing point from the multi-dimensional image to be processed according to the grabbing point pose. According to the method, the image processing device determines the plurality of grabbing parameter values corresponding to the plurality of grabbing surfaces, and then evaluates the plurality of grabbing surfaces according to the plurality of grabbing parameter values, so that the target grabbing surface is determined from the plurality of grabbing surfaces, rather than the target grabbing surface is determined according to a single height parameter value, accuracy of determining the target grabbing surface by the image processing device is improved, the image processing device determines the grabbing point position of the target object according to the target grabbing surface with high accuracy, and accuracy of determining the pose of the target object by the image processing device is improved.
Drawings
Fig. 1 is a flowchart of an image processing method according to the present embodiment;
fig. 2 is a flowchart of an image processing method according to the present embodiment;
fig. 3 is a schematic diagram of a composition structure of an image processing apparatus according to the present embodiment;
fig. 4 is a schematic diagram of a second configuration of an image processing apparatus according to the present embodiment.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure.
An embodiment of the present disclosure provides an image processing method, and fig. 1 is a flowchart of an image processing method provided by an embodiment of the present disclosure, where, as shown in fig. 1, the image processing method may include:
s101, determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the grabbing surfaces from the multi-dimensional image to be processed according to image data information of the multi-dimensional image to be processed, wherein the grabbing surfaces and the grabbing points are in one-to-one correspondence.
The image processing method provided by the embodiment of the disclosure is suitable for processing the image data information of the multidimensional image to be processed, and determining the grabbing point pose of the target object.
In the embodiment of the disclosure, the image processing method is applied to an image processing device, and the image processing device can be integrated in a robot so that the robot can grab the target object according to the grabbing point pose of the target object. In some possible implementations, the image processing method may be implemented by way of a processor invoking computer readable instructions stored in a memory.
It should be noted that the multi-dimensional image to be processed may be a three-dimensional image to be processed, a four-dimensional image to be processed, or a two-dimensional image to be processed, which may be specifically determined according to actual situations, which is not limited in the embodiments of the present disclosure.
In the embodiment of the disclosure, when the image processing device acquires the image data information of the multi-dimensional image to be processed, the image processing device may determine a plurality of capture surfaces and a plurality of capture points corresponding to the plurality of capture surfaces from the image to be processed according to the image data information.
The image processing device may acquire the image data information of the multi-dimensional image to be processed through an image acquisition device such as a camera, and the image processing device may directly acquire the image data information of the multi-dimensional image to be processed at other devices, and the specific image processing device may determine the image data information of the multi-dimensional image to be processed according to the actual situation, which is not limited in the embodiments of the present disclosure.
It should be noted that, if the image processing device acquires the multi-dimensional image to be processed through an image acquisition device such as a camera, and acquires the image processing data in a manner of reading the image data information from the multi-dimensional image to be processed, the image processing device may control the camera to acquire the multi-dimensional image to be processed through a central processing unit (Central Processing Unit, CPU), and acquire the image data information according to the multi-dimensional image to be processed. When the image processing device acquires the image data information, the image processing device transmits the image data information to a graphics processor (Graphics Processing Unit, GPU), and the image data information is processed by the graphics processor to determine the grabbing point pose of the target object.
In the embodiment of the disclosure, the image processing device processes the image data information of the multi-dimensional image to be processed by using the GPU in parallel, that is, the GPU processes the image data information of all pixel points in the multi-dimensional image to be processed at the same time, so that the speed of the image processing device when processing the image data information is improved, that is, the speed of the image processing device when determining the capturing point pose of the target object by using the image data information in the multi-dimensional image to be processed is improved.
The plurality of gripping surfaces are in one-to-one correspondence with the plurality of gripping points.
The plurality of grabbing surfaces are a plurality of surfaces of a target object to be grabbed in the multi-dimensional image to be processed, and the plurality of grabbing points are center points of the plurality of grabbing surfaces, wherein one grabbing surface corresponds to one grabbing point.
For example, the target object in the multi-dimensional image to be processed may be two hexahedrons, the plurality of grabbing surfaces may be two front surfaces of the two hexahedrons, that is, the number of the plurality of grabbers is 2, the first grabber surface is the front surface of the first hexahedron, the second grabber surface is the front surface of the second hexahedron, the plurality of grabbers are center points of the two front surfaces, that is, the number of the plurality of grabbers is 2, the first grabber point is the center point of the front surface of the first hexahedron, and the second grabber point is the center point of the front surface of the second hexahedron.
For example, the target object in the multi-dimensional image to be processed may be a tetrahedron, the plurality of grabbing surfaces may be two sides of the tetrahedron, that is, the number of the plurality of grabbing surfaces is 2, the first grabbing surface is a first side of the tetrahedron, the second grabbing surface is a side adjacent to the first side, that is, the second side, the plurality of grabbing points are center points of the two sides, that is, the number of the plurality of grabbing points is 2, the first grabbing point is a center point of the first side, and the second grabbing point is a center point of the second side.
In an embodiment of the present disclosure, the image data information includes color channel data information and depth data information.
For example, the image data information may be RGBD data information of the multi-dimensional image to be processed, wherein the color channel data information may be RGB data information of the multi-dimensional image to be processed, and the depth data information may be depth information of the multi-dimensional image to be processed. When the image data information is RGBD data of the multi-dimensional image to be processed, the image processing device can determine the grabbing point pose of the target object according to the RGB data of the multi-dimensional image to be processed and the depth image data of the multi-dimensional image to be processed, so that the accuracy of the image processing device in calculating the grabbing point pose of the target object is improved.
In an embodiment of the present disclosure, the manner in which the image processing apparatus determines, from the to-be-processed image, a plurality of capture surfaces and a plurality of capture points corresponding to the plurality of capture surfaces according to image data information of the to-be-processed multidimensional image may be: the image processing device inputs image data information of the multidimensional image to be processed into the deep learning network model to obtain a plurality of pixel points and a plurality of center points corresponding to the pixel points.
In the embodiment of the disclosure, the image processing device includes a deep learning network model, and when the image processing device acquires image data information of a multidimensional image to be processed, the image processing device inputs the image data information into the deep learning network model, and the deep learning network model determines a plurality of pixel points and a plurality of center points from the image data information.
In the embodiment of the disclosure, the image processing device trains the initial deep learning network model by using sample image data information, and adjusts initial parameters of the initial deep learning network model, so that the initial deep learning network model after the initial parameters are adjusted can classify and detect the sample image data information, learn the surface information of a sample object in a sample multidimensional image, and determine a sample image point and a sample center point, thereby obtaining the deep learning network model. When the deep learning network model obtains the image data information of the multi-dimensional image to be processed, the deep learning network model can classify and detect the image data information, and a plurality of pixel points and a plurality of center points are determined from the image data information.
It should be noted that, a plurality of pixel points are in one-to-one correspondence with a plurality of center points, wherein one pixel point corresponds to one center point.
The sample image data information is data information for training the initial deep learning network model. The sample image data information may be a plurality of pieces of surface information of a polyhedron, or may be a plurality of pieces of surface information of a ball, or may be a plurality of pieces of surface information of a toy, or may be a plurality of pieces of surface information of other objects, and specific sample image data information may be determined according to actual situations, which is not limited in the embodiment of the present disclosure.
In the embodiment of the present disclosure, the plurality of pieces of surface information may be position information of a plurality of surfaces, may be size information of a plurality of surfaces, may also be a plurality of pieces of sample image point information in a plurality of surfaces, and may specifically be determined according to actual situations, which is not limited in the embodiment of the present disclosure.
In the embodiments of the present disclosure, before the image processing apparatus trains the initial deep learning network model, the image processing apparatus needs to acquire many pieces of sample image data information, such as: the surface information of the polyhedron, the surface information of the toy, the surface information of the ball, the surface information of the cup, etc. When the image processing device acquires the sample image data information, the image processing device trains an initial deep learning network model by utilizing the sample image data information, and adjusts initial parameters of the initial deep learning network model, so that the initial deep learning network model after the initial parameters are adjusted can classify and detect the sample image data information, learn the surface information of a sample object in a sample multidimensional image, and determine a sample image point and a sample center point, thereby obtaining the deep learning network model. Therefore, when the image processing apparatus inputs the image data information of the multi-dimensional image to be processed to the deep learning network model, the deep learning network model can determine the pixel points on the plurality of grasping surfaces from the image data information of the multi-dimensional image to be processed.
The sample image point is a point on the sample image, and the sample center point is a center point of the sample image.
It can be understood that the model parameters in the deep learning network model are parameters obtained by learning the surface information of the object in the sample multidimensional image, when the deep learning network model obtains the image data information of the unknown object in the multidimensional image to be processed, the deep learning network model can directly determine the surface information of the unknown object from the image data information according to the model parameters, thereby determining a plurality of pixel points and a plurality of center points corresponding to the surface information, and the image data information of the unknown object does not need to be marked and retrained, so that generalization and practicability of the image processing device when processing the image data information are improved.
In the embodiment of the disclosure, after the image processing apparatus determines a plurality of pixel points and a plurality of center points corresponding to the plurality of pixel points from the image data information of the multidimensional image to be processed by using the deep learning network model, the image processing apparatus may divide the plurality of center points into a plurality of groups of center points.
In the embodiment of the disclosure, the image processing device divides the plurality of center points to obtain a plurality of groups of center points, and the plurality of center points can be clustered for the image processing device, so as to determine a plurality of groups of center points corresponding to the plurality of grabbing surfaces. The image processing device may also divide the plurality of center points by using the deep learning network model first, so as to determine a plurality of groups of grabbing points, and a specific manner in which the image processing device divides the plurality of center points into a plurality of groups of center points may be determined according to the actual situation, which is not limited in the embodiment of the present disclosure.
It should be noted that, the manner in which the image processing device clusters the plurality of center points may be a mean shift clustering algorithm, a hierarchical clustering algorithm, or a density clustering algorithm, and the specific clustering algorithm may be determined according to the actual situation, which is not limited in the embodiment of the present disclosure.
It should be noted that, a plurality of grabbing surfaces are in one-to-one correspondence with a plurality of groups of center points, wherein one grabbing surface corresponds to one group of grabbing points.
In the embodiment of the disclosure, after the image processing device divides the plurality of center points into a plurality of sets of center points, the image processing device may determine, according to any one set of center points in the plurality of sets of center points, one capturing point corresponding to any one set of center points until a plurality of capturing points are determined from the plurality of sets of center points.
In this embodiment of the present disclosure, the manner in which the image processing apparatus determines a capturing point corresponding to any one of the plurality of sets of center points according to any one of the plurality of sets of center points may determine a point randomly from any one of the plurality of sets of center points for the image processing apparatus, and use the point as a capturing point, or may randomly select a portion of center points from any one of the plurality of sets of center points for the image processing apparatus, and determine a capturing point according to position information of the portion of center points, and the image processing apparatus may determine a capturing point according to position information of all points of any one of the plurality of sets of center points.
In the embodiment of the disclosure, when the image processing apparatus determines, according to a certain manner, one capturing point corresponding to any one set of center points from any one set of center points, the image processing apparatus may also determine, according to the manner, remaining capturing points from other sets of center points from the plurality of sets of center points, thereby obtaining a plurality of capturing points.
In the embodiment of the present disclosure, the image processing apparatus determines, according to any one of the plurality of sets of center points, a capturing point corresponding to any one set of center points, until a plurality of capturing points are determined from the plurality of sets of center points, may be: the image processing device averages one group of position data of any group of center points to obtain one average position data until the image processing device obtains a plurality of average position data from a plurality of groups of position data information of a plurality of groups of center points.
It should be noted that one average position data corresponds to one set of position data.
It should be noted that, the set of position data may be three-dimensional coordinate data of a set of center points, three-dimensional pose data of a set of center points, or other position data of a set of center points, which may be specifically determined according to actual situations, which is not limited in the embodiments of the present disclosure.
In the embodiment of the present disclosure, after the image processing apparatus obtains the plurality of average position data, the image processing apparatus may use a plurality of points corresponding to the plurality of average position data as the plurality of capturing points.
It should be noted that, a plurality of average position data corresponds to a plurality of grabbing points one by one, wherein one average position data corresponds to one grabbing point.
In the embodiment of the disclosure, after a plurality of grabbing points are determined in a plurality of sets of center points of the image processing device, the image processing device may determine one grabbing surface corresponding to any set of center points according to any set of pixel points corresponding to any set of center points in the plurality of sets of center points until a plurality of grabbing surfaces are determined from a plurality of sets of pixel points corresponding to the plurality of sets of center points.
It should be noted that any group of center points corresponds to any group of pixel points one by one, wherein one group of center points corresponds to one group of pixel points.
It should be noted that, a plurality of groups of pixel points are in one-to-one correspondence with a plurality of grabbing surfaces, wherein a group of pixel points corresponds to one grabbing surface.
In the embodiment of the disclosure, the image processing device needs to acquire the original image data information of the multi-dimensional image to be processed before determining a plurality of grabbing surfaces and grabbing points corresponding to the grabbing surfaces from the multi-dimensional image to be processed according to the image data information of the multi-dimensional image to be processed.
The method for acquiring the original image data information of the to-be-processed multi-dimensional image by the image processing device may be that the image processing device acquires the to-be-processed multi-dimensional image by using an image acquisition device such as a camera, and reads the original image data information from the to-be-processed multi-dimensional image, or may be that the image processing device directly acquires the original image data information of the to-be-processed multi-dimensional image from another device, and the specific method for acquiring the original image data information of the to-be-processed multi-dimensional image by the image processing device may be determined according to practical situations.
It should be noted that the original image data information may be original RGBD data information.
In the embodiment of the disclosure, when the image processing device acquires the original image data information of the multidimensional image to be processed, the image processing device pre-processes the original image data information to obtain the image data information.
It can be understood that the image processing device performs preprocessing on the original image data information, so that accuracy when the preprocessed original image data information is matched with the sample image data information is improved, and accuracy when the image processing device processes the image data information of the multidimensional image to be processed is improved.
The method of preprocessing the original image data by the image processing apparatus may be to remove noise in the original image data information, amplify the original image data information, change the number of the original image data information, or process the original image data information, which may be specifically determined according to the actual situation, and the embodiment of the present disclosure is not limited thereto.
In the embodiment of the present disclosure, the image processing apparatus performs preprocessing on original image data information, and the process of obtaining the image data information may be: when the number of the original data information in the original image data information does not meet the preset number value, the image processing device adjusts the number of the original data information to the preset number.
It can be understood that the preset number is the preset data information amount when the image processing device processes the image data information, and the image processing device adjusts the data amount of the original data information to the preset number, so that the accuracy when the adjusted original data information is matched with the sample image data is improved, and the accuracy when the image processing device processes the image data information of the multi-dimensional image to be processed is improved.
In the embodiment of the disclosure, when the image processing device obtains the original image data information of the multidimensional image to be processed, the image processing device compares the number of the original image data information with a preset number value, and when the number of the original data information in the original image data information does not meet the preset number value, the image processing device adjusts the number of the original data information to the preset number.
It should be noted that, the number of the original data information does not satisfy the preset number value, and may be that the number of the original data information is greater than or less than the preset number value, and the number of the original data information satisfies the preset number value, and may be that the number of the original data information is equal to the preset number value.
It should be noted that, the preset number value is a number value of original image data information preset in the image processing device, for example, the preset number value may be 65536, the number of points on the abscissa of the corresponding multi-dimensional image to be processed may be 256 points, and the number of points on the ordinate of the corresponding multi-dimensional image to be processed may be 256 points.
For example, the image processing device obtains 1024 points on the abscissa of the multi-dimensional image to be processed, and 1024 points on the ordinate of the multi-dimensional image to be processed, so that the image processing device compares the data amount of the original image data information of the multi-dimensional image to be processed with the preset number value, that is, the 1024 points are multiplied by 1024 points, and the total 1048576 points are compared with the preset number value 65536 points, and since the data amount of the original image data information is 1048576 points which are larger than the preset number value 65536 points, the number of the original data information does not satisfy the preset number value.
In the embodiment of the disclosure, the image processing device may adjust the number of the original data information to the preset number, by increasing the original data information when the image processing device determines that the number of the original data information is smaller than the preset number, and by decreasing the original data information when the image processing device determines that the number of the original data information is larger than the preset number.
It should be noted that, the manner in which the image processing apparatus adds the original data information may be that one or more data information is added between two adjacent original data information of the original data information, the value of the added one or more data information may be determined according to the two adjacent original data information, the manner in which the image processing apparatus adds the original data information may be that one or more data information is added between two adjacent original data information of the original data information, the value of the one or more data information may be determined according to all original data information, the image processing apparatus may also add the original data information by other manners, and the manner in which the specific image processing apparatus adds the original data information may be determined according to the actual situation.
It should be noted that, the manner in which the image processing apparatus reduces the original data information may be that the image processing apparatus deletes the data at the odd-numbered point positions of the original data information, may delete the data at the even-numbered point positions of the original data information, or may be that other manners of reducing the original data information, which may be specifically determined according to the actual situation, which is not limited in the embodiments of the present disclosure.
In the embodiment of the disclosure, after the image processing device adjusts the number of the original data information to the preset number, the image processing device divides the data of the original data information adjusted to the preset number by the preset number to obtain the image data information.
It can be understood that the image processing device divides the original data information by the preset value, so that the original data information can be quickly converged, the convergence degree of the original data information in calculation is improved, and the accuracy of the image processing device in processing the image data information of the multi-dimensional image to be processed is improved.
It should be noted that the preset value is a value preset in the image processing apparatus.
In the embodiment of the disclosure, the number of preset values in the image processing device may be multiple, and when the value ranges of the original image data information are different, different preset values may be corresponding, and the specific value of the preset value may be determined according to the actual situation, which is not limited in the embodiment of the disclosure.
S102, determining a plurality of grabbing parameters corresponding to the grabbing surfaces.
In the embodiment of the disclosure, after the image processing apparatus determines a plurality of capturing surfaces from the multi-dimensional image to be processed, the image processing apparatus may determine a plurality of capturing parameters corresponding to each of the plurality of capturing surfaces.
It should be noted that the plurality of capturing parameters include at least one of the following: the area parameter of the grabbing surface, the height parameter of the grabbing surface, the flatness parameter of the grabbing surface and the gradient parameter of the grabbing surface.
The plurality of capture parameters may be determined by the image processing device based on the plurality of capture surfaces.
In the embodiment of the disclosure, the image processing device assigns values to a plurality of grabbing surfaces corresponding to a single grabbing parameter, determines a plurality of grabbing parameter values corresponding to the single grabbing parameter for the plurality of grabbing surfaces, and until the image processing device assigns values to a plurality of grabbing surfaces corresponding to the plurality of grabbing parameters, determines a plurality of grabbing parameters corresponding to the plurality of grabbing surfaces.
In the embodiment of the disclosure, a first set of grabbing parameter values corresponding to different area parameters of a grabbing surface, a second set of grabbing parameter values corresponding to different height parameters of the grabbing surface, a third set of grabbing parameter values corresponding to different flatness parameters of the grabbing surface and a fourth set of grabbing parameter values corresponding to different inclination parameters of the grabbing surface are set in an image processing device, and when the image processing device obtains one grabbing surface of a plurality of grabbing surfaces, the image processing device determines a first grabbing parameter value corresponding to the grabbing area parameter from a first set of preset parameter values according to the grabbing area parameter of the grabbing surface; determining a second grabbing parameter value corresponding to the height parameter of the grabbing surface from a second group of grabbing parameter values according to the height parameter of the grabbing surface; determining a third grabbing parameter value corresponding to the flatness parameter of the grabbing surface from a third group of grabbing parameter values according to the flatness parameter of the grabbing surface; and determining a fourth grabbing parameter value corresponding to the inclination parameter of the grabbing surface from a fourth group of grabbing parameter values according to the inclination parameter of the grabbing surface until the image processing device determines that each grabbing surface of the grabbing surfaces corresponds to four grabbing parameter values.
S103, multiple grabbing surfaces are evaluated by utilizing multiple grabbing parameters, and a target grabbing surface is determined from the multiple grabbing surfaces according to an evaluation result.
In the embodiment of the disclosure, after the image processing device determines a plurality of capture parameters corresponding to the plurality of capture surfaces, the image processing device may evaluate the plurality of capture surfaces by using the plurality of capture parameters, and determine that the target capture surface is the plurality of capture surfaces.
In an embodiment of the present disclosure, the image processing apparatus evaluates a plurality of capture surfaces by using a plurality of capture parameters, and a process of determining a target capture surface from the plurality of capture surfaces may be: the image processing device utilizes the plurality of grabbing parameters to respectively evaluate each grabbing surface in the plurality of grabbing surfaces to obtain a plurality of grabbing surface evaluation values corresponding to the plurality of grabbing surfaces.
In this embodiment of the present disclosure, the image processing apparatus may obtain each parameter value corresponding to each grabbing surface according to an area parameter of the grabbing surface, a height parameter of the grabbing surface, a flatness parameter of the grabbing surface, and an inclination parameter of the grabbing surface, until a plurality of first parameter values corresponding to a plurality of grabbing surfaces are obtained, and obtain a plurality of grabbing surface evaluation values according to the plurality of first parameter values and a plurality of grabbing probability values of the plurality of grabbing surfaces.
The plurality of capture probability values are probability values for a plurality of capture surfaces as target capture surfaces.
For example, the image processing apparatus may determine a plurality of grasp face evaluation values by the formula (1).
If a group of grabbing parameters corresponding to one grabbing surface are the area parameters and grabbing parameters of the grabbing surfaceThe height parameter of the surface, P (c) i I x) is an evaluation value of each of the plurality of gripping surfaces under the condition that the gripping parameters are an area parameter of the gripping surface and a height parameter of the gripping surface, wherein c i For one of the plurality of gripping surfaces, x is an area parameter of one of the plurality of gripping surfaces and a height parameter of one of the plurality of gripping surfaces, P (x|c i ) For each gripping surface, a corresponding total gripping parameter value, P (c i ) The probability that each gripping surface is gripped without considering the gripping parameters may be the probability when the gripping parameters are the area parameters of the gripping surface and the height parameters of the gripping surface, that is, the value of P (x) is 1.
In the embodiment of the present disclosure, the total grabbing parameter value corresponding to each grabbing surface may be a product of a set of grabbing parameters corresponding to each grabbing surface, or may be a sum of a set of grabbing parameters corresponding to each grabbing surface, which may be specifically determined according to actual situations, and the embodiment of the present disclosure is not limited thereto.
If the total grabbing parameter value corresponding to each grabbing surface is the product of a group of grabbing parameters corresponding to each grabbing surface, and the grabbing parameters are the area parameters of grabbing surfaces and the height parameters of grabbing surfaces, the total grabbing parameter value corresponding to each grabbing surface is the product of the area parameters of grabbing surfaces corresponding to the grabbing surfaces and the height parameters of grabbing surfaces.
The image processing device determines that only two grabbing surfaces exist, wherein grabbing parameters are an area parameter of each grabbing surface and a height parameter of each grabbing surface, wherein the area parameter value of each grabbing surface corresponding to the first grabbing surface is 0.5, the area parameter value of each grabbing surface corresponding to the second grabbing surface is 0.2, the height parameter value of each grabbing surface corresponding to the first grabbing surface is 0.4, the area parameter value of each grabbing surface corresponding to the second grabbing surface is 0.6, and then the total grabbing parameter value corresponding to the first grabbing surface is the product of the area parameter value of each grabbing surface is 0.5 and the height parameter value of each grabbing surface is 0.4, and the total grabbing parameter value is 0.2; the total grabbing parameter value corresponding to the second grabbing surface is the product of the area parameter value of the second grabbing surface being 0.2 and the height parameter value of the second grabbing surface being 0.6, and is 0.12.
In the embodiment of the present disclosure, the image processing apparatus may determine the probability that each of the plurality of gripping surfaces is gripped according to the number of the plurality of gripping surfaces.
For example, if the image processing apparatus determines that there are 5 grabbing surfaces, then the probability that each grabbing surface is grabbed, i.e., P (c) i ) Has a value of 0.2; if the image processing device determines that there are 2 grabbing surfaces, then the probability that each grabbing surface is grabbed, i.e. P (c) i ) Has a value of 0.5; if the image processing device determines that there are 3 grabbing surfaces, then the probability that each grabbing surface is grabbed, i.e. P (c) i ) The value of (2) is 1/3.
In the embodiment of the present disclosure, if the set of grabbing parameters corresponding to one grabbing surface is an area parameter of the grabbing surface, a height parameter of the grabbing surface and a flatness parameter of the grabbing surface, x in formula (1) is an area parameter of one grabbing surface of the grabbing surfaces, a height parameter of one grabbing surface of the grabbing surfaces and a flatness parameter of one grabbing surface of the grabbing surfaces, P (c) i I x) is an evaluation value of each of the plurality of gripping surfaces under the condition that the gripping parameter is an area parameter of the gripping surface, a height parameter of the gripping surface, and a flatness parameter of the gripping surface, P (x) is a probability when the gripping parameter is the area parameter of the gripping surface, the height parameter of the gripping surface, and the flatness parameter of the gripping surface, i.e., a value of P (x) is 1, P (x|c i ) And the total grabbing parameter value corresponding to each grabbing surface is obtained.
If the grabbing surface parameter is the product of a group of grabbing parameters corresponding to each grabbing surface, and the grabbing parameters are the area parameters of grabbing surfaces, the height parameters of grabbing surfaces and the flatness parameters of grabbing surfaces, the total grabbing parameter value corresponding to each grabbing surface is the product of the area parameters of grabbing surfaces, the height parameters of grabbing surfaces and the flatness parameters of grabbing surfaces.
In the embodiment of the present disclosure, if a set of grabbing parameters corresponding to one grabbing surface is an area parameter of the grabbing surface, a height parameter of the grabbing surface, a flatness parameter of the grabbing surface and an inclination parameter of the grabbing surface, x in formula (1) isAn area parameter of one of the plurality of gripping surfaces, a height parameter of one of the plurality of gripping surfaces, a flatness parameter of one of the plurality of gripping surfaces, and an inclination parameter of one of the plurality of gripping surfaces, P (c) i I x) is an evaluation value of each of the plurality of gripping surfaces under the condition that the gripping parameter is an area parameter of the gripping surface, a height parameter of the gripping surface, a flatness parameter of the gripping surface, and an inclination parameter of the gripping surface, P (x) is a probability when the gripping parameter is the area parameter of the gripping surface, the height parameter of the gripping surface, the flatness parameter of the gripping surface, and the flatness parameter of the gripping surface, i.e., a value of P (x) is 1, P (x|c i ) And the total grabbing parameter value corresponding to each grabbing surface is obtained.
If the grabbing surface parameter is the product of a group of grabbing parameters corresponding to each grabbing surface, and the grabbing parameters are the area parameters of the grabbing surfaces, the height parameters of the grabbing surfaces, the flatness parameters of the grabbing surfaces and the gradient parameters of the grabbing surfaces, the total grabbing parameter value corresponding to each grabbing surface is the product of the area parameters of the grabbing surfaces, the height parameters of the grabbing surfaces, the flatness parameters of the grabbing surfaces and the gradient parameters of the grabbing surfaces.
In the embodiment of the disclosure, after the image processing device obtains the plurality of grabbing surface evaluation values corresponding to the plurality of grabbing surfaces, the image processing device may determine a first grabbing surface evaluation value with the highest evaluation value from the plurality of grabbing surface evaluation values, and use the grabbing surface corresponding to the first grabbing surface evaluation value as the target grabbing surface.
In the embodiment of the disclosure, the manner in which the image processing apparatus determines the first capture surface evaluation value with the highest evaluation value from the plurality of capture surface evaluation values may be that the image processing apparatus first randomly determines one capture surface evaluation value from the plurality of capture surface evaluation values, and compares the capture surface evaluation value with other capture surface evaluation values, so as to determine the first capture surface evaluation value with the highest evaluation value, and the image processing apparatus may use the capture surface corresponding to the first capture surface evaluation value as the target capture surface; the image processing device can also sort the plurality of grabbing surface evaluation values in a big-to-small sorting mode, and take grabbing surfaces corresponding to the grabbing surface evaluation values sorted at the first position as target grabbing surfaces; the image processing device may further sort the plurality of grabbing surface evaluation values according to a small-to-large sorting manner, and use the grabbing surface corresponding to the grabbing surface evaluation value sorted at the last position as the target grabbing surface, where the specific manner of determining the target grabbing surface may be determined according to the actual situation, which is not limited in the embodiment of the present disclosure.
S104, taking the grabbing point corresponding to the target grabbing surface as a target grabbing point.
In the embodiment of the disclosure, when the image processing apparatus evaluates the plurality of capturing surfaces by using the plurality of capturing parameter values, after determining the target capturing surface from the plurality of capturing surfaces, the image processing apparatus takes a capturing point corresponding to the target capturing surface as the target capturing point.
The target capturing point may be a center point of the target capturing surface. Of course, in other embodiments, the target capture point may not be the center point of the target capture surface, and may be, for example, a neighborhood point of the center point of the target capture surface.
S105, according to the target grabbing points, the grabbing point positions corresponding to the target grabbing points are determined, and according to the grabbing point positions, target objects corresponding to the target grabbing points are grabbed from the multi-dimensional image to be processed.
In the embodiment of the disclosure, after the image processing device determines the target capturing point, the image processing device may determine a capturing point pose corresponding to the target capturing point according to the target capturing point, so as to capture a target object corresponding to the target capturing point from the multi-dimensional image to be processed according to the capturing point pose.
It should be noted that, the pose of the capturing point corresponding to the target capturing point may be a six-dimensional pose of the target capturing point, may be a five-dimensional pose of the target capturing point, or may be a pose of other dimensions of the target capturing point, which may be specifically determined according to the actual situation, and the embodiment of the present disclosure does not limit this.
It should be noted that, the information of the target capturing point may be three-dimensional coordinate point information.
When the capturing point pose corresponding to the target capturing point is the six-dimensional pose of the target capturing point, the image processing device can determine the capturing point pose corresponding to the target capturing point according to the three-dimensional coordinate point information of the target capturing point and the rotation degree information of the target capturing point.
In the embodiment of the disclosure, the manner of determining the rotation degree information of the target capturing point by the image processing device may be that the image processing device performs plane fitting according to a plurality of target pixel points on the target capturing surface to obtain a fitted target capturing surface, the image processing device determines a tangent line of the target capturing point on the fitting plane, and the image processing device uses a vertical vector of the tangent line as the rotation degree information, thereby obtaining the rotation degree information of the target capturing point.
The embodiment of the disclosure provides an exemplary image processing manner flowchart, as shown in fig. 2, when a CPU in an intelligent robot controls a camera to collect original image data information of a multi-dimensional image to be processed, that is, when an image processing device collects the original RGBD image data information of the multi-dimensional image to be processed, the CPU controls to transmit the original RGBD image data to a GPU, the GPU performs preprocessing on the original RGBD image data to obtain RGBD image data information, that is, obtain image data information, the GPU determines a plurality of capturing surfaces and a plurality of capturing points corresponding to the capturing surfaces from the RGBD image data information by using a deep learning network model, the GPU evaluates each capturing surface of the plurality of capturing surfaces by using a plurality of capturing parameters, determines a target capturing surface from the plurality of capturing surfaces according to an evaluation result of each capturing surface, and uses a capturing point corresponding to the target capturing surface as a target capturing point, determines a capturing pose of the target object according to the target capturing point, and after the image processing device determines a capturing pose of the target object, the intelligent robot can capture the target object.
It can be understood that the image processing device determines a plurality of grabbing parameter values corresponding to the grabbing surfaces, and evaluates the grabbing surfaces according to the grabbing parameter values, so that the target grabbing surface is determined from the grabbing surfaces instead of determining the target grabbing surface according to a single height parameter value, the accuracy of determining the target grabbing surface by the image processing device is improved, the grabbing point pose of the target object is determined by the image processing device according to the target grabbing surface with high accuracy, and the accuracy of determining the pose of the target object by the image processing device is improved.
When the image processing device is applied to the logistics transmission process, the image processing device can determine the grabbing point pose of the target express according to the implementation mode, so that sorting and stacking of the express are realized.
When the image processing device is applied to the medical and cosmetic processes, the image processing device can determine the grabbing point position of the target medicine according to the implementation mode, or the image processing device can determine the grabbing point position of the target cosmetic product according to the implementation mode, so that the classified packaging of the medicine and the cosmetic product is realized.
When the image processing device is applied to heavy industry, the image processing device can determine the grabbing point pose of the target industrial product according to the implementation mode, so that the carrying of the target industrial product is realized.
When the image processing device is applied to the garbage treatment process, the image processing device can determine the grabbing point pose of the target garbage according to the implementation mode, so that the garbage classification treatment is realized.
Based on the same inventive concept, the presently disclosed embodiments provide an image processing apparatus 1, corresponding to an image processing method; fig. 3 is a schematic diagram of a composition structure of an image processing apparatus according to an embodiment of the present disclosure, where the image processing apparatus 1 may include:
a determining unit 11, configured to determine a plurality of capturing surfaces and a plurality of capturing points corresponding to the plurality of capturing surfaces from the multi-dimensional image to be processed according to image data information of the multi-dimensional image to be processed, where the plurality of capturing surfaces and the plurality of capturing points are in one-to-one correspondence; determining a plurality of grabbing parameters corresponding to the grabbing surfaces; taking the grabbing point corresponding to the target grabbing surface as a target grabbing point; according to the target grabbing points, grabbing point positions corresponding to the target grabbing points are determined, and according to the grabbing point positions, target objects corresponding to the target grabbing points are grabbed from the RGBD images;
and the evaluation unit 12 is used for evaluating the plurality of grabbing surfaces by utilizing the plurality of grabbing parameters and determining the target grabbing surface from the plurality of grabbing surfaces according to the evaluation result.
In some embodiments of the present disclosure, the evaluation unit 12 is specifically configured to evaluate each of the plurality of gripping surfaces with a plurality of parameters of the plurality of gripping parameters, to obtain a plurality of gripping surface evaluation values corresponding to the plurality of gripping surfaces;
the determining unit 11 is specifically configured to determine a first gripping surface evaluation value with a highest evaluation value from the plurality of gripping surface evaluation values, and take a gripping surface corresponding to the first gripping surface evaluation value as the target gripping surface.
In some embodiments of the present disclosure, the plurality of grasping parameters includes at least one of:
the area parameter of the grabbing surface, the height parameter of the grabbing surface, the flatness parameter of the grabbing surface and the gradient parameter of the grabbing surface.
In some embodiments of the present disclosure, the determining unit 11 is specifically configured to input image data information of the multidimensional image to be processed into a deep learning network model, to obtain a plurality of pixels and a plurality of center points corresponding to the plurality of pixels, where the deep learning network model is a model obtained by training an initial deep learning network model with sample image data information of a sample multidimensional image, and the plurality of pixels and the plurality of center points are in one-to-one correspondence; dividing the plurality of center points into a plurality of groups of center points; determining a grabbing point corresponding to any group of center points according to any group of center points in the plurality of groups of center points until the plurality of grabbing points are determined from the plurality of groups of center points; and determining one grabbing surface corresponding to any group of center points according to any group of pixel points corresponding to any group of center points in the plurality of groups of center points until the plurality of grabbing surfaces are determined from the plurality of groups of pixel points corresponding to the plurality of groups of center points, wherein any group of center points corresponds to any group of pixel points one by one.
In some embodiments of the present disclosure, the determining unit 11 is specifically configured to average one set of position data of any set of center points to obtain one average position data, until a plurality of average position data are obtained from a plurality of sets of position data information of the plurality of sets of center points; and taking a plurality of points corresponding to the plurality of average position data as a plurality of grabbing points, wherein the plurality of average position data are in one-to-one correspondence with the plurality of grabbing points.
In some embodiments of the present disclosure, the apparatus further comprises an acquisition unit 13 and a preprocessing unit 14;
the acquiring unit 13 is configured to acquire original image data information of the multidimensional image to be processed;
the preprocessing unit 14 is configured to preprocess the original image data information, so as to obtain the image data information.
In some embodiments of the present disclosure, the preprocessing unit 14 is specifically configured to adjust the number of the original data information to a preset number if the number of the original data information in the original image data information does not satisfy a preset number value; dividing the data of the original data information adjusted to the preset number by a preset value to obtain the image data information.
In some embodiments of the present disclosure, the image data information includes color channel data information and depth data information.
It should be noted that, in practical applications, the determining unit 11, the evaluating unit 12, the acquiring unit 13, and the preprocessing unit 14 may be implemented by the processor 15 on the motion quality evaluating device 1, specifically, a GPU (Graphics Processing Unit, a graphics processor), an MPU (Microprocessor Unit, a microprocessor), a DSP (Digital Signal Processing, a digital signal processor), or a field programmable gate array (FPGA, field Programmable Gate Array); the above-described data storage may be realized by a memory 16 on the motion quality assessment apparatus 1.
The embodiment of the present disclosure also provides an image processing apparatus 1, as shown in fig. 4, the image processing apparatus 1 includes: a processor 15 and a memory 16, said memory 16 storing an image processing program executable by said processor 14, said program, when executed, performing by said processor 15 the image processing method as described above.
In practical applications, the Memory 16 may be a volatile Memory (RAM), such as a Random-Access Memory (RAM); or a nonvolatile Memory (non-volatile Memory), such as a Read-Only Memory (ROM), a flash Memory (flash Memory), a Hard Disk (HDD) or a Solid State Drive (SSD); or a combination of memories of the above kind and providing instructions and data to the processor 15.
The presently disclosed embodiments provide a computer readable storage medium having a computer program thereon, which when executed by the processor 15 implements the image processing method as described above.
The embodiment provides a robot, which comprises a mechanical arm and an image processing device, wherein the image processing device is used for executing the method, and the mechanical arm is used for grabbing a target object at a grabbing point pose under the condition that the image processing device determines the grabbing point pose of the target object.
Specifically, after the capturing position and the pose of the target object determined by the image processing device are obtained, the mechanical arm calculates the capturing position and the pose of the mechanical arm for capturing the target object at the capturing position and the pose according to the capturing position and the pose, so that the motion path of the mechanical arm is planned to capture the object.
It can be understood that the image processing device determines a plurality of grabbing parameter values corresponding to the grabbing surfaces, and evaluates the grabbing surfaces according to the grabbing parameter values, so that the target grabbing surface is determined from the grabbing surfaces instead of determining the target grabbing surface according to a single height parameter value, the accuracy of determining the target grabbing surface by the image processing device is improved, the grabbing point pose of the target object is determined by the image processing device according to the target grabbing surface with high accuracy, and the accuracy of determining the pose of the target object by the image processing device is improved.
When the image processing device is applied to the logistics transmission process, the image processing device can determine the grabbing point pose of the target express according to the implementation mode, so that sorting and stacking of the express are realized.
When the image processing device is applied to the medical and cosmetic processes, the image processing device can determine the grabbing point position of the target medicine according to the implementation mode, or the image processing device can determine the grabbing point position of the target cosmetic product according to the implementation mode, so that the classified packaging of the medicine and the cosmetic product is realized.
When the image processing device is applied to heavy industry, the image processing device can determine the grabbing point pose of the target industrial product according to the implementation mode, so that the carrying of the target industrial product is realized.
When the image processing device is applied to the garbage treatment process, the image processing device can determine the grabbing point pose of the target garbage according to the implementation mode, so that the garbage classification treatment is realized.
It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The foregoing description is only of the preferred embodiments of the present disclosure, and is not intended to limit the scope of the present disclosure.

Claims (10)

1. An image processing method, the method comprising:
acquiring original image data information of a multidimensional image to be processed;
when the number of the original data information in the original image data information does not meet a preset number value, adjusting the number of the original data information to be a preset number; dividing the data of the original data information adjusted to the preset number by a preset value respectively to obtain image data information of the multidimensional image to be processed;
determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the grabbing surfaces from the multi-dimensional image to be processed according to the image data information of the multi-dimensional image to be processed, wherein the grabbing surfaces and the grabbing points are in one-to-one correspondence;
Determining a plurality of grabbing parameters corresponding to the grabbing surfaces;
evaluating the plurality of grabbing surfaces by utilizing the plurality of grabbing parameters, and determining a target grabbing surface from the plurality of grabbing surfaces according to an evaluation result;
taking the grabbing point corresponding to the target grabbing surface as a target grabbing point;
and determining a grabbing point pose corresponding to the target grabbing point according to the target grabbing point, so as to grab a target object corresponding to the target grabbing point from the multi-dimensional image to be processed according to the grabbing point pose.
2. The method of claim 1, wherein the evaluating the plurality of gripping surfaces using the plurality of gripping parameters and determining a target gripping surface from the plurality of gripping surfaces based on the evaluation results comprises:
each grabbing surface of the grabbing surfaces is evaluated by the grabbing parameters, and a plurality of grabbing surface evaluation values corresponding to the grabbing surfaces are obtained;
and determining a first grabbing surface evaluation value with the highest evaluation value from the grabbing surface evaluation values, and taking the grabbing surface corresponding to the first grabbing surface evaluation value as the target grabbing surface.
3. The method of claim 1, wherein the plurality of grasping parameters comprises at least one of:
the area parameter of the grabbing surface, the height parameter of the grabbing surface, the flatness parameter of the grabbing surface and the gradient parameter of the grabbing surface.
4. A method according to any one of claims 1 to 3, wherein the determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the plurality of grabbing surfaces from the multi-dimensional image to be processed according to the image data information of the multi-dimensional image to be processed comprises:
inputting image data information of the multidimensional image to be processed into a deep learning network model to obtain a plurality of pixel points and a plurality of center points corresponding to the pixel points, wherein the deep learning network model is a model obtained by training an initial deep learning network model by using sample image data information of a sample multidimensional image, and the pixel points correspond to the center points one by one;
dividing the plurality of center points into a plurality of groups of center points;
determining a grabbing point corresponding to any group of center points according to any group of center points in the plurality of groups of center points until the plurality of grabbing points are determined from the plurality of groups of center points;
And determining one grabbing surface corresponding to any group of center points according to any group of pixel points corresponding to any group of center points in the plurality of groups of center points until the plurality of grabbing surfaces are determined from the plurality of groups of pixel points corresponding to the plurality of groups of center points, wherein any group of center points corresponds to any group of pixel points one by one.
5. The method of claim 4, wherein determining, from any one of the plurality of sets of center points, a grasp point corresponding to the any one set of center points until the plurality of grasp points are determined from the plurality of sets of center points, comprises:
averaging one group of position data of any group of center points to obtain average position data until a plurality of average position data are obtained from a plurality of groups of position data information of a plurality of groups of center points;
and taking a plurality of points corresponding to the plurality of average position data as a plurality of grabbing points, wherein the plurality of average position data are in one-to-one correspondence with the plurality of grabbing points.
6. A method according to any one of claims 1 to 3, wherein the image data information comprises color channel data information and depth data information.
7. An image processing apparatus, characterized in that the apparatus comprises:
the acquisition unit acquires original image data information of the multidimensional image to be processed;
a preprocessing unit, configured to adjust, if the number of original data information in the original image data information does not satisfy a preset number value, the number of original data information to a preset number; dividing the data of the original data information adjusted to the preset number by a preset value respectively to obtain image data information of the multidimensional image to be processed;
the determining unit is used for determining a plurality of grabbing surfaces and a plurality of grabbing points corresponding to the grabbing surfaces from the multi-dimensional image to be processed according to the image data information of the multi-dimensional image to be processed, wherein the grabbing surfaces and the grabbing points are in one-to-one correspondence; determining a plurality of grabbing parameters corresponding to the grabbing surfaces; taking the grabbing point corresponding to the target grabbing surface as a target grabbing point; according to the target grabbing points, grabbing point positions corresponding to the target grabbing points are determined, and according to the grabbing point positions, target objects corresponding to the target grabbing points are grabbed from the multidimensional image to be processed;
And the evaluation unit is used for evaluating the plurality of grabbing surfaces by utilizing the plurality of grabbing parameters and determining the target grabbing surface from the plurality of grabbing surfaces according to the evaluation result.
8. An image processing apparatus, characterized in that the apparatus comprises:
a memory and a graphics processor, the memory storing an image processing program executable by the graphics processor, the image processing program, when executed, performing the method of any one of claims 1 to 6 by the graphics processor.
9. A storage medium having stored thereon a computer program for application to an image processing apparatus, characterized in that the computer program, when executed by a graphics processor, implements the method of any of claims 1 to 6.
10. A robot, comprising: a robot arm for gripping a target object at a gripping point pose, and an image processing device for performing the method of any one of claims 1 to 6, in case the image processing device determines the gripping point pose of the target object.
CN202010117760.6A 2020-02-25 2020-02-25 Image processing method, device, storage medium and robot Active CN111325795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010117760.6A CN111325795B (en) 2020-02-25 2020-02-25 Image processing method, device, storage medium and robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010117760.6A CN111325795B (en) 2020-02-25 2020-02-25 Image processing method, device, storage medium and robot

Publications (2)

Publication Number Publication Date
CN111325795A CN111325795A (en) 2020-06-23
CN111325795B true CN111325795B (en) 2023-07-25

Family

ID=71172985

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010117760.6A Active CN111325795B (en) 2020-02-25 2020-02-25 Image processing method, device, storage medium and robot

Country Status (1)

Country Link
CN (1) CN111325795B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114078158A (en) * 2020-08-14 2022-02-22 边辕视觉科技(上海)有限公司 Method for automatically acquiring characteristic point parameters of target object
CN111928953A (en) * 2020-09-15 2020-11-13 深圳市商汤科技有限公司 Temperature measuring method and device, electronic equipment and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6695843B2 (en) * 2017-09-25 2020-05-20 ファナック株式会社 Device and robot system
CN109598264B (en) * 2017-09-30 2020-10-16 北京猎户星空科技有限公司 Object grabbing method and device
US11185986B2 (en) * 2017-11-17 2021-11-30 The Hong Kong University Of Science And Technology Robotic fingertip design and grasping on contact primitives
CN108280856B (en) * 2018-02-09 2021-05-07 哈尔滨工业大学 Unknown object grabbing pose estimation method based on mixed information input network model
CN109986560B (en) * 2019-03-19 2023-02-14 埃夫特智能装备股份有限公司 Mechanical arm self-adaptive grabbing method for multiple target types
CN110238840B (en) * 2019-04-24 2021-01-29 中山大学 Mechanical arm autonomous grabbing method based on vision

Also Published As

Publication number Publication date
CN111325795A (en) 2020-06-23

Similar Documents

Publication Publication Date Title
CN109801337B (en) 6D pose estimation method based on instance segmentation network and iterative optimization
CN108280856B (en) Unknown object grabbing pose estimation method based on mixed information input network model
CN106530297B (en) Grasping body area positioning method based on point cloud registering
Richtsfeld et al. Segmentation of unknown objects in indoor environments
CN109685141B (en) Robot article sorting visual detection method based on deep neural network
Eppner et al. Grasping unknown objects by exploiting shape adaptability and environmental constraints
Cretu et al. Soft object deformation monitoring and learning for model-based robotic hand manipulation
US11794343B2 (en) System and method for height-map-based grasp execution
CN111325795B (en) Image processing method, device, storage medium and robot
CN107705322A (en) Motion estimate tracking and system
JP2011022992A (en) Robot with vision-based 3d shape recognition
CN107016391A (en) A kind of complex scene workpiece identification method
Chen et al. Combining reinforcement learning and rule-based method to manipulate objects in clutter
Le Louedec et al. Segmentation and detection from organised 3D point clouds: A case study in broccoli head detection
CN113762159B (en) Target grabbing detection method and system based on directional arrow model
CN108555902B (en) Method and device for sorting articles by robot and robot
CN106446832B (en) Video-based pedestrian real-time detection method
CN113724329A (en) Object attitude estimation method, system and medium fusing plane and stereo information
Yadav et al. An image matching and object recognition system using webcam robot
Hameed et al. Pose estimation of objects using digital image processing for pick-and-place applications of robotic arms
CN108805896B (en) Distance image segmentation method applied to urban environment
CN113524172B (en) Robot, article grabbing method thereof and computer-readable storage medium
Yang et al. Target position and posture recognition based on RGB-D images for autonomous grasping robot arm manipulation
CN106886791A (en) Fat location recognition methods in a kind of two-dimensional ct picture based on condition random field
Sial et al. Spatio-temporal RGBD cuboids feature for human activity recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant