CN111368852A - Article identification and pre-sorting system and method based on deep learning and robot - Google Patents

Article identification and pre-sorting system and method based on deep learning and robot Download PDF

Info

Publication number
CN111368852A
CN111368852A CN201811605348.8A CN201811605348A CN111368852A CN 111368852 A CN111368852 A CN 111368852A CN 201811605348 A CN201811605348 A CN 201811605348A CN 111368852 A CN111368852 A CN 111368852A
Authority
CN
China
Prior art keywords
target object
target
image data
neural network
sorting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811605348.8A
Other languages
Chinese (zh)
Inventor
姜楠
曲道奎
邹风山
王晓东
毕丰隆
徐佳新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Siasun Robot and Automation Co Ltd
Original Assignee
Shenyang Siasun Robot and Automation Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Siasun Robot and Automation Co Ltd filed Critical Shenyang Siasun Robot and Automation Co Ltd
Priority to CN201811605348.8A priority Critical patent/CN111368852A/en
Publication of CN111368852A publication Critical patent/CN111368852A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • G06V10/464Salient features, e.g. scale invariant feature transforms [SIFT] using a plurality of salient features, e.g. bag-of-words [BoW] representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds

Abstract

The invention provides an article identification pre-sorting method based on deep learning, which comprises the steps of acquiring image data containing a target article by using an RGBD (red green blue) camera, performing target positioning on the image data by using a convolutional neural network, performing pixel level segmentation on a positioned target object by using a multi-scale target detection FPN (field programmable gate array) network to obtain a target article pixel point set, processing the target article pixel point set according to the corresponding relation between the image data and point cloud to obtain a point cloud set, processing the point cloud set in an iterative closest ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object, so that accurate pose information of the target article can be provided, and a guarantee is provided for rapid sorting of a robot. The invention further correspondingly provides an article identification and pre-sorting system and a robot based on deep learning.

Description

Article identification and pre-sorting system and method based on deep learning and robot
Technical Field
The invention relates to the field of robot vision, in particular to an article identification and pre-sorting system and method based on deep learning and a robot.
Background
With the rapid development of electronic commerce, the flexibility and the efficiency of sorting can be greatly improved by using a robot as a sorting executing mechanism. For the robot, except for the teaching mode, the robot can autonomously work, and the mode of identifying and positioning the target through machine vision is more effective and practical.
The robot is composed of a mechanical arm and a movable chassis, so that the operation of a plurality of stations can be realized. When the machine vision is used for positioning the target, the common methods are divided into two methods, one method is based on a two-dimensional camera, and the target is grabbed in a plane calibration mode; the other is to adopt a three-dimensional sensor, which can position and operate any article in the space, and the vision recognition system becomes an indispensable auxiliary unit.
Visual recognition systems are often very complex systems requiring the ability to accurately capture images and react to external changes in real time. In addition, the vision recognition system is often required to track the external moving target in real time, so that the vision recognition system puts high requirements on the real-time performance of hardware and software systems, and the traditional method is still used mainly in the actual use process.
Disclosure of Invention
The embodiment of the invention provides an article identification and pre-sorting system and method based on deep learning and a robot, which can provide accurate pose information of target articles and guarantee the robot to quickly sort.
In a first aspect, the invention provides a deep learning-based item identification pre-sorting method, which comprises the following steps:
acquiring image data containing a target object by using an RGBD camera;
performing target positioning on the image data by using a convolutional neural network, and performing pixel level segmentation on the positioned target object by using a multi-scale target detection FPN network to obtain a target object pixel point set;
processing the target article pixel point set according to the corresponding relation between the image data and the point cloud to obtain a point cloud set;
and processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object.
Optionally, before the performing target localization on the image data by using a convolutional neural network algorithm and performing pixel-level segmentation on the localized target object by using an FPN network to obtain a target item pixel point set, the method further includes:
and carrying out image recognition on the image data to obtain the identification information of the target object, wherein the identification information is used for identifying the uniqueness of the target object.
Optionally, after the processing the point cloud set by the ICP matching to obtain the pose information of the target object, and the pose information is used for sorting the target object, the method further includes:
and converting the pose information into position parameters of a robot coordinate system, so that the robot can grab the target object according to the position parameters.
Optionally, the performing target location on the image data by using a convolutional neural network includes:
the method comprises the steps of constructing a convolutional neural network, inputting an image of a target article into the convolutional neural network for training to obtain a trained convolutional neural network model, training a Softmax classifier through global target article features extracted by the convolutional neural network model, wherein the convolutional neural network comprises a convolutional pooling layer, a local feature fusion layer and a full connection layer, conducting modular preprocessing on image data, inputting results into the trained convolutional neural network model to obtain the features of the target article, and identifying by using the trained Softmax classifier to obtain the positioning of the target article.
Optionally, the performing target location on the image data by using a convolutional neural network includes:
receiving input image samples of multiple categories, normalizing the input image sample data of each category, convolving the normalized image sample data, mapping the convolved image sample data by adopting a preset asymmetric mapping matrix, arranging the mapped image sample data to obtain corresponding one-dimensional feature description, and calculating a neural network weight corresponding to the image of each category according to the one-dimensional feature description;
distributing the corresponding neural network weights of the plurality of category images by adopting a hierarchical structure, wherein the category number distributed in each layer is the maximum distinguishing classification number determined according to the asymmetric mapping matrix, the plurality of category images are sequentially distributed in the plurality of layers, and each layer forms a corresponding learning library;
processing input test type image sample data to obtain corresponding one-dimensional feature description, and performing feed-forward learning on the one-dimensional feature description corresponding to the test type image sample data and the neural network weight in the learning library to obtain whether the test type is in the learned type image.
Optionally, the processing the point cloud set by the iterative closest point ICP matching to obtain the pose information of the target object includes:
determining any two three-dimensional point sets in the point cloud set, namely a first three-dimensional point set X1 and a second three-dimensional point set X2;
calculating a corresponding near point for each point in the second set of three-dimensional points X2 in the first set of three-dimensional points X1;
obtaining rigid body transformation which enables the corresponding close point to have the minimum average distance, and obtaining translation parameters and rotation parameters;
obtaining a new transformation point set by using the translation and rotation parameters obtained in the previous step for the second three-dimensional point set X2;
and when the average distance between the new transformation point set and the reference point set is smaller than a given threshold value, stopping iterative computation, otherwise, taking the new transformation point set as a new second three-dimensional point set X2 to continue iteration until the requirement of the objective function is met, and obtaining the pose information of the target object.
Optionally, the performing image recognition on the image data to obtain the identification information of the target item includes:
and recognizing the image data by adopting an Optical Character Recognition (OCR) to obtain the identification information of the target object.
In a second aspect, the present invention provides an item identification pre-sorting system based on deep learning, comprising:
the vision board card is used for acquiring image data containing a target object by using an RGBD (red, green and blue) camera, processing a pixel point set of the target object according to the corresponding relation between the image data and the point cloud to obtain a point cloud set, processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object;
the vision processing unit GPU server is used for carrying out target positioning on the image data by utilizing a convolutional neural network and carrying out pixel level segmentation on the positioned target object by utilizing a multi-scale target detection FPN network to obtain a target article pixel point set;
and the visual board card and the GPU server are communicated by adopting a TCP/IP protocol.
Optionally, the vision board is further configured to convert the pose information into a position parameter of a robot coordinate system, so that the robot grasps the target object according to the position parameter;
the GPU server is also used for carrying out image recognition on the image data to obtain the identification information of the target object, and the identification information is used for identifying the uniqueness of the target object.
Optionally, the GPU server is specifically configured to recognize the image data by using an optical character recognition OCR to obtain the identification information of the target item.
In a third aspect, the invention provides a robot for performing the deep learning based item identification pre-sorting method as described above.
According to the technical scheme, the embodiment of the invention has the following advantages:
the invention provides an article identification pre-sorting method based on deep learning, which comprises the following steps: the method comprises the steps of acquiring image data containing a target object by using an RGBD (red green blue) camera, performing target positioning on the image data by using a convolutional neural network, performing pixel level segmentation on the positioned target object by using a multi-scale target detection FPN (field programmable gate array) network to obtain a target object pixel point set, processing the target object pixel point set according to the corresponding relation between the image data and point cloud to obtain a point cloud set, processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object, can provide accurate pose information of the target object and guarantee the rapid sorting of a robot, correspondingly provides an object identification pre-sorting system and the robot based on deep learning, and effectively improves the calculation speed by adopting a local private server, and accurate pose information of target objects can be provided, and a guarantee is provided for the robot to rapidly sort.
Drawings
FIG. 1 is a flow chart of a deep learning based item identification pre-sort method provided in an embodiment of the present invention;
FIG. 2 is a schematic diagram of a deep learning based item identification pre-sort method provided in an embodiment of the present invention;
fig. 3 is a block diagram of an item identification pre-sorting system based on deep learning provided in an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, the present invention provides a deep learning-based item identification pre-sorting method, which includes:
s101, acquiring image data containing the target object by using an RGBD camera.
The RGBD camera adopts binocular stereo vision, and the binocular stereo vision is a method for acquiring three-dimensional geometric information of an object from a plurality of images based on a parallax principle. In a machine vision system, binocular vision generally obtains two digital images of surrounding scenery from different angles by two cameras simultaneously, or obtains two digital images of the surrounding scenery from different angles by a single camera at different times, can recover three-dimensional geometric information of an object based on a parallax principle, and reconstructs the three-dimensional shape and position of the surrounding scenery, the binocular vision is based on the parallax, and obtains the three-dimensional information based on a trigonometry principle, namely a triangle is formed between an image plane of the two cameras and a north object. The three-dimensional size of an object in the common field of view of the two cameras and the three-dimensional coordinates of the feature points of the spatial object can be obtained by keeping the position relationship between the two cameras, and the binocular vision system is composed of the two cameras.
S102, carrying out target positioning on the image data by using a convolutional neural network, and carrying out pixel level segmentation on the positioned target object by using a multi-scale target detection FPN network to obtain a target object pixel point set.
The FPN network has multiple positioning methods, can be obtained by respectively obtaining multi-scale features through multi-scale pictures, can also be directly used for predicting on the highest layer conv, which is a method similar to rcnn mainstream detection, and can also be used for predicting the feature maps of each layer with different scales, and the other method is to perform prediction after the upsampling and shallow layer fusion of each layer of feature maps, specifically to perform upsampling on a high-layer feature map 2x up, convolving and compressing a channel on a shallow-layer feature map 1 x1, and then performing fusion, wherein anchors are configured on each scale to select 15 anchors which are respectively corresponding to the region original image, and have the total {32^2,64^2,128^2,256^2,512^2} (1: 2:1, 2:1} total.
S103, processing the target article pixel point set according to the corresponding relation between the image data and the point cloud to obtain a point cloud set.
The point cloud is a collection of a vast number of points on the surface characteristic of the object. The point cloud obtained according to the laser measurement principle comprises three-dimensional coordinates (XYZ) and laser reflection Intensity (Intensity). The point cloud obtained according to the photogrammetry principle comprises three-dimensional coordinates (XYZ) and color information (RGB). And combining laser measurement and photogrammetry principles to obtain a point cloud comprising three-dimensional coordinates (XYZ), laser reflection Intensity (Intensity) and color information (RGB). After the spatial coordinates of each sampling point on the surface of an object are obtained, a point set is obtained as a point cloud, the number of points obtained by using a three-dimensional coordinate measuring machine is small, the distance between the points is large, and the point cloud is called as a sparse point cloud; the point clouds obtained by using the three-dimensional laser scanner or the photographic scanner have larger and denser point quantities, and are called dense point clouds.
And S104, processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object.
The ICP matching specifically includes:
determining any two three-dimensional point sets in the point cloud set, namely a first three-dimensional point set X1 and a second three-dimensional point set X2; calculating a corresponding near point for each point in the second set of three-dimensional points X2 in the first set of three-dimensional points X1; obtaining rigid body transformation which enables the corresponding close point to have the minimum average distance, and obtaining translation parameters and rotation parameters; obtaining a new transformation point set by using the translation and rotation parameters obtained in the previous step for the second three-dimensional point set X2; and when the average distance between the new transformation point set and the reference point set is smaller than a given threshold value, stopping iterative computation, otherwise, taking the new transformation point set as a new second three-dimensional point set X2 to continue iteration until the requirement of the objective function is met, and obtaining the pose information of the target object.
In order to facilitate the distinguishing and marking of the target object, before the performing target localization on the image data by using the convolutional neural network algorithm and performing pixel-level segmentation on the localized target object by using the FPN network to obtain a target item pixel point set, the method further includes:
the image data is subjected to image recognition to obtain identification information of the target object, and the identification information is used for identifying the uniqueness of the target object, for example, by adding a character signboard, and obtaining character content by OCR recognition, which is not limited herein.
After the point cloud set is processed in an ICP matching manner to obtain pose information of the target object, where the pose information is used for sorting the target object, the method further includes:
and S105, converting the pose information into position parameters of a robot coordinate system, so that the robot can grab the target object according to the position parameters.
With reference to fig. 2, the coordinate system can be converted by the hand-eye calibration module of the robot, and the scheme can be used in the sorting process of the articles, specifically, the grabbing execution stage: the method comprises the steps of utilizing hand-eye calibration to capture the object position and pose information obtained through conversion, converting the object position and pose information obtained through camera recognition into a robot coordinate system in the hand-eye calibration stage, enabling a calibration plate to be static, enabling the robot to drive the camera to move for settlement, obtaining object types, actual positions and postures in the object recognition and positioning stage, obtaining identification information of the object through OCR recognition, performing pixel-level segmentation, and finally mapping point cloud for ICP matching to obtain position and pose information.
In one embodiment of step S102, the performing target localization on the image data by using a convolutional neural network includes:
the method comprises the steps of constructing a convolutional neural network, inputting an image of a target article into the convolutional neural network for training to obtain a trained convolutional neural network model, training a Softmax classifier through global target article features extracted by the convolutional neural network model, wherein the convolutional neural network comprises a convolutional pooling layer, a local feature fusion layer and a full connection layer, conducting modular preprocessing on image data, inputting results into the trained convolutional neural network model to obtain the features of the target article, and identifying by using the trained Softmax classifier to obtain the positioning of the target article.
In another embodiment of step S102, the performing target location on the image data by using a convolutional neural network includes:
receiving input image samples of multiple categories, normalizing the input image sample data of each category, convolving the normalized image sample data, mapping the convolved image sample data by adopting a preset asymmetric mapping matrix, arranging the mapped image sample data to obtain corresponding one-dimensional feature description, and calculating a neural network weight corresponding to the image of each category according to the one-dimensional feature description;
distributing the corresponding neural network weights of the plurality of category images by adopting a hierarchical structure, wherein the category number distributed in each layer is the maximum distinguishing classification number determined according to the asymmetric mapping matrix, the plurality of category images are sequentially distributed in the plurality of layers, and each layer forms a corresponding learning library;
processing input test type image sample data to obtain corresponding one-dimensional feature description, and performing feed-forward learning on the one-dimensional feature description corresponding to the test type image sample data and the neural network weight in the learning library to obtain whether the test type is in the learned type image.
The invention provides an article identification pre-sorting method based on deep learning, which comprises the following steps: the method comprises the steps of obtaining image data containing a target object by using an RGBD (red, green and blue) camera, performing target positioning on the image data by using a convolutional neural network, performing pixel level segmentation on the positioned target object by using a multi-scale target detection FPN (field programmable gate array) network to obtain a target object pixel point set, processing the target object pixel point set according to the corresponding relation between the image data and point cloud to obtain a point cloud set, processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object, so that accurate pose information of the target object can be provided, and rapid sorting of a robot is guaranteed.
As shown in fig. 3, the present invention provides a deep learning-based item identification pre-sorting system, which includes:
the vision board card is used for acquiring image data containing a target object by using an RGBD (red, green and blue) camera, processing a pixel point set of the target object according to the corresponding relation between the image data and the point cloud to obtain a point cloud set, processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object;
the vision processing unit GPU server is used for carrying out target positioning on the image data by utilizing a convolutional neural network and carrying out pixel level segmentation on the positioned target object by utilizing a multi-scale target detection FPN network to obtain a target article pixel point set;
the visual board card and the GPU server are communicated by adopting a TCP/IP protocol, and the calculation speed is effectively increased by adopting a local private server mode.
The vision board card is also used for converting the pose information into position parameters of a robot coordinate system so that the robot can grab the target object according to the position parameters;
the GPU server is further used for carrying out image recognition on the image data to obtain identification information of the target object, the identification information is used for identifying the uniqueness of the target object, and the GPU server specifically adopts Optical Character Recognition (OCR) to carry out recognition on the image data to obtain the identification information of the target object.
The calculation of deep learning is usually realized by adopting a GPU acceleration mode, a more powerful GPU is needed for a more complex deep learning network, the GPU with stronger calculation capability usually needs larger power consumption, the display card is not suitable for being placed on the composite robot, and otherwise, the working time of the composite robot can be greatly reduced. Therefore, the system architecture is realized by adopting an edge computing mode, namely, an i7 computing board card is used on the composite robot body and is connected to a local private server through a wireless network, and the function needing GPU computing is placed on the private server. In order to facilitate the integration and customization of the system, the visual software system adopts the ROS as a basic framework to realize multi-machine distributed processing.
The article identification pre-sorting system provided by the invention can be applied to the e-commerce storage logistics industry, such as logistics warehouses, various types of living goods exist, and the system can be used for replacing manual operation of classifying and warehousing the goods. The device can also be applied to industrial production flow, and can effectively replace the work of workers in certain severe production environments for reading instruments and meters and operating the simple control buttons.
The article identification pre-sorting system based on deep learning provided by the invention comprises a vision board card, a local private server and a vision processing unit (GPU) server, wherein the vision board card is used for acquiring image data containing a target article by using an RGBD camera, processing a pixel point set of the target article according to the corresponding relation between the image data and a point cloud to obtain a point cloud set, processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target article, the pose information is used for sorting the target article, the vision processing unit (GPU) server is used for carrying out target positioning on the image data by using a convolutional neural network and carrying out pixel level segmentation on the positioned target article by using a multi-scale target detection (FPN) network to obtain a pixel point set of the target article, the vision board card and the GPU server are communicated by adopting a TCP/IP (Transmission control protocol/Internet protocol), the calculation speed is effectively improved, accurate pose information of the target object can be provided, and a guarantee is provided for the robot to sort quickly.
Accordingly, the present invention provides a robot for performing the deep learning based item identification pre-sorting method as described above.
The vision board card is arranged on the robot body and is in communication connection with the GPU server through a TCP/IP protocol.
When the robot is used for multi-station operation in an industrial field, the system scheme of the patent can effectively save the cost of the fixed robot needing to be invested and realize the flexible multi-station operation under certain conditions. Meanwhile, for the production of certain products, such as medicines, daily supplies and the like, the system can well distinguish different article types, and realize the functions of classification, arrangement and the like.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
In view of the above, the detailed description of the article identification and pre-sorting system, method and robot based on deep learning provided by the present invention is provided, and those skilled in the art will appreciate that the concepts of the embodiments of the present invention may be changed in the specific implementation manners and the application ranges.

Claims (10)

1. A deep learning based item identification pre-sorting method, the method comprising:
acquiring image data containing a target object by using an RGBD camera;
performing target positioning on the image data by using a convolutional neural network, and performing pixel level segmentation on the positioned target object by using a multi-scale target detection FPN network to obtain a target object pixel point set;
processing the target article pixel point set according to the corresponding relation between the image data and the point cloud to obtain a point cloud set;
and processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object.
2. The deep learning based item identification pre-sorting method of claim 1, wherein before the target localization of the image data by using convolutional neural network algorithm and the pixel-level segmentation of the localized target object by using FPN network to obtain the target item pixel point set, the method further comprises:
and carrying out image recognition on the image data to obtain the identification information of the target object, wherein the identification information is used for identifying the uniqueness of the target object.
3. The deep learning based item identification pre-sorting method according to claim 1, wherein the processing the point cloud collection by ICP matching results in pose information of the target object, the pose information being used for sorting use of the target object, and the method further comprises:
and converting the pose information into position parameters of a robot coordinate system, so that the robot can grab the target object according to the position parameters.
4. The deep learning based item identification pre-sorting method of claim 1, wherein the target locating the image data by using a convolutional neural network comprises:
the method comprises the steps of constructing a convolutional neural network, inputting an image of a target article into the convolutional neural network for training to obtain a trained convolutional neural network model, training a Softmax classifier through global target article features extracted by the convolutional neural network model, wherein the convolutional neural network comprises a convolutional pooling layer, a local feature fusion layer and a full connection layer, conducting modular preprocessing on image data, inputting results into the trained convolutional neural network model to obtain the features of the target article, and identifying by using the trained Softmax classifier to obtain the positioning of the target article.
5. The deep learning based item identification pre-sorting method of claim 1, wherein the target locating the image data by using a convolutional neural network comprises:
receiving input image samples of multiple categories, normalizing the input image sample data of each category, convolving the normalized image sample data, mapping the convolved image sample data by adopting a preset asymmetric mapping matrix, arranging the mapped image sample data to obtain corresponding one-dimensional feature description, and calculating a neural network weight corresponding to the image of each category according to the one-dimensional feature description;
distributing the corresponding neural network weights of the plurality of category images by adopting a hierarchical structure, wherein the category number distributed in each layer is the maximum distinguishing classification number determined according to the asymmetric mapping matrix, the plurality of category images are sequentially distributed in the plurality of layers, and each layer forms a corresponding learning library;
processing input test type image sample data to obtain corresponding one-dimensional feature description, and performing feed-forward learning on the one-dimensional feature description corresponding to the test type image sample data and the neural network weight in the learning library to obtain whether the test type is in the learned type image.
6. The deep learning-based item identification pre-sorting method according to claim 1, wherein the processing the point cloud set by means of iterative closest point ICP matching to obtain the pose information of the target object comprises:
determining any two three-dimensional point sets in the point cloud set, namely a first three-dimensional point set X1 and a second three-dimensional point set X2;
calculating a corresponding near point for each point in the second set of three-dimensional points X2 in the first set of three-dimensional points X1;
obtaining rigid body transformation which enables the corresponding close point to have the minimum average distance, and obtaining translation parameters and rotation parameters;
obtaining a new transformation point set by using the translation and rotation parameters obtained in the previous step for the second three-dimensional point set X2;
and when the average distance between the new transformation point set and the reference point set is smaller than a given threshold value, stopping iterative computation, otherwise, taking the new transformation point set as a new second three-dimensional point set X2 to continue iteration until the requirement of the objective function is met, and obtaining the pose information of the target object.
7. The deep learning based item identification pre-sorting method according to claim 2, wherein the image recognition of the image data to obtain the identification information of the target item comprises:
and recognizing the image data by adopting an Optical Character Recognition (OCR) to obtain the identification information of the target object.
8. An item identification pre-sorting system based on deep learning, comprising:
the vision board card is used for acquiring image data containing a target object by using an RGBD (red, green and blue) camera, processing a pixel point set of the target object according to the corresponding relation between the image data and the point cloud to obtain a point cloud set, processing the point cloud set in an iterative closest point ICP (inductively coupled plasma) matching mode to obtain pose information of the target object, wherein the pose information is used for sorting the target object;
the vision processing unit GPU server is used for carrying out target positioning on the image data by utilizing a convolutional neural network and carrying out pixel level segmentation on the positioned target object by utilizing a multi-scale target detection FPN network to obtain a target article pixel point set;
and the visual board card and the GPU server are communicated by adopting a TCP/IP protocol.
9. The deep learning based item identification pre-sorting system of claim 8, wherein the vision board is further configured to convert the pose information into position parameters of a robot coordinate system, so that a robot grabs the target object according to the position parameters;
the GPU server is also used for carrying out image recognition on the image data to obtain the identification information of the target object, and the identification information is used for identifying the uniqueness of the target object.
10. A robot for performing the deep learning based item identification pre-sorting method of any of claims 1 to 7.
CN201811605348.8A 2018-12-26 2018-12-26 Article identification and pre-sorting system and method based on deep learning and robot Pending CN111368852A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811605348.8A CN111368852A (en) 2018-12-26 2018-12-26 Article identification and pre-sorting system and method based on deep learning and robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811605348.8A CN111368852A (en) 2018-12-26 2018-12-26 Article identification and pre-sorting system and method based on deep learning and robot

Publications (1)

Publication Number Publication Date
CN111368852A true CN111368852A (en) 2020-07-03

Family

ID=71209840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811605348.8A Pending CN111368852A (en) 2018-12-26 2018-12-26 Article identification and pre-sorting system and method based on deep learning and robot

Country Status (1)

Country Link
CN (1) CN111368852A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070818A (en) * 2020-11-10 2020-12-11 纳博特南京科技有限公司 Robot disordered grabbing method and system based on machine vision and storage medium
CN112264309A (en) * 2020-09-30 2021-01-26 北京京东振世信息技术有限公司 Package sorting method, server and storage medium
CN112288819A (en) * 2020-11-20 2021-01-29 中国地质大学(武汉) Multi-source data fusion vision-guided robot grabbing and classifying system and method
CN112509145A (en) * 2020-12-22 2021-03-16 珠海格力智能装备有限公司 Material sorting method and device based on three-dimensional vision
CN112784717A (en) * 2021-01-13 2021-05-11 中北大学 Automatic pipe fitting sorting method based on deep learning
CN112788326A (en) * 2020-12-28 2021-05-11 北京迁移科技有限公司 Image data online acquisition system and method based on 3D vision
CN113021355A (en) * 2021-03-31 2021-06-25 重庆正格技术创新服务有限公司 Agricultural robot operation method for predicting sheltered crop picking point
CN113393522A (en) * 2021-05-27 2021-09-14 湖南大学 6D pose estimation method based on monocular RGB camera regression depth information
CN113609985A (en) * 2021-08-05 2021-11-05 诺亚机器人科技(上海)有限公司 Object pose detection method, detection device, robot and storage medium
CN113780464A (en) * 2021-09-26 2021-12-10 唐山百川智能机器股份有限公司 Method for detecting anti-loose identification of bogie fastener
CN113808197A (en) * 2021-09-17 2021-12-17 山西大学 Automatic workpiece grabbing system and method based on machine learning
CN113920142A (en) * 2021-11-11 2022-01-11 江苏昱博自动化设备有限公司 Sorting manipulator multi-object sorting method based on deep learning
CN114170521A (en) * 2022-02-11 2022-03-11 杭州蓝芯科技有限公司 Forklift pallet butt joint identification positioning method
CN114871120A (en) * 2022-05-26 2022-08-09 江苏省徐州医药高等职业学校 Medicine determining and sorting method and device based on image data processing
CN115755920A (en) * 2022-11-30 2023-03-07 南京蔚蓝智能科技有限公司 Automatic charging method for robot dog
CN116187718A (en) * 2023-04-24 2023-05-30 深圳市宏大供应链服务有限公司 Intelligent goods identification and sorting method and system based on computer vision
CN116228854A (en) * 2022-12-29 2023-06-06 中科微至科技股份有限公司 Automatic parcel sorting method based on deep learning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544506A (en) * 2013-10-12 2014-01-29 Tcl集团股份有限公司 Method and device for classifying images on basis of convolutional neural network
CN103955939A (en) * 2014-05-16 2014-07-30 重庆理工大学 Boundary feature point registering method for point cloud splicing in three-dimensional scanning system
CN105590102A (en) * 2015-12-30 2016-05-18 中通服公众信息产业股份有限公司 Front car face identification method based on deep learning
CN106600639A (en) * 2016-12-09 2017-04-26 江南大学 Genetic algorithm and adaptive threshold constraint-combined ICP (iterative closest point) pose positioning technology
CN108171748A (en) * 2018-01-23 2018-06-15 哈工大机器人(合肥)国际创新研究院 A kind of visual identity of object manipulator intelligent grabbing application and localization method
CN108198145A (en) * 2017-12-29 2018-06-22 百度在线网络技术(北京)有限公司 For the method and apparatus of point cloud data reparation
CN108710919A (en) * 2018-05-25 2018-10-26 东南大学 A kind of crack automation delineation method based on multi-scale feature fusion deep learning

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544506A (en) * 2013-10-12 2014-01-29 Tcl集团股份有限公司 Method and device for classifying images on basis of convolutional neural network
CN103955939A (en) * 2014-05-16 2014-07-30 重庆理工大学 Boundary feature point registering method for point cloud splicing in three-dimensional scanning system
CN105590102A (en) * 2015-12-30 2016-05-18 中通服公众信息产业股份有限公司 Front car face identification method based on deep learning
CN106600639A (en) * 2016-12-09 2017-04-26 江南大学 Genetic algorithm and adaptive threshold constraint-combined ICP (iterative closest point) pose positioning technology
CN108198145A (en) * 2017-12-29 2018-06-22 百度在线网络技术(北京)有限公司 For the method and apparatus of point cloud data reparation
CN108171748A (en) * 2018-01-23 2018-06-15 哈工大机器人(合肥)国际创新研究院 A kind of visual identity of object manipulator intelligent grabbing application and localization method
CN108710919A (en) * 2018-05-25 2018-10-26 东南大学 A kind of crack automation delineation method based on multi-scale feature fusion deep learning

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112264309A (en) * 2020-09-30 2021-01-26 北京京东振世信息技术有限公司 Package sorting method, server and storage medium
CN112070818A (en) * 2020-11-10 2020-12-11 纳博特南京科技有限公司 Robot disordered grabbing method and system based on machine vision and storage medium
CN112070818B (en) * 2020-11-10 2021-02-05 纳博特南京科技有限公司 Robot disordered grabbing method and system based on machine vision and storage medium
CN112288819A (en) * 2020-11-20 2021-01-29 中国地质大学(武汉) Multi-source data fusion vision-guided robot grabbing and classifying system and method
CN112509145A (en) * 2020-12-22 2021-03-16 珠海格力智能装备有限公司 Material sorting method and device based on three-dimensional vision
CN112509145B (en) * 2020-12-22 2023-12-08 珠海格力智能装备有限公司 Material sorting method and device based on three-dimensional vision
CN112788326B (en) * 2020-12-28 2023-06-06 北京迁移科技有限公司 3D vision-based image data online acquisition system and method
CN112788326A (en) * 2020-12-28 2021-05-11 北京迁移科技有限公司 Image data online acquisition system and method based on 3D vision
CN112784717B (en) * 2021-01-13 2022-05-13 中北大学 Automatic pipe fitting sorting method based on deep learning
CN112784717A (en) * 2021-01-13 2021-05-11 中北大学 Automatic pipe fitting sorting method based on deep learning
CN113021355A (en) * 2021-03-31 2021-06-25 重庆正格技术创新服务有限公司 Agricultural robot operation method for predicting sheltered crop picking point
CN113393522A (en) * 2021-05-27 2021-09-14 湖南大学 6D pose estimation method based on monocular RGB camera regression depth information
CN113609985A (en) * 2021-08-05 2021-11-05 诺亚机器人科技(上海)有限公司 Object pose detection method, detection device, robot and storage medium
CN113609985B (en) * 2021-08-05 2024-02-23 诺亚机器人科技(上海)有限公司 Object pose detection method, detection device, robot and storable medium
CN113808197A (en) * 2021-09-17 2021-12-17 山西大学 Automatic workpiece grabbing system and method based on machine learning
CN113780464A (en) * 2021-09-26 2021-12-10 唐山百川智能机器股份有限公司 Method for detecting anti-loose identification of bogie fastener
CN113920142B (en) * 2021-11-11 2023-09-26 江苏昱博自动化设备有限公司 Sorting manipulator multi-object sorting method based on deep learning
CN113920142A (en) * 2021-11-11 2022-01-11 江苏昱博自动化设备有限公司 Sorting manipulator multi-object sorting method based on deep learning
CN114170521A (en) * 2022-02-11 2022-03-11 杭州蓝芯科技有限公司 Forklift pallet butt joint identification positioning method
CN114871120A (en) * 2022-05-26 2022-08-09 江苏省徐州医药高等职业学校 Medicine determining and sorting method and device based on image data processing
CN114871120B (en) * 2022-05-26 2023-11-07 江苏省徐州医药高等职业学校 Medicine determining and sorting method and device based on image data processing
CN115755920A (en) * 2022-11-30 2023-03-07 南京蔚蓝智能科技有限公司 Automatic charging method for robot dog
CN116228854B (en) * 2022-12-29 2023-09-08 中科微至科技股份有限公司 Automatic parcel sorting method based on deep learning
CN116228854A (en) * 2022-12-29 2023-06-06 中科微至科技股份有限公司 Automatic parcel sorting method based on deep learning
CN116187718B (en) * 2023-04-24 2023-08-04 深圳市宏大供应链服务有限公司 Intelligent goods identification and sorting method and system based on computer vision
CN116187718A (en) * 2023-04-24 2023-05-30 深圳市宏大供应链服务有限公司 Intelligent goods identification and sorting method and system based on computer vision

Similar Documents

Publication Publication Date Title
CN111368852A (en) Article identification and pre-sorting system and method based on deep learning and robot
CN108171748B (en) Visual identification and positioning method for intelligent robot grabbing application
CN108555908B (en) Stacked workpiece posture recognition and pickup method based on RGBD camera
CN111328396B (en) Pose estimation and model retrieval for objects in images
CN109816725B (en) Monocular camera object pose estimation method and device based on deep learning
CN109870983B (en) Method and device for processing tray stack image and system for warehousing goods picking
CN106156778B (en) The method of known object in the visual field of NI Vision Builder for Automated Inspection for identification
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
CN102141398B (en) Monocular vision-based method for measuring positions and postures of multiple robots
CN111695562B (en) Autonomous robot grabbing method based on convolutional neural network
CN111178250A (en) Object identification positioning method and device and terminal equipment
CN111179324A (en) Object six-degree-of-freedom pose estimation method based on color and depth information fusion
CN110084243B (en) File identification and positioning method based on two-dimensional code and monocular camera
CN107992881A (en) A kind of Robotic Dynamic grasping means and system
CN114952809B (en) Workpiece identification and pose detection method, system and mechanical arm grabbing control method
CN109461184B (en) Automatic positioning method for grabbing point for grabbing object by robot mechanical arm
CN110400315A (en) A kind of defect inspection method, apparatus and system
Ni et al. A new approach based on two-stream cnns for novel objects grasping in clutter
Zhuang et al. Instance segmentation based 6D pose estimation of industrial objects using point clouds for robotic bin-picking
CN115816460A (en) Manipulator grabbing method based on deep learning target detection and image segmentation
WO2023278550A1 (en) Systems and methods for picking objects using 3-d geometry and segmentation
Salem et al. Assessment of methods for industrial indoor object recognition
Zhang et al. A fast detection and grasping method for mobile manipulator based on improved faster R-CNN
CN111242057A (en) Product sorting system, method, computer device and storage medium
CN207752527U (en) A kind of Robotic Dynamic grasping system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200703

WD01 Invention patent application deemed withdrawn after publication