WO2021220191A2 - Automatic production of a training dataset for a neural network - Google Patents

Automatic production of a training dataset for a neural network Download PDF

Info

Publication number
WO2021220191A2
WO2021220191A2 PCT/IB2021/053529 IB2021053529W WO2021220191A2 WO 2021220191 A2 WO2021220191 A2 WO 2021220191A2 IB 2021053529 W IB2021053529 W IB 2021053529W WO 2021220191 A2 WO2021220191 A2 WO 2021220191A2
Authority
WO
WIPO (PCT)
Prior art keywords
objects
representation
scenography
work area
neural network
Prior art date
Application number
PCT/IB2021/053529
Other languages
French (fr)
Other versions
WO2021220191A3 (en
Inventor
Carlo Bazzica
Original Assignee
Bazzica Engineering S.R.L.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bazzica Engineering S.R.L. filed Critical Bazzica Engineering S.R.L.
Publication of WO2021220191A2 publication Critical patent/WO2021220191A2/en
Publication of WO2021220191A3 publication Critical patent/WO2021220191A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects

Definitions

  • the present invention relates, in general, to the field of neural networks, and more particularly to the automatic production of a training dataset for a neural network.
  • logistic handling systems For the physical treatment of these objects from the warehouses to the shipping channels, logistic handling systems have been developed in which a robotic device is provided with a grasping member designed to pick up the objects moving along conveyors, for example a conveyor belt, to direct them to different destinations.
  • the robotic device In order to carry out the grasping operations in a completely automatic way the robotic device is provided with a vision system that uses artificial intelligence for recognizing different types of objects and determining positions of the individual objects on the conveyor belt in order to grasp the objects arranged randomly on a support (conveyor belt and/or logistic bin).
  • the vision systems can conveniently use:
  • neural networks need training datasets to train the neural network and cause in it to become able to produce the expected result with regard to the automatic recognition of the types of objects and the positions thereof.
  • Figure 1 shows a prior art labeling during which bounds of the objects considered interesting ⁇ A,B,C,D,E,F,G) are indicated with a polyline: the angles that longitudinal axes of the objects (in the example elongated objects are shown) form with respect to a reference plane is also indicated. In this way, a series of labeling data is manually produced.
  • the set of all the labeling data of all the images forms the training dataset based on which the neural network is then trained.
  • the Applicant has experienced that, depending on the morphological complexity of the set of objects, it may be necessary to acquire even several hundred images, associating each of them with the labeling data.
  • this tedious work phase which can take several days of work time, it is sufficient for the operator to make only a few mistakes, for example associating the labeling data of an object to a different object or drawing incorrect polylines (or any other kind of error caused by the repetitiveness of this type of work) to produce extremely negative effects on the neural network training procedure and the entire vision system will be unstable and inaccurate and, ultimately, unusable.
  • the aim of the present invention is to provide an automatic neural network training dataset production methodology that overcomes the drawbacks of known methodologies requiring the intervention of an operator.
  • Figure 1 shows a prior art solution.
  • Figure 2 schematically shows the automatic neural network training dataset production methodology of the present invention.
  • Figures 3 and 4 show examples of working environments virtually generated according to the automatic neural network training dataset production methodology of the present invention.
  • block diagrams included in the attached figures and described in the following are not intended as a representation of the structural characteristics, or constructive limitations, but must be interpreted as a representation of functional characteristics, i.e. intrinsic properties of the devices and defined by the effects obtained i.e. functional limitations and that can be implemented in different ways, therefore so as to protect the functionality of the same (possibility of functioning).
  • the present invention comprises the following macro phases:
  • the scenography 2 models a 3D representation of the objects 7 and implements the laws of physics (e.g., gravity, laws of motion, etc.) to which the objects 7 would be subjected in a real physical environment;
  • the scenography 2 represents a physical work area 3 where a robotic device 4 is provided with a grasping member 5 operable to grasp different objects 7 which move along a conveyor device 8 on which the objects 7 are randomly arranged; the robotic device 4 represented in the scenography 2 is driven by an object recognition system of the Region Based Convolutional Neural Network type.
  • the work area may be different, for example it may represent by a bin in which the objects 7 are arranged randomly overlapping.
  • the scenography 2 comprises the 3D representation of each object 7 (see Figure 4) including the representation of the walls 15 that delimit the internal volume of the object 7, and the representation of the portions of walls 15p of the object 7 that co-operate with the grasping member 5 to allow the object 7 to be grasped from the conveyor device 8.
  • each object 7 is associated with characteristic parameters of the object 7 such as size, weight, density, etc., which are used by the laws of physics implemented in the scenography 2 to characterize the physical-dynamic behavior of the object 7.
  • the 3D representation of the object 7 is produced by a 3D CAD system and/or by a 3D scanner operated to scan a physical object 7.
  • a mixing operation is modeled (see Figure 2, this operation is indicated with 2m) which concurs to arrange the objects from a first arrangement to a second arrangement in which the objects are arranged, one with respect to the other in space, according to a random mode.
  • Said operation is extremely important as it serves to create a work environment in which the three-dimensional objects are arranged in bulk; the subsequent recognition operations take place in a “difficult” environment in order to carry out an extremely competitive training.
  • the mixing operation is modeled by the fall of moving objects from a first (upper) position to a second (lower) position of the work area ( Figure 4); the physical laws of dynamics which create the trajectories of falling objects and which represent the rebound of the objects contribute to the mixing operation.
  • phase B the images are extracted by carrying out a shooting phase in which the representation of the set of objects, as arranged in the second arrangement, is taken.
  • each individual object is represented isolated from the other objects of the set by extracting a plurality of layering images in which the object maintains its relative position with respect to other objects that are not represented.
  • Phase B is carried out on the individual layering images by providing labeling data associated with the objects represented individually in each image.
  • the labeling data can be represented in different formats depending on the RCNN framework for which the dataset is intended.
  • phase B an automatic procedure can extract the outline of the images of a generic class of objects 7 and on the basis of this outline information the labeling data can be extracted.
  • some images may also be discarded on which the labeling data extraction operation is difficult.
  • the mixing operation described above which creates a totally random spatial arrangement of the objects in the second arrangement could produce an arrangement which, as represented in the extracted images, leads to a difficult outline operation.
  • statistical data can be provided on the rejected images and on the images actually used for the production of labeling data.
  • phases A) - B) are carried out on a software platform configured for the creation of video games.
  • the system described above is completely implemented by a computer and therefore provides a training dataset in a completely automatic way, avoiding that manual errors introduced by the operator can lead to the creation of an unsuitable training dataset. In addition, a long and tedious manual work is eliminated.
  • a logistic system is obtained in which a real robotic device 4 provided with a physical grasping member 5 for objects 7 different one from the other and arranged randomly on a transport and/or accumulation device; the robotic device 4 is guided by an object recognition system which uses a neural network which has been set up using the training dataset obtained by means of the method described above.

Abstract

A neural network training method comprising the phases of virtually creating a scenography (2) that represents a physical work area (3) in which different objects (7) to be handled are arranged. The scenography (2) models a 3D representation of the objects (7) and implements the laws of physics to which the objects (7) would be subjected in a real physical environment; processing 3D images (10) extracted from the scenography (2) by extracting from each image labeling data (11) associated with the objects (7) represented on the image, thus forming a training dataset (12); and providing the training dataset (12) to a neural network (13) to carry out a network setup phase that implements the artificial intelligence for the recognition of different types of objects (7) and for determining the positions of the individual objects (7) in the physical work area.

Description

AUTOMATIC PRODUCTION OF A TRAINING DATASET FOR A NEURAL NETWORK
Cross Reference to Related Applications
This patent application claims priority to Italian patent application no. 102020000009283 filed on 28.04.2020, the disclosure of which is incorporated herein by reference.
Technical Field of the Invention
The present invention relates, in general, to the field of neural networks, and more particularly to the automatic production of a training dataset for a neural network.
State of the Art
As is known, an increasing number of objects are sold online through e-commerce platforms where a vast variety of different objects characterized by an ever-shorter life cycle is offered.
Companies operating in this field have found themselves managing an increasing number of objects in a limited time-frame and sometimes even in small quantities.
In other words, the number of Stock Keeping Unit (SKU) (different objects to be managed) has increased and at the same time the quantity of the average batch of identical objects to be managed and the lead times have significantly decreased.
For the physical treatment of these objects from the warehouses to the shipping channels, logistic handling systems have been developed in which a robotic device is provided with a grasping member designed to pick up the objects moving along conveyors, for example a conveyor belt, to direct them to different destinations.
In order to carry out the grasping operations in a completely automatic way the robotic device is provided with a vision system that uses artificial intelligence for recognizing different types of objects and determining positions of the individual objects on the conveyor belt in order to grasp the objects arranged randomly on a support (conveyor belt and/or logistic bin).
In particular, the vision systems can conveniently use:
- “Active Stereo” and “Structured-Light” 3D-”Dep Vision” artificial vision sensory, and
- Recognition of objects to be grasped by means of software frameworks based on convolutional neural network models of the “Deep Learning - R-CNN” (Region Based Convolutional Neural Network) type. As is known, neural networks need training datasets to train the neural network and cause in it to become able to produce the expected result with regard to the automatic recognition of the types of objects and the positions thereof.
Production of a neural network training dataset, for example an R-CNN (Region Based Convolutional Neural Network) neural network, is certainly the most critical phase during the logistic system setup.
During production of a training dataset it is necessary to manually take a series of photographs of objects arranged randomly on the conveyor belt in a work area.
An operator must then manually act on the images by labeling via a graphic interface the images of the objects considered interesting for the creation of the training dataset.
Figure 1 shows a prior art labeling during which bounds of the objects considered interesting { A,B,C,D,E,F,G) are indicated with a polyline: the angles that longitudinal axes of the objects (in the example elongated objects are shown) form with respect to a reference plane is also indicated. In this way, a series of labeling data is manually produced.
The set of all the labeling data of all the images forms the training dataset based on which the neural network is then trained.
In the article by Stephan R. Richter et al. “ Playing for Data: Ground Truth from Computer Games”, 17 September 2016 (2016-09-17), Big Data Analytics in the Social and Ubiquitous Context: 5th International Workshop on Modeling Social Media, Ubiquitous and Social Environments, Muse 2014 and First International Workshop on Machine Learning, proposes a solution to the high-cost problem caused by the amount of human effort required to create large datasets with pixel-level labels. In this article, an approach is presented that enables the rapid creation of pixel level-accurate semantic label maps for images extracted from modern computer games. Although the source code and the inner workings of commercial games are inaccessible, the associations between image patches can be reconstructed from the communication between the game and the graphics hardware. This allows for a rapid propagation of semantic labels within and by way of the images synthesized by the game, without having access to the source code or the content. The presented approach is validated by producing dense semantic pixel-level labeling for 25,000 images synthesized by an open-world photorealistic computer game. Experiments on semantic segmentation datasets show that by using the acquired data to integrate real-world images significantly increases accuracy and that the acquired data can reduce the amount of hand-tagged real-world data. Object and Summary of the Invention
The Applicant has experienced that, depending on the morphological complexity of the set of objects, it may be necessary to acquire even several hundred images, associating each of them with the labeling data. During this tedious work phase, which can take several days of work time, it is sufficient for the operator to make only a few mistakes, for example associating the labeling data of an object to a different object or drawing incorrect polylines (or any other kind of error caused by the repetitiveness of this type of work) to produce extremely negative effects on the neural network training procedure and the entire vision system will be unstable and inaccurate and, ultimately, unusable.
The aim of the present invention is to provide an automatic neural network training dataset production methodology that overcomes the drawbacks of known methodologies requiring the intervention of an operator.
According to the present invention, a logistic system is provided, as claimed in the appended claims.
Brief Description of the Drawings
Figure 1 shows a prior art solution.
Figure 2 schematically shows the automatic neural network training dataset production methodology of the present invention.
Figures 3 and 4 show examples of working environments virtually generated according to the automatic neural network training dataset production methodology of the present invention.
Detailed Description of Preferred Embodiments of the Invention
The present invention will now be described in detail with reference to the attached figures so as to allow a person skilled in the art to develop and implement it. V arious modifications to the embodiments described will be immediately evident to those skilled in the art and the generic principles described can be applied to other embodiments and applications without thereby departing from the scope of the present invention, as defined in the appended claims. Therefore, the present invention should not be considered limited to the embodiments described and illustrated, but should be accorded the broadest scope according to the principles and characteristics described and claimed herein.
Unless otherwise defined, all the technical and scientific terms used herein have the same meaning commonly used by persons of ordinary experience in the field pertaining to the present invention. In the event of conflict, the present description, including the definitions provided, will be binding. Furthermore, the examples are provided for illustrative purposes only and as such should not be considered limiting.
In particular, the block diagrams included in the attached figures and described in the following are not intended as a representation of the structural characteristics, or constructive limitations, but must be interpreted as a representation of functional characteristics, i.e. intrinsic properties of the devices and defined by the effects obtained i.e. functional limitations and that can be implemented in different ways, therefore so as to protect the functionality of the same (possibility of functioning).
In order to facilitate the understanding of the embodiments described herein, reference will be made to some specific embodiments and a specific language will be used to describe the same. The terminology used herein has the purpose of describing only particular embodiments, and is not intended to limit the scope of the present invention.
As shown in Figure 2, the present invention comprises the following macro phases:
A) virtually creating a scenography 2 representing a physical work area 3 in which different objects 7 to be handled are arranged; the scenography 2 models a 3D representation of the objects 7 and implements the laws of physics (e.g., gravity, laws of motion, etc.) to which the objects 7 would be subjected in a real physical environment;
B) processing 3D images 10 extracted from the scenography 2 by extracting from each image labeling data 11 associated with the objects 7 represented in the image 10, thus automatically forming a training dataset 12; and
C) providing the training dataset 12 to a neural network 13 to carry out a set-up phase of the neural network 13 which implements the artificial intelligence for recognizing different types of objects 7 and determining positions of individual objects 7 in a real work area.
In the example shown in Figure 3, the scenography 2 represents a physical work area 3 where a robotic device 4 is provided with a grasping member 5 operable to grasp different objects 7 which move along a conveyor device 8 on which the objects 7 are randomly arranged; the robotic device 4 represented in the scenography 2 is driven by an object recognition system of the Region Based Convolutional Neural Network type. However, it goes without saying that the work area may be different, for example it may represent by a bin in which the objects 7 are arranged randomly overlapping.
In greater detail, in phase A), the scenography 2 comprises the 3D representation of each object 7 (see Figure 4) including the representation of the walls 15 that delimit the internal volume of the object 7, and the representation of the portions of walls 15p of the object 7 that co-operate with the grasping member 5 to allow the object 7 to be grasped from the conveyor device 8.
The 3D representation of each object 7 is associated with characteristic parameters of the object 7 such as size, weight, density, etc., which are used by the laws of physics implemented in the scenography 2 to characterize the physical-dynamic behavior of the object 7.
The 3D representation of the object 7 is produced by a 3D CAD system and/or by a 3D scanner operated to scan a physical object 7.
In greater detail, in the scenography 2 a mixing operation is modeled (see Figure 2, this operation is indicated with 2m) which concurs to arrange the objects from a first arrangement to a second arrangement in which the objects are arranged, one with respect to the other in space, according to a random mode. Said operation is extremely important as it serves to create a work environment in which the three-dimensional objects are arranged in bulk; the subsequent recognition operations take place in a “difficult” environment in order to carry out an extremely competitive training.
In the example shown, the mixing operation is modeled by the fall of moving objects from a first (upper) position to a second (lower) position of the work area (Figure 4); the physical laws of dynamics which create the trajectories of falling objects and which represent the rebound of the objects contribute to the mixing operation.
In phase B) the images are extracted by carrying out a shooting phase in which the representation of the set of objects, as arranged in the second arrangement, is taken.
In the example shown, the images of the objects arranged randomly one with respect to the other, following the fall, are taken.
In particular, each individual object is represented isolated from the other objects of the set by extracting a plurality of layering images in which the object maintains its relative position with respect to other objects that are not represented.
Phase B) is carried out on the individual layering images by providing labeling data associated with the objects represented individually in each image.
As is known, the labeling data can be represented in different formats depending on the RCNN framework for which the dataset is intended.
In phase B an automatic procedure can extract the outline of the images of a generic class of objects 7 and on the basis of this outline information the labeling data can be extracted. In phase B, some images may also be discarded on which the labeling data extraction operation is difficult. The mixing operation described above which creates a totally random spatial arrangement of the objects in the second arrangement could produce an arrangement which, as represented in the extracted images, leads to a difficult outline operation. At the end of phase B), statistical data can be provided on the rejected images and on the images actually used for the production of labeling data.
Conveniently, phases A) - B) are carried out on a software platform configured for the creation of video games.
The system described above is completely implemented by a computer and therefore provides a training dataset in a completely automatic way, avoiding that manual errors introduced by the operator can lead to the creation of an unsuitable training dataset. In addition, a long and tedious manual work is eliminated. In the real work area, a logistic system is obtained in which a real robotic device 4 provided with a physical grasping member 5 for objects 7 different one from the other and arranged randomly on a transport and/or accumulation device; the robotic device 4 is guided by an object recognition system which uses a neural network which has been set up using the training dataset obtained by means of the method described above.

Claims

1. A logistic system comprising a robotic device (4) with a grasping member (5) operable to grasp different objects (7) randomly arranged on a transport and/or accumulation device in a work area (3) and; the robotic device (4) is driven by an object recognition system (7) operating based on a neural network trained to recognize different objects (7) and determine positions thereof in the work area (3) by using a dataset automatically produced by a computer-implemented video gaming software platform designed to:
A) virtually create a scenography (2) representing the work area (3) where different objects (7) to be grasped are arranged; the scenography models a 3D representation of the objects (7) and implements the laws of physics to which the (7) objects would be subjected in a real environment; and
B) processing a plurality of 3D images (10) extracted from the scenography (2) by extracting from each processed image labeling data (11) associated with the objects (7) represented in the processed image, thus forming a training dataset (12) for the neural network.
2. The logistic system of claim 1, wherein the 3d representation of an object (7) comprises the representation of external walls (15) that delimit an internal volume of the object (7); and the representation of portions (15p) of the object (7) intended to co-operate with a device, for example the grasping member (5), operating in the work area (3).
3. The logistic system of claim 1 or 2, wherein the 3D representation of an object (7) is associated with characteristic parameters of the object (7) such as size, weight, density, etc., which are used by the laws of physics implemented in the scenography (2) to characterize the physical- dynamic behavior of the object (7).
4. The logistic system of any one of the preceding claims, wherein the video gaming software platform is further designed to model in the scenography (2) an object mixing operation during which the arrangement of the objects (7) changes from a first arrangement to a second arrangement in which the objects (7) are mutually randomly spatially arranged.
5. The logistic system of claim 4, wherein the mixing operation is modeled to cause moving objects (7) to fall from an upper position to a lower position in the work area (3); the physical laws of dynamics that produce the trajectories of the falling objects (7) and that represent the rebound of objects (7) contribute to the mixing operation.
6. The logistic system of claim 4 or 5, wherein in phase B) the images are extracted by carrying out a shooting phase in which a representation of the objects (7) arranged in the second arrangement is taken.
7. The logistic system of claim 6, wherein in the shooting phase each individual object (7) is represented isolated from the other objects (7) by extracting layering images in which the object (7) maintains its position relative to the other objects (7) that are not represented.
8. The logistic system of claim 7, wherein phase B) is carried out on the layering images by providing labeling data (11) associated with the objects (7) individually represented in each image.
9. The logistic system of any of the preceding claims, wherein the 3D representation of an object (7) is obtained by means of a 3D CAD system or by means of a 3D scanner that scans the object
(7).
PCT/IB2021/053529 2020-04-28 2021-04-28 Automatic production of a training dataset for a neural network WO2021220191A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IT102020000009283A IT202000009283A1 (en) 2020-04-28 2020-04-28 AUTOMATIC PRODUCTION OF A TRAINING DATA SET FOR A NEURAL NETWORK
IT102020000009283 2020-04-28

Publications (2)

Publication Number Publication Date
WO2021220191A2 true WO2021220191A2 (en) 2021-11-04
WO2021220191A3 WO2021220191A3 (en) 2022-01-13

Family

ID=71994716

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2021/053529 WO2021220191A2 (en) 2020-04-28 2021-04-28 Automatic production of a training dataset for a neural network

Country Status (2)

Country Link
IT (1) IT202000009283A1 (en)
WO (1) WO2021220191A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116416217A (en) * 2023-03-06 2023-07-11 赛那德科技有限公司 Method, system and equipment for generating unordered stacking parcel image

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6607261B2 (en) * 2015-12-24 2019-11-20 富士通株式会社 Image processing apparatus, image processing method, and image processing program
GB2568475A (en) * 2017-11-15 2019-05-22 Cubic Motion Ltd A method of generating training data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116416217A (en) * 2023-03-06 2023-07-11 赛那德科技有限公司 Method, system and equipment for generating unordered stacking parcel image
CN116416217B (en) * 2023-03-06 2023-11-28 赛那德科技有限公司 Method, system and equipment for generating unordered stacking parcel image

Also Published As

Publication number Publication date
IT202000009283A1 (en) 2021-10-28
WO2021220191A3 (en) 2022-01-13

Similar Documents

Publication Publication Date Title
US20230264109A1 (en) System and method for toy recognition
US11308689B2 (en) Three dimensional scanning and data extraction systems and processes for supply chain piece automation
KR102332603B1 (en) Robotic system for palletizing packages using real-time placement simulation
Mahler et al. Learning ambidextrous robot grasping policies
US10755437B2 (en) Information processing device, image recognition method and non-transitory computer readable medium
CN107463946B (en) Commodity type detection method combining template matching and deep learning
Bormann et al. Towards automated order picking robots for warehouses and retail
Wang et al. Dense robotic packing of irregular and novel 3D objects
Hinkle et al. Predicting object functionality using physical simulations
Hutabarat et al. Combining virtual reality enabled simulation with 3D scanning technologies towards smart manufacturing
WO2021220191A2 (en) Automatic production of a training dataset for a neural network
Garcia-Garcia et al. A study of the effect of noise and occlusion on the accuracy of convolutional neural networks applied to 3D object recognition
Periyasamy et al. Synpick: A dataset for dynamic bin picking scene understanding
Buls et al. Generation of synthetic training data for object detection in piles
Imtiaz et al. Prehensile and non-prehensile robotic pick-and-place of objects in clutter using deep reinforcement learning
Le et al. Deformation-aware data-driven grasp synthesis
Jia et al. Robot Online 3D Bin Packing Strategy Based on Deep Reinforcement Learning and 3D Vision
CN112633187B (en) Automatic robot carrying method, system and storage medium based on image analysis
Sauvet et al. Model-based grasping of unknown objects from a random pile
Gouda et al. DoPose-6D dataset for object segmentation and 6D pose estimation
Mojtahedzadeh Safe robotic manipulation to extract objects from piles: From 3D perception to object selection
JP6730091B2 (en) Loading procedure determination device and loading procedure determination program
Gouda et al. Object class-agnostic segmentation for practical CNN utilization in industry
Fichtl et al. Bootstrapping relational affordances of object pairs using transfer
Jonker Robotic bin-picking pipeline for chicken fillets with deep learning-based instance segmentation using synthetic data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21728113

Country of ref document: EP

Kind code of ref document: A2

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21728113

Country of ref document: EP

Kind code of ref document: A2