CN109903323B - Training method and device for transparent object recognition, storage medium and terminal - Google Patents

Training method and device for transparent object recognition, storage medium and terminal Download PDF

Info

Publication number
CN109903323B
CN109903323B CN201910167767.6A CN201910167767A CN109903323B CN 109903323 B CN109903323 B CN 109903323B CN 201910167767 A CN201910167767 A CN 201910167767A CN 109903323 B CN109903323 B CN 109903323B
Authority
CN
China
Prior art keywords
images
depth
rgb
establishing
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910167767.6A
Other languages
Chinese (zh)
Other versions
CN109903323A (en
Inventor
张�成
龙宇
王语诗
蔡自立
郑子璇
吉守龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201910167767.6A priority Critical patent/CN109903323B/en
Publication of CN109903323A publication Critical patent/CN109903323A/en
Application granted granted Critical
Publication of CN109903323B publication Critical patent/CN109903323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a training method, a device, a storage medium and a terminal for transparent object identification, wherein the method comprises the following steps: s1, establishing a first data set with a plurality of RGB images and a second data set with a plurality of depth images, wherein the RGB images are respectively in one-to-one correspondence with the depth images; s2, establishing a multi-mode fused deep convolutional neural network structure N1; s3, establishing a multi-mode shared deep convolutional network structure N2, inputting the first characteristic information and the second characteristic information into the N2 for fusion training, so as to output classification parameter information and position coordinate information of the object and obtain a network weight model M2; and S4, inputting other multiple pairs of RGB images and depth images again to carry out parameter adjustment on the network weight model M1 and the network weight model M2 so as to obtain optimized network weight models M11 and M22.

Description

Training method and device for transparent object recognition, storage medium and terminal
Technical Field
The invention relates to the technical field of image recognition, in particular to a training method, a training device, a storage medium and a terminal for transparent object recognition.
Background
Nowadays, science and technology are developed at a high speed, and the popularization of industrial robots not only liberates labor force, but also accelerates production speed and improves production quality. Wherein, the introduction of machine vision makes the robot snatch efficiency promotion more. However, for some special articles such as transparent objects, machine vision also has difficulties such as low recognition accuracy or long time consumption.
Since the image of the transparent object is easily influenced by different factors and the like, the stability and accuracy of the single-mode object recognition system are influenced to a certain extent by the factors. The current common method is to increase the main characteristics of the object by changing the environment, but this method is also for objects such as translucency, for example, patent CN104180772a, the identification requirement is the rough surface of the transparent object, and only the flat transparent object can be identified. Meanwhile, the methods are often high in equipment configuration requirements, or complicated in calculation, and cannot meet industrial survival requirements, for example, patent CN102753933B is harsh in setting environment, needs to shield an external light source, has no practical application value in industry, and cannot adapt to variable and complex environments.
Therefore, the prior art has defects and needs to be improved urgently.
Disclosure of Invention
The embodiment of the invention provides a training method, a training device, a storage medium and a terminal for transparent object identification, which can improve the accuracy and efficiency of transparent object identification.
The embodiment of the invention provides a training method for transparent object identification, which comprises the following steps:
s1, establishing a first data set with a plurality of RGB images and a second data set with a plurality of depth images, wherein the RGB images are in one-to-one correspondence with the depth images respectively;
s2, establishing a multi-mode fusion depth convolution neural network structure N1, wherein the N1 is used for extracting a plurality of RGB images for independent training and extracting a plurality of depth images for independent training so as to respectively extract first characteristic information of the RGB images and second characteristic information of the depth images to obtain a network weight model M1;
s3, establishing a multi-mode shared deep convolutional network structure N2, inputting the first characteristic information and the second characteristic information into the N2 for fusion training, so as to output classification parameter information and position coordinate information of the object and obtain a network weight model M2;
and S4, inputting other multiple pairs of RGB images and depth images again to carry out parameter adjustment on the network weight model M1 and the network weight model M2 so as to obtain optimized network weight models M11 and M22.
In the training method for transparent object recognition according to the present invention, the step of establishing a first data set having a plurality of RGB images and a second data set having a plurality of depth images, where the plurality of RGB images respectively correspond to the plurality of depth images one to one includes:
collecting RGB images and depth images of a plurality of objects to be trained, wherein the RGB images correspond to the depth images one by one respectively;
carrying out boundary calibration on an object to be trained in the RGB image, and setting first classification parameter information of the object to be trained and first position coordinate information of the object to be trained in the RGB image;
establishing a first data set with a plurality of RGB images according to the first classification parameter information and the first position coordinate information;
performing boundary calibration on an object to be trained in the depth image according to the corresponding relation between the RGB image and the depth image, and setting second classification parameter information of the object to be trained and second position coordinate information of the object to be trained in the depth image;
and establishing a second data set with a plurality of depth images according to the second classification parameter information and the second position coordinate information.
In the training method for transparent object recognition according to the present invention, the step of establishing a multi-modal fused deep convolutional neural network structure N1, where N1 is used to extract a plurality of RGB images for individual training and a plurality of depth images for individual training, so as to extract first feature information of the RGB images and second feature information of the depth images respectively to obtain a network weight model M1, includes:
establishing a multimode fusion deep convolutional neural network structure N1, wherein the N1 comprises two independent convolutional neural network branches, and the two independent convolutional neural network branches are used for respectively and independently training the RGB image and the depth image; during training, the RGB images and the depth images which correspond to each other are randomly extracted from the first data set and the second data set as input each time, and the first characteristic information of the RGB images and the second characteristic information of the depth images are respectively extracted by using the convolutional neural network to obtain a network weight model M1.
In the training method for transparent object recognition according to the present invention, the RGB image and the depth image corresponding to each other are images of the same object acquired by using a color RGB camera and a depth camera, respectively.
In the training method for transparent object recognition according to the present invention, in step S2, the parameters of each layer are updated by using a reverse pass-back algorithm and by passing back the error of the loss layer, so that the network weight model is updated and optimized, and finally converged.
A training apparatus for transparent object recognition, comprising:
the device comprises a first establishing module, a second establishing module and a third establishing module, wherein the first establishing module is used for establishing a first data set with a plurality of RGB images and a second data set with a plurality of depth images, and the RGB images are respectively in one-to-one correspondence with the depth images;
the second establishing module is used for establishing a multi-mode fused depth convolution neural network structure N1, wherein the N1 is used for extracting a plurality of RGB images for independent training and extracting a plurality of depth images for independent training so as to respectively extract first characteristic information of the RGB images and second characteristic information of the depth images to obtain a network weight model M1;
the third establishing module is used for establishing a multi-modal shared deep convolutional network structure N2, inputting the first characteristic information and the second characteristic information into the N2 for fusion training, so as to output classification parameter information and position coordinate information of an object and obtain a network weight model M2;
and the optimization module is used for re-inputting other pairs of RGB images and depth images to perform parameter adjustment on the network weight model M1 and the network weight model M2 so as to obtain optimized network weight models M11 and M22.
In the training apparatus for transparent object recognition according to the present invention, the first establishing module includes:
the device comprises an acquisition unit, a training unit and a control unit, wherein the acquisition unit is used for acquiring RGB images and depth images of a plurality of objects to be trained, and the RGB images correspond to the depth images one to one respectively;
the first calibration unit is used for carrying out boundary calibration on the object to be trained in the RGB image, and setting first classification parameter information of the object to be trained and first position coordinate information of the object to be trained in the RGB image;
the first establishing unit is used for establishing a first data set with a plurality of RGB images according to the first classification parameter information and the first position coordinate information;
the second calibration unit is used for performing boundary calibration on the object to be trained in the depth image according to the corresponding relation between the RGB image and the depth image, and setting second classification parameter information of the object to be trained and second position coordinate information of the object to be trained in the depth image;
and the second establishing unit is used for establishing a second data set with a plurality of depth images according to the second classification parameter information and the second position coordinate information.
In the training device for transparent object recognition according to the present invention, the RGB image and the depth image corresponding to each other are images of the same object acquired by using a color RGB camera and a depth camera, respectively.
A storage medium having stored therein a computer program which, when run on a computer, causes the computer to perform any of the methods described above.
A terminal comprising a processor and a memory, the memory having stored therein a computer program, the processor being adapted to perform the method of any preceding claim by invoking the computer program stored in the memory.
According to the method, data (RGB images and depth images) of different modes are trained independently, the characteristics of the modes are learned through a series of neural networks, then the characteristics of the modes are complementarily learned through fusion connection and a series of shared convolution layers, and the fusion of the RGB information and the depth information can achieve the effect of improving the transparent object recognition.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can also be derived from them without inventive effort.
Fig. 1 is a schematic flowchart of a training method for transparent object recognition according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a training device for transparent object recognition according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a terminal according to an embodiment of the present invention.
Detailed Description
The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, belong to the scope of protection of the present invention.
The terms "first," "second," "third," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the objects so described are interchangeable under appropriate circumstances. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, or apparatus that includes a series of steps, an advanced driver assistance system, or a system that includes a series of modules or elements is not necessarily limited to those steps or modules or elements expressly listed, may include steps or modules or elements not expressly listed, and may include other steps or modules or elements inherent to such process, method, apparatus, advanced driver assistance system, or system.
Referring to fig. 1, fig. 1 is a flow chart of a training method for transparent object recognition. The training method for transparent object recognition comprises the following steps:
s1, establishing a first data set with a plurality of RGB images and a second data set with a plurality of depth images, wherein the RGB images are in one-to-one correspondence with the depth images.
Wherein pictures of images to be trained taken in a real scene are used as training samples.
Specifically, the step S1 includes:
s11, collecting RGB images and depth images of a plurality of objects to be trained, wherein the RGB images correspond to the depth images one by one; s12, performing boundary calibration on the object to be trained in the RGB image, and setting first classification parameter information of the object to be trained and first position coordinate information of the object to be trained in the RGB image; s13, establishing a first data set with a plurality of RGB images according to the first classification parameter information and the first position coordinate information; s14, performing boundary calibration on the object to be trained in the depth image according to the corresponding relation between the RGB image and the depth image, and setting second classification parameter information of the object to be trained and second position coordinate information of the object to be trained in the depth image; and S15, establishing a second data set with a plurality of depth images according to the second classification parameter information and the second position coordinate information.
The RGB image and the depth image which correspond to each other are images of the same object which are acquired by a color RGB camera and a depth camera respectively.
In step S1, the color RGB module and the depth image module are at different locations of the sensor. Therefore, the acquired image information is different even for the same object at the same time. Since we need to unify the bounding box, we need to perform matrix transformation on the color RGB image and the depth image information to make their coordinates in one-to-one correspondence. Here, two matrices need to be considered, one translation and one rotation.
Assuming that a certain point of the depth image is (X, Y), and its corresponding point on the color RGB image is (X, Y), we can get X = X + dx and Y = Y + dy by translating the matrix, which is expressed as follows.
Figure 556713DEST_PATH_IMAGE002
dx and dy are the distances in which x, y move in their directions, respectively. The translation matrix is represented as follows:
Figure 21324DEST_PATH_IMAGE004
meanwhile, a rotation matrix is also provided, an included angle between a connecting line of a certain point and an original point and an X axis is set as b degrees, the rotation matrix rotates anticlockwise by a degrees by taking the original point as a circle center, the length of the connecting line of the original point and the point is R, [ X, Y ] is a depth image coordinate, and [ X, Y ] is a color RGB image coordinate, so that the following can be obtained:
Figure 332219DEST_PATH_IMAGE006
Figure 258587DEST_PATH_IMAGE008
Figure 233146DEST_PATH_IMAGE010
Figure 71920DEST_PATH_IMAGE012
thus, the rotation matrix can be calculated as follows:
Figure 654734DEST_PATH_IMAGE014
s2, establishing a multi-mode fusion deep convolution neural network structure N1, wherein the N1 is used for extracting a plurality of RGB images for independent training and extracting a plurality of depth images for independent training so as to respectively extract first feature information of the RGB images and second feature information of the depth images to obtain a network weight model M1.
Establishing a multimode fusion deep convolutional neural network structure N1, wherein the N1 comprises two independent convolutional neural network branches, and the two independent convolutional neural network branches are used for respectively and independently training the RGB image and the depth image; during training, the RGB images and the depth images which correspond to each other are randomly extracted from the first data set and the second data set as input each time, and the first characteristic information of the RGB images and the second characteristic information of the depth images are respectively extracted by using the convolutional neural network to obtain a network weight model M1.
Wherein, the parameters of each layer can be updated by adopting a reverse return algorithm and the errors of return loss layers, so that the network weight model is updated and optimized and finally converged.
And S3, establishing a multi-mode shared deep convolutional network structure N2, inputting the first characteristic information and the second characteristic information into the N2 for fusion training, so as to output classification parameter information and position coordinate information of the object and obtain a network weight model M2.
Wherein, the N2 comprises a plurality of convolutional neural networks, then a plurality of fully-connected networks are connected, and the output comprises two parameters, one is the coordinate position parameter of the object, and the other is the classification parameter of the object.
And S4, inputting other multiple pairs of RGB images and depth images again to carry out parameter adjustment on the network weight model M1 and the network weight model M2 so as to obtain optimized network weight models M11 and M22.
And (3) extracting new data input networks from the data set again by using the trained network weight models M1 and M2, and carrying out parameter fine adjustment on the global network again to realize the hidden relation of input and output. By separately training the two parts of the network, the training time cost can be reduced.
In the present application, the network structure of N1 and N2, including but not limited to convolutional layer, pooling layer, nonlinear function layer, fully-connected layer, normalization layer, and including but not limited to any combination of these layers, is not a scope of protection.
Referring to fig. 2, a training apparatus for transparent object recognition includes: a first establishing module 201, a second establishing module 202, a third establishing module 203 and an optimizing module 204.
The first establishing module 201 is configured to establish a first data set having a plurality of RGB images and a second data set having a plurality of depth images, where the plurality of RGB images are respectively in one-to-one correspondence with the plurality of depth images. The RGB images and the depth images which correspond to each other are images of the same object which are acquired by the color RGB camera and the depth camera respectively.
Wherein, the first establishing module comprises: the device comprises an acquisition unit, a training unit and a control unit, wherein the acquisition unit is used for acquiring RGB images and depth images of a plurality of objects to be trained, and the RGB images correspond to the depth images one by one; the first calibration unit is used for carrying out boundary calibration on the object to be trained in the RGB image, and setting first classification parameter information of the object to be trained and first position coordinate information of the object to be trained in the RGB image; a first establishing unit, configured to establish a first data set with multiple RGB images according to the first classification parameter information and the first position coordinate information; the second calibration unit is used for performing boundary calibration on the object to be trained in the depth image according to the corresponding relation between the RGB image and the depth image, and setting second classification parameter information of the object to be trained and second position coordinate information of the object to be trained in the depth image; and the second establishing unit is used for establishing a second data set with a plurality of depth images according to the second classification parameter information and the second position coordinate information.
The second establishing module 202 is configured to establish a multi-modal fused deep convolutional neural network structure N1, where the N1 is configured to extract a plurality of RGB images for individual training and a plurality of depth images for individual training, so as to extract first feature information of the RGB images and second feature information of the depth images respectively to obtain a network weight model M1. The second establishing module 202 establishes a multi-modal fused deep convolutional neural network structure N1, where N1 includes two independent convolutional neural network branches, and the two independent convolutional neural network branches are used to train the RGB image and the depth image separately; during training, the RGB images and the depth images which correspond to each other are randomly extracted from the first data set and the second data set as input each time, and the first characteristic information of the RGB images and the second characteristic information of the depth images are respectively extracted by using the convolutional neural network to obtain a network weight model M1.
The third establishing module 203 is configured to establish a multi-modal shared deep convolutional network structure N2, input the first feature information and the second feature information into the N2, perform fusion training, output classification parameter information and position coordinate information of an object, and obtain a network weight model M2. Wherein, the N2 comprises a plurality of convolutional neural networks, then a plurality of fully-connected networks are connected, and the output comprises two parameters, one is the coordinate position parameter of the object, and the other is the classification parameter of the object.
The optimization module 204 is configured to re-input other pairs of RGB images and depth images to perform parameter adjustment on the network weight model M1 and the network weight model M2 to obtain optimized network weight models M11 and M22.
And (3) extracting new data input networks from the data set again by using the trained network weight models M1 and M2, and carrying out parameter fine adjustment on the global network again to realize the hidden relation of input and output. By separately training the two parts of the network, the training time cost can be reduced.
Finally, after obtaining the optimized network weight models M11 and M22, the transparent objects can be identified by using the network weight models M11 and M22, and the method has high accuracy and efficiency.
The present invention also provides a storage medium having stored therein a computer program which, when run on a computer, causes the computer to perform the training for transparent object recognition described in any of the above embodiments.
Referring to fig. 3, the present invention further provides a terminal, which includes a processor 301 and a memory 302. The processor 301 is electrically connected to the memory 302.
The processor 301 is a control center of the terminal, connects various parts of the entire terminal using various interfaces and lines, and performs various functions of the terminal and processes data by running or calling a computer program stored in the memory 302 and calling data stored in the memory 302, thereby performing overall monitoring of the terminal.
In this embodiment, the processor 301 in the terminal loads instructions corresponding to one or more processes of the computer program into the memory 302 according to the following steps, and the processor 301 runs the computer program stored in the memory 302, thereby implementing various functions: establishing a first data set with a plurality of RGB images and a second data set with a plurality of depth images, wherein the RGB images are respectively in one-to-one correspondence with the depth images; establishing a multi-mode fused deep convolutional neural network structure N1, wherein the N1 is used for extracting a plurality of RGB images for individual training and extracting a plurality of depth images for individual training so as to respectively extract first characteristic information of the RGB images and second characteristic information of the depth images to obtain a network weight model M1; establishing a multi-mode shared deep convolutional network structure N2, inputting the first characteristic information and the second characteristic information into the N2 for fusion training, so as to output classification parameter information and position coordinate information of the object and obtain a network weight model M2; and re-inputting other pairs of RGB images and depth images to perform parameter adjustment on the network weight model M1 and the network weight model M2 to obtain optimized network weight models M11 and M22.
It should be noted that, a person skilled in the art can understand that all or part of the steps in the various methods of the above embodiments can be accomplished by related hardware through instructions of a program, and the program can be stored in a computer-readable storage medium, which can include but is not limited to: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, and the like.
The advanced driving assistance-based reminding method, the advanced driving assistance-based reminding device, the advanced driving assistance-based reminding medium and the advanced driving assistance system provided by the embodiment of the invention are described in detail, a specific example is applied in the text to explain the principle and the implementation of the invention, and the description of the embodiment is only used for helping to understand the method and the core idea of the invention; meanwhile, for those skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A training method for transparent object recognition, comprising the steps of:
s1, establishing a first data set with a plurality of RGB images and a second data set with a plurality of depth images, wherein the RGB images are respectively in one-to-one correspondence with the depth images;
s2, establishing a multi-mode fusion depth convolution neural network structure N1, wherein the N1 is used for extracting a plurality of RGB images for independent training and extracting a plurality of depth images for independent training so as to respectively extract first characteristic information of the RGB images and second characteristic information of the depth images to obtain a network weight model M1;
s3, establishing a multi-mode shared deep convolutional network structure N2, inputting the first characteristic information and the second characteristic information into the N2 for fusion training, so as to output classification parameter information and position coordinate information of the object and obtain a network weight model M2;
and S4, inputting other multiple pairs of RGB images and depth images again to carry out parameter adjustment on the network weight model M1 and the network weight model M2 so as to obtain optimized network weight models M11 and M22.
2. The method as claimed in claim 1, wherein the step of creating a first data set with a plurality of RGB images and a second data set with a plurality of depth images, the RGB images corresponding to the depth images one-to-one respectively comprises:
collecting RGB images and depth images of a plurality of objects to be trained, wherein the RGB images correspond to the depth images one by one respectively;
carrying out boundary calibration on an object to be trained in the RGB image, and setting first classification parameter information of the object to be trained and first position coordinate information of the object to be trained in the RGB image;
establishing a first data set with a plurality of RGB images according to the first classification parameter information and the first position coordinate information;
performing boundary calibration on an object to be trained in the depth image according to the corresponding relation between the RGB image and the depth image, and setting second classification parameter information of the object to be trained and second position coordinate information of the object to be trained in the depth image;
and establishing a second data set with a plurality of depth images according to the second classification parameter information and the second position coordinate information.
3. The training method for transparent object recognition according to claim 1, wherein the step of establishing a multi-modal fused deep convolutional neural network structure N1, where N1 is used for extracting a plurality of RGB images for individual training and a plurality of depth images for individual training, so as to respectively extract first feature information of the RGB images and second feature information of the depth images to obtain a network weight model M1, includes:
establishing a multi-mode fused deep convolutional neural network structure N1, wherein the N1 comprises two independent convolutional neural network branches, and the two independent convolutional neural network branches are used for respectively and independently training the RGB image and the depth image; during training, the RGB images and the depth images which correspond to each other are randomly extracted from the first data set and the second data set as input each time, and the first characteristic information of the RGB images and the second characteristic information of the depth images are respectively extracted by using the convolutional neural network to obtain a network weight model M1.
4. A training method for transparent object recognition according to claim 1, wherein the mutually corresponding RGB image and depth image are images of the same object captured by a color RGB camera and a depth camera, respectively.
5. Training method for transparent object recognition according to claim 1, characterized in that in step S2, the parameters of each layer are updated by the feedback lost layer error using the inverse feedback algorithm, so that the network weight model is updated and optimized, and finally converged.
6. A training apparatus for transparent object recognition, comprising:
the device comprises a first establishing module, a second establishing module and a third establishing module, wherein the first establishing module is used for establishing a first data set with a plurality of RGB images and a second data set with a plurality of depth images, and the RGB images are respectively in one-to-one correspondence with the depth images;
the second establishing module is used for establishing a multi-mode fused depth convolution neural network structure N1, wherein the N1 is used for extracting a plurality of RGB images for independent training and extracting a plurality of depth images for independent training so as to respectively extract first characteristic information of the RGB images and second characteristic information of the depth images to obtain a network weight model M1;
the third establishing module is used for establishing a multi-modal shared deep convolutional network structure N2, inputting the first characteristic information and the second characteristic information into the N2 for fusion training, so as to output classification parameter information and position coordinate information of an object and obtain a network weight model M2;
and the optimization module is used for re-inputting other pairs of RGB images and depth images to perform parameter adjustment on the network weight model M1 and the network weight model M2 so as to obtain optimized network weight models M11 and M22.
7. The training device for transparent object recognition according to claim 6, wherein the first establishing module comprises:
the device comprises an acquisition unit, a training unit and a control unit, wherein the acquisition unit is used for acquiring RGB images and depth images of a plurality of objects to be trained, and the RGB images correspond to the depth images one by one;
the first calibration unit is used for calibrating the boundary of an object to be trained in the RGB image, and setting first classification parameter information of the object to be trained and first position coordinate information of the object to be trained in the RGB image;
the first establishing unit is used for establishing a first data set with a plurality of RGB images according to the first classification parameter information and the first position coordinate information;
the second calibration unit is used for performing boundary calibration on the object to be trained in the depth image according to the corresponding relation between the RGB image and the depth image, and setting second classification parameter information of the object to be trained and second position coordinate information of the object to be trained in the depth image;
and the second establishing unit is used for establishing a second data set with a plurality of depth images according to the second classification parameter information and the second position coordinate information.
8. Training device for transparent object recognition according to claim 6, wherein the mutually corresponding RGB images and depth images are images of the same object captured with a color RGB camera and a depth camera, respectively.
9. A storage medium, having stored thereon a computer program which, when run on a computer, causes the computer to carry out the method of any one of claims 1 to 5.
10. A terminal, characterized in that it comprises a processor and a memory, in which a computer program is stored, the processor being adapted to carry out the method of any one of claims 1 to 5 by calling the computer program stored in the memory.
CN201910167767.6A 2019-03-06 2019-03-06 Training method and device for transparent object recognition, storage medium and terminal Active CN109903323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910167767.6A CN109903323B (en) 2019-03-06 2019-03-06 Training method and device for transparent object recognition, storage medium and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910167767.6A CN109903323B (en) 2019-03-06 2019-03-06 Training method and device for transparent object recognition, storage medium and terminal

Publications (2)

Publication Number Publication Date
CN109903323A CN109903323A (en) 2019-06-18
CN109903323B true CN109903323B (en) 2022-11-18

Family

ID=66946615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910167767.6A Active CN109903323B (en) 2019-03-06 2019-03-06 Training method and device for transparent object recognition, storage medium and terminal

Country Status (1)

Country Link
CN (1) CN109903323B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458828B (en) * 2019-08-12 2023-02-10 广东工业大学 Laser welding defect identification method and device based on multi-mode fusion network
CN112082475B (en) * 2020-08-25 2022-05-24 中国科学院空天信息创新研究院 Living stumpage species identification method and volume measurement method
CN116665002B (en) * 2023-06-28 2024-02-27 北京百度网讯科技有限公司 Image processing method, training method and device for deep learning model
CN117115208A (en) * 2023-10-20 2023-11-24 城云科技(中国)有限公司 Transparent object tracking model, construction method and application thereof

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180330194A1 (en) * 2017-05-15 2018-11-15 Siemens Aktiengesellschaft Training an rgb-d classifier with only depth data and privileged information
CN108182441B (en) * 2017-12-29 2020-09-18 华中科技大学 Parallel multichannel convolutional neural network, construction method and image feature extraction method

Also Published As

Publication number Publication date
CN109903323A (en) 2019-06-18

Similar Documents

Publication Publication Date Title
CN109903323B (en) Training method and device for transparent object recognition, storage medium and terminal
CN109902548B (en) Object attribute identification method and device, computing equipment and system
CN106780631B (en) Robot closed-loop detection method based on deep learning
EP3427186A1 (en) Systems and methods for normalizing an image
CN107481292A (en) The attitude error method of estimation and device of vehicle-mounted camera
CN113920307A (en) Model training method, device, equipment, storage medium and image detection method
CN109683699A (en) The method, device and mobile terminal of augmented reality are realized based on deep learning
CN115699082A (en) Defect detection method and device, storage medium and electronic equipment
KR102158799B1 (en) Method, computer program and apparatus for recognition of building by using deep neural network model
CN111127548B (en) Grabbing position detection model training method, grabbing position detection method and grabbing position detection device
CN111259710B (en) Parking space structure detection model training method adopting parking space frame lines and end points
CN110756462B (en) Power adapter test method, device, system, control device and storage medium
CN112037142B (en) Image denoising method, device, computer and readable storage medium
Leiva et al. Collision avoidance for indoor service robots through multimodal deep reinforcement learning
CN111738403A (en) Neural network optimization method and related equipment
CN111753739A (en) Object detection method, device, equipment and storage medium
CN111950570A (en) Target image extraction method, neural network training method and device
CN113592015B (en) Method and device for positioning and training feature matching network
CN113222961B (en) Intelligent ship body detection system and method
CN112669452B (en) Object positioning method based on convolutional neural network multi-branch structure
CN112668596B (en) Three-dimensional object recognition method and device, recognition model training method and device
CN116071625B (en) Training method of deep learning model, target detection method and device
CN115219492B (en) Appearance image acquisition method and device for three-dimensional object
CN116460851A (en) Mechanical arm assembly control method for visual migration
CN113065521B (en) Object identification method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant