CN114871115A - Object sorting method, device, equipment and storage medium - Google Patents

Object sorting method, device, equipment and storage medium Download PDF

Info

Publication number
CN114871115A
CN114871115A CN202210460909.XA CN202210460909A CN114871115A CN 114871115 A CN114871115 A CN 114871115A CN 202210460909 A CN202210460909 A CN 202210460909A CN 114871115 A CN114871115 A CN 114871115A
Authority
CN
China
Prior art keywords
image
sorting
identified
target
target candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210460909.XA
Other languages
Chinese (zh)
Other versions
CN114871115B (en
Inventor
李澄非
徐傲
梁辉杰
邱世汉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuyi University
Original Assignee
Wuyi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuyi University filed Critical Wuyi University
Priority to CN202210460909.XA priority Critical patent/CN114871115B/en
Priority claimed from CN202210460909.XA external-priority patent/CN114871115B/en
Publication of CN114871115A publication Critical patent/CN114871115A/en
Application granted granted Critical
Publication of CN114871115B publication Critical patent/CN114871115B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B07SEPARATING SOLIDS FROM SOLIDS; SORTING
    • B07CPOSTAL SORTING; SORTING INDIVIDUAL ARTICLES, OR BULK MATERIAL FIT TO BE SORTED PIECE-MEAL, e.g. BY PICKING
    • B07C5/00Sorting according to a characteristic or feature of the articles or material being sorted, e.g. by control effected by devices which detect or measure such characteristic or feature; Sorting by manually actuated devices, e.g. switches
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B07SEPARATING SOLIDS FROM SOLIDS; SORTING
    • B07CPOSTAL SORTING; SORTING INDIVIDUAL ARTICLES, OR BULK MATERIAL FIT TO BE SORTED PIECE-MEAL, e.g. BY PICKING
    • B07C5/00Sorting according to a characteristic or feature of the articles or material being sorted, e.g. by control effected by devices which detect or measure such characteristic or feature; Sorting by manually actuated devices, e.g. switches
    • B07C5/36Sorting apparatus characterised by the means used for distribution
    • B07C5/361Processing or control devices therefor, e.g. escort memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an object sorting method, an object sorting device, object sorting equipment and a storage medium, wherein the method comprises the following steps: acquiring an image of an article to be identified; inputting an article image to be identified into a sorting network for article identification to obtain an identification result, performing feature extraction on the article image to be identified by the sorting network according to the article image to be identified to obtain a feature map, predicting according to the feature map to obtain a central point of a target candidate frame, an offset value of the target candidate frame and the size of the target candidate frame so as to determine a target detection frame, and obtaining the identification result according to the target detection frame; sorting the objects according to the recognition result; the sorting network reduces the complexity of a network structure and the complexity of network calculation, improves the detection performance, improves the operation speed of an algorithm, improves the detection efficiency and has good robustness.

Description

Object sorting method, device, equipment and storage medium
Technical Field
The invention relates to the field of image processing, in particular to an object sorting method, device, equipment and storage medium.
Background
Sorting objects is generally performed manually, which has problems of high labor cost and low efficiency. At present, there is also a method of identifying an image of an article to be sorted by a neural network and controlling an execution manipulator to sort the article to be sorted according to a classification and identification result; the efficiency of the method is mainly determined by the performance of the neural network. When part of the neural networks process the object images including a plurality of objects with high similarity, a plurality of objects which are mutually shielded, objects which are too far away from the camera and objects which rotate at high speed, the processing effect is poor, and the sorting efficiency is greatly influenced.
Disclosure of Invention
The present invention is directed to at least one of the technical problems of the prior art, and provides a method, an apparatus, a device and a storage medium for sorting objects.
The technical scheme adopted by the invention for solving the problems is as follows:
in a first aspect of the present invention, an object sorting method includes:
acquiring an image of an article to be identified;
inputting the to-be-identified article image into a sorting network for article identification to obtain an identification result, wherein the sorting network performs feature extraction according to the to-be-identified article image to obtain a feature map, performs prediction according to the feature map, and outputs a center point of a target candidate frame, an offset value of the target candidate frame and a size of the target candidate frame through three prediction branches respectively, determines a target detection frame according to the center point of the target candidate frame, the offset value of the target candidate frame and the size of the target candidate frame, and obtains the identification result according to the target detection frame;
and sorting the objects according to the recognition result.
According to the first aspect of the present invention, before the step of inputting the image of the object to be identified to the sorting network, the method further comprises:
and adjusting the size of the image of the article to be identified through a size adjustment network.
According to the first aspect of the present invention, the resizing the image of the article to be identified through a resizing network includes:
when the width of the article image to be identified is larger than a preset width, zooming the article image to be identified in the width direction to enable the width of the article image to be identified to be equal to the preset width;
when the height of the article image to be recognized is larger than a preset height, zooming the article image to be recognized in the height direction to enable the height of the article image to be recognized to be higher than the preset height;
when the width of the article image to be identified is smaller than a preset width, carrying out zero filling processing on the article image to be identified in the width direction to enable the width of the article image to be identified to be equal to the preset width;
and when the height of the object image to be recognized is smaller than the preset height, carrying out zero filling processing on the object image to be recognized in the height direction, so that the height of the object image to be recognized is higher than the preset height.
According to the first aspect of the present invention, predicting according to the feature map, and outputting a center point of a target candidate frame, includes:
generating a thermodynamic diagram according to the feature map;
zooming a target candidate frame into the thermodynamic diagram, and calculating the center coordinates of a Gaussian circle corresponding to the target candidate frame;
calculating the radius of the Gaussian circle according to the size of the target candidate frame;
calculating the Gaussian value of the Gaussian circle according to the circle center coordinate and the radius;
and taking the position corresponding to the maximum value of the Gaussian value as the central point of the target candidate frame, and outputting the central point of the target candidate frame.
According to the first aspect of the present invention, predicting according to the feature map, and outputting the bias value of the target pre-selection box and the size of the target candidate box, includes:
performing maximum pooling on the Gaussian values, then sorting the Gaussian values according to numerical values from large to small, and taking all the Gaussian values ranked before a preset numerical value as target Gaussian values;
and taking the pixel points corresponding to the target Gaussian values as target pixel points, and performing regression calculation according to the target pixel points to obtain the offset value of the target pre-selection frame and the size of the target candidate frame.
According to the first aspect of the present invention, after the step of extracting features according to the image of the article to be identified to obtain the feature map, the method further includes:
and expanding the size of the feature map through a deconvolution network.
According to the first aspect of the invention, the network structure adopted for feature extraction according to the to-be-identified article image is a Resnet-18 network structure.
In a second aspect of the present invention, an object sorting apparatus includes:
the image acquisition unit is used for acquiring an image of an article to be identified;
the image identification unit is used for inputting the to-be-identified article image into a sorting network for article identification to obtain an identification result, wherein the sorting network performs feature extraction according to the to-be-identified article image to obtain a feature map, performs prediction according to the feature map, respectively outputs a central point of a target candidate frame, an offset value of the target candidate frame and the size of the target candidate frame through three prediction branches, determines a target detection frame according to the central point of the target candidate frame, the offset value of the target candidate frame and the size of the target candidate frame, and obtains the identification result according to the target detection frame;
and the sorting unit is used for sorting the objects according to the identification result.
In a third aspect of the present invention, an object sorting apparatus includes: memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the object sorting method according to the first aspect of the invention when executing the computer program.
In a fourth aspect of the present invention, a storage medium stores a computer program for executing the object sorting method according to the first aspect of the present invention.
The scheme at least has the following beneficial effects: the sorting network is used for identifying the articles, so that the network structure complexity and the network calculation complexity are reduced, the detection performance is improved, the operation speed of an algorithm is improved, the detection efficiency is improved, and the sorting network has good robustness and universality; and then the object sorting efficiency is improved.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The invention is further illustrated with reference to the following figures and examples.
FIG. 1 is a flow chart of a method of sorting objects in accordance with an embodiment of the present invention;
FIG. 2 is a partial block diagram of a sorting network;
fig. 3 is a structural view of an object sorting apparatus according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as the upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings, and is only for convenience of description and simplification of description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
Embodiments of a first aspect of the present invention provide a method for sorting objects.
Referring to fig. 1, the object sorting method includes:
s100, acquiring an image of an article to be identified;
step S200, inputting an article image to be identified into a sorting network for article identification to obtain an identification result, wherein the sorting network performs feature extraction according to the article image to be identified to obtain a feature map, performs prediction according to the feature map, respectively outputs a center point of a target candidate frame, an offset value of the target candidate frame and the size of the target candidate frame through three prediction branches, determines a target detection frame according to the center point of the target candidate frame, the offset value of the target candidate frame and the size of the target candidate frame, and obtains the identification result according to the target detection frame;
and step S300, sorting the objects according to the recognition result.
Corresponding to step S100, the image of the object to be recognized may be a still picture obtained by shooting with a camera, or may be a moving video obtained by shooting with an image capturing device and then segmenting the moving video.
After step S100, that is, before the image of the article to be recognized is input to the sorting network for article recognition, the image of the article to be recognized needs to be preprocessed, for example, denoising or resizing.
In this embodiment, the size of the image of the item to be identified is adjusted by the sizing network.
Specifically, when the width of the image of the article to be identified is larger than the preset width, zooming the image of the article to be identified in the width direction to make the width of the image of the article to be identified equal to the preset width; when the height of the object image to be recognized is larger than the preset height, zooming the object image to be recognized in the height direction to enable the height of the object image to be recognized to be higher than the preset height; when the width of the object image to be identified is smaller than the preset width, carrying out zero filling processing on the object image to be identified in the width direction to enable the width of the object image to be identified to be equal to the preset width; and when the height of the object image to be recognized is smaller than the preset height, carrying out zero filling processing on the object image to be recognized in the height direction, so that the height of the object image to be recognized is higher than the preset height.
In this embodiment, the preset width of the image of the object to be identified is 512 pixels, and the preset height is 512 pixels.
Referring to fig. 2, for step S200, an image of an article to be recognized is input to a sorting network.
Firstly, the sorting network extracts features according to the images of the articles to be identified to obtain a feature map.
Specifically, the network structure of the feature extraction network adopted for feature extraction according to the image of the article to be identified is a Resnet-18 network structure.
Specifically, the structure of the Resnet-18 network structure is as follows:
the first layer is a convolutional layer, the convolution kernel with the size of 7x7, the step length of 2, the filling of 3 and the channel number of 64 is used for carrying out convolution on the input image, then batch normalization and ReLU activation operation are carried out, and finally maximum pooling is carried out through a maximum pooling layer with the size of 3x3 and the step length of 2;
the second layer is of a residual structure, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 64 are used for carrying out convolution on the characteristic images output by the first layer, then batch normalization and ReLU activation operation are carried out, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 64 are used for carrying out convolution again, batch normalization and ReLU activation operation are carried out, and finally the convolution kernels and the characteristic images output by the first layer are added;
the third layer is a residual structure, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 64 are used for performing convolution on the feature images output by the second layer, then batch normalization and ReLU activation operation are performed, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 64 are used for performing convolution again, batch normalization and ReLU activation operation are performed, and finally the feature images output by the second layer are added;
the fourth layer is a residual structure, convolution is carried out on the feature images output by the third layer by using convolution kernels with the size of 3x3, the step length of 2, the filling of 1 and the channel number of 128, then batch normalization and ReLU activation operation are carried out, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 128 are used for carrying out convolution again, batch normalization and ReLU activation operation are carried out, and finally the feature images output by the third layer are added with convolution results of the feature images through convolution kernels with the size of 1x1, the step length of 2 and the channel number of 128;
the fifth layer is a residual structure, convolution kernels with the size of 3x3, the step length of 2, the filling of 1 and the channel number of 128 are used for performing convolution on the feature images output by the fourth layer, then batch normalization and ReLU activation operation are performed, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 128 are used for performing convolution again, batch normalization and ReLU activation operation are performed, and finally the feature images output by the fourth layer are added;
the sixth layer is of a residual structure, convolution is carried out on the characteristic image output by the fifth layer by using a convolution kernel with the size of 3x3, the step length of 2, the filling of 1 and the channel number of 256, then batch normalization and ReLU activation operation are carried out, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 256 are used for carrying out convolution again, batch normalization and ReLU activation operation are carried out, and finally the convolution result of the characteristic image output by the fifth layer is added with the convolution result of the convolution kernels with the size of 1x1, the step length of 2 and the channel number of 256;
the seventh layer is of a residual structure, convolution kernels with the size of 3x3, the step length of 2, the filling of 1 and the channel number of 256 are used for carrying out convolution on the characteristic images output by the sixth layer, then batch normalization and ReLU activation operation are carried out, convolution kernels with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 256 are used for carrying out convolution again, batch normalization and ReLU activation operation are carried out, and finally the residual structure is added with the characteristic images output by the sixth layer;
the eighth layer is a residual error structure, the feature image output by the seventh layer is convolved by using a convolution kernel with the size of 3x3, the step length of 2, the filling of 1 and the channel number of 512, then batch normalization and ReLU activation operation are carried out, then the convolution kernel with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 512 is used for carrying out convolution again, batch normalization and ReLU activation operation are carried out, and finally the feature image output by the seventh layer is added with the convolution kernel convolution result with the size of 1x1, the step length of 2 and the channel number of 512;
and the ninth layer is of a residual structure, the feature images output by the eighth layer are convolved by using a convolution kernel with the size of 3x3, the step length of 2, the filling of 1 and the channel number of 512, then batch normalization and ReLU activation operation are carried out, then the convolution kernel with the size of 3x3, the step length of 1, the filling of 1 and the channel number of 512 is used for carrying out convolution again, batch normalization and ReLU activation operation are carried out, and finally the feature images output by the eighth layer are added.
The network can obtain multi-scale information by cascading a plurality of funnel-shaped networks.
After the step of extracting features according to the image of the article to be identified to obtain the feature map, the method further comprises the following steps: and enlarging the size of the feature map through a deconvolution network.
The structure of the deconvolution network is as follows:
deconvolution is carried out by using convolution kernels with the size of 4x4, the step size of 2, the filling of 1 and the channel number of 256, then batch normalization and ReLU activation operation are carried out, deconvolution is carried out by using convolution kernels with the size of 4x4, the step size of 2, the filling of 1 and the channel number of 128, then batch normalization and ReLU activation operation are carried out, finally deconvolution is carried out by using convolution kernels with the size of 4x4, the step size of 2, the filling of 1 and the channel number of 64, and then batch normalization and ReLU activation operation are carried out.
The predicted branches are three, namely a predicted branch A, a predicted branch B and a predicted branch C, wherein the predicted branch A outputs the center point of the target candidate frame, the predicted branch B outputs the offset value of the target candidate frame, and the predicted branch C outputs the size of the target candidate frame.
The predicted branch A actually outputs a thermodynamic diagram comprising a plurality of key points, wherein the key points comprise the center points of the target candidate boxes. Predicted branch a contains C channels, each containing a class.
The offset value of the target candidate box output by the prediction branch B can be used to compensate for the pixel error caused by mapping the pooled points in the low-thermodynamic diagram into the original image.
The size of the target candidate box output by predicted branch C can be used to compensate for the width and height errors of the target candidate box.
For prediction according to the feature map, the method comprises the following steps:
generating a thermodynamic diagram according to the feature diagram, specifically, downsampling the feature diagram to generate the thermodynamic diagram;
zooming the target candidate frame into the thermodynamic diagram, and calculating the center coordinates of the Gaussian circle corresponding to the target candidate frame;
calculating the radius of the Gaussian circle according to the size of the target candidate frame;
calculating the Gaussian value of the Gaussian circle according to the coordinate and the radius of the circle center;
and taking the position corresponding to the maximum value of the Gaussian value as the central point of the target candidate frame, and outputting the central point of the target candidate frame.
The network structure of the prediction branch is as follows: convolution is carried out by using a convolution kernel with the size of 3x3, the step size of 2, the filling of 1 and the channel number of 64, then batch normalization and ReLU activation operation are carried out, and finally convolution is carried out by using a convolution kernel with the size of 1x1, the step size of 2 and the channel number of 64.
It should be noted that, for some points near the center point of the target candidate frame, when the points are within a certain radius of the center point of the target candidate frame and IoU between the corresponding rectangular frame and the target detection frame is greater than 0.7, it is necessary to set the values at these points to the values of the gaussian distribution, i.e., the gaussian values, instead of the value 0.
Wherein, the loss function predicted by the thermodynamic diagram is as follows:
Figure BDA0003622156210000111
alpha and beta are hyper-parameters, the size of alpha is 2, the size of beta is 4, and the hyper-parameters are used for balancing difficult and easy samples. Y is xyc The target value is represented by a target value,
Figure BDA0003622156210000112
the predicted value is represented, N is the number of key points, and the key points are points with gaussian values.
When in use
Figure BDA0003622156210000113
In the time of the classification, for the easily classified samples, the predicted values corresponding to the easily classified samples
Figure BDA0003622156210000114
A value close to 1 corresponds to a sample easily classified
Figure BDA0003622156210000115
A small value is represented, and the value of the loss function is smaller, so that the weight of the sample is reduced;
when in use
Figure BDA0003622156210000121
In time, for the samples difficult to classify, the predicted values corresponding to the samples difficult to classify
Figure BDA0003622156210000122
If the value is close to 0, the samples difficult to classify correspond to
Figure BDA0003622156210000123
A large value is represented, and the value of the loss function is large, which serves to increase the weight of the sample.
When in use
Figure BDA0003622156210000124
In order to prevent the predicted value
Figure BDA0003622156210000125
Too high to approach 1, using
Figure BDA0003622156210000126
Acting as a penalty term for the loss function. While
Figure BDA0003622156210000127
The closer the parameter is to the center, the smaller the value of the parameter, and the weight is used to reduce the penalty.
The method for predicting according to the feature map and outputting the bias value of the target preselected frame and the size of the target candidate frame comprises the following steps:
performing maximum pooling on the multiple Gaussian values, then sequencing the Gaussian values from large to small according to numerical values, and taking all the Gaussian values ranked before a preset numerical value as target Gaussian values;
and taking the pixel points corresponding to the target Gaussian values as target pixel points, and performing regression calculation according to the target pixel points to obtain the offset value of the target pre-selection frame and the size of the target candidate frame.
The network structure of the prediction branch corresponding to the bias value of the target preselection box is as follows: convolution is carried out by using a convolution kernel with the size of 3x3, the step size of 2, the padding of 1 and the channel number of 64, then batch normalization and ReLU activation operation are carried out, and finally convolution is carried out by using a convolution kernel with the size of 1x1, the step size of 2 and the channel number of 2.
The network structure of the prediction branch corresponding to the size of the target candidate box is as follows: convolution is carried out by using a convolution kernel with the size of 3x3, the step size of 2, the filling of 1 and the channel number of 64, then batch normalization and ReLU activation operation are carried out, and finally convolution is carried out by using a convolution kernel with the size of 1x1, the step size of 2 and the channel number of 2.
The loss function for the bias value prediction is:
Figure BDA0003622156210000131
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003622156210000132
represents the predicted offset value, p represents the coordinates of the center point of the image, R represents the scaling factor of the thermodynamic diagram,
Figure BDA0003622156210000133
representing the approximate certificate coordinates of the scaled center point, the entire process calculates the offset loss for the positive sample block using L1 loss.
The loss function for the size prediction is:
Figure BDA0003622156210000134
wherein N represents the number of key points, S k Which represents the true size of the object,
Figure BDA0003622156210000135
representing the predicted size, the entire process utilizes L1L oss to calculate the loss in length and width for the positive sample block.
The loss function of the entire prediction network is then: l is det =L ksize L sizeoff L off (ii) a Wherein λ size And λ off Are all weight parameters, where size =0.1,λ off =1。
And then determining a target detection frame according to the central point of the target candidate frame, the bias value of the target candidate frame and the size of the target candidate frame, and obtaining a recognition result according to the target detection frame.
For step S300, the robot arm is controlled to sort the objects according to the recognition result.
It should be noted that training data is input into the sorting network for training, the trained network weight data is transferred into the Jetson nano development board, the sorting network is perfected by using the trained weight data, and prediction accuracy is improved.
The sorting network is used for identifying the articles, so that the network structure complexity and the network calculation complexity are reduced, the detection performance is improved, the operation speed of an algorithm is improved, the detection efficiency is improved, and the sorting network has good robustness and universality; and then the object sorting efficiency is improved.
Embodiments of a second aspect of the invention provide an object sorting apparatus.
Referring to fig. 3, the object sorting apparatus includes an image acquisition unit 10, an image recognition unit 20, and a sorting unit 30.
The image acquiring unit 10 is used for acquiring an image of an article to be identified; the image identification unit 20 is configured to input an image of an article to be identified to a sorting network for article identification to obtain an identification result, where the sorting network performs feature extraction according to the image of the article to be identified to obtain a feature map, performs prediction according to the feature map, and outputs a center point of a target candidate frame, an offset value of the target candidate frame, and a size of the target candidate frame through three prediction branches, respectively, determines a target detection frame according to the center point of the target candidate frame, the offset value of the target candidate frame, and the size of the target candidate frame, and obtains the identification result according to the target detection frame; the sorting unit 30 is used for sorting the objects according to the recognition result.
The image acquisition unit 10 may be an apparatus capable of image acquisition having a camera. The image recognition unit 20 may be a computer device with a sorting network. The sorting unit 30 may be a sorting robot having a robot arm.
It should be noted that, the units of the object sorting device adopted in the embodiment of the second aspect of the present invention correspond to the steps of the object sorting method adopted in the embodiment of the first aspect of the present invention one by one, and both have the same technical solutions, so that the same technical problems are solved, and the same technical effects are brought, and therefore, detailed descriptions of the passenger flow prediction device are omitted.
In an embodiment of a third aspect of the invention, an object sorting apparatus is provided. The object sorting apparatus includes: memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor implements an object sorting method as described in embodiments of the first aspect of the invention.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
An embodiment of a fourth aspect of the invention provides a storage medium. The storage medium stores a computer program for executing the object sorting method according to an embodiment of the first aspect of the present invention.
One of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media and communication media. The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM or other memory technology, CD-ROM, digital versatile disks or other optical disk storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as is well known to those skilled in the art.
The above description is only a preferred embodiment of the present invention, and the present invention is not limited to the above embodiment, and the present invention shall fall within the protection scope of the present invention as long as the technical effects of the present invention are achieved by the same means.

Claims (10)

1. A method of sorting objects, comprising:
acquiring an image of an article to be identified;
inputting the to-be-identified article image into a sorting network for article identification to obtain an identification result, wherein the sorting network performs feature extraction according to the to-be-identified article image to obtain a feature map, performs prediction according to the feature map, and outputs a center point of a target candidate frame, an offset value of the target candidate frame and a size of the target candidate frame through three prediction branches respectively, determines a target detection frame according to the center point of the target candidate frame, the offset value of the target candidate frame and the size of the target candidate frame, and obtains the identification result according to the target detection frame;
and sorting the objects according to the recognition result.
2. The method for sorting objects according to claim 1, wherein before the step of inputting the image of the object to be identified into the sorting network, the method further comprises:
and adjusting the size of the image of the article to be identified through a size adjustment network.
3. The object sorting method according to claim 2, wherein the resizing the image of the object to be identified through a resizing network comprises:
when the width of the article image to be identified is larger than a preset width, zooming the article image to be identified in the width direction to enable the width of the article image to be identified to be equal to the preset width;
when the height of the article image to be recognized is larger than a preset height, zooming the article image to be recognized in the height direction to enable the height of the article image to be recognized to be higher than the preset height;
when the width of the article image to be identified is smaller than a preset width, carrying out zero filling processing on the article image to be identified in the width direction to enable the width of the article image to be identified to be equal to the preset width;
and when the height of the object image to be recognized is smaller than the preset height, carrying out zero filling processing on the object image to be recognized in the height direction, so that the height of the object image to be recognized is higher than the preset height.
4. The method according to claim 1, wherein predicting the center point of the target frame candidate based on the feature map and outputting the center point of the target frame candidate comprises:
generating a thermodynamic diagram according to the feature map;
zooming a target candidate frame into the thermodynamic diagram, and calculating the center coordinates of the Gaussian circle corresponding to the target candidate frame;
calculating the radius of the Gaussian circle according to the size of the target candidate frame;
calculating the Gaussian value of the Gaussian circle according to the circle center coordinate and the radius;
and taking the position corresponding to the maximum value of the Gaussian value as the central point of the target candidate frame, and outputting the central point of the target candidate frame.
5. The method for sorting objects according to claim 4, wherein the predicting according to the feature map and outputting the offset value of the target pre-selection frame and the size of the target candidate frame comprises:
performing maximum pooling on the Gaussian values, then sorting the Gaussian values from large to small according to numerical values, and taking all the Gaussian values ranked before a preset numerical value as target Gaussian values;
and taking the pixel points corresponding to the target Gaussian values as target pixel points, and performing regression calculation according to the target pixel points to obtain the offset value of the target pre-selection frame and the size of the target candidate frame.
6. The method for sorting objects according to claim 1, wherein after the step of extracting features from the image of the object to be identified, the method further comprises:
and expanding the size of the feature map through a deconvolution network.
7. The method as claimed in claim 1, wherein the network structure for extracting features from the image of the object to be recognized is a Resnet-18 network structure.
8. An object sorting apparatus, comprising:
the image acquisition unit is used for acquiring an image of an article to be identified;
the image identification unit is used for inputting the to-be-identified article image into a sorting network for article identification to obtain an identification result, wherein the sorting network performs feature extraction according to the to-be-identified article image to obtain a feature map, performs prediction according to the feature map, respectively outputs a central point of a target candidate frame, an offset value of the target candidate frame and the size of the target candidate frame through three prediction branches, determines a target detection frame according to the central point of the target candidate frame, the offset value of the target candidate frame and the size of the target candidate frame, and obtains the identification result according to the target detection frame;
and the sorting unit is used for sorting the objects according to the identification result.
9. An object sorting apparatus comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the computer program, implements an object sorting method according to any one of claims 1 to 7.
10. A storage medium characterized in that it stores a computer program for executing the object sorting method according to any one of claims 1 to 7.
CN202210460909.XA 2022-04-28 Object sorting method, device, equipment and storage medium Active CN114871115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210460909.XA CN114871115B (en) 2022-04-28 Object sorting method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210460909.XA CN114871115B (en) 2022-04-28 Object sorting method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114871115A true CN114871115A (en) 2022-08-09
CN114871115B CN114871115B (en) 2024-07-05

Family

ID=

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012083855A (en) * 2010-10-07 2012-04-26 Toyota Motor Corp Object recognition device and object recognition method
CN108229455A (en) * 2017-02-23 2018-06-29 北京市商汤科技开发有限公司 Object detecting method, the training method of neural network, device and electronic equipment
CN108846415A (en) * 2018-05-22 2018-11-20 长沙理工大学 The Target Identification Unit and method of industrial sorting machine people
CN110210474A (en) * 2019-04-30 2019-09-06 北京市商汤科技开发有限公司 Object detection method and device, equipment and storage medium
CN111144322A (en) * 2019-12-28 2020-05-12 广东拓斯达科技股份有限公司 Sorting method, device, equipment and storage medium
US20210241015A1 (en) * 2020-02-03 2021-08-05 Beijing Sensetime Technology Development Co., Ltd. Image processing method and apparatus, and storage medium
CN113361527A (en) * 2021-08-09 2021-09-07 浙江华睿科技股份有限公司 Multi-target object identification and positioning method and device, electronic equipment and storage medium
CN114359624A (en) * 2021-12-10 2022-04-15 五邑大学 Article sorting method and device and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012083855A (en) * 2010-10-07 2012-04-26 Toyota Motor Corp Object recognition device and object recognition method
CN108229455A (en) * 2017-02-23 2018-06-29 北京市商汤科技开发有限公司 Object detecting method, the training method of neural network, device and electronic equipment
CN108846415A (en) * 2018-05-22 2018-11-20 长沙理工大学 The Target Identification Unit and method of industrial sorting machine people
CN110210474A (en) * 2019-04-30 2019-09-06 北京市商汤科技开发有限公司 Object detection method and device, equipment and storage medium
CN111144322A (en) * 2019-12-28 2020-05-12 广东拓斯达科技股份有限公司 Sorting method, device, equipment and storage medium
US20210241015A1 (en) * 2020-02-03 2021-08-05 Beijing Sensetime Technology Development Co., Ltd. Image processing method and apparatus, and storage medium
CN113361527A (en) * 2021-08-09 2021-09-07 浙江华睿科技股份有限公司 Multi-target object identification and positioning method and device, electronic equipment and storage medium
CN114359624A (en) * 2021-12-10 2022-04-15 五邑大学 Article sorting method and device and storage medium

Similar Documents

Publication Publication Date Title
CN109934065B (en) Method and device for gesture recognition
CN109658454B (en) Pose information determination method, related device and storage medium
CN111008961B (en) Transmission line equipment defect detection method and system, equipment and medium thereof
CN108229418B (en) Human body key point detection method and apparatus, electronic device, storage medium, and program
CN111046856B (en) Parallel pose tracking and map creating method based on dynamic and static feature extraction
CN104978738A (en) Method of detection of points of interest in digital image
CN108573471B (en) Image processing apparatus, image processing method, and recording medium
CN112364865B (en) Method for detecting small moving target in complex scene
CN110310305B (en) Target tracking method and device based on BSSD detection and Kalman filtering
CN107578424B (en) Dynamic background difference detection method, system and device based on space-time classification
CN112435223B (en) Target detection method, device and storage medium
CN115375917B (en) Target edge feature extraction method, device, terminal and storage medium
CN111915657A (en) Point cloud registration method and device, electronic equipment and storage medium
CN111062400A (en) Target matching method and device
CN111767915A (en) License plate detection method, device, equipment and storage medium
CN112102383A (en) Image registration method and device, computer equipment and storage medium
CN115063454A (en) Multi-target tracking matching method, device, terminal and storage medium
CN115345905A (en) Target object tracking method, device, terminal and storage medium
US20070223785A1 (en) Image processor and method
CN113112479A (en) Progressive target detection method and device based on key block extraction
CN111178200B (en) Method for identifying instrument panel indicator lamp and computing equipment
CN114871115B (en) Object sorting method, device, equipment and storage medium
CN114871115A (en) Object sorting method, device, equipment and storage medium
CN114898306A (en) Method and device for detecting target orientation and electronic equipment
CN113255405B (en) Parking space line identification method and system, parking space line identification equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant