CN112541508A - Fruit segmentation and recognition method and system and fruit picking robot - Google Patents
Fruit segmentation and recognition method and system and fruit picking robot Download PDFInfo
- Publication number
- CN112541508A CN112541508A CN202011519247.6A CN202011519247A CN112541508A CN 112541508 A CN112541508 A CN 112541508A CN 202011519247 A CN202011519247 A CN 202011519247A CN 112541508 A CN112541508 A CN 112541508A
- Authority
- CN
- China
- Prior art keywords
- fruit
- network
- target
- segmentation
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 235000013399 edible fruits Nutrition 0.000 title claims abstract description 174
- 230000011218 segmentation Effects 0.000 title claims abstract description 64
- 238000000034 method Methods 0.000 title claims abstract description 33
- 239000002420 orchard Substances 0.000 claims abstract description 23
- 230000005764 inhibitory process Effects 0.000 claims abstract description 6
- 238000005070 sampling Methods 0.000 claims description 34
- 238000012549 training Methods 0.000 claims description 27
- 238000010586 diagram Methods 0.000 claims description 24
- 238000000605 extraction Methods 0.000 claims description 15
- 238000012216 screening Methods 0.000 claims description 12
- 230000001629 suppression Effects 0.000 claims description 12
- 230000007246 mechanism Effects 0.000 claims description 11
- 230000008030 elimination Effects 0.000 claims description 8
- 238000003379 elimination reaction Methods 0.000 claims description 8
- 230000004931 aggregating effect Effects 0.000 claims description 7
- 238000012163 sequencing technique Methods 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000011478 gradient descent method Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 abstract description 10
- 230000008569 process Effects 0.000 abstract description 5
- 230000001737 promoting effect Effects 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 18
- 230000006870 function Effects 0.000 description 12
- 238000004590 computer program Methods 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000011176 pooling Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 2
- 238000012952 Resampling Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000000889 atomisation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 235000012055 fruits and vegetables Nutrition 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000007670 refining Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention provides a fruit segmentation identification method and system and a fruit picking robot, belonging to the technical field of fruit picking robots, wherein the outline of a target fruit in a fruit image is labeled; extracting the dimension of the target fruit and the characteristics of the target missing fruit in the marked fruit image; the obtained characteristic graph is transmitted to a regional candidate network, and the interested region with the same scale is obtained through non-maximum value inhibition; predicting the confidence coefficient of the fruit, the frame coordinates and the segmentation mask of the interested area through two full-connected layers and a full convolution network; calculating the fruit confidence coefficient, the loss between the frame coordinate and the segmentation mask and the loss between the frame coordinate and the marked value, updating the network parameters through gradient feedback, and continuously iterating until the parameters are stable to obtain an identification model for segmentation identification. The invention realizes the end-to-end detection process, has high precision and strong robustness, can realize the effective division of fruits in the orchard environment with various interferences and lays the foundation for promoting the deployment of the apple picking robot to the practical application.
Description
Technical Field
The invention relates to the technical field of fruit picking robots, in particular to a fruit segmentation and identification method and system and a fruit picking robot.
Background
The real application of the fruit picking robot has important significance for promoting the production automation and management intelligence of the fruit and vegetable industry, and the visual system is used as the most basic and important link, so that the target fruit can be accurately segmented under the complex orchard environment, and the operation quality and the operation efficiency of the picking robot can be directly related. Since the mid-century intelligent picking came out, the recognition algorithm of the target fruit has attracted the attention of numerous scholars at home and abroad, and certain research bases and achievements have been accumulated in the technical categories of machine learning, deep learning and the like, however, the current segmentation method is difficult to deal with various interferences existing in the natural environment, such as fruit overlapping, branch and leaf shielding, illumination and weather change, mixed noise, homochromatic background and other factors, and the segmentation effect of each model is greatly limited. Therefore, the detection precision and the anti-interference capability of the model are further improved, and the high efficiency and the stability of the visual system are improved.
The main reasons for the decrease of the model detection effect are that the feature extraction capability of the model itself is insufficient, and in addition, various interferences exist in the orchard environment, so that the features of the target fruit in the aspects of shape, color, texture and the like are lost, and the subsequent steps of the model are difficult to support for making correct judgment, so that the target fruit is detected by mistake and missed.
Disclosure of Invention
The invention aims to provide a fruit segmentation and identification method and system and a fruit picking robot which can effectively improve the segmentation effect of a model on clustered or overlapped target fruits, effectively improve the segmentation effect of the model on target fruits with different scales, improve the segmentation effect of the model on characteristic-missing target fruits and adapt to the identification of different types of target fruits in a complex orchard environment, so as to solve at least one technical problem in the background technology.
In order to achieve the purpose, the invention adopts the following technical scheme:
in one aspect, the present invention provides a fruit segmentation and identification method, including:
step S110: collecting fruit images containing different interferences in an orchard environment, and labeling the outlines of target fruits in the fruit images;
step S120: extracting the dimension of the target fruit and the characteristics of the target missing fruit in the marked fruit image;
step S130: the feature map obtained in the step S120 is transmitted to a regional candidate network, and the interested regions with the same scale are obtained through non-maximum suppression;
step S140: predicting the fruit confidence, the frame coordinates and the segmentation mask of the interested areas with the same scale through two full-connected layers and a full convolution network;
step S150: calculating the fruit confidence coefficient, the loss between the frame coordinate and the segmentation mask and the loss between the frame coordinate and the marked value, updating the network parameters through gradient feedback, continuously iterating until the parameters are stable, obtaining an identification model, and carrying out segmentation identification on fruits in the orchard environment image.
Preferably, the step S120 specifically includes:
step S121: inputting the marked fruit images into a residual error network in batches, and performing continuous downsampling through convolution operation;
step S122: introducing a characteristic pyramid network, and integrating characteristic representations of all layers in a residual error network through top-to-bottom and transverse connection;
step S123: sampling all layers of feature maps in the feature pyramid network to the same scale and integrating, aggregating the whole map information by a Gaussian non-local attention mechanism, representing the aggregated features, sampling and re-fusing to obtain a balanced feature pyramid, and outputting the balanced feature map.
Preferably, the step S130 specifically includes:
step S131: inputting the balanced feature map into a regional candidate network, and generating a predefined anchor frame according to different size and aspect ratios by taking coordinates of each spatial position on the feature map, which correspond to the original map, as centers;
step S132: judging positive and negative samples based on the intersection and parallel ratio between each anchor frame and all the marking frames, generating a training target of the regional candidate network, and primarily predicting a target fruit through classification branches and regression branches;
step S133: and screening the generated candidate frames through boundary elimination and non-maximum suppression, selecting the first N candidate frames according to confidence degree sequencing, selecting positive and negative samples according to a certain proportion, and inputting the positive and negative samples into a RoI Align layer for sampling to the same size.
Preferably, the step S140 specifically includes:
step S141: inputting the regions of interest with the same size into two fully-connected branches, and respectively outputting the probability vector of each candidate frame belonging to the target fruit and the corresponding frame offset;
step S142: and the two fully-connected branches are parallel to a full convolution network to realize target fruit mask prediction, multi-dimensional feature representation is segmented for each candidate frame, binarization is carried out, and a segmentation graph of a background and a foreground is generated.
Preferably, the step S150 specifically includes:
and adding the fruit confidence coefficient, the frame coordinate and the loss between the segmentation mask and the labeled value to obtain a final loss function, performing back propagation by using a random gradient descent method, continuously optimizing the model parameters to be stable, and fitting training data to obtain the recognition model.
In a second aspect, the present invention provides a fruit segmentation recognition system, comprising:
the fruit image acquisition module is used for acquiring fruit images containing different interferences in an orchard environment and marking the outline of a target fruit in the fruit images;
the first extraction module is used for extracting the scale size of the target fruit and the characteristics of the target missing fruit in the marked fruit image to obtain a characteristic diagram;
the second extraction module is used for combining the regional candidate network and obtaining the interested regions with the same scale in the characteristic diagram through non-maximum value inhibition;
the prediction module is used for predicting the fruit confidence, the frame coordinates and the segmentation mask of the interested region with the same scale through two full-connected layers and a full convolution network;
and the identification module is used for calculating the fruit confidence coefficient, the frame coordinate and the loss between the segmentation mask and the labeled value, updating the network parameters through gradient return, continuously iterating until the parameters are stable, obtaining an identification model, and performing segmentation identification on the fruits in the orchard environment image.
Preferably, the first extraction module includes:
the sampling unit is used for combining the marked fruit image into a residual error network and carrying out continuous downsampling through convolution operation;
the characteristic representation unit is used for introducing a characteristic pyramid network and integrating the characteristic representation of each layer in the residual error network through top-to-bottom and transverse connection;
and the balancing unit is used for sampling and integrating all layers of feature graphs in the feature pyramid network to the same scale, aggregating the whole graph information through a Gaussian non-local attention mechanism, representing the aggregated features back to sampling and re-fusing to obtain a balanced feature pyramid, and outputting the balanced feature graph.
Preferably, the second extraction module includes:
the predefined unit is used for combining the balanced feature map with the regional candidate network, taking the coordinate of each space position on the feature map, which corresponds to the original map, as the center, and generating a predefined anchor frame according to different sizes and aspect ratios;
the preliminary prediction unit is used for judging positive and negative samples based on the intersection and parallel ratio between each anchor frame and all the marking frames, generating a training target of the regional candidate network, and preliminarily predicting target fruits through classification branches and regression branches;
and the screening unit is used for screening the generated candidate frames through boundary elimination and non-maximum suppression, selecting the first N candidate frames according to confidence degree sequencing, selecting positive and negative samples according to a certain proportion, and combining the positive and negative samples with the RoI Align layer to sample to the same size.
Preferably, the prediction module comprises:
the calculation unit is used for combining the regions of interest with the same size with the two fully-connected branches and respectively outputting the probability vector of each candidate frame belonging to the target fruit and the corresponding frame offset;
and the segmentation unit is used for realizing target fruit mask prediction by parallel one full convolution network with the two full-connected branches, segmenting out multi-dimensional feature representation aiming at each candidate frame, and carrying out binarization to generate a segmentation graph of a background and a foreground.
In a third aspect, the present invention provides a fruit picking robot comprising a fruit segmentation identification system as described above.
The invention has the beneficial effects that: the apple picking robot has the advantages that the end-to-end detection process can be realized, the precision is high, the robustness is strong, a good segmentation effect can be shown under various interferences existing in the orchard environment, and a foundation is laid for further promoting the deployment of the apple picking robot to the practical application.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is an apple image collected in different time periods and different interference scenes according to an embodiment of the present invention.
Fig. 2 is a diagram illustrating an effect of different image enhancements applied to the same image according to an embodiment of the present invention.
Fig. 3 is a flowchart illustrating an embodiment of a partition architecture according to the present invention.
Fig. 4 is a flowchart illustrating a specific implementation of the feature obtaining stage according to an embodiment of the present invention.
Fig. 5 is a schematic diagram of a gaussian non-local attention mechanism according to an embodiment of the present invention.
Fig. 6 is a diagram of a regional candidate network architecture according to an embodiment of the present invention.
Fig. 7 is an overall flowchart of the trained network in the inference phase according to the embodiment of the present invention.
FIG. 8 is a diagram illustrating the effect of segmenting a target fruit in a complex environment according to an embodiment of the present invention
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below by way of the drawings are illustrative only and are not to be construed as limiting the invention.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
For the purpose of facilitating an understanding of the present invention, the present invention will be further explained by way of specific embodiments with reference to the accompanying drawings, which are not intended to limit the present invention.
It should be understood by those skilled in the art that the drawings are merely schematic representations of embodiments and that the elements shown in the drawings are not necessarily required to practice the invention.
Example 1
The embodiment 1 of the invention provides a method for accurately positioning and dividing target fruits suitable for an apple picking robot, which is used for the method for accurately positioning and dividing the target fruits of a visual system of the apple picking robot and comprises the following steps:
step 1: image acquisition and labeling: under the orchard environment, images containing target fruits under different interferences are collected and labeled by using labelme image labeling software.
Step 2: image feature acquisition: aiming at a training target, in order to improve the detection effect of a model on different types of target fruits, a backbone network architecture for special diagnosis extraction is designed, and for an input image X:
step 2.1: firstly, after convolution and pooling of residual error network (ResNet) and other down-sampling operations, the semantic capacity of each space position on the feature map is gradually enriched, and the feature representation output by each layer of residual error block is respectively marked as { F }2,F3,F4,F5};
Step 2.2: will { F2,F3,F4,F5Merging the top-down and transverse connection frameworks to obtain a characteristic pyramid which is marked as { A }2,A3,A4,A5Respectively using each layer of feature map for subsequent operation, and improving the recognition effect of the model on target fruits with different scales;
step 2.3: will be { A2,A3,A4,A5Uniformly sampling to A4Size and fusion to give AconcatA 'is obtained by inputting the fused product to a non-local attention machine of Gaussian'concatAggregating the characteristics of the whole graph and simultaneously inhibiting interference factors, then resampling and fusing to obtain the final balanced characteristic representation { P }2,P3,P4,P5}。
And step 3: region of interest generation: respectively using { P2,P3,P4,P5Each layer of feature representation in the graph is input into a Region candidate network (RPN), a predefined anchor frame is generated on an input picture and is subjected to preliminary adjustment, a candidate frame is obtained through screening, a corresponding Region of Interest (Region of Interest) on the feature graph is intercepted, and sampling is carried out to the same size through a RoI Align layer.
And 4, step 4: and (3) predicting the detection result: the two Fully connected branches respectively predict the fruit confidence and the bounding box offset, and the parallel mask branch generates a segmentation mask through a Full Convolution Network (FCN).
And 5: model training and optimization: and generating a training target according to the labeling information, calculating loss with a predicted value generated by the model, updating model parameters through a gradient back propagation algorithm, and continuously iterating and optimizing.
In the embodiment 1 of the invention, the method is suitable for the accurate positioning and segmentation of the target fruits of the apple picking robot, and solves the problem that the visual system of the fruit picking robot is difficult to deal with various interferences in the natural environment. The method is high in precision and strong in robustness, can realize an end-to-end identification process, and is suitable for actual operation of the apple picking robot.
Example 2
An embodiment 2 of the present invention provides a fruit division and recognition system, including:
the fruit image acquisition module is used for acquiring fruit images containing different interferences in an orchard environment and marking the outline of a target fruit in the fruit images;
the first extraction module is used for extracting the scale size of the target fruit and the characteristics of the target missing fruit in the marked fruit image to obtain a characteristic diagram;
the second extraction module is used for combining the regional candidate network and obtaining the interested regions with the same scale in the characteristic diagram through non-maximum value inhibition;
the prediction module is used for predicting the fruit confidence, the frame coordinates and the segmentation mask of the interested region with the same scale through two full-connected layers and a full convolution network;
and the identification module is used for calculating the fruit confidence coefficient, the frame coordinate and the loss between the segmentation mask and the labeled value, updating the network parameters through gradient return, continuously iterating until the parameters are stable, obtaining an identification model, and performing segmentation identification on the fruits in the orchard environment image.
The first extraction module comprises: the sampling unit is used for combining the marked fruit image into a residual error network and carrying out continuous downsampling through convolution operation; the characteristic representation unit is used for introducing a characteristic pyramid network and integrating the characteristic representation of each layer in the residual error network through top-to-bottom and transverse connection; and the balancing unit is used for sampling and integrating all layers of feature graphs in the feature pyramid network to the same scale, aggregating the whole graph information through a Gaussian non-local attention mechanism, representing the aggregated features back to sampling and re-fusing to obtain a balanced feature pyramid, and outputting the balanced feature graph.
The second extraction module comprises: the predefined unit is used for combining the balanced feature map with the regional candidate network, taking the coordinate of each space position on the feature map, which corresponds to the original map, as the center, and generating a predefined anchor frame according to different sizes and aspect ratios; the preliminary prediction unit is used for judging positive and negative samples based on the intersection and parallel ratio between each anchor frame and all the marking frames, generating a training target of the regional candidate network, and preliminarily predicting target fruits through classification branches and regression branches; and the screening unit is used for screening the generated candidate frames through boundary elimination and non-maximum suppression, selecting the first N candidate frames according to confidence degree sequencing, selecting positive and negative samples according to a certain proportion, and combining the positive and negative samples with the RoI Align layer to sample to the same size.
The prediction module comprises: the calculation unit is used for combining the regions of interest with the same size with the two fully-connected branches and respectively outputting the probability vector of each candidate frame belonging to the target fruit and the corresponding frame offset; and the segmentation unit is used for realizing target fruit mask prediction by parallel one full convolution network with the two full-connected branches, segmenting out multi-dimensional feature representation aiming at each candidate frame, and carrying out binarization to generate a segmentation graph of a background and a foreground.
In this embodiment 2, a fruit division recognition method is implemented based on the fruit division recognition subsystem, and includes the following steps:
step S110: collecting fruit images containing different interferences in an orchard environment, and labeling the outlines of target fruits in the fruit images;
step S120: extracting the dimension of the target fruit and the characteristics of the target missing fruit in the marked fruit image;
step S130: the feature map obtained in the step S120 is transmitted to a regional candidate network, and the interested regions with the same scale are obtained through non-maximum suppression;
step S140: predicting the fruit confidence, the frame coordinates and the segmentation mask of the interested areas with the same scale through two full-connected layers and a full convolution network;
step S150: calculating the fruit confidence coefficient, the loss between the frame coordinate and the segmentation mask and the loss between the frame coordinate and the marked value, updating the network parameters through gradient feedback, continuously iterating until the parameters are stable, obtaining an identification model, and carrying out segmentation identification on fruits in the orchard environment image.
In this embodiment 2, the step S120 specifically includes:
step S121: inputting the marked fruit images into a residual error network in batches, and performing continuous downsampling through convolution operation;
step S122: introducing a characteristic pyramid network, and integrating characteristic representations of all layers in a residual error network through top-to-bottom and transverse connection;
step S123: sampling all layers of feature maps in the feature pyramid network to the same scale and integrating, aggregating the whole map information by a Gaussian non-local attention mechanism, representing the aggregated features, sampling and re-fusing to obtain a balanced feature pyramid, and outputting the balanced feature map.
In this embodiment 2, the step S130 specifically includes:
step S131: inputting the balanced feature map into a regional candidate network, and generating a predefined anchor frame according to different size and aspect ratios by taking coordinates of each spatial position on the feature map, which correspond to the original map, as centers;
step S132: judging positive and negative samples based on the intersection and parallel ratio between each anchor frame and all the marking frames, generating a training target of the regional candidate network, and primarily predicting a target fruit through classification branches and regression branches;
step S133: and screening the generated candidate frames through boundary elimination and non-maximum suppression, selecting the first N candidate frames according to confidence degree sequencing, selecting positive and negative samples according to a certain proportion, and inputting the positive and negative samples into a RoI Align layer for sampling to the same size.
In this embodiment 2, the step S140 specifically includes:
step S141: inputting the regions of interest with the same size into two fully-connected branches, and respectively outputting the probability vector of each candidate frame belonging to the target fruit and the corresponding frame offset;
step S142: and the two fully-connected branches are parallel to a full convolution network to realize target fruit mask prediction, multi-dimensional feature representation is segmented for each candidate frame, binarization is carried out, and a segmentation graph of a background and a foreground is generated.
In this embodiment 2, the step S150 specifically includes:
and adding the fruit confidence coefficient, the frame coordinate and the loss between the segmentation mask and the labeled value to obtain a final loss function, performing back propagation by using a random gradient descent method, continuously optimizing the model parameters to be stable, and fitting training data to obtain the recognition model.
Example 3
An embodiment 3 of the present invention provides a fruit picking robot, including a fruit division recognition system, where the fruit division recognition system is capable of implementing a fruit division recognition method, and the fruit division recognition method includes the following steps:
step 1: and (5) making a data set. Collecting images containing different interferences in a complex orchard environment and marking the outlines of target fruits in the images to produce a data set for subsequent training, verification and testing of a model;
step 2: and (6) obtaining the characteristics. Inputting the picture into a combined backbone architecture of a residual error network, a feature pyramid network and a balanced feature pyramid, and fully extracting features of small-scale and target missing fruits in the picture;
and step 3: and generating a region of interest. Transmitting the characteristic diagrams of each layer in the steps to a regional candidate network, and obtaining the interested region with the same scale through operations such as non-maximum value inhibition;
and 4, step 4: and (6) result prediction. Embedding three parallel branches, and predicting fruit confidence, frame coordinates and a segmentation mask through two full-connected layers and a full convolution network respectively;
and 5: and (6) optimizing the model. And calculating the loss between the prediction information and the labeled value, updating network parameters through gradient return, and continuously performing iterative training and evaluation to finally enable the model to tend to be stable.
The feature extraction method in the step 2 comprises the following steps:
step 2.1: inputting the image into a residual error network by taking batch as a unit, continuously sampling the image through operations such as convolution and the like, and gradually enriching semantic capacity in the feature map;
step 2.2: a characteristic pyramid network is introduced, and the detection effect of the model on target fruits with different scales, especially small scales, is improved by efficiently integrating characteristic representations of all layers in the residual error network from top to bottom and in a transverse connection manner;
step 2.3: sampling all layers of feature maps in the pyramid to the same scale and integrating, aggregating full map information through a Gaussian non-local attention mechanism, inhibiting interferences such as homochromatic system backgrounds and the like, representing refined features back to the sampling and re-fusing to obtain a balanced feature pyramid;
the region of interest generation method in step 3 comprises the following steps:
step 3.1: inputting each layer of balanced feature map into a regional candidate network, and generating a predefined anchor frame according to different size and aspect ratio by taking a coordinate of each spatial position on the feature map, which corresponds to the original map, as a center;
step 3.2: judging positive and negative samples based on the intersection and parallel ratio between each anchor frame and all the marking frames, generating a training target of the regional candidate network, and primarily predicting a target fruit through two branches of classification and regression;
step 3.3: and screening the generated candidate frames through boundary elimination and non-maximum suppression, selecting the first N candidate frames according to confidence degree sequencing, selecting positive and negative samples according to a certain proportion, inputting the positive and negative samples into a RoI Align layer, sampling the positive and negative samples to the same size, and preparing for subsequent fruit identification.
The target fruit prediction method in the step 4 comprises the following steps:
step 4.1: inputting the regions of interest with the same size into two fully-connected branches, and respectively outputting the probability vector of each candidate box belonging to the target fruit and the corresponding frame offset so as to more accurately regress the target detection box;
step 4.2: and a full convolution network is parallel to the two branches to realize the task of target fruit mask prediction, a feature representation of Km2 dimension is segmented for each candidate box, and binarization is performed by using 0.5 as a threshold value during testing, so that a segmentation graph of a background and a foreground is generated.
The model optimization method in step 5 comprises the following steps: and adding the five losses generated in the process to obtain a final loss function, wherein the classification loss is calculated by adopting a cross entropy loss function, the regression loss is sampled and smoothed by L1 loss calculation, and for the loss of the mask branch, sigmoid is applied to each spatial position, and then the average value of the cross entropies of all pixels on the RoI is taken. And finally, performing back propagation by a random gradient descent method, continuously optimizing model parameters, and fitting training data to obtain the recognition model.
In this embodiment 3, the fruit picking robot is used to pick fruits, and it is necessary to identify the fruit target in a complicated orchard environment by segmentation. In an orchard environment, images of target fruits under different interferences including different time periods such as early morning, noon and night, overlapping, shielding, direct lighting, backlight, raining and the like are collected, as shown in fig. 1(a) and 1 (b). And the acquired image is subjected to image enhancement processing such as atomization, brightness enhancement, contrast reduction, gaussian noise, impulse noise, poisson noise and the like, as shown in fig. 2. And marking the target fruit in the image, and making a data set for subsequent operation.
The overall architecture of the recognition model is shown in fig. 3 and can be divided into three stages, namely feature acquisition, RoIs generation and result prediction.
1. A characteristic acquisition stage:
as shown in fig. 4, for an input picture X, the specific implementation flow at this stage sequentially passes through three modules, namely ResNet, FPN and BFP, and the image features are extracted in depth by means of a combined backbone architecture of the three modules. Wherein:
ResNet extracts features:
the depth of a Convolutional Neural Network (CNN) is important to the performance influence of a model, the training of a deep network is usually accompanied with the problem of gradient disappearance or explosion, ResNet well solves the contradiction through the design of a residual block, the semantic capacity of a deep characteristic diagram is gradually enriched through the technologies of convolution, pooling, residual learning and the like, and the characteristic representation output by each layer of residual block is respectively marked as { F }2,F3,F4,F5};
FPN fusion characteristics:
since ResNet is subject to constant downsampling operations, albeit F5The method contains rich semantic features, but is not suitable for detecting small-scale target fruits, and has poor detection effect on long-range images. Thus, will { F2,F3,F4,F5Fusing the top-down and transverse connection frameworks to obtain a characteristic pyramid, and marking characteristic graphs of each layer as { A }2,A3,A4,A5And respectively using the characteristic diagram of each layer for subsequent operation, and improving the recognition effect of the model on target fruits with different scales.
BFP refining characteristics:
will be { A2,A3,A4,A5It is sampled uniformly to the same value as A by interpolation or pooling4The same dimensionality is fused to obtain a fused characteristic diagram Aconcat∈RC×H×WAs shown in formula (1):
wherein A isconcatIs a fused characteristic diagram, R is a real number, C \ H \ W is AconcatThe dimensions of (a) represent the number of channels, height and width in space, respectively. L is the number of feature maps, i.e. { A2,A3,A4,A54 layers of feature maps, l is formed by {2,3,4,5}, lmax=5,lmin=2。
In the above formula, L is the number of feature maps in the feature pyramid, LminAnd lmaxRespectively representing the lowest-level feature map and the highest-level feature map index in the feature pyramid. To obtain AconcatThen, the feature map is input into a Gaussian non-local attention mechanism, the schematic diagram is shown in FIG. 5, and for the finally output feature map E, each spatial position E of the feature map EiCan be expressed as:
i is the index of the feature point whose correlation degree with each spatial position needs to be calculated currently, f is the method for calculating the correlation similarity between two spatial positions, which is calculated by using an embedded gaussian method, as shown in formula (3),for three embedding spaces, three 1 × 1 convolutions are used and finally regularized by c (x).
θ, φ, g are three embedding spaces.
As shown in FIG. 5, three new feature representations B, C, D e R are obtained by three 1 × 1 convolutionsC×H×WWith a transformation dimension of RC×NRC×NWhere N is H × W, multiplying the transpose of B by C to obtain a matrix of degree of association S ∈ RN×N。
For each feature point S on the matrix SijAnd, the representative is the degree of association between the ith spatial position and the jth spatial position, as shown in equation (4):
after obtaining the similarity matrix S, the similarity matrix S and D are subjected to matrix multiplication and dimension conversion to RC×H×WAnd then the obtained characteristic expression E belongs to R by carrying out pixel-level addition on the obtained characteristic expression E and the obtained characteristic expression AC×H×WAs shown in formula (5):
the above formula can conclude that E fuses context information of the whole graph, and can fuse similar characteristic information and inhibit interference factors through the similarity matrix S, so that the characteristic-missing fruits can also be subjected toAnd a good detection effect is achieved. Then re-sample E and match { A }2,A3,A4,A5The final characteristic pyramid (P) is obtained by fusion2,P3,P4,P5}
2. And a RoIs generation stage:
for Pl∈{P2,P3,P4,P5And the pre-defined anchor frames with different sizes and aspect ratios are generated by respectively inputting the pre-defined anchor frames into RPN corresponding to the original image receptive field, which is realized by connecting 13 × 3 convolution with 21 × 1 convolutions, and taking each space bit center on Pl as the center, and the pre/background and frame offsets are preliminarily predicted through 21 × 1 convolutions, as shown in FIG. 6. And carrying out operations such as frame elimination and non-maximum suppression on the candidate frame obtained by the RPN, reserving the candidate frame of the positive and negative samples according to a certain sampling proportion, extracting the characteristics of the region of interest from the specified characteristic level k according to the formula (6), and inputting the characteristics into the RoI Align layer to sample to a fixed size.
3. Result prediction phase
And inputting the RoIs with fixed sizes generated in the last stage into two fully-connected branches to generate the class confidence coefficient and the frame offset of the fruit respectively, and generating the segmentation mask of the fruit in each prediction frame by paralleling one FCN.
4. Model training and optimization
In embodiment 3 of the present invention, the idea of model optimization is to calculate a loss between a predicted value generated by a model and a training target generated according to labeling information, update parameters by using a gradient back propagation algorithm, perform iterative training repeatedly, and evaluate on a verification set to obtain an optimal model for a network to infer and segment a target fruit in an image, where a flow chart of the network in an inference stage is shown in fig. 7.
The loss of the whole model is mainly composed of multitask loss generated by two multitask branches, and is generated by the RPN and the result prediction stage respectively, as shown in formula (7):
lfinal is the total loss of the final gradient return, where L iscls1And Lreg1Predicted values and training targets generated from two 1 × 1 convolutional layers of RPN, Lcls2、Lreg2And LmaskResulting from the outcome prediction stage. Of the 5 loss functions of the formula (7), Lcls1、Lcls2、LmaskCalculating by sampling a cross entropy loss function, Lreg1、Lreg2The calculation is performed using a smoothing L1 loss function.
When training the RPN, taking an Intersection over Unit (IoU) between an anchor frame and a real frame as a standard for judging positive and negative samples, and if the Intersection over Unit (IoU) is greater than 0.7, judging the positive sample as a positive sample; if the sampling rate is less than 0.3, the sampling rate is regarded as a negative sample; if the frequency is between 0.3 and 0.7, the training is not participated. After the positive and negative samples are judged, random sampling is carried out according to the proportion of 1:1 of the positive and negative samples, and a training target of the RPN is generated according to the corresponding real frame information; when the three-task branch is generated in the training result prediction stage, firstly, the IoU threshold value is used as the standard for judging positive and negative samples, then sampling is carried out according to a certain proportion (1:3), finally, a corresponding training target is generated, and the loss between the training target and the predicted value is calculated to optimize the model. The segmentation effect of the model is shown in fig. 8.
In summary, according to the fruit segmentation and identification method for the fruit picking robot in the embodiment of the present invention, RGB images including different interference types are collected in an orchard environment, and target fruits therein are labeled; inputting the image into a residual error network ResNet, fusing feature representations of different levels by a feature pyramid, sampling to the same size for combination, conveying to a Gaussian non-local attention mechanism for interference suppression, and resampling and fusing to obtain a balanced feature pyramid; respectively inputting each layer of feature map in the pyramid into a region candidate module, generating a predefined anchor frame on an input picture, performing preliminary adjustment, screening out a candidate frame through non-maximum value inhibition, extracting features on the corresponding feature map, and down-sampling to the same size through Align layers to obtain an interested region; and finally, generating category confidence and frame coordinates through two full-connection layers, and generating a segmentation mask of the fruit by using a full convolution network.
Mask R-CNN is used as the most mainstream example segmentation algorithm at present, image features are extracted by means of neural network depth, nonlinear data are fitted, and the segmentation effect of a model on clustered or overlapped target fruits can be effectively improved; the Feature Pyramid Network (FPN) is connected with the transverse direction from top to bottom to construct Feature representations with different sizes and high-level semantic information, so that the segmentation effect of the model on target fruits with different scales, especially small scales, can be effectively improved; the Gaussian non-local attention mechanism can be embedded into a network by a very small amount of parameters, similar characteristic information in the whole graph is aggregated, interference factors are suppressed, and the segmentation effect of the model on characteristic missing target fruits can be effectively improved. The embodiment of the invention integrates the technical design, and can be suitable for identifying different types of target fruits in a complex orchard environment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only a preferred embodiment of the present disclosure and is not intended to limit the present disclosure, and various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
Although the present disclosure has been described with reference to the specific embodiments shown in the drawings, it is not intended to limit the scope of the present disclosure, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive faculty based on the technical solutions disclosed in the present disclosure.
Claims (10)
1. A fruit division recognition method is characterized by comprising the following steps:
step S110: collecting fruit images containing different interferences in an orchard environment, and labeling the outlines of target fruits in the fruit images;
step S120: extracting the dimension of the target fruit and the characteristics of the target missing fruit in the marked fruit image;
step S130: the feature map obtained in the step S120 is transmitted to a regional candidate network, and the interested regions with the same scale are obtained through non-maximum suppression;
step S140: predicting the fruit confidence, the frame coordinates and the segmentation mask of the interested areas with the same scale through two full-connected layers and a full convolution network;
step S150: calculating the fruit confidence coefficient, the loss between the frame coordinate and the segmentation mask and the loss between the frame coordinate and the marked value, updating the network parameters through gradient feedback, continuously iterating until the parameters are stable, obtaining an identification model, and carrying out segmentation identification on fruits in the orchard environment image.
2. The fruit division recognition method according to claim 1, wherein the step S120 specifically includes:
step S121: inputting the marked fruit images into a residual error network in batches, and performing continuous downsampling through convolution operation;
step S122: introducing a characteristic pyramid network, and integrating characteristic representations of all layers in a residual error network through top-to-bottom and transverse connection;
step S123: sampling all layers of feature maps in the feature pyramid network to the same scale and integrating, aggregating the whole map information by a Gaussian non-local attention mechanism, representing the aggregated features, sampling and re-fusing to obtain a balanced feature pyramid, and outputting the balanced feature map.
3. The fruit division recognition method according to claim 2, wherein the step S130 specifically includes:
step S131: inputting the balanced feature map into a regional candidate network, and generating a predefined anchor frame according to different size and aspect ratios by taking coordinates of each spatial position on the feature map, which correspond to the original map, as centers;
step S132: judging positive and negative samples based on the intersection and parallel ratio between each anchor frame and all the marking frames, generating a training target of the regional candidate network, and primarily predicting a target fruit through classification branches and regression branches;
step S133: and screening the generated candidate frames through boundary elimination and non-maximum suppression, selecting the first N candidate frames according to confidence degree sequencing, selecting positive and negative samples according to a certain proportion, and inputting the positive and negative samples into a RoI Align layer for sampling to the same size.
4. The fruit division recognition method according to claim 3, wherein the step S140 specifically comprises:
step S141: inputting the regions of interest with the same size into two fully-connected branches, and respectively outputting the probability vector of each candidate frame belonging to the target fruit and the corresponding frame offset;
step S142: and the two fully-connected branches are parallel to a full convolution network to realize target fruit mask prediction, multi-dimensional feature representation is segmented for each candidate frame, binarization is carried out, and a segmentation graph of a background and a foreground is generated.
5. The fruit division recognition method according to claim 4, wherein the step S150 specifically comprises:
and adding the fruit confidence coefficient, the frame coordinate and the loss between the segmentation mask and the labeled value to obtain a final loss function, performing back propagation by using a random gradient descent method, continuously optimizing the model parameters to be stable, and fitting training data to obtain the recognition model.
6. A fruit segmentation identification system, comprising:
the fruit image acquisition module is used for acquiring fruit images containing different interferences in an orchard environment and marking the outline of a target fruit in the fruit images;
the first extraction module is used for extracting the scale size of the target fruit and the characteristics of the target missing fruit in the marked fruit image to obtain a characteristic diagram;
the second extraction module is used for combining the regional candidate network and obtaining the interested regions with the same scale in the characteristic diagram through non-maximum value inhibition;
the prediction module is used for predicting the fruit confidence, the frame coordinates and the segmentation mask of the interested region with the same scale through two full-connected layers and a full convolution network;
and the identification module is used for calculating the fruit confidence coefficient, the frame coordinate and the loss between the segmentation mask and the labeled value, updating the network parameters through gradient return, continuously iterating until the parameters are stable, obtaining an identification model, and performing segmentation identification on the fruits in the orchard environment image.
7. The fruit segmentation identification system according to claim 6, characterized in that the first extraction module comprises:
the sampling unit is used for combining the marked fruit image into a residual error network and carrying out continuous downsampling through convolution operation;
the characteristic representation unit is used for introducing a characteristic pyramid network and integrating the characteristic representation of each layer in the residual error network through top-to-bottom and transverse connection;
and the balancing unit is used for sampling and integrating all layers of feature graphs in the feature pyramid network to the same scale, aggregating the whole graph information through a Gaussian non-local attention mechanism, representing the aggregated features back to sampling and re-fusing to obtain a balanced feature pyramid, and outputting the balanced feature graph.
8. The fruit segmentation recognition system of claim 7, wherein the second extraction module comprises:
the predefined unit is used for combining the balanced feature map with the regional candidate network, taking the coordinate of each space position on the feature map, which corresponds to the original map, as the center, and generating a predefined anchor frame according to different sizes and aspect ratios;
the preliminary prediction unit is used for judging positive and negative samples based on the intersection and parallel ratio between each anchor frame and all the marking frames, generating a training target of the regional candidate network, and preliminarily predicting target fruits through classification branches and regression branches;
and the screening unit is used for screening the generated candidate frames through boundary elimination and non-maximum suppression, then selecting the first N candidate frames according to confidence degree sequencing, selecting positive and negative samples according to a certain proportion, and combining the positive and negative samples with a RoIAlign layer for sampling to the same size.
9. The fruit segmentation identification system according to claim 8, wherein the prediction module comprises:
the calculation unit is used for combining the regions of interest with the same size with the two fully-connected branches and respectively outputting the probability vector of each candidate frame belonging to the target fruit and the corresponding frame offset;
and the segmentation unit is used for realizing target fruit mask prediction by parallel one full convolution network with the two full-connected branches, segmenting out multi-dimensional feature representation aiming at each candidate frame, and carrying out binarization to generate a segmentation graph of a background and a foreground.
10. A fruit picking robot comprising a fruit division recognition system according to any one of claims 6 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011519247.6A CN112541508A (en) | 2020-12-21 | 2020-12-21 | Fruit segmentation and recognition method and system and fruit picking robot |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011519247.6A CN112541508A (en) | 2020-12-21 | 2020-12-21 | Fruit segmentation and recognition method and system and fruit picking robot |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112541508A true CN112541508A (en) | 2021-03-23 |
Family
ID=75019354
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011519247.6A Pending CN112541508A (en) | 2020-12-21 | 2020-12-21 | Fruit segmentation and recognition method and system and fruit picking robot |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112541508A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113099847A (en) * | 2021-05-25 | 2021-07-13 | 广东技术师范大学 | Fruit picking method based on fruit three-dimensional parameter prediction model |
CN113205085A (en) * | 2021-07-05 | 2021-08-03 | 武汉华信数据系统有限公司 | Image identification method and device |
CN113283320A (en) * | 2021-05-13 | 2021-08-20 | 桂林安维科技有限公司 | Pedestrian re-identification method based on channel feature aggregation |
CN113343749A (en) * | 2021-04-13 | 2021-09-03 | 山东师范大学 | Fruit identification method and system based on D2Det model |
CN113344845A (en) * | 2021-04-14 | 2021-09-03 | 山东师范大学 | Target fruit segmentation method and system based on anchor point set |
CN113343751A (en) * | 2021-04-15 | 2021-09-03 | 山东师范大学 | Small target fruit detection method and system |
CN113343750A (en) * | 2021-04-15 | 2021-09-03 | 山东师范大学 | Homochromy target fruit detection method and system |
CN113378813A (en) * | 2021-05-28 | 2021-09-10 | 陕西大智慧医疗科技股份有限公司 | Modeling and target detection method and device based on attention balance feature pyramid |
CN113592906A (en) * | 2021-07-12 | 2021-11-02 | 华中科技大学 | Long video target tracking method and system based on annotation frame feature fusion |
CN114004859A (en) * | 2021-11-26 | 2022-02-01 | 山东大学 | Method and system for segmenting echocardiography left atrium map based on multi-view fusion network |
CN114511850A (en) * | 2021-12-30 | 2022-05-17 | 广西慧云信息技术有限公司 | Method for identifying image of fruit size and granule of sunshine rose grape |
CN114902872A (en) * | 2022-04-26 | 2022-08-16 | 华南理工大学 | Visual guidance method for picking fruits by robot |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109584248A (en) * | 2018-11-20 | 2019-04-05 | 西安电子科技大学 | Infrared surface object instance dividing method based on Fusion Features and dense connection network |
CN110619632A (en) * | 2019-09-18 | 2019-12-27 | 华南农业大学 | Mango example confrontation segmentation method based on Mask R-CNN |
CN111667461A (en) * | 2020-05-06 | 2020-09-15 | 青岛科技大学 | Method for detecting abnormal target of power transmission line |
CN112017154A (en) * | 2020-07-08 | 2020-12-01 | 林智聪 | Ray defect detection method based on Mask R-CNN model |
-
2020
- 2020-12-21 CN CN202011519247.6A patent/CN112541508A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109584248A (en) * | 2018-11-20 | 2019-04-05 | 西安电子科技大学 | Infrared surface object instance dividing method based on Fusion Features and dense connection network |
CN110619632A (en) * | 2019-09-18 | 2019-12-27 | 华南农业大学 | Mango example confrontation segmentation method based on Mask R-CNN |
CN111667461A (en) * | 2020-05-06 | 2020-09-15 | 青岛科技大学 | Method for detecting abnormal target of power transmission line |
CN112017154A (en) * | 2020-07-08 | 2020-12-01 | 林智聪 | Ray defect detection method based on Mask R-CNN model |
Non-Patent Citations (2)
Title |
---|
JIANGMIAO PANG ET AL.: "Libra R-CNN: Towards Balanced Learning for Object Detection", 《2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 * |
YANG YU ET AL.: "Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN", 《COMPUTERS AND ELECTRONICS IN AGRICULTURE》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113343749A (en) * | 2021-04-13 | 2021-09-03 | 山东师范大学 | Fruit identification method and system based on D2Det model |
CN113344845A (en) * | 2021-04-14 | 2021-09-03 | 山东师范大学 | Target fruit segmentation method and system based on anchor point set |
CN113343751A (en) * | 2021-04-15 | 2021-09-03 | 山东师范大学 | Small target fruit detection method and system |
CN113343750A (en) * | 2021-04-15 | 2021-09-03 | 山东师范大学 | Homochromy target fruit detection method and system |
CN113283320A (en) * | 2021-05-13 | 2021-08-20 | 桂林安维科技有限公司 | Pedestrian re-identification method based on channel feature aggregation |
CN113099847B (en) * | 2021-05-25 | 2022-03-08 | 广东技术师范大学 | Fruit picking method based on fruit three-dimensional parameter prediction model |
CN113099847A (en) * | 2021-05-25 | 2021-07-13 | 广东技术师范大学 | Fruit picking method based on fruit three-dimensional parameter prediction model |
CN113378813A (en) * | 2021-05-28 | 2021-09-10 | 陕西大智慧医疗科技股份有限公司 | Modeling and target detection method and device based on attention balance feature pyramid |
CN113205085A (en) * | 2021-07-05 | 2021-08-03 | 武汉华信数据系统有限公司 | Image identification method and device |
CN113592906A (en) * | 2021-07-12 | 2021-11-02 | 华中科技大学 | Long video target tracking method and system based on annotation frame feature fusion |
CN113592906B (en) * | 2021-07-12 | 2024-02-13 | 华中科技大学 | Long video target tracking method and system based on annotation frame feature fusion |
CN114004859A (en) * | 2021-11-26 | 2022-02-01 | 山东大学 | Method and system for segmenting echocardiography left atrium map based on multi-view fusion network |
CN114004859B (en) * | 2021-11-26 | 2024-08-02 | 山东大学 | Method and system for segmenting echocardiographic left atrial map based on multi-view fusion network |
CN114511850A (en) * | 2021-12-30 | 2022-05-17 | 广西慧云信息技术有限公司 | Method for identifying image of fruit size and granule of sunshine rose grape |
CN114511850B (en) * | 2021-12-30 | 2024-05-14 | 广西慧云信息技术有限公司 | Method for identifying size particle image of sunlight rose grape fruit |
CN114902872A (en) * | 2022-04-26 | 2022-08-16 | 华南理工大学 | Visual guidance method for picking fruits by robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112541508A (en) | Fruit segmentation and recognition method and system and fruit picking robot | |
CN113449680B (en) | Knowledge distillation-based multimode small target detection method | |
CN110458844B (en) | Semantic segmentation method for low-illumination scene | |
CN110929610B (en) | Plant disease identification method and system based on CNN model and transfer learning | |
CN114092832B (en) | High-resolution remote sensing image classification method based on parallel hybrid convolutional network | |
Chen et al. | An improved Yolov3 based on dual path network for cherry tomatoes detection | |
Zuo et al. | HF-FCN: Hierarchically fused fully convolutional network for robust building extraction | |
CN103914705B (en) | Hyperspectral image classification and wave band selection method based on multi-target immune cloning | |
CN112950780B (en) | Intelligent network map generation method and system based on remote sensing image | |
CN112749675A (en) | Potato disease identification method based on convolutional neural network | |
CN111695640A (en) | Foundation cloud picture recognition model training method and foundation cloud picture recognition method | |
CN115393631A (en) | Hyperspectral image classification method based on Bayesian layer graph convolution neural network | |
CN117611932A (en) | Image classification method and system based on double pseudo tag refinement and sample re-weighting | |
CN116563683A (en) | Remote sensing image scene classification method based on convolutional neural network and multi-layer perceptron | |
CN113657196B (en) | SAR image target detection method, SAR image target detection device, electronic equipment and storage medium | |
CN118279320A (en) | Target instance segmentation model building method based on automatic prompt learning and application thereof | |
Kong et al. | Detection model based on improved faster-RCNN in apple orchard environment | |
CN117994240A (en) | Multi-scale two-level optical remote sensing image stripe noise intelligent detection method and device | |
CN117830711A (en) | Automatic image content auditing method based on deep learning | |
CN113536944A (en) | Distribution line inspection data identification and analysis method based on image identification | |
CN112818818A (en) | Novel ultra-high-definition remote sensing image change detection method based on AFFPN | |
CN116152699B (en) | Real-time moving target detection method for hydropower plant video monitoring system | |
Wang et al. | Strawberry ripeness classification method in facility environment based on red color ratio of fruit rind | |
CN111882545A (en) | Fabric defect detection method based on bidirectional information transmission and feature fusion | |
Jing et al. | Time series land cover classification based on semi-supervised convolutional long short-term memory neural networks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210323 |