CN115496979A - Orchard young fruit growth posture visual identification method based on multiple feature fusion - Google Patents

Orchard young fruit growth posture visual identification method based on multiple feature fusion Download PDF

Info

Publication number
CN115496979A
CN115496979A CN202211120206.9A CN202211120206A CN115496979A CN 115496979 A CN115496979 A CN 115496979A CN 202211120206 A CN202211120206 A CN 202211120206A CN 115496979 A CN115496979 A CN 115496979A
Authority
CN
China
Prior art keywords
feature
fusion
posture
growth
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211120206.9A
Other languages
Chinese (zh)
Inventor
吕继东
牛亮亮
徐黎明
邹凌
韩颖
戎海龙
许浩
卢文斌
孙晓琴
王凌云
杨洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou University
Original Assignee
Changzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou University filed Critical Changzhou University
Priority to CN202211120206.9A priority Critical patent/CN115496979A/en
Publication of CN115496979A publication Critical patent/CN115496979A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/70Labelling scene content, e.g. deriving syntactic or semantic representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to the technical field of image detection, in particular to an orchard young fruit growth posture visual identification method based on multiple feature fusion, which comprises the steps of collecting data images of orchard young fruits, and adjusting a target detection frame in the images; carrying out format conversion on the marked data set, and carrying out cutting pretreatment on the converted data set; constructing a young fruit growth posture feature extraction model, and carrying out deep fusion on a shallow feature map and a high feature map of the feature extraction model by adopting a Bi-FPN network; performing posture frame regression on the feature map subjected to the fusion processing by adopting a posture prediction layer, and extracting a target region; training the model through a training data set, storing the coordinates of the posture frame, and calculating the growth posture angle of the young fruit. The invention provides an effective solution in the aspects of realizing the mechanization, automation and intellectualization of intelligent bagging, ensuring the timely and high-efficiency bagging of young fruits, reducing the bagging operation cost and the like.

Description

Orchard young fruit growth posture visual identification method based on multiple feature fusion
Technical Field
The invention relates to the technical field of image processing, in particular to a visual identification method for the growth posture of young fruits in an orchard based on multi-feature fusion.
Background
The bagging technology is an important technology for producing green and high-quality high-grade fruits and vegetables, bird and insect damages to the fruits and vegetables can be effectively reduced through bagging, pesticide pollution, sun burn, wind and rain damage and scratching deformation are prevented, and the color and luster of the fruits and vegetables are improved. In the planting and production process of high-quality fruits, bagging is also an indispensable link, however, as picking ripe fruits, bagging of young fruits is a work with strong timeliness and huge workload, and is mainly completed by manpower or manual work matched with simple machinery at present, so that time and labor are consumed, labor intensity is high, and bagging quality is uneven; moreover, the problems of aging and shortage of agricultural labor are increasingly obvious, and the labor price of manual bagging is increased year by year, so that the corresponding production cost is increased, and the market competitiveness is influenced.
For the young fruit bag, the bag needs to be sleeved from the fruit bottom upwards, so the growth posture information is necessary, and the young fruit is small compared with the ripe fruit, and the color is close to the background of branches and leaves, so the difficulty in identifying the growth posture of the young fruit can be imagined.
Disclosure of Invention
Aiming at the defects of the existing algorithm, the invention provides an effective solution in the aspects of realizing the mechanization, automation and intellectualization of intelligent bagging, ensuring the timely and high-efficiency bagging of young fruits, reducing the bagging operation cost and the like.
The technical scheme adopted by the invention is as follows: a visual identification method for orchard young fruit growth postures based on multi-feature fusion comprises the following steps:
acquiring a data image of orchard young fruits, adjusting a target detection frame in the image, and enhancing and labeling the image;
further, the target detection frame is adjusted by replacing the horizontal frame with a posture frame with angle parameters through rolabelmg, wherein the posture frame comprises a target center point coordinate, a length and a width and an inclination angle.
Secondly, format conversion is carried out through the marked data set, and cutting pretreatment is carried out on the converted data set;
further, format conversion is to express the inclination angle of the fruit according to the angle formed by the long edge of the posture frame and the x axis clockwise.
Constructing a young fruit growth posture feature extraction model, and performing deep fusion on a shallow feature map and a high feature map of the feature extraction model by adopting a Bi-FPN network;
further, the feature extraction model comprises a Focus module, a feature extraction module P1, a CBL module, a feature extraction module P2, a CBL module, a feature extraction module P3, a CBL module, a feature extraction module P4, a CBL module and a feature extraction module P5 which are connected in sequence; the feature extraction module P2 consists of 2 Bottleneck CSP modules and a CA attention mechanism module, the feature extraction modules P3 and P4 consist of 8 Bottleneck CSP modules and a CA attention mechanism module, and the feature extraction module P5 consists of an SPP module, 2 Bottleneck CSP modules and a CA attention mechanism module.
The Focus module is to use a frame with 2 multiplied by 2 and step pitch of 2 to take pixel values of an image at an interval of one pixel, the fixed position value of each frame is placed on the same layer to obtain four pictures, the four pictures are spliced into a new picture, and the new picture is subjected to convolution operation to obtain a sampling characteristic diagram.
The CBL module consists of convolution + batch normalization + leak relu activation function.
The SPP module sends the sampling feature map into pooling layers with the sizes of 1 × 1,2 × 2 and 4 × 4 different convolution kernels;
the BottleneckCSP module is used for normalizing two branches after Concat through BatchNormalization and an LeakReLU activation function; wherein, one branch is composed of two CBL layers and a convolution layer; the other is composed of a convolution layer;
further, global pooling is carried out on the input feature graph along the horizontal direction and the vertical direction by utilizing Coordinate information embedding of a CA attention mechanism module, and feature graphs in the horizontal direction and the vertical direction are obtained;
splicing two feature maps together in coding Attention generation conversion, and obtaining a feature map F through convolution transformation 1 Normalizing to obtain a characteristic diagram f;
decomposing the characteristic diagram f into f along the horizontal and vertical directions w ∈R C/r×W And f h ∈R C/r×H R is the reduction rate, respectively for f w And f h Convolution by 1X 1 to obtain a feature map F w And F h Respectively obtaining attention weights g of the feature map in two spatial directions by using sigmoid activation function w And g h
The original feature map is multiplied by attention weights in the horizontal and vertical directions, and the attention in both the horizontal and vertical directions is simultaneously applied to the input features.
Further, the deep fusion of the shallow feature map and the high feature map of the feature extraction model by adopting the Bi-FPN network comprises the following steps:
firstly, fusing a P5 characteristic layer with a P4 characteristic layer through up-sampling; secondly, performing up-sampling and P3 feature layer secondary fusion on the obtained fusion feature information again; finally, performing up-sampling and P2 feature layer three times of fusion on the feature information obtained by the second fusion to complete feature information fusion from top to bottom; in a similar way, firstly, fusing feature information obtained by down-sampling and secondary fusion of the information obtained by the third fusion; secondly, fusing the feature information obtained by down-sampling and first fusing the fusion information; and finally, fusing the fusion information with the P5 characteristic diagram through downsampling.
Performing posture frame regression on the feature map subjected to the fusion processing by adopting a posture prediction layer, and extracting a target region;
furthermore, an angle prediction channel is added to the head structure of the prediction layer, the channel dimension of the head detection layer is 3 × (C +5+ 180), wherein 3 represents that 3 anchor frames with length-width ratios are preset in each grid, and each anchor frame is responsible for predicting C types and frame parameter information (x, y, w, h, p) r ) (ii) a Class prediction channel (C) 0 ,C 1 ,…,C n ),p r The foreground confidence of the prediction box is represented and each anchor box will add one more angular prediction of 180 channels.
Training the model through a training data set, storing coordinates of a posture frame, and calculating a young fruit growth posture angle;
further, the coordinates of the posture frame comprise four vertex coordinates, an included angle formed by the long edge and the x axis clockwise is a growth posture angle of the young fruit, and the calculation formula is as follows:
Figure BDA0003846687530000041
wherein (x) 1 ,y 1 )、(x 2 ,y 2 )、(x 3 ,y 3 ) And (x) 4 ,y 4 ) Four vertex coordinates.
The invention has the beneficial effects that:
1. aiming at the problem of visual identification of the growth posture of young fruits in an orchard growth environment, the posture angle information is also obtained while the position information of the young fruits is obtained.
2. An attention mechanism is introduced to the near color background problem, and compared with the method without the attention mechanism, the recognition performance is improved.
3. In order to further improve the identification capability of the network to smaller targets, a small target detection layer is added, the growth posture of the young peach fruits can be better identified, and the missing rate is greatly reduced compared with the case that the small target detection layer is not used.
4. The intelligent bagging machine provides an effective solution in the aspects of realizing the mechanization, automation and intellectualization of intelligent bagging, ensuring the timely and high-efficiency bagging of young fruits, reducing the bagging operation cost and the like, and also provides reference for solving the bagging research problems of other vegetables and fruits.
Drawings
FIG. 1 is a diagram of a young fruit growth posture feature extraction model according to the present invention;
FIG. 2 is a schematic diagram of a CA attention mechanism model according to the present invention;
FIG. 3 is a Bi-FPN plot of the detection of added small targets of the present invention;
fig. 4 is a visual image of the growth posture angle of the young peach fruit.
Detailed Description
The invention will be further described with reference to the accompanying drawings and examples, which are simplified schematic drawings and illustrate only the basic structure of the invention in a schematic manner, and therefore only show the structures relevant to the invention.
A visual identification method for orchard young fruit growth postures based on multi-feature fusion comprises the following steps:
acquiring a young fruit data image of an orchard, adjusting a target detection frame in the image, and enhancing and labeling the image;
by collecting young fruit data sets in different time periods and different weather conditions, the data sets are more similar to different scenes in a natural growth state, and the adaptability of a network is enhanced; then, image data is enriched by image enhancement, setting a contrast enhancement factor to be 1.5, enhancing the brightness by 1.5, rotating, adding Gaussian noise and the like; the target detection frame is adjusted by replacing a horizontal frame with a posture frame with angle parameters through rolabel img, wherein the posture frame comprises a target central point coordinate, a length, a width and an inclination angle, and a posture information data set is obtained.
Secondly, format conversion is carried out through the marked data set, and cutting and segmentation pretreatment are carried out on the converted data set;
and the format conversion is carried out on the inclination angle of the fruit by representing the four vertex coordinates of the posture frame according to the clockwise included angle formed by the long edge of the posture frame and the x axis.
Constructing a training set and a testing set of the preprocessed young fruit data set; the pretreatment comprises the following steps: the method comprises the steps of segmenting a young fruit image by utilizing cutting, then enhancing the image by means of image enhancement Mosaic data, wherein Mosaic is enhanced according to a data format, and then dividing according to the proportion of a training set to a verification set 8.
Constructing a young fruit growth posture feature extraction model, and performing deep fusion on a shallow feature map and a high feature map of the feature extraction model by adopting a Bi-FPN network;
fig. 1 is a feature extraction model, which is composed of Focus, CBL (restriction-batch normalization-leak relu), spatial pyramid pooling module (SPP), bottlenceckcsp, and attention model module, and is used for extracting features of an image.
The principle of the Focus module is that a series of slicing operations are carried out on a picture before the picture enters a backbone network, specifically, a frame with 2 x2 and a step pitch of 2 is used, a value is taken from every other pixel in each image, a fixed position value of each frame is taken and placed on the same layer, four pictures are obtained, namely, W and H information is concentrated into a channel space, and an input channel is expanded by 4 times; finally, compared with an RGB three-channel mode of an original image, the spliced image is changed into 12 channels, then the obtained new image is subjected to convolution operation, and finally a double-sampling feature image under the condition of no information loss is obtained, and the function of the feature image is to reduce the calculated amount and accelerate the speed; wherein.
The CBL consists of convolution, batchNormalization normalization and LeakReLU activation functions, and has the function of improving effective information for extracting picture features.
The SPP module sends the feature map of the previous layer into pooling layers with sizes of 1 × 1,2 × 2 and 4 × 4 different convolution kernels, and the detection capability of the multi-receptive-field fusion promotion model for complex scenes is achieved.
The BottleneckCSP is formed by two branches, concat, a BatchNormalization normalization and a LeakReLU activation function; one branch is composed of two CBL layers and one convolution layer, the other branch is composed of one convolution layer, the effect of the branch is that the structure fuses the thought of a residual error structure, the two different branches are connected, the feature fusion of different levels is realized, and the feature extraction capability of the network is greatly improved.
For the small target detection processing, a prediction head for small target detection is added on a P2 layer (160 × 160 pixels) feature map of a backbone network to predict targets with the size of pixels 4 × 4.
As shown in FIG. 2, the attention mechanism module focuses attention on important areas of the image and ignores irrelevant target areas, resulting in better separation of the target from the background.
The method utilizes Coordinate information embedding in a CA attention mechanism module to perform global pooling on input feature maps along the horizontal direction and the vertical direction, specifically, given input x, encoding each channel along the horizontal Coordinate and the vertical Coordinate respectively by using posing with the sizes of (1, W) and (H, 1) to obtain feature maps in the horizontal direction and the vertical direction; then, in the Coordinate orientation generation conversion, the two characteristic maps are spliced together, and the characteristic map F is obtained through convolution transformation 1 Using normalization to obtain a characteristic diagram f; then decomposing the characteristic diagram f into f along the horizontal direction and the vertical direction w ∈R C/r×W And f h ∈R C/r×H R is the reduction rate, respectively for f w And f h Performing convolution calculation of 1 × 1 to obtain a feature diagram F with the same channel number as the original one w And F h Respectively obtaining attention weights g of the feature map in two spatial directions by using sigmoid activation function w And g h (ii) a Finally, multiplying the original feature map by attention weights in the horizontal and vertical directions, and simultaneously applying the attention in the horizontal and vertical directions to the input features; CA pays attention to the relationship among the channels, meanwhile, the long-term dependence relationship is captured by using accurate position information, the target characteristics can be better focused to weaken background noise, and the research significance lies in that young fruits belong to near-color backgrounds and the detection effect can be improved.
As shown in fig. 3, a Bi-directional feature fusion from top to bottom and from bottom to top is repeatedly applied to a shallow feature map and a high feature map of a feature extraction model through a Bi-FPN network structure, wherein Bi-FPN is used for performing deep fusion on a P2-P5 feature layer, and firstly, the P5 feature layer is fused with a P4 feature layer through up-sampling; secondly, performing up-sampling and P3 feature layer secondary fusion on the obtained fusion feature information again; and finally, performing three times of fusion on the feature information obtained by the second fusion and the P2 feature layer to complete the feature information fusion from top to bottom. Similarly, fusing feature information obtained by down-sampling and secondary fusion of the information obtained by the third fusion; secondly, fusing the feature information obtained by down-sampling and first fusing the fused information of the previous step; and finally, fusing the fusion information obtained in the last step with the P5 characteristic layer through downsampling, completing the characteristic information fusion from bottom to top, fully utilizing the shallow information, and reducing the negative influence of the object on the scale.
Extracting the target area by adopting a method of performing posture frame regression on the feature map subjected to the fusion processing by adopting a posture prediction layer;
furthermore, an angle parameter theta is added to the head structure of the attitude prediction layer 1 Angle parameter θ 1 The dimension comprises 180 prediction angle channels (1, 2,3 \8230180; 180), and then the angle regression task is converted into classification, so that the prediction of the growth attitude angle can be realized. Wherein, the growth posture of the orchard young fruitThe dimension of a head detection layer channel of the visual identification network is 3 × (C +5+ 180), wherein 3 indicates that anchor frames with 3 length-width ratios are preset for each grid, and each anchor frame is responsible for predicting C category and border parameter information (x, y, w, h, p) r ) (ii) a Class prediction channel (C) 0 ,C 1 ,…,C n ),p r The foreground confidence for the prediction box is indicated and each anchor box will add one more angular prediction for 180 channels.
According to the growth characteristics of young fruits in the natural state of an orchard, the young fruits are in downward postures in multiple growth directions and have a certain angle, so that great challenge is brought to the intelligent bagging research; in order to guarantee bagging of young fruits growing in multiple angles, the target position is detected by using the posture frame, and the inclination angle of the fruits is represented by the included angle between the long edge of the posture frame and the x axis, so that the problems that the target is difficult to find and the bagging angle cannot be determined in the young fruit bagging process are solved; the background of the young fruit image collected in the unstructured field growth environment is complex and similar to the color of branches and leaves, the method belongs to target detection under the near-color background, and certain difficulty is brought to identification of the target detection. Therefore, the last layer of the bottleckCSP of the backbone is replaced by the attention mechanism, so that attention is focused on important areas of the image and irrelevant target areas are ignored, the position information and the channel relation can be captured simultaneously, the remote dependence relation is obtained, the feature representation of the network is enhanced, and the identification effect is improved to a certain extent in a complex environment; the invention relates to a method for detecting weak and small targets under complex background, which is characterized in that young fruits in an orchard are small, and identification of the young fruits belongs to the problem of detection of the weak and small targets under the complex background.
And fifthly, training the visual identification network model of the orchard young fruit growth posture through a training data set, predicting the test set by the obtained pre-training weight, returning the class name and the confidence level, and finally storing the coordinates of the posture frame to further calculate the growth posture angle.
As shown in fig. 3, the posture angle is detected by the visual recognition network based on the growth posture of the orchard young fruit with multiple feature fusion, four vertex coordinates of the target, namely ((x 1, y 1), (x 2, y 2), (x 3, y 3), (x 4, y 4)), are obtained, the coordinates are counted clockwise from the upper left corner, in order to unify the detection direction of the posture target, the angle formed by the long edge and the x axis clockwise is proposed to be expressed as the growth posture angle of the young fruit, the method additionally calculates the angle of the target through the prediction result, and the method is calculated through formula (1) according to the predicted and displayed coordinates:
Figure BDA0003846687530000091
wherein, by comparing the side length of the posture frame, the angle of the slope corresponding to the longer side is calculated by using an inverse trigonometric function arctan, and the growth posture angle theta of the young fruit is represented.
The experimental results are as follows:
the orchard peach young fruit is taken as a research object, and in order to explore the influence of different network models on the identification of the young fruit growth attitude angles, the network designed by the experiment is compared with R3Det and R-CenterNet. According to the identification result, the visual identification network for the growth posture of the orchard young fruit with the multiple feature fusion has the best network performance in three models. The experiment verifies 300 test images through training weights, the identification effect is shown in figure 4, and the experiment shows that the average accuracy of the model and the average accuracy of angle estimation are the best, so that the network can effectively identify the growth postures of young peaches.
In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.

Claims (8)

1. A visual identification method for orchard young fruit growth postures based on multi-feature fusion is characterized by comprising the following steps:
acquiring a data image of orchard young fruits, adjusting a target detection frame in the image, and enhancing and labeling the image;
secondly, format conversion is carried out through the marked data set, and cutting pretreatment is carried out on the converted data set;
step three, constructing a young fruit growth posture feature extraction model, and performing deep fusion on a shallow feature map and a high feature map of the feature extraction model by adopting a Bi-FPN network;
performing attitude frame regression on the feature map subjected to the fusion processing by adopting an attitude prediction layer, and extracting a target area;
and fifthly, training the model through a training data set, storing the coordinates of the posture frame, and calculating the growth posture angle of the young fruit.
2. The visual identification method for the growth posture of the young fruit of the orchard based on the multi-feature fusion as claimed in claim 1, wherein the target detection frame is adjusted by replacing a horizontal frame with a posture frame with angle parameters through rolabelmg, wherein the posture frame comprises target center point coordinates, length and width and an inclination angle.
3. The visual identification method for the growth postures of the young fruits of the orchard based on the multi-feature fusion as claimed in claim 1, wherein format conversion is carried out to express the inclination angles of the fruits according to the included angle formed by the long edge of the posture frame and the x axis clockwise.
4. The visual identification method for the growth posture of the young fruit of the orchard based on the multiple feature fusion is characterized in that the feature extraction model comprises a Focus module, a feature extraction module P1, a CBL module, a feature extraction module P2, a CBL module, a feature extraction module P3, a CBL module, a feature extraction module P4, a CBL module and a feature extraction module P5 which are sequentially connected; the feature extraction module P2 consists of 2 BottleneckCSP modules and a CA attention mechanism module, the feature extraction modules P3 and P4 consist of 8 BottleneckCSP modules and a CA attention mechanism module, and the feature extraction module P5 consists of an SPP module, 2 BottleneckCSP modules and a CA attention mechanism module.
5. The visual identification method for the growth postures of the young fruits in the orchard based on the multi-feature fusion as claimed in claim 4, characterized in that the inputted feature map is subjected to global pooling along the horizontal direction and the vertical direction by utilizing Coordinate information embedding of a CA attention mechanism module, and feature maps in the horizontal direction and the vertical direction are obtained;
splicing two characteristic graphs together in the coding Attention generation conversion, and obtaining a characteristic graph F through convolution transformation 1 Normalizing to obtain a characteristic diagram f;
decomposing the characteristic diagram f into f along the horizontal direction and the vertical direction w ∈R C/r×W And f h ∈R C/r×H R is the reduction rate, respectively for f w And f h Convolution by 1X 1 to obtain a feature map F w And F h Respectively obtaining attention weights g of the feature map in two spatial directions by using sigmoid activation function w And g h
The original feature map is multiplied by attention weights in the horizontal and vertical directions, and the attention in both the horizontal and vertical directions is simultaneously applied to the input features.
6. The visual identification method for the growth posture of the young fruit of the orchard based on the multiple feature fusion as claimed in claim 1, wherein the deep fusion of the shallow feature map and the high feature map of the feature extraction model by adopting a Bi-FPN network specifically comprises:
firstly, fusing a P5 characteristic layer with a P4 characteristic layer through up-sampling; secondly, performing secondary fusion on the obtained fusion characteristic information with a P3 characteristic layer through upsampling again; finally, performing up-sampling and three times of fusion on the feature information obtained by the second fusion with the P2 feature layer to complete feature information fusion from top to bottom; in a similar way, firstly, fusing feature information obtained by down-sampling and secondary fusion of the information obtained by the third fusion; secondly, fusing the feature information obtained by down-sampling and first fusing the fusion information; and finally, fusing the fusion information with the P5 characteristic diagram through downsampling.
7. The visual identification method for orchard young fruit growth postures based on multi-feature fusion as claimed in claim 1, wherein an angle prediction channel is added to a head structure of a posture prediction layer, the channel dimension of a head detection layer is 3 x (C +5+ 180), wherein 3 denotes that 3 anchor frames with length-width ratios are preset for each grid, and each anchor frame is responsible for predicting C category and border parameter information (x, y, w, h, p) r ) (ii) a Class prediction channel (C) 0 ,C 1 ,…,C n ),p r The foreground confidence of the prediction box is indicated and each anchor box will add one more angular prediction of 180 channels.
8. The visual identification method for the growth posture of the young fruit of the orchard based on the multi-feature fusion as claimed in claim 1, wherein the coordinates of the posture frame comprise coordinates of four vertexes, an included angle formed by a long edge and an x-axis clockwise is the growth posture angle of the young fruit, and the calculation formula is as follows:
Figure FDA0003846687520000031
wherein (x) 1 ,y 1 )、(x 2 ,y 2 )、(x 3 ,y 3 ) And (x) 4 ,y 4 ) Four vertex coordinates.
CN202211120206.9A 2022-09-15 2022-09-15 Orchard young fruit growth posture visual identification method based on multiple feature fusion Pending CN115496979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211120206.9A CN115496979A (en) 2022-09-15 2022-09-15 Orchard young fruit growth posture visual identification method based on multiple feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211120206.9A CN115496979A (en) 2022-09-15 2022-09-15 Orchard young fruit growth posture visual identification method based on multiple feature fusion

Publications (1)

Publication Number Publication Date
CN115496979A true CN115496979A (en) 2022-12-20

Family

ID=84468675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211120206.9A Pending CN115496979A (en) 2022-09-15 2022-09-15 Orchard young fruit growth posture visual identification method based on multiple feature fusion

Country Status (1)

Country Link
CN (1) CN115496979A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115937314A (en) * 2022-12-23 2023-04-07 南京林业大学 Camellia oleifera fruit growth posture detection method
CN116740704A (en) * 2023-06-16 2023-09-12 安徽农业大学 Wheat leaf phenotype parameter change rate monitoring method and device based on deep learning

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115937314A (en) * 2022-12-23 2023-04-07 南京林业大学 Camellia oleifera fruit growth posture detection method
CN115937314B (en) * 2022-12-23 2023-09-08 南京林业大学 Method for detecting growth posture of oil tea fruits
CN116740704A (en) * 2023-06-16 2023-09-12 安徽农业大学 Wheat leaf phenotype parameter change rate monitoring method and device based on deep learning
CN116740704B (en) * 2023-06-16 2024-02-27 安徽农业大学 Wheat leaf phenotype parameter change rate monitoring method and device based on deep learning

Similar Documents

Publication Publication Date Title
CN111598861B (en) Improved Faster R-CNN model-based non-uniform texture small defect detection method
CN115496979A (en) Orchard young fruit growth posture visual identification method based on multiple feature fusion
CN113192040A (en) Fabric flaw detection method based on YOLO v4 improved algorithm
CN114387520B (en) Method and system for accurately detecting compact Li Zijing for robot picking
CN114758252B (en) Image-based distributed photovoltaic roof resource segmentation and extraction method and system
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN111178177A (en) Cucumber disease identification method based on convolutional neural network
Wang et al. Research on detection technology of various fruit disease spots based on mask R-CNN
CN113297915A (en) Insulator recognition target detection method based on unmanned aerial vehicle inspection
CN111027538A (en) Container detection method based on instance segmentation model
CN115272204A (en) Bearing surface scratch detection method based on machine vision
CN113033315A (en) Rare earth mining high-resolution image identification and positioning method
CN108932474B (en) Remote sensing image cloud judgment method based on full convolution neural network composite characteristics
CN116363505A (en) Target picking method based on picking robot vision system
CN115019302A (en) Improved YOLOX target detection model construction method and application thereof
CN115908354A (en) Photovoltaic panel defect detection method based on double-scale strategy and improved YOLOV5 network
CN116824347A (en) Road crack detection method based on deep learning
CN114581307A (en) Multi-image stitching method, system, device and medium for target tracking identification
CN114511627A (en) Target fruit positioning and dividing method and system
CN113298767A (en) Reliable go map recognition method capable of overcoming light reflection phenomenon
CN109657540A (en) Withered tree localization method and system
CN113256563A (en) Method and system for detecting surface defects of fine product tank based on space attention mechanism
CN113538453A (en) Infrared image photovoltaic module visual identification method based on semantic segmentation
CN117079125A (en) Kiwi fruit pollination flower identification method based on improved YOLOv5
CN116433545A (en) Multi-scale fusion single image rain removing method based on rain stripe guidance

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination