CN116051902A

CN116051902A - Straw identification method based on high-flux graph calculation

Info

Publication number: CN116051902A
Application number: CN202310105390.8A
Authority: CN
Inventors: 夏勇; 周晓宇; 陈传飞; 薛巨峰; 范东睿; 王晓虹; 杨卫兵; 吴浩
Original assignee: Zhongke Nanjing Information High Speed Railway Research Institute; Yancheng Zhongke High Throughput Computing Research Institute Co ltd
Current assignee: Zhongke Nanjing Information High Speed Railway Research Institute; Yancheng Zhongke High Throughput Computing Research Institute Co ltd
Priority date: 2022-11-02
Filing date: 2023-02-13
Publication date: 2023-05-02

Abstract

The invention provides a straw identification method based on high-flux map calculation, which comprises the steps of firstly establishing a straw identification training map library management system, secondly collecting and inputting massive straw training map elements, secondly, rapidly establishing a training model through a high-flux map calculation technology by a straw identification training map library, then shooting actual straw scene pictures by a small program user and uploading the actual straw scene pictures to an application platform, and finally realizing low-delay high-accuracy identification through the high-flux map calculation technology by the application platform.

Description

Straw identification method based on high-flux graph calculation

Technical Field

The invention relates to the technical field of three-dimensional reconstruction and phenotypic feature extraction of crops, in particular to a straw identification method based on high-throughput graph calculation.

Background

Along with the proposal of the 'material genome project', how to integrate data, codes, calculation tools and the like in the new material design process through high-flux material integrated calculation so as to realize sharing, thereby accelerating the development of the new material and attracting more attention in the industry. "Material genome planning" is a methodology that is typical of scientific informatization (e-Science) applications and practices. The core is the "integration" of the two words. It emphasizes the integration of computation with data, the integration of computation data with experimental data, the integration of high throughput material computation with multi-scale simulation. Therefore, we propose the concept of "high-throughput material integration calculation" (integrated high-throughput material), and propose a material design method oriented to "high-throughput material integration calculation".

High throughput material integration calculations may reference some concepts and ideas of combinatorial chemistry and material informatics. Combinatorial chemistry has been greatly successful in the discovery of new drugs, a strategy and method for combining "building blocks" of different structures or components in parallel, systematically, and repeatedly, to rapidly obtain large amounts of compounds, and thus to perform High-throughput screening.

Materials informatics (materials information) is "processing and interpreting materials science and engineering data by calculation", which can be combined with materials calculation, through known reliable experimental data, try as many real or unknown materials as possible by theoretical simulation, build databases of their components, structures and various physical properties, and search for relational patterns between material components, structures and properties by data mining for guiding new material design.

The essence of the high-flux material integration calculation is to discuss how to use the construction unit and the high-flux screening concept in combinatorial chemistry for material computer simulation, search or screen the basic construction unit of material composition through material calculation, construct new compounds, integrate data, codes and material calculation software by combining material informatics related technologies, and establish quantitative relation models of material components, structures and performances for guiding new material design. This has become one of the hot spot problems of current domestic and foreign industries.

In China, the traditional ground feature space distribution is observed and drawn in the field by a researcher, but is influenced by a plurality of factors such as manpower, material resources, time and the like, and large-area operation is difficult to process. Along with the development of modern remote sensing technology, the method can rapidly acquire ground high-resolution images and extract straw coverage space distribution. In the high-resolution remote sensing image, corn straw coverage is represented as an irregular area with approximate local spectrum curve and larger length and width change, and great difficulty is brought to straw coverage extraction work. At present, few researches on straw coverage characteristics in high-resolution images are performed, and the current extraction method has limitations in efficiency, speed, applicability and accuracy.

In order to solve the technical problems, the invention provides a straw identification method based on high-flux map calculation, which adopts the YOLOv5 algorithm and the high-flux map calculation technology, has the characteristics of smaller mean weight file, easier deployment, shorter training time, faster reasoning speed and higher calculation precision, and can effectively solve the limitations of the current extraction method in the prior art in terms of efficiency, speed, applicability and accuracy.

Disclosure of Invention

In order to solve the technical problems, the invention provides a straw identification method based on high-flux graph calculation, which adopts a YOLOv5 algorithm and a high-flux graph calculation technology to make the straw identification method based on a pytorch frame and has the characteristics of smaller mean weight file, easier deployment, shorter training time, faster reasoning speed and higher calculation precision.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

a straw identification method based on high-flux map calculation is characterized by comprising the following steps of: the method comprises the following steps:

s1: a straw recognition training gallery management system is established,

applying a single convolutional neural network CNN to the whole image by fusing a YouOyLookOnce 5 algorithm in a YouOyLookOnce series algorithm, dividing the image into grids, predicting class probability and boundary boxes of each grid, and predicting a boundary box and probability corresponding to each class for each grid by the network;

s2: a large number of straw training picture elements are collected and input,

the YOLOV5 network structure is divided into four parts: respectively an input end, a Backbone, neck end and an output end,

the input terminal comprises a Mosaic data enhancement, an adaptive anchor frame calculation and an adaptive picture scaling,

mosained data enhancement: the input end adopts a mode of enhancing Mosaic data, and adopts a mode of randomly zooming, randomly cutting and randomly arranging 4 pictures for splicing;

the adaptive anchor frame calculation: in network training, the network outputs a predicted frame on the basis of an initial anchor frame, then compares the predicted frame with a real frame, calculates the difference between the predicted frame and the real frame, reversely updates, iterates network parameters, embeds the predicted frame into a YOLOV5 algorithm in high-throughput graph calculation, and adaptively calculates optimal anchor frame values in different training sets during each training;

the adaptive picture scaling: in the target detection algorithm, different pictures are different in length and width, so that the original pictures are uniformly scaled to a standard size and then sent into a detection network;

the backbox comprises a Focus structure, a CSP structure and an SPP module:

the Focus structure: slicing the scaled, cut and arranged pictures, wherein the specific operation is to take a value every other pixel in a picture, inputting an original 608 multiplied by 3 image into a Focus structure in a Yolov5s algorithm, adopting slicing operation to change the original 608 multiplied by 3 image into a characteristic diagram of 304 multiplied by 12, and then carrying out convolution operation of 32 convolution kernels to finally change the original 608 multiplied by 304 multiplied by 32 characteristic diagram;

the CSP structure comprises: the CSP module divides the feature mapping of the base layer into two parts, and then merges the two parts through a cross-stage hierarchical structure;

the SPP module: in the SPP module, a mode of maximum pooling of k= {1×1,5×5,9×9,13×13} is used, and then the Concat operation is carried out on the feature graphs with different scales;

the Neck includes a FPN+ PAN structure:

the fpn+pan structure: the FPN is from top to bottom, the feature information of a high layer is transmitted and fused in an up-sampling mode to obtain a predicted feature map, a bottom-up feature pyramid is added behind the FPN layer by the Yolov5 algorithm, the FPN layer conveys strong semantic features from top to bottom, and the feature pyramid conveys strong positioning features from bottom to top;

the output terminal comprises GIOU_Loss:

the GIOU_Loss: the Yolov5 algorithm uses CIOU _ Loss as the Loss function of Boundingbox,

CIOU_Loss calculation formula:

s3: the straw recognition training gallery is used for quickly establishing a training model through a high-throughput graph computing technology,

establishing a straw identification training gallery through a Yolov5 algorithm, and rapidly establishing a training model through a high-throughput graph calculation technology by the straw identification training gallery;

s4: the applet user takes the actual straw scene picture and uploads it to the application platform,

a user shoots an actual straw scene picture through a mobile terminal and uploads the actual straw scene picture to a straw recognition training gallery in an application platform;

s5: the application platform realizes the identification with low delay and high accuracy through a high-flux graph computing technology,

the actual straw scene pictures are uploaded to a straw recognition training gallery in the application platform, and then the application platform trains and recognizes the uploaded actual straw scene pictures through a training model.

As a preferable technical scheme of the invention: in step S2, 4 pictures adopted for enhancing the mosaics are all actual straw scene pictures.

As a preferable technical scheme of the invention: in step S2, the original picture is adaptively added with minimal black when the adaptive picture is scaled.

As a preferable technical scheme of the invention: in step S2, the feature pyramid in the fpn+pan structure includes two PAN structures.

Compared with the prior art, the invention has the beneficial effects that:

the invention adopts the YOLOv5 algorithm, is based on the pytorch frame, has the characteristics of smaller mean weight file, easier deployment, shorter training time and faster reasoning speed (the time of each picture only needs about 0.007 s), and has ultrahigh YOLOv5 precision, and other performances are comprehensively and greatly improved under the condition of no loss of input characteristic information;

in the invention, the method is realized by a mode of enhancing the Mosaic data:

a. enriching a data set: 4 pictures are randomly used, randomly scaled and then randomly distributed for splicing, so that a detection data set is greatly enriched, and particularly, a plurality of small targets are added by random scaling, so that the robustness of the network is better.

b. Reducing GPU: when the Mosaic is used for enhancing training, the data of 4 pictures can be directly calculated, so that the Mini-batch size is not required to be large, and a good effect can be achieved by one GPU;

the self-adaptive anchor frame calculation is embedded into an algorithm in the high-flux graph calculation, and the self-adaptive calculation is used for calculating the optimal anchor frame value in different training sets during each training.

In the invention, the parameter quantity can be reduced and the calculation can be quickened under the condition of not losing information through the Focus structure;

the learning ability of the CNN is enhanced through the CSP structure, so that the accuracy is maintained while the weight is reduced; reducing the computational bottleneck; the memory cost is reduced;

the SPP module can effectively increase the receiving range of the trunk feature, and the most important context feature is obviously separated.

In the invention, through the FPN+PAN structure, the FPN layer conveys strong semantic features from top to bottom, while the feature pyramid conveys strong positioning features from bottom to top, and different detection layers are subjected to feature aggregation from different main layers, so that the feature extraction capability is further improved.

In the invention, the CIOU_Loss considers the scale information of the aspect ratio of the boundary box, thereby improving the recognition speed and accuracy

Drawings

FIG. 1 is a network identification flow chart 1 of the present invention;

FIG. 2 is a network identification flow chart of the present invention, FIG. 2;

FIG. 3 is a diagram of a YOLOv5s network architecture in accordance with the present invention;

FIG. 4 is a diagram of four original pictures input by the input end in the present invention;

FIG. 5 is four pictures enhanced with Mosaic data;

fig. 6 is a picture of a conventional scaling method;

FIG. 7 is a picture of adaptive picture scaling in the present invention;

FIG. 8 is a view of a slice operation of the Focus structure of the present invention;

FIG. 9 is a feature map of FPN+PAN fusion extraction in the present invention;

FIG. 10 is a schematic diagram of CIOU_Loss in accordance with the present invention;

fig. 11 is a diagram of an identification pattern in the present invention.

Detailed Description

The invention is described in further detail below with reference to the attached drawings and detailed description:

the high-throughput graph calculation technology of the invention fuses YOLO (YouOnlyLookOnce) series algorithms, provides a high-reliability and low-time-delay efficient identification method, and the YOLO algorithm is one of the most representative algorithms in the field of target detection in deep learning. YOLO redefines object detection as a regression problem. It applies a single Convolutional Neural Network (CNN) to the entire image, divides the image into grids, and predicts class probabilities and bounding boxes for each grid. The network then predicts, for each grid, a bounding box and probabilities corresponding to each category (car, pedestrian, traffic light, etc.).

YOLOv5 is a single-stage object detection algorithm. YOLOv5 is based on a pytorch framework and has the characteristics of smaller mean weight file, easier deployment, shorter training time and faster reasoning speed (the time of each picture is only about 0.007 s). YOLOv5 was very accurate, and an average accuracy (mAP) of about 0.895 was achieved with only 100 epochs trained in the Roboflow on the blood count and detection (BCCD) dataset. Under the condition that no loss of the input characteristic information exists, the performance is comprehensively and greatly improved.

As shown in fig. 1-11: the invention provides a straw identification method based on high-flux graph calculation, which comprises the following steps:

s1: a straw recognition training gallery management system is established,

s2: a large number of straw training picture elements are collected and input,

the backbox comprises a Focus structure, a CSP structure and an SPP module:

the Neck includes a FPN+ PAN structure:

the output terminal comprises GIOU_Loss:

CIOU_Loss calculation formula:

YOLOV5 network architecture can be divided into four parts:

(1) An input end: enhancing the Mosaic data, calculating an adaptive anchor frame and scaling an adaptive picture;

(2) Backspace: focus structure, CSP structure, SPP module;

(3) Neck: fpn+pan structure;

(4) And an output end: GIOU_Loss;

an input end:

(1) Mosaic data enhancement

The input end of the high-flux graph calculation adopts a mode of enhancing Mosaic data, and 4 pictures are spliced in a mode of random scaling, random cutting and random arrangement.

The main functions are as follows:

b. Reducing GPU: when the Mosaic is used for enhancing training, the data of 4 pictures can be directly calculated, so that the Mini-batch size is not required to be large, and a good effect can be achieved by one GPU.

(2) Adaptive anchor frame computation

In network training, the network outputs a prediction frame based on an initial anchor frame, then compares the prediction frame with a real frame, calculates the difference between the prediction frame and the real frame, and then reversely updates and iterates network parameters. This function is embedded in the algorithm in the high-throughput graph calculation, and the optimal anchor frame values in different training sets are calculated adaptively each time training is performed.

(3) Adaptive picture scaling

In a common target detection algorithm, different pictures are different in length and width, so that a common mode is to uniformly scale an original picture to a standard size and send the standard size into a detection network.

But many pictures differ in aspect ratio when the project is in actual use. Therefore, after scaling and filling, the sizes of black edges at two ends are different, and if the filling is more, information redundancy exists, so that the reasoning speed is influenced. Thus, modification optimization is performed in the high-throughput map calculation, with minimal black edges added to the original image adaptation.

Backbone：

(1) Focus structure:

the picture is sliced by taking a value every other pixel in a picture, similar to adjacent downsampling. Taking Yolov5s as an example, the original 608×608×3 image is input into a Focus structure, and is firstly changed into a 304×304×12 feature map by slicing operation, and then is subjected to convolution operation of 32 convolution kernels once, and finally is changed into a 304×304×32 feature map.

The function is as follows: the parameter quantity is reduced and the calculation is quickened under the condition of not losing information.

(2) CSP structure:

the CSP module divides the feature mapping of the base layer into two parts, and then merges the two parts through a cross-stage hierarchical structure, so that the calculation amount is reduced, the accuracy can be ensured, and the problem of large calculation amount in the pushing process is solved from the perspective of network structure design.

The function is as follows: the learning ability of CNN is enhanced, so that the accuracy is maintained while the weight is reduced; reducing the computational bottleneck;

reducing memory cost

(3) SPP module:

in the SPP module, the maximum pooling mode of k= {1×1,5×5,9×9,13×13} is used, and then the Concat operation is performed on the feature maps with different scales.

The function is as follows: the reception range of the backbone features is increased more effectively, significantly separating the most important contextual features.

Neck：

Fpn+pan structure:

the FPN is from top to bottom, and the feature information of a high layer is transmitted and fused in an up-sampling mode to obtain a predicted feature map. Yolov5 adds a bottom-up feature pyramid behind the FPN layer. Which contains two PAN structures.

In this way, the FPN layer conveys strong semantic features from top to bottom, the feature pyramid conveys strong positioning features from bottom to top, and feature aggregation is carried out on different detection layers from different trunk layers, so that feature extraction capability is further improved.

CIOU_Loss：

The ciou_loss is used in Yolov5 as a Loss function for Boundingbox.

CIOU_Loss calculation formula:

the function is as follows: and considering the scale information of the aspect ratio of the boundary box, the speed and the accuracy of the identification are improved.

The above description is only of the preferred embodiment of the present invention, and is not intended to limit the present invention in any other way, but is intended to cover any modifications or equivalent variations according to the technical spirit of the present invention, which fall within the scope of the present invention as defined by the appended claims.

Claims

1. A straw identification method based on high-flux map calculation is characterized by comprising the following steps of: the method comprises the following steps:

s1: a straw recognition training gallery management system is established,

s2: a large number of straw training picture elements are collected and input,

the backbox comprises a Focus structure, a CSP structure and an SPP module:

the Neck includes a FPN+ PAN structure:

the output terminal comprises GIOU_Loss:

CIOU_Loss calculation formula:

2. The straw identification method based on high-throughput map calculation of claim 1, wherein the method comprises the following steps: in step S2, 4 pictures adopted for enhancing the mosaics are all actual straw scene pictures.

3. The straw identification method based on high-throughput map calculation of claim 1, wherein the method comprises the following steps: in step S2, the original picture is adaptively added with minimal black when the adaptive picture is scaled.

4. The straw identification method based on high-throughput map calculation of claim 1, wherein the method comprises the following steps: in step S2, the feature pyramid in the fpn+pan structure includes two PAN structures.