CN114066920A

CN114066920A - Harvester visual navigation method and system based on improved Segnet image segmentation

Info

Publication number: CN114066920A
Application number: CN202111394892.4A
Authority: CN
Inventors: 卢柱; 齐亮; 李邦昱; 张永韡; 宋英磊; 李长江; 暴琳
Original assignee: Jiangsu University of Science and Technology
Current assignee: Jiangsu University of Science and Technology
Priority date: 2021-11-23
Filing date: 2021-11-23
Publication date: 2022-02-18
Anticipated expiration: 2041-11-23
Also published as: CN114066920B

Abstract

The invention discloses a harvester visual navigation method and system based on improved Segnet image segmentation, which comprises the following steps: (1) collecting field crop images to be segmented for preprocessing; (2) performing semantic segmentation on the acquired field crop image by using an improved Segnet model to generate a target characteristic map; the improved Segnet model is characterized in that a ShuffeNet V2 network is used as an encoder part in the Segnet model; (3) acquiring boundary pixel points and position information of a target in the feature map by using an edge detection algorithm; (4) and finally outputting a straight line segment as a target straight line path for navigation by the accumulative probability Hough transform PPHT algorithm by adjusting a threshold parameter, a minLinelength parameter and a maxLineGap parameter. The invention can quickly carry out semantic segmentation on the acquired image, and calculate and screen out a proper path for the harvester to navigate according to the segmented characteristic image.

Description

Harvester visual navigation method and system based on improved Segnet image segmentation

Technical Field

The invention relates to the technical field of deep learning, computer vision and image analysis, in particular to a harvester vision navigation method and system based on improved Segnet image segmentation.

Background

At present, the technology of segmenting an image by utilizing deep learning and acquiring image information through the segmented image is applied to the field of agricultural production, the application of the invention patent with the patent application number of CN202110287975.7 utilizes a Segnet network model to segment the acquired image of sorghum lodging, and a MobileNet network is adopted to lighten the Segnet network in the Segnet network model coding stage, so that the rapid segmentation of the image is realized. However, such a segmentation rate is not sufficient for harvesters that are moving dynamically in real time as they harvest the crop, requiring faster segmentation of the acquired images, and computing and screening out the desired path for navigation from the segmented images.

Disclosure of Invention

The purpose of the invention is as follows: aiming at the defects, the invention provides a harvester visual navigation method based on improved Segnet image segmentation, which realizes the semantic segmentation of the image acquired by the harvester by rapid identification, and calculates and screens out a proper harvesting path for the navigation of the harvester according to the segmented characteristic image. Meanwhile, the invention also provides a harvester visual navigation system based on the improved Segnet image segmentation, and a proper harvesting path can be obtained in real time through the system for navigation of the harvester.

The technical scheme is as follows: in order to solve the problems, the invention provides a harvester visual navigation method based on improved Segnet image segmentation, which is characterized by comprising the following steps of:

(1) collecting field crop images to be segmented for preprocessing;

(2) performing semantic segmentation on the acquired field crop image by using an improved Segnet model to generate a target characteristic map; the improved Segnet model is characterized in that a ShuffeNet V2 network is used as an encoder part in the Segnet model;

(3) acquiring boundary pixel points and position information of a target in the feature map by using an edge detection algorithm;

(4) and finally outputting a straight line segment as a target straight line path for navigation by the accumulative probability Hough transform PPHT algorithm by adjusting a threshold parameter, a minLinelength parameter and a maxLineGap parameter.

Has the advantages that: compared with the prior art, the invention has the following remarkable advantages: the encoding part of the Segnet network adopts a ShuffleNet V2 network, and the ShuffleNet V2 mainly simplifies the network by using deep separable convolution, so that the calculation efficiency is obviously improved under the condition of sacrificing a small amount of precision, the Segnet network is lightened, the speed of segmenting field crop images is increased, boundary pixel points and position information are obtained by detecting the edges of the segmented images, and a proper straight line path is selected by using a PPHT algorithm for navigation of a harvester.

Further, the step (2) of generating the target feature map by using the improved Segnet model specifically includes:

(2.1) acquiring an image of a field crop, setting an interested area for marking, and dividing a marked data set into a training set and a testing set; carrying out image enhancement processing on the marked images in the training set;

(2.2) carrying out model iterative training by adjusting the weight coefficient vector omega and the learning rate alpha in the improved Segnet model, and selecting the model with the maximum average intersection ratio MIoU as an optimal model;

in the formula, k represents the number of categories; k +1 represents the number of categories including empty categories; p is a radical of_ijRepresenting the number of false positive samples; p is a radical of_jiRepresenting the number of false negative samples; p is a radical of_iiRepresenting the true sample number;

(2.3) carrying out image segmentation on the field crop image to be segmented by using the optimal model to obtain a target characteristic diagram;

further, the ShuffleNet V2 network in the step (2) is further improved, and the construction of the basic unit in the ShuffleNet V2 network specifically comprises the following steps:

(a) dividing the input into two groups of feature maps according to the number of channels through Channel Split operation, and then respectively transmitting the two groups of feature maps into two branches;

(b) after the feature map transmitted to one branch is firstly convolved by 1 × 1 line, the feature map is input into an ASPP structure based on the depth separable convolution for convolution operation, then splicing output is carried out, and the feature map is output after the feature map is convolved by 1 × 1; the ASPP structure is three depth separable convolutional layer branches with different sizes;

(c) and splicing and outputting the feature map output by one branch and the feature map output by the other branch, and finally outputting the feature maps after Channel Shuffle operation.

The ASPP structure based on the deep separable convolution is used for replacing a single 3 x 3 convolution kernel in the lightweight network ShuffleNet V2 to improve the receptive field of the network, so that multi-scale spatial information is captured, and the performance of the model is improved.

Further, the method is further improved aiming at the ShuffleNet V2 network in the step (2), and the construction of the spatial down-sampling unit in the ShuffleNet V2 network specifically comprises the following steps:

(d) copying a feature map output by an upper network as an input to two branches, wherein one branch input outputs the feature map after sequentially passing through a 3 × 3 depth separable convolutional layer and a 1 × 1 convolution operation; the other branch is firstly subjected to 1 × 1 convolution, then input into an ASPP structure based on the depth separable convolution for convolution operation, then spliced and output, and then subjected to 1 × 1 convolution operation to output a characteristic diagram;

(e) and splicing and outputting the feature map output by one branch and the feature map output by the other branch, and finally outputting the feature maps after Channel Shuffle operation.

Further, optimizing a loss function of the improved Segnet model by using a random gradient descent method and L1 regularization; the formula of the random gradient descent method SGD is as follows:

where ω denotes a weight coefficient vector, α denotes a learning rate, and J (ω) denotes a loss function;

the L1 regularization formula is as follows:

wherein X is a training sample, y is a label corresponding to X, omega (omega) is a penalty term,

is an objective function.

Further, the three depth separable convolution layers of different sizes in the ASPP structure are 3 × 3, 5 × 5, and 7 × 7, respectively.

Further, stride of the depth separable convolution layer in the ASPP structure in the base unit is set to 1, and stride of the depth separable convolution layer in the ASPP structure in the downsampling unit is set to 2.

The invention also provides a harvester vision navigation system based on improved Segnet image segmentation, which comprises:

the image acquisition module is used for acquiring a field crop image to be segmented for preprocessing;

the image processing module is used for performing semantic segmentation on the acquired field crop image by using the improved Segnet model to generate a target characteristic map; the improved Segnet model is characterized in that a ShuffeNet V2 network is used as an encoder part in the Segnet model; acquiring boundary pixel points and position information of a target in the feature map by using an edge detection algorithm;

and the path decision unit is used for taking the acquired boundary pixel points and position information as the input of the cumulative probability Hough transform PPHT algorithm, and finally outputting a straight line segment as a target straight line path for navigation by adjusting a threshold parameter, a minLinelength parameter and a maxLineGap parameter.

Has the advantages that: compared with the prior art, the system has the obvious advantage that the system can acquire a proper harvesting path in real time for the navigation of the harvester.

Furthermore, a computer-readable storage medium comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the above-mentioned method.

Furthermore, a memory, a processor and a program stored on the memory and executable, which when executed by the processor, implement the steps of the above-described method.

Drawings

Fig. 1 is a flowchart of a harvester visual navigation method based on improved Segnet image segmentation according to the present invention;

FIG. 2 is a flowchart illustrating a process of obtaining a target feature map using an improved segnet model according to the present invention;

FIG. 3 is a block diagram of a ShuffLeNet V2 base unit and a downsampling unit according to a further improvement of the present invention;

FIG. 4 is a diagram illustrating the acceleration process of the inference accelerator TensorRT according to the present invention;

fig. 5 is a composition diagram of a harvester visual navigation system based on improved Segnet image segmentation according to the present invention;

FIG. 6 is a graph showing the effect of rice boundary detection according to the present invention.

Detailed Description

The invention will be further described with reference to the accompanying drawings.

As shown in fig. 1, the harvester visual navigation method based on improved Segnet image segmentation according to the present invention includes the following steps:

(1) acquiring target data and preprocessing the target data;

a high-definition monocular camera of a Robotic (Logitech) C1000e is selected, the real-time frame rate is set to be 60f/s, the high-definition monocular camera is used for collecting field crop images in real time, size reconstruction and region of interest setting are required to be carried out on the field images in an industrial camera continuous capturing mode, rich detail information is obtained, and meanwhile the real-time performance of an algorithm is improved by compressing the size of the images. Meanwhile, in order to reduce the influence of field illumination and false edge detection, preprocessing such as graying, bilateral filtering and the like is carried out on the image;

(2) performing semantic segmentation on the preprocessed image data by using an improved Segnet model, and extracting features to generate a target feature map;

the Segnet network consists of two parts, namely an encoder and a decoder, the Segnet encoder part adopts the first 13 layers of the VGG network, each layer contains convolution + BN (batch normalization) + ReLU, and downsampling processing is carried out. And each layer of the decoder and the encoder corresponds to each other one by one and is used for up-sampling recovered pixels, and finally, the output layer uses a softmax function to classify and outputs the classification probability of each pixel. However, when the embedded platform deploys the Segnet network, the parameter quantity of the network is too large, the required calculated quantity is also high, and the real-time requirement of the harvester for harvesting crops is difficult to meet, so that an improved Segnet model is formed by using the ShuffleNet V2 network as an encoder part in the Segnet model;

the ShuffleNet V2 network mainly utilizes the depth separable convolution to simplify the network, obviously improves the calculation efficiency under the condition of sacrificing a small amount of precision, and can be well used as the basis of a light-weight image semantic segmentation model. To achieve better processing speed, the ShuffleNet V2 network was further improved. As shown in fig. 3(a), the construction of the basic unit in the ShuffleNetV2 network specifically includes the following steps:

(b) after the feature graph transmitted to one branch is subjected to 1 × 1 convolution, the feature graph is input into an ASPP structure based on the depth separable convolution for convolution operation, then splicing output is carried out, and the feature graph is output after the feature graph is subjected to 1 × 1 convolution; the ASPP structure is three depth separable convolutional layer branches with the depth of 3 multiplied by 3, 5 multiplied by 5 and 7 multiplied by 7; stride of the depth separable convolution in the ASPP structure is set to 1. The ASPP (space pyramid pooling) structure based on the depth separable convolution is used for replacing a single 3 multiplied by 3 convolution kernel in the original ShuffleNet V2 model, the structural modification can enlarge the receptive field of convolution operation, and therefore multi-scale space information can be captured

(c) And splicing and outputting the feature maps output by the two branches, and finally outputting the feature maps after Channel Shuffle operation.

As shown in fig. 3(b), the construction of the spatial down-sampling unit in the shuffle netv2 network specifically includes the following steps:

(d) copying a feature map output by an upper network as an input to two branches, wherein one branch input outputs the feature map after sequentially passing through a 3 × 3 depth separable convolutional layer and a 1 × 1 convolution operation; the other branch is firstly subjected to 1 × 1 convolution, then input into an ASPP structure based on the depth separable convolution for convolution operation, then spliced and output, and then subjected to 1 × 1 convolution operation to output a characteristic diagram; the ASPP structure is three depth separable convolutional layer branches with the depth of 3 multiplied by 3, 5 multiplied by 5 and 7 multiplied by 7; stride of the depth separable convolution in the ASPP structure is set to 2

In addition, a random gradient descent method and L1 regularization can be adopted to optimize the loss function of the improved Segnet model; the formula of the random gradient descent method SGD is as follows:

the L1 regularization formula is as follows:

is an objective function;

as shown in fig. 2, the specific steps of processing the image and generating the target feature map using the improved Segnet model include:

(2.1) finely marking the region of interest in the collected farmland data set (total 2000 pieces) by using a labelme marking tool, and dividing the marked data set into a training set and a testing set according to the division ratio of 7: 3; and performing labeling processing on the data set, and converting the labeled data into TFRecord format data convenient for Tensorflow acquisition. And performing geometric transformation such as turning, rotating, scaling, shifting and the like on each frame of image under a Tensorflow environment to enhance image data so as to reduce the overfitting condition of the network and improve the training effect of the network.

(2.2) carrying out 30000 times to 50000 times of iterative training by adjusting the weight coefficient vector omega and the learning rate alpha in the improved Segnet model to obtain training results of different MIoUs, and finally selecting the model with the maximum average intersection ratio MIoU as the optimal model according to the MIoU value, wherein the MIoU formula is as follows:

(2.3) carrying out image segmentation on the test set sample by using the optimal model to obtain a target characteristic diagram; the test set sample can be used for carrying out image segmentation on the field crop image obtained in the step (1) so as to obtain a target characteristic diagram;

in order to increase the reasoning speed of the Segnet model, the obtained optimal model can be deployed in the embedded terminal, and operations such as merging layers, reducing the calculation precision, parallel optimization and the like are performed on the optimal model deployed in the embedded terminal by using the TensorRT (reasoning accelerator) of NVIDIA. As shown in FIG. 4, the method can perform operations such as merging layer, precision calibration, dynamic storage, kernel automatic adjustment, parallel optimization and the like on a trained network, so that the inference speed of a model is improved.

(3) Aiming at the generated target characteristic diagram, by combining the characteristics of the edge harvesting operation of the combine harvester, boundary pixel points and position information of a target in the characteristic diagram can be obtained by utilizing an edge detection algorithm; specifically, the boundary pixel points and the position information of the target in the feature map can be obtained through the sobel operator edge detection algorithm and then through the canny operator edge detection algorithm, and the detection effect is enhanced through the superposition processing of the two algorithms.

(4) The obtained boundary pixel points and position information are used as input of an accumulative probability Hough transform PPHT algorithm, a Hough LinesP () function in opencv is called to obtain a plurality of straight line segments, and a threshold parameter (the minimum curve intersection point required by detecting a straight line) in the function is modified: the adjustment interval is as follows: 150-200 pixel points, minLinelength (the number of the minimum points forming a straight line): the adjustment interval is as follows: 80-120 pixel points, maxLineGap parameter (threshold of the distance between the detected line segments): the adjustment interval is as follows: and 0-10 pixel points, so that the cumulative probability Hough transform PPHT algorithm finally outputs a straight line segment as a target straight line path for navigation.

As shown in fig. 6, the target path of the rice harvest boundary finally output by the above method. Combining the size and heading information of the harvester such that the edge of the harvester must not traverse and travel along the boundary straight path acquired.

In addition, the invention also provides a harvester visual navigation system based on the improved Segnet image segmentation, which comprises:

As shown in fig. 5, the image acquisition module adopts a monocular high-definition camera, the high-performance vision computer is internally provided with an image processing module and a path decision unit, the monocular high-definition camera and the high-performance vision computer are arranged on the harvester, and an image acquired by the monocular high-definition camera is transmitted into the high-performance vision computer to perform image processing and path decision.

Claims

1. A harvester visual navigation method based on improved Segnet image segmentation is characterized by comprising the following steps:

(1) collecting field crop images to be segmented for preprocessing;

(3) acquiring boundary pixel points and position information of a target in a target characteristic diagram by using an edge detection algorithm;

2. The vision navigation method for the harvester based on the improved Segnet image segmentation as claimed in claim 1, wherein the step (2) of generating the target feature map by using the improved Segnet model specifically comprises the following steps:

and (2.3) carrying out image segmentation on the field crop image to be segmented by using the optimal model to obtain a target characteristic diagram.

3. The visual navigation method for the harvester based on the improved Segnet image segmentation as claimed in claim 1, wherein the further improvement is made to the ShuffeNet V2 network in the step (2), and the construction of the basic unit in the ShuffeNet V2 network specifically comprises the following steps:

(b) after the feature graph transmitted to one branch is subjected to 1 × 1 convolution, the feature graph is input into an ASPP structure based on the depth separable convolution for convolution operation, then splicing output is carried out, and the feature graph is output after the feature graph is subjected to 1 × 1 convolution; the ASPP structure is three depth separable convolutional layer branches with different sizes;

4. The visual navigation method for the harvester based on improved Segnet image segmentation as claimed in claim 3, wherein the further improvement is made to the ShuffeNet V2 network in the step (2), and the construction of the spatial down-sampling unit in the ShuffeNet V2 network specifically comprises the following steps:

5. The visual harvester navigation method based on improved Segnet image segmentation of claim 1, further comprising optimizing a loss function of the improved Segnet model using a stochastic gradient descent method and L1 regularization; the formula of the random gradient descent method SGD is as follows:

the L1 regularization formula is as follows:

is an objective function.

6. The method of claim 4 in which the three different sized depth separable convolutional layers in the ASPP structure are 3 x 3, 5 x 5, and 7 x 7, respectively.

7. The method of claim 4, wherein stride of the depth separable convolutional layers in the ASPP structure in the base unit is set to 1, and stride of the depth separable convolutional layers in the ASPP structure in the downsampling unit is set to 2.

8. A harvester visual navigation system based on improved Segnet image segmentation, comprising:

9. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the method of any of claims 1-7.

10. A debugging device characterized by a memory, a processor and a program stored and executable on said memory, said program realizing the steps of the method according to any one of claims 1 to 7 when executed by the processor.