CN113808055B

CN113808055B - Plant identification method, device and storage medium based on mixed expansion convolution

Info

Publication number: CN113808055B
Application number: CN202110947152.2A
Authority: CN
Inventors: 郑禄; 帖军; 刘越; 宋中山; 王江晴; 吴立锋; 徐胜舟; 肖博文
Original assignee: Wuhan Bitaoju Agricultural Development Co ltd; South Central University for Nationalities
Current assignee: Wuhan Bitaoju Agricultural Development Co ltd; South Central Minzu University
Priority date: 2021-08-17
Filing date: 2021-08-17
Publication date: 2023-11-24
Anticipated expiration: 2041-08-17
Also published as: CN113808055A

Abstract

The invention relates to the technical field of image processing, and discloses a plant identification method, a plant identification device and a storage medium based on mixed expansion convolution, wherein the method comprises the following steps: acquiring a plant image to be processed in a preset scene, and determining scene distance information of the plant image to be processed; performing expansion processing on the plant image to be processed according to the scene distance information to obtain an expanded plant image; inputting the expanded plant image into a preset mixed expanded image segmentation model to obtain a plant segmentation image; and carrying out plant identification according to the plant segmentation image. Compared with the prior art that the plant image is required to be manually segmented, the plant image processing efficiency is low, and the plant image is processed by using the preset mixed expansion image segmentation model, so that the plant segmentation image is rapidly and accurately acquired, and the plant recognition accuracy is improved.

Description

Plant identification method, device and storage medium based on mixed expansion convolution

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a plant identification method, apparatus and storage medium based on hybrid expansion convolution.

Background

Plants in natural environments are affected by a plurality of interference factors, mainly comprising illumination intensity change, uneven brightness, close colors of the plants and the background, branch and leaf shielding and shadow coverage, so that the appearance characteristics of the plants are greatly changed along with the change of the environment. Because of the influence of a plurality of interference factors, the direct plant identification by utilizing the target detection algorithm can have more misjudgment or omission, so that the plant can not be accurately identified.

The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.

Disclosure of Invention

The invention mainly aims to provide a plant identification method, device and storage medium based on mixed expansion convolution, which aim at solving the technical problem of how to improve the accuracy of plant identification.

In order to achieve the above object, the present invention provides a plant identification method based on a hybrid expansion convolution, comprising the steps of:

acquiring a plant image to be processed in a preset scene, and determining scene distance information of the plant image to be processed;

performing expansion processing on the plant image to be processed according to the scene distance information to obtain an expanded plant image;

Inputting the expanded plant image into a preset mixed expanded image segmentation model to obtain a plant segmentation image;

and carrying out plant identification according to the plant segmentation image.

Preferably, the step of expanding the plant image to be processed according to the scene distance information to obtain an expanded plant image includes:

determining an image expansion rate according to the scene distance information;

acquiring image pixel information corresponding to the plant image to be processed;

performing expansion processing on the image pixel information according to the image expansion rate to obtain expansion pixel information;

and generating an expanded plant image according to the expanded pixel information.

Preferably, before the step of acquiring the plant image to be processed in the preset scene and determining the scene distance information of the plant image to be processed, the method further includes:

acquiring a plurality of plant image samples under different scenes;

respectively determining plant image category information corresponding to the plurality of plant image samples;

preprocessing the plurality of plant image samples according to the plant image category information to obtain a plurality of plant image training samples and a plurality of plant image verification samples;

Training an initial network model according to the plurality of plant image training samples to obtain an initial mixed expansion image segmentation model;

and determining a preset mixed expansion image segmentation model according to the plant image verification samples and the initial mixed expansion image segmentation model.

Preferably, the step of preprocessing the plurality of plant image samples according to the plant image category information to obtain a plurality of plant image training samples and a plurality of plant image verification samples includes:

respectively determining image illumination information corresponding to the plurality of plant image samples according to the plant image category information;

determining a preset gray world elimination rule according to the image illumination information;

respectively processing the plurality of plant image samples according to the preset gray world elimination rule to obtain a plurality of gray image samples;

and determining a plurality of plant image training samples and a plurality of plant image verification samples according to the plurality of gray level image samples.

Preferably, the step of determining a plurality of plant image training samples and a plurality of plant image verification samples from the plurality of gray image samples comprises:

respectively determining image brightness information corresponding to the plurality of gray images;

Determining a brightness processing function according to the image brightness information;

respectively carrying out brightness adjustment on the gray images through the brightness processing function to obtain a plurality of brightness image samples;

and determining a plurality of plant image training samples and a plurality of plant image verification samples according to the plurality of brightness image samples.

Preferably, the step of determining a plurality of plant image training samples and a plurality of plant image verification samples from the plurality of luminance image samples includes:

respectively carrying out rotary mirror image processing on the brightness image samples to obtain a plurality of rotary image samples;

respectively carrying out noise treatment on the plurality of rotation image samples to obtain a plurality of Zhang Zao-point image samples;

respectively carrying out fuzzy processing on the plurality of noise image samples to obtain a plurality of fuzzy image samples;

dividing the plurality of fuzzy image samples according to a preset image proportion rule to obtain a plurality of plant image training samples and a plurality of plant image verification samples.

Preferably, the step of training the initial network model according to the plurality of plant image training samples to obtain an initial mixed expansion image segmentation model includes:

Respectively determining image distance information corresponding to the plurality of plant image training samples;

respectively determining the expansion rates of the image samples corresponding to the plurality of plant image training samples according to the image distance information;

judging whether the expansion rate of the image sample meets a preset expansion factor relation or not;

and training an initial network model according to the plurality of plant image training samples and the image sample expansion rate corresponding to the plurality of plant image training samples when the image sample expansion rate meets the preset expansion factor relation, so as to obtain an initial mixed expansion image segmentation model.

Preferably, the step of determining a preset blended-up image segmentation model from the plurality of plant image verification samples and the initial blended-up image segmentation model includes:

verifying the initial mixed expansion image segmentation model according to the multiple plant image verification samples to obtain a model evaluation index score;

and when the model evaluation index score meets a preset index condition, taking the initial mixed expansion image segmentation model as a preset mixed expansion image segmentation model.

In addition, in order to achieve the above object, the present invention also provides a plant identification device based on a hybrid expansion convolution, the plant identification device based on the hybrid expansion convolution includes:

The acquisition module is used for acquiring a plant image to be processed in a preset scene and determining scene distance information of the plant image to be processed;

the processing module is used for performing expansion processing on the plant image to be processed according to the scene distance information to obtain an expanded plant image;

the segmentation module is used for inputting the expanded plant image into a preset mixed expanded image segmentation model to obtain a plant segmentation image;

and the identification module is used for carrying out plant identification according to the plant segmentation image.

In addition, in order to achieve the above object, the present invention also proposes a storage medium having stored thereon a plant identification program based on a hybrid expansion convolution, which when executed by a processor, implements the steps of the plant identification method based on a hybrid expansion convolution as described above.

According to the method, firstly, a plant image to be processed in a preset scene is obtained, scene distance information of the plant image to be processed is determined, then expansion processing is carried out on the plant image to be processed according to the scene distance information to obtain an expanded plant image, the expanded plant image is input into a preset mixed expansion image segmentation model to obtain a plant segmentation image, and finally plant identification is carried out according to the plant segmentation image. Compared with the prior art that the plant image is required to be manually segmented, the plant image processing efficiency is low, and the plant image is processed by using the preset mixed expansion image segmentation model, so that the plant segmentation image is rapidly and accurately acquired, and the plant recognition accuracy is improved.

Drawings

FIG. 1 is a schematic diagram of a plant identification device based on a hybrid expansion convolution of a hardware operating environment in accordance with an embodiment of the present invention;

FIG. 2 is a flow chart of a first embodiment of a plant identification method based on mixed expansion convolution according to the present invention;

FIG. 3 is a flow chart of a second embodiment of a plant identification method based on mixed dilation convolution of the present invention;

FIG. 4 is a graph of HDC-Seg Net model loss for a second embodiment of a plant identification method based on mixed dilation convolution of the present invention;

FIG. 5 is a SegNet model loss diagram of a second embodiment of a plant identification method based on mixed dilation convolution of the present invention;

fig. 6 is a block diagram of a first embodiment of a plant identification device based on a hybrid dilation convolution of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Referring to fig. 1, fig. 1 is a schematic diagram of a plant identification device based on a hybrid expansion convolution in a hardware operation environment according to an embodiment of the present invention.

As shown in fig. 1, the plant identification apparatus based on the mixed expansion convolution may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display (Display), and the optional user interface 1003 may also include a standard wired interface, a wireless interface, and the wired interface for the user interface 1003 may be a USB interface in the present invention. The network interface 1004 may optionally include a standard wired interface, a Wireless interface (e.g., a Wireless-Fidelity (Wi-Fi) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) or a stable Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.

Those skilled in the art will appreciate that the configuration shown in fig. 1 is not limiting of a plant identification device based on a hybrid expansion convolution, and may include more or fewer components than shown, or certain components in combination, or a different arrangement of components.

As shown in fig. 1, an operating system, a network communication module, a user interface module, and a plant identification program based on a hybrid expansion convolution may be included in a memory 1005 identified as a computer storage medium.

In the plant identification device based on the mixed expansion convolution shown in fig. 1, the network interface 1004 is mainly used for connecting a background server, and performing data communication with the background server; the user interface 1003 is mainly used for connecting user equipment; the plant recognition device based on the hybrid expansion convolution calls the plant recognition program based on the hybrid expansion convolution stored in the memory 1005 through the processor 1001, and executes the plant recognition method based on the hybrid expansion convolution provided by the embodiment of the invention.

Based on the above hardware structure, an embodiment of the plant identification method based on the mixed expansion convolution is provided.

Referring to fig. 2, fig. 2 is a schematic flow chart of a first embodiment of a plant identification method based on mixed expansion convolution according to the present invention.

In a first embodiment, the plant identification method based on mixed expansion convolution comprises the following steps:

step S10: and acquiring a plant image to be processed in a preset scene, and determining scene distance information of the plant image to be processed.

The main body of execution of the present embodiment is a plant recognition device based on a hybrid expansion convolution, where the device is a plant recognition device based on a hybrid expansion convolution with functions of image processing, data communication, program running, and the like, and may be other devices, which is not limited in this embodiment.

The plant image to be processed in the preset scene can be a fruit image grown on a tree in a natural environment, the fruit image can be a single green citrus fruit image, can also be a plurality of green citrus fruit images and the like, in this embodiment, the plant image to be processed in the preset scene can be obtained through a built-in camera of the mobile terminal, the plant image to be processed in the preset scene can also be obtained through an external camera, and the plant image to be processed in the preset scene can also be obtained through a camera.

The scene distance information is distance information between the camera and the plant, the scene distance information can be obtained according to pixel information of the plant image to be processed, and the scene distance information can be 10cm, 8cm and the like.

Step S20: and performing expansion processing on the plant image to be processed according to the scene distance information to obtain an expanded plant image.

A swelled plant image can be understood as a plant image or the like in which a plant area in a plant image to be processed is increased and the image size is uniform.

Determining an image expansion rate according to scene distance information, acquiring image pixel information corresponding to a plant image to be processed, performing expansion processing on the image pixel information according to the image expansion rate, acquiring expansion pixel information, and generating an expanded plant image according to the expansion pixel information. The dilation pixel information may be a dilation rate or the like corresponding to the plant image to be processed.

The size of the fruits in the natural environment can be different along with the growth period of the fruits and the change of the distance between the fruits and the image when the fruits are photographed, if the small objects cannot be reconstructed, the segmentation effect of the model on the small fruits can be affected, in the prior art, when the convolutional neural network is used for image processing, the rolling and pooling operation is generally carried out, so that the characteristic diagram after the operation is smaller and can be transmitted to the final full-connection layer for classification. The image semantic segmentation task needs to classify each pixel point in the image, so that the smaller feature image needs to be converted into the original image size through up-sampling operation and then pixel point prediction is performed. There are two major drawbacks to this process: because the pooling operation is irreversible, the problem of information loss exists when the feature map is restored, and meanwhile, the small object image cannot be reconstructed after the multi-pooling operation. However, in this embodiment, the dilation convolution is proposed to replace the conventional max-pooling and convolution operations, so that the perceived field of view is increased while keeping the feature map consistent with the original image size. The dilation convolution reduces the number of convolutions of each pixel, so that semantic information of local pixel blocks can be effectively focused, and the segmentation accuracy is not affected by mixing each pixel with surrounding pixels.

In a specific implementation, if a plurality of identical expansion convolutions are overlapped, a large number of holes exist in the visual field, which breaks the continuity and integrity between data and leads to loss of local information. Because the input signals are relatively sparse, the feature information extracted remotely does not have correlation, and the classification result is affected. Therefore, in this embodiment, mixed expansion convolution is adopted, and has three characteristics, the first is that the value of the expansion rate of the superimposed convolution cannot be arbitrarily selected, and the convention cannot contain a common divisor greater than 1; the second characteristic is that the expansion rate is designed into a zigzag structure, the small expansion rate convolution focuses on the short-distance information, the large expansion rate convolution focuses on the long-distance information, the requirement on the segmentation of large and small objects can be met at the same time, by setting different expansion rates, more and more continuous pixel range information can be processed in the same configuration, and meanwhile, a large amount of local information cannot be lost; the third characteristic is that the expansion rate needs to satisfy a preset expansion formula so that no cavity exists in the visual field.

The preset expansion formula is:

M _i ＝max[M _i+1 -2r _i ,M _i+1 -2(M _i+1 -r _i ),r _i ]

wherein M is _i Is the maximum expansion rate of the ith layer, r _i Is the expansion rate of the i-th layer.

In this embodiment, assuming that there are n and K x K expansion convolution kernels, then M _n ＝r _n The purpose of this design is to let M ₂ <=k. It should be noted that the expansion rate in the group cannot have a common factor relationship of 2, 4, 8, or else, the problem of meshing in the network may be caused.

The method is characterized in that aiming at a near-scenic fruit identification task, the size of the fruit in an image is greatly changed along with the selection of a shooting angle and the change of the shooting distance in the image acquisition process, wherein when the shooting distance is far, the distribution of the fruit in the image is more and smaller, and the method brings great difficulty to accurate segmentation. Therefore, in order to carry out semantic segmentation on fruits in multiple scenes, mixed expansion convolution with different expansion rates is overlapped to adapt to targets with different sizes, and multi-scale detection of images is realized. Meanwhile, by superposition of mixed expansion convolution, the depth of the model can be increased, so that the model learns deeper semantic features, and the segmentation effect is optimized.

Step S30: and inputting the expanded plant image into a preset mixed expanded image segmentation model to obtain a plant segmentation image.

The processing mode of constructing the preset mixed expansion image segmentation model is that plant image category information corresponding to a plurality of plant image samples can be respectively determined for acquiring the plurality of plant image samples under different scenes, the plurality of plant image samples are respectively preprocessed according to the plant image category information to obtain a plurality of plant image training samples and a plurality of plant image verification samples, an initial network model is trained according to the plurality of plant image training samples to obtain an initial mixed expansion image segmentation model, and the preset mixed expansion image segmentation model is determined according to the plurality of plant image verification samples and the initial mixed expansion image segmentation model. Plant image category information weather information, fruit number information, fruit size information, and the like of a plant image.

The processing mode of respectively preprocessing a plurality of plant image samples according to plant image category information to obtain a plurality of plant image training samples and a plurality of plant image verification samples can be that image illumination information corresponding to the plurality of plant image samples is respectively determined according to the plant image category information, a preset gray world elimination rule is determined according to the image illumination information, the plurality of plant image samples are respectively processed according to the preset gray world elimination rule to obtain a plurality of gray image samples, image brightness information corresponding to the plurality of gray images is respectively determined, a brightness processing function is determined according to the image brightness information, brightness adjustment is respectively carried out on the plurality of gray images through the brightness processing function to obtain a plurality of brightness image samples, and the plurality of plant image training samples and the plurality of plant image verification samples are determined according to the plurality of brightness image samples.

In order to enhance the richness of the green citrus dataset, the collected plant images are preprocessed in terms of color, brightness, rotation, noise, blurring and the like through application program interfaces provided in a computer vision and machine learning software library, so that the dataset is expanded.

In this embodiment, different illumination conditions in the natural environment may cause a certain deviation between the color represented by the captured image and the actual color of the object, and the influence of illumination on color rendering may be eliminated by using a gray world algorithm. The influence of illumination on the color of the image in the natural environment can be eliminated by applying a gray world algorithm to a plurality of plant image samples, so that the image is closer to the true color of the object.

Due to variability of illumination intensity in reality, the brightness of the original dataset image can be adjusted by changing the parameters of the brightness function in this embodiment. Too strong or too weak an image may cause an unclear edge of the object, and it may be difficult to draw a bounding box during manual annotation, which may have an impact on the performance of the model. To avoid generating such images, the target edge can be accurately identified to select the appropriate parameter ranges (0.3, 0.5, and 0.7) for brightness variation to obtain multiple brightness image samples. The method can simulate the condition of continuously changing illumination intensity in natural environment, and overcomes the defect that various illumination intensities caused by the concentration of image acquisition time by the neural network are not robust.

The processing mode of determining the plurality of plant image training samples and the plurality of plant image verification samples according to the plurality of brightness image samples can be that the plurality of brightness image samples are respectively subjected to rotation and mirror image processing to obtain a plurality of rotation image samples, the plurality of rotation image samples are respectively subjected to denoising processing to obtain a plurality of Zhang Zao-point image samples, the plurality of noise image samples are respectively subjected to blurring processing to obtain a plurality of blurring image samples, the plurality of blurring image samples are divided according to a preset image proportion rule to obtain the plurality of plant image training samples and the plurality of plant image verification samples, and the preset image proportion rule can be user-defined setting, can be 6:4, can be 8:2 and the like.

It should be noted that the random rotation of the plurality of luminance image samples helps to enhance the generalization ability of the model. In order to further expand the image data set, rotation of 90 degrees, 180 degrees, 270 degrees, mirror image processing and the like can be further performed on the image, if the model can learn new features in the image with noise, the anti-noise capability of the model can be increased, gaussian noise and salt-and-pepper noise can be used for processing the image, and the image with noise, namely a plurality of noise image samples, can be generated for training the anti-noise capability of the model. It will be appreciated that in an actual photographing process, the distance taken is far, and that incorrect focusing or movement of the phone can cause the acquired image to become unclear. In order to improve the richness of the data set, the condition of image blurring in the actual shooting process is simulated, and the image is processed by using Gaussian blurring and median blurring respectively to obtain a plurality of blurred image samples and the like.

In a specific implementation, a plurality of plant image training samples and a plurality of plant image verification samples are determined according to a plurality of fuzzy image samples, an initial network model is trained according to the plurality of plant image training samples, an initial mixed expansion image segmentation model is obtained, a preset mixed expansion image segmentation model is determined according to the plurality of plant image verification samples and the initial mixed expansion image segmentation model, then an expansion plant image is input into the preset mixed expansion image segmentation model, a plant segmentation image is obtained, and the plant segmentation image can be a background-free segmentation image with fruits of different sizes and the like.

Step S40: and carrying out plant identification according to the plant segmentation image.

It will be appreciated that the number of fruits and size etc. can be identified from background-free segmented images where fruits of different sizes are present.

In this embodiment, a plant image to be processed in a preset scene is first obtained, scene distance information of the plant image to be processed is determined, then expansion processing is performed on the plant image to be processed according to the scene distance information to obtain an expanded plant image, then the expanded plant image is input into a preset mixed expansion image segmentation model to obtain a plant segmentation image, and finally plant identification is performed according to the plant segmentation image. Compared with the prior art that the plant image is required to be manually segmented, the plant image processing efficiency is low, and the plant image is processed by using the preset mixed expansion image segmentation model in the embodiment, the plant segmentation image is acquired rapidly and accurately, and the plant recognition accuracy is improved.

Further, referring to fig. 3, fig. 3 is a first embodiment of a plant identification method based on the above-mentioned mixed expansion convolution, and a second embodiment of a plant identification method based on the mixed expansion convolution according to the present invention is proposed.

In a second embodiment, before step S10 in the plant identification method based on the mixed expansion convolution, the method further includes:

step S001: and acquiring a plurality of plant image samples under different scenes.

The plurality of plant image samples in different scenes can be a plurality of plant image samples in cloudy days, a plurality of plant image samples in sunny days and the like.

In order to adapt to complex and changeable reality, the data set constructed is close to the reality environment by selecting different shooting weather and time, different shooting distances and different fruit numbers in the acquisition process, so that the data set has illumination intensity diversity, fruit size diversity and fruit number diversity. In this embodiment, a shooting scene may be selected from a cloudy day and a sunny day, a shooting time may be selected from 2 pm, 5 pm, and the like, and a plurality of plant image samples may be obtained according to the shooting scene and the shooting time.

Step S002: and respectively determining plant image category information corresponding to the plant image samples.

The plant category information includes shooting distance information, fruit number information, fruit size information, weather information, and the like.

The shooting distance can be 1.0m and 0.5m from the trunk, the number of fruits appearing in the image can be divided into less (1-5), more (5-10) and more (more than 10), the size of the fruits in the image can be divided into small or large, and the like, and referring to table 1, table 1 is a table of a plurality of plant image sample categories.

TABLE 1

Step S003: and respectively preprocessing the plurality of plant image samples according to the plant image category information to obtain a plurality of plant image training samples and a plurality of plant image verification samples.

Respectively determining image illumination information corresponding to a plurality of plant image samples according to plant image category information, determining a preset gray world elimination rule according to the image illumination information, respectively processing the plurality of plant image samples according to the preset gray world elimination rule to obtain a plurality of gray image samples, respectively determining image brightness information corresponding to the plurality of gray images, determining a brightness processing function according to the image brightness information, respectively carrying out brightness adjustment on the plurality of gray images through the brightness processing function to obtain a plurality of brightness image samples, and determining a plurality of plant image training samples and a plurality of plant image verification samples according to the plurality of brightness image samples.

Assume that a plurality of plant image samples extends from 2200 images to 5600, with A, B categories of 1200, C, D, E, F, G, H, I and J categories of 400, respectively. And selecting 840 images in A, B categories, 280 images in the remaining categories, taking 3920 images in total as a plurality of plant image training samples to train a network model, taking the rest 1680 images as the performance of a plurality of plant image verification sample verification models, and the like.

Step S004: and training the initial network model according to the plurality of plant image training samples to obtain an initial mixed expansion image segmentation model.

Respectively determining image distance information corresponding to a plurality of plant image training samples, respectively determining image sample expansion rates corresponding to the plurality of plant image training samples according to the image distance information, judging whether the image sample expansion rates meet a preset expansion factor relation, and training an initial network model according to the plurality of plant image training samples and the image sample expansion rates corresponding to the plurality of plant image training samples when the image sample expansion rates meet the preset expansion factor relation to obtain an initial mixed expansion image segmentation model.

If a large number of holes are found to exist by overlapping a plurality of identical expansion convolutions in a specific implementation, the continuity and the integrity between data are destroyed, and local information is lost. Because the input signals are relatively sparse, the feature information extracted remotely does not have correlation, and the classification result is affected. Therefore, in this embodiment, mixed expansion convolution is adopted, and has three characteristics, the first is that the value of the expansion rate of the superimposed convolution cannot be arbitrarily selected, and the convention cannot contain a common divisor greater than 1; the second characteristic is that the expansion rate is designed into a zigzag structure, the small expansion rate convolution focuses on the short-distance information, the large expansion rate convolution focuses on the long-distance information, the requirement on the segmentation of large and small objects can be met at the same time, by setting different expansion rates, more and more continuous pixel range information can be processed in the same configuration, and meanwhile, a large amount of local information cannot be lost; the third characteristic is that the expansion rate needs to satisfy a preset expansion formula so that no cavity exists in the visual field.

It should be noted that the semanteme segmentation SegNet model is a symmetric network composed of an encoder and a decoder, the encoder part firstly analyzes pixels of an image and learns features in the pixels to obtain higher-order semantic information, and then the decoder part classifies the pixels in the image by using the higher-order information. The growth state of the fruits in the natural environment can not be estimated, the interference of a plurality of environmental factors is faced, meanwhile, the size of the fruits in an image suddenly changes due to the selection of the shooting angle and the distance change of the shooting distance, and the semantic segmentation model is required to have good robustness and the capability of segmenting the fruits with different sizes. In the embodiment, the SegNet model is improved by using mixed expansion convolution, the mixed expansion convolution can keep the sense field of view of the SegNet, and convolution layers with different expansion rates can perform multi-scale detection, so that the influence on the segmentation accuracy caused by abrupt change of the citrus size is overcome. The mixed expansion convolution (Hybrid Dilated Con volution, HDC) Block consists of three convolution layers, the convolution kernel size is 3*3, the number of filters is 1024, the expansion rate is 1, 2 and 5 respectively, and for further research on the influence of the HDC Block on the SegNet model, reference is made to table 2, and table 2 is a model design training result table.

TABLE 2

New module	Accuracy/%	Recall/%
			No additional module (SegNet model)	78.46	77.57
A set of HDC blocks	79.23	80.68
			Two sets of HDC blocks	82.43	82.89
Two sets of HDC blocks (convolutional BN-containing layer and ReLU layer)	82.62	83.59

From the above table, when a set of mixed expansion convolutions is added, the model has obvious gain effect, by referring to the symmetrical structure of SegNet, when two sets of mixed expansion convolutions are added, the model segmentation accuracy is better, two sets of mixed expansion convolutions are added, the expansion rate is set to be [1,2,5], and after each layer of convolution, a BN (batch normalization) layer and a ReLU (linear rectification) layer are used, the model segmentation effect is the best. The model has multi-scale detection capability due to the addition of the mixed expansion convolution, and meanwhile, the model can extract more advanced semantic features due to the proper network depth, so that the overall segmentation effect of the model is improved. The BN layer and the ReLU layer are used after each convolution layer, and the stability of interlayer data distribution is ensured by normalizing the mean value and the variance of fixed input, so that the convergence rate of a network can be accelerated, the overfitting is controlled, the loss and normalization operation can be eliminated or reduced, and the characterization capability of the whole neural network is ensured.

Two groups of HDC Block structures are added between an encoder and a decoder of the SegNet network, each convolution contains a BN layer and a ReLU layer, and the improved model, namely the initial mixed expansion image segmentation model, has the best semantic segmentation effect. The HDC Block is added between the encoder and decoder. The HDC Block structure is composed of a group of mixed expansion convolutions, wherein convolution with large expansion rate can extract and generate more abstract features for large objects, convolution with small expansion rate is used for focusing on near-distance information, and therefore semantic segmentation effect of the model on small objects is better. The mixed expansion convolution groups HDC Block are formed by combining expansion convolutions with different expansion rates, so that the model has the capability of extracting object features with different sizes, namely multi-scale detection is realized so as to adapt to the fruit sizes which change continuously in natural environments, and meanwhile, by adding an HDC Block structure, a network can extract advanced semantic features, so that more accurate semantic segmentation is realized.

Step S005: and determining a preset mixed expansion image segmentation model according to the plant image verification samples and the initial mixed expansion image segmentation model.

And verifying the initial mixed expansion image segmentation model according to the plurality of plant image verification samples to obtain a model evaluation index score, and taking the initial mixed expansion image segmentation model as a preset mixed expansion image segmentation model when the model evaluation index score meets a preset index condition. The preset index condition is within a preset threshold range, etc.

The model evaluation index score may be precision, recall, F1 score, loss function (SSIM), and the like. The accuracy rate can be understood as the ratio of the frame to be correctly predicted, namely the overall prediction accuracy, in the currently traversed prediction frame; recall may be understood as the ratio of the currently co-detected label box to all label boxes, i.e., the probability of being predicted as a positive sample in the samples that are actually positive; f1 score as harmonic mean; the SSIM may be an index that considers brightness (luminance), contrast (contrast), and structure (structure) to evaluate the quality of a compressed image, etc.

In this embodiment, the HDC-SegNet model and the SegNet model may be trained according to a plurality of plant image training samples, and referring to fig. 4 and 5, fig. 4 is a loss chart of the HDC-SegNet model according to the second embodiment of the plant identification method based on the mixed expansion convolution of the present invention, and fig. 5 is a loss chart of the SegNet model according to the second embodiment of the plant identification method based on the mixed expansion convolution of the present invention, where it is known that the final loss of the SegNe model is about 4.0 and the final loss of the HDC-SegNet model is about 2.5. After 2000 training steps, the loss curves of the two models begin to converge, and are not reduced after 4000 training steps, and then the accuracy, recall rate, F1 fraction, SSIM value and processing time of the SegNet model, the PSPNet model and the HDC-SegNet model before and after improvement are compared according to a plurality of plant image training samples, and referring to Table 3, table 3 is a model evaluation index score comparison table.

TABLE 3 Table 3

As shown in the table above, the accuracy rate is improved by 4.16%, the recall rate is improved by 6.02%, the F1 fraction is improved by 5.09%, and the SSIM value is improved by 2.88% before the improved model is relatively improved. Because the HDC-SegNet model has the capacity of multi-scale detection, small fruit targets which are not reported by the SegNet model can be segmented, so that the recall rate is greatly improved. The improved model accuracy is improved, and the segmentation result image is closer to the group Truth, so that the SSIM value is improved. The improved model has a relatively similar effect to the PSPNet model, and the HDC-SegNet model only stores the maximum index of the feature mapping, so that the memory is saved, the model is more efficient, and the time processing of the HDC-SegNet model on the image is shorter.

It should be further noted that five preprocessing methods are used in the data preprocessing section to enhance the image, so that the robustness of the algorithm is better. To verify the impact of these five data enhancement methods on the training model, one data enhancement method was deleted at a time, and the model was observed for F1 score changes.

The color balance treatment has the greatest influence on the performance of the model, the F1 fraction of the model after the treatment is removed is reduced by 4.41 percent, and the rotation transformation treatment has the least influence on the model. The color balance processing corrects the color shift of the image, so that the image is presented closer to a real object, and the rotation transformation has small influence on the characteristics of fruits in the image, so that the model can still better classify the pixel points. Experimental results show that the richness of the data set can be improved by carrying out various preprocessing on the data set, so that the model can learn more scenes close to reality, and the model is suitable for complex and changeable conditions in the actual scene.

In order to verify the segmentation capability of the model on images with different fruit numbers, 200 images with the fruit numbers of 1-5, 5-10 and more than 10 are selected, and the segmentation effects of the HD C-SegNet model, the SegNet model and the PSPNet model are compared and tested. For images of more than 10 fruits, the accuracy of the improved model is improved by 1.09%; for 5-10 fruit images, the accuracy of the improved model is improved by 0.39%; for images of 1-5 fruits, the accuracy of the improved model is improved by 0.82%. Because the PSPNet model predicts based on the structure, the result is finer when the fruits are segmented, and the segmentation of the target edge is more accurate, the PSPNet model has better segmentation effect than the SegNet model. Referring to Table 4, table 4 shows a comparison of the accuracy of each model for different fruit numbers, and the overall segmentation effect of the improved model, i.e., the HDC-SegNet model, was superior to that of the model before improvement.

TABLE 4 Table 4

Number of fruits/number of fruits	HDC-SegNet model/%	SegNet model/%	PSPNet model/%
				1～5	91.45	90.63	91.63
5～10	90.34	89.95	90.29
				Greater than 10	84.57	83.48	84.26

In order to further explore the performances of the model under the condition of different fruit sizes, the number of the selected fruits is medium, the fruit sizes are respectively 200 images with large and small sizes, and the segmentation effects of the HDC-SegNet model, the SegNet model and the PSPNet model are compared and tested. Referring to table 5, table 5 is a comparison table of accuracy of each model under different fruit sizes, and as can be seen from table 5, the improved model has a significant advantage in the segmentation effect of small fruits, and is improved by 8.25% compared with the prior improvement and 5.05% compared with the PSPNet model. The PSPNet model adopts a multi-scale feature combination mode to improve the segmentation performance, so that the PSPNet model has better segmentation results for fruits with different sizes compared with a SegNet model. The experimental result shows that the HDC-SegNet model has good segmentation effect on small fruits and large fruits, and particularly has obvious gain on the segmentation effect of the small fruits, which proves that the multi-scale detection formed by mixed expansion convolution can help the semantic segmentation model to better segment the small fruits, so that the model has stronger generalization and can adapt to segmentation tasks of different fruit sizes.

TABLE 5

Fruit size	HDC-SegNet model/%	SegNet model/%	PSPNet model/%
				Big size	91.75	89.49	91.28
Small size	83.62	75.37	78.57

In this embodiment, first, a plurality of plant image samples under different scenes are acquired, plant image category information corresponding to the plurality of plant image samples is determined respectively, then the plurality of plant image samples are preprocessed according to the plant image category information respectively to obtain a plurality of plant image training samples and a plurality of plant image verification samples, finally, an initial network model is trained according to the plurality of plant image training samples to obtain an initial mixed expansion image segmentation model, a preset mixed expansion image segmentation model is determined according to the plurality of plant image verification samples and the initial mixed expansion image segmentation model, compared with the prior art, the encoder part is used for analyzing and learning features in the image pixels to obtain high-order semantic information, then the decoder part is used for classifying the pixels in the image according to the high-order information, and in this embodiment, mixed expansion convolution is added to the preset mixed expansion image segmentation model to realize segmentation operation of plants with different sizes, and then, near-scene plants are segmented more accurately in natural environments.

In addition, the embodiment of the invention also provides a storage medium, wherein the storage medium stores a plant identification program based on the mixed expansion convolution, and the plant identification program based on the mixed expansion convolution realizes the steps of the plant identification method based on the mixed expansion convolution when being executed by a processor.

In addition, referring to fig. 6, an embodiment of the present invention further provides a plant identification device based on a hybrid expansion convolution, where the plant identification device based on the hybrid expansion convolution includes:

the acquiring module 6001 is configured to acquire a plant image to be processed in a preset scene, and determine scene distance information of the plant image to be processed.

The processing module 6002 is configured to perform expansion processing on the plant image to be processed according to the scene distance information, so as to obtain an expanded plant image.

The preset expansion formula is:

M _i ＝max[M _i+1 -2r _i ,M _i+1 -2(M _i+1 -r _i ),r _i ]

The segmentation module 6003 is configured to input the inflated plant image into a preset hybrid inflated image segmentation model to obtain a plant segmentation image.

The identifying module 6004 is configured to identify a plant according to the plant segmentation image.

Other embodiments or specific implementations of the plant identification device based on the hybrid expansion convolution may refer to the above method embodiments, and will not be described herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the terms first, second, third, etc. do not denote any order, but rather the terms first, second, third, etc. are used to interpret the terms as names.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. read only memory mirror (Read Only Memory image, ROM)/random access memory (Random Access Memory, RAM), magnetic disk, optical disk), comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. A plant identification method based on mixed expansion convolution, which is characterized by comprising the following steps:

performing plant identification according to the plant segmentation image;

the step of expanding the plant image to be processed according to the scene distance information to obtain an expanded plant image comprises the following steps:

generating an expanded plant image according to the expanded pixel information;

the preset expansion formula of the image expansion rate is as follows:

wherein M is _i Is the maximum expansion rate of the ith layer, r _i The expansion rate of the ith layer;

the method for acquiring the plant image to be processed in the preset scene and determining the scene distance information of the plant image to be processed further comprises the following steps:

acquiring a plurality of plant image samples under different scenes;

determining a preset mixed expansion image segmentation model according to the plurality of plant image verification samples and the initial mixed expansion image segmentation model;

the step of training the initial network model according to the plurality of plant image training samples to obtain an initial mixed expansion image segmentation model comprises the following steps:

2. The method of claim 1, wherein the step of preprocessing the plurality of plant image samples according to the plant image category information to obtain a plurality of plant image training samples and a plurality of plant image verification samples, respectively, comprises:

3. The method of claim 2, wherein the step of determining a plurality of plant image training samples and a plurality of plant image verification samples from the plurality of gray scale image samples comprises:

4. The method of claim 3, wherein the step of determining a plurality of plant image training samples and a plurality of plant image verification samples from the plurality of luminance image samples comprises:

5. The method of claim 1, wherein the step of determining a preset blended-up image segmentation model from the plurality of plant image verification samples and the initial blended-up image segmentation model comprises:

6. A plant identification device based on a hybrid expansion convolution, the plant identification device based on the hybrid expansion convolution comprising:

the identification module is used for carrying out plant identification according to the plant segmentation image;

the processing module is also used for determining the image expansion rate according to the scene distance information;

generating an expanded plant image according to the expanded pixel information;

the preset expansion formula of the image expansion rate is as follows:

the acquisition module is also used for acquiring a plurality of plant image samples in different scenes;

the acquisition module is further used for respectively determining image distance information corresponding to the plurality of plant image training samples;

7. A storage medium having stored thereon a mixed-expansion convolution-based plant identification program which, when executed by a processor, implements the steps of the mixed-expansion convolution-based plant identification method of any one of claims 1 to 5.