CN113837062A

CN113837062A - Classification method and device, storage medium and electronic equipment

Info

Publication number: CN113837062A
Application number: CN202111106836.6A
Authority: CN
Inventors: 房建东; 黄居松; 赵于东; 李巴津
Original assignee: Inner Mongolia University of Technology
Current assignee: Inner Mongolia University of Technology
Priority date: 2021-09-22
Filing date: 2021-09-22
Publication date: 2021-12-24

Abstract

The present disclosure provides a classification method, a classification device, a storage medium and an electronic device, wherein the classification method includes acquiring a first image, wherein a target image includes a target to be classified and other objects; segmenting the first image by utilizing a pre-trained segmentation model to obtain a second image, wherein the segmentation model comprises a full convolution neural network, and the second image comprises a target to be classified; and identifying the second image and determining the type information of the target to be classified. According to the method, the parameters of the segmentation model are obtained by modifying the parameters in the visual geometric group network, the first image is segmented by the segmentation model to obtain the second image, and then the second image is identified to determine the type information of the target to be classified, so that the problems that the traditional method is time-consuming and labor-consuming, low in accuracy, dependent on manual feature extraction, low in grass classification accuracy under a complex background and the like are solved, the accuracy of the type information of the target to be classified can be ensured, and the accuracy of the classification result is also improved.

Description

Classification method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of classification technologies, and in particular, to a classification method and apparatus, a storage medium, and an electronic device.

Background

With the rapid development of computer and internet related technologies, the automation and digital management of the grassland resources are realized by using the related technologies such as computer vision, and the ecological balance of sustainable development of animal husbandry and grassland environment is facilitated to be promoted. The classified recognition of the pasture images is one of important basic requirements of the digital management of the pasture resources, and the rapid and accurate classified recognition is important for the planting, production and other works of the pasture.

Currently, the classification and identification methods of images include: (1) the manual visual inspection method classifies the pasture by means of manual experience. (2) The traditional pasture classification method based on image processing comprises a naive Bayes recognition algorithm, a K-NN algorithm, a Support Vector Machine (SVM) algorithm and a BP neural network algorithm, wherein pasture images are preprocessed firstly, the pasture images are subjected to background separation to obtain pastures to be classified, then first-order, second-order and third-order color features, texture features and shape features of the pastures are extracted, the extracted features are subjected to dimensionality reduction by a Principal Component Analysis (PCA), and finally the pastures are classified by the algorithms. (3) The forage grass classification method based on the deep learning method is a Convolutional Neural Network (CNN) method. The specific process is that multilayer convolution is carried out on the input pasture image to extract deep layer characteristics, and finally the pasture is classified through a full connection layer and a SoftMax classification layer.

However, in the method (1), the pasture is classified by means of manual experience, which has the problems of time and labor waste, low accuracy and the like. The naive Bayes recognition algorithm in the mode (2) has the assumption of sample attribute independence, so that the effect is not good if the sample attribute is related, but the congeneric pastures do not have the completely independent attribute; when solving a multi-classification task, the SVM algorithm needs to construct a plurality of binary dividers and train simultaneously, and the training speed is reduced along with the increase of the number of training samples; the BP neural network has the biggest problems that the calculated weight is too large and does not have the characteristic of CNN weight sharing, a large amount of time cost is consumed, the calculation is seriously dependent on the amount of training samples, generally speaking, the traditional classification method depends on manual feature making, but the artificial feature has poor adaptability to pasture images under different conditions (such as illumination change, pasture posture or surrounding background change) and accuracy is poor. In the method (3), although manual feature extraction is not required, when the background is complicated, the pasture features are interfered by surrounding background noise, and the classification effect is also poor for different types of pastures under similar backgrounds.

Disclosure of Invention

In view of this, an object of the embodiments of the present disclosure is to provide a classification method, an apparatus, a storage medium, and an electronic device, which are used to solve the problems that the conventional method is time-consuming and labor-consuming, has low accuracy, depends on manual feature extraction, and has low grass classification accuracy under a complex background.

In a first aspect, an embodiment of the present disclosure provides a classification method, including:

acquiring a first image, wherein the target image comprises a target to be classified and other objects;

segmenting the first image by utilizing a pre-trained segmentation model to obtain a second image, wherein the segmentation model comprises a full convolution neural network, and the second image comprises the target to be classified;

and identifying the second image, and determining the type information of the target to be classified.

In a possible implementation manner, the recognizing the second image and determining the type information of the object to be classified includes:

and inputting the second image into a pre-trained classification model to obtain the type information of the target to be classified.

In one possible embodiment, the segmentation model and the classification model are both determined based on parameters in a visual geometry group network.

In a possible embodiment, the type information of the object to be classified includes a type and a proportion.

In a second aspect, an embodiment of the present disclosure further provides a classification apparatus, which includes:

the system comprises an acquisition module, a classification module and a classification module, wherein the acquisition module is configured to acquire a first image, and the target image comprises a target to be classified and other objects;

the segmentation module is configured to segment the first image by using a pre-trained segmentation model to obtain a second image, wherein the segmentation model comprises a full convolution neural network, and the second image comprises the target to be classified;

and the identification module is configured to identify the second image and determine the type information of the target to be classified.

In a possible implementation, the identification module is specifically configured to:

In a third aspect, an embodiment of the present disclosure further provides a storage medium, where the computer readable storage medium has a computer program stored thereon, and the computer program, when executed by a processor, performs the following steps:

In a fourth aspect, the present disclosure also provides an electronic device, including: a processor and a memory, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over a bus when an electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of:

According to the method and the device, the parameters of the segmentation model are obtained by modifying the parameters in the visual geometric group network, the first image is segmented through the segmentation model to remove the part except the target to be classified to obtain the second image, then the second image is identified to determine the type information of the target to be classified, the problems that the traditional method is time-consuming and labor-consuming and low in accuracy rate, the method depends on manual feature extraction, the grass classification accuracy under a complex background is low and the like are solved, even if the background in the first image is complex, the accuracy of the type information of the target to be classified can be ensured, and the accuracy of the classification result is also improved.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the present disclosure or the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present disclosure, and other drawings can be obtained by those skilled in the art without inventive exercise.

FIG. 1 illustrates a flow chart of a classification method provided by the present disclosure;

FIG. 2 is a schematic diagram illustrating the structure of the FCN in the classification method provided by the present disclosure;

FIG. 3a shows a schematic diagram of a first image before segmentation in a classification method provided by the present disclosure;

FIG. 3b is a schematic diagram of a segmented second image in the classification method provided by the present disclosure;

FIG. 4 is a schematic structural diagram of a classification model in the classification method provided by the present disclosure;

fig. 5 shows a schematic structural diagram of a sorting apparatus provided by the present disclosure;

fig. 6 shows a schematic structural diagram of an electronic device provided by the present disclosure.

Detailed Description

Various aspects and features of the disclosure are described herein with reference to the drawings.

It will be understood that various modifications may be made to the embodiments of the present application. Accordingly, the foregoing description should not be construed as limiting, but merely as exemplifications of embodiments. Other modifications will occur to those skilled in the art within the scope and spirit of the disclosure.

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the disclosure and, together with a general description of the disclosure given above, and the detailed description of the embodiments given below, serve to explain the principles of the disclosure.

These and other characteristics of the present disclosure will become apparent from the following description of preferred forms of embodiment, given as non-limiting examples, with reference to the attached drawings.

It should also be understood that, although the present disclosure has been described with reference to some specific examples, a person of skill in the art shall certainly be able to achieve many other equivalent forms of the disclosure, having the characteristics as set forth in the claims and hence all coming within the field of protection defined thereby.

The above and other aspects, features and advantages of the present disclosure will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.

Specific embodiments of the present disclosure are described hereinafter with reference to the accompanying drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the disclosure that may be embodied in various forms. Well-known and/or repeated functions and structures have not been described in detail so as not to obscure the present disclosure with unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present disclosure in virtually any appropriately detailed structure.

The specification may use the phrases "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments in accordance with the disclosure.

In a first aspect, to facilitate an understanding of the present disclosure, a method for classifying pasture provided by the present disclosure will be described in detail first. As shown in fig. 1, the classification method provided for the embodiment of the present disclosure specifically includes the following steps:

s101, a first image is obtained, wherein the target image comprises a target to be classified and other objects.

In specific implementation, an image acquisition device is used for acquiring a first image corresponding to a target area, wherein pasture exists in the target area, that is, the target image contains a target to be classified, namely pasture, and other objects, which can be stones, open space and the like.

And S102, segmenting the first image by utilizing a pre-trained segmentation model to obtain a second image, wherein the segmentation model comprises a full convolution neural network, and the second image comprises the target to be classified.

Optionally, the segmentation model includes a Full Convolutional Network (FCN), and the FCN is a classification of the image at a pixel level, which solves the problem of semantic-level image segmentation. Also, the input of the FCN may be an image of any size, and thus, the image capturing apparatus in the embodiment of the present application may not be limited.

Specifically, the FCN includes the first five volume blocks of a Visual Geometry Group Network (VGG) -16 and three deconvolution layers instead of a full connection layer, and the FCN structure is shown in fig. 2. The VGG-16 network model is trained through an ImageNet database comprising 100 ten thousand images, has strong deep feature learning capacity, has a large number of trained parameters and weights, and particularly has strong capacity of extracting features of convolution layers related to curves, edges and contours of the images, so that in order to avoid training the whole segmentation model from the beginning, reduce the training time of FCN (fuzzy c-means), namely the segmentation model, and improve the training efficiency of the segmentation model, the trained VGG-16 network model can be used as a pre-training model of the segmentation model, and the pre-training model parameters are transferred to the segmentation model to be adjusted according to the specific use scene of the segmentation model. Specifically, the network migration learning is initialized by using the first five convolution blocks of the VGG-16 network model, the convolution blocks perform feature extraction on an input image and output a feature map, 3 deconvolution layers gradually perform upsampling on the feature map generated by the 5 th convolution block, the first two deconvolution layers adopt a sliding step length of 2, the last deconvolution layer adopts a sliding step length of 8, and in addition, the upsampling adopts a bilinear interpolation method to perform upsampling so as to smoothly reconstruct the edge of the pasture grass interested region. And then, the characteristic diagram of the last convolution layer is up-sampled by the reverse convolution layer to restore the characteristic diagram to the same size of the input image, so that a prediction can be generated for each pixel, the spatial information in the original input image is kept, and finally the up-sampled characteristic diagram is subjected to pixel classification, namely whether the target pixel is a grass pixel or not.

The FCN training adopts a back propagation method, low-level features are extracted from the front convolutional layer in a convolutional network, and high-level features are extracted from the back convolutional layer. Therefore, the use of transfer learning employs a way of fractional learning at training initialization: firstly, fixing convolution kernels of initialized convolution blocks 1 and 2, only enabling a model to learn convolution kernels of initialized convolution blocks 3, 4 and 5, and stopping training when verification accuracy reaches saturation; then, starting from the last learned model, the FCN learns the convolution kernels in the first 2 convolution blocks to complete the training of the segmentation model.

Here, the trained segmentation model segments the first image to obtain a second image, the second image includes the object to be classified, and the final segmentation effect is as shown in fig. 3a and 3b, where fig. 3a is the first image before segmentation, and fig. 3b is the second image before segmentation, and in the second image obtained after segmentation, a portion except the object to be classified is removed, that is, a background region in the first image is ignored, and only an object region with a high recognition degree including the object to be classified is focused, so that the classification of the object to be classified under a complex background is facilitated, and the classification is more suitable for practical application. That is to say, the embodiment of the application takes the VGG-16 network model as the pre-training model of the segmentation model, so that not only the training efficiency of the segmentation model is improved, but also the accuracy of the segmentation is ensured.

S103, identifying the second image and determining the type information of the target to be classified.

After the second image is obtained, identifying and classifying the target to be classified contained in the second image, specifically, inputting the second image into a classification model trained in advance to obtain the type information of the target to be classified. The type information of the target to be classified comprises types and proportions, the types are 5 types of nutritional pastures and 6 types of non-nutritional pastures, and the proportions of the target to be classified in the second image, and if two or more types exist in the second image, the type information with the highest proportion is output.

Here, the classification model in the embodiment of the present application also uses the VGG-16 model as a pre-training model of the segmentation model, that is, the model parameters of the VGG-16 network model are "migrated" into the classification model, and then parameters are adjusted according to the specific use scenario of the classification model. Based on the network structure and the convolutional neural network of the VGG-16 network model, 5 nutritional pasture classification models are constructed by combining the types and characteristics of the nutritional pasture, as shown in FIG. 5. The nutritional pasture classification model is divided into 5 layers such as a convolutional layer, a pooling layer, 2 full-connection layers and a SoftMax classification layer, wherein a one-dimensional vector with the output length of 512 of the global average pooling layer is fully connected with 6 classification neurons; replacing a SoftMax classifier in the original VGG-16 network model by using a classification layer comprising a 5-label SoftMax function, and outputting a result to generate a classification label through the 5-label SoftMax function; optionally, the activation function of the classification model is a ReLU function, and an Adam optimization algorithm is selected.

The number of parameters of the last three fully-connected layers in the VGG-16 network model is 1.2 x 108, the parameters of the VGG-16 network model are designed for 1000 classification categories, and the research is only directed at classification of 6 categories, so that the parameter amount of the fully-connected layers in the classification model obtained by replacing the fully-connected layers in the embodiment of the application is 2561, and the identification accuracy and efficiency for classifying the 6 categories are further achieved. Here, the structure of the classification model is shown in fig. 4, and the main operation flow of the classification model training is as follows:

firstly, inputting 5 nutritional pasture samples, randomly extracting 5 nutritional pasture images of which the content is 90 percent from a nutritional pasture image library, and taking the images as a training sample set to be input into a model; then, preprocessing the nutritional pasture images, namely screening the nutritional pasture images, specifically deleting the fuzzy nutritional pasture images, and standardizing the nutritional pasture images to be consistent in resolution; then, constructing a classification model of the nutritional pasture, namely optimizing a full connection layer from 3 to 2 based on a VGG-16 network model, and replacing the original SoftMax classification layer with a 5-tag SoftMax classifier; then, fine-tuning transfer learning, namely optimizing the parameters of the nutrient pasture recognition model by using the VGG-16 network model parameters in a transfer learning mode, and mainly determining the parameters of 13 convolutional layers and pooling layers; and then, performing model training, namely initializing model parameters by using a random method, setting learning rate and training time, freezing parameters of 13 convolutional layers and pooling layers through iteration of a loss function, and training parameters of 2 full-link layers and SoftMax classification layers to optimize the parameters of the 2 full-link layers and the SoftMax classification layers so as to obtain a classification model.

After the classification model is obtained, the classification model can be subjected to model test by using a test set, and similarly, 5 types of pasture images are selected, 20 images of each type of pasture are used as the test set to verify the accuracy of the model, and of course, images of the pasture in the test set can include images of the pasture which do not contain nutrition pasture.

Therefore, the segmentation model and the classification model in the embodiment of the application are determined based on the parameters in the visual geometric group network, so that the training efficiency of the segmentation model and the classification model is improved, the accuracy of segmentation and classification is improved, and the problems that the traditional method is time-consuming and labor-consuming, the accuracy is low, the method depends on manual feature extraction, the grass classification accuracy under a complex background is low and the like are solved.

Based on the same inventive concept, the second aspect of the present disclosure further provides a classification apparatus corresponding to the classification method, and since the principle of the apparatus in the present disclosure for solving the problem is similar to the classification method in the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not repeated.

Referring to fig. 5, the sorting apparatus includes:

an obtaining module 501 configured to obtain a first image, where the target image includes a target to be classified and other objects;

a segmentation module 502 configured to segment the first image by using a pre-trained segmentation model to obtain a second image, where the segmentation model includes a full convolution neural network, and the second image includes the target to be classified;

an identifying module 503 configured to identify the second image and determine type information of the object to be classified.

In another embodiment, the identifying module 503 is specifically configured to:

In another embodiment, the segmentation model and the classification model are both determined based on parameters in a visual geometry group network.

In another embodiment, the type information of the target to be classified includes a type and a proportion.

The third aspect of the present disclosure also provides a storage medium, which is a computer-readable medium storing a computer program, and when the computer program is executed by a processor, the computer program implements the method provided in any embodiment of the present disclosure, including the following steps:

s11, acquiring a first image, wherein the target image comprises a target to be classified and other objects;

s12, segmenting the first image by using a pre-trained segmentation model to obtain a second image, wherein the segmentation model comprises a full convolution neural network, and the second image comprises the target to be classified;

s13, recognizing the second image, and determining the type information of the object to be classified.

The computer program is executed by the processor to recognize the second image, and when the type information of the target to be classified is determined, the processor further specifically executes the following steps: and inputting the second image into a pre-trained classification model to obtain the type information of the target to be classified.

When the computer program is executed by a processor for the classification method, the segmentation model and the classification model are both determined based on parameters in a visual geometry group network.

When the computer program is used for executing the classification method by a processor, the type information of the target to be classified comprises a type and a proportion.

It should be noted that the storage media described above in this disclosure can be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any storage medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

The fourth aspect of the present disclosure also provides an electronic device, as shown in fig. 6, the electronic device at least includes a memory 601 and a processor 602, a computer program is stored on the memory 601, and the processor 602 implements the method provided by any embodiment of the present disclosure when executing the computer program on the memory 601. Illustratively, the method performed by the electronic device computer program is as follows:

s21, acquiring a first image, wherein the target image comprises a target to be classified and other objects;

s22, segmenting the first image by using a pre-trained segmentation model to obtain a second image, wherein the segmentation model comprises a full convolution neural network, and the second image comprises the target to be classified;

s23, recognizing the second image, and determining the type information of the object to be classified.

The processor executes the following computer program when executing the second image identification and the type information of the object to be classified stored in the memory: and inputting the second image into a pre-trained classification model to obtain the type information of the target to be classified.

The processor, when executing the classification method stored on the memory, determines the segmentation model and the classification model based on parameters in a visual geometry group network.

When the processor executes the classification method stored in the memory, the type information of the target to be classified comprises a type and a proportion.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

While the present disclosure has been described in detail with reference to the embodiments, the present disclosure is not limited to the specific embodiments, and those skilled in the art can make various modifications and alterations based on the concept of the present disclosure, and the modifications and alterations should fall within the scope of the present disclosure as claimed.

Claims

1. A method of classification, comprising:

2. The classification method according to claim 1, wherein the identifying the second image and determining type information of the object to be classified comprises:

3. The classification method according to claim 2, wherein the segmentation model and the classification model are each determined based on parameters in a visual geometry group network.

4. The classification method according to claim 1 or 2, wherein the type information of the object to be classified includes a type and a proportion.

5. A sorting apparatus, comprising:

6. The classification device according to claim 5, wherein the identification module is specifically configured to:

7. The classification apparatus according to claim 6, wherein the segmentation model and the classification model are each determined based on parameters in a visual geometry group network.

8. The classification device according to claim 5 or 6, wherein the type information of the object to be classified includes a type and a proportion.

9. A storage medium, having a computer program stored thereon, the computer program when executed by a processor performing the steps of:

10. An electronic device, comprising: a processor and a memory, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over a bus when an electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of: