CN112016617A - Fine-grained classification method and device and computer-readable storage medium - Google Patents

Fine-grained classification method and device and computer-readable storage medium Download PDF

Info

Publication number
CN112016617A
CN112016617A CN202010880880.1A CN202010880880A CN112016617A CN 112016617 A CN112016617 A CN 112016617A CN 202010880880 A CN202010880880 A CN 202010880880A CN 112016617 A CN112016617 A CN 112016617A
Authority
CN
China
Prior art keywords
original image
initial model
neural network
loss
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010880880.1A
Other languages
Chinese (zh)
Other versions
CN112016617B (en
Inventor
杨若愚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN202010880880.1A priority Critical patent/CN112016617B/en
Publication of CN112016617A publication Critical patent/CN112016617A/en
Application granted granted Critical
Publication of CN112016617B publication Critical patent/CN112016617B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to artificial intelligence, and discloses a fine-grained classification method, which comprises the steps of establishing an initial model, acquiring an original image, and preprocessing the original image to form training data for training the initial model; acquiring loss data corresponding to the original image based on the initial model and the training data; performing backward propagation of random gradient descent based on the loss data until the initial model converges in a preset range to form classification model training; and carrying out classification prediction on the image to be processed through a classification model. The invention also relates to a blockchain technique, the original image being stored in a blockchain. The invention can realize self-adaptive selection of local characteristic regions and improve the accuracy of classification prediction.

Description

Fine-grained classification method and device and computer-readable storage medium
Technical Field
The present invention relates to artificial intelligence, and in particular, to a fine-grained classification method, apparatus, electronic device, and computer-readable storage medium.
Background
The fine-grained classification mainly refers to more refined subclass classification on the basis of distinguishing basic classes, such as distinguishing the types of birds and the styles of vehicles, and the like, and currently, the fine-grained classification has wide business requirements and application scenes in the industry and actual life.
At present, the fine-grained classification method mainly includes two main categories. One is to classify by locating local feature regions and extracting image features of these different classification regions that can distinguish between classifications. However, since there is no position marking information of the local key region, most algorithms use a pre-sized window to select the local feature region. It can be known that the method cannot automatically adapt to the sizes of the feature regions in different scenes, and needs to manually set the aspect ratio and the area of the scribed window. The self-adaptive capacity of the fine-grained classification algorithm to data in different scenes is weakened, so that local feature areas cannot be accurately positioned and local features cannot be effectively extracted, and the algorithm has poor expansion adaptability.
And the other method is to eliminate the influence of non-target areas on fine-grained classification through example detection and segmentation of weak supervision. However, the previous algorithm based on the idea can not realize end-to-end training, and the training process is repeated. Automation cannot be achieved, in addition, continuous human intervention is needed in the whole training process, the training time is long, and the expected training result is uncertain.
Disclosure of Invention
The invention provides a fine-grained classification method, a fine-grained classification device, electronic equipment and a computer-readable storage medium, and mainly aims to improve the precision and efficiency of image fine-grained classification.
In order to achieve the above object, the present invention provides a fine-grained classification method, including:
creating an initial model, acquiring an original image, and preprocessing the original image to form training data for training the initial model;
obtaining loss data corresponding to the original image based on the initial model and the training data;
performing backward propagation of random gradient descent based on the loss data until the initial model converges in a preset range to form a classification model;
and carrying out classification prediction on the image to be processed through the classification model.
Optionally, the training data is stored in a blockchain, and the process of preprocessing the raw images to form training data for training the initial model includes:
inputting the original image into the initial model to obtain a significant matrix corresponding to the original image;
determining the information strength sequence of the adding sub-matrix corresponding to the original image based on the intermediate output result, and acquiring the image local characteristic regions of the adding sub-matrix with the preset number in the information strength sequence;
and determining a target image characteristic region as the training data according to the image local characteristic region.
Optionally, the initial model includes a first neural network and a second neural network, and the process of obtaining loss data corresponding to the original image includes:
extracting and predicting the characteristics of the target image characteristic region through the first neural network, and acquiring a first prediction result and a first cross entropy loss;
obtaining a pairing sorting loss based on the information strength sorting and the first prediction result; simultaneously, extracting a feature vector of the original image through the first neural network, and splicing the feature vector with a feature vector of the target image feature region;
and inputting the spliced feature vectors into a full-connection layer of the first neural network for prediction, and acquiring a corresponding second prediction result and a second cross entropy loss.
Optionally, the loss data is stored in a blockchain; wherein the content of the first and second substances,
the loss data includes the first cross-entropy loss, the second cross-entropy loss, and the pair ordering loss.
Optionally, the inputting an original image into the initial model, and the obtaining a saliency matrix corresponding to the original image includes:
inputting an original image into the first neural network, and acquiring an output result of an intermediate layer of the first neural network;
and respectively passing the output results of the intermediate layers through the second neural network to obtain the significant matrix corresponding to each intermediate layer.
In order to solve the above problem, the present invention also provides a fine-grained classification apparatus, including:
the model building and data acquiring module is used for building an initial model, acquiring an original image, and preprocessing the original image to form training data for training the initial model;
a loss data acquisition module for acquiring loss data corresponding to the original image based on the initial model and the training data;
the model training module is used for carrying out backward propagation of random gradient descent based on the loss data until the initial model converges in a preset range so as to form a classification model;
and the classification prediction module is used for performing classification prediction on the image to be processed through the classification model.
Optionally, the training data is stored in a blockchain, and the process of preprocessing the raw images to form training data for training the initial model includes:
inputting the original image into the initial model to obtain a significant matrix corresponding to the original image;
determining the information strength sequence of the adding sub-matrix corresponding to the original image based on the significant matrix, and acquiring the image local characteristic region of the adding sub-matrix with the preset number in the information strength sequence;
and determining a target image characteristic region as the training data according to the image local characteristic region.
Optionally, the initial model includes a first neural network and a second neural network, and the process of obtaining loss data corresponding to the original image includes:
extracting and predicting the characteristics of the target image characteristic region through the first neural network, and acquiring a first prediction result and a first cross entropy loss;
obtaining a pairing sorting loss based on the information strength sorting and the first prediction result; simultaneously, extracting a feature vector of the original image through the first neural network, and splicing the feature vector with a feature vector of the target image feature region;
and inputting the spliced feature vectors into a full-connection layer of the first neural network for prediction, and acquiring a corresponding second prediction result and a second cross entropy loss.
In order to solve the above problem, the present invention also provides an electronic device, including:
a memory storing at least one instruction; and
and the processor executes the instructions stored in the memory to realize the fine-grained classification method.
In order to solve the above problem, the present invention further provides a computer-readable storage medium, which stores at least one instruction, and the at least one instruction is executed by a processor in an electronic device to implement the fine-grained classification method.
The embodiment of the invention constructs an initial model by acquiring an original image and preprocesses the original image to form training data; acquiring loss data corresponding to the original image according to the initial model and the training data; and performing reverse propagation of random gradient descent based on loss data until the initial model training is completed to form a classification model, and finally performing classification prediction on the image to be processed through the classification model, so that not only can the key feature region be flexibly selected, but also the local key feature region can be selected in a self-adaptive manner, and the effective extraction of the fine-grained features of the image is realized, thereby achieving the effects of classifying and positioning the fine-grained features of the image and highlighting the features of the local feature region.
Drawings
Fig. 1 is a schematic flowchart of a fine-grained classification method according to an embodiment of the present invention;
fig. 2 is a schematic block diagram of a fine-grained classification apparatus according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an internal structure of an electronic device implementing a fine-grained classification method according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a fine-grained classification method. Fig. 1 is a schematic flow chart of a fine-grained classification method according to an embodiment of the present invention. The method may be performed by an apparatus, which may be implemented by software and/or hardware.
In this embodiment, the fine-grained classification method includes:
s110: an initial model is built, an original image is obtained, and the original image is preprocessed to form training data for training the initial model.
Specifically, the process of preprocessing the raw images to form training data for training the initial model includes:
firstly, inputting an original image into an initial model to obtain a significant matrix corresponding to the original image;
then, determining the information strength sequence of the corresponding adding sub-matrix based on the significant matrix, and acquiring the image local characteristic region of the adding sub-matrix with the preset number in the information strength sequence;
and finally, determining a target image characteristic region as the training data according to the acquired image local characteristic region.
In another aspect, the initial model includes a first neural network and a second neural network, inputting the original image into the initial model, and the process of obtaining the saliency matrix corresponding to the original image further includes:
1. inputting an original image into a first neural network of an initial model, and acquiring an output result of an intermediate layer of the first neural network; when a plurality of intermediate layers exist, the output results of the intermediate layers are also multiple, the positions of local feature areas of the original image can be preliminarily determined through the output results of the intermediate layers, namely the process of finding the local features through the first neural network is used for preparing for obtaining a loss function in the next step;
2. and respectively passing the output results of the intermediate layers through a second neural network to obtain the significant matrix corresponding to each intermediate layer.
The second neural network can adopt a characteristic pyramid network, and after the output result of the middle layer passes through the convolution layer of the characteristic pyramid network, a significant matrix corresponding to each middle layer can be obtained; in addition, the first neural network and the second neural network can also adopt other structures or types of network structures, and both local feature search and intermediate output result acquisition can be realized.
And respectively passing the output results of the middle layers on different scales through 1-x-1 dimensional convolution layers in the FPN, and calculating to obtain significant matrixes of the original images on different scales.
It is emphasized that, in order to further ensure the privacy and security of the original image or training data, the original image or training data may also be stored in a node of a blockchain.
S120: and acquiring loss data corresponding to the original image based on the initial model and the training data.
The original image is an image which needs to be classified and predicted, the image is input into a first neural network of an initial model, the output result of the first neural network is input into a second neural network, the first neural network and the second neural network can be deep convolution neural networks for extracting features of the image, a plurality of different convolution neural network structures can be selected in actual training or application, and a common trainer can select a proper neural network structure according to the characteristics of a data set, so that the specific structure of the neural network is not limited.
As a specific example, in the present invention, the first neural network employs a ResNet50 network structure, which includes 64 convolutional layers with a step size of 2 of 7 × 7, one maximum pooling layer with a step size of 2 of 3 × 3, followed by four residual structures with a step size of 2 between each residual structure, and includes a certain number of convolutional layer combinations. Each convolution layer combination comprises a certain number of convolution layers with 1x1 dimension, a certain number of convolution layers with 3x3 dimension and a certain number of convolution layers with 1x1 dimension. At the end of the four residual structures, one average pooling layer, one 1000-dimensional fully-linked network and the softmax activation function are connected.
In addition, the second neural network may adopt a feature pyramid network, that is, the nearest neighbor upsampling is performed on the operation output results of the four residual structures of the ResNet50 network from high to low (i.e., 4, 3, 2, 1), and the convolutional operation result of the residual structure output intermediate result and FPN (corresponding to 1x 1-dimensional convolution in the feature pyramid structure) of the lower layer is subjected to cannate stitching. Finally, the result of the above operation is obtained by an operation of 1x1 dimension, and the significant matrix required by the local feature region is located.
Specifically, the process of acquiring loss data corresponding to the original image includes:
1. performing feature extraction and prediction on a target image feature region through a first neural network, and acquiring a first prediction result and a first cross entropy loss;
2. obtaining pairing sorting loss based on the information strength sorting and the first prediction result; simultaneously, extracting a characteristic vector of the original image through a first neural network, and splicing the characteristic vector with a characteristic vector of a target image characteristic region;
3. inputting the spliced feature vectors into a full-connection layer of the neural network for prediction, and obtaining a corresponding second prediction result and a second cross entropy loss.
As an example, using dynamic programming, the information strength ordering of the summing sub-matrices can be solved on three significant matrices and the summing sub-matrices with the information strength ranked first three can be output, in which case nine summing sub-matrices can be obtained. And further acquiring image local characteristic regions corresponding to the nine summation submatrices, and selecting three image local characteristic regions as image local characteristic regions proposed by the neural network (namely target image characteristic regions) through a non-maximum suppression algorithm.
In addition, the first neural network or the backbone network extracts features of the target image feature region, predicts by using an independent 'first full-link layer', obtains a first prediction result (namely, the prediction probability of the category to which the original image belongs), and further obtains a first cross entropy loss according to the first prediction result. In addition, feature vectors extracted from the original image and the three image local feature areas by the main network are spliced, and an independent 'second full-connection layer' is input for prediction, so that a second prediction result and a second cross entropy loss are obtained.
In addition, the loss data may also be stored in the blockchain; wherein the loss data comprises the first cross entropy loss, the second cross entropy loss and the pair ordering loss.
S130: and carrying out backward propagation of random gradient descent based on the loss data until the initial model converges in a preset range to form a classification model.
S140: and carrying out classification prediction on the image to be processed through the classification model.
Specifically, according to the total loss (the sum of the first cross entropy loss, the second cross entropy loss and the pairing sorting loss) obtained in the steps, the gradient descent is performed to the neural network for back propagation, the steps are repeated until the neural network converges in a preset range, namely, the training of the classification model is completed, and then the classification model is obtained according to the training to perform classification prediction on the correlation of the graph to be processed. Therefore, the invention relates to artificial intelligence, and deep learning is carried out based on a neural network to obtain a final classification model.
As a specific example, the process of classifying and predicting the image by the fine-grained classification method provided by the present invention includes the following steps:
1. and inputting the original image into a backbone network, and outputting a required intermediate layer calculation intermediate result.
2. And (3) respectively passing the output results of the intermediate layers on different scales through a 1X 1-dimensional convolution layer in the FPN (feature pyramid network), and calculating the significant matrixes of the original image on different scales.
3. And (4) superposing the significant matrixes on different scales to integrate one significant matrix.
4. Inputting the original image and pictures of three areas including ' a minimum external matrix of a significant value positive response area ', a minimum external matrix of a region with a significant value larger than 0.5 and a minimum external matrix of a region with a significant value larger than 0.9 ' intercepted according to a significant image matrix into a backbone network, and respectively extracting 4 1024-dimensional eigenvectors. Wherein, the 0.5 area or the 0.9 area refers to the minimum bounding matrix of the enclosed area in which the significance values of all matrix elements in the significance matrix are greater than 0.5 or 0.9.
5. And splicing the four extracted feature vectors, and inputting the four feature vectors into a full-connection layer to obtain an image classification result.
As a specific application scene, the fine-grained classification method based on deformable key feature region extraction can be applied to specific processes such as crop pest and disease identification. The disease and pest lesion marking method can automatically position a lesion feature area under the condition of no disease and pest lesion marking, and extract effective image classification features aiming at the lesion area.
Specifically, crop image information is input into a main network (first neural network) of an initial model, an output result of an intermediate layer of a main network of the last three layers is connected to two independent convolution layers 1 and a full-connection layer 1, and a crop category to which an input image belongs is output.
Then, after a crop type prediction result is obtained, inputting an intermediate layer output result of a shallow layer network into a special deep layer network corresponding to a crop type in a backbone network, inputting a prophase net to the intermediate layer result of the deep layer network before Max posing, outputting a two-dimensional saliency map by the prophase net, and outputting a plurality of local key areas by a method of solving an addition and ascending sequence sub-matrix sequence.
Then, the selected multiple local key areas are input into a main network formed by a general shallow network for various crops and a special crop deep network of the judged crop category, and feature vectors are extracted. And splicing the extracted feature vectors, inputting the feature vectors into the full-connection layer 2, and outputting a prediction result of the pest and disease category to which the crop belongs.
And accessing a tanh activation function to a propsal net output result to generate a saliency map with a saliency value between (-1,1), calculating a sequence of adding to an ascending sub-matrix by outputting on the saliency map, inputting an image area corresponding to the sub-matrix with the saliency value ranked at the front into a teacher network model, and obtaining the confidence coefficient of the labeling label. And calculating the pair ranking loss by ranking the confidence degree of the extracted local characteristic region and ranking the significant value. By updating the parameters of the proposal net and the backbone network through the back propagation of the pair warning loss, the significance value of the non-target area approaches to-1 and the significance value of the target area approaches to 1 on the significance map output by the model.
And selecting the image area of the minimum circumscribed matrix of the corresponding areas with different saliency values between different intervals based on the generated saliency map matrix. In forward propagation, the image areas are input into a backbone network, and the feature vectors are extracted, so that the effects of inhibiting non-target areas and reducing background interference are achieved.
Therefore, in order to realize feasible local feature region positioning without position supervision information of the local key feature region, the fine-grained classification method based on deformable key feature region extraction provided by the invention can realize local key region selection without increasing the calculation cost by a method of extracting a saliency map matrix from a trunk network intermediate layer through an FPN (feature pyramid network) and a method of utilizing a maximum summation sub-matrix. The sub-matrixes with the higher rank are selected as local key areas selected by the model through adding and ranking the information strength of the sub-matrixes, the sub-matrixes are respectively input into a backbone network to extract feature vectors, and the network can effectively extract fine-grained features of the image, so that the purposes of classifying and positioning the fine-grained features of the image and highlighting the features of the local feature areas are achieved.
Fig. 2 is a functional block diagram of the fine-grained classification apparatus according to the present invention.
The fine-grained classification device 100 of the present invention may be installed in an electronic device. According to the realized functions, the fine-grained classification device may include a training data acquisition module 101, a loss data acquisition module 102, a model training module 103, and a classification prediction module 104. A module according to the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.
In the present embodiment, the functions regarding the respective modules/units are as follows:
the model building and data acquiring module 101 is configured to build an initial model, acquire an original image, and preprocess the original image to form training data for training the initial model;
a loss data obtaining module 102, configured to obtain loss data corresponding to the original image based on the initial model and the training data;
the model training module 103 is configured to perform backward propagation of random gradient descent based on the loss data until the initial model converges within a preset range to form a classification model;
and the classification prediction module 104 is configured to perform classification prediction on the image to be processed through the classification model.
Optionally, the training data is stored in a blockchain, and the process of preprocessing the raw images to form training data for training the initial model includes:
inputting the original image into the initial model to obtain a significant matrix corresponding to the original image;
determining the information strength sequence of the adding sub-matrix corresponding to the original image based on the significant matrix, and acquiring the image local characteristic region of the adding sub-matrix with the preset number in the information strength sequence;
and determining a target image characteristic region as the training data according to the image local characteristic region.
Optionally, the initial model includes a first neural network and a second neural network, and the process of obtaining loss data corresponding to the original image includes:
extracting and predicting the features of the target image feature region through a backbone network of the first neural network, and acquiring a first prediction result and a first cross entropy loss;
obtaining a pairing sorting loss based on the information strength sorting and the first prediction result; simultaneously, extracting a feature vector of the original image through the backbone network, and splicing the feature vector with a feature vector of the target image feature region;
and inputting the spliced feature vectors into a full-connection layer of the first neural network for prediction, and acquiring a corresponding second prediction result and a second cross entropy loss.
Fig. 3 is a schematic structural diagram of an electronic device implementing the fine-grained classification method according to the present invention.
The electronic device 1 may comprise a processor 10, a memory 11 and a bus, and may further comprise a computer program, such as a fine-grained classification program 12, stored in the memory 11 and executable on the processor 10.
The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic apparatus 1 and various types of data, such as codes of fine-grained classification programs, etc., but also to temporarily store data that has been output or is to be output.
The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the whole electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing programs or modules (e.g., fine-grained classification programs, etc.) stored in the memory 11 and calling data stored in the memory 11.
The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.
Fig. 3 shows only an electronic device with components, and it will be understood by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than those shown, or some components may be combined, or a different arrangement of components.
For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.
Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The fine-grained classification program 12 stored by the memory 11 in the electronic device 1 is a combination of instructions that, when executed in the processor 10, may implement:
constructing an initial model, acquiring an original image, and preprocessing the original image to form training data for training the initial model;
obtaining loss data corresponding to the original image based on the initial model and the training data;
performing backward propagation of random gradient descent based on the loss data until the initial model converges in a preset range to form a classification model;
and carrying out classification prediction on the image to be processed through a classification model.
Optionally, the training data is stored in a blockchain, and the process of preprocessing the raw images to form training data for training the initial model includes:
inputting the original image into the initial model, and acquiring a significant matrix corresponding to the original image;
determining the information strength sequence of the adding sub-matrix corresponding to the original image based on the significant matrix, and acquiring the image local characteristic region of the adding sub-matrix with the preset number in the information strength sequence;
and determining a target image characteristic region as the training data according to the image local characteristic region.
Optionally, the initial model includes a first neural network and a second neural network, and the process of obtaining loss data corresponding to the original image includes:
extracting and predicting the features of the target image feature region through a backbone network of the first neural network, and acquiring a first prediction result and a first cross entropy loss;
obtaining a pairing sorting loss based on the information strength sorting and the first prediction result; simultaneously, extracting a feature vector of the original image through the backbone network, and splicing the feature vector with a feature vector of the target image feature region;
and inputting the spliced feature vectors into a full-connection layer of the first neural network for prediction, and acquiring a corresponding second prediction result and a second cross entropy loss.
Optionally, the loss data is stored in a blockchain; wherein the content of the first and second substances,
the loss data includes the first cross-entropy loss, the second cross-entropy loss, and the pair ordering loss.
Optionally, the inputting an original image into an initial model, and the obtaining a saliency matrix corresponding to the original image includes:
inputting an original image into the first neural network, and acquiring an output result of an intermediate layer of the first neural network;
and respectively passing the output results of the intermediate layers through a second neural network to obtain the significant matrix corresponding to each intermediate layer.
Specifically, the specific implementation method of the processor 10 for the instruction may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, which is not described herein again. It is emphasized that, in order to further ensure the privacy and security of the original image or training data, the original image or training data may also be stored in a node of a blockchain.
Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A fine-grained classification method, characterized in that the method comprises:
constructing an initial model, acquiring an original image, and preprocessing the original image to form training data for training the initial model;
obtaining loss data corresponding to the original image based on the initial model and the training data;
performing backward propagation of random gradient descent based on the loss data until the initial model converges in a preset range to form a classification model;
and carrying out classification prediction on the image to be processed through the classification model.
2. A fine-grained classification method according to claim 1, wherein the training data is stored in a blockchain, and wherein the pre-processing of the raw images to form training data for training the initial model comprises:
inputting the original image into the initial model, and acquiring a significant matrix corresponding to the original image;
determining the information strength sequence of the adding sub-matrix corresponding to the original image based on the significant matrix, and acquiring the image local characteristic region of the adding sub-matrix with the preset number in the information strength sequence;
and determining a target image characteristic region as the training data according to the image local characteristic region.
3. A fine-grained classification method according to claim 2, wherein the initial model comprises a first neural network and a second neural network, and the process of obtaining loss data corresponding to the original image comprises:
extracting and predicting the characteristics of the target image characteristic region through the first neural network, and acquiring a first prediction result and a first cross entropy loss;
obtaining a pairing sorting loss based on the information strength sorting and the first prediction result; simultaneously, extracting a feature vector of the original image through the first neural network, and splicing the feature vector with a feature vector of the target image feature region;
and inputting the spliced feature vectors into a full-connection layer of the first neural network for prediction, and acquiring a corresponding second prediction result and a second cross entropy loss.
4. The fine grain classification method of claim 3, wherein the loss data is stored in a blockchain; wherein the content of the first and second substances,
the loss data includes the first cross-entropy loss, the second cross-entropy loss, and the pair ordering loss.
5. A fine-grained classification method according to claim 3, wherein said process of inputting an original image into said initial model and obtaining a saliency matrix corresponding to said original image comprises:
inputting an original image into the first neural network, and acquiring an output result of an intermediate layer of the first neural network;
and respectively passing the output results of the intermediate layers through the second neural network to obtain the significant matrix corresponding to each intermediate layer.
6. A fine-grained classification apparatus, characterized in that the apparatus comprises:
the model building and data acquiring module is used for building an initial model, acquiring an original image, and preprocessing the original image to form training data for training the initial model;
a loss data acquisition module for acquiring loss data corresponding to the original image based on the initial model and the training data;
the model training module is used for carrying out backward propagation of random gradient descent based on the loss data until the initial model converges in a preset range so as to form a classification model;
and the classification prediction module is used for performing classification prediction on the image to be processed through the classification model.
7. A fine grain classification apparatus according to claim 6, wherein the training data is stored in a blockchain, and the process of preprocessing the raw images to form training data for training the initial model comprises:
inputting the original image into the initial model to obtain a significant matrix corresponding to the original image;
determining the information strength sequence of the adding sub-matrix corresponding to the original image based on the significant matrix, and acquiring the image local characteristic region of the adding sub-matrix with the preset number in the information strength sequence;
and determining a target image characteristic region as the training data according to the image local characteristic region.
8. A fine-grained classification method according to claim 7, wherein the initial model comprises a first neural network and a second neural network, and the process of obtaining loss data corresponding to the original image comprises:
extracting and predicting the characteristics of the target image characteristic region through the first neural network, and acquiring a first prediction result and a first cross entropy loss;
obtaining a pairing sorting loss based on the information strength sorting and the first prediction result; simultaneously, extracting a feature vector of the original image through the first neural network, and splicing the feature vector with a feature vector of the target image feature region;
and inputting the spliced feature vectors into a full-connection layer of the first neural network for prediction, and acquiring a corresponding second prediction result and a second cross entropy loss.
9. An electronic device, characterized in that the electronic device comprises:
at least one processor; and the number of the first and second groups,
a memory communicatively coupled to the processor; wherein the content of the first and second substances,
the memory stores instructions executable by the processor to perform a fine-grained classification method according to any one of claims 1 to 6.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the fine-grained classification method according to any one of claims 1 to 6.
CN202010880880.1A 2020-08-27 2020-08-27 Fine granularity classification method, apparatus and computer readable storage medium Active CN112016617B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010880880.1A CN112016617B (en) 2020-08-27 2020-08-27 Fine granularity classification method, apparatus and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010880880.1A CN112016617B (en) 2020-08-27 2020-08-27 Fine granularity classification method, apparatus and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN112016617A true CN112016617A (en) 2020-12-01
CN112016617B CN112016617B (en) 2023-12-01

Family

ID=73503617

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010880880.1A Active CN112016617B (en) 2020-08-27 2020-08-27 Fine granularity classification method, apparatus and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN112016617B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507934A (en) * 2020-12-16 2021-03-16 平安银行股份有限公司 Living body detection method, living body detection device, electronic apparatus, and storage medium
CN112801391A (en) * 2021-02-04 2021-05-14 科大智能物联技术有限公司 Artificial intelligent scrap steel impurity deduction rating method and system
CN115222955A (en) * 2022-06-13 2022-10-21 北京医准智能科技有限公司 Training method and device of image matching model, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833085A (en) * 2011-06-16 2012-12-19 北京亿赞普网络技术有限公司 System and method for classifying communication network messages based on mass user behavior data
CN108549926A (en) * 2018-03-09 2018-09-18 中山大学 A kind of deep neural network and training method for refining identification vehicle attribute
CN110084285A (en) * 2019-04-08 2019-08-02 安徽艾睿思智能科技有限公司 Fish fine grit classification method based on deep learning
WO2019154262A1 (en) * 2018-02-07 2019-08-15 腾讯科技(深圳)有限公司 Image classification method, server, user terminal, and storage medium
US20190318405A1 (en) * 2018-04-16 2019-10-17 Microsoft Technology Licensing , LLC Product identification in image with multiple products
CN110619369A (en) * 2019-09-23 2019-12-27 常熟理工学院 Fine-grained image classification method based on feature pyramid and global average pooling

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102833085A (en) * 2011-06-16 2012-12-19 北京亿赞普网络技术有限公司 System and method for classifying communication network messages based on mass user behavior data
WO2019154262A1 (en) * 2018-02-07 2019-08-15 腾讯科技(深圳)有限公司 Image classification method, server, user terminal, and storage medium
CN108549926A (en) * 2018-03-09 2018-09-18 中山大学 A kind of deep neural network and training method for refining identification vehicle attribute
US20190318405A1 (en) * 2018-04-16 2019-10-17 Microsoft Technology Licensing , LLC Product identification in image with multiple products
CN110084285A (en) * 2019-04-08 2019-08-02 安徽艾睿思智能科技有限公司 Fish fine grit classification method based on deep learning
CN110619369A (en) * 2019-09-23 2019-12-27 常熟理工学院 Fine-grained image classification method based on feature pyramid and global average pooling

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112507934A (en) * 2020-12-16 2021-03-16 平安银行股份有限公司 Living body detection method, living body detection device, electronic apparatus, and storage medium
CN112801391A (en) * 2021-02-04 2021-05-14 科大智能物联技术有限公司 Artificial intelligent scrap steel impurity deduction rating method and system
CN112801391B (en) * 2021-02-04 2021-11-19 科大智能物联技术股份有限公司 Artificial intelligent scrap steel impurity deduction rating method and system
CN115222955A (en) * 2022-06-13 2022-10-21 北京医准智能科技有限公司 Training method and device of image matching model, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112016617B (en) 2023-12-01

Similar Documents

Publication Publication Date Title
CN109978893B (en) Training method, device, equipment and storage medium of image semantic segmentation network
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
US20180336683A1 (en) Multi-Label Semantic Boundary Detection System
CN112465071A (en) Image multi-label classification method and device, electronic equipment and medium
CN112016617A (en) Fine-grained classification method and device and computer-readable storage medium
CN113283446B (en) Method and device for identifying object in image, electronic equipment and storage medium
CN110222718B (en) Image processing method and device
CN112396005A (en) Biological characteristic image recognition method and device, electronic equipment and readable storage medium
CN108596240B (en) Image semantic segmentation method based on discriminant feature network
CN112487207A (en) Image multi-label classification method and device, computer equipment and storage medium
CN112137591B (en) Target object position detection method, device, equipment and medium based on video stream
CN112132216B (en) Vehicle type recognition method and device, electronic equipment and storage medium
CN111695609A (en) Target damage degree determination method, target damage degree determination device, electronic device, and storage medium
CN111860496A (en) License plate recognition method, device, equipment and computer readable storage medium
CN111414916A (en) Method and device for extracting and generating text content in image and readable storage medium
CN114049568A (en) Object shape change detection method, device, equipment and medium based on image comparison
CN113487621A (en) Medical image grading method and device, electronic equipment and readable storage medium
CN112329666A (en) Face recognition method and device, electronic equipment and storage medium
CN112183303A (en) Transformer equipment image classification method and device, computer equipment and medium
CN114463685A (en) Behavior recognition method and device, electronic equipment and storage medium
CN113705686B (en) Image classification method, device, electronic equipment and readable storage medium
CN114187476A (en) Vehicle insurance information checking method, device, equipment and medium based on image analysis
CN112561893A (en) Picture matching method and device, electronic equipment and storage medium
CN113343882A (en) Crowd counting method and device, electronic equipment and storage medium
CN111915615A (en) Image segmentation method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant