CN116758355A - Image classification method and device, electronic equipment and storage medium - Google Patents
Image classification method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN116758355A CN116758355A CN202310834475.XA CN202310834475A CN116758355A CN 116758355 A CN116758355 A CN 116758355A CN 202310834475 A CN202310834475 A CN 202310834475A CN 116758355 A CN116758355 A CN 116758355A
- Authority
- CN
- China
- Prior art keywords
- image
- semantic
- target
- instance
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 239000013598 vector Substances 0.000 claims abstract description 308
- 238000013145 classification model Methods 0.000 claims abstract description 92
- 238000012549 training Methods 0.000 claims abstract description 82
- 238000004364 calculation method Methods 0.000 claims description 70
- 238000012545 processing Methods 0.000 claims description 39
- 230000011218 segmentation Effects 0.000 claims description 26
- 238000000605 extraction Methods 0.000 claims description 20
- 238000001514 detection method Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 11
- 238000012216 screening Methods 0.000 claims description 11
- 238000005259 measurement Methods 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 13
- 238000005516 engineering process Methods 0.000 description 16
- 230000008569 process Effects 0.000 description 12
- 238000004891 communication Methods 0.000 description 11
- 230000002159 abnormal effect Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000000926 separation method Methods 0.000 description 5
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 238000003064 k means clustering Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 210000002216 heart Anatomy 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000002604 ultrasonography Methods 0.000 description 2
- 208000014644 Brain disease Diseases 0.000 description 1
- 210000001015 abdomen Anatomy 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000007721 medicinal effect Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the application provides an image classification method and device, electronic equipment and a storage medium, and belongs to the technical fields of artificial intelligence and digital medical treatment. The method comprises the following steps: acquiring and inputting training image features into an original image classification model comprising an original encoder and an original decoder; the training image features are encoded through an original encoder to obtain training image vectors, and the training image vectors comprise: image domain vectors, image semantic vectors, and image instance vectors; decoding the training image vector through an original decoder to obtain an image prediction category; decoupling the training image vector to a decoupling model to obtain image decoupling information; performing parameter adjustment on the original image classification model according to a preset image reference category, an image prediction category and image decoupling information to obtain a target image classification model; and inputting the target image characteristics into a target image classification model to perform image classification to obtain target image categories. The embodiment of the application can improve the accuracy of image classification.
Description
Technical Field
The present application relates to the field of artificial intelligence and digital medical technology, and in particular, to an image classification method and apparatus, an electronic device, and a storage medium.
Background
With the development of the intelligent technology, the intelligent technology is applied to various fields. For example, medical personnel have been assisted in medical research by intelligent image classification. Wherein, medical images of different internal tissues are complex, and meanwhile, the classification accuracy of the medical images is high.
In the related art, in order to improve accuracy of medical image classification, tag information of a large amount of tagged image data is migrated into untagged image data to determine an image class of the untagged image data. Wherein medical personnel are assisted in medical research by calculating an image vector distance between the unlabeled image data and the labeled image data to achieve medical image classification according to the image vector distance. But the accuracy of classifying unlabeled image data by image vector distance is low. Therefore, how to improve the accuracy of image data classification becomes a technical problem to be solved.
Disclosure of Invention
The embodiment of the application mainly aims to provide an image classification method and device, electronic equipment and storage medium, and aims to improve the accuracy of medical image classification.
To achieve the above object, a first aspect of an embodiment of the present application provides an image classification method, including:
acquiring training image characteristics; wherein the training image features include: image domain features, image semantic features, and image instance features;
inputting the training image features into a preset original image classification model; wherein the original image classification model comprises: domain encoder, semantic encoder, instance encoder, and original decoder;
the image domain features are encoded through the domain encoder to obtain image domain vectors, the image semantic features are encoded through the semantic encoder to obtain image semantic vectors, and the image instance features are encoded through the instance encoder to obtain image instance vectors;
decoding the image domain vector, the image semantic vector and the image instance vector through the original decoder to obtain an image prediction category;
inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling processing to obtain image decoupling information;
Performing loss calculation according to a preset image reference category, the image prediction category and the image decoupling information to obtain target loss data;
performing parameter adjustment on the original image classification model according to the target loss data to obtain a target image classification model;
and inputting the acquired target image characteristics into the target image classification model to perform image classification processing to obtain target image categories.
In some embodiments, the decoupling model comprises: a discriminator, a classifier and a parser; the image decoupling information includes: image domain information, image semantic information, and image instance information; inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling processing to obtain image decoupling information, wherein the method comprises the following steps of:
extracting domain information of the image domain vector through the discriminator to obtain the image domain information;
extracting semantic information from the image semantic vector through the classifier to obtain the image semantic information;
and extracting the instance information of the image instance vector through the analyzer to obtain the image instance information.
In some embodiments, the classifier includes: the similarity calculation layer, the clustering layer and the semantic extraction layer; the extracting semantic information from the image semantic vector by the classifier to obtain image semantic information comprises:
taking the image domain information as a screening condition, and screening the image semantic vector through the similarity calculation layer to obtain a reference semantic vector;
performing similarity measurement calculation on the image semantic vector and the reference semantic vector through the similarity calculation layer to obtain similarity data;
clustering the image semantic vectors through the clustering layer to obtain a target vector set; the reference semantic vector is used as a clustering center, and the similarity data is used as a clustering parameter;
and obtaining the structural information of the target vector set through the semantic extraction layer to obtain the image semantic information.
In some embodiments, the classifier includes: the similarity calculation layer, the clustering layer and the semantic extraction layer; the extracting semantic information from the image semantic vector by the classifier to obtain image semantic information comprises:
taking the image domain information as a screening condition, and screening the image semantic vector through the similarity calculation layer to obtain a reference semantic vector;
Performing similarity measurement calculation on the image semantic vector and the reference semantic vector through the similarity calculation layer to obtain similarity data;
clustering the image semantic vectors through the clustering layer to obtain a target vector set; the reference semantic vector is used as a clustering center, and the similarity data is used as a clustering parameter;
and obtaining the structural information of the target vector set through the semantic extraction layer to obtain the image semantic information.
In some embodiments, the calculating the loss according to the preset image reference category, the image prediction category and the image decoupling information to obtain target loss data specifically includes:
performing loss calculation according to the image reference category and the image prediction category to obtain classified loss data;
performing loss calculation on the image decoupling information to obtain decoupling loss data;
and performing data splicing on the classified loss data and the decoupling loss data to obtain the target loss data.
In some embodiments, the performing loss calculation on the image decoupling information to obtain decoupling loss data includes:
performing loss calculation on the image domain information to obtain an image domain loss value;
Performing loss calculation on the image semantic information to obtain an image semantic loss value;
performing loss calculation on the image instance information to obtain an image instance loss value;
and carrying out data combination on the image domain loss value, the image semantic loss value and the image instance loss value to obtain the decoupling loss data.
In some embodiments, the target image features include target semantic features; inputting the obtained target image characteristics into the target image classification model for image classification processing to obtain target image categories, wherein the method comprises the following steps:
inputting the target image features into the target image classification model; wherein the target image classification model comprises: a target encoder and a target decoder;
encoding the target semantic features through the target encoder to obtain target semantic vectors;
and decoding the target semantic vector through the target decoder to obtain the target image category.
To achieve the above object, a second aspect of an embodiment of the present application provides an image classification apparatus, including:
the feature acquisition module is used for acquiring training image features; wherein the training image features include: image domain features, image semantic features, and image instance features;
The feature input module is used for inputting the training image features into a preset original image classification model; wherein the original image classification model comprises: domain encoder, semantic encoder, instance encoder and decoder;
the encoding module is used for encoding the image domain features through a domain encoder to obtain image domain vectors, encoding the image semantic features through the semantic encoder to obtain image semantic vectors, and encoding the image instance features through the instance encoder to obtain image instance vectors;
the decoding module is used for decoding the image domain vector, the image semantic vector and the image instance vector through the decoder to obtain an image prediction category;
the decoupling module is used for inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling treatment to obtain image decoupling information;
the loss calculation module is used for carrying out loss calculation according to a preset image reference category, the image prediction category and the image decoupling information to obtain target loss data;
The parameter adjustment module is used for carrying out parameter adjustment on the original image classification model according to the target loss data to obtain a target image classification model;
and the image classification module is used for inputting the acquired target image characteristics into the target image classification model to perform image classification processing to obtain target image categories.
To achieve the above object, a third aspect of the embodiments of the present application proposes an electronic device, including a memory storing a computer program and a processor implementing the method according to the first aspect when the processor executes the computer program.
To achieve the above object, a fourth aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method of the first aspect.
The image classification method, the device, the electronic equipment and the storage medium are characterized in that image domain characteristics, image semantic characteristics and image instance characteristics are obtained, the image domain characteristics, the image semantic characteristics and the image instance characteristics are input into a preset original image classification model, the image domain characteristics are subjected to coding processing by a domain coder to obtain image domain vectors, the image semantic characteristics are subjected to coding processing by a semantic coder to obtain image semantic vectors, the image instance characteristics are subjected to coding processing by an instance coder to obtain image instance vectors, the image domain vectors, the image semantic vectors and the image instance vectors are decoded by an original decoder to obtain image prediction types, the image domain vectors, the image semantic vectors and the image instance vectors are input into a preset decoupling model to be subjected to decoupling processing to obtain image decoupling information, then the preset image reference types, the image prediction types and the image decoupling information are subjected to loss calculation to obtain target loss data, the original image model is subjected to parameter adjustment according to the target loss data, so that a target image model which can be more accurately classified to the image is constructed, and the obtained target image characteristics are input into the target image classification model to obtain a target image type which is more accurately classified to the target image classification model. Therefore, for complex medical images, the accuracy of medical image classification can be improved by performing image classification through three angles of image domain, image semantics and image instance, so as to provide more accurate medical image classification results for medical researchers.
Drawings
FIG. 1 is a flow chart of an image classification method provided by an embodiment of the present application;
FIG. 2 is a flow chart of an image classification method according to another embodiment of the present application;
fig. 3 is a flowchart of step S105 in fig. 1;
fig. 4 is a flowchart of step S302 in fig. 3;
fig. 5 is a flowchart of step S303 in fig. 3;
fig. 6 is a flowchart of step S106 in fig. 1;
fig. 7 is a flowchart of step S602 in fig. 6;
fig. 8 is a flowchart of step S108 in fig. 1;
fig. 9 is a schematic structural diagram of an image classification method device according to an embodiment of the present application;
fig. 10 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the application only and is not intended to be limiting of the application.
First, several nouns involved in the present application are parsed:
artificial intelligence (artificial intelligence, AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding the intelligence of people; artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that can react in a manner similar to human intelligence, research in this field including robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of consciousness and thinking of people. Artificial intelligence is also a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.
Unsupervised domain adaptation (Unsupervised domain Adaption, UDA): the intention is to transfer knowledges from a marked source domain to another unmarked target domain. Most UDA methods focus on learning domain-invariant feature representations from domain levels or category levels using a Convolutional Neural Network (CNN) based framework. One basic problem with category-level based UDA is the generation of pseudo tags (pseudo tags) for samples in the target domain, which typically have too many noise to do with accurate domain alignment (domain alignment), thus inevitably affecting the performance of the UDA. With the success of the transformers in various tasks, we found that cross-talk in the transformers is robust to noise input pairs (noisy input pairs) for better feature alignment (feature alignment).
Decoupling: coupling refers to the phenomenon whereby two or more systems or two forms of motion interact with each other to join. Mathematically decoupling refers to changing a mathematical equation containing multiple variables into a system of equations that can be represented by a single variable, i.e., the variables no longer together directly affect the result of one equation at the same time, thereby simplifying analytical calculations.
Image instance segmentation (Instance Segmentation): image instance segmentation is further refined on the basis of semantic detection (Semantic Segmentation), and the foreground and the background of the object are separated, so that the object separation at the pixel level is realized. The semantic segmentation of the image and the instance segmentation of the image are two different concepts, the semantic segmentation only distinguishes and segments objects of different categories, and the instance segmentation further segments objects of different instances in the same category.
Cross Entropy (Cross Entropy): cross entropy is an important concept in Shannon information theory and is mainly used for measuring the difference information between two probability distributions. The performance of a language model is typically measured in terms of cross entropy and complexity (superplexity). The meaning of cross entropy is the difficulty of text recognition with the model, or from a compression perspective, each word is encoded with on average a few bits.
k-means clustering algorithm (k-means clustering algorithm): the K-means clustering algorithm is a clustering analysis algorithm for iterative solution, and comprises the steps of dividing data into K groups, randomly selecting K objects as initial clustering centers, calculating the distance between each object and each seed clustering center, and distributing each object to the closest clustering center. The cluster centers and the objects assigned to them represent a cluster.
With the development of the intelligent technology, the intelligent technology is applied to various fields. For example, medical personnel have been assisted in medical research by intelligent image recognition. Wherein, medical images of different internal tissues are complex, and the classification accuracy of the medical images is required to be higher.
In the related art, in order to improve the easiness and efficiency of medical image classification, by setting image sample data containing a large amount of tag information as source domain sample data and image sample data lacking tag information as target domain sample data, the tag information of the source domain sample data is migrated to the target domain sample data to achieve medical image classification of the target domain sample data, so that medical image classification is more efficient and easy. However, conventionally, the label information of the source domain sample data is migrated to the target domain sample data mainly by calculating the image vector distance between the source domain sample data and the target domain sample data, and then determining which source domain sample data the target domain sample data is close to according to the image vector distance, so as to migrate the label information of the source domain sample data to the target domain sample data, thereby realizing the image classification of the target domain sample data. However, the abnormal image data cannot be accurately classified through the semantic distance, and the accuracy of classifying the target domain sample data is low only through the image vector distance.
Based on the above, the embodiment of the application provides an image classification method and device, an electronic device and a storage medium, which are characterized in that image domain features, image semantic features and image instance features are input into an original image classification model, the image domain features are encoded through a domain encoder to obtain image domain vectors, the image semantic features are encoded through a semantic encoder to obtain image semantic vectors, the image instance features are encoded through an instance encoder to obtain image instance information, then the image domain vectors, the image semantic vectors and the image instance vectors are decoded through an original decoder to obtain image prediction types, so that images are classified more accurately by combining various image features, the image domain vectors, the image semantic vectors and the image instance vectors are input into a decoupling module to be decoupled to obtain image decoupling information, so that the accuracy of a feature encoding process can be known according to the image decoupling information, then loss calculation is performed on image reference types, image prediction types and the image decoupling information to obtain target loss data, and parameter adjustment is performed on the original image classification model according to the target loss data to obtain a target image classification model. Therefore, by constructing a target image classification model which is more accurate in image classification and capable of classifying abnormal images, the target image characteristics are input into the target image classification model for image classification processing to obtain the target image types, so that the accurate classification of the images is realized, and the images with incorrect classification can be correctly classified. Therefore, for medical images which are easy to have abnormal points and cannot be classified by the traditional model, the image domain, the image semantics and the image examples of the medical images are extracted to be classified, so that the classification of the abnormal medical images can be realized, and the accuracy of the classification of the normal medical images can be improved.
The image classification method and device, the electronic device and the storage medium provided by the embodiment of the application are specifically described through the following embodiments, and the image classification method in the embodiment of the application is described first.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The embodiment of the application provides an image classification method, which relates to the technical fields of artificial intelligence and digital medical treatment. The image classification method provided by the embodiment of the application can be applied to a terminal, a server and software running in the terminal or the server. In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, etc.; the server side can be configured as an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like; the software may be an application or the like that implements the image classification method, but is not limited to the above form.
The application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In the embodiments of the present application, when related processing is required according to user information, user behavior data, user image data, user history data, user location information, and other data related to user identity or characteristics, permission or consent of the user is obtained first, and the collection, use, processing, and the like of the data comply with related laws and regulations and standards. In addition, when the embodiment of the application needs to acquire the sensitive personal information of the user, the independent permission or independent consent of the user is acquired through popup or jump to a confirmation page and the like, and after the independent permission or independent consent of the user is definitely acquired, the necessary relevant data of the user for enabling the embodiment of the application to normally operate is acquired.
Fig. 1 is an optional flowchart of an image classification method according to an embodiment of the present application, where the method in fig. 1 may include, but is not limited to, steps S101 to S108.
Step S101, obtaining training image characteristics; wherein training image features includes: image domain features, image semantic features, and image instance features;
step S102, inputting training image features into a preset original image classification model; wherein, the original image classification model comprises: domain encoder, semantic encoder, instance encoder, and original decoder;
step S103, coding the image domain features through a domain coder to obtain image domain vectors, coding the image semantic features through a semantic coder to obtain image semantic vectors, and coding the image instance features through an instance coder to obtain image instance vectors;
step S104, decoding the image domain vector, the image semantic vector and the image instance vector through an original decoder to obtain an image prediction category;
step S105, inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling processing to obtain image decoupling information;
Step S106, carrying out loss calculation according to a preset image reference category, an image prediction category and image decoupling information to obtain target loss data;
step S107, carrying out parameter adjustment on the original image classification model according to the target loss data to obtain a target image classification model;
step S108, inputting the acquired target image characteristics into a target image classification model for image classification processing to obtain target image categories.
In the steps S101 to S108 shown in the embodiment of the present application, training image features composed of image domain features, image semantic features and image instance features are obtained, the training image features are input to a preset original image classification model, the image domain features are encoded by a domain encoder to obtain image domain vectors, the image semantic features are encoded by a semantic encoder to obtain image semantic vectors, the image instance features are encoded by an instance encoder to obtain image instance vectors, the image domain vectors, the image semantic vectors and the image instance vectors are respectively input to an original decoder and a decoupling model, the image domain vectors, the image semantic vectors and the image instance vectors are decoded by the original decoder to obtain image prediction types, and the decoupling model performs decoupling processing on the image domain vectors, the image semantic vectors and the image instance vectors to obtain image decoupling information, and the image decoupling information can feed back whether vectors formed by encoding the image features have distinguishability, and then perform loss calculation on image reference types, image prediction types and image decoupling information to obtain target loss data, so as to perform parameter adjustment on the original image models according to the target loss data, so as to construct a target classification model of an image accurately. Because the target loss data comprises loss data of the image decoupling information, the target image classification model is adjusted to be more accurate in encoding the image characteristics, and noise influence in the encoding process can be reduced. Therefore, the target image classification is obtained by inputting the target image characteristics into the target image classification model for image classification processing, so that the image classification is more accurate.
For example, for medical image classification, by constructing and extracting image semantics, image domains and image instances in medical images for classification, image features at the instance level can be considered, and abnormal medical features can be accurately classified.
In step S101 of some embodiments, training image features are acquired, that is, training images are acquired, and the training images are input to a feature extractor for feature extraction to obtain training image features. The training images are directly extracted through the target database, or candidate images of a plurality of internet platforms are acquired in real time, and the training images are screened from the candidate images according to the image demand information, so that the acquisition mode of the training images is not particularly limited. The present embodiment is applied to medical image classification, and the image classification may be classified into a disease-level classification to specific internal tissues or external tissues, wherein the internal tissues include: stomach, abdomen, heart, brain, etc., the external tissues include: arm, foot, hand, head, etc. If the acquired training image is an electroencephalogram, namely, classifying the brain diseases according to the electroencephalogram. If the acquired training image is an ultrasound image of the heart region, the image classification is based on the ultrasound image to classify the heart disease level. After the training image is obtained, the training image is input to a feature extractor for feature extraction, and image domain features, image semantic features and image instance features are extracted for the training image.
Specifically, the training image includes a source domain image and a target domain image, and the target domain image does not include tag information, if the data of the source domain image isTotally n s Zhang Yuanyu image, x i Is the characteristic of the source domain image, y i Is tag information. The data of the target domain image is +.>Totally n t Zhang Mubiao field image, x j Is a feature of the target domain image. Since the training image features comprise image domain features, image semantic features and image instance features, the training image is input into the z-feature extractor to judge the image domain to obtain the image domain features, namely whether the training image has label information or not is analyzed to determine the image domain features according to the label information. The image domain features comprise source domain image features and target domain image features, the source domain image features are characterized by the existence of tag information, and the target domain image features are characterized by the absence of tag information. Meanwhile, the feature extractor performs semantic feature extraction on the training image, namely performs semantic processing on the training image by using a multi-layer extraction mechanism, firstly extracts candidate image features of the training image, performs object recognition on the candidate image features to obtain image object information, and performs semantic extraction on the image object information to obtain image semantic features. Wherein the candidate image features include color features, texture features, and shape features; and carrying out object recognition on the candidate image features, and mainly carrying out similarity matching on the object model in the knowledge base and the extracted image reference features to determine the image object information of each candidate image feature. And the object is identified from the image object information, and the image object information and the semantic reference feature are mapped by utilizing the identification rule and the identification method of the knowledge base to obtain the image semantic feature. After the image semantic feature extraction is completed, the feature extractor is used for extracting example features of the training image to obtain image example features, and the image example features can represent target objects of the training image.
In step S102 of some embodiments, the original image classification model includes: the domain encoder, the semantic encoder, the instance encoder and the original decoder are used for carrying out feature encoding by adopting corresponding encoders aiming at different training image features by constructing an original image classification model formed by combining the domain encoder, the semantic encoder and the instance encoder, and the generated training image vector synthesizes image domain features and image semantic features to image instance features, so that abnormal medical images can be normally classified, and the accuracy of medical image classification is improved.
The medical image is also called a medical image, and is affected by the internal environment of the human body when a certain internal tissue is photographed, so that noise exists in the generated medical image, and the medical effect is abnormal. Referring to fig. 2, a domain encoder, a semantic encoder, and an instance encoder are respectively connected to the feature extractor epsilon F Feature extractor epsilon F Inputting image domain features to a domain encoderThe image semantic features are input to a semantic encoder +.>And inputting the image instance feature to the instance encoder +.>To enable a separate encoding process for each image training feature. Therefore, by combining the image domain features, the image semantic features and the image instance features of the medical image, the noise influence can be reduced, and abnormal medical images can be normally classified, so that the medical image classification is more accurate.
In step S103 of some embodiments, referring to fig. 2, image domain features are input to the domain encoder through the feature extractorEncoding is performed, and the image domain features include a source domain image feature and a target domain feature, so that the domain encoderEncoding image domain features to output an image domain vector Z d Image domain vector Z d For 1 or 0, the source domain image features are characterized by a 1 vector, and the target domain image features are characterized by a 0 vector. At the same time, the feature extractor inputs the image semantic features to the semantic encoderWhere the encoding is performedProcessing to obtain an image semantic vector Z s Namely, selecting a selected semantic vector from semantic vectors according to the semantic features of the image, and vector-stitching the selected semantic vector to obtain an image semantic vector Z s . Inputting image instance features to an instance encoder via a feature extractor>Coding processing is carried out to obtain an image instance vector Z i I.e. by means of the image instance vector Z i The image instance characteristics are characterized, so that the image prediction category is more accurate through decoding processing by an original decoder according to the image domain vector, the image semantic vector and the image instance vector.
It should be noted that, the domain encoder, the semantic encoder and the instance encoder are VAE networks, the VAE networks can map Z in the low-dimensional space to the high-dimensional space, that is, map the low-dimensional training image features to the image vectors in the high-dimensional space, and represent the training image features through the image vectors in the high-dimensional space to be richer, so that the image prediction category obtained by decoding the image vectors in the high-dimensional space is more accurate.
In step S104 of some embodiments, please refer to fig. 2, a training image vector is obtained by vector stitching the image domain vector, the image semantic vector and the image instance vector, and then the training image vector is input to the original decoder epsilon D And performing decoding processing, namely obtaining an image prediction category by performing image classification according to the training image vector, so that the image category prediction of the medical image is more accurate.
Referring to fig. 3, in some embodiments, the decoupling model includes: a discriminator, a classifier and a parser; the image decoupling information includes: image domain information, image semantic information, and image instance information; step S105 may include, but is not limited to, steps S301 to S303:
step S301, extracting domain information of the image domain vector through a discriminator to obtain image domain information;
step S302, extracting semantic information from the semantic vectors of the images through a classifier to obtain semantic information of the images;
step S303, extracting the instance information of the image instance vector through a parser to obtain the image instance information.
In step S301 of some embodiments, the image domain information is obtained by constructing a decoupling model including a discriminator, a classifier, and a parser, and inputting the image domain vector to the discriminator for domain information extraction, that is, by the discriminator, distinguishing whether the image domain vector represents that the training image belongs to the source domain image or the target domain image.
In step S302 of some embodiments, semantic information extraction is performed on the image semantic vectors through a classifier to obtain image semantic information, and the image semantic information can determine whether the semantic information of images in different domains but the image categories are the same, so that the image semantic vectors of the same image category can be adjusted according to the image semantic information, the image semantic vectors of images in different domains but the same image category are similar, and the image category obtained by decoding according to the image semantic vectors is more accurate. Therefore, the image semantic information of the source domain image and the image semantic information of the target domain image are required to be input into the classifier to extract the image semantic information, so that whether the semantics of different domains are the same or not can be judged according to the image semantic information, namely whether the image semantic vectors output by the training images of different image domains by the original image classification model are reasonable or not is judged according to the image semantic information, and the semantic encoder of the original image classification model can be subjected to parameter adjustment according to the image semantic information, and the image semantic vectors obtained by encoding the images of the same image class by the semantic encoder are more similar.
In step S303 of some embodiments, the instance information extraction is performed on the image instance vector by the parser, that is, the training image is subjected to instance separation according to the image instance vector, that is, the foreground and the background in the training image are separated, so that the object separation at the pixel level is implemented, so as to obtain the image instance information capable of representing the image category of the training image.
In the steps S301 to S303 shown in the embodiment of the present application, the image domain information characterizing which image domain the training image comes from is extracted from the image domain vector by the differentiator, and the image semantic information is obtained by performing semantic extraction on the image semantic vector by the classifier, and whether the training images in different image domains are the same image class can be determined according to the image semantic information. And extracting instance information of the image instance vector through a parser to obtain image instance information, so as to determine which target objects the training image comprises according to the image instance information. Wherein the training image is a medical image, then the image instance information characterizes the medical image as including target objects representing each disease category.
Referring to fig. 4, in some embodiments, the classifier includes: the similarity calculation layer, the clustering layer and the semantic extraction layer; step S302 may include, but is not limited to, steps S401 to S404:
step S401, taking the image domain information as a screening condition, and screening the image semantic vector through a similarity calculation layer to obtain a reference semantic vector;
step S402, performing similarity measurement calculation on the image semantic vector and the reference semantic vector through a similarity calculation layer to obtain similarity data;
Step S403, clustering is carried out on the image semantic vectors through a clustering layer to obtain a target vector set; the method comprises the steps of taking a reference semantic vector as a clustering center and similarity data as a clustering parameter;
step S404, obtaining the structural information of the target vector set through a semantic extraction layer to obtain the image semantic information.
In step S401 of some embodiments, the image domain information is used as a filtering condition, and the similarity calculation layer performs filtering processing on the image semantic vector, that is, obtains the image semantic vector in the same image domain as the image semantic vector as a reference semantic vector. For example, if the image domain information characterizes that the training image is a source domain image, an image semantic vector in which the training image is the source domain image is obtained as a reference semantic vector. And if the image domain information characterizes the training image as the target domain image, acquiring an image semantic vector of which the training image is the target domain image as a reference semantic vector.
In step S402 of some embodiments, similarity data is obtained by acquiring image semantic vectors of different image domains as reference semantic vectors, and then calculating the similarity between each image semantic vector and the reference semantic vector. Wherein the similarity data includes any one of the following: the manhattan distance, the euclidean distance, and the angle cosine value, and the category of the similarity data is not particularly limited. If the similarity data is cosine similarity, calculating an included angle cosine value between the image semantic vector and the reference semantic vector, wherein the larger the included angle cosine value is, the more dissimilar the image semantic vector and the reference semantic vector are, and the smaller the included angle cosine value is, the more similar the image semantic vector and the reference semantic vector are. Therefore, an included angle cosine value between the image semantic vector and the reference semantic vector is calculated as similarity data, so that the vector similarity condition between the image semantic vector and the reference semantic vector is judged through the similarity data.
In step S403 of some embodiments, the reference semantic vector is used as a clustering center by the clustering layer, and the similarity data is used as a clustering parameter, so that the image semantic vector is clustered by the clustering layer, and the image semantic vector is clustered by the clustering layer through a k-means clustering algorithm. Namely, similarity data exceeding a preset similarity threshold is obtained as target similarity data through comparison of the similarity data and the preset similarity threshold, then the target similarity data is subjected to sorting processing, and k image semantic vectors before sorting are obtained as a target vector set. If the similarity data exceeds the preset similarity threshold, the image semantic vector cannot be used as a vector set. Therefore, the image semantic vectors are clustered according to the similarity data to obtain the target vector set, so that vector classification is simpler.
In step S404 of some embodiments, by acquiring the structural information of the target vector set, that is, acquiring the semantic information of the reference semantic vector in the target vector set, since the image structures of the same category are not necessarily similar, all the image semantic vectors in the same target vector set adopt the same image semantic information, so that the image semantic information of the same image category in the same image domain is the same, and the discernable image semantic information is extracted, whether the image categories of the training images in the same image domain are the same or not can be judged more simply according to the image semantic information, so as to be convenient for judging whether the image semantic vectors generated by the training images in different image domains in the same category are similar or not according to the image semantic information.
In the steps S401 to S404 shown in the embodiment of the present application, the image domain information is used as a screening condition, and the similarity calculation layer is used to screen the image semantic vector to obtain a reference semantic vector, that is, the image semantic vector of the same image domain information is screened out to be used as the reference semantic vector, and the cosine included angle value between the reference semantic vector and the image semantic vector of the same image domain information is calculated to be used as similarity data. The similarity data is used as a clustering parameter, the reference semantic vector is used as a clustering center, the image semantic vector is clustered through a clustering layer to obtain a target vector set, semantic information of the reference semantic vector in the target vector set is obtained to be used as image semantic information, so that whether the image semantic information of the same category but different image fields is the same or not can be judged according to the image semantic information, parameters of a semantic encoder can be adjusted according to the image semantic information, and the semantic encoder can output similar and discernable image semantic vectors of target images of different image fields but the same image category.
Referring to fig. 5, in some embodiments, step S303 may include, but is not limited to, steps S501 to S503:
Step S501, carrying out image reconstruction processing on the image instance vector through a reconstruction layer, and obtaining an instance reference image;
step S502, carrying out target detection on an example reference image through a target detection layer to obtain image target information;
in step S503, image object information is used as a segmentation parameter, and image instance segmentation processing is performed on the instance reference image through the instance segmentation layer, so as to obtain image instance information.
In step S501 of some embodiments, image instance information is image reconstructed by a reconstruction layer to obtain an instance reference image, and the instance reference image approximates the training image.
In step S502 of some embodiments, the object detection layer performs object detection on the instance reference image, that is, identifies the object in the instance reference image, and identifies the object, that is, establishes an identification frame for the object in the instance reference image, or represents each object with a different color, so as to obtain image object information.
In step S503 of some embodiments, the image instance information includes a position and an identifier of each target object, and the image target information is used as a segmentation parameter to perform image instance segmentation on the instance reference image through the instance segmentation layer, that is, each target object is segmented from the background, so as to obtain image instance information. So that the accuracy of image classification can be improved by the image instance information obtained by separation.
In steps S501 to S503 shown in the embodiment of the present application, an example reference image is obtained by performing image reconstruction on an image example vector through a reconstruction layer, then, an example reference image is subjected to target detection through a target detection layer, so as to detect a target object of the example reference image to obtain image target information, image example information is obtained by performing image example segmentation processing on the example reference image through an example segmentation layer according to the image target information as a segmentation parameter, namely, example loss data can be calculated according to the image example information, whether an original image classification model has discrimination capability for extracting example features is determined according to the example loss data, so that an abnormal medical image can be accurately identified by training the original image classification model according to the example loss data.
Referring to fig. 6, in some embodiments, step S106 includes, but is not limited to, steps S601 to S603:
step S601, performing loss calculation according to an image reference category and an image prediction category to obtain classified loss data;
step S602, performing loss calculation on the image decoupling information to obtain decoupling loss data;
and step S603, performing data splicing on the classified loss data and the decoupled loss data to obtain target loss data.
In step S601 of some embodiments, the image reference class is the label information of the source domain image, and the target domain image has no label information, so the label information of the source domain image is used as the image reference class, and the image reference class and the image prediction class of the source domain image are subjected to the loss calculation to obtain the classification loss data. The classification accuracy of the original image classification model on the source domain image can be characterized through the classification loss data.
It should be noted that, the loss calculation is performed on the image reference category and the image prediction category through a preset loss function to obtain classified loss data, the loss function is an ELBO function, and the calculation on the image reference category and the image prediction category through the ELBO function is shown in formula (1):
in the method, in the process of the invention,for a domain encoder>For semantic encoder->Epsilon for example encoder D Z is the original decoder d Is an image domain vector, Z s Z is the semantic vector of the image i For the image instance vector, +.>For the probability distribution of the reference class of the image,predicting the probability distribution of the class for the image, +.>Is the ELBO loss function.
In step S602 of some embodiments, since the training image includes a source domain image and a target domain image, it cannot be judged whether the original image classification model can accurately classify the target domain image without tag information only by the classification loss data, and the target image classification model constructed later is applied to the unlabeled image data. Therefore, decoupling loss data is obtained by carrying out loss calculation on the image decoupling information, and because the image decoupling information characterizes image domain information, image semantic information and image instance information in the image vector after the training image is encoded, the decoupling loss data adjusts model parameters of an original image classification model so as to construct an image vector with discrimination after the target image is encoded, thereby improving the accuracy of image classification.
In step S603 of some embodiments, the target loss data is obtained by performing data stitching on the classification loss data and the decoupling loss data, so as to adjust model parameters of the original image classification model through the target loss data, so as to construct a target image classification model with more accurate classification.
In the steps S601 to S603 shown in the embodiment of the present application, the classification loss data is obtained by performing loss calculation according to the image reference class and the image prediction class, and the classification loss data can determine the accuracy of the original image classification model to the classification of the source domain image; and then carrying out loss calculation on the image decoupling information to obtain decoupling loss data, wherein the decoupling loss data can judge whether an original image classification model extracts image vectors with strong discrimination on images in different domains, and then carrying out data splicing on the classification loss data and the decoupling loss data to obtain target loss data so as to carry out parameter adjustment on the original image classification model according to the target loss data and construct a target image classification model with more accurate image classification.
Referring to fig. 7, in some embodiments, step S602 may include, but is not limited to, steps S701 to S704:
Step S701, performing loss calculation on the image domain information to obtain an image domain loss value;
step S702, performing loss calculation on the image semantic information to obtain an image semantic loss value;
step S703, performing loss calculation on the image instance information to obtain an image instance loss value;
step S704, carrying out data combination on the image domain loss value, the image semantic loss value and the image instance loss value to obtain decoupling loss data.
In step S701 of some embodiments, the image domain loss value is obtained by performing loss calculation on the image domain information, so as to determine the accuracy of the image domain vector constructed by the domain encoder according to the image domain loss value, so as to adjust the model parameters of the original image classification model according to the image domain loss value, so that the image domain vector output by the domain encoder after adjusting the parameters can be more accurate, and the image semantic vector output by the semantic encoder can be assisted to have more discrimination, thereby improving the accuracy of image classification.
The image domain information is subjected to loss calculation through a cross entropy function to obtain an image domain loss value, so that the image domain loss value is easy to calculate.
In step S702 of some embodiments, the loss calculation is performed on the image semantic information, since the image semantic information characterizes the structural information of the same set of target vectors, that is, the same image domain and the image semantic information of the same image class are the same. The loss data of the image semantic information, namely the difference between the image semantic information of different image domains of the same image category is calculated, so that the semantic encoder is adjusted according to the image semantic loss value, the semantic encoder outputs the image semantic vector approximation of the same image category, and the image classification is more accurate.
Specifically, image semantic information with the same image category but different image domains is obtained and respectively defined as source domain semantic information and target domain semantic information; and calculating cosine similarity between image semantic vectors corresponding to the source domain semantic information and the target domain semantic information, namely calculating cosine similarity between a clustering center of the target domain image and a clustering center of the source domain image of the same image category, and taking the cosine similarity as an image semantic loss value so as to adjust a semantic encoder in an original image classification model according to the image semantic loss value, so that the semantic encoder encodes similar image semantic vectors for the images of the same image category.
In step S703 of some embodiments, the image instance information is subjected to loss calculation to obtain an image instance loss value, and the image instance loss value characterizes the accuracy of image instance segmentation performed by the parser, and then the original image classification model is adjusted according to the image instance loss value, so that the image instance vector output by the instance encoder of the original image classification model is more accurate, and the noise influence in the image classification process can be effectively reduced through the image instance vector, thereby improving the accuracy of image classification model classification.
It should be noted that, a specific formula for performing loss calculation by the image instance information is shown in formula (2):
in the method, in the process of the application,for the image instance loss value, Z represents the set of potential variables, ε, for all images in the source and target domain images A (Z) T ε A (Z) can help the parser output image instance information with discrimination, so that the image instance information can be separated. II Z-epsilon A (Z)‖ 2 It may be ensured that the intrinsic characteristics of the instance are preserved.
In step S704 of some embodiments, the decoupling loss data is obtained by data merging the image domain loss value, the image semantic loss value, and the image instance loss value, so that the decoupling loss data is simply constructed.
Specifically, the calculation formula of the decoupling loss data is shown as formula (3):
in the method, in the process of the application,for the image field loss value, +.>For the image semantic loss value, < >>Image instance loss value.
In the steps S701 to S704 shown in the embodiment of the present application, an image domain loss value is obtained by performing loss calculation on the image domain information, and the image domain loss value characterizes whether the image domain vector output by the domain encoder is accurate; carrying out loss calculation on the image semantic information to obtain an image semantic loss value, wherein the image semantic loss value can characterize whether the semantic encoder encodes the same image semantic vectors for different image domains but the same image categories, and can judge whether the image semantic vectors with discrimination are encoded; carrying out loss calculation on the image instance information to obtain an image instance loss value, wherein the image instance loss value can represent whether the instance segmentation of the training image is accurate or not; and finally, carrying out data combination on the image domain loss value, the image semantic loss value and the image instance loss value to obtain decoupling loss data, so as to carry out parameter adjustment on the original image classification model according to the decoupling loss data, and constructing a target image classification model with more accurate image classification.
In step S107 of some embodiments, the classification loss data and the decoupling loss data are added to obtain target loss data, and parameter adjustment is performed on the original image classification model according to the target loss data to obtain a target image classification model. Specifically, parameter adjustment is performed on a semantic encoder in an original image classification model through target loss data to obtain a target encoder, so that the target encoder can output discernable image semantic vectors, and the accuracy of image classification is improved. The target image classification model is constructed by a target encoder and a target decoder, and the target image of the target image classification model is input subsequently without carrying label information, so that the image domain features of the target image are not required to be extracted, only the image semantic features of the target image are extracted, and the target image category obtained by decoding the image semantic vector by the target decoder is more accurate because the adjusted encoder can output the discernable image semantic vector.
In some embodiments, referring to fig. 8, the target image features include target semantic features, and step S108 may include, but is not limited to, steps S801 to S803:
Step S801, inputting target image characteristics into a target image classification model; wherein the target image classification model comprises: a target encoder and a target decoder;
step S802, encoding the target semantic features through a target encoder to obtain target semantic vectors;
step S803, decoding the target semantic vector through a target decoder to obtain a target image category.
In step S801 of some embodiments, after a target image classification model including a target encoder and a target decoder is constructed, and the target encoder is a semantic encoder after parameter adjustment, the target image is input to a feature extractor to perform feature extraction to obtain target semantic features, and the target semantic features are input to the target image classification model to perform image classification.
In step S802 of some embodiments, the target semantic feature is encoded by a target encoder to obtain a target semantic vector. The target encoder is a semantic encoder for adjusting parameters, the target encoder can encode target semantic features into target semantic vectors with discrimination, parameter adjustment is carried out according to the instance loss value of the image instance information in the training process of the semantic encoder, and the target semantic vectors output by the semantic encoder can reduce the influence of noise, so that the built target semantic vectors can improve the accuracy of image classification.
In step S803 of some embodiments, the target semantic vector is decoded by the target decoder, and the decoding operation of the target decoder is the same as that of the original decoder, that is, the target semantic vector is subjected to image class identification to obtain the target image class, so that the classification of the target image is more accurate.
In steps S801 to S803 shown in the embodiment of the present application, target semantic features are input to a target image classification model, where the target image classification model includes a target encoder and a target decoder, and then the target encoder encodes the target semantic features to obtain a target semantic vector with discrimination, and then the target decoder decodes the target semantic vector to identify an image class according to the target semantic vector to obtain a target image class, so that the target image classification is more accurate.
Referring to fig. 2, the embodiment of the application acquires a medical image in a medical field as a training image, inputs the training image into a feature extractor for feature extraction to obtain an image domain feature, an image semantic feature and an image instance feature, inputs the image domain feature into a domain encoder, inputs the image semantic feature into a semantic encoder, and inputs the image instance feature into an instance encoder. Domain encoder Coding the image domain features to obtain image domain vectors, and a semantic encoder>Coding the image semantic features to obtain image semantic vectors, and an example coder +.>And encoding the image instance characteristics to obtain image instance vectors. Integrating the image domain vector, the image semantic vector and the image instance vector to obtain a training image vector, and inputting the training image vector to an original decoder epsilon D And decoding to obtain the image prediction type. Simultaneously inputting the training image vector into the decoupling model, extracting, by the differentiator, from the image domain vector, an image domain that characterizes from which image domain the training image is derivedInformation. And meanwhile, taking the image domain information as a screening condition, screening the image semantic vectors through a similarity calculation layer to obtain reference semantic vectors, and calculating cosine included angle values between the reference semantic vectors and the image semantic vectors of the same image domain information as similarity data. And in the clustering of the similarity data serving as the clustering parameters and the reference semantic vectors serving as the clustering parameters, clustering the image semantic vectors by a clustering layer to obtain a target vector set, and then acquiring semantic information of the reference semantic vectors in the target vector set serving as image semantic information. And carrying out image reconstruction on the image instance vector through a reconstruction layer to obtain an instance reference image, carrying out object detection on the instance reference image through an object detection layer to detect an object of the instance reference image to obtain image object information, and carrying out image instance segmentation processing on the instance reference image through an instance segmentation layer according to the image object information as a segmentation parameter to obtain image instance information. The image domain information is subjected to loss calculation to obtain an image domain loss value, the image semantic information is subjected to loss calculation to obtain an image semantic loss value, and the image instance information is subjected to loss calculation to obtain an image instance loss value. And carrying out data combination on the image domain loss value, the image semantic loss value and the image instance loss value to obtain decoupling loss data, carrying out loss calculation according to the image reference category and the image prediction category to obtain classification loss data, and carrying out data splicing on the classification loss data and the decoupling loss data to obtain target loss data. The target loss data performs parameter adjustment on the semantic encoder in the original image classification model to enable the semantic encoder to output discriminative image semantic vectors. The target semantic features are encoded through the target encoder to obtain target semantic vectors with discrimination, and then the target semantic vectors are decoded through the target decoder to obtain target image categories, so that the target image classification is more accurate. Therefore, by constructing a target image classification model formed by combining image domain information, image semantic information and image instance information with comparison learning, clustering and instance separation methods, abnormal medical images can be accurately classified through the target image classification model, and normal medical images can be classified The class is more accurate.
Referring to fig. 9, an embodiment of the present application further provides an image classification apparatus, which may implement the above image classification method, where the apparatus includes:
the feature acquisition module is used for acquiring training image features; wherein training image features includes: image domain features, image semantic features, and image instance features;
the feature input module is used for inputting training image features into a preset original image classification model; wherein, the original image classification model comprises: domain encoder, semantic encoder, instance encoder and decoder;
the encoding module is used for encoding the image domain features through a domain encoder to obtain image domain vectors, encoding the image semantic features through a semantic encoder to obtain image semantic vectors, and encoding the image instance features through an instance encoder to obtain image instance vectors;
the decoding module is used for decoding the image domain vector, the image semantic vector and the image instance vector through a decoder to obtain an image prediction category;
the decoupling module is used for inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling treatment to obtain image decoupling information;
The loss calculation module is used for carrying out loss calculation according to a preset image reference category, an image prediction category and image decoupling information to obtain target loss data;
the parameter adjustment module is used for carrying out parameter adjustment on the original image classification model according to the target loss data to obtain a target image classification model;
the image classification module is used for inputting the acquired target image characteristics into the target image classification model to perform image classification processing to obtain target image categories.
The specific implementation of the image classification device is basically the same as the specific embodiment of the image classification method, and will not be described herein.
The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the image classification method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.
Referring to fig. 10, fig. 10 illustrates a hardware structure of an electronic device according to another embodiment, the electronic device includes:
the processor 1001 may be implemented by using a general-purpose CPU (central processing unit), a microprocessor, an application-specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. to execute related programs to implement the technical solution provided by the embodiments of the present application;
The memory 1002 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), static storage, dynamic storage, or random access memory (RandomAccessMemory, RAM). The memory 1002 may store an operating system and other application programs, and when the technical solutions provided in the embodiments of the present disclosure are implemented by software or firmware, relevant program codes are stored in the memory 1002, and the processor 1001 invokes an image classification method for executing the embodiments of the present disclosure;
an input/output interface 1003 for implementing information input and output;
the communication interface 1004 is configured to implement communication interaction between the present device and other devices, and may implement communication in a wired manner (e.g. USB, network cable, etc.), or may implement communication in a wireless manner (e.g. mobile network, WIFI, bluetooth, etc.);
a bus 1005 for transferring information between the various components of the device (e.g., the processor 1001, memory 1002, input/output interface 1003, and communication interface 1004);
wherein the processor 1001, the memory 1002, the input/output interface 1003, and the communication interface 1004 realize communication connection between each other inside the device through the bus 1005.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the image classification method when being executed by a processor.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The image classification method, the device, the electronic equipment and the storage medium provided by the embodiment of the application are characterized in that the image domain characteristics, the image semantic characteristics and the image instance characteristics are obtained and input into a preset original image classification model, the image domain characteristics are encoded by a domain encoder to obtain image domain vectors, the image semantic characteristics are encoded by a semantic encoder to obtain image semantic vectors, the image instance characteristics are encoded by an instance encoder to obtain image instance vectors, the image domain vectors, the image semantic vectors and the image instance vectors are decoded by an original decoder to obtain image prediction types, the image domain vectors, the image semantic vectors and the image instance vectors are input into a preset decoupling model to be subjected to decoupling treatment to obtain image decoupling information, then the preset image reference types, the image prediction types and the image decoupling information are subjected to loss calculation to obtain target loss data, the original image model is subjected to parameter adjustment according to the target loss data to construct a target image classification model capable of classifying images more accurately, the obtained target image characteristics are input into the target image classification model to obtain more accurate medical classification types of images.
The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
It will be appreciated by persons skilled in the art that the embodiments of the application are not limited by the illustrations, and that more or fewer steps than those shown may be included, or certain steps may be combined, or different steps may be included.
The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.
The terms "first," "second," "third," "fourth," and the like in the description of the application and in the above figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present application, "at least one (item)" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing a program.
The preferred embodiments of the present application have been described above with reference to the accompanying drawings, and are not thereby limiting the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.
Claims (10)
1. A method of classifying images, the method comprising:
acquiring training image characteristics; wherein the training image features include: image domain features, image semantic features, and image instance features;
inputting the training image features into a preset original image classification model; wherein the original image classification model comprises: domain encoder, semantic encoder, instance encoder, and original decoder;
the image domain features are encoded through the domain encoder to obtain image domain vectors, the image semantic features are encoded through the semantic encoder to obtain image semantic vectors, and the image instance features are encoded through the instance encoder to obtain image instance vectors;
decoding the image domain vector, the image semantic vector and the image instance vector through the original decoder to obtain an image prediction category;
inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling processing to obtain image decoupling information;
performing loss calculation according to a preset image reference category, the image prediction category and the image decoupling information to obtain target loss data;
Performing parameter adjustment on the original image classification model according to the target loss data to obtain a target image classification model;
and inputting the acquired target image characteristics into the target image classification model to perform image classification processing to obtain target image categories.
2. The method of claim 1, wherein the decoupling model comprises: a discriminator, a classifier and a parser; the image decoupling information includes: image domain information, image semantic information, and image instance information; inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling processing to obtain image decoupling information, wherein the method comprises the following steps of:
extracting domain information of the image domain vector through the discriminator to obtain the image domain information;
extracting semantic information from the image semantic vector through the classifier to obtain the image semantic information;
and extracting the instance information of the image instance vector through the analyzer to obtain the image instance information.
3. The method of claim 2, wherein the classifier comprises: the similarity calculation layer, the clustering layer and the semantic extraction layer; the extracting semantic information from the image semantic vector by the classifier to obtain image semantic information comprises:
Taking the image domain information as a screening condition, and screening the image semantic vector through the similarity calculation layer to obtain a reference semantic vector;
performing similarity measurement calculation on the image semantic vector and the reference semantic vector through the similarity calculation layer to obtain similarity data;
clustering the image semantic vectors through the clustering layer to obtain a target vector set; the reference semantic vector is used as a clustering center, and the similarity data is used as a clustering parameter;
and obtaining the structural information of the target vector set through the semantic extraction layer to obtain the image semantic information.
4. The method of claim 2, wherein the parser comprises: a reconstruction layer, a target detection layer, and an instance segmentation layer; the extracting, by the parser, the instance information of the image instance vector to obtain image instance information includes:
performing image reconstruction processing on the image instance vector through the reconstruction layer, and obtaining an instance reference image;
performing target detection on the example reference image through the target detection layer to obtain image target information;
And taking the image target information as a segmentation parameter, and carrying out image instance segmentation processing on the instance reference image through the instance segmentation layer to obtain image instance information.
5. The method according to claim 2, wherein the performing loss calculation according to the preset image reference category, the image prediction category and the image decoupling information to obtain target loss data specifically includes:
performing loss calculation according to the image reference category and the image prediction category to obtain classified loss data;
performing loss calculation on the image decoupling information to obtain decoupling loss data;
and performing data splicing on the classified loss data and the decoupling loss data to obtain the target loss data.
6. The method according to claim 5, wherein the performing loss calculation on the image decoupling information to obtain decoupling loss data includes:
performing loss calculation on the image domain information to obtain an image domain loss value;
performing loss calculation on the image semantic information to obtain an image semantic loss value;
performing loss calculation on the image instance information to obtain an image instance loss value;
And carrying out data combination on the image domain loss value, the image semantic loss value and the image instance loss value to obtain the decoupling loss data.
7. The method of any one of claims 1 to 6, wherein the target image features comprise target semantic features; inputting the obtained target image characteristics into the target image classification model for image classification processing to obtain target image categories, wherein the method comprises the following steps:
inputting the target image features into the target image classification model; wherein the target image classification model comprises: a target encoder and a target decoder;
encoding the target semantic features through the target encoder to obtain target semantic vectors;
and decoding the target semantic vector through the target decoder to obtain the target image category.
8. An image classification apparatus, the apparatus comprising:
the feature acquisition module is used for acquiring training image features; wherein the training image features include: image domain features, image semantic features, and image instance features;
the feature input module is used for inputting the training image features into a preset original image classification model; wherein the original image classification model comprises: domain encoder, semantic encoder, instance encoder and decoder;
The encoding module is used for encoding the image domain features through a domain encoder to obtain image domain vectors, encoding the image semantic features through the semantic encoder to obtain image semantic vectors, and encoding the image instance features through the instance encoder to obtain image instance vectors;
the decoding module is used for decoding the image domain vector, the image semantic vector and the image instance vector through the decoder to obtain an image prediction category;
the decoupling module is used for inputting the image domain vector, the image semantic vector and the image instance vector into a preset decoupling model for decoupling treatment to obtain image decoupling information;
the loss calculation module is used for carrying out loss calculation according to a preset image reference category, the image prediction category and the image decoupling information to obtain target loss data;
the parameter adjustment module is used for carrying out parameter adjustment on the original image classification model according to the target loss data to obtain a target image classification model;
and the image classification module is used for inputting the acquired target image characteristics into the target image classification model to perform image classification processing to obtain target image categories.
9. An electronic device comprising a memory storing a computer program and a processor implementing the image classification method of any of claims 1 to 7 when the computer program is executed by the processor.
10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the image classification method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310834475.XA CN116758355A (en) | 2023-07-07 | 2023-07-07 | Image classification method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310834475.XA CN116758355A (en) | 2023-07-07 | 2023-07-07 | Image classification method and device, electronic equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116758355A true CN116758355A (en) | 2023-09-15 |
Family
ID=87958906
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310834475.XA Pending CN116758355A (en) | 2023-07-07 | 2023-07-07 | Image classification method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116758355A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118053018A (en) * | 2024-01-23 | 2024-05-17 | 北京透彻未来科技有限公司 | Semantic segmentation model construction method based on pathology big model |
CN118197027A (en) * | 2024-05-15 | 2024-06-14 | 广东力创信息技术有限公司 | Unmanned aerial vehicle scheduling checking method based on pipeline early warning and related device |
-
2023
- 2023-07-07 CN CN202310834475.XA patent/CN116758355A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118053018A (en) * | 2024-01-23 | 2024-05-17 | 北京透彻未来科技有限公司 | Semantic segmentation model construction method based on pathology big model |
CN118197027A (en) * | 2024-05-15 | 2024-06-14 | 广东力创信息技术有限公司 | Unmanned aerial vehicle scheduling checking method based on pipeline early warning and related device |
CN118197027B (en) * | 2024-05-15 | 2024-07-26 | 广东力创信息技术有限公司 | Unmanned aerial vehicle scheduling checking method based on pipeline early warning and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hoang Ngan Le et al. | Robust hand detection and classification in vehicles and in the wild | |
CN111582342B (en) | Image identification method, device, equipment and readable storage medium | |
CN116758355A (en) | Image classification method and device, electronic equipment and storage medium | |
CN112182166A (en) | Text matching method and device, electronic equipment and storage medium | |
CN111954250B (en) | Lightweight Wi-Fi behavior sensing method and system | |
CN115239675A (en) | Training method of classification model, image classification method and device, equipment and medium | |
CN116129141B (en) | Medical data processing method, apparatus, device, medium and computer program product | |
CN114298997B (en) | Fake picture detection method, fake picture detection device and storage medium | |
Park et al. | Neurocartography: Scalable automatic visual summarization of concepts in deep neural networks | |
CN114998583B (en) | Image processing method, image processing apparatus, device, and storage medium | |
Viedma et al. | Relevant features for gender classification in NIR periocular images | |
CN113160987B (en) | Health state prediction method, apparatus, computer device and storage medium | |
CN114329004A (en) | Digital fingerprint generation method, digital fingerprint generation device, data push method, data push device and storage medium | |
CN110675312B (en) | Image data processing method, device, computer equipment and storage medium | |
CN116741396A (en) | Article classification method and device, electronic equipment and storage medium | |
CN112132026A (en) | Animal identification method and device | |
Fan et al. | [Retracted] Accurate Recognition and Simulation of 3D Visual Image of Aerobics Movement | |
CN116543798A (en) | Emotion recognition method and device based on multiple classifiers, electronic equipment and medium | |
CN115392474B (en) | Local perception graph representation learning method based on iterative optimization | |
CN116432648A (en) | Named entity recognition method and recognition device, electronic equipment and storage medium | |
CN116049434A (en) | Construction method and device of power construction safety knowledge graph and electronic equipment | |
CN117351382A (en) | Video object positioning method and device, storage medium and program product thereof | |
CN104778479B (en) | A kind of image classification method and system based on sparse coding extraction | |
CN114973285B (en) | Image processing method, device, equipment and medium | |
CN117633668A (en) | Multi-classification identification method, device, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |