CN116152513A - Image classification method, image classification device and electronic equipment - Google Patents

Image classification method, image classification device and electronic equipment Download PDF

Info

Publication number
CN116152513A
CN116152513A CN202210822389.2A CN202210822389A CN116152513A CN 116152513 A CN116152513 A CN 116152513A CN 202210822389 A CN202210822389 A CN 202210822389A CN 116152513 A CN116152513 A CN 116152513A
Authority
CN
China
Prior art keywords
target
classification model
domain
image
image classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210822389.2A
Other languages
Chinese (zh)
Inventor
赵幸福
周迅溢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Xiaofei Finance Co Ltd
Original Assignee
Mashang Xiaofei Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Xiaofei Finance Co Ltd filed Critical Mashang Xiaofei Finance Co Ltd
Priority to CN202210822389.2A priority Critical patent/CN116152513A/en
Publication of CN116152513A publication Critical patent/CN116152513A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image classification method, an image classification device and electronic equipment, and belongs to the field of image classification. The image classification method provided by the application comprises the following steps: acquiring a target image, wherein the target image is an image to be classified belonging to a target domain; inputting the target image to a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain; and outputting a classification result of the target image.

Description

Image classification method, image classification device and electronic equipment
Technical Field
The application belongs to the field of image classification, and particularly relates to an image classification method, an image classification device and electronic equipment.
Background
In the field of image classification, training of an image classification model is generally performed using an image of a source domain as a training sample, however, a limited training sample is difficult to cover all styles of images. In the application process of the image classification model, if the trained image classification model is adopted, the image to be identified of the target domain with a larger difference from the source domain is identified, and the identification accuracy is low.
In order to solve the above-mentioned problems, in the related art, an image to be identified belonging to a target domain may be input into a domain migration model trained in advance, so as to convert the image to be identified of the target domain into an image to be identified of a source domain; and then inputting the images to be identified of the source domain into a pre-trained image classification model to realize classification of the images to be identified.
However, in the related art, the image to be identified needs to be preprocessed by the domain migration model and then classified by the image classification model, so that the classification speed is low.
Disclosure of Invention
The embodiment of the application provides an image classification method, an image classification device and electronic equipment, which can solve the problem of low image classification speed in the related technology.
In a first aspect, an embodiment of the present application provides an image classification method, including:
acquiring a target image, wherein the target image is an image to be classified belonging to a target domain;
inputting the target image to a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
And outputting a classification result of the target image.
In a second aspect, an embodiment of the present application provides an image classification apparatus, including: the device comprises an acquisition module, an input module and an output module;
the acquisition module is used for acquiring a target image, wherein the target image is an image to be classified belonging to a target domain;
the input module is used for inputting the target image into the second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
and the output module is used for outputting the classification result of the target image.
In a third aspect, embodiments of the present application provide an electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the method as described in the first aspect.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the method according to the first aspect.
In the embodiment of the application, the target image is obtained and is an image to be classified belonging to a target domain; inputting the target image to the second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain; and outputting a classification result of the target image. Therefore, the second image classification model is an image classification model obtained by adjusting the first image classification model based on the target domain characteristic information corresponding to the target domain, and can be suitable for classifying images belonging to the target domain.
Drawings
FIG. 1 is a schematic flow chart of an image classification method in the related art;
FIG. 2 is a schematic flow chart of an image classification method provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart of another image classification method provided by an embodiment of the present application;
FIG. 4 is a schematic flow chart of another image classification method provided by an embodiment of the present application;
FIG. 5 is a schematic flow chart diagram of another image classification method provided by an embodiment of the present application;
FIG. 6 is a schematic flow chart diagram of another image classification method provided by an embodiment of the present application;
FIG. 7 is a schematic flow chart diagram of another image classification method provided by an embodiment of the present application;
FIG. 8 is a schematic flow chart of a training method of a target classification model provided in an embodiment of the present application;
FIG. 9 is a schematic flow chart of another training method for a target classification model provided by an embodiment of the present application;
fig. 10 is a schematic structural diagram of an image classification apparatus provided in an embodiment of the present application;
FIG. 11 is a schematic block diagram of an electronic device according to an embodiment of the present application;
fig. 12 is a schematic hardware structure of an electronic device according to an embodiment of the present application.
Detailed Description
Technical solutions in the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application are within the scope of the protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged, as appropriate, such that embodiments of the present application may be implemented in sequences other than those illustrated or described herein, and that the objects identified by "first," "second," etc. are generally of a type and not limited to the number of objects, e.g., the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The applicant notes that in the related art, as shown in fig. 1, in order to solve the problem that the image scene difference between the training of the image classification model and the application of the image classification model is large, a target image (an image to be identified of a target domain) is generally input into a domain migration model trained in advance, so as to convert the image to be identified of the target domain into an image of a source domain; and then inputting the image of the source domain into a pre-trained first image classification model to realize the prediction classification of the image to be identified. The first image classification model may be an image classification model obtained by training with an image sample of the source domain as a training sample, and further the first image classification model may be suitable for classifying images belonging to the source domain. Thus, the target image is classified by the domain migration model and the first image classification model in sequence, and the image classification speed is low.
Based on this, as shown in fig. 2, in the image classification method provided in the embodiment of the present application, the first image classification model may be adjusted by using the target domain feature information corresponding to the target domain, so as to obtain the second image classification model suitable for classifying the image belonging to the target domain. Therefore, the target image can be classified directly through the second image classification model, and compared with the related technology, the target image does not need to be preprocessed through the domain migration model, and the image classification speed is high.
The image classification method, the image classification device and the electronic equipment provided by the embodiment of the application are described in detail below through specific embodiments and application scenes thereof with reference to the accompanying drawings.
Fig. 3 is a schematic flow chart of an image classification method according to an embodiment of the present application.
As shown in fig. 3, the image classification method provided in the embodiment of the present application may include:
step 310: acquiring a target image, wherein the target image is an image to be classified belonging to a target domain;
step 320: inputting the target image to a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
Step 330: and outputting a classification result of the target image.
In step 310, the target image is an image to be classified belonging to the target domain, and there is a style difference between the image belonging to the target domain and the image belonging to the source domain.
In step 320, the first image classification model may be an image classification model trained using the image samples of the source domain as training samples, and thus the first image classification model may be suitable for classifying images belonging to the source domain, but not so suitable for classifying images belonging to the target domain. The second image classification model is an image classification model obtained by adjusting the first image classification model based on the target domain feature information corresponding to the target domain. The object domain feature information corresponding to the object domain may represent a feature specific to the object domain, and is used to distinguish a style difference between an image belonging to the object domain and an image belonging to the source domain, so that an image classification model obtained by adjusting the first image classification model based on the object domain feature information corresponding to the object domain may be suitable for classifying images belonging to the object domain.
In step 330, embodiments of the present application may employ a second image classification model to directly classify the target image. Furthermore, since the second image classification model is an image classification model obtained by adjusting the first image classification model based on the target domain feature information corresponding to the target domain, the second image classification model can be applied to classification of images belonging to the target domain, and the images to be classified belonging to the target domain can be directly input into the second image classification model, and the classification result of the images to be classified can be output. Compared with the related art, the method has the advantages that the image to be classified in the target domain does not need to be input into the domain migration model for preprocessing, the image to be classified in the target domain is favorably classified directly through the second image classification model, the image classification speed is high, and the problem that the image classification speed is low in the related art is solved.
According to the image classification method provided by the embodiment of the application, the target image is obtained, and the target image is an image to be classified belonging to a target domain; inputting the target image to the second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain; and outputting a classification result of the target image. Therefore, the second image classification model is an image classification model obtained by adjusting the first image classification model based on the target domain characteristic information corresponding to the target domain, and can be suitable for classifying images belonging to the target domain.
In a specific embodiment, in order to increase the classification speed of the target image, before the target image is acquired for classification, the embodiment of the application may be trained in advance to obtain the second image classification model, so that the target image is classified by directly using the second image classification model. The following is an example.
Fig. 4 is a schematic flow chart of another image classification method provided in an embodiment of the present application.
As shown in fig. 4, the image classification method provided in the embodiment of the present application may include:
step 410: acquiring a first image classification model;
the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
step 420: acquiring target domain feature information corresponding to a target domain, wherein the target domain feature information is used for representing the characteristic of the target domain;
step 430: adjusting the first image classification model based on the target domain characteristic information to obtain a second image classification model, wherein the second image classification model is suitable for classifying images belonging to the target domain;
step 440: acquiring a target image, wherein the target image is an image to be classified belonging to the target domain;
step 450: inputting the target image to the second image classification model;
step 460: and outputting a classification result of the target image.
Wherein steps 410 through 430 may be performed before step 440.
Wherein step 440 may refer to the specific content of step 310, step 450 may refer to the specific content of step 320, and step 460 may refer to the specific content of step 330.
In step 410, the first image classification model may be an image classification model trained using the image samples of the source domain as training samples, and thus the first image classification model may be applicable to classification of images belonging to the source domain. The first image classification model may be a neural network learning model.
The structure of the first image classification model may be composed of five convolution layers and a full connection layer for classification prediction as shown in fig. 7, or may be further composed of a VGGNet (Visual Geometry Group Network ) structure, or may be further composed of a Residual (Residual Network) structure, or the like, which is not particularly limited in this application.
It can be understood that in practical application, in order to alleviate the problem of generalization ability of the image classification model, when training samples of the first image classification model are collected, a greater number of training samples with a more complete type are generally collected as much as possible, so that the training samples conform to the distribution of practical use as much as possible. However, in reality, the images are various, and it is difficult for the training sample to cover all the distributed images, so that there is a gap between the data distribution (corresponding to the source domain) of the images of the training sample and the data distribution (corresponding to the target domain) of the images to be identified, resulting in low accuracy of identifying the images to be identified of the target domain by the first image classification model. In other words, the first image classification model is less suitable for classifying images belonging to the target domain.
In step 420, the target domain feature information corresponding to the target domain is used to distinguish a style difference between the image belonging to the target domain and the image belonging to the source domain, and the target domain feature information may be used to characterize a feature specific to the target domain. The domain-specific features may be features predefined according to actual usage requirements.
For example, domain-specific features may include at least one of: the characteristic information of the target domain can be used for representing the domain specific characteristic of the target domain, wherein the characteristic information of the target domain can be high exposure, the object has a duty ratio of less than 50% in the image and black and white; the source domain feature information may be low exposure, object to image ratio higher than 70%, color tone. The target domain feature information may distinguish a style difference between an image belonging to the target domain and an image of the source domain.
In step 420, the manner in which the target domain feature information is obtained is not limited. For example, the target domain feature information may be target domain feature information extracted from an image belonging to a target domain in real time, or target domain feature information received from another terminal and prepared in advance, and the present application is not particularly limited.
In step 430, the first image classification model is adjusted based on the target domain feature information to obtain a second image classification model, so that the second image classification model is suitable for classifying images belonging to the target domain. Wherein the second image classification model may be a neural network learning model.
In steps 440 to 460, after the second image classification model is obtained, a target image (an image to be classified belonging to a target domain) may be input to the second image classification model, and a classification result of the target image may be output. It can be understood that the image to be classified belonging to the target domain can be directly input into the second image classification model, and the classification result of the image to be classified is output. Compared with the related art, the method has the advantages that the image to be classified in the target domain does not need to be input into the domain migration model for preprocessing, the image to be classified in the target domain is favorably classified directly through the second image classification model, the image classification speed is high, and the problem that the image classification speed is low in the related art is solved.
According to the image classification method provided by the embodiment of the application, a first image classification model is obtained, wherein the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain; acquiring target domain feature information corresponding to a target domain, wherein the target domain feature information is used for representing the characteristic of the target domain; adjusting the first image classification model based on the target domain characteristic information to obtain a second image classification model, wherein the second image classification model is suitable for classifying images belonging to the target domain; acquiring a target image, wherein the target image is an image to be classified belonging to the target domain; inputting the target image to the second image classification model; and outputting a classification result of the target image. Therefore, the first image classification model is adjusted based on the target domain characteristic information corresponding to the target domain, the obtained second image classification model can be suitable for classifying images belonging to the target domain, compared with the related art, the image to be classified in the target domain does not need to be input into the domain migration model for preprocessing, the image to be classified in the target domain is directly classified through the second image classification model, the image classification speed is high, and the problem that the image classification speed is low in the related art is solved.
Fig. 5 may be a schematic flow chart of an image classification method provided in another embodiment of the present application. FIG. 5 may further define a process for adjusting the first image classification model based on the target domain feature information, based on the embodiment shown in FIG. 4.
In a specific embodiment, in order to make the second image classification model more suitable for classifying the image belonging to the target domain, as shown in fig. 5, in step 430, the adjusting the first image classification model based on the target domain feature information to obtain the second image classification model includes:
step 4301: acquiring source domain feature information corresponding to the source domain in the first image classification model, wherein the source domain feature information is used for representing the characteristic of the source domain;
step 4302: replacing the source domain feature information in the first image classification model with the target domain feature information to obtain an adjusted image classification model;
step 4303: and obtaining a second image classification model based on the adjusted image classification model.
In step 4301, since the source domain feature information may be used to characterize the source domain-specific features, a style feature specific to the source domain image may be derived based on the source domain feature information. For images belonging to a source domain and a target domain, source domain feature information of the image belonging to the source domain may be different from target domain feature information of the image belonging to the target domain. Further, there may often be a large style difference between the image belonging to the source domain and the image belonging to the target domain.
For example, domain-specific features may include information in terms of image dimensions. The source domain feature information may be feature information of a three-dimensional image. The target domain feature information may be feature information of a two-dimensional image. It can be seen that there may be a large style difference between the image of the target domain and the image of the source domain.
In step 4301, the manner in which the source domain characteristic information is acquired is not limited. For example, the source domain feature information may be source domain feature information extracted from the first image classification model in real time, source domain feature information extracted from the first image classification model in advance, source domain feature information extracted from a training image sample of the first image classification model after processing the training image sample by using a trained domain classification model, or the like, which is not particularly limited in the present application.
In step 4302, after replacing the source domain feature information in the first image classification model with the target domain feature information, the resulting adjusted image classification model may be applied to classification of images belonging to the target domain. It will be appreciated that since the source domain feature information in the first image classification model is replaced, the adjusted image classification model (or the second image classification model) may be changed from being applicable to the classification of images belonging to the source domain to being applicable to the classification of images belonging to the target domain.
In step 4303, based on the adjusted image classification model, deriving a second image classification model may include: and taking the adjusted image classification model as a second image classification model. In this way, the difficulty in acquiring the second image classification model is reduced on the basis that the second image classification model can be applied to classification of images belonging to the target domain.
Alternatively, in step 4303, based on the adjusted image classification model, deriving a second image classification model may include: and training the adjusted image classification model by utilizing the image samples belonging to the target domain to obtain the second image classification model. It can be understood that, in order to avoid that the characteristic information of the target domain obtained in the practical application cannot well represent the characteristic information of the target domain, after the adjusted image classification model is obtained, the application can also train the adjusted image classification model by using a small amount of image samples belonging to the target domain as training samples, so as to obtain a second image classification model, thereby further improving the accuracy of identifying the target image by the second image classification model.
It should be appreciated that the process of step 430 may not be limited to the processes listed above. In fact, in the embodiment of the present application, any process of adjusting the first image classification model based on the target domain feature information to obtain the second image classification model suitable for classifying the target domain image may be used. The replacement operation described above is only one example. For example, in some cases, instead of replacing the source domain feature information, the source domain feature information may be only faded, for example, the weight or importance ratio of the source domain feature information is reduced to be lower, and the target domain feature information may be intensified, for example, the weight or importance ratio of the target domain feature information is increased to be higher. In the embodiments of the present application, higher and lower are merely relative concepts. The higher may be, for example, more than 50% to less than 99%, the lower may be more than 1% to less than 50%, etc.
In the image classification method provided by the embodiment of the application, the source domain characteristic information corresponding to the source domain in the first image classification model is obtained, and the source domain characteristic information is used for representing the characteristic of the source domain; replacing the source domain feature information in the first image classification model with the target domain feature information to obtain an adjusted image classification model; and obtaining a second image classification model based on the adjusted image classification model. In this way, the source domain feature information in the first image classification model is replaced by the target domain feature information, so that the second image classification model can be better suitable for classifying the image belonging to the target domain.
In a specific embodiment, the source domain feature information in the first image classification model may be in the form of feature vectors. That is, in step 4301, the source domain feature information may include N specified feature vectors, the target domain feature information may include N target feature vectors, N is an integer, and N.gtoreq.1. For example, N is 1,2 or 3, etc., and the present application is not particularly limited. It should be understood that in the embodiment of the present application, the source domain feature information is in the form of a feature vector, which is only an example. In some cases, the source domain feature information may also be in other forms. For example, in some cases, the source domain feature information may be in other forms than feature vectors, such as custom strings, among other forms. In the embodiment of the present application, the forms of the source domain feature information and the target domain feature information are not specifically limited, and emphasis is placed on that the source domain feature information may represent a feature specific to a source domain, and the target domain feature information may represent a feature specific to a target domain.
Accordingly, in the step 4302, the replacing the source domain feature information in the first image classification model with the target domain feature information to obtain an adjusted image classification model includes:
and replacing N appointed characteristic vectors in the first image classification model with the N target characteristic vectors to obtain an adjusted image classification model.
In the embodiment of the present application, the number of target feature vectors may be determined according to the number of specified feature vectors included in the source domain feature information. Specifically, the number of the designated feature vectors contained in the source domain feature information can be determined to be N, based on the number of the designated feature vectors, N target feature vectors corresponding to the N designated feature vectors are obtained and used as target domain feature information, so that all the designated feature vectors contained in the source domain feature information are replaced with the target feature vectors one by one, and the problem that the accuracy of the second image classification model is affected due to the fact that the unreplaced source domain feature information exists in the adjusted image classification model is avoided.
In practical applications, the embodiment of the application may replace N specified feature vectors in the first image classification model in combination with a specific structure of the first image classification model. For example, the first image classification model may include: the N appointed convolution layers are connected with the first prediction layer; the ith appointed feature vector in the N appointed feature vectors corresponds to the ith appointed convolution layer in the N appointed convolution layers, wherein i is an integer, and N is more than or equal to i is more than or equal to 1.
It can be appreciated that the first image classification model may include L (L is greater than N) convolution layers, the first L-N non-specified convolution layers in the first image classification model near the input end are typically used to extract local feature information of the input image, and the last N specified convolution layers in the first image classification model near the first prediction layer are typically used to extract global feature information of the input image. Whereas domain feature information is generally associated with overall feature information of the image. The first prediction layer may be a full connection layer or other output layers for classification prediction, which is not particularly limited in this application.
In the step 420, the obtaining the target domain feature information corresponding to the target domain includes: acquiring target domain characteristic information corresponding to a target domain by using a target classification model, wherein the target classification model is a trained domain classification model;
wherein the object classification model comprises: the N target convolution layers are connected with the second prediction layer; the ith target convolutional layer in the N target convolutional layers corresponds to the ith appointed convolutional layer in the N appointed convolutional layers; and outputting an ith target feature vector in the N target feature vectors by the ith target convolution layer.
It can be appreciated that the target classification model may include J (J is greater than N) convolution layers and a second prediction layer, the first J-N non-target convolution layers of the target classification model near the input end are typically used to extract local feature information of the input image, and the last N target convolution layers of the target classification model near the second prediction layer are typically used to extract global feature information of the input image. Whereas domain feature information is generally related to the overall features of the image. In the case that the target classification model is a trained domain classification model, N target feature vectors output by N target convolution layers may be used as target domain feature information corresponding to a target domain. The second prediction layer may be a full connection layer or other output layers for classification prediction, which is not particularly limited in this application.
Accordingly, the replacing the N specified feature vectors in the first image classification model with the N target feature vectors includes:
for each of the N specified feature vectors, performing the following replacement operation: correspondingly replacing the ith appointed characteristic vector in the N appointed characteristic vectors with the ith target characteristic vector in the N target characteristic vectors; wherein i is more than or equal to 1 and less than or equal to N; .
Wherein, in the case of n=1, the sum of the output of the first specified convolution layer and the first target feature vector is taken as the input of the first prediction layer;
when N is more than or equal to 2 and N-1 is more than or equal to q is more than or equal to 1, the sum of the output of the q-th appointed convolution layer and the q-th target feature vector is used as the input of the q+1-th appointed convolution layer, and q is an integer;
in the case where N is equal to or greater than 2 and q=n, the q-th designates the sum of the output of the convolution layer and the q-th target feature vector as the input of the first prediction layer.
In this way, according to the correspondence between the target convolution layer in the target classification model and the specified convolution layer in the first image classification model, the ith specified feature vector in the N specified feature vectors can be replaced with the ith target feature vector in the N target feature vectors in an orderly manner.
For example, as shown in fig. 7, L is 5,J and 5,N is 3. The N specified convolution layers may be 3 specified convolution layers in an image classification model that are adjacent to the first full-connection layer. The N target convolutional layers may be 3 target convolutional layers in the domain classification model that are close to the second fully-connected layer. It should be understood that the values of L and N are only examples. In the embodiment of the present application, L may take other values besides 5, for example, 4 or 3 or 6. N may take other values than 3, such as 2 or 1 or 4.
As shown in fig. 7, i may be 1, 2 or 3, for example, and the adjustment process for the first image classification model includes: correspondingly replacing the 1 st appointed characteristic vector in the 3 appointed characteristic vectors with the 1 st target characteristic vector in the 3 target characteristic vectors; correspondingly replacing the 2 nd appointed characteristic vector in the 3 appointed characteristic vectors with the 2 nd target characteristic vector in the 3 target characteristic vectors; correspondingly replacing the 3 rd appointed characteristic vector in the 3 appointed characteristic vectors with the 3 rd target characteristic vector in the 3 target characteristic vectors to obtain an adjusted image classification model, and further obtaining a second image classification model based on the adjusted image classification model;
further, as shown in fig. 7, when the second image classification model is subsequently used to classify the target image, the target image is input to the second image classification model, where the target image is an image to be identified belonging to the target domain, and the target image is represented by data with a length×height×channel number of 256×256×3 as an example; the target image is subjected to dimension reduction processing by a first convolution layer of the second image classification model, and then a feature vector used for representing 128 x 64 images can be output; then, after the second convolution layer of the second image classification model performs the dimension reduction processing, the feature vector used for representing the 64 x 128 images can be output; then, the feature vector for the 32 x 256 image can be output after the dimension reduction treatment is carried out on the first appointed convolution layer (namely the third convolution layer) of the second image classification model; the sum of the feature vector output by the first specified convolution layer and the 1 st target feature vector is input to a second specified convolution layer (namely a fourth convolution layer) of the second image classification model for carrying out dimension reduction processing, and then the feature vector used for representing 16 x 512 images is output; the sum of the feature vector output by the second specified convolution layer and the 2 nd target feature vector is input to a third specified convolution layer (namely a fifth convolution layer) of the second image classification model for carrying out dimension reduction processing, and then the feature vector used for representing 8 x 1024 images is output; and inputting the sum of the feature vector output by the third specified convolution layer and the 3 rd target feature vector into the first full-connection layer of the second image classification model for classification processing, and outputting the classification result of the target image.
Fig. 6 is a schematic flow chart of an image classification method according to another embodiment of the present application. Fig. 6 may further define a process of acquiring the target domain feature information on the basis of the embodiment shown in fig. 4.
The process of acquiring the target domain feature information using the target classification model is specifically described below. As shown in fig. 6, in the step 420, obtaining the target domain feature information corresponding to the target domain may include:
step 4201: obtaining a target classification model, wherein the target classification model is a trained domain classification model;
step 4202: acquiring M images belonging to a target domain;
step 4203: inputting the M images to the target classification model;
step 4204: processing the M images through the target classification model to obtain target domain characteristic information corresponding to the target domain; wherein M is a positive integer.
The target classification model is a trained domain classification model, and the domain classification model is suitable for carrying out domain classification on images belonging to various different domains. For example, the structure of the domain classification model may be composed of 5 convolution layers and a full connection layer for classification prediction as shown in fig. 7, or may also be composed of VGGNet structure, resnet structure, etc., which is not particularly limited in this application.
In this way, the embodiment of the application can extract the target domain characteristic information corresponding to the target domain from the image belonging to the target domain by adopting the target classification model so as to accurately represent the information specific to the target domain.
In the embodiment of the present application, in order to obtain more comprehensive target domain feature information, the target domain feature information may include one or more target feature vectors according to the difference of the total layers of the convolution layers of the target classification model. In the step 4204, the processing the M images through the object classification model to obtain object domain feature information corresponding to the object domain includes:
processing the M images through the target classification model to obtain N target feature vectors, wherein N is an integer and is more than or equal to 1;
and taking the N target feature vectors as target domain feature information corresponding to the target domain.
For example, if the target classification model includes 5 convolution layers and a prediction layer for classification prediction, feature vectors output from 1 to 3 convolution layers near one side of the prediction layer may be taken as target feature vectors. If the target classification model includes 500 convolution layers and a prediction layer for classification prediction, feature vectors output from 1 to 300 convolution layers near one side of the prediction layer may be used as target feature vectors.
In this way, one or more target feature vectors can be flexibly obtained as target domain feature information according to the difference of the total layers of the convolution layers of the target classification model.
For example, as shown in fig. 7, feature vectors output by 3 convolution layers near the second prediction layer in the domain classification model may be used as target feature vectors.
In a particular embodiment, one of the N target feature vectors may correspond to output by one of the target convolutional layers in the target classification model. Specifically, the object classification model may include: the N target convolution layers are connected with the second prediction layer;
processing the M images through the target classification model to obtain N target feature vectors, wherein the N target feature vectors comprise:
processing M target inputs corresponding to the M images through an ith target convolution layer in the target classification model to obtain an ith target feature vector in the N target feature vectors; wherein i is an integer, and N is greater than or equal to i and greater than or equal to 1.
As shown in fig. 7, N is, for example, 3, and the target classification model may include 3 target convolutional layers.
And when i is 1, processing M target inputs corresponding to the M images through a 1 st target convolution layer in the target classification model to obtain a 1 st target feature vector in the 3 target feature vectors.
And when i is 2, processing M target inputs corresponding to the M images through a 2 nd target convolution layer in the target classification model to obtain a 2 nd target feature vector in the 3 target feature vectors.
And when i is 3, processing M target inputs corresponding to the M images through a 3 rd target convolution layer in the target classification model to obtain a 3 rd target feature vector in the 3 target feature vectors.
Thus, one target feature vector can be correspondingly output through one target convolution layer in the target classification model, and N target feature vectors can be correspondingly output through N target convolution layers in the target classification model.
In a specific embodiment, the number M of images input to the target classification model is an integer greater than or equal to 1. The larger M, the more stable the obtained target feature vector. The following is an example.
In the case where M is equal to 1, as shown in fig. 7, an image belonging to a target domain is input to the domain classification model, and the target domain image may be represented by data with a length of 256×256×3 channels; the target domain image is subjected to dimension reduction processing by a first convolution layer and then output feature vectors for representing 128 x 64 images; then the second convolution layer is subjected to dimension reduction processing and then the feature vector used for representing the 64 x 128 images is output; then the first target convolution layer (namely the third convolution layer) is subjected to dimension reduction processing and then the feature vector used for representing the 32 x 256 images is output; the feature vector output by the first target convolution layer is input to a second target convolution layer (namely a fourth convolution layer) for dimension reduction treatment, and then the feature vector used for representing 16 x 512 images is output; the feature vector output by the second target convolution layer is input to a third target convolution layer (namely a fifth convolution layer) for dimension reduction treatment, and then the feature vector used for representing the 8 x 1024 image is output; and inputting the feature vector output by the third target convolution layer to the second full-connection layer for classification processing, and outputting a domain classification result.
In the embodiment of the present application, the domain classification result output by the domain classification model may not be used, but feature vectors output by 3 convolution layers close to the second full-connection layer in the 5 convolution layers of the domain classification model are obtained. Furthermore, the processing, by the ith target convolution layer in the target classification model, the M target inputs corresponding to the M images to obtain an ith target feature vector in the N target feature vectors may include: the feature vector output by the first target convolution layer is used as a 1 st target feature vector, the feature vector output by the second target convolution layer is used as a 2 nd target feature vector, and the feature vector output by the third target convolution layer is used as a 3 rd target feature vector.
And when M is greater than 1, the processing, by the ith target convolution layer in the target classification model, the M target inputs corresponding to the M images, to obtain an ith target feature vector in the N target feature vectors includes:
respectively processing M target inputs corresponding to the M images through the ith target convolution layer, and extracting M domain feature vectors, wherein one domain feature vector in the M domain feature vectors corresponds to one target input in the M target inputs;
And calculating the average value of the M domain feature vectors, and taking the average value of the M domain feature vectors as the ith target feature vector in the N target feature vectors, wherein M is more than or equal to 2.
Therefore, each target feature vector in the N target feature vectors is obtained by carrying out average calculation on M domain feature vectors, so that the stability is good, and the specific information of the target domain can be accurately represented.
As shown in fig. 7, in the case where N is equal to 3 and m is equal to 10, for example, 10 times of operations of inputting an image belonging to a target domain to the domain classification model may be performed, and then an average value of 10 domain feature vectors output from the first target convolutional layer is taken as a 1 st target feature vector, an average value of 10 domain feature vectors output from the second target convolutional layer is taken as a 2 nd target feature vector, and an average value of 10 domain feature vectors output from the third target convolutional layer is taken as a 3 rd target feature vector. Thus, since each target feature vector in the 10 target feature vectors is obtained by average calculation, the stability is good, and the specific information of the target domain can be accurately represented. It should be understood that, in the embodiment of the present application, instead of directly taking the average value of the plurality of domain feature vectors, a form of selecting a specific domain feature vector that meets the setting condition may be adopted, and depending on the setting condition, the specific domain feature vector may be one or a plurality of domain feature vectors. In the case where the specific domain feature vector is plural, the plural specific domain feature vectors may be averaged. Therefore, the domain feature vector can be selected according to the requirement, and the requirement of continuous change can be better met.
In another specific embodiment, in order to enable the target classification model to have a better domain classification capability, and ensure that domain features of different domains are distinguished on the premise that the image class is unchanged, as shown in fig. 8, the training process of the target classification model may include:
step 810: obtaining K training data sets, wherein one training data set in the K training data sets corresponds to one image type, and the image types corresponding to all training data sets in the K training data sets are different; each training data set comprises a plurality of images belonging to different domains, wherein K is an integer and is more than or equal to 2;
step 820: acquiring an initial domain classification model;
step 830: and training the initial domain classification model through the K training data sets to obtain the target classification model.
In this way, since the images in one training data set are images belonging to different domains and having the same image category, the training data set is adopted to train the initial domain classification model, the same domain can be used as one category to train the domain classification, and since the image categories of the images in one training data set are the same, the images can be prevented from being interfered by the image categories during training, and the inter-domain difference can be learned, so that the inter-domain difference (namely the domain feature information) can be conveniently extracted by using the target classification model later.
For example, in the embodiment of the present application, K training data sets may be sequentially used to train the initial domain classification model. As shown in fig. 9, in step 830, the training the initial domain classification model through the K training data sets to obtain the target classification model includes:
step 8301: training the initial domain classification model through a first training data set of the K training data sets to obtain a first domain classification model;
step 8302: training the s-1 domain classification model through the s-th training data set in the K training data sets to obtain an s-domain classification model, wherein s is a positive integer, and K is more than or equal to s is more than or equal to 2;
step 8303: and under the condition of s=K, obtaining a K domain classification model, and taking the K domain classification model as the target classification model.
For example, in the case where K is equal to 3, training the initial domain classification model by a first training data set of the 3 training data sets to obtain a first domain classification model; training the first domain classification model through a second training data set of the 3 training data sets to obtain a second domain classification model; training the second domain classification model through a third training data set in the 3 training data sets to obtain a third domain classification model; and taking the third domain classification model as the target classification model.
In this way, K training data sets are adopted to train the initial domain classification model successively, and the more the number of the training data sets is, the more accurately the trained target classification model can learn the inter-domain difference.
Specifically, when training the target classification model by using a single training data set, in step 3302, the s-th training data set includes P images belonging to different domains, where P is a positive integer and P is greater than or equal to 2;
in the step 8302, the training the s-1 th domain classification model according to the s-th training data set in the K training data sets to obtain an s-th domain classification model may include:
inputting a first image in the P images into the s-1 domain classification model to obtain a first predicted value result corresponding to the first image; obtaining a first comparison loss value based on the first predicted value result and the first truth result; wherein the first truth result is based on a first image;
inputting a j-th image in the P images into the s-1-th domain classification model to obtain a j-th predicted value result corresponding to the j-th image; obtaining a j-th comparison loss value based on the j-th predicted value result and the j-th truth value result; wherein the j-th truth result is obtained based on the j-th image; wherein j is a positive integer, and P is more than or equal to j is more than or equal to 2;
And adjusting parameters of the s-1 domain classification model based on the first contrast loss value and the j-th contrast loss value of … to obtain the s-domain classification model.
Therefore, the s-1 domain classification model is trained by adopting the P images in the s training data set, and the more the number of images in the training data set is, the more accurately the trained target classification model can learn the inter-domain difference.
In addition, when training the target classification model by using the training data set, a Center Loss (Center Loss) function can be used, so that the intra-domain distance is minimized, the domain specific features are guaranteed to be separable, the distance between the domains is increased, and the inter-domain difference (namely the domain specific information) is accurately extracted.
The image classification method provided by the embodiment of the application is described below in connection with an actual application scenario.
As shown in fig. 7, in an actual application scenario, taking more than 3 training data sets as an example, where one training data set includes 10 images, the flow of the image classification method provided in the embodiment of the present application may include:
acquiring more than 3 training data sets, wherein one training data set of the more than 3 training data sets corresponds to one image type, and the image types corresponding to all the training data sets in the more than 3 training data sets are different; each training dataset comprising a plurality of images belonging to different domains;
Acquiring an initial domain classification model;
training the initial domain classification model through more than 3 training data sets to obtain a trained domain classification model;
sequentially inputting 10 images belonging to a target domain into a trained domain classification model, extracting domain feature vectors output by 3 rd, 4 th and 5 th convolution layers of the domain classification model, taking the average value of 10 domain feature vectors output by the 3 rd convolution layer of the domain classification model as a 1 st target feature vector, taking the average value of 10 domain feature vectors output by the 4 th convolution layer of the domain classification model as a 2 nd target feature vector, and taking the average value of 10 domain feature vectors output by the 5 th convolution layer of the domain classification model as a 3 rd target feature vector;
replacing the 1 st appointed feature vector in the first image classification model with the 1 st target feature vector, replacing the 2 nd appointed feature vector in the first image classification model with the 2 nd target feature vector, and replacing the 3 rd appointed feature vector in the first image classification model with the 3 rd target feature vector to obtain an adjusted image classification model;
training the adjusted image classification model by adopting a small amount of image samples belonging to the target domain to obtain a second image classification model;
Inputting a target image into the second image classification model, wherein the target image is an image to be identified belonging to a target domain, and the target image can be represented by data with a long-by-high channel number of 256×256×3;
the target image is subjected to dimension reduction processing by a first convolution layer of the second image classification model, and then feature vectors used for representing 128 x 64 images are output; then, the feature vector used for representing the 64 x 128 images is output after the dimension reduction processing is carried out on the second convolution layer of the second image classification model; then, the feature vector used for representing the 32 x 256 images is output after the first specified convolution layer (namely the third convolution layer) of the second image classification model is subjected to dimension reduction treatment; the sum of the feature vector output by the first specified convolution layer and the 1 st target feature vector is input to a second specified convolution layer (namely a fourth convolution layer) of the second image classification model for carrying out dimension reduction processing, and then the feature vector used for representing 16 x 512 images is output; the sum of the feature vector output by the second specified convolution layer and the 2 nd target feature vector is input to a third specified convolution layer (namely a fifth convolution layer) of the second image classification model for carrying out dimension reduction processing, and then the feature vector used for representing 8 x 1024 images is output; and inputting the sum of the feature vector output by the third specified convolution layer and the 3 rd target feature vector into the first full-connection layer of the second image classification model for classification processing, and outputting the classification result of the target image.
In this way, the first image classification model is adjusted based on 3 target feature vectors corresponding to the target domain, the obtained second image classification model can be suitable for classifying images belonging to the target domain, compared with the related art, the image to be classified in the target domain does not need to be input into the domain migration model for preprocessing, the image to be classified in the target domain is directly classified through the second image classification model, and the image classification speed is high.
According to the image classification method provided by the embodiment of the application, the execution subject can be an image classification device. In the embodiment of the present application, an image classification device is described by taking an image classification method performed by the image classification device as an example.
Fig. 10 is a schematic structural diagram of an image classification apparatus according to an embodiment of the present application.
As shown in fig. 10, an image classification apparatus 1000 provided in an embodiment of the present application may include:
an acquisition module 1001, an input module 1002, and an output module 1003;
the acquiring module 1001 is configured to acquire a target image, where the target image is an image to be classified that belongs to a target domain;
the input module 1002 is configured to input the target image to a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
The output module 1003 is configured to output a classification result of the target image.
The image classification device provided by the embodiment of the application comprises an acquisition module, an input module and an output module; the acquisition module is used for acquiring a target image, wherein the target image is an image to be classified belonging to a target domain; the input module is used for inputting the target image into a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain; and the output module is used for outputting the classification result of the target image. Therefore, the second image classification model is an image classification model obtained by adjusting the first image classification model based on the target domain characteristic information corresponding to the target domain, and can be suitable for classifying images belonging to the target domain.
Optionally, the image classification device provided in the embodiment of the present application further includes a processing module;
before the acquisition module acquires the target image, the acquisition module is further configured to: acquiring a first image classification model; acquiring target domain feature information corresponding to a target domain, wherein the target domain feature information is used for representing the characteristic of the target domain;
the processing module is configured to adjust the first image classification model based on the target domain feature information to obtain a second image classification model, where the second image classification model is suitable for classifying images belonging to the target domain.
In this way, the first image classification model is adjusted based on the target domain feature information corresponding to the target domain, and the obtained second image classification model can be applied to classification of images belonging to the target domain. Furthermore, the second image classification model can be obtained through pre-training before the target image is obtained for classification, so that the target image can be classified by directly adopting the second image classification model, and the classification speed of the target image is improved.
Optionally, in the image classification device provided in the embodiment of the present application, in a process of adjusting the first image classification model based on the target domain feature information to obtain the second image classification model, the processing module is specifically configured to: acquiring source domain feature information corresponding to the source domain in the first image classification model, wherein the source domain feature information is used for representing the characteristic of the source domain; replacing the source domain feature information in the first image classification model with the target domain feature information to obtain an adjusted image classification model; and obtaining a second image classification model based on the adjusted image classification model.
In this way, the source domain feature information in the first image classification model is replaced by the target domain feature information, so that the obtained second image classification model can be better suitable for classifying images belonging to the target domain.
Optionally, in the image classification device provided in the embodiment of the present application, the source domain feature information includes N specified feature vectors, the target domain feature information includes N target feature vectors, N is an integer, and N is greater than or equal to 1;
in the process of replacing the source domain feature information in the first image classification model with the target domain feature information to obtain an adjusted image classification model, the processing module 403 is specifically configured to: and the method is used for replacing N appointed feature vectors in the first image classification model with the N target feature vectors to obtain an adjusted image classification model.
In this way, N target feature vectors corresponding to the N designated feature vectors are acquired and used as target domain feature information, so that all the designated feature vectors contained in the source domain feature information are replaced with the target feature vectors one by one, and the problem that the accuracy of the second image classification model is affected due to the fact that the unreplaced source domain feature information exists in the adjusted image classification model is avoided.
Optionally, in the image classification apparatus provided in the embodiment of the present application, the first image classification model includes: the N appointed convolution layers are connected with the first prediction layer; the ith appointed feature vector in the N appointed feature vectors corresponds to the ith appointed convolution layer in the N appointed convolution layers, wherein i is an integer, and N is more than or equal to i is more than or equal to 1;
in the process of acquiring the target domain characteristic information corresponding to the target domain, the acquisition module is used for: acquiring target domain characteristic information corresponding to a target domain by using a target classification model, wherein the target classification model is a trained domain classification model;
wherein the object classification model comprises: the N target convolution layers are connected with the second prediction layer; the ith target convolutional layer in the N target convolutional layers corresponds to the ith appointed convolutional layer in the N appointed convolutional layers; the ith target feature vector in the N target feature vectors is output by the ith target convolution layer;
In the process of replacing the N specified feature vectors in the first image classification model with the N target feature vectors, the processing module is specifically configured to: for each of the N specified feature vectors, performing the following replacement operation: correspondingly replacing the ith appointed characteristic vector in the N appointed characteristic vectors with the ith target characteristic vector in the N target characteristic vectors; wherein i is more than or equal to 1 and less than or equal to N;
wherein, in the case of n=1, the sum of the output of the first specified convolution layer and the first target feature vector is taken as the input of the first prediction layer;
when N is more than or equal to 2 and N-1 is more than or equal to q is more than or equal to 1, the sum of the output of the q-th appointed convolution layer and the q-th target feature vector is used as the input of the q+1-th appointed convolution layer, and q is an integer;
in the case where N is equal to or greater than 2 and q=n, the q-th designates the sum of the output of the convolution layer and the q-th target feature vector as the input of the first prediction layer.
In this way, according to the correspondence between the target convolution layer in the target classification model and the specified convolution layer in the first image classification model, the ith specified feature vector in the N specified feature vectors can be replaced with the ith target feature vector in the N target feature vectors in an orderly manner.
Optionally, in the image classification device provided in the embodiment of the present application, in a process of acquiring target domain feature information corresponding to a target domain, the acquiring module is specifically configured to:
obtaining a target classification model, wherein the target classification model is a trained domain classification model;
acquiring M images belonging to a target domain;
inputting the M images to the target classification model;
processing the M images through the target classification model to obtain target domain characteristic information corresponding to the target domain; wherein M is a positive integer.
In this way, the embodiment of the application can extract the target domain characteristic information corresponding to the target domain from the image belonging to the target domain by adopting the target classification model so as to accurately represent the information specific to the target domain.
Optionally, in the image classification device provided in the embodiment of the present application, in a process of processing the M images by using the target classification model to obtain target domain feature information corresponding to the target domain, the obtaining module is specifically configured to: processing the M images through the target classification model to obtain N target feature vectors, wherein N is an integer and is more than or equal to 1; and taking the N target feature vectors as target domain feature information corresponding to the target domain. In this way, one or more target feature vectors can be flexibly acquired as target domain feature information.
In the image classification apparatus provided in the embodiment of the present application, the object classification model may include: the N target convolution layers are connected with the second prediction layer;
in the process of processing the M images through the object classification model to obtain N object feature vectors, the processing module is specifically configured to: processing M target inputs corresponding to the M images through an ith target convolution layer in the target classification model to obtain an ith target feature vector in the N target feature vectors; wherein i is an integer, and N is greater than or equal to i and greater than or equal to 1.
Thus, one target feature vector can be correspondingly output through one target convolution layer in the target classification model, and N target feature vectors can be correspondingly output through N target convolution layers in the target classification model.
In the process of processing M target inputs corresponding to the M images through the ith target convolution layer in the target classification model to obtain the ith target feature vector in the N target feature vectors, the processing module may be specifically configured to:
Respectively processing M target inputs corresponding to the M images through the ith target convolution layer, and extracting M domain feature vectors, wherein one domain feature vector in the M domain feature vectors corresponds to one target input in the M target inputs;
and calculating the average value of the M domain feature vectors, and taking the average value of the M domain feature vectors as the ith target feature vector in the N target feature vectors, wherein M is more than or equal to 2.
Therefore, each target feature vector in the N target feature vectors is obtained by carrying out average calculation on M domain feature vectors, so that the stability is good, and the specific information of the target domain can be accurately represented.
Optionally, the image classification device provided in the embodiment of the application further includes a target classification model training module. The target classification model training module is used for;
obtaining K training data sets, wherein one training data set in the K training data sets corresponds to one image type, and the image types corresponding to all training data sets in the K training data sets are different; each training data set comprises a plurality of images belonging to different domains, wherein K is an integer and is more than or equal to 2;
Acquiring an initial domain classification model;
and training the initial domain classification model through the K training data sets to obtain the target classification model.
In this way, since the images in one training data set are images belonging to different domains and having the same image category, the training data set is adopted to train the initial domain classification model, the same domain can be used as one category to train the domain classification, and since the image categories of the images in one training data set are the same, the images can be prevented from being interfered by the image categories during training, and the inter-domain difference can be learned, so that the inter-domain difference (namely the domain feature information) can be conveniently extracted by using the target classification model later.
Optionally, in the process of adjusting the first image classification model based on the target domain feature information to obtain a second image classification model, the processing module may be configured to: taking the adjusted image classification model as a second image classification model; or training the adjusted image classification model by using the image samples belonging to the target domain to obtain the second image classification model.
In this way, the adjusted image classification model is used as the second image classification model, and the acquisition difficulty of the second image classification model is reduced on the basis that the second image classification model can be suitable for classifying images belonging to the target domain. Or, a small amount of image samples belonging to the target domain are used as training samples to train the adjusted image classification model to obtain a second image classification model, so that the accuracy of identifying the target image by the second image classification model is further improved.
The image classification device in the embodiment of the application may be an electronic device, or may be a component in the electronic device, for example, an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. By way of example, the electronic device may be a mobile phone, tablet computer, notebook computer, palm computer, vehicle-mounted electronic device, mobile internet appliance (Mobile Internet Device, MID), augmented reality (augmented reality, AR)/Virtual Reality (VR) device, robot, wearable device, ultra-mobile personal computer, UMPC, netbook or personal digital assistant (personal digital assistant, PDA), etc., but may also be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the present application are not limited in particular.
The image classification device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
The image classification device provided in the embodiment of the present application can implement each process implemented by the embodiments of the methods of fig. 2 to 9, and in order to avoid repetition, a description is omitted here.
Optionally, as shown in fig. 11, the embodiment of the present application further provides an electronic device 1100, including a processor 1101 and a memory 1102, where the memory 1102 stores a program or an instruction that can be executed on the processor 1101, and the program or the instruction implements each step in any of the image classification methods provided in the embodiment of the present application when executed by the processor 1101, and can achieve the same technical effect. For example, the program or instructions when executed by the processor 1101 implement the following: acquiring a target image, wherein the target image is an image to be classified belonging to a target domain; inputting the target image to a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain; and outputting a classification result of the target image. Therefore, the second image classification model is an image classification model obtained by adjusting the first image classification model based on the target domain characteristic information corresponding to the target domain, and can be suitable for classifying images belonging to the target domain.
It should be noted that, the electronic device in the embodiment of the present application includes a mobile electronic device and a non-mobile electronic device.
Fig. 12 is a schematic hardware structure of an electronic device implementing an embodiment of the present application.
The electronic device 1200 includes, but is not limited to: radio frequency unit 1201, network module 1202, audio output unit 1203, input unit 1204, sensor 1205, display unit 1206, user input unit 1207, interface unit 1208, memory 1209, and processor 1210.
Those skilled in the art will appreciate that the electronic device 1200 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 1210 by a power management system, such as to perform functions such as managing charging, discharging, and power consumption by the power management system. The electronic device structure shown in fig. 12 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than illustrated, or may combine certain components, or may be arranged in different components, which are not described in detail herein.
The user input unit 1207 is configured to obtain a target image, where the target image is an image to be classified that belongs to a target domain;
An input unit 1204 for inputting the target image to a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
the processor 1210 is further configured to output a classification result of the target image.
In the electronic device provided by the embodiment of the application, the user input unit is used for acquiring a target image, wherein the target image is an image to be classified belonging to a target domain; the input unit is used for inputting the target image into the second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain; the processor is also used for outputting the classification result of the target image. Therefore, the second image classification model is an image classification model obtained by adjusting the first image classification model based on the target domain characteristic information corresponding to the target domain, and can be suitable for classifying images belonging to the target domain.
It should be understood that in the embodiment of the present application, the input unit 1204 may include a graphics processor (Graphics Processing Unit, GPU) 12041 and a microphone 12042, and the graphics processor 12041 processes image data of still pictures or videos obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 1206 may include a display panel 12061, and the display panel 12061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1207 includes at least one of a touch panel 12071 and other input devices 12072. The touch panel 12071 is also called a touch screen. The touch panel 12071 may include two parts, a touch detection device and a touch controller. Other input devices 12072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.
Memory 1209 may be used to store software programs as well as various data. The memory 1209 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 1209 may include volatile memory or nonvolatile memory, or the memory 1209 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (ddr SDRAM), enhanced SDRAM (Enhanced SDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRRAM). Memory 1209 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.
Processor 1210 may include one or more processing units; optionally, processor 1210 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, and the like, and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into processor 1210.
The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored, and when the program or the instruction is executed by a processor, the program or the instruction implement each process of the embodiment of the method, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled with the processor, and the processor is used for running a program or an instruction, implementing each process of the above method embodiment, and achieving the same technical effect, so as to avoid repetition, and not repeated here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
The embodiments of the present application provide a computer program product, which is stored in a storage medium, and the program product is executed by at least one processor to implement the respective processes of the above method embodiments, and achieve the same technical effects, and are not repeated herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.

Claims (14)

1. An image classification method, comprising:
acquiring a target image, wherein the target image is an image to be classified belonging to a target domain;
inputting the target image to a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
and outputting a classification result of the target image.
2. The method of claim 1, wherein prior to acquiring the target image, the method further comprises:
acquiring a first image classification model;
acquiring target domain feature information corresponding to a target domain, wherein the target domain feature information is used for representing the characteristic of the target domain;
and adjusting the first image classification model based on the target domain characteristic information to obtain a second image classification model, wherein the second image classification model is suitable for classifying images belonging to the target domain.
3. The method of claim 2, wherein adjusting the first image classification model based on the target domain feature information to obtain a second image classification model comprises:
Acquiring source domain feature information corresponding to the source domain in the first image classification model, wherein the source domain feature information is used for representing the characteristic of the source domain;
replacing the source domain feature information in the first image classification model with the target domain feature information to obtain an adjusted image classification model;
and obtaining a second image classification model based on the adjusted image classification model.
4. A method according to claim 3, wherein the source domain feature information comprises N specified feature vectors, the target domain feature information comprises N target feature vectors, N is an integer, and N is ≡1;
the replacing the source domain feature information in the first image classification model with the target domain feature information to obtain an adjusted image classification model includes:
and replacing N appointed characteristic vectors in the first image classification model with the N target characteristic vectors to obtain an adjusted image classification model.
5. The method of claim 4, wherein the first image classification model comprises: the N appointed convolution layers are connected with the first prediction layer; the ith appointed feature vector in the N appointed feature vectors corresponds to the ith appointed convolution layer in the N appointed convolution layers, wherein i is an integer, and N is more than or equal to i is more than or equal to 1;
The obtaining the target domain characteristic information corresponding to the target domain comprises the following steps: acquiring target domain characteristic information corresponding to a target domain by using a target classification model, wherein the target classification model is a trained domain classification model;
wherein the object classification model comprises: the N target convolution layers are connected with the second prediction layer; the ith target convolutional layer in the N target convolutional layers corresponds to the ith appointed convolutional layer in the N appointed convolutional layers; the ith target feature vector in the N target feature vectors is output by the ith target convolution layer;
the replacing the N specified feature vectors in the first image classification model with the N target feature vectors comprises:
for each of the N specified feature vectors, performing the following replacement operation: correspondingly replacing the ith appointed characteristic vector in the N appointed characteristic vectors with the ith target characteristic vector in the N target characteristic vectors; wherein i is more than or equal to 1 and less than or equal to N;
wherein, in the case of n=1, the sum of the output of the first specified convolution layer and the first target feature vector is taken as the input of the first prediction layer;
When N is more than or equal to 2 and N-1 is more than or equal to q is more than or equal to 1, the sum of the output of the q-th appointed convolution layer and the q-th target feature vector is used as the input of the q+1-th appointed convolution layer, and q is an integer;
in the case where N is equal to or greater than 2 and q=n, the q-th designates the sum of the output of the convolution layer and the q-th target feature vector as the input of the first prediction layer.
6. The method of claim 2, wherein the acquiring the target domain feature information corresponding to the target domain comprises:
obtaining a target classification model, wherein the target classification model is a trained domain classification model;
acquiring M images belonging to a target domain;
inputting the M images to the target classification model;
processing the M images through the target classification model to obtain target domain characteristic information corresponding to the target domain; wherein M is a positive integer.
7. The method of claim 6, wherein the processing the M images by the object classification model to obtain object domain feature information corresponding to the object domain comprises:
processing the M images through the target classification model to obtain N target feature vectors, wherein N is an integer and is more than or equal to 1;
And taking the N target feature vectors as target domain feature information corresponding to the target domain.
8. The method of claim 7, wherein the object classification model comprises: the N target convolution layers are connected with the second prediction layer;
processing the M images through the target classification model to obtain N target feature vectors, wherein the N target feature vectors comprise:
processing M target inputs corresponding to the M images through an ith target convolution layer in the target classification model to obtain an ith target feature vector in the N target feature vectors; wherein i is an integer, and N is greater than or equal to i and greater than or equal to 1.
9. The method of claim 8, wherein the processing, by the ith target convolutional layer in the target classification model, the M target inputs corresponding to the M images to obtain the ith target feature vector in the N target feature vectors comprises:
respectively processing M target inputs corresponding to the M images through the ith target convolution layer, and extracting M domain feature vectors, wherein one domain feature vector in the M domain feature vectors corresponds to one target input in the M target inputs;
And calculating the average value of the M domain feature vectors, and taking the average value of the M domain feature vectors as the ith target feature vector in the N target feature vectors, wherein M is more than or equal to 2.
10. The method of claim 6, wherein the training process of the object classification model comprises:
obtaining K training data sets, wherein one training data set in the K training data sets corresponds to one image type, and the image types corresponding to all training data sets in the K training data sets are different; each training data set comprises a plurality of images belonging to different domains, wherein K is an integer and is more than or equal to 2;
acquiring an initial domain classification model;
and training the initial domain classification model through the K training data sets to obtain the target classification model.
11. A method according to claim 3, wherein said deriving a second image classification model based on said adjusted image classification model comprises:
taking the adjusted image classification model as a second image classification model;
or alternatively, the process may be performed,
and training the adjusted image classification model by utilizing the image samples belonging to the target domain to obtain the second image classification model.
12. An image classification apparatus, comprising: the device comprises an acquisition module, an input module and an output module;
the acquisition module is used for acquiring a target image, wherein the target image is an image to be classified belonging to a target domain;
the input module is used for inputting the target image into a second image classification model; the second image classification model is an image classification model obtained by adjusting the first image classification model based on target domain characteristic information corresponding to a target domain; the first image classification model is a trained image classification model and is used for classifying images belonging to a source domain;
and the output module is used for outputting the classification result of the target image.
13. An electronic device comprising a processor and a memory storing a program or instructions that when executed by the processor perform the steps of the method of any of claims 1-11.
14. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1-11.
CN202210822389.2A 2022-07-13 2022-07-13 Image classification method, image classification device and electronic equipment Pending CN116152513A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210822389.2A CN116152513A (en) 2022-07-13 2022-07-13 Image classification method, image classification device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210822389.2A CN116152513A (en) 2022-07-13 2022-07-13 Image classification method, image classification device and electronic equipment

Publications (1)

Publication Number Publication Date
CN116152513A true CN116152513A (en) 2023-05-23

Family

ID=86339515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210822389.2A Pending CN116152513A (en) 2022-07-13 2022-07-13 Image classification method, image classification device and electronic equipment

Country Status (1)

Country Link
CN (1) CN116152513A (en)

Similar Documents

Publication Publication Date Title
CN109558832B (en) Human body posture detection method, device, equipment and storage medium
CN110084281A (en) Image generating method, the compression method of neural network and relevant apparatus, equipment
CN108898579A (en) A kind of image definition recognition methods, device and storage medium
CN113994384A (en) Image rendering using machine learning
CN111738243B (en) Method, device and equipment for selecting face image and storage medium
CN109657533A (en) Pedestrian recognition methods and Related product again
Gupta et al. ALMNet: Adjacent layer driven multiscale features for salient object detection
Zhao et al. Scale-aware crowd counting via depth-embedded convolutional neural networks
CN111292262B (en) Image processing method, device, electronic equipment and storage medium
Huang et al. RGB-D salient object detection by a CNN with multiple layers fusion
CN110399788A (en) AU detection method, device, electronic equipment and the storage medium of image
CN112561846A (en) Method and device for training image fusion model and electronic equipment
WO2023151511A1 (en) Model training method and apparatus, image moire removal method and apparatus, and electronic device
CN112418195A (en) Face key point detection method and device, electronic equipment and storage medium
CN116701706B (en) Data processing method, device, equipment and medium based on artificial intelligence
Maiano et al. Depthfake: a depth-based strategy for detecting deepfake videos
CN110163049B (en) Face attribute prediction method, device and storage medium
Panda et al. Feedback through emotion extraction using logistic regression and CNN
CN116152513A (en) Image classification method, image classification device and electronic equipment
CN115016641A (en) Conference control method, device, conference system and medium based on gesture recognition
CN116129210A (en) Training method of feature extraction model, feature extraction method and device
CN111445545B (en) Text transfer mapping method and device, storage medium and electronic equipment
Wang et al. MetaScleraSeg: an effective meta-learning framework for generalized sclera segmentation
CN113674383A (en) Method and device for generating text image
Rawat et al. Indian Sign Language Recognition System for Interrogative Words Using Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination