CN116363421A - Image feature classification method and device, electronic equipment and medium - Google Patents

Image feature classification method and device, electronic equipment and medium Download PDF

Info

Publication number
CN116363421A
CN116363421A CN202310269093.7A CN202310269093A CN116363421A CN 116363421 A CN116363421 A CN 116363421A CN 202310269093 A CN202310269093 A CN 202310269093A CN 116363421 A CN116363421 A CN 116363421A
Authority
CN
China
Prior art keywords
image
noise
feature
domain
initial
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310269093.7A
Other languages
Chinese (zh)
Inventor
马占宇
童煜钧
罗悦恒
梁孔明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN202310269093.7A priority Critical patent/CN116363421A/en
Publication of CN116363421A publication Critical patent/CN116363421A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a feature classification method, device, electronic equipment and medium of an image. By applying the technical scheme, inter-domain difference values between different domain images can be measured in advance by utilizing noise adding and noise removing modes, and domain generalization training is carried out on the feature extractor by using the inter-domain difference values, so that an image classification model of the feature extractor with stronger robustness is obtained. And further, the problem that the classification performance of the traditional image classification model is greatly reduced when a domain offset scene is encountered in the related technology is avoided.

Description

Image feature classification method and device, electronic equipment and medium
Technical Field
The present disclosure relates to image data processing technologies, and in particular, to a method and apparatus for classifying features of an image, an electronic device, and a medium.
Background
In the field of image classification, artificial intelligence algorithms often have the problem of non-ideal classification results. For example, when a traditional image classification model encounters a domain offset scene (i.e., a training image and an actually applied image to be classified have a domain distribution difference, such as a photo and a sketch for the same object), the performance of the image classification model is greatly reduced, and thus the classification result is inaccurate.
Disclosure of Invention
The embodiment of the application provides an image feature classification method, device, electronic equipment and medium. Therefore, the problem that the classification performance of the traditional image classification model is greatly reduced when a domain offset scene is encountered in the related technology is solved.
According to one aspect of the embodiment of the present application, a method for classifying features of an image is provided, including:
acquiring an image to be processed;
inputting the image to be processed into an image classification model, and extracting image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains;
and classifying the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model.
Optionally, in another embodiment of the method according to the present application, before the capturing the image to be processed, the method further includes:
acquiring sample images of a plurality of source domains aiming at the same object and an initial image classification model, wherein the initial image classification model comprises an initial feature extractor and a denoising model;
extracting initial features of a plurality of sample images by using the initial feature extractor, and adding noise data to the plurality of initial features to obtain a plurality of noise features;
denoising the plurality of noise features by using the denoising model to obtain a plurality of denoising features;
and performing iterative training on the initial feature extractor based on at least one noise feature and at least one noise cancellation feature until the image classification model containing the target feature extractor is obtained.
Optionally, in another embodiment of the method according to the present application, the iteratively training the initial feature extractor based on at least one noise feature and at least one noise cancellation feature includes:
iteratively training the initial feature extractor based on at least one noise feature corresponding to a first source domain and based on at least one noise cancellation feature corresponding to a second source domain;
wherein the first source domain and the second source domain belong to different source domains.
Optionally, in another embodiment of the method according to the present application, the iteratively training the initial feature extractor based on at least one noise feature and at least one noise cancellation feature includes:
respectively calculating an inter-domain difference value between one noise feature and one noise elimination feature, wherein the inter-domain difference value is an offset distance value between the noise feature and the noise elimination feature;
and if the number of the inter-domain difference values is multiple, taking the average value of the inter-domain difference values as a loss function, and performing iterative training on the initial feature extractor.
Alternatively, in another embodiment based on the above method of the present application, the loss function is obtained by the following formula:
Figure BDA0004134091780000021
wherein the said
Figure BDA0004134091780000022
For the loss function, t is the sign of the noise signature, the +.>
Figure BDA0004134091780000023
For the noise characteristics, the +.>
Figure BDA0004134091780000024
Is the noise cancellation feature.
Optionally, in another embodiment of the method according to the present application, adding noise data to the plurality of initial features to obtain a plurality of noise features includes:
and gradually adding Gaussian noise data to each initial feature to obtain a plurality of noise features with different variances, wherein the size of one Gaussian noise data is the same as the size of the corresponding initial feature.
According to still another aspect of the embodiments of the present application, there is provided an image feature classification apparatus, including:
an acquisition module configured to acquire an image to be processed;
the extraction module is configured to input the image to be processed into an image classification model, and extract image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains;
and the classification module is configured to classify the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model.
According to still another aspect of the embodiments of the present application, there is provided an electronic device including:
a memory for storing executable instructions; and
and the display is used for executing the executable instructions with the memory so as to finish the operation of the feature classification method of any image.
According to still another aspect of the embodiments of the present application, there is provided a computer-readable storage medium storing computer-readable instructions that, when executed, perform the operations of any one of the above-described feature classification methods of images.
In the application, an image to be processed is acquired; inputting an image to be processed into an image classification model, and extracting image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains; and classifying the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model. By applying the technical scheme, inter-domain difference values between different domain images can be measured in advance by utilizing noise adding and noise removing modes, and domain generalization training is carried out on the feature extractor by using the inter-domain difference values, so that an image classification model of the feature extractor with stronger robustness is obtained. And further, the problem that the classification performance of the traditional image classification model is greatly reduced when a domain offset scene is encountered in the related technology is avoided.
The technical scheme of the present application is described in further detail below through the accompanying drawings and examples.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the application and, together with the description, serve to explain the principles of the application.
The present application will be more clearly understood from the following detailed description with reference to the accompanying drawings, in which:
FIG. 1 is a schematic diagram of a feature classification method for an image according to the present application;
FIG. 2 is a flow chart of a method for classifying features of an image according to the present disclosure;
FIG. 3 is a schematic diagram of a training process for a feature extractor in an image feature classification method according to the present application;
FIG. 4 is a schematic diagram illustrating a process of diffusing and reconstructing initial features in an image feature classification method according to the present disclosure;
FIG. 5 is a schematic structural diagram of an electronic device for classifying features of an image according to the present disclosure;
fig. 6 is a schematic structural diagram of an electronic device for classifying features of an image according to the present application.
Detailed Description
Various exemplary embodiments of the present application will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present application unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely exemplary in nature and is in no way intended to limit the application, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but are intended to be part of the specification where appropriate.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
In addition, the technical solutions of the embodiments of the present application may be combined with each other, but it is necessary to be based on the fact that those skilled in the art can implement the technical solutions, and when the technical solutions are contradictory or cannot be implemented, the combination of the technical solutions should be considered to be absent, and is not within the scope of protection claimed in the present application.
It should be noted that all directional indicators (such as up, down, left, right, front, and rear … …) in the embodiments of the present application are merely used to explain the relative positional relationship, movement conditions, and the like between the components in a specific posture (as shown in the drawings), and if the specific posture is changed, the directional indicator is correspondingly changed.
A feature classification method for performing an image according to an exemplary embodiment of the present application is described below with reference to fig. 1 to 4. It should be noted that the following application scenario is only shown for the convenience of understanding the spirit and principles of the present application, and embodiments of the present application are not limited in any way in this respect. Rather, embodiments of the present application may be applied to any scenario where applicable.
The application also provides an image feature classification method, an image feature classification device, electronic equipment and a medium.
Fig. 1 schematically shows a flow diagram of a method for classifying features of an image according to an embodiment of the present application. As shown in fig. 1, the method is applied to a base station device, and includes:
s101, acquiring an image to be processed.
S102, inputting the image to be processed into an image classification model, and extracting image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains.
S103, classifying the image to be processed based on the image characteristics of the image to be processed and the classifier in the image classification model.
In the related art, the artificial intelligence algorithm tends to have a problem of performance degradation for pictures inconsistent with the distribution of the training image in the image classification problem, because the artificial intelligence algorithm requires the system to operate in the independent same distribution (i.i.d). However, in real life, there are some inter-domain differences and domain offsets between the training images and the images in the application scene.
In one mode, the problem that the traditional image classification model can cause the performance of the image classification model to be greatly reduced when encountering a domain offset scene, so that the classification result is inaccurate is solved. The method and the device for classifying the images of the feature extractor have the advantages that inter-domain difference values among different domain images are measured in advance by means of noise addition and noise removal, and domain generalization training is carried out on the feature extractor by using the inter-domain difference values, so that the image classification model of the feature extractor with stronger robustness is obtained.
The domain generalization training provided in the application is to train on a plurality of labeled source domains, so that the model can extract domain sharing information among a plurality of domains, and the model can have better robustness and generalization in the face of an unseen target domain when the model is applied.
Further, the present application is specifically described herein with reference to fig. 2:
step 1, acquiring sample images of a plurality of source domains aiming at the same object and an initial image classification model.
The initial image classification model comprises an initial feature extractor and a denoising model.
In one mode, an embodiment of the present application proposes a method of measuring inter-domain differences between a plurality of different images. Specifically, a process of adding noise to image features based on a diffusion model can be proposed to measure the distance between different domains of the same image.
As an example, a "diffusion-reconstruction" process of image features is included.
The diffusion process is a process of gradually adding noise data to the image characteristics to the standard Gaussian noise so as to obtain the noise characteristics.
In addition, the reconstruction process is a process of performing noise elimination on the noise characteristics through a denoising model until the reconstructed image characteristics are achieved. The method is characterized in that the difference between the expansion-reconstruction processes (namely, the difference between the noise characteristics and the noise elimination characteristics) is calculated later and used as an inter-domain difference value, and the inter-domain difference value is used as a loss function to train a characteristic extractor in an image classification model so as to achieve the purpose that the image classification model reduces the inter-domain difference, and then the domain-unchanged characteristic extractor is obtained, so that the domain generalization is realized.
In one embodiment, the number of sample images is multiple. And the number of sample images per set includes a plurality, e.g., sample images that need to include multiple source fields for the same object. For example, a set of sample images may be a first sample image obtained by photographing a certain person and a second sample image obtained by sketching the same person.
It can be appreciated that the first sample image and the second sample image are sample images of a plurality of source fields for the same object. I.e. each source field corresponds to a generation style of image. Including for example, photographs, sketches, drawing figures, color images, black and white images, and the like.
In one mode, the initial image classification model in the application comprises an initial feature extractor, a classifier and a denoising model. Further, the present application requires subsequent optimization training of the initial feature extractor and classifier to achieve the final trained target feature extractor and classifier, and further deploys the final trained target feature extractor and classifier in the image classification model to enable subsequent image classification.
As an example, as shown in fig. 3, a schematic flow chart of training the feature extractor is provided in the present application.
It should be noted that, in the process of performing optimization training on the initial feature extractor and the classifier, the feature extractor and the classifier may be trained on a plurality of labeled source domains according to the empirical risk minimization ERM (Emprical Risk Minimization) method, that is, mixing a plurality of pictures with labels in the source domains while using the cross entropy loss training model.
And 2, extracting initial features of a plurality of sample images by using an initial feature extractor, and adding noise data to the initial features to obtain a plurality of noise features.
In one mode, in the embodiment of the present application, gaussian noise data is gradually added to each initial feature, where the gaussian noise data has the same size as the corresponding initial feature, so as to obtain a plurality of noise features.
Wherein, the characteristic extractor F mentioned in the embodiment of the application θ The method comprises the following steps: is responsible for extracting the characteristics of the input image byConvolutional neural network extraction; the feature extractor can extract domain-invariant semantic features by training in multiple labeled source domains and performing domain generalization in the process of matching "diffusion-reconstruction" (i.e., adding or subtracting noise to image features).
In addition, the classifier G mentioned in the embodiment of the application ω The method comprises the following steps: and the image is responsible for classifying the features, and after the images pass through the feature extractor, the features are input into the classifier for classification, and a final classification result is given.
Furthermore, the denoising model D mentioned in the embodiment of the present application φ The method comprises the following steps: for reconstruction processes requiring the passage of a denoising model D φ Denoising the diffusion (denoised) features as original features, i.e. restoring the diffusion features to the original features, i.e. reconstructing the features.
In one mode, as shown in fig. 4, in the process of adding noise data to a plurality of initial features to obtain a plurality of noise features, the embodiment of the present application may include the following steps:
step A1, sampling noise characteristic identification t: i.e. the "diffusion-reconstruction" (i.e. adding noise to the image features) process comprises T steps, only one of which is performed at a time, i.e. the number of steps T that one step needs to be trained is evenly sampled from [1, T ].
Step B1, selecting the noise E to be added t : from a standard Gaussian distribution, noise data E of the same size as the original feature is sampled t I.e.
Figure BDA0004134091780000081
Step C1, calculating noise characteristics after the initial characteristics are diffused (namely noise is increased)
Figure BDA0004134091780000082
That is, the noise data e sampled in step B is obtained t Initial feature f added to sample image 0 Is a kind of medium. As an example, the diffusion (i.e., noise-adding) process is:
Figure BDA0004134091780000083
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004134091780000084
i.e. noise characteristics, < >>
Figure BDA0004134091780000085
Is a manually defined parameter. Optionally, the embodiment of the application can also control the noise added at each step in the diffusion process.
And step 3, denoising the plurality of noise features by using the denoising model to obtain a plurality of denoising features.
Furthermore, after noise data is added to the initial feature and the noise feature is obtained, the noise data can be reconstructed (i.e. noise reduction) by using the pre-trained denoising model so as to obtain the corresponding noise elimination feature.
Wherein for the denoising process, the noise characteristic f can be calculated t Input to noise model D φ Thereby obtaining the predicted noise E φ (f t T), obtaining the noise elimination characteristics after the noise elimination treatment
Figure BDA0004134091780000086
And 4, respectively calculating inter-domain difference values between a noise characteristic and a noise canceling characteristic.
The inter-domain difference value is an offset distance value between the noise feature and the noise cancellation feature.
In one manner, embodiments of the present application entail iterative training of an initial feature extractor based on at least one noise feature corresponding to a first source domain and based on at least one noise cancellation feature corresponding to a second source domain;
wherein the first source domain and the second source domain belong to different source domains.
By way of example, the process of calculating the inter-domain difference value according to the embodiment of the present application includes the following steps:
and A2, extracting features of different domains:
in one mode, the embodiment of the application can randomly extract a pair of domains belonging to different domains
Figure BDA0004134091780000091
But image features generated for the same object and input into a feature extractor F θ Obtaining corresponding initial characteristic f A And initial feature f B
In other words, the two initial features described above represent image features extracted from different a-domain and B-domain, respectively.
Step B2, sampling noise characteristic identification t and noise data epsilon t : from [1, T]The step number t of one step to be trained is uniformly sampled, and noise data epsilon with the same size as the initial characteristic is sampled from standard Gaussian distribution t
Step C2, calculating noise characteristics
Figure BDA0004134091780000092
Noise data epsilon t Respectively adding the initial characteristics f A And f B Obtaining noise characteristics after the noise adding process>
Figure BDA0004134091780000093
And->
Figure BDA0004134091780000094
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004134091780000095
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004134091780000096
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004134091780000097
is a manually defined parameter.
Step D2, calculating noise elimination characteristics
Figure BDA0004134091780000098
Noise characterization->
Figure BDA0004134091780000099
And->
Figure BDA00041340917800000910
Respectively input to the denoising model D φ In order to obtain the predicted noise->
Figure BDA00041340917800000911
And->
Figure BDA00041340917800000912
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00041340917800000913
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00041340917800000914
further, after obtaining the noise data obtained by predicting the noise characteristics by the denoising model, the predicted noise can be obtained from
Figure BDA00041340917800000915
And->
Figure BDA00041340917800000916
Removing (i.e. denoising) to obtain corresponding denoising feature ∈>
Figure BDA00041340917800000917
And->
Figure BDA00041340917800000918
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00041340917800000919
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA00041340917800000920
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004134091780000101
step E2, noise characterization
Figure BDA0004134091780000102
And noise-canceling feature->
Figure BDA0004134091780000103
Matching: in order to make different domains have the same diffusion and reconstruction process, the embodiment of the application can make the noise characteristics of the A domain and the B domain obtained by the above method ∈>
Figure BDA0004134091780000104
And noise-canceling feature->
Figure BDA0004134091780000105
Matching, using the offset distance value between the features as an inter-domain difference value, and using the inter-domain difference value as a loss function to train an initial feature extractor F θ I.e. reconstruction losses.
Further, the present application may calculate the offset distance value for the first source domain based on the noise signature of the first source domain and based on the noise signature corresponding to the second source domain. For example, noise characteristics
Figure BDA0004134091780000106
For the feature generated by adding noise to the image feature extracted from the photo generated by the user A, the noise elimination feature is +.>
Figure BDA0004134091780000107
And (3) performing noise adding processing on the image features extracted from the sketch generated by the user A, and then performing noise reduction processing on the image features.
And step 5, if the number of the inter-domain difference values is multiple, taking the average value of the inter-domain difference values as a loss function, and performing iterative training on the initial feature extractor.
In one manner, embodiments of the present application may derive the loss function by the following equation:
Figure BDA0004134091780000108
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure BDA0004134091780000109
as a loss function, t is the sign of the noise signature, < ->
Figure BDA00041340917800001010
For noise characteristics->
Figure BDA00041340917800001011
Is a noise elimination feature.
And step 6, performing iterative training on the initial feature extractor until an image classification model containing the target feature extractor is obtained.
In one approach, embodiments of the present application are directed to further matching the "diffusion-reconstruction" (i.e., noise-reducing) process of the different domains. The prediction noise of the different domains of the denoising model pair can be further constrained
Figure BDA00041340917800001012
And->
Figure BDA00041340917800001013
Should be identical and training the feature extractor F as a loss function, also as an inter-domain difference value θ Noise matching loss:
Figure BDA00041340917800001014
and 7, acquiring an image to be processed, and inputting the image to be processed into an image classification model.
And 8, extracting image features of the image to be processed by using a target feature extractor in the image classification model.
The feature extractor is obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains. The method comprises the following steps:
an image input step: i.e. from multiple source domains of the same object, respectively
Figure BDA0004134091780000111
In which a number of sample images are randomly sampled so that a batch of images (batch) is formed that is mixed with a plurality of source domains, the plurality of tagged source domains obeying different distributions
Figure BDA0004134091780000112
Where x represents a picture, y represents a label, and N is the number of source fields.
It should be noted that the target domain of the sample image under test will have a different distribution from the source domain
Figure BDA0004134091780000113
That is, the domain generalization target is trained on a plurality of labeled source domains, and the target still has good performance in the image domain which is not seen.
Pretreatment: and (3) carrying out augmentation treatment on the image: the sample image is scaled to 256 x 256 and finally randomly flipped horizontally.
Extracting characteristic steps of a convolutional neural network: inputting the image to be processed into an image classification network F θ (. CNN), where θ is a training parameter, image features can be obtained, where res net-50 or other mainstream neural networks can be used.
And 9, classifying the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model.
In the application, an image to be processed is acquired; inputting an image to be processed into an image classification model, and extracting image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains; and classifying the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model.
By applying the technical scheme, inter-domain difference values between different domain images can be measured in advance by utilizing noise adding and noise removing modes, and domain generalization training is carried out on the feature extractor by using the inter-domain difference values, so that an image classification model of the feature extractor with stronger robustness is obtained. And further, the problem that the classification performance of the traditional image classification model is greatly reduced when a domain offset scene is encountered in the related technology is avoided.
Optionally, in another embodiment of the method according to the present application, before the capturing the image to be processed, the method further includes:
acquiring sample images of a plurality of source domains aiming at the same object and an initial image classification model, wherein the initial image classification model comprises an initial feature extractor and a denoising model;
extracting initial features of a plurality of sample images by using the initial feature extractor, and adding noise data to the plurality of initial features to obtain a plurality of noise features;
denoising the plurality of noise features by using the denoising model to obtain a plurality of denoising features;
and performing iterative training on the initial feature extractor based on at least one noise feature and at least one noise cancellation feature until the image classification model containing the target feature extractor is obtained.
Optionally, in another embodiment of the method according to the present application, the iteratively training the initial feature extractor based on at least one noise feature and at least one noise cancellation feature includes:
iteratively training the initial feature extractor based on at least one noise feature corresponding to a first source domain and based on at least one noise cancellation feature corresponding to a second source domain;
wherein the first source domain and the second source domain belong to different source domains.
Optionally, in another embodiment of the method according to the present application, the iteratively training the initial feature extractor based on at least one noise feature and at least one noise cancellation feature includes:
respectively calculating an inter-domain difference value between one noise feature and one noise elimination feature, wherein the inter-domain difference value is an offset distance value between the noise feature and the noise elimination feature;
and if the number of the inter-domain difference values is multiple, taking the average value of the inter-domain difference values as a loss function, and performing iterative training on the initial feature extractor.
Alternatively, in another embodiment based on the above method of the present application, the loss function is obtained by the following formula:
Figure BDA0004134091780000121
wherein the said
Figure BDA0004134091780000122
For the loss function, t is the sign of the noise signature, the +.>
Figure BDA0004134091780000123
For the noise characteristics, the +.>
Figure BDA0004134091780000124
Is the noise cancellation feature.
Optionally, in another embodiment of the method according to the present application, adding noise data to the plurality of initial features to obtain a plurality of noise features includes:
and gradually adding Gaussian noise data to each initial feature to obtain a plurality of noise features with different variances, wherein the size of one Gaussian noise data is the same as the size of the corresponding initial feature.
By applying the technical scheme, inter-domain difference values between different domain images can be measured in advance by utilizing noise adding and noise removing modes, and domain generalization training is carried out on the feature extractor by using the inter-domain difference values, so that an image classification model of the feature extractor with stronger robustness is obtained. And further, the problem that the classification performance of the traditional image classification model is greatly reduced when a domain offset scene is encountered in the related technology is avoided.
Optionally, in another embodiment of the present application, as shown in fig. 5, the present application further provides an image feature classification apparatus. Comprising the following steps:
an acquisition module 201 configured to acquire an image to be processed;
an extraction module 202 configured to input the image to be processed into an image classification model, and extract image features of the image to be processed by using a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains;
the classification module 203 is configured to perform classification processing on the image to be processed based on the image features of the image to be processed and a classifier in the image classification model.
By applying the technical scheme, inter-domain difference values between different domain images can be measured in advance by utilizing noise adding and noise removing modes, and domain generalization training is carried out on the feature extractor by using the inter-domain difference values, so that an image classification model of the feature extractor with stronger robustness is obtained. And further, the problem that the classification performance of the traditional image classification model is greatly reduced when a domain offset scene is encountered in the related technology is avoided.
In another embodiment of the present application, the extraction module 202 is configured to:
acquiring sample images of a plurality of source domains aiming at the same object and an initial image classification model, wherein the initial image classification model comprises an initial feature extractor and a denoising model;
extracting initial features of a plurality of sample images by using the initial feature extractor, and adding noise data to the plurality of initial features to obtain a plurality of noise features;
denoising the plurality of noise features by using the denoising model to obtain a plurality of denoising features;
and performing iterative training on the initial feature extractor based on at least one noise feature and at least one noise cancellation feature until the image classification model containing the target feature extractor is obtained.
In another embodiment of the present application, the extraction module 202 is configured to:
iteratively training the initial feature extractor based on at least one noise feature corresponding to a first source domain and based on at least one noise cancellation feature corresponding to a second source domain;
wherein the first source domain and the second source domain belong to different source domains.
In another embodiment of the present application, the extraction module 202 is configured to:
respectively calculating an inter-domain difference value between one noise feature and one noise elimination feature, wherein the inter-domain difference value is an offset distance value between the noise feature and the noise elimination feature;
and if the number of the inter-domain difference values is multiple, taking the average value of the inter-domain difference values as a loss function, and performing iterative training on the initial feature extractor.
In another embodiment of the present application, the extraction module 202 is configured to:
the loss function is obtained by the following formula:
Figure BDA0004134091780000141
wherein the said
Figure BDA0004134091780000142
For the loss function, t is the sign of the noise signature, the +.>
Figure BDA0004134091780000143
For the noise characteristics, the +.>
Figure BDA0004134091780000144
Is the noise cancellation feature.
In another embodiment of the present application, the extraction module 202 is configured to:
and gradually adding Gaussian noise data to each initial feature to obtain a plurality of noise features with different variances, wherein the size of one Gaussian noise data is the same as the size of the corresponding initial feature.
Fig. 6 is a block diagram of a logic structure of an electronic device, according to an example embodiment. For example, the electronic device 300 may be an electronic device.
In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium including instructions, such as a memory including instructions, executable by an electronic device processor to perform a method of feature classification of an image as described above, the method comprising: acquiring an image to be processed; inputting the image to be processed into an image classification model, and extracting image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains; and classifying the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model. Optionally, the above instructions may also be executed by a processor of the electronic device to perform the other steps involved in the above-described exemplary embodiments. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
In an exemplary embodiment, there is also provided an application/computer program product comprising one or more instructions executable by a processor of an electronic device to perform a method of classifying features of an image as described above, the method comprising: acquiring an image to be processed; inputting the image to be processed into an image classification model, and extracting image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains; and classifying the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model. Optionally, the above instructions may also be executed by a processor of the electronic device to perform the other steps involved in the above-described exemplary embodiments.
Fig. 6 is an example diagram of an electronic device 300. It will be appreciated by those skilled in the art that the schematic diagram 6 is merely an example of the electronic device 300 and is not meant to be limiting of the electronic device 300, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the electronic device 300 may also include input-output devices, network access devices, buses, etc.
The processor 302 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor 302 may be any conventional processor or the like, the processor 302 being a control center of the electronic device 300, with various interfaces and lines connecting the various parts of the overall electronic device 300.
The memory 301 may be used to store computer readable instructions 303 and the processor 302 implements the various functions of the electronic device 300 by executing or executing computer readable instructions or modules stored in the memory 301 and invoking data stored in the memory 301. The memory 301 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device 300, and the like. In addition, the Memory 301 may include a hard disk, a Memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card), at least one magnetic disk storage device, a Flash Memory device, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or other nonvolatile/volatile storage device.
The modules integrated with the electronic device 300 may be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product. Based on such understanding, the present invention may implement all or part of the flow of the method of the above-described embodiments, or may be implemented by means of computer readable instructions to instruct related hardware, where the computer readable instructions may be stored in a computer readable storage medium, where the computer readable instructions, when executed by a processor, implement the steps of the method embodiments described above.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (9)

1. A method for classifying features of an image, comprising:
acquiring an image to be processed;
inputting the image to be processed into an image classification model, and extracting image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains;
and classifying the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model.
2. The method of claim 1, further comprising, prior to said acquiring the image to be processed:
acquiring sample images of a plurality of source domains aiming at the same object and an initial image classification model, wherein the initial image classification model comprises an initial feature extractor and a denoising model;
extracting initial features of a plurality of sample images by using the initial feature extractor, and adding noise data to the plurality of initial features to obtain a plurality of noise features;
denoising the plurality of noise features by using the denoising model to obtain a plurality of denoising features;
and performing iterative training on the initial feature extractor based on at least one noise feature and at least one noise cancellation feature until the image classification model containing the target feature extractor is obtained.
3. The method of claim 2, wherein the iteratively training the initial feature extractor based on at least one noise feature and at least one noise cancellation feature comprises:
iteratively training the initial feature extractor based on at least one noise feature corresponding to a first source domain and based on at least one noise cancellation feature corresponding to a second source domain;
wherein the first source domain and the second source domain belong to different source domains.
4. A method according to claim 2 or 3, wherein said iteratively training said initial feature extractor based on at least one noise feature and at least one noise cancellation feature comprises:
respectively calculating an inter-domain difference value between one noise feature and one noise elimination feature, wherein the inter-domain difference value is an offset distance value between the noise feature and the noise elimination feature;
and if the number of the inter-domain difference values is multiple, taking the average value of the inter-domain difference values as a loss function, and performing iterative training on the initial feature extractor.
5. The method of claim 4, wherein the loss function is obtained by the following formula:
Figure FDA0004134091770000021
wherein the said
Figure FDA0004134091770000022
For the loss function, t is the sign of the noise signature, the +.>
Figure FDA0004134091770000023
For the noise characteristics, the +.>
Figure FDA0004134091770000024
Is the noise cancellation feature.
6. The method of claim 2, wherein adding noise data to the plurality of initial features results in a plurality of noise features, comprising:
and gradually adding Gaussian noise data to each initial feature to obtain a plurality of noise features with different variances, wherein the size of one Gaussian noise data is the same as the size of the corresponding initial feature.
7. An image feature classification apparatus, comprising:
an acquisition module configured to acquire an image to be processed;
the extraction module is configured to input the image to be processed into an image classification model, and extract image features of the image to be processed by utilizing a target feature extractor in the image classification model, wherein the feature extractor is a feature extractor obtained by performing domain generalization training on inter-domain difference values in sample images of a plurality of different source domains;
and the classification module is configured to classify the image to be processed based on the image characteristics of the image to be processed and a classifier in the image classification model.
8. An electronic device, comprising:
a memory for storing executable instructions; the method comprises the steps of,
a processor for executing the executable instructions with the memory to perform the operations of the feature classification method of an image of any of claims 1-6.
9. A computer readable storage medium storing computer readable instructions, wherein the instructions when executed perform the operations of the feature classification method of an image of any of claims 1-6.
CN202310269093.7A 2023-03-15 2023-03-15 Image feature classification method and device, electronic equipment and medium Pending CN116363421A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310269093.7A CN116363421A (en) 2023-03-15 2023-03-15 Image feature classification method and device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310269093.7A CN116363421A (en) 2023-03-15 2023-03-15 Image feature classification method and device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN116363421A true CN116363421A (en) 2023-06-30

Family

ID=86935670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310269093.7A Pending CN116363421A (en) 2023-03-15 2023-03-15 Image feature classification method and device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN116363421A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210182687A1 (en) * 2019-12-12 2021-06-17 Samsung Electronics Co., Ltd. Apparatus and method with neural network implementation of domain adaptation
CN113378981A (en) * 2021-07-02 2021-09-10 湖南大学 Noise scene image classification method and system based on domain adaptation
WO2021258967A1 (en) * 2020-06-24 2021-12-30 华为技术有限公司 Neural network training method and device, and data acquisition method and device
JP2022164597A (en) * 2021-04-16 2022-10-27 富士通株式会社 Method of domain adaptation applied to image segmentation, apparatus, and storage medium
CN115272880A (en) * 2022-07-29 2022-11-01 大连理工大学 Multimode remote sensing target recognition method based on metric learning
CN115578248A (en) * 2022-11-28 2023-01-06 南京理工大学 Generalized enhanced image classification algorithm based on style guidance
CN115731424A (en) * 2022-12-03 2023-03-03 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210182687A1 (en) * 2019-12-12 2021-06-17 Samsung Electronics Co., Ltd. Apparatus and method with neural network implementation of domain adaptation
WO2021258967A1 (en) * 2020-06-24 2021-12-30 华为技术有限公司 Neural network training method and device, and data acquisition method and device
JP2022164597A (en) * 2021-04-16 2022-10-27 富士通株式会社 Method of domain adaptation applied to image segmentation, apparatus, and storage medium
CN113378981A (en) * 2021-07-02 2021-09-10 湖南大学 Noise scene image classification method and system based on domain adaptation
CN115272880A (en) * 2022-07-29 2022-11-01 大连理工大学 Multimode remote sensing target recognition method based on metric learning
CN115578248A (en) * 2022-11-28 2023-01-06 南京理工大学 Generalized enhanced image classification algorithm based on style guidance
CN115731424A (en) * 2022-12-03 2023-03-03 北京邮电大学 Image classification model training method and system based on enhanced federal domain generalization

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
HO J等: "Denoising diffusion probabilistic models", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS *
LI P等: "A simple feature augmentation for domain generalization", PROCEEDINGS OF THE IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION *
SONG J等: "Denoising diffusion implicit models", ARXIV PREPRINT ARXIV:2010.02502 *
WANG J等: "Domain Generalization via Frequency-domain-based Feature Disentanglement and Interaction", PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, pages 4821 - 4829 *
徐海等: "视觉域泛化技术及研究进展", 广州大学学报(自然科学版) *
陶洋等: "基于重建分类网络特征增强的盲域自适应分类器", 信息通信, no. 06 *

Similar Documents

Publication Publication Date Title
WO2020048271A1 (en) Gan network-based vehicle damage image enhancement method and apparatus
CN108229531B (en) Object feature extraction method and device, storage medium and electronic equipment
CN109344762B (en) Image processing method and device
CN110599387A (en) Method and device for automatically removing image watermark
US11281939B2 (en) Method and apparatus for training an object identification neural network, and computer device
CN111767906B (en) Face detection model training method, face detection device and electronic equipment
Kim et al. Defocus and motion blur detection with deep contextual features
CN110349161B (en) Image segmentation method, image segmentation device, electronic equipment and storage medium
US9779488B2 (en) Information processing device, image processing method and medium
CN112613508A (en) Object identification method, device and equipment
CN114943649A (en) Image deblurring method, device and computer readable storage medium
CN113158773B (en) Training method and training device for living body detection model
CN108289176B (en) Photographing question searching method, question searching device and terminal equipment
CN111222446B (en) Face recognition method, face recognition device and mobile terminal
CN116109878B (en) Image reproduction identification method, system, device and storage medium
CN111340722B (en) Image processing method, processing device, terminal equipment and readable storage medium
CN110349108B (en) Method, apparatus, electronic device, and storage medium for processing image
CN110210425B (en) Face recognition method and device, electronic equipment and storage medium
CN116363421A (en) Image feature classification method and device, electronic equipment and medium
CN116152079A (en) Image processing method and image processing model training method
CN113326893A (en) Training and recognition method and device of license plate recognition model and electronic equipment
CN114119377A (en) Image processing method and device
CN113658050A (en) Image denoising method, denoising device, mobile terminal and storage medium
CN113920493B (en) Method, device, equipment and storage medium for detecting lost articles
CN116188511B (en) Method and device for optimizing human labels based on edge detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination