CN118334410A - Cross-domain image classification method and system based on self-adaptive optimal transmission - Google Patents
Cross-domain image classification method and system based on self-adaptive optimal transmission Download PDFInfo
- Publication number
- CN118334410A CN118334410A CN202410342453.6A CN202410342453A CN118334410A CN 118334410 A CN118334410 A CN 118334410A CN 202410342453 A CN202410342453 A CN 202410342453A CN 118334410 A CN118334410 A CN 118334410A
- Authority
- CN
- China
- Prior art keywords
- image
- source
- classification
- target
- domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000005540 biological transmission Effects 0.000 title claims abstract description 87
- 238000000034 method Methods 0.000 title claims abstract description 21
- 238000007781 pre-processing Methods 0.000 claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 11
- 239000010410 layer Substances 0.000 claims description 30
- 230000003044 adaptive effect Effects 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000013507 mapping Methods 0.000 claims description 10
- 238000010276 construction Methods 0.000 claims description 5
- 239000002356 single layer Substances 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 description 26
- 238000013528 artificial neural network Methods 0.000 description 9
- 238000009826 distribution Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000000275 quality assurance Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Image Processing (AREA)
Abstract
The invention discloses a cross-domain image classification method and a system based on self-adaptive optimal transmission, wherein the method comprises the following steps: acquiring original images acquired in a source field and a target field, performing image preprocessing and image enhancement, and extracting by an image feature extractor to obtain feature embedding of the image after data enhancement; the image classifier performs image classification according to the characteristic embedding of the image to obtain a prediction classification label of the image; solving the self-adaptive optimal transmission distance between the source domain image set and the target domain image set to serve as the difference degree between the source domain image set and the target domain image set; calculating the image classification loss of the image classifier on the source field image; constructing an objective function and performing iterative training to obtain a trained image feature extractor and an image classifier; and respectively extracting the characteristic embedding of the image, classifying the image in the target field, and outputting a classification label. The method and the device can effectively improve the robustness and generalization of cross-domain image classification.
Description
Technical Field
The invention relates to the technical field of cross-domain image classification, in particular to a cross-domain image classification method and system based on self-adaptive optimal transmission.
Background
The cross-domain image classification technique solves the problem of how to quickly migrate an image classification system from an existing environment to a new environment. With the rapid increase in the amount of information related to images, image classification is becoming more and more important in many fields of application. The traditional image classification technology needs a large number of marked image samples, and also needs that the sample distribution in the source field and the target field meet independent same distribution, so that a better effect can be achieved. However, in practical applications, there are enough unlabeled image data and a small amount of labeled image data in many fields, so that the cost of manpower and material resources consumed for labeling a large amount of image samples is too great, and many times even not feasible. The sources of image data in different fields are different, so there is always a certain difference in feature distribution or feature space between fields. For example: the image acquisition equipment and the acquisition conditions have differences, and the images shot by indoor and outdoor, different scenes, different illumination, different angles and the like are different, and the differences of resolution, expression, action and the like can also cause the change of the characteristic distribution. The goal of cross-domain image classification is to quickly migrate an image classification system trained on a large number of annotated images in an existing environment (also known as the source domain) to a new environment (also known as the target domain).
At present, cross-domain image classification based on optimal transmission theory is one of the research directions with the most development potential in the field. The optimal transmission theory researches a difference degree problem (for example, evaluating the difference degree of a source field image set and a target field image set) between two probability distributions, which is one of core problems to be solved by cross-domain image classification. However, the classical optimal transmission problem adopts a probability-preserving quality constraint, which severely restricts the performance of the cross-domain image classification system. In the application scene of cross-domain image classification, the problems of small samples, abnormal points, noise, long tail effect, data not following independent identical distribution assumptions and the like are common. Under these scenarios, quality assurance constraints distort the transmission map, severely compromising the robustness and generalization of the cross-domain image classification system. Especially in deep learning, the training paradigm of small-batch sampling can aggravate the severity of the problem, and the existing cross-domain image classification technology based on the classical optimal transmission theory cannot meet the requirements of the practical application fields of many image classifications on the accuracy and the robustness of the cross-domain image classification.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention provides a cross-domain image classification method and system based on self-adaptive optimal transmission.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
1. a cross-domain image classification method based on self-adaptive optimal transmission comprises the following steps:
acquiring original images acquired in a source field and a target field, wherein the source field is provided with a classification label, and the target field is not provided with a classification label;
performing image preprocessing on original images acquired in the source field and the target field;
Carrying out data enhancement on the preprocessed image;
Constructing an image feature extractor based on the depth convolution network, and performing feature extraction on the image after data enhancement by the image feature extractor to obtain feature embedding of the image;
constructing an image classifier based on the full-connection layer network, and performing image classification by the image classifier according to characteristic embedding of the image to obtain a prediction classification label of the image;
Based on feature embedding and prediction classification labels of images, optimizing and solving the self-adaptive optimal transmission distance problem between the source field image set and the target field image set to obtain the self-adaptive optimal transmission distance between the source field image set and the target field image set, wherein the self-adaptive optimal transmission distance is used as the difference degree between the source field image set and the target field image set;
Calculating image classification loss of the image classifier on the source field image based on the prediction classification label of the image;
Constructing an objective function based on the image classification loss and the degree of difference between the source field image set and the target field image set, performing gradient feedback on the image feature extractor and the image classifier based on the objective function, updating network parameters, and performing iterative training to obtain a trained image feature extractor and image classifier;
the method comprises the steps of obtaining a target field image, extracting features through a trained image feature extractor to obtain feature embedding of the image, classifying the feature embedding of the target field image based on a trained image classifier, and outputting a classification label of the image.
As a preferable technical scheme, the image feature extractor is constructed based on a depth convolution network, and specifically, the image feature extractor is constructed by adopting a residual depth network ResNet.
As a preferable technical scheme, the image classifier is constructed based on a full-connection layer network, and specifically, the image classifier is constructed by adopting a single-layer or multi-layer full-connection layer network and a soft maximization layer.
As an optimal technical scheme, an adaptive optimal transmission distance problem between a source field image set and a target field image set is optimized and solved, and the adaptive optimal transmission distance between the source field image set and the target field image set is obtained, which is specifically expressed as follows:
Cij=||g(xi)-g(zj)||-αyitanh(f(g(zj)))
Γ≤(μ,v)={π∈P(X×Z)|π1m≤μ,πT1n≤v}
Wherein, Representing a set of images of a source domain,Representing a target domain image set, W (X, Z) representing an adaptive optimal transmission distance between a source domain image set and the target domain image set, Γ representing a transmission map space, pi representing a transmission map, pi ij representing a transmission probability mass from the source domain image X i to the target domain image Z j, C ij representing a cost of transmitting a unit probability mass from the source domain image X i to the target domain image Z j, μ representing a probability measure of the source domain image set, v representing a probability measure of the target domain image set,A dirac function representing a source domain image,A dirac function representing the target domain image, P i representing the probability mass of measure μ at source domain image x i, q j representing the probability mass of measure v at target domain image z j, P representing the probability measure space, 1 m representing the m-dimensional unit vector, 1 n representing the n-dimensional unit vector, g (x i) representing the image feature extractor mapping the source domain image to the feature space, g (z i) representing the image feature extractor mapping the target domain image to the feature space, f () representing the image classifier, α being a non-negative coefficient, and tanh representing the tanh function.
As a preferred technical solution, calculating an image classification loss of the image classifier on the source field image, specifically expressed as:
where g () represents an image feature extractor, f () represents an image classifier, x i represents a source field image, and y i represents a class label.
The invention also provides a cross-domain image classification system based on self-adaptive optimal transmission, which comprises: the system comprises an original image acquisition module, an image preprocessing module, an image enhancement module, an image feature extractor, an image classifier, a self-adaptive optimal transmission distance calculation module, an image classification loss calculation module, an objective function construction module, a training module and a classification result output module;
the original image acquisition module is used for acquiring original images acquired in a source field and a target field, wherein the source field is provided with a classification label, and the target field is not provided with a classification label;
the image preprocessing module is used for preprocessing the original images acquired in the source field and the target field;
The image enhancement module is used for carrying out data enhancement on the preprocessed image;
the image feature extractor is constructed based on a depth convolution network and is used for extracting features of the image after data enhancement to obtain feature embedding of the image;
The image classifier is constructed based on a full-connection layer network and is used for classifying images according to characteristic embedding of the images to obtain prediction classification labels of the images;
The self-adaptive optimal transmission distance calculation module is used for carrying out optimization solution on the self-adaptive optimal transmission distance problem between the source field image set and the target field image set based on the characteristic embedding and prediction classification labels of the images, so as to obtain the self-adaptive optimal transmission distance between the source field image set and the target field image set, and the self-adaptive optimal transmission distance is used as the difference degree between the source field image set and the target field image set;
the image classification loss calculation module is used for calculating the image classification loss of the image classifier on the source field image based on the prediction classification label of the image;
The objective function construction module is used for constructing an objective function based on the image classification loss and the degree of difference between the source field image set and the target field image set;
The training module is used for carrying out gradient feedback on the image feature extractor and the image classifier based on the objective function, updating network parameters, and carrying out iterative training to obtain a trained image feature extractor and image classifier;
The classification result output module is used for acquiring the target field image, extracting the characteristics through the trained image characteristic extractor to obtain the characteristic embedding of the image, classifying the characteristic embedding of the target field image based on the trained image classifier, and outputting the classification label of the image.
As a preferable technical scheme, the image feature extractor is constructed based on a depth convolution network, and specifically, the image feature extractor is constructed by adopting a residual depth network ResNet.
As a preferable technical scheme, the image classifier is constructed based on a full-connection layer network, and specifically, the image classifier is constructed by adopting a single-layer or multi-layer full-connection layer network and a soft maximization layer.
As an optimal technical scheme, an adaptive optimal transmission distance problem between a source field image set and a target field image set is optimized and solved, and the adaptive optimal transmission distance between the source field image set and the target field image set is obtained, which is specifically expressed as follows:
Cij=||g(xi)-g(zj)||-ayi tanh(f(g(zj)))
Γ≤(μ,v)={π∈P(X×Z)|π1m≤μ,πf1n≤v}
Wherein, Representing a set of images of a source domain,Representing a target domain image set, W (X, Z) representing an adaptive optimal transmission distance between a source domain image set and the target domain image set, Γ representing a transmission map space, pi representing a transmission map, pi ij representing a transmission probability mass from the source domain image X i to the target domain image Z j, C ij representing a cost of transmitting a unit probability mass from the source domain image X i to the target domain image Z j, μ representing a probability measure of the source domain image set, v representing a probability measure of the target domain image set,A dirac function representing a source domain image,A dirac function representing the target domain image, P i representing the probability mass of measure μ at source domain image x i, q j representing the probability mass of measure v at target domain image z j, P representing the probability measure space, 1 m representing the m-dimensional unit vector, 1 n representing the n-dimensional unit vector, g (x i) representing the image feature extractor mapping the source domain image to the feature space, g (z i) representing the image feature extractor mapping the target domain image to the feature space, f () representing the image classifier, α being a non-negative coefficient, and tanh representing the tanh function.
As a preferred technical solution, the image classification loss calculation module is configured to calculate, based on a prediction classification label of an image, an image classification loss of an image classifier on an image in a source field, specifically expressed as:
where g () represents an image feature extractor, f () represents an image classifier, x i represents a source field image, and y i represents a class label.
Compared with the prior art, the invention has the following advantages and beneficial effects:
According to the method, the difference between the source domain image set and the target domain image set is measured by adopting the self-adaptive optimal transmission, the cross-domain image classification is realized based on the self-adaptive optimal transmission, the limitation of a classical optimal transmission theory can be overcome, the robustness and generalization of the cross-domain image classification can be effectively improved, and the requirements of the cross-domain image classification on the aspects of accuracy and robustness are met.
Drawings
Fig. 1 is a flow chart of a cross-domain image classification method based on adaptive optimal transmission.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1
As shown in fig. 1, the present embodiment provides a cross-domain image classification method based on adaptive optimal transmission, which includes the following steps:
S1: acquiring an image; acquiring original images acquired from an old environment (called a source field) and a new environment (called a target field), wherein the source field images have classification labels (taking a pedestrian recognition task as an example, if pedestrians are contained in the images, the classification labels are 'yes' and are 'no' otherwise), and the images of the target fields have no classification labels;
In this embodiment, the source field is exemplified by a clear day autopilot scene, including a large number of manually labeled images, and the target field is exemplified by a snow day autopilot scene, including images lacking labels or not manually labeled;
S2: preprocessing an image; preprocessing an original image through image preprocessing technologies such as clipping, scaling and the like to remove noise information, and keeping the dimensions of images acquired from an old environment and a new environment consistent;
In this embodiment, the original image is an original image acquired by automatic driving in a sunny scene and a snowy scene, that is, an automatic driving image;
S3: data enhancement of images: in order to solve the scarcity of the image sample and improve the generalization performance of the image classification system, the preprocessed image is also subjected to data enhancement, including but not limited to random slicing, horizontal or vertical flipping, changing illumination conditions, and the like;
s4: feature extraction of images: the method comprises the steps that a depth convolution network is used as an image feature extractor to extract features of an image subjected to data enhancement, semantic features are extracted, and feature embedding of the image is obtained;
In this embodiment, the image feature extractor is implemented by using a deep convolutional network, including but not limited to a residual depth network ResNet and the like, and is described below by taking the residual depth network ResNet as an example: resNet50 networks together have 5 convolutions, an average pooling layer. Taking input color images 224×224 as an example, first, a convolution layer conv1 with a number of 64, a convolution kernel size of 7×7, and a step size of 2, where the layer outputs a picture size of 112×112, and the number of output image channels is 64; then the layer is pooled through a maximum downsampling of 3×3, the output picture size of the layer is 56×56, and the number of output image channels is 64; stacking 4 residual network blocks, wherein the output picture size is 7 multiplied by 7, and the output image channel number is 2048; finally, the characteristic embedding of the image is obtained through an average pooling layer.
S5: cross-domain image classification: based on feature embedding of the image, classifying the image by adopting a full-connection layer network as an image classifier to obtain a prediction classification label of the image (taking a pedestrian recognition task as an example, if the image contains pedestrians, the classification label is 'yes', otherwise 'no');
in this embodiment, the image classifier is implemented using a single-layer or multi-layer fully connected layer network and a soft maximization layer;
S6: calculating the difference degree between the source domain image set and the target domain image set: based on feature embedding and prediction classification labels of images, calculating the degree of difference between a source field image set and a target field image set by using a self-adaptive optimal transmission model, and specifically comprising the following steps:
modeling a difference degree problem between a source field image set and a target field image set as a self-adaptive optimal transmission problem between solving image sets, wherein the specific process is as follows:
Set source field image set Obeying probability measuresTarget area image setObeying probability measuresWhere δ is a dirac function, measure μ has a probability mass p i at image x i, measure v has probability masses q j,pi and q j at image z j belonging to the probability simplex, i.eAndThe source field image set is provided with a classification label, and the classification label set is as followsWhile images of the target area have no class labels. C is the transmission cost function and,Representing the cost of transmitting the unit probability mass from the source domain image x i to the target domain image z j. The transmission map is represented by a joint probability pi, where pi ij represents the transmission probability mass from the source domain image x i to the target domain image z j. P represents the probability measure space, 1 m represents the m-dimensional unit vector, and 1 n represents the n-dimensional unit vector. Converting the difference degree between the source field image set and the target field image set into a self-adaptive optimal transmission distance problem between the solving image sets, wherein the self-adaptive optimal transmission distance problem is specifically expressed as follows:
Wherein, the transmission mapping space is:
Γ≤(μ,v)={π∈P(X×Z)|π1m≤μ,πT1n≤v}
and after the self-adaptive optimal transmission problem between the image sets is optimally solved, obtaining the self-adaptive optimal transmission distance W (X, Z) between the image sets, and taking the self-adaptive optimal transmission distance W as the difference degree between the source domain image set and the target domain image set.
Specifically, taking an automatic driving sunny scene and snowy scene image set as an example, the embodiment adopts a self-adaptive optimal transmission model to calculate the difference degree of the automatic driving sunny scene and snowy scene image set;
Setting an automatic driving sunny scene image set Obeying probability measuresAutomatic driving snow scene image setObeying probability measuresWhere δ is a dirac function, measure μ has a probability mass p i at image x i, measure v has probability masses g j,pi and q j at image z j satisfyingAndThe automatic driving sunny scene image set is provided with a classification label, and the classification label set is thatWhile the set of autopilot snowy scene images has no classification labels. C is the transmission cost function and,Representing the cost of transmitting the unit probability mass from the sunny scene image x i to the snowy scene image z j. The transmission map is represented by a joint probability pi, where pi ij represents the transmission probability mass from the sunny scene image x i to the snowy scene image z j. P represents the probability measure space, 1 m represents the m-dimensional unit vector, and 1 n represents the n-dimensional unit vector. The depth network comprises an image feature extractor g (x) that maps images to feature space and a classifier f (x) that maps images from feature space to class label space. The following formula is used to calculate the cost of transmitting a unit probability mass from a sunny scene image x i to a snowy scene image z j:
Cij=||g(xi)-g(zj)||-αyitanh(f(g(zj))).
the construction cost function C is to align the ultrasonic images in the feature space and the tag space at the same time, only the feature space or the tag space is not comprehensive, the difference degree of the automatic driving fine scene image set and the automatic driving snow scene image set is calculated and modeled as a self-adaptive optimal transmission distance problem between the fine scene image set and the snow scene image set is solved:
Wherein, alpha is a non-negative coefficient, and the transmission mapping space is:
Γ≤(μ,v)={π∈P(X×Z)|π1m≤μ,πT1n≤v}
after the self-adaptive optimal transmission problem is optimized and solved, the self-adaptive optimal transmission distance between the image sets is obtained and is used as the difference degree of the image sets of the automatic driving sunny scene and the snowy scene;
S7: calculating the image classification loss: based on the predictive classification labels of the images, the classification loss of the image classifier on the source field image is calculated, and the cross entropy loss function is adopted in the embodiment. As described in step S6, the source field image set The corresponding classified label set isG () represents an image feature extractor, f () represents an image classifier, and the cross entropy loss function is specifically expressed as:
In this embodiment, the cross entropy loss function is used to calculate the classification loss of the image classifier on the sunny scene of the automatic driving, and of course, the invention is not limited to the calculation using only the cross entropy loss function, and other classification loss functions are also applicable.
S8: calculating an objective function of the neural network: the objective function of the neural network comprises an image classification loss and the difference degree of a source field image set and a target field image set, and specifically, the objective function is the sum of the image classification loss and the difference degree of the image set, wherein the image classification loss is calculated according to the step S7, and the difference degree of the source field image set and the target field image set is calculated according to the step S6;
S9: neural network gradient feedback: gradient feedback is carried out on the deep neural network (comprising an image feature extractor and an image classifier) according to an objective function (obtained in step S8) of the neural network, and parameters of the deep neural network are updated;
S10: training a neural network: repeating the steps S4 to S9 until the neural network converges, for example: if the updated iteration number reaches the maximum iteration number, judging that the neural network is converged;
S11: outputting an image classification result: inputting the target field image, extracting the characteristics through the trained image characteristic extractor to obtain the characteristic embedding of the image, classifying the characteristic embedding of the target field image by utilizing the trained image classifier, and outputting the classification label of the image.
In this embodiment, a classification result of the automatic driving snow scene image is output, the automatic driving snow scene image is classified by using a trained image classifier, and a classification result, such as a pedestrian recognition or object classification result, is output.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.
Claims (10)
1. The cross-domain image classification method based on the self-adaptive optimal transmission is characterized by comprising the following steps of:
acquiring original images acquired in a source field and a target field, wherein the source field is provided with a classification label, and the target field is not provided with a classification label;
performing image preprocessing on original images acquired in the source field and the target field;
Carrying out data enhancement on the preprocessed image;
Constructing an image feature extractor based on the depth convolution network, and performing feature extraction on the image after data enhancement by the image feature extractor to obtain feature embedding of the image;
constructing an image classifier based on the full-connection layer network, and performing image classification by the image classifier according to characteristic embedding of the image to obtain a prediction classification label of the image;
Based on feature embedding and prediction classification labels of images, optimizing and solving the self-adaptive optimal transmission distance problem between the source field image set and the target field image set to obtain the self-adaptive optimal transmission distance between the source field image set and the target field image set, wherein the self-adaptive optimal transmission distance is used as the difference degree between the source field image set and the target field image set;
Calculating image classification loss of the image classifier on the source field image based on the prediction classification label of the image;
Constructing an objective function based on the image classification loss and the degree of difference between the source field image set and the target field image set, performing gradient feedback on the image feature extractor and the image classifier based on the objective function, updating network parameters, and performing iterative training to obtain a trained image feature extractor and image classifier;
the method comprises the steps of obtaining a target field image, extracting features through a trained image feature extractor to obtain feature embedding of the image, classifying the feature embedding of the target field image based on a trained image classifier, and outputting a classification label of the image.
2. The cross-domain image classification method based on adaptive optimal transmission according to claim 1, wherein the image feature extractor is constructed based on a depth convolution network, and particularly the image feature extractor is constructed by adopting a residual depth network ResNet.
3. The cross-domain image classification method based on self-adaptive optimal transmission according to claim 1, wherein the image classifier is constructed based on a full-connection layer network, in particular, a single-layer or multi-layer full-connection layer network and a soft maximization layer are adopted to construct the image classifier.
4. The cross-domain image classification method based on self-adaptive optimal transmission according to claim 1, wherein the self-adaptive optimal transmission distance problem between the source domain image set and the target domain image set is solved in an optimized manner, so as to obtain the self-adaptive optimal transmission distance between the source domain image set and the target domain image set, which is specifically expressed as:
Cij=||g(xi)-g(zj)||-αyitanh(f(g(zj)))
Γ≤(μ,ν)={π∈P(X×Z)|π1m≤μ,πT1n≤ν}
Wherein, Representing a set of images of a source domain,Representing a target domain image set, W (X, Z) representing an adaptive optimal transmission distance between a source domain image set and a target domain image set, Γ representing a transmission mapping space, pi representing a transmission mapping, pi ij representing a transmission probability mass from a source domain image X i to a target domain image Z j, C ij representing a cost of transmitting a unit probability mass from the source domain image X i to the target domain image Z j, μ representing a probability measure of the source domain image set, ν representing a probability measure of the target domain image set,A dirac function representing a source domain image,A dirac function representing the target domain image, P i representing the probability mass of measure μ at source domain image x i, q k representing the probability mass of measure ν at target domain image z j, P representing the probability measure space, 1 m representing the m-dimensional unit vector, 1 b representing the n-dimensional unit vector, g (x i) representing the image feature extractor mapping the source domain image to the feature space, g (z i) representing the image feature extractor mapping the target domain image to the feature space, f () representing the image classifier, α being a non-negative coefficient, and tanh representing the tanh function.
5. The cross-domain image classification method based on adaptive optimal transmission according to claim 1, wherein the image classification loss of the image classifier on the source domain image is calculated, specifically expressed as:
where g () represents an image feature extractor, f () represents an image classifier, x i represents a source field image, and y i represents a class label.
6. A cross-domain image classification system based on adaptive optimal transmission, comprising: the system comprises an original image acquisition module, an image preprocessing module, an image enhancement module, an image feature extractor, an image classifier, a self-adaptive optimal transmission distance calculation module, an image classification loss calculation module, an objective function construction module, a training module and a classification result output module;
the original image acquisition module is used for acquiring original images acquired in a source field and a target field, wherein the source field is provided with a classification label, and the target field is not provided with a classification label;
the image preprocessing module is used for preprocessing the original images acquired in the source field and the target field;
The image enhancement module is used for carrying out data enhancement on the preprocessed image;
the image feature extractor is constructed based on a depth convolution network and is used for extracting features of the image after data enhancement to obtain feature embedding of the image;
The image classifier is constructed based on a full-connection layer network and is used for classifying images according to characteristic embedding of the images to obtain prediction classification labels of the images;
The self-adaptive optimal transmission distance calculation module is used for carrying out optimization solution on the self-adaptive optimal transmission distance problem between the source field image set and the target field image set based on the characteristic embedding and prediction classification labels of the images, so as to obtain the self-adaptive optimal transmission distance between the source field image set and the target field image set, and the self-adaptive optimal transmission distance is used as the difference degree between the source field image set and the target field image set;
the image classification loss calculation module is used for calculating the image classification loss of the image classifier on the source field image based on the prediction classification label of the image;
The objective function construction module is used for constructing an objective function based on the image classification loss and the degree of difference between the source field image set and the target field image set;
The training module is used for carrying out gradient feedback on the image feature extractor and the image classifier based on the objective function, updating network parameters, and carrying out iterative training to obtain a trained image feature extractor and image classifier;
The classification result output module is used for acquiring the target field image, extracting the characteristics through the trained image characteristic extractor to obtain the characteristic embedding of the image, classifying the characteristic embedding of the target field image based on the trained image classifier, and outputting the classification label of the image.
7. The cross-domain image classification system based on adaptive optimal transmission of claim 6 wherein the image feature extractor is constructed based on a depth convolution network, in particular using a residual depth network ResNet to construct the image feature extractor.
8. The cross-domain image classification system based on adaptive optimal transmission according to claim 6, wherein the image classifier is constructed based on a full-connection layer network, and specifically, the image classifier is constructed by adopting a single-layer or multi-layer full-connection layer network and a soft maximization layer.
9. The cross-domain image classification system based on adaptive optimal transmission according to claim 6, wherein an adaptive optimal transmission distance problem between a source domain image set and a target domain image set is solved in an optimized manner, so as to obtain an adaptive optimal transmission distance between the source domain image set and the target domain image set, which is specifically expressed as:
Cij=||g(xi)-g(zj)||-αyitanh(f(g(zj)))
Γ≤(μ,v)={π∈P(X×Z)gπ1m≤μ,πT1n≤v}
Wherein, Representing a set of images of a source domain,Represents a target domain image set, W (X, Z) represents an adaptive optimal transmission distance between a source domain image set and a target domain image set, Γ represents a transmission map space, pi represents a transmission map, pi ij represents a transmission probability mass from a source domain image X i to a target domain image Z j, C ij denotes the cost of transmitting a unit probability mass from the source domain image x i to the target domain image z j, μ denotes the probability measure of the source domain image set, ν denotes the probability measure of the target domain image set, δ xi denotes the dirac function of the source domain image, Delta zj denotes the dirac function of the target domain image, p i denotes the probability mass of measure mu at the source domain image x i, q j denotes the probability mass of measure v at the target domain image z j, P denotes a probability measure space, 1 m denotes an m-dimensional unit vector, 1 n denotes an n-dimensional unit vector, g (x i) denotes that the image feature extractor maps the source domain image to a feature space, g (z i) denotes that the image feature extractor maps the target domain image to a feature space, f () represents an image classifier, α is a non-negative coefficient, and tanh represents a tanh function.
10. The cross-domain image classification system based on adaptive optimal transmission according to claim 6, wherein the image classification loss calculation module is configured to calculate an image classification loss of the image classifier on the source domain image based on the prediction classification label of the image, specifically expressed as:
where g () represents an image feature extractor, f () represents an image classifier, x i represents a source field image, and y i represents a class label.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410342453.6A CN118334410A (en) | 2024-03-25 | 2024-03-25 | Cross-domain image classification method and system based on self-adaptive optimal transmission |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410342453.6A CN118334410A (en) | 2024-03-25 | 2024-03-25 | Cross-domain image classification method and system based on self-adaptive optimal transmission |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118334410A true CN118334410A (en) | 2024-07-12 |
Family
ID=91765145
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410342453.6A Pending CN118334410A (en) | 2024-03-25 | 2024-03-25 | Cross-domain image classification method and system based on self-adaptive optimal transmission |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118334410A (en) |
-
2024
- 2024-03-25 CN CN202410342453.6A patent/CN118334410A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109949317B (en) | Semi-supervised image example segmentation method based on gradual confrontation learning | |
CN111414942B (en) | Remote sensing image classification method based on active learning and convolutional neural network | |
CN110717526B (en) | Unsupervised migration learning method based on graph convolution network | |
CN110909820B (en) | Image classification method and system based on self-supervision learning | |
CN111583263B (en) | Point cloud segmentation method based on joint dynamic graph convolution | |
CN111079847B (en) | Remote sensing image automatic labeling method based on deep learning | |
CN112541355B (en) | Entity boundary type decoupling few-sample named entity recognition method and system | |
CN112347970B (en) | Remote sensing image ground object identification method based on graph convolution neural network | |
CN115049841A (en) | Depth unsupervised multistep anti-domain self-adaptive high-resolution SAR image surface feature extraction method | |
CN111967325A (en) | Unsupervised cross-domain pedestrian re-identification method based on incremental optimization | |
CN113469186A (en) | Cross-domain migration image segmentation method based on small amount of point labels | |
CN116310350B (en) | Urban scene semantic segmentation method based on graph convolution and semi-supervised learning network | |
CN114863091A (en) | Target detection training method based on pseudo label | |
CN108376257B (en) | Incomplete code word identification method for gas meter | |
CN111126155B (en) | Pedestrian re-identification method for generating countermeasure network based on semantic constraint | |
CN114357221A (en) | Self-supervision active learning method based on image classification | |
CN117152427A (en) | Remote sensing image semantic segmentation method and system based on diffusion model and knowledge distillation | |
CN111242028A (en) | Remote sensing image ground object segmentation method based on U-Net | |
CN113077438B (en) | Cell nucleus region extraction method and imaging method for multi-cell nucleus color image | |
CN114399687A (en) | Semi-supervised self-training hyperspectral remote sensing image classification method based on spatial correction | |
CN116758401B (en) | Urban inland river water quality assessment method based on deep learning and remote sensing image | |
CN113642614A (en) | Basic weather type classification method based on deep network | |
CN115797642B (en) | Self-adaptive image semantic segmentation algorithm based on consistency regularization and semi-supervision field | |
CN117058386A (en) | Asphalt road crack detection method based on improved deep Labv3+ network | |
CN113920127B (en) | Training data set independent single-sample image segmentation method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |