CN114283049A - Image processing method, image processing device, computer equipment and storage medium - Google Patents

Image processing method, image processing device, computer equipment and storage medium Download PDF

Info

Publication number
CN114283049A
CN114283049A CN202111081567.2A CN202111081567A CN114283049A CN 114283049 A CN114283049 A CN 114283049A CN 202111081567 A CN202111081567 A CN 202111081567A CN 114283049 A CN114283049 A CN 114283049A
Authority
CN
China
Prior art keywords
feature
semantic
image
migration
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111081567.2A
Other languages
Chinese (zh)
Inventor
宋奕兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111081567.2A priority Critical patent/CN114283049A/en
Publication of CN114283049A publication Critical patent/CN114283049A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The application provides an image processing method, an image processing device, computer equipment and a storage medium, belongs to the technical field of computers, and can be applied to various scenes such as cloud technology, AI, intelligent traffic, vehicle-mounted and the like. The method comprises the following steps: mapping an image to be processed to a first hidden space of a source field to obtain a first semantic feature; acquiring a migration characteristic, and migrating the first semantic characteristic into a second semantic characteristic based on the migration characteristic; based on the second semantic features, a target image is generated, the target image having a style of the target domain. According to the scheme, the image to be processed is mapped to the first hidden space in the source field, the mapped first semantic space features are migrated based on the migration features, the semantic direction of the image to be processed in the first hidden space is migrated to the second hidden space, the second semantic features are obtained, the target image is generated, the image migration can be completed without training of a large amount of data, time is saved, and the computing power requirement on a training platform is not high.

Description

Image processing method, image processing device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an image processing method and apparatus, a computer device, and a storage medium.
Background
With the development of computer technology, the antagonistic generation model is widely applied to image migration tasks in cross-fields, such as converting a real face image into a hand-drawing style face image. How to better realize the image migration task is an improvement direction.
At present, when image migration across fields is realized, data of a field a is generally used to perform fine adjustment on a generator of a pre-trained confrontational model in a field b, so that the generator obtained by fine adjustment and the generator of the confrontational model in the field a correspond to each other structurally, and the image migration from the field a to the field b is realized.
However, the above technical solutions all require the large-scale data set of the field b to train the antagonistic generative model, which consumes a lot of time and has high requirements on the computational power of the training platform.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, computer equipment and a storage medium, image migration can be completed without using a large amount of data for training, time is saved, and the calculation force requirement on a training platform is not high. The technical scheme is as follows:
in one aspect, an image processing method is provided, and the method includes:
mapping an image to be processed to a first hidden space of a source field to obtain a first semantic feature, wherein the image to be processed has the style of the source field, and the first semantic feature is used for representing the semantic direction of the image to be processed in the first hidden space;
acquiring a migration feature, wherein the migration feature is used for migrating the semantic feature in the first hidden space to a second hidden space in a target field, the migration feature is obtained based on matching a first model and a second model, the first model is a countermeasure type generation model pre-trained in the source field, and the second model is a countermeasure type generation model pre-trained in the target field;
migrating the first semantic feature into a second semantic feature based on the migration feature, wherein the second semantic feature is used for representing the semantic direction of the image to be processed in the second hidden space;
and generating a target image based on the second semantic features, wherein the target image has the style of the target field.
In another aspect, there is provided an image processing apparatus, the apparatus including:
the mapping module is used for mapping an image to be processed to a first hidden space of a source field to obtain a first semantic feature, wherein the image to be processed has the style of the source field, and the first semantic feature is used for representing the semantic direction of the image to be processed in the first hidden space;
an obtaining module, configured to obtain a migration feature, where the migration feature is used to migrate semantic features in the first hidden space to a second hidden space in a target field, and the migration feature is obtained based on matching a first model and a second model, where the first model is a countermeasure type generation model pre-trained in the source field, and the second model is a countermeasure type generation model pre-trained in the target field;
a migration module, configured to migrate the first semantic feature into a second semantic feature based on the migration feature, where the second semantic feature is used to represent a semantic direction of the to-be-processed image in the second hidden space;
and the image generation module is used for generating a target image based on the second semantic features, wherein the target image has the style of the target field.
In some embodiments, the migration module is configured to, for any first semantic element in the first semantic feature, determine, from the migration feature, a migration element corresponding to the first semantic element, where the first semantic element represents a semantic direction of a semantic meaning of any image in the image to be processed; and migrating the first semantic element based on the migration element to obtain a corresponding second semantic element in the second semantic feature.
In some embodiments, the apparatus further comprises:
a first determination module to determine a plurality of first sample migration features based on the first model and a first sample image set that trains the first model, the plurality of first sample migration features in one-to-one correspondence with a plurality of first sample images in the first sample image set;
a second determining module, configured to determine, based on the second model and a second sample image set for training the second model, a plurality of second sample migration features, where the plurality of second sample migration features are in one-to-one correspondence with a plurality of second sample images in the second sample image set;
a third determination module to determine the migration feature based on the first plurality of sample migration features and the second plurality of sample migration features.
In some embodiments, the first determining module comprises:
a first obtaining sub-module, configured to, for any first sample image in the first sample image set, obtain a first hidden space feature of the first sample image in a first hidden space of the first model;
the first dimensionality reduction submodule is used for carrying out dimensionality reduction on the first hidden space characteristic to obtain a first characteristic vector matrix;
a first determining submodule, configured to determine a first sample migration feature corresponding to the first sample image based on the first implicit spatial feature and the first feature vector matrix.
In some embodiments, the first dimension reduction submodule includes:
a first obtaining unit, configured to obtain an average value of each element in the first hidden spatial feature;
a first determining unit, configured to determine, based on the average, a plurality of first eigenvalues and a plurality of first eigenvectors of a first covariance matrix, where the first eigenvalues are in one-to-one correspondence with the first eigenvectors;
a second determining unit, configured to determine a first eigenvector matrix based on the plurality of first eigenvalues, where a row vector in the first eigenvector matrix is a first eigenvector corresponding to the first eigenvalue that satisfies the sorting condition.
In some embodiments, the first determining unit is configured to determine, based on the average, a covariance corresponding to each element in the first implicit spatial feature, so as to obtain the first covariance matrix; and performing eigenvalue decomposition on the first covariance matrix to obtain the plurality of first eigenvalues and the plurality of first eigenvectors.
In some embodiments, the second determining unit is configured to rank the plurality of first feature values; selecting at least one first feature value ordered before the first target order from the ordered plurality of first feature values; and taking at least one first eigenvector corresponding to the at least one first eigenvalue as a row vector to obtain the first eigenvector matrix.
In some embodiments, the second determining module comprises:
a second obtaining sub-module, configured to, for any second sample image in the second sample image set, obtain a second hidden space feature of the second sample image in a second hidden space of the second model;
the second dimension reduction submodule is used for reducing the dimension of the second hidden space feature to obtain a second feature vector matrix;
and the second determining submodule is used for determining a second sample migration characteristic corresponding to the second sample image based on the second implicit spatial characteristic and the second characteristic vector matrix.
In some embodiments, the second dimension reduction submodule includes:
the second obtaining unit is used for obtaining the average value of each element in the second hidden space characteristic;
a third determining unit, configured to determine, based on the average, a plurality of second eigenvalues and a plurality of second eigenvectors of a second covariance matrix, where the second eigenvalues and the second eigenvectors are in one-to-one correspondence;
a fourth determining unit, configured to determine a second eigenvector matrix based on the plurality of second eigenvalues, where a row vector in the second eigenvector matrix is a second eigenvector corresponding to the second eigenvalue that satisfies the sorting condition.
In some embodiments, the third determining unit is configured to determine, based on the average value, a covariance corresponding to each element in the second implicit spatial feature, so as to obtain the second covariance matrix; and performing eigenvalue decomposition on the second covariance matrix to obtain a plurality of second eigenvalues and a plurality of second eigenvectors.
In some embodiments, the fourth determining unit is configured to rank the plurality of second feature values; selecting at least one second eigenvalue ranked before the second target order from the ranked plurality of second eigenvalues; and taking at least one second eigenvector corresponding to the at least one second eigenvalue as a row vector to obtain the second eigenvector matrix.
In another aspect, a computer device is provided, and the computer device includes a processor and a memory, where the memory is used to store at least one piece of computer program, and the at least one piece of computer program is loaded and executed by the processor to implement the operations executed in the image processing method in the embodiments of the present application.
In another aspect, a computer-readable storage medium is provided, in which at least one piece of computer program is stored, and the at least one piece of computer program is loaded and executed by a processor to implement the operations performed in the image processing method in the embodiments of the present application.
In another aspect, a computer program product is provided, the computer program product comprising computer program code stored in a computer readable storage medium. The processor of the computer device reads the computer program code from the computer-readable storage medium, and executes the computer program code, so that the computer device executes the image processing method provided in the various alternative implementations of the aspects.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
the embodiment of the application provides an image processing method, which comprises the steps of mapping an image to be processed to a first hidden space in a source field, enabling the first semantic space feature obtained through mapping to be migrated based on migration features, migrating the semantic direction of the image to be processed in the first hidden space to a second hidden space to obtain a second semantic feature, and accordingly generating a target image.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of an implementation environment of an image processing method according to an embodiment of the present application;
FIG. 2 is a flow chart of an image processing method provided according to an embodiment of the present application;
FIG. 3 is a flow chart of an image processing method provided according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a technical framework provided in accordance with an embodiment of the present application;
fig. 5 is a block diagram of an image processing apparatus provided according to an embodiment of the present application;
fig. 6 is a block diagram of another image processing apparatus provided according to an embodiment of the present application;
fig. 7 is a block diagram of a terminal according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a server provided according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution.
The term "at least one" in this application means one or more, and the meaning of "a plurality" means two or more.
Hereinafter, terms related to the present application are explained.
GAN (Generative adaptive Networks) is a deep learning model that typically includes a generator and a discriminator, where the discriminator provides a loss function based on antagonistic learning for training of the generator. GAN can be used as an image generation model. The image generation model can realize the arbitrary generation of images in the field of training data sets, and the image generation model generally takes random noise with Gaussian distribution as input to obtain a meaningful image (such as a human face).
StyleGAN: a pair generative model with a leading effect.
StyleGAN 2: an improved antagonistic generative model with better effect aiming at the defects of StyleGAN is one of the antibiotic networks with the best comprehensive performance at present.
Hidden space: the input space of the countermeasure generating network generator is usually composed of randomly sampled noise based on a standard gaussian distribution.
PCA (Principal Component Analysis), also called Principal Component Analysis, aims to convert multiple indexes into a few comprehensive indexes by using the idea of dimension reduction. The main idea of PCA is to map n-dimensional features onto k-dimensions, which are completely new orthogonal features, also called principal components, and k-dimensional features reconstructed on the basis of the original n-dimensional features. The task of PCA is to sequentially find a set of mutually orthogonal axes from the original space, the selection of new axes being strongly dependent on the data itself. The first new coordinate axis is selected to be the direction with the largest square difference in the original data, the second new coordinate axis is selected to be the plane which is orthogonal to the first coordinate axis and enables the square difference to be the largest, and the third axis is the plane which is orthogonal to the 1 st axis and the 2 nd axis and enables the square difference to be the largest. By analogy, n such coordinate axes can be obtained. With the new axes obtained in this way, we have found that most of the variances are contained in the preceding k axes, and the variance contained in the following axes is almost 0. Thus, we can ignore the remaining axes and only keep the first k axes containing the most variance. In fact, this is equivalent to only retaining the dimension feature containing most of the variance, and neglecting the feature dimension containing variance of almost 0, thereby realizing the dimension reduction processing on the data feature.
The image processing method provided by the embodiment of the application can be executed by computer equipment. In some embodiments, the computer device is a terminal or a server. An implementation environment of the image processing method provided in the embodiment of the present application is described below by taking a computer device as an example, and fig. 1 is a schematic diagram of an implementation environment of an image processing method provided in the embodiment of the present application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102.
The terminal 101 and the server 102 can be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.
In some embodiments, the terminal 101 is a smartphone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, a smart voice interaction device, a smart appliance, a vehicle-mounted terminal, and the like, but is not limited thereto. The terminal 101 is installed and operated with an application program for supporting image processing. Those skilled in the art will appreciate that the number of terminals described above may be greater or fewer. For example, the number of the terminals may be only one, or several tens or hundreds of the terminals, or more. The number of terminals and the type of the device are not limited in the embodiments of the present application.
In some embodiments, the server 102 is an independent physical server, can also be a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server providing basic cloud computing services such as cloud service, cloud database, cloud computing, cloud function, cloud storage, web service, cloud communication, middleware service, domain name service, security service, CDN (Content Delivery Network), big data and artificial intelligence platform, and the like. The server 102 is used for providing background services for the application programs supporting the virtual scenes. In some embodiments, the server 102 undertakes primary computing work and the terminal 101 undertakes secondary computing work; or, the server 102 undertakes the secondary computing work, and the terminal 101 undertakes the primary computing work; alternatively, the server 102 and the terminal 101 perform cooperative computing by using a distributed computing architecture.
In this implementation environment, the application for image processing is capable of migrating an image to be processed from a style of a domain a to a style of a domain b.
For example, the terminal sends the image to be processed to the server based on the application program, and the server maps the image to be processed to a first hidden space of the domain a to obtain a first semantic feature, where the first semantic feature is used to represent a semantic direction of the image to be processed in the first hidden space. The server then migrates the first semantic feature into a second semantic feature based on the migration feature, the second semantic feature being used to represent a semantic direction in a second hidden space of the domain b. And finally, the server generates a target image based on the second semantic features, wherein the target image has the style of the field b. And the server returns the target image to the terminal and displays the target image by the terminal.
Fig. 2 is a flowchart of an image processing method according to an embodiment of the present application, and as shown in fig. 2, the image processing method is described as being executed by a server in the embodiment of the present application. The image processing method comprises the following steps:
201. mapping an image to be processed to a first hidden space of a source field to obtain a first semantic feature, wherein the image to be processed has the style of the source field, and the first semantic feature is used for representing the semantic direction of the image to be processed in the first hidden space.
In this embodiment of the application, the image to be processed is an image uploaded by a terminal, the image to be processed has a style of a source field, the server can map the image to be processed to a first hidden space of the source field based on a generator of a pre-trained first model in the source field, so as to obtain a semantic direction of the image to be processed in the first hidden space, and the first model is a generative confrontation model. If the image to be processed is a face image, the semantic direction of the face image in the first hidden space includes gender, five sense organs, age, and the like.
It should be noted that, in the embodiment of the present application, the semantic direction may also be referred to as a semantic direction, which is not limited in the embodiment of the present application.
202. The method comprises the steps of obtaining a migration characteristic, wherein the migration characteristic is used for migrating semantic features in a first hidden space to a second hidden space of a target field, the migration characteristic is obtained based on matching of a first model and a second model, the first model is a countermeasure type generation model obtained through pre-training in a source field, and the second model is a countermeasure type generation model obtained through pre-training in the target field.
In the embodiment of the application, the migration feature is obtained based on the matching of the pre-trained first model and the pre-trained second model, and the migration feature can be directly obtained when the image to be processed is processed.
203. And migrating the first semantic feature into a second semantic feature based on the migration feature, wherein the second semantic feature is used for representing the semantic direction of the image to be processed in a second hidden space.
In this embodiment, the server may migrate, based on the migration feature, the semantic direction in the first hidden space represented by the first semantic feature to a semantic direction in a second hidden space in the target domain, so as to obtain a second semantic feature. The migration feature is derived based on matching a first hidden space of a first model pre-trained in a source domain and a second hidden space of a second model pre-trained in a target domain.
204. Based on the second semantic features, a target image is generated, the target image having a style of the target domain.
In this embodiment, after obtaining the second semantic features, the server may process the second semantic features based on a generator of a second model pre-trained in a target field to obtain a target image.
The embodiment of the application provides an image processing method, which comprises the steps of mapping an image to be processed to a first hidden space in a source field, enabling the first semantic space feature obtained through mapping to be migrated based on migration features, migrating the semantic direction of the image to be processed in the first hidden space to a second hidden space to obtain a second semantic feature, and accordingly generating a target image.
Fig. 2 illustrates a main flow of an image processing method provided in an embodiment of the present application, and the image processing method is further described below based on an application scenario, referring to fig. 3, where fig. 3 is a flow chart of an image processing method provided in an embodiment of the present application, and is described as an example executed by a server in the embodiment of the present application. The image processing method comprises the following steps:
301. determining a plurality of first sample migration features based on a first model and a first sample image set for training the first model, wherein the first model is a confrontation type generation model pre-trained in a source field, and the plurality of first sample migration features correspond to the plurality of first sample images in the first sample image set one by one.
In this embodiment of the present application, the server can obtain a first model pre-trained in the source domain, where the first model is a countermeasure generation model, such as a styleGAN model or a styleGAN2 model. The first model can be obtained by the server through pre-training based on a plurality of first sample images in the first sample image set, and the pre-trained first model and the first sample image set for training the first model can also be directly obtained. The server maps a plurality of first sample images in the first sample image set to first hidden spaces of the first model respectively based on the first model, so as to obtain first sample migration characteristics corresponding to the first sample images.
In some embodiments, any one of the first sample images in the first sample image set is taken as an example for illustration. The server can obtain a first hidden space feature of the first sample image in a first hidden space of the first model, wherein the server can input the first sample image into the first model and map the first sample image into the first hidden space of the first model, so that the first hidden space feature is obtained. And then the server can perform dimension reduction on the first hidden space feature to obtain a first feature vector matrix, wherein the server can perform dimension reduction on the first hidden space feature based on a hidden space decomposition algorithm. Finally, the server can determine a first sample migration feature corresponding to the first sample image based on the first hidden space feature and the first feature vector matrix, wherein the first hidden space feature is in the form of a vector matrix, and the server obtains the first sample migration feature by multiplying the first hidden space feature and the first feature vector matrix, so that the first sample migration feature corresponding to the first sample image is also in the form of a vector matrix. By mapping the first sample image to the first hidden space, the main features in the first sample image, namely the features having influence on image migration, can be determined in a dimension reduction mode, so that the hidden spaces in different fields can be matched conveniently.
In some embodiments, the server can employ a principal component analysis method to perform dimensionality reduction on the first hidden spatial feature. Correspondingly, the server obtains an average value of each element in the first hidden space feature, and then the server determines a plurality of first eigenvalues and a plurality of first eigenvectors of the first covariance matrix based on the average value, wherein the first eigenvalues are in one-to-one correspondence with the first eigenvectors. Then, the server determines a first eigenvector matrix based on the plurality of first eigenvalues, wherein the row vector in the first eigenvector matrix is a first eigenvector corresponding to the first eigenvalue meeting the sorting condition. The server can determine the covariance corresponding to each element in the first hidden space feature based on the average value to obtain the first covariance matrix, and then perform eigenvalue decomposition on the first covariance matrix to obtain the plurality of first eigenvalues and the plurality of first eigenvectors. The server can sort the plurality of first eigenvalues, then select at least one first eigenvalue sorted before the first target order from the sorted plurality of second eigenvalues, and use at least one first eigenvector corresponding to the at least one first eigenvalue as a row vector to obtain the first eigenvector matrix. By calculating a first covariance matrix and performing eigenvalue decomposition on the first covariance matrix, n 1-dimensional features can be obtained, and then by sorting and screening, n 1-dimensional features can be mapped to k dimensions to obtain a first eigenvector matrix, wherein the k-dimensional features are reconstructed on the basis of the original n 1-dimensional features, n1 is larger than k, and n1 and k are positive integers, thereby realizing the dimension reduction of the features.
For example, the server performs dimensionality reduction on the first hidden spatial feature by using a PCA method. First, the average value of each element in the first hidden space feature X1 is calculated, then a first covariance matrix is calculated based on the average value, and a plurality of first eigenvalues and a plurality of first eigenvectors of the first covariance matrix are calculated by an eigenvalue decomposition method. Then, the first k first eigenvalues are selected, and k first eigenvectors corresponding to the first k first eigenvalues are used as row vectors to form a first eigenvector matrix P1. The first sample migration characteristic is T1 ═ P1 · X1.
It should be noted that the server can also use other hidden space decomposition algorithms, which is not limited in this embodiment of the present application.
302. And determining a plurality of second sample migration features based on a second model and a second sample image set for training the second model, wherein the second model is a confrontation type generation model obtained by pre-training in the target field, and the plurality of second sample migration features are in one-to-one correspondence with the plurality of second sample images in the second sample image set.
In the embodiment of the present application, the server can obtain a second model pre-trained in the target domain, where the second model is a countermeasure generation model, such as a styleGAN model or a styleGAN2 model. The second model may be obtained by the server through pre-training based on a plurality of second sample images in the second sample image set, and the pre-trained second model and the second sample image set for training the second model may also be directly obtained. The server maps a plurality of second sample images in the second sample image set to a second hidden space of the second model respectively based on the second model, so as to obtain second sample migration features corresponding to the second sample images.
In some embodiments, any one of the second sample images in the second sample image set is taken as an example for illustration. The server can obtain a second hidden space feature of the second sample image in a second hidden space of the second model, wherein the server can input the second sample image into the second model and map the second sample image into the second hidden space of the second model, so that the second hidden space feature is obtained. And then the server can perform dimension reduction on the second hidden space feature to obtain a second feature vector matrix, wherein the server can perform dimension reduction on the second hidden space feature based on a hidden space decomposition algorithm. Finally, the server can determine a second sample migration feature corresponding to the second sample image based on the second implicit spatial feature and the second feature vector matrix, wherein the second implicit spatial feature is in the form of a vector matrix, the server obtains the second sample migration feature by multiplying the second implicit spatial feature and the second feature vector matrix, and the second sample migration feature corresponding to the second sample image is also in the form of a vector matrix. By mapping the second sample image to the second hidden space, the main features in the second sample image, namely the features having influence on image migration, can be determined in a dimension reduction mode, so that the hidden spaces in different fields can be matched conveniently.
In some embodiments, the server can perform dimensionality reduction on the second hidden spatial feature by using a principal component analysis method. Correspondingly, the server obtains an average value of each element in the second hidden space feature, and then the server determines a plurality of second eigenvalues and a plurality of second eigenvectors of the second covariance matrix based on the average value, wherein the second eigenvalues are in one-to-one correspondence with the second eigenvectors. And then the server determines a second eigenvector matrix based on the plurality of second eigenvalues, wherein the row vector in the second eigenvector matrix is a second eigenvector corresponding to the second eigenvalue which meets the sorting condition. The server can determine the covariance corresponding to each element in the second implicit spatial feature based on the average value to obtain the second covariance matrix, and then perform eigenvalue decomposition on the second covariance matrix to obtain the plurality of second eigenvalues and the plurality of second eigenvectors. The server can sort the plurality of second eigenvalues, then select at least one second eigenvalue sorted before the second target order from the sorted plurality of second eigenvalues, and use at least one second eigenvector corresponding to the at least one second eigenvalue as a row vector to obtain the second eigenvector matrix. By calculating a second covariance matrix and performing eigenvalue decomposition on the second covariance matrix, n 2-dimensional features can be obtained, and then by sorting and screening, n 2-dimensional features can be mapped to k dimensions to obtain a second eigenvector matrix, wherein the k-dimensional features are reconstructed on the basis of the original n 2-dimensional features, n2 is greater than k, and n and k are positive integers, thereby realizing dimension reduction of the features.
For example, the server performs dimensionality reduction on the first hidden spatial feature by using a PCA method. First, the average value of each element in the first hidden space feature X2 is calculated, then a first covariance matrix is calculated based on the average value, and a plurality of first eigenvalues and a plurality of first eigenvectors of the first covariance matrix are calculated by an eigenvalue decomposition method. Then, the first k first eigenvalues are selected, and k first eigenvectors corresponding to the first k first eigenvalues are used as row vectors to form a first eigenvector matrix P2. The first sample migration characteristic is T2 ═ P2 · X2.
303. Determining a migration characteristic based on the plurality of first sample migration characteristics and the plurality of second sample migration characteristics.
In this embodiment, the server is capable of determining an average value of the plurality of first sample migration features and the plurality of second sample migration features as the migration feature.
304. Mapping an image to be processed to a first hidden space of a source field to obtain a first semantic feature, wherein the image to be processed has the style of the source field, and the first semantic feature is used for representing the semantic direction of the image to be processed in the first hidden space.
In an embodiment of the application, the server is capable of obtaining a first model pre-trained in the source domain. The server can also acquire the image to be processed uploaded by the terminal, and the image to be processed has the style of the source field. Then the server inputs the image to be processed into the first model, and the generator of the first model maps the image to be processed to a first hidden space to obtain a first semantic feature.
For example, the first model is a StyleGAN model pre-trained in a source domain, the image to be processed is a real face image uploaded by a terminal, and the server maps the real face image to a first hidden space based on the StyleGAN model to obtain a first semantic feature, wherein the first semantic feature includes features in semantic directions such as age, gender, expression, skin color, face orientation, and the like.
In some embodiments, the server maps the processed image to a first hidden space of the source domain to obtain a first image hidden space feature, and then decomposes the first image hidden space feature into a first semantic feature and an image retention feature, where the image retention feature is a feature that does not need to be migrated. For example, the first semantic features are features of human face related semantic directions, such as gender, age, five sense organs, expression, and the like; the image retention features are background related features such as lighting, buildings, and sky.
305. And acquiring a migration feature, wherein the migration feature is used for migrating the semantic features in the first hidden space to a second hidden space in the target field.
In the embodiment of the present application, the migration features determined based on the above steps 301 to 303 are directly obtained.
306. And migrating the first semantic feature into a second semantic feature based on the migration feature, wherein the second semantic feature is used for expressing the semantic direction of the image to be processed in a second implicit space of the target field.
In the embodiment of the present application, the migration feature can be used to adjust the semantic direction of different image semantics. The server can match the source field with the target field based on the migration characteristics, namely, the semantic direction of the image to be processed in the first hidden space is migrated into the semantic direction in the second hidden space.
In some embodiments, the server is capable of multiplying the migration feature by the first semantic feature to obtain a second semantic feature. The migration feature, the first semantic feature and the second semantic feature are all expressed in the form of a feature matrix. See formula (1).
Z2=T·Z1 (1);
Wherein Z is2Representing a second semantic feature, T representing a migration feature, Z1Representing a first semantic feature.
In some embodiments, the first semantic feature includes a plurality of first semantic elements, and different first semantic elements represent semantic directions of different image semantics, such as age, skin color, gender, and the like. The following describes the migration of the first semantic feature into the second semantic feature from the perspective of semantic elements: for any first semantic element in the first semantic feature, the server determines a migration element corresponding to the first semantic element from the migration feature, wherein the first semantic element represents the semantic direction of any image semantic in the image to be processed; the server migrates the first semantic element based on the migration element to obtain a corresponding second semantic element in the second semantic feature.
In some embodiments, after obtaining the second semantic feature, the server may further fuse the second semantic feature with the unprocessed image-preserved feature, for example, splice the second semantic feature with the image-preserved feature to obtain a second image hidden space feature, and generate the target image based on the second image hidden space feature by the server. The fusion method is an inverse operation of the division method of the first image implicit spatial feature, and the embodiment of the present application does not limit this.
307. Based on the second semantic features, a target image is generated, the target image having a style of a target domain.
In the embodiment of the application, the server can obtain a second model pre-trained in the target field, the server can input the second semantic features into the second model, and the generator of the second model generates the target image based on the second semantic features, so that the image to be processed is transferred from the style of the source field to the style of the target field.
It should be noted that, in order to make the image processing method described in the foregoing steps 301 to 306 easier to understand, referring to fig. 4, fig. 4 is a schematic diagram of a technical framework provided according to an embodiment of the present application. The source field is a field a, the target field is a field b, and the server inputs the image to be processed into a generator of a pre-trained first model in the field a to obtain a first image hidden space characteristic. The server divides the first image implicit spatial features into first semantic features and image retention features. The server realizes the matching of the first hidden space and the second hidden space through the migration features determined in the steps 301 to 303, that is, the first semantic feature is migrated into the second semantic feature. And the server fuses the second semantic features and the image retention features to obtain the hidden space features of the second image. And the server inputs the implicit spatial features of the second image into a generator of a pre-trained second model in the field b to obtain a target image with the style of the field b. It should be noted that, if the first semantic feature and the image retention feature are fused and then input to the generator of the pre-trained generative confrontation model in the field a, an image having the style of the field a can be obtained, and the higher the similarity between the image and the image to be processed is, the higher the training degree of the pre-trained first model in the field a is.
The embodiment of the application provides an image processing method, which comprises the steps of mapping an image to be processed to a first hidden space in a source field, enabling the first semantic space feature obtained through mapping to be migrated based on migration features, migrating the semantic direction of the image to be processed in the first hidden space to a second hidden space to obtain a second semantic feature, and accordingly generating a target image.
Fig. 5 is a block diagram of an image processing apparatus provided according to an embodiment of the present application. The apparatus is for performing the steps in the above-described image processing method, and referring to fig. 5, the apparatus comprises: a mapping module 51, an acquisition module 52, a migration module 53, and an image generation module 54.
A mapping module 51, configured to map an image to be processed to a first hidden space in a source field to obtain a first semantic feature, where the image to be processed has a style of the source field, and the first semantic feature is used to represent a semantic direction of the image to be processed in the first hidden space;
an obtaining module 52, configured to obtain a migration feature, where the migration feature is used to migrate a semantic feature in the first hidden space to a second hidden space in the target field, and the migration feature is obtained based on matching a first model and a second model, where the first model is a countermeasure type generation model pre-trained in the source field, and the second model is a countermeasure type generation model pre-trained in the target field;
a migration module 53, configured to migrate the first semantic feature into a second semantic feature based on the migration feature, where the second semantic feature is used to indicate a semantic direction of the to-be-processed image in the second hidden space;
an image generation module 54 for generating a target image based on the second semantic feature, the target image having a style of the target domain.
In some embodiments, the migration module 53 is configured to, for any first semantic element in the first semantic feature, determine a migration element corresponding to the first semantic element from the migration feature, where the first semantic element represents a semantic direction of any image semantic in the image to be processed; and migrating the first semantic element based on the migration element to obtain a corresponding second semantic element in the second semantic feature.
In some embodiments, fig. 6 is a block diagram of another image processing apparatus provided in an embodiment of the present application, and referring to fig. 6, the apparatus further includes:
a first determining module 55, configured to determine a plurality of first sample migration features based on a first model and a first sample image set for training the first model, where the plurality of first sample migration features are in one-to-one correspondence with a plurality of first sample images in the first sample image set;
a second determining module 56, configured to determine a plurality of second sample migration features based on a second model and a second sample image set for training the second model, where the plurality of second sample migration features are in one-to-one correspondence with a plurality of second sample images in the second sample image set;
a third determining module 57, configured to determine the migration characteristic based on the plurality of first sample migration characteristics and the plurality of second sample migration characteristics.
In some embodiments, the first determining module 55 includes:
a first obtaining sub-module 551, configured to, for any first sample image in the first sample image set, obtain a first hidden spatial feature of the first sample image in a first hidden space of the first model;
a first dimension reduction submodule 552, configured to perform dimension reduction on the first hidden spatial feature to obtain a first feature vector matrix;
a first determining sub-module 553, configured to determine a first sample migration feature corresponding to the first sample image based on the first implicit spatial feature and the first feature vector matrix.
In some embodiments, the first dimension reduction submodule 552 includes:
a first obtaining unit 5521, configured to obtain an average value of each element in the first implicit spatial feature;
a first determining unit 5522, configured to determine, based on the average, a plurality of first eigenvalues and a plurality of first eigenvectors of a first covariance matrix, where the first eigenvalues are in one-to-one correspondence with the first eigenvectors;
the second determining unit 5523 is configured to determine a first eigenvector matrix based on the plurality of first eigenvalues, where a row vector in the first eigenvector matrix is a first eigenvector corresponding to the first eigenvalue satisfying the sorting condition.
In some embodiments, the first determining unit 5522 is configured to determine, based on the average, a covariance corresponding to each element in the first implicit spatial feature, so as to obtain the first covariance matrix; and carrying out eigenvalue decomposition on the first covariance matrix to obtain a plurality of first eigenvalues and a plurality of first eigenvectors.
In some embodiments, the second determining unit 5523 is configured to rank the plurality of first feature values; selecting at least one first feature value ordered before the first target order from the ordered plurality of first feature values; and taking at least one first eigenvector corresponding to the at least one first eigenvalue as a row vector to obtain the first eigenvector matrix.
In some embodiments, the second determining module 56 includes:
a second obtaining submodule 561, configured to, for any second sample image in the second sample image set, obtain a second hidden space feature of the second sample image in a second hidden space of the second model;
a second dimension reduction submodule 562, configured to perform dimension reduction on the second hidden spatial feature to obtain a second feature vector matrix;
a second determining submodule 563 configured to determine, based on the second implicit spatial feature and the second feature vector matrix, a second sample migration feature corresponding to the second sample image.
In some embodiments, the second dimension reduction submodule 562 includes:
a second obtaining unit 5621, which obtains an average value of each element in the second implicit spatial feature;
a third determining unit 5622, configured to determine, based on the average, a plurality of second eigenvalues and a plurality of second eigenvectors of a second covariance matrix, where the second eigenvalues are in one-to-one correspondence with the second eigenvectors;
a fourth determining unit 5623, configured to determine a second eigenvector matrix based on the plurality of second eigenvalues, where a row vector in the second eigenvector matrix is a second eigenvector corresponding to the second eigenvalue that satisfies the sorting condition.
In some embodiments, the third determining unit 5622 is configured to determine, based on the average, a covariance corresponding to each element in the second implicit spatial feature, so as to obtain the second covariance matrix; and performing eigenvalue decomposition on the second covariance matrix to obtain a plurality of second eigenvalues and a plurality of second eigenvectors.
In some embodiments, the fourth determining unit 5623 is configured to sort the plurality of second feature values; selecting at least one second eigenvalue ranked before the second target order from the ranked plurality of second eigenvalues; and taking at least one second eigenvector corresponding to the at least one second eigenvalue as a row vector to obtain the second eigenvector matrix.
The embodiment of the application provides an image processing device, which can migrate a mapped first semantic space feature based on a migration feature by mapping an image to be processed to a first hidden space in a source field, migrate a semantic direction of the image to be processed in the first hidden space to a second hidden space to obtain a second semantic feature, thereby generating a target image, completing image migration without using a large amount of data for training, saving time and having low computational power requirement on a training platform.
It should be noted that: in the image processing apparatus provided in the above embodiment, only the division of the functional modules is illustrated when performing image processing, and in practical applications, the functions may be distributed by different functional modules as needed, that is, the internal structure of the apparatus may be divided into different functional modules to complete all or part of the functions described above. In addition, the image processing apparatus and the image processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.
In this embodiment of the present application, the computer device can be configured as a terminal or a server, when the computer device is configured as a terminal, the terminal can be used as an execution subject to implement the technical solution provided in the embodiment of the present application, when the computer device is configured as a server, the server can be used as an execution subject to implement the technical solution provided in the embodiment of the present application, or the technical solution provided in the present application can be implemented through interaction between the terminal and the server, which is not limited in this embodiment of the present application.
When the computer device is configured as a terminal, fig. 7 is a block diagram of a terminal 700 according to an embodiment of the present application. The terminal 700 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 700 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so on.
In general, terminal 700 includes: a processor 701 and a memory 702.
The processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 701 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 702 may include one or more computer-readable storage media, which may be non-transitory. Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 702 is used to store at least one computer program for execution by the processor 701 to implement the image processing method provided by the method embodiments herein.
In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 704, a display screen 705, a camera assembly 706, an audio circuit 707, a positioning component 708, and a power source 709.
The peripheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 701 and the memory 702. In some embodiments, processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. In some embodiments, the radio frequency circuitry 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 704 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 705 is a touch display screen, the display screen 705 also has the ability to capture touch signals on or over the surface of the display screen 705. The touch signal may be input to the processor 701 as a control signal for processing. At this point, the display 705 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 705 may be one, disposed on a front panel of the terminal 700; in other embodiments, the display 705 can be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in other embodiments, the display 705 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The Display 705 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or the like.
The camera assembly 706 is used to capture images or video. In some embodiments, camera assembly 706 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
The audio circuitry 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing or inputting the electric signals to the radio frequency circuit 704 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 707 may also include a headphone jack.
The positioning component 708 is used to locate the current geographic Location of the terminal 700 for navigation or LBS (Location Based Service). The Positioning component 708 can be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.
Power supply 709 is provided to supply power to various components of terminal 700. The power source 709 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 709 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, terminal 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.
The acceleration sensor 711 can detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the terminal 700. For example, the acceleration sensor 711 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 701 may control the display screen 705 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 711. The acceleration sensor 711 may also be used for acquisition of motion data of a game or a user.
The gyro sensor 712 may detect a body direction and a rotation angle of the terminal 700, and the gyro sensor 712 may cooperate with the acceleration sensor 711 to acquire a 3D motion of the terminal 700 by the user. From the data collected by the gyro sensor 712, the processor 701 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.
Pressure sensors 713 may be disposed on a side frame of terminal 700 and/or underneath display 705. When the pressure sensor 713 is disposed on a side frame of the terminal 700, a user's grip signal on the terminal 700 may be detected, and the processor 701 performs right-left hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at a lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The fingerprint sensor 714 is used for collecting a fingerprint of a user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 714 may be disposed on the front, back, or side of the terminal 700. When a physical button or a vendor Logo is provided on the terminal 700, the fingerprint sensor 714 may be integrated with the physical button or the vendor Logo.
The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the ambient light intensity is high, the display brightness of the display screen 705 is increased; when the ambient light intensity is low, the display brightness of the display screen 705 is adjusted down. In another embodiment, processor 701 may also dynamically adjust the shooting parameters of camera assembly 706 based on the ambient light intensity collected by optical sensor 715.
A proximity sensor 716, also referred to as a distance sensor, is typically disposed on a front panel of the terminal 700. The proximity sensor 716 is used to collect the distance between the user and the front surface of the terminal 700. In one embodiment, when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 gradually decreases, the processor 701 controls the display 705 to switch from the bright screen state to the dark screen state; when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 is gradually increased, the processor 701 controls the display 705 to switch from the breath-screen state to the bright-screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 7 is not intended to be limiting of terminal 700 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
When the computer device is configured as a server, fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 800 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 801 and one or more memories 802, where the memories 802 store at least one computer program, and the at least one computer program is loaded and executed by the processors 801 to implement the image Processing method provided by each method embodiment. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.
The embodiment of the present application further provides a computer-readable storage medium, where at least one piece of computer program is stored in the computer-readable storage medium, and the at least one piece of computer program is loaded and executed by a processor of a computer device to implement the operations performed by the computer device in the image processing method according to the foregoing embodiment. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In some embodiments, the computer program according to the embodiments of the present application may be deployed to be executed on one computer device or on multiple computer devices located at one site, or may be executed on multiple computer devices distributed at multiple sites and interconnected by a communication network, and the multiple computer devices distributed at the multiple sites and interconnected by the communication network may constitute a block chain system.
Embodiments of the present application also provide a computer program product comprising computer program code stored in a computer readable storage medium. The processor of the computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computer device performs the image processing method provided in the above-described various alternative implementations.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (15)

1. An image processing method, characterized in that the method comprises:
mapping an image to be processed to a first hidden space of a source field to obtain a first semantic feature, wherein the image to be processed has the style of the source field, and the first semantic feature is used for representing the semantic direction of the image to be processed in the first hidden space;
acquiring a migration feature, wherein the migration feature is used for migrating the semantic feature in the first hidden space to a second hidden space in a target field, the migration feature is obtained based on matching a first model and a second model, the first model is a countermeasure type generation model pre-trained in the source field, and the second model is a countermeasure type generation model pre-trained in the target field;
migrating the first semantic feature into a second semantic feature based on the migration feature, wherein the second semantic feature is used for representing the semantic direction of the image to be processed in the second hidden space;
and generating a target image based on the second semantic features, wherein the target image has the style of the target field.
2. The method of claim 1, wherein said migrating said first semantic feature to a second semantic feature based on said migrated feature comprises:
for any first semantic element in the first semantic features, determining a migration element corresponding to the first semantic element from the migration features, wherein the first semantic element represents the semantic direction of any image semantic in the image to be processed;
and migrating the first semantic element based on the migration element to obtain a corresponding second semantic element in the second semantic feature.
3. The method of claim 1, further comprising:
determining a plurality of first sample migration features based on the first model and a first sample image set for which the first model is trained, the plurality of first sample migration features in one-to-one correspondence with a plurality of first sample images in the first sample image set;
determining a plurality of second sample migration features based on the second model and a second sample image set for training the second model, the plurality of second sample migration features corresponding to a plurality of second sample images in the second sample image set one to one;
determining the migration feature based on the first plurality of sample migration features and the second plurality of sample migration features.
4. The method of claim 3, wherein determining a plurality of first sample migration features based on a first model and a first sample image set from which the first model was trained comprises:
for any first sample image in the first sample image set, acquiring a first hidden space feature of the first sample image in a first hidden space of the first model;
reducing the dimension of the first hidden space feature to obtain a first feature vector matrix;
and determining a first sample migration feature corresponding to the first sample image based on the first implicit spatial feature and the first feature vector matrix.
5. The method of claim 4, wherein the reducing the dimension of the first hidden spatial feature to obtain a first eigenvector matrix comprises:
obtaining an average value of each element in the first hidden space characteristic;
determining a plurality of first eigenvalues and a plurality of first eigenvectors of a first covariance matrix based on the average value, wherein the first eigenvalues are in one-to-one correspondence with the first eigenvectors;
and determining a first eigenvector matrix based on the plurality of first eigenvalues, wherein the row vector in the first eigenvector matrix is a first eigenvector corresponding to the first eigenvalue which meets the sorting condition.
6. The method of claim 5, wherein determining a plurality of first eigenvalues and a plurality of first eigenvectors of a first covariance matrix based on the mean comprises:
determining covariance corresponding to each element in the first hidden space characteristic based on the average value to obtain a first covariance matrix;
and performing eigenvalue decomposition on the first covariance matrix to obtain the plurality of first eigenvalues and the plurality of first eigenvectors.
7. The method of claim 5, wherein determining a first eigenvector matrix based on the plurality of first eigenvalues comprises:
sorting the plurality of first feature values;
selecting at least one first feature value ordered before the first target order from the ordered plurality of first feature values;
and taking at least one first eigenvector corresponding to the at least one first eigenvalue as a row vector to obtain the first eigenvector matrix.
8. The method of claim 3, wherein determining a plurality of second sample migration features based on a second model and a second sample image set for training the second model comprises:
for any second sample image in the second sample image set, obtaining a second hidden space feature of the second sample image in a second hidden space of the second model;
reducing the dimension of the second hidden space feature to obtain a second feature vector matrix;
and determining a second sample migration characteristic corresponding to the second sample image based on the second implicit spatial characteristic and the second characteristic vector matrix.
9. The method of claim 8, wherein the reducing the dimension of the second implicit spatial feature to obtain a second eigenvector matrix comprises:
obtaining an average value of each element in the second hidden space characteristic;
determining a plurality of second eigenvalues and a plurality of second eigenvectors of a second covariance matrix based on the mean value, the second eigenvalues corresponding to the second eigenvectors one-to-one;
and determining a second eigenvector matrix based on the plurality of second eigenvalues, wherein the row vector in the second eigenvector matrix is a second eigenvector corresponding to the second eigenvalue which meets the sorting condition.
10. The method of claim 9, wherein determining a plurality of second eigenvalues and a plurality of second eigenvectors of a second covariance matrix based on the mean comprises:
determining covariance corresponding to each element in the second implicit space feature based on the average value to obtain a second covariance matrix;
and performing eigenvalue decomposition on the second covariance matrix to obtain a plurality of second eigenvalues and a plurality of second eigenvectors.
11. The method of claim 9, wherein determining a second eigenvector matrix based on the plurality of second eigenvalues comprises:
sorting the plurality of second feature values;
selecting at least one second eigenvalue ranked before the second target order from the ranked plurality of second eigenvalues;
and taking at least one second eigenvector corresponding to the at least one second eigenvalue as a row vector to obtain the second eigenvector matrix.
12. An image processing apparatus, characterized in that the apparatus comprises:
the mapping module is used for mapping an image to be processed to a first hidden space of a source field to obtain a first semantic feature, wherein the image to be processed has the style of the source field, and the first semantic feature is used for representing the semantic direction of the image to be processed in the first hidden space;
an obtaining module, configured to obtain a migration feature, where the migration feature is used to migrate semantic features in the first hidden space to a second hidden space in a target field, and the migration feature is obtained based on matching a first model and a second model, where the first model is a countermeasure type generation model pre-trained in the source field, and the second model is a countermeasure type generation model pre-trained in the target field;
a migration module, configured to migrate the first semantic feature into a second semantic feature based on the migration feature, where the second semantic feature is used to represent a semantic direction of the to-be-processed image in the second hidden space;
and the image generation module is used for generating a target image based on the second semantic features, wherein the target image has the style of the target field.
13. A computer device, characterized in that the computer device comprises a processor and a memory for storing at least one piece of computer program, which is loaded by the processor and executes the image processing method according to any one of claims 1 to 11.
14. A computer-readable storage medium for storing at least one piece of a computer program for executing the image processing method according to any one of claims 1 to 11.
15. A computer program product, characterized in that the computer program product comprises computer program code, which is stored in a computer-readable storage medium, which is read by a processor of a computer device from the computer-readable storage medium, the processor executing the computer program code, causing the computer device to perform the image processing method of any of claims 1 to 11.
CN202111081567.2A 2021-09-15 2021-09-15 Image processing method, image processing device, computer equipment and storage medium Pending CN114283049A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111081567.2A CN114283049A (en) 2021-09-15 2021-09-15 Image processing method, image processing device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111081567.2A CN114283049A (en) 2021-09-15 2021-09-15 Image processing method, image processing device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114283049A true CN114283049A (en) 2022-04-05

Family

ID=80868583

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111081567.2A Pending CN114283049A (en) 2021-09-15 2021-09-15 Image processing method, image processing device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114283049A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116993590A (en) * 2023-08-09 2023-11-03 中国电信股份有限公司技术创新中心 Image processing method and device, storage medium and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116993590A (en) * 2023-08-09 2023-11-03 中国电信股份有限公司技术创新中心 Image processing method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN110097019B (en) Character recognition method, character recognition device, computer equipment and storage medium
CN109299315B (en) Multimedia resource classification method and device, computer equipment and storage medium
CN111489378B (en) Video frame feature extraction method and device, computer equipment and storage medium
CN110083791B (en) Target group detection method and device, computer equipment and storage medium
CN111091132A (en) Image recognition method and device based on artificial intelligence, computer equipment and medium
CN111104980B (en) Method, device, equipment and storage medium for determining classification result
CN110544272A (en) face tracking method and device, computer equipment and storage medium
CN110675412A (en) Image segmentation method, training method, device and equipment of image segmentation model
CN114283050A (en) Image processing method, device, equipment and storage medium
CN111738365B (en) Image classification model training method and device, computer equipment and storage medium
CN110942046A (en) Image retrieval method, device, equipment and storage medium
CN110705614A (en) Model training method and device, electronic equipment and storage medium
CN110647881A (en) Method, device, equipment and storage medium for determining card type corresponding to image
CN114282035A (en) Training and searching method, device, equipment and medium of image searching model
CN111192072A (en) User grouping method and device and storage medium
CN113570510A (en) Image processing method, device, equipment and storage medium
CN109388732B (en) Music map generating and displaying method, device and storage medium
CN114283049A (en) Image processing method, image processing device, computer equipment and storage medium
CN113361376B (en) Method and device for acquiring video cover, computer equipment and readable storage medium
CN113822916B (en) Image matching method, device, equipment and readable storage medium
CN113343709B (en) Method for training intention recognition model, method, device and equipment for intention recognition
CN113569822B (en) Image segmentation method and device, computer equipment and storage medium
CN114817709A (en) Sorting method, device, equipment and computer readable storage medium
CN112907939B (en) Traffic control subarea dividing method and device
CN112560472B (en) Method and device for identifying sensitive information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination