CN111833413A - Image processing method, image processing device, electronic equipment and computer readable storage medium - Google Patents

Image processing method, image processing device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN111833413A
CN111833413A CN202010710400.7A CN202010710400A CN111833413A CN 111833413 A CN111833413 A CN 111833413A CN 202010710400 A CN202010710400 A CN 202010710400A CN 111833413 A CN111833413 A CN 111833413A
Authority
CN
China
Prior art keywords
person
face image
face
image
synthesized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010710400.7A
Other languages
Chinese (zh)
Other versions
CN111833413B (en
Inventor
郑子奇
徐国强
邱寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010710400.7A priority Critical patent/CN111833413B/en
Publication of CN111833413A publication Critical patent/CN111833413A/en
Priority to PCT/CN2021/096713 priority patent/WO2022016996A1/en
Application granted granted Critical
Publication of CN111833413B publication Critical patent/CN111833413B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Abstract

The embodiment of the application provides an image processing method, an image processing device, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a face image set, wherein the face image set comprises a face image of a first person, a face image of a second person in a specified posture and a face image of a third person in a specified expression; extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of a first person, the posture features of a second person and the expression features of a third person; and carrying out face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person and the expression feature of the third person. By the adoption of the method and the device, the quality of the generated face image can be improved. In addition, the application also relates to a block chain technology, and the synthetic face image of the first person can be written into the block chain.

Description

Image processing method, image processing device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to an image processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
Image generation is an information processing capability that has become popular in recent years. Image generation involves many aspects, of which face image generation is a very important research area. The main purpose of the method is to generate a face image which is high in quality and is used commercially or experimentally, or to generate a face image satisfying a restriction condition according to the restriction of a user.
At present, for generating a face image, a large-scale related technology is mainly used in the fields of media, popularization and the like, such as virtual character production. But the human face editing is difficult to refine, and the granularity of the control is insufficient. On one hand, the method is not easy to meet due to the fact that the technology is immature, and on the other hand, the related technology needs to support mass data. Meanwhile, for the related technology with higher completion degree, various defects can exist, such as the problem that the character identity does not reach the standard, such as the existence of a Chinese character, the expressive force of an expression is insufficient, and the like; or the problems of low definition of the face image, single image content and the like are caused. Therefore, how to improve the quality of the generated face image becomes an urgent problem to be solved.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, electronic equipment and a computer readable storage medium, which can improve the quality of a generated face image.
In a first aspect, an embodiment of the present application provides an image processing method, including:
acquiring a face image set, wherein the face image set comprises a face image of a first person, a face image of a second person in a specified posture and a face image of a third person in a specified expression;
extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person;
and performing face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person and the expression feature of the third person.
Optionally, the performing face synthesis according to the first face feature set to obtain a synthesized face image of the first person includes:
utilizing the trained convolutional neural network model to perform upsampling on the first human face feature set to obtain a synthesized human face image of the first person, wherein the upsampling comprises any one of the following items: bilinear interpolation, nearest neighbor interpolation, and transposed convolution.
Optionally, the upsampling includes a transposed convolution, and the upsampling the first human face feature set by using the trained convolutional neural network model to obtain a synthesized human face image of the first person includes:
and performing transposition convolution on the first person feature set by using a transposition convolution layer included in the trained convolutional neural network model to obtain a synthetic face image of the first person.
Optionally, the performing a transpose convolution on the first human face feature set by using a transpose convolution layer included in the trained convolutional neural network model to obtain a synthesized human face image of the first person includes:
constructing a feature map by using the first face feature set;
and performing transposition convolution on the feature graph by using a transposition convolution layer included in the trained convolution neural network model to obtain a synthetic face image of the first person.
Optionally, after obtaining the synthetic face image of the first person, the method further includes:
carrying out image detection on the synthesized face image of the first person to obtain an image detection result;
performing face correction on the synthesized face image of the first person according to the image detection result to obtain a corrected synthesized face image of the first person;
and outputting the corrected synthesized face image of the first person.
Optionally, the method further comprises:
acquiring a facial image data set, wherein the facial image data set comprises at least one group of images, and each group of images in the at least one group of images comprises an original facial image, at least one facial image corresponding to each posture in at least one posture and at least one facial image corresponding to each expression in at least one expression;
and training an initial convolutional neural network model by using the face image data set to obtain a trained convolutional neural network model.
Optionally, the training an initial convolutional neural network model by using the face image data set to obtain a trained convolutional neural network model includes:
performing feature extraction on a target group of images in the at least one group of images to obtain a second face feature set;
utilizing an initial convolutional neural network model to perform upsampling on the second face feature set to obtain at least one first synthesized face image;
and restoring each first synthesized face image in the at least one first synthesized face image through the initial convolutional neural network model to obtain a second synthesized face image corresponding to each first synthesized face image, wherein the second synthesized face image is matched with the original face image included in the target group of face images to obtain a converged convolutional neural network model as the trained convolutional neural network model.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a face image set, and the face image set comprises a face image of a first person, a face image of a specified posture of a second person and a face image of a specified expression of a third person;
the processing module is used for extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person;
the processing module is further configured to perform face synthesis according to the first face feature set to obtain a synthesized face image of the first person, where the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person, and the expression feature of the third person.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory, where the processor and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method according to the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to implement the method according to the first aspect.
In summary, the electronic device may obtain a facial image set, where the facial image set includes a facial image of a first person, a facial image of a second person in a specified posture, and a facial image of a third person in a specified expression; the electronic equipment can extract the features of each face image in the face image set to obtain a first face feature set, and carry out face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face features of the first person, the posture features of the second person and the expression features of the third person.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of another image processing method provided in the embodiments of the present application;
fig. 3 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
Because the changes of the human face pose and the expression have diversity and complexity, and the defects of the human face image generation technology in the prior art are added, such as the special instability of a human face generation model and the loss of details possibly caused, the generation of the human face expression image in multiple angles is very difficult, and the quality of the generated human face image is not high.
The image processing scheme described in the embodiments of the present application specifically includes: acquiring a face image set, wherein the face image set comprises a face image of a first person, a face image of a second person in a specified posture and a face image of a third person in a specified expression; extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person; and carrying out face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person and the expression feature of the third person. The process can generate faces with different expressions in multiple angles, and meet certain image quality requirements while editing the face attributes with certain accuracy,
in one application scenario, the user may upload the front photo of the person a, the side photo of the person B, such as the photo turned by 60 ° on the right side, and the expression photo of the person B to the electronic device, and then, by using the above-mentioned image processing scheme, the photo with the expression on the side of the person a may be synthesized, where the expression is the expression corresponding to the expression photo of the person B. The person a may be the user himself or another person.
In an application scenario, a user may upload a front photo of a person a to an electronic device, select a side photo of the person B and an expression photo of the person B from a plurality of photos provided by the electronic device for face synthesis, and then submit a synthesis instruction to the electronic device by clicking a synthesis button, after detecting the synthesis instruction, the electronic device may acquire the front photo of the person a, the side photo of the person B, and the expression photo of the person B, and then, by using the above-mentioned image processing scheme, may synthesize a side emotional photo of the person a, where the expression is an expression corresponding to the expression photo of the person B.
In one application scenario, the electronic device may provide a face synthesis interface for the user, so that the user can set a face image set based on the face synthesis interface, such as setting a front photograph of person a, a side photograph of person B, and an emoticon of person B. The face synthesis interface may include a synthesis button, and the user may submit a synthesis instruction to the electronic device by clicking the synthesis button, and the electronic device may obtain the face image set after detecting the synthesis instruction. In one embodiment, the compositing instructions may carry a collection of facial images, such as a frontal photograph of person A, a lateral photograph of person B, and an emoticon of person B.
In one embodiment, in order to avoid a situation that a synthetic face image is used for crime as a true face, the synthetic face image of the first person may be written into the block chain, so as to achieve the purpose of tracing the synthetic face image. Or, the electronic device may write the synthesized face image of the first person and the identifier of the user who requests to synthesize the synthesized face image or the device information of the user terminal corresponding to the user into the block chain. The identifier of the user may be, for example, an account number of the user or a mobile phone number of the user, which may be used to uniquely identify the user. The device information of the ue may be, for example, a physical address, a device number, an internet protocol address, and the like of the ue, which are used to uniquely identify the ue. In an embodiment, the electronic device referred to in the embodiment of the present application may be a user terminal corresponding to the user, or may not be the user terminal. In one embodiment, the electronic device may write the rectified synthetic face image of the first person into the block chain in a case where the synthesized face image of the first person is rectified.
In one embodiment, in consideration of the privacy of the synthesized face image, the digest of the synthesized face image of the first person may be calculated to obtain digest information, and then the digest information is written into the block chain. The embodiment of the present application does not limit the algorithm used for calculating the summary information of the synthesized face image of the first person. Correspondingly, the summary calculation may be performed on the synthesized face image of the first person and the above-mentioned user identifier to obtain first summary information, and then the first summary information is written into the block chain, or the summary calculation may be performed on the synthesized face image of the first person and the above-mentioned device information of the user terminal to obtain second summary information, and then the second summary information is written into the block chain. In an embodiment, in a case that the electronic device corrects the synthesized face image of the first person, the electronic device may perform summary calculation on the corrected synthesized face image of the first person to obtain third summary information, and then write the third summary information into the block chain.
Please refer to fig. 1, which is a flowchart illustrating an image processing method according to an embodiment of the present disclosure. The method can be applied to electronic equipment, and the electronic equipment can be a terminal or a server. The terminal may be an intelligent terminal such as a notebook computer, a desktop computer, etc., and the server includes, but is not limited to, a single server or a server cluster. Specifically, the method may comprise the steps of:
s101, a face image set is obtained, and the face image set comprises a face image of a first person, a face image of a second person in a specified posture and a face image of a third person in a specified expression.
The first person, the second person and the third person may be the same or different. The designated pose may also be referred to as a fixed pose, for example, the designated pose may be a designated left-right corner, a designated up-down corner, or a corner within a designated plane. The gesture described in the embodiments of the present application may refer to a facial gesture. Accordingly, the specified expression may also be referred to as a fixed expression, for example, the specified expression may be angry, happy, or sad. The face image according to the embodiment of the present application may refer to an image including a face of a person, that is, may refer to an image including a face of a person.
In one embodiment, the electronic device may perform step S101 when detecting the composition instruction.
In an embodiment, when the synthesis instruction is detected, the process of acquiring the facial image set may acquire, for the electronic device, a facial image set requested to be carried by the facial image set. Or, when the synthesis instruction is detected, the process of acquiring the facial image set may be that the electronic device reads the facial image set from a specified directory.
S102, extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person.
The facial features may also be referred to as facial features. The referred pose features may refer to pose features of the face.
In one embodiment, the electronic device may invoke a feature extraction algorithm to perform feature extraction on each face image in the face image set, so as to obtain a first face feature set. By adopting the process, the characteristics can be effectively extracted, and the accuracy of the extracted characteristics is guaranteed.
In an embodiment, the electronic device may specifically invoke a feature extraction algorithm to perform feature extraction on a face image of a first person to obtain facial features of the first person, perform feature extraction on a face image of a second person in a specified posture to obtain posture features of the second person, and perform feature extraction on a face image of a third person in a specified expression to obtain expression features of the third person.
In one embodiment, the feature extraction algorithm employed by the electronic device may also be different depending on the features extracted. For example, the electronic device may specifically perform feature extraction on a face image of a first person by using a facial feature extraction algorithm, perform feature extraction on a face image of a specified pose of a second person by using a pose feature extraction algorithm, and perform feature extraction on a face image of a specified expression of a third person by using an expression feature extraction algorithm.
S103, carrying out face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person and the expression feature of the third person.
In the embodiment of the application, the electronic device can perform feature fusion according to the first human face feature set, so as to obtain a synthetic human face image of the first person. After obtaining the synthetic face image of the first person, the electronic device may output the synthetic face image of the first person. The face corresponding to the synthesized face image of the first person obtained by the process is the face of the first person, the posture corresponding to the synthesized face image of the first person is the specified posture, and the expression corresponding to the synthesized face image of the first person is the specified expression.
In an embodiment, in order to perform feature fusion effectively, the electronic device may specifically perform face synthesis according to preset fusion parameters and the first face feature set, so as to obtain a synthesized face image of the first person. Wherein, the fusion parameters can be set according to experience.
In one embodiment, in order to perform feature fusion efficiently, the electronic device may further perform upsampling according to the first human face feature set to obtain a synthetic human face image of the first human being. Wherein the upsampling may comprise any one of: bilinear interpolation, nearest neighbor interpolation, and transposed convolution.
In the embodiment shown in fig. 1, the electronic device may acquire a facial image set, where the facial image set includes a facial image of a first person, a facial image of a second person in a specified posture, and a facial image of a third person in a specified expression; the electronic equipment can extract the features of each face image in the face image set to obtain a first face feature set, and carries out face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face features of the first person, the posture features of the second person and the expression features of the third person, and by adopting the process, vivid multi-angle face expression images can be generated, and the quality of the generated face images can be improved.
Please refer to fig. 2, which is a flowchart illustrating another image processing method according to an embodiment of the present disclosure. The method can be applied to electronic equipment, and the electronic equipment can be a terminal or a server. The terminal may be an intelligent terminal such as a notebook computer, a desktop computer, etc., and the server includes, but is not limited to, a single server or a server cluster. Compared with the embodiment of fig. 1, the embodiment of the present application can improve the stability of the synthesized face image through the image rectification process of steps S204 to S206, so that the quality of the output synthesized face image is higher. Specifically, the method may comprise the steps of:
s201, a face image set is obtained, wherein the face image set comprises a face image of a first person, a face image of a second person in a specified posture and a face image of a third person in a specified expression.
S202, extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person.
S203, carrying out face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person and the expression feature of the third person.
Steps S201 to S203 can refer to steps S101 to S103 in the embodiment of fig. 1, and the embodiments of the present application are not described herein again.
And S204, carrying out image detection on the synthesized face image of the first person to obtain an image detection result.
S205, performing face correction on the synthesized face image of the first person according to the image detection result to obtain a corrected synthesized face image of the first person.
And S206, outputting the corrected synthesized face image of the first person.
In steps S204 to S206, in order to make the output synthetic face image more truly stable, the electronic device may correct the synthetic face image of the first person. Specifically, the electronic device may perform image detection on a synthesized face image of a first person to obtain an image detection result, correct the synthesized face image of the first person according to the image detection result to obtain a corrected synthesized face image of the first person, and output the corrected synthesized face image of the first person.
In one embodiment, the electronic device performs image detection on the synthesized face image of the first person to obtain an image detection result, and corrects the synthesized face image of the first person according to the image detection result to obtain a corrected synthesized face image of the first person as follows: the electronic equipment carries out face detection on the synthesized face image of the first person to obtain the coordinates of each key point in a plurality of key points in the synthesized face image of the first person as an image detection result; the electronic equipment calculates a transformation matrix used for transforming the coordinates of each key point to the coordinates of a preset key point corresponding to the key point; and the electronic equipment obtains the corrected synthetic face image of the first person according to the synthetic face image of the first person and the transformation matrix. The key points herein may refer to face key points. The preset key points may refer to key points of any image or designated image in the face image set. Since the key points are one of the very important attributes of the human face, the key points are detected for the human face of a certain person, a certain head posture or a certain expression to correct the key points of the generated human face, so that the output synthesized human face image is more stable under most conditions, the possibility of distortion of the synthesized human face image is reduced, and the synthesized human face image can be more normal to look through correction.
In an embodiment, in order to effectively perform feature extraction, the electronic device performs feature extraction on each face image in the face image set, and the process of obtaining the first face feature set may further perform feature extraction on each face image in the face image set by using a trained convolutional neural network model for the electronic device, so as to obtain the first face feature set.
In an embodiment, the process that the electronic device performs feature extraction on each face image in the face image set by using the trained convolutional neural network model to obtain the first face feature set may specifically be that the electronic device performs feature extraction on each face image in the face image set by using a convolutional layer included in the trained convolutional neural network model to obtain the first face feature set.
In an embodiment, in order to perform feature fusion effectively, the process of obtaining the synthetic face image of the first person, in which the electronic device performs face synthesis according to the first face feature set, may further perform upsampling according to the first face feature set to obtain the synthetic face image of the first person. Wherein the upsampling may comprise any one of: bilinear interpolation, nearest neighbor interpolation, and transposed convolution.
In an embodiment, the electronic device may specifically perform upsampling on the first face feature set through the trained convolutional neural network model to obtain a synthesized face image of the first person. The up-sampling is realized through the trained convolutional neural network model, so that the up-sampling efficiency is higher.
In an embodiment, when the upsampling includes bilinear interpolation, the electronic device may specifically perform bilinear interpolation operation according to the first face feature set through the trained convolutional neural network model, so as to obtain a synthesized face image of the first person. The image obtained by the bilinear interpolation method has high quality, and the condition of discontinuous pixel values cannot occur.
In an embodiment, when the upsampling includes bilinear interpolation, the electronic device may specifically construct a feature map according to the first face feature set, and then perform bilinear interpolation operation according to the feature map by using the trained convolutional neural network model, so as to obtain a synthesized face image of the first person. The processing procedure of the bilinear interpolation method can be referred to the following formula:
yi,j=xi-,j-(1-Δi)(1-Δj)+xj+,j-Δi(1-Δj)+xi-,j+(1-Δi)Δj+xi+,j+Δ i Δ j formula 1.1;
where y represents the synthetic face image. x represents a feature map. Δ i ═ i-i-。Δj=j-j-. i denotes the abscissa of the pixel of the target position, and j denotes the ordinate of the pixel of the target position. i.e. i-Represents rounding down i, i+Indicating rounding up i. j is a function of-Indicating that j is rounded down. j is a function of+Indicating that j is rounded up.
In one embodiment, y may be taken as the synthetic face image of the first person.
In one embodiment, after y is obtained, the convolution operation may be performed on the first convolution layer in the trained convolutional neural network model to obtain a final synthetic face image as the synthetic face image of the first person. Or after y is obtained, y is input into the first convolution layer to perform convolution operation, so that a first synthesized face image is obtained, and then bilinear operation is performed according to the first synthesized face image, so that a final synthesized face image is obtained and serves as a synthesized face image of the first person. The above process is matched with bilinear interpolation operation and convolution operation, so that the final synthesized face image has higher resolution.
It should be noted that the bilinear interpolation in the embodiments of the present application essentially finds four pixels around the target position (i, j) in x, and calculates the pixel value of the pixel at the target position y using the pixel values of the four pixels.
In an embodiment, when the upsampling includes nearest neighbor interpolation, the electronic device may specifically perform nearest neighbor interpolation operation according to the first face feature set through the trained convolutional neural network model to obtain a synthesized face image of the first person.
In an embodiment, when the upsampling includes nearest neighbor interpolation, the electronic device may specifically construct a feature map according to the first face feature set, and then perform nearest neighbor interpolation operation according to the feature map by using the trained convolutional neural network model, so as to obtain a synthesized face image of the first person. The processing procedure of the method using nearest neighbor interpolation can be referred to the following formula:
Figure BDA0002596331740000111
where y represents the synthetic face image. x represents a feature map. i denotes the abscissa of the pixel of the target position, and j denotes the ordinate of the pixel of the target position. u denotes the abscissa of the pixel in x, and v denotes the ordinate of the pixel in x.
In one embodiment, y may be taken as the synthetic face image of the first person.
In one embodiment, after y is obtained, y is input to the second convolutional layer of the trained convolutional neural network model to perform a convolution operation, so as to obtain a final synthetic face image. The second convolutional layer here may be the same as or different from the first convolutional layer. Or after y is obtained, y is input into the second convolution layer to perform convolution operation, so that a second synthesized face image is obtained, and then nearest neighbor interpolation operation is performed according to the second synthesized face image, so that a final synthesized face image is obtained and serves as the synthesized face image of the first person.
It should be noted that, the nearest neighbor interpolation referred to in the embodiment of the present application essentially finds the pixel value of the pixel nearest to the target position (i, j) in the source image as the pixel value of the pixel of the target position in the synthetic face image.
In an embodiment, when the upsampling includes a transposed convolution, the electronic device may perform the transposed convolution according to the first face feature set by using a transposed convolution layer included in the trained convolutional neural network model, so as to obtain a synthesized face image of the first person.
In an embodiment, the electronic device may specifically construct a feature map by using the first person feature set, and perform a transpose convolution on the feature map by using a transpose convolution layer included in the trained convolutional neural network model to obtain a synthesized face image of the first person. The processing procedure of the method using the transposed convolution can be referred to as the following formula:
Figure BDA0002596331740000112
where y represents the synthetic face image. x represents a feature map. k denotes a convolution kernel of the transposed convolution layer. c denotes the column of the convolution kernel, r denotes the row of the convolution kernel, and r' denotes the row of the region where the sliding window of the convolution kernel slides. row denotes a column and col denotes a row. I denotes a preset column range and J denotes a preset row range.
In one embodiment, y may be taken as the synthetic face image of the first person.
In one embodiment, the operation of transposing convolution may also be performed multiple times, so as to obtain a final synthetic face image as the synthetic face image of the first person. For example, the electronic device may be configured to transpose the convolution layer after another transpose convolution layer, and then the next transpose convolution layer may perform a transpose convolution operation according to an output of the previous transpose convolution layer, so as to obtain a final synthesized face image as a synthesized face image of the first person.
In one embodiment, the aforementioned trained convolutional neural network model can be obtained by:
1. the electronic equipment acquires a facial image data set, wherein the facial image data set comprises at least one group of images, and each group of images in the at least one group of images comprises an original facial image, at least one facial image corresponding to each posture in at least one posture and at least one facial image corresponding to each expression in at least one expression. For example, one of the at least one set of images may include a facial image of the person C, one facial image for each of the 54 expressions, one facial image for each of the 4 poses.
2. The electronic equipment trains the initial convolutional neural network model by using the face image data set to obtain the trained convolutional neural network model.
Specifically, the electronic device trains an initial convolutional neural network model by using the face image data set, and a process of obtaining the trained convolutional neural network model may be as follows:
firstly, the electronic equipment performs feature extraction on a target group of images in at least one group of images to obtain a second face feature set. The target group image may be any one of the at least one group of images, or any one of the at least one group of images randomly selected from the at least one group of images. In one embodiment, the electronic device may perform feature extraction on a target set of images in the at least one set of images through an initial convolutional neural network model. In one embodiment, the electronic device may specifically perform feature extraction on a target set of images in the at least one set of images through convolutional layers included in the initial convolutional neural network model.
And secondly, the electronic equipment utilizes the initial convolutional neural network model to perform upsampling on the second face feature set to obtain at least one first synthesized face image. In an embodiment, the electronic device may perform bilinear interpolation according to the second face feature set by using an initial convolutional neural network model to obtain at least one first synthesized face image, or the electronic device may perform nearest neighbor interpolation according to the second face feature set by using an initial convolutional neural network to obtain at least one first synthesized face image, or the electronic device may perform transpose convolution according to the second face feature set by using a transpose convolutional layer included in the initial convolutional neural network model to obtain at least one first synthesized face image.
And thirdly, the electronic equipment restores each first synthesized face image in the at least one first synthesized face image through the initial convolutional neural network model to obtain a second synthesized face image corresponding to each first synthesized face image, the second synthesized face image is matched with the original face image included in the target group of face images, and the converged convolutional neural network model is obtained and used as the trained convolutional neural network model. Where matching may be similar, analogous, or identical. In the embodiment of the invention, the electronic equipment can repeatedly execute the process of restoring each first synthesized face image in the at least one first synthesized face image through the initial convolutional neural network model in the step (i) and (iii) to obtain a second synthesized face image corresponding to each first synthesized face image until the model converges, and the converged convolutional neural network model is obtained and used as the trained convolutional neural network model.
In an embodiment, the electronic device performs reduction processing on each first synthesized face image in the at least one first synthesized face image through the initial convolutional neural network model to obtain a second synthesized face image corresponding to each first synthesized face image, which may be performed by the process of performing feature extraction on each first synthesized face image and an original face image by the electronic device to obtain a third face feature set, and then performs upsampling on the third face feature set through the initial convolutional neural network model to obtain the second synthesized face image corresponding to each first synthesized face image.
As can be seen, in the embodiment shown in fig. 2, the electronic device may correct the synthesized face image of the first person to obtain a corrected synthesized face image of the first person, and the process makes the output synthesized face image more stable and real.
Fig. 3 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure. The apparatus may be applied to the aforementioned electronic device. Specifically, the apparatus may include:
an obtaining module 301, configured to obtain a facial image set, where the facial image set includes a facial image of a first person, a facial image of a specified pose of a second person, and a facial image of a specified expression of a third person.
The processing module 302 is configured to perform feature extraction on each face image in the face image set to obtain a first face feature set, where the first face feature set includes a face feature of the first person, a posture feature of the second person, and an expression feature of the third person.
The processing module 302 is further configured to perform face synthesis according to the first face feature set to obtain a synthesized face image of the first person, where the synthesized face image of the first person has the facial features of the first person, the posture features of the second person, and the expression features of the third person.
In an optional implementation manner, the processing module 302 performs face synthesis according to the first face feature set to obtain a synthesized face image of the first person, specifically, performs upsampling on the first face feature set by using a trained convolutional neural network model to obtain the synthesized face image of the first person, where the upsampling includes any one of: bilinear interpolation, nearest neighbor interpolation, and transposed convolution.
In an optional implementation manner, the upsampling includes a transposed convolution, and the processing module 302 performs upsampling on the first human face feature set by using the trained convolutional neural network model to obtain a synthetic human face image of the first person, specifically, performs transposed convolution on the first human face feature set by using a transposed convolution layer included in the trained convolutional neural network model to obtain the synthetic human face image of the first person.
In an optional implementation manner, the processing module 302 performs a transposed convolution on the first human face feature set by using a transposed convolution layer included in the trained convolutional neural network model to obtain a synthetic human face image of the first person, specifically, a feature map is constructed by using the first human face feature set; and performing transposition convolution on the feature graph by using a transposition convolution layer included in the trained convolution neural network model to obtain a synthetic face image of the first person.
In an alternative embodiment, the image processing apparatus further comprises an output module 303.
In an optional implementation manner, the processing module 302 is further configured to, after obtaining the synthetic face image of the first person, perform image detection on the synthetic face image of the first person to obtain an image detection result; and carrying out face correction on the synthesized face image of the first person according to the image detection result to obtain the corrected synthesized face image of the first person.
In an alternative embodiment, the output module 303 is configured to output the corrected synthetic face image of the first person.
In an optional implementation, the processing module 302 is further configured to obtain a facial image dataset, where the facial image dataset includes at least one group of images, and each group of images in the at least one group of images includes an original facial image, at least one facial image corresponding to each pose in at least one pose, and at least one facial image corresponding to each expression in at least one expression; and training an initial convolutional neural network model by using the face image data set to obtain a trained convolutional neural network model.
In an optional implementation manner, the processing module 302 trains an initial convolutional neural network model by using the face image data set to obtain a trained convolutional neural network model, specifically, performs feature extraction on a target group of images in the at least one group of images to obtain a second face feature set; utilizing an initial convolutional neural network model to perform upsampling on the second face feature set to obtain at least one first synthesized face image; and restoring each first synthesized face image in the at least one first synthesized face image through the initial convolutional neural network model to obtain a second synthesized face image corresponding to each first synthesized face image, wherein the second synthesized face image is matched with the original face image included in the target group of face images to obtain a converged convolutional neural network model as the trained convolutional neural network model.
In the embodiment shown in fig. 3, the image processing device may obtain a facial image set, where the facial image set includes a facial image of a first person, a facial image of a specified pose of a second person, and a facial image of a specified expression of a third person; the image processing device can extract the features of each face image in the face image set to obtain a first face feature set, and carry out face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face features of the first person, the posture features of the second person and the expression features of the third person.
Please refer to fig. 4, which is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device described in this embodiment may include: one or more processors 1000, one or more input devices 2000, one or more output devices 3000, and memory 4000. The processor 1000, the input device 2000, the output device 3000, and the memory 4000 may be connected by a bus. The input device 2000 and the output device 3000 are optional devices in the electronic device, that is, the electronic device may only include the processor 1000 and the memory 4000. In one embodiment, the input device 2000, the output device 3000 may be a standard wired or wireless communication interface. In an embodiment, the input device 2000 may be a touch screen or a touch display screen, and the output device 3000 may be a display screen or a touch display screen, which is not limited in the embodiments of the present application.
The Processor 1000 may be a Central Processing Unit (CPU), and may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 4000 may be a high-speed RAM memory or a non-volatile memory (e.g., a disk memory). The memory 4000 is used to store a set of program codes, and the input device 2000, the output device 3000, and the processor 1000 may call the program codes stored in the memory 4000. Specifically, the method comprises the following steps:
a processor 1000, configured to obtain a facial image set, where the facial image set includes a facial image of a first person, a facial image of a second person in a specified posture, and a facial image of a third person in a specified expression; extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person; and performing face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person and the expression feature of the third person.
In an embodiment, the processor 1000 performs face synthesis according to the first face feature set to obtain a synthesized face image of the first person, specifically, performs upsampling on the first face feature set by using a trained convolutional neural network model to obtain the synthesized face image of the first person, where the upsampling includes any one of: bilinear interpolation, nearest neighbor interpolation, and transposed convolution.
In an embodiment, the upsampling includes a transposed convolution, and the processor 1000 performs upsampling on the first human face feature set by using the trained convolutional neural network model to obtain a synthetic human face image of the first person, specifically, performs transposed convolution on the first human face feature set by using a transposed convolution layer included in the trained convolutional neural network model to obtain a synthetic human face image of the first person.
In an embodiment, the processor 1000 performs a transpose convolution on the first human face feature set by using a transpose convolution layer included in the trained convolutional neural network model to obtain a synthetic human face image of the first person, specifically, constructs a feature map by using the first human face feature set; and performing transposition convolution on the feature graph by using a transposition convolution layer included in the trained convolution neural network model to obtain a synthetic face image of the first person.
In one embodiment, the processor 1000 is further configured to, after obtaining the synthesized face image of the first person, perform image detection on the synthesized face image of the first person to obtain an image detection result; performing face correction on the synthesized face image of the first person according to the image detection result to obtain a corrected synthesized face image of the first person; the corrected synthetic face image of the first person is output through the output device 3000.
In one embodiment, the processor 1000 is further configured to obtain a facial image data set, where the facial image data set includes at least one group of images, and each group of images in the at least one group of images includes an original facial image, at least one facial image corresponding to each pose in at least one pose, and at least one facial image corresponding to each expression in at least one expression; and training an initial convolutional neural network model by using the face image data set to obtain a trained convolutional neural network model.
In an embodiment, the processor 1000 trains an initial convolutional neural network model by using the face image data set to obtain a trained convolutional neural network model, specifically, performs feature extraction on a target group image in the at least one group of images to obtain a second face feature set; utilizing an initial convolutional neural network model to perform upsampling on the second face feature set to obtain at least one first synthesized face image; and restoring each first synthesized face image in the at least one first synthesized face image through the initial convolutional neural network model to obtain a second synthesized face image corresponding to each first synthesized face image, wherein the second synthesized face image is matched with the original face image included in the target group of face images to obtain a converged convolutional neural network model as the trained convolutional neural network model.
In a specific implementation, the processor 1000, the input device 2000, and the output device 3000 described in this embodiment of the present application may perform the implementation described in the embodiment of fig. 1 and the embodiment of fig. 2, and may also perform the implementation described in this embodiment of the present application, which is not described herein again.
The functional modules in the embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a form of sampling hardware, and can also be realized in a form of sampling software functional modules.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The computer readable storage medium may be volatile or nonvolatile. For example, the computer storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like. The computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. An image processing method, comprising:
acquiring a face image set, wherein the face image set comprises a face image of a first person, a face image of a second person in a specified posture and a face image of a third person in a specified expression;
extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person;
and performing face synthesis according to the first face feature set to obtain a synthesized face image of the first person, wherein the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person and the expression feature of the third person.
2. The method according to claim 1, wherein the performing face synthesis according to the first face feature set to obtain a synthesized face image of the first person comprises:
utilizing the trained convolutional neural network model to perform upsampling on the first human face feature set to obtain a synthesized human face image of the first person, wherein the upsampling comprises any one of the following items: bilinear interpolation, nearest neighbor interpolation, and transposed convolution.
3. The method of claim 2, wherein the upsampling comprises transpose convolution, and wherein the upsampling the first set of facial features using the trained convolutional neural network model to obtain a synthetic facial image of the first person comprises:
and performing transposition convolution on the first person feature set by using a transposition convolution layer included in the trained convolutional neural network model to obtain a synthetic face image of the first person.
4. The method according to claim 3, wherein the performing the transposed convolution on the first set of face features by using the transposed convolution layer included in the trained convolutional neural network model to obtain the synthesized face image of the first person comprises:
constructing a feature map by using the first face feature set;
and performing transposition convolution on the feature graph by using a transposition convolution layer included in the trained convolution neural network model to obtain a synthetic face image of the first person.
5. The method of any of claims 1-4, wherein after obtaining the synthetic face image of the first person, the method further comprises:
carrying out image detection on the synthesized face image of the first person to obtain an image detection result;
performing face correction on the synthesized face image of the first person according to the image detection result to obtain a corrected synthesized face image of the first person;
and outputting the corrected synthesized face image of the first person.
6. The method according to any one of claims 2-4, further comprising:
acquiring a facial image data set, wherein the facial image data set comprises at least one group of images, and each group of images in the at least one group of images comprises an original facial image, at least one facial image corresponding to each posture in at least one posture and at least one facial image corresponding to each expression in at least one expression;
and training an initial convolutional neural network model by using the face image data set to obtain a trained convolutional neural network model.
7. The method of claim 6, wherein the training an initial convolutional neural network model using the face image dataset to obtain a trained convolutional neural network model comprises:
performing feature extraction on a target group of images in the at least one group of images to obtain a second face feature set;
utilizing an initial convolutional neural network model to perform upsampling on the second face feature set to obtain at least one first synthesized face image;
and restoring each first synthesized face image in the at least one first synthesized face image through the initial convolutional neural network model to obtain a second synthesized face image corresponding to each first synthesized face image, wherein the second synthesized face image is matched with the original face image included in the target group of face images to obtain a converged convolutional neural network model as the trained convolutional neural network model.
8. An image processing apparatus characterized by comprising:
the system comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring a face image set, and the face image set comprises a face image of a first person, a face image of a specified posture of a second person and a face image of a specified expression of a third person;
the processing module is used for extracting the features of each face image in the face image set to obtain a first face feature set, wherein the first face feature set comprises the face features of the first person, the posture features of the second person and the expression features of the third person;
the processing module is further configured to perform face synthesis according to the first face feature set to obtain a synthesized face image of the first person, where the synthesized face image of the first person has the face feature of the first person, the posture feature of the second person, and the expression feature of the third person.
9. An electronic device, comprising a processor and a memory, the processor and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method according to any one of claims 1-7.
CN202010710400.7A 2020-07-22 2020-07-22 Image processing method, image processing device, electronic equipment and computer readable storage medium Active CN111833413B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010710400.7A CN111833413B (en) 2020-07-22 2020-07-22 Image processing method, image processing device, electronic equipment and computer readable storage medium
PCT/CN2021/096713 WO2022016996A1 (en) 2020-07-22 2021-05-28 Image processing method, device, electronic apparatus, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010710400.7A CN111833413B (en) 2020-07-22 2020-07-22 Image processing method, image processing device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111833413A true CN111833413A (en) 2020-10-27
CN111833413B CN111833413B (en) 2022-08-26

Family

ID=72924678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010710400.7A Active CN111833413B (en) 2020-07-22 2020-07-22 Image processing method, image processing device, electronic equipment and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN111833413B (en)
WO (1) WO2022016996A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022016996A1 (en) * 2020-07-22 2022-01-27 平安科技(深圳)有限公司 Image processing method, device, electronic apparatus, and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020067362A1 (en) * 1998-11-06 2002-06-06 Agostino Nocera Luciano Pasquale Method and system generating an avatar animation transform using a neutral face image
CN106845330A (en) * 2016-11-17 2017-06-13 北京品恩科技股份有限公司 A kind of training method of the two-dimension human face identification model based on depth convolutional neural networks
CN107680158A (en) * 2017-11-01 2018-02-09 长沙学院 A kind of three-dimensional facial reconstruction method based on convolutional neural networks model
CN109344724A (en) * 2018-09-05 2019-02-15 深圳伯奇科技有限公司 A kind of certificate photo automatic background replacement method, system and server
CN110097606A (en) * 2018-01-29 2019-08-06 微软技术许可有限责任公司 Face synthesis
CN111383307A (en) * 2018-12-29 2020-07-07 上海智臻智能网络科技股份有限公司 Video generation method and device based on portrait and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102254336B (en) * 2011-07-14 2013-01-16 清华大学 Method and device for synthesizing face video
US10755145B2 (en) * 2017-07-07 2020-08-25 Carnegie Mellon University 3D spatial transformer network
CN113261013A (en) * 2019-01-18 2021-08-13 斯纳普公司 System and method for realistic head rotation and facial animation synthesis on mobile devices
CN111539903B (en) * 2020-04-16 2023-04-07 北京百度网讯科技有限公司 Method and device for training face image synthesis model
CN111583399B (en) * 2020-06-28 2023-11-07 腾讯科技(深圳)有限公司 Image processing method, device, equipment, medium and electronic equipment
CN111833413B (en) * 2020-07-22 2022-08-26 平安科技(深圳)有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020067362A1 (en) * 1998-11-06 2002-06-06 Agostino Nocera Luciano Pasquale Method and system generating an avatar animation transform using a neutral face image
CN106845330A (en) * 2016-11-17 2017-06-13 北京品恩科技股份有限公司 A kind of training method of the two-dimension human face identification model based on depth convolutional neural networks
CN107680158A (en) * 2017-11-01 2018-02-09 长沙学院 A kind of three-dimensional facial reconstruction method based on convolutional neural networks model
CN110097606A (en) * 2018-01-29 2019-08-06 微软技术许可有限责任公司 Face synthesis
CN109344724A (en) * 2018-09-05 2019-02-15 深圳伯奇科技有限公司 A kind of certificate photo automatic background replacement method, system and server
CN111383307A (en) * 2018-12-29 2020-07-07 上海智臻智能网络科技股份有限公司 Video generation method and device based on portrait and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022016996A1 (en) * 2020-07-22 2022-01-27 平安科技(深圳)有限公司 Image processing method, device, electronic apparatus, and computer readable storage medium

Also Published As

Publication number Publication date
CN111833413B (en) 2022-08-26
WO2022016996A1 (en) 2022-01-27

Similar Documents

Publication Publication Date Title
CN110555795B (en) High resolution style migration
CN110503703B (en) Method and apparatus for generating image
JP7373554B2 (en) Cross-domain image transformation
JP2023545565A (en) Image detection method, model training method, image detection device, training device, equipment and program
CN110517214B (en) Method and apparatus for generating image
WO2023035531A1 (en) Super-resolution reconstruction method for text image and related device thereof
TW201115252A (en) Document camera with image-associated data searching and displaying function and method applied thereto
US10489839B2 (en) Information presentation method and information presentation apparatus
CN113870104A (en) Super-resolution image reconstruction
US20220189189A1 (en) Method of training cycle generative networks model, and method of building character library
WO2020062191A1 (en) Image processing method, apparatus and device
CN113673519B (en) Character recognition method based on character detection model and related equipment thereof
CN111047509A (en) Image special effect processing method and device and terminal
AU2017383979A1 (en) Projection image construction method and device
CN111833413B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
JP2023541351A (en) Character erasure model training method and device, translation display method and device, electronic device, storage medium, and computer program
CN114742722A (en) Document correction method, device, electronic equipment and storage medium
JP2023545052A (en) Image processing model training method and device, image processing method and device, electronic equipment, and computer program
CN114638375A (en) Video generation model training method, video generation method and device
US20160335771A1 (en) Incremental global non-rigid alignment of three-dimensional scans
WO2016018682A1 (en) Processing image to identify object for insertion into document
CN113610864B (en) Image processing method, device, electronic equipment and computer readable storage medium
WO2022105120A1 (en) Text detection method and apparatus from image, computer device and storage medium
CN112348069A (en) Data enhancement method and device, computer readable storage medium and terminal equipment
CN113362249A (en) Text image synthesis method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant