CN113762015A - Image processing method and device - Google Patents
Image processing method and device Download PDFInfo
- Publication number
- CN113762015A CN113762015A CN202110009410.2A CN202110009410A CN113762015A CN 113762015 A CN113762015 A CN 113762015A CN 202110009410 A CN202110009410 A CN 202110009410A CN 113762015 A CN113762015 A CN 113762015A
- Authority
- CN
- China
- Prior art keywords
- image
- visual angle
- limb
- template
- perspective
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 16
- 230000000007 visual effect Effects 0.000 claims abstract description 279
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000012545 processing Methods 0.000 claims description 24
- 238000012937 correction Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 8
- 238000006243 chemical reaction Methods 0.000 abstract description 15
- 238000010586 diagram Methods 0.000 description 13
- 238000004891 communication Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses an image processing method and device, and relates to the technical field of computers. One embodiment of the method comprises: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted. This embodiment can need not artificial intervention, and automatic conversion image visual angle solves shoulder distortion, perspective skew scheduling problem that the auto heterodyne visual angle leads to, reduces conversion time, reduces the cost of labor, can realize the batch conversion of image visual angle.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to an image processing method and apparatus.
Background
In some scenes it is desirable to convert an image from one perspective to another, for example from a self-portrait perspective to an image from a point of view he takes. When a single person or a plurality of persons take a photo together to carry out self-shooting, the problems of arm entry, maladjustment of body proportion, unnaturalness and the like can be caused due to the influence of arm length, angle and the like, so that the image needs to be converted into a shooting visual angle of the person from the self-shooting visual angle so as to provide diversified selection of shooting. The existing scheme for converting the image of the self-shooting visual angle into the image of the other shooting visual angle has two kinds: the scheme is that through an image clipping method, frames such as arms extending out from a self-photographing visual angle and a self-photographing rod are removed, and an image which is shortened but is basically the other photographing visual angle is obtained; and the second scheme is that the perspective effect of the picture is manually changed by a manual post-phase picture repairing method, the entering arm is smeared, and the background pattern is filled in the areas such as the arm and the like by using the picture repairing functions such as 'imitation stamp' and the like, so that the aim of converting the self-shooting visual angle into the other shooting visual angle is fulfilled.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
in the first scheme, the cutting part needs to be manually selected, more parts can be cut to remove unnatural images such as arms, and the problems of shoulder distortion, perspective deviation and the like caused by a self-photographing visual angle cannot be solved, and the first scheme is mechanical; the second scheme needs a certain post-phase map repairing skill, has multiple map repairing steps and long consumed time, and cannot realize batch conversion of image visual angles in a short time.
Disclosure of Invention
In view of this, embodiments of the present invention provide an image processing method and apparatus, which can automatically convert an image viewing angle without manual intervention, solve the problems of shoulder distortion, perspective offset, and the like caused by a self-portrait viewing angle, reduce conversion time, reduce labor cost, and implement batch conversion of image viewing angles.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided an image processing method.
An image processing method comprising: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
Optionally, the searching for the limb image template of the second view angle corresponding to the limb image of the first view angle in the template library includes: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm; and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
Optionally, before searching for the limb image template of the second perspective corresponding to the limb image of the first perspective in the template library, the method includes: pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective; performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
Optionally, the correcting the limb image of the second view angle based on the image to be converted of the first view angle to obtain a converted image of the second view angle corresponding to the image to be converted includes: and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
Optionally, the preset generation model is trained by using an image sample set of a first perspective and a limb image template set of a second perspective, the image sample set of the first perspective includes the image sample of the first perspective, and the limb image template set of the second perspective includes the limb image template of the second perspective.
Optionally, the preset generative model is implemented based on a twin network and a generative countermeasure network.
Optionally, the generating a limb image of a second perspective according to the limb image of the first perspective and the limb image template of the second perspective includes: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
According to another aspect of the embodiments of the present invention, there is provided an image processing apparatus.
An image processing apparatus comprising: the first visual angle limb image generation module is used for segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; the template searching module is used for searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; the second visual angle limb image generation module is used for generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle; and the correction processing module is used for correcting the limb image at the second visual angle based on the image to be converted at the first visual angle to obtain a converted image at the second visual angle corresponding to the image to be converted.
Optionally, the template searching module is further configured to: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm; and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
Optionally, the system further comprises a template library establishing module, configured to: pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective; performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
Optionally, the corrective processing module is further configured to: and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
Optionally, the preset generation model is trained by using an image sample set of a first perspective and a limb image template set of a second perspective, the image sample set of the first perspective includes the image sample of the first perspective, and the limb image template set of the second perspective includes the limb image template of the second perspective.
Optionally, the preset generative model is implemented based on a twin network and a generative countermeasure network.
Optionally, the second perspective limb image generation module is further configured to: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
According to yet another aspect of an embodiment of the present invention, an electronic device is provided.
An electronic device, comprising: one or more processors; a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the image processing method provided by the embodiments of the present invention.
According to yet another aspect of an embodiment of the present invention, a computer-readable medium is provided.
A computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements an image processing method provided by an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted. The image visual angle can be automatically converted without manual intervention, the problems of shoulder distortion, perspective deviation and the like caused by the self-photographing visual angle are solved, the conversion time is shortened, the labor cost is reduced, and batch conversion of the image visual angles can be realized.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of an image processing method according to one embodiment of the present invention;
FIG. 2 is a flow chart illustrating the conversion of an image from a self-portrait perspective to an other-portrait perspective according to one embodiment of the present invention;
FIG. 3 is a block diagram of a GAN rectification model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the main blocks of an image processing apparatus according to one embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 6 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of main steps of an image processing method according to an embodiment of the present invention.
As shown in fig. 1, the image processing method according to an embodiment of the present invention mainly includes steps S101 to S104 as follows.
Step S101: and segmenting the limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle.
Step S102: and searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in the template library.
Searching the template library for a limb image template of a second view angle corresponding to the limb image of the first view angle may include: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in a template library by utilizing a nearest neighbor algorithm; and searching a limb image template of a second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle. The limb image template is a limb image serving as a template, and the template library may be a database for storing all the limb image templates.
The first viewing angle may be a self-timer viewing angle, and the second viewing angle may be a shooting viewing angle.
Before searching the template library for the limb image template of the second view angle corresponding to the limb image of the first view angle, the method may include: pre-establishing an image pair, wherein the image pair comprises an image sample of a first view angle and an image sample of a second view angle corresponding to the image sample of the first view angle; limb identification is carried out on the image sample of the first visual angle and the image sample of the second visual angle in the image pair, and a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle are obtained respectively; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into a template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
The body recognition can adopt a Dense Pose technology, namely a technology of mapping all human body pixels of a 2D RGB image to a 3D model in real time.
Step S103: and generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle.
Generating the limb image of the second perspective according to the limb image of the first perspective and the limb image template of the second perspective may include: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
Step S104: and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
Based on the image to be converted at the first view angle, performing correction processing on the limb image at the second view angle to obtain a converted image at the second view angle corresponding to the image to be converted, which may include: and inputting the image to be converted at the first visual angle and the limb image at the second visual angle into the trained preset generation model so as to output the converted image at the second visual angle corresponding to the image to be converted.
The method comprises the steps of training a preset generation model by utilizing an image sample set of a first visual angle and a limb image template set of a second visual angle, wherein the image sample set of the first visual angle comprises an image sample of the first visual angle, the limb image template set of the second visual angle comprises a limb image template of the second visual angle, the limb image template of the second visual angle is obtained by carrying out limb identification on the image sample of the second visual angle corresponding to the image sample of the first visual angle, and a training target is the image sample of the second visual angle corresponding to each image sample of the first visual angle.
The preset generative model may be implemented based on a twin network and a generative confrontation network.
FIG. 2 is a flow chart illustrating a process of converting an image from a self-timer perspective to an alternative-timer perspective according to an embodiment of the present invention.
As shown in fig. 2, the present invention provides an image self-timer view angle conversion method based on deep learning, which mainly uses a method of synthesizing a self-timer image from other images, uses a nearest neighbor method to retrieve the self-timer image closest to the given other image, and then synthesizes a corresponding self-timer photo through a GAN (generative confrontation network) rectification model (i.e. a preset generative model). The defects of the traditional conversion method can be effectively avoided, a good shooting angle conversion effect is achieved, manual intervention is not needed, and automatic conversion is achieved. The process of the present invention is described in detail below.
A self-timer-self-timer image pair is created, which comprises an image sample of a first viewing angle and an image sample of a second viewing angle corresponding to the image sample of the first viewing angle. Recognizing the human body limb part by using a Dense Pose technology, respectively obtaining a limb image template of a first visual angle corresponding to the image sample of the first visual angle and a limb image template of a second visual angle corresponding to the image sample of the second visual angle, forming a limb image template pair by the limb image template of the first visual angle and the limb image template of the second visual angle, and storing the limb image template pair in a self-timer and other-timer database (namely a template library).
And (3) segmenting the limb area of the image to be converted (namely the image to be converted with the first visual angle) to obtain a limb image with the first visual angle. The method includes the steps of identifying a limb part of a limb image at a first visual angle by means of Dense Pose, finding an image of the limb of the other person who is closest to the limb image at the first visual angle (namely, the image of the other person who is closest to the limb image at the first visual angle) from a template library of an established limb image template pair by means of nearest neighbor, specifically, searching a limb image template at the first visual angle which is closest to the limb image at the first visual angle from the template library by means of nearest neighbor algorithm, and then searching a limb image template at a second visual angle corresponding to the limb image template at the first visual angle (namely, the image of the other person who is closest to the limb image at the first visual angle from the template library) according to the found limb image template at the first visual angle.
And filling the detected limb image template of the second visual angle into each part of the human body according to the limb image of the first visual angle to obtain a rough other-shot limb image (the limb image of the second visual angle, namely the rough other-shot image in fig. 2), wherein the image only has a portrait and further background filling and limb part repairing are required.
The image sample set of the first view and the limb image template set of the second view are used as training samples, the image sample set of the second view corresponding to the image sample set of the first view is used as a label of the training samples, and specifically, a GAN correction model (namely, a preset generation model) can be trained based on a twin network and a CycleGAN (cyclic consistency generation countermeasure network). The image sample set of the first visual angle comprises an image sample of the first visual angle, the image sample set of the second visual angle comprises an image sample of the second visual angle, the limb image template set of the second visual angle comprises a limb image template of the second visual angle, and the limb image template of the second visual angle is obtained by performing limb identification on the image sample of the second visual angle corresponding to the image sample of the first visual angle. The GAN correction model realizes the background filling of the limb image at the second visual angle and the repair and refinement of the limb.
And inputting the image to be converted at the first visual angle and the limb image at the second visual angle into the trained GAN correction model, and performing background filling and limb part restoration to output a converted other-shot picture (namely the converted image at the second visual angle) corresponding to the image to be converted.
It can be understood by those skilled in the art that in other embodiments, the first view angle may be an image-taking view angle, and the second view angle may be a self-photographing view angle, so that a process of converting the image-taking view angle to the self-photographing view angle can be implemented by the image processing method of the embodiment of the present invention.
Fig. 3 is a schematic diagram of a framework of a GAN correction model according to an embodiment of the present invention.
As shown in fig. 3, when the GAN correction model is used, the image to be converted and the rough image taken by the user (i.e., the limb image of the angle of view taken by the user) are used as input, and the converted image taken by the user is obtained through a CNN (convolutional neural network) module and a generative confrontation network (e.g., CycleGAN).
When the GAN correction model is trained, the characteristic of double input and single output of a twin network is utilized, an image sample set of a first visual angle and a limb image template set of a second visual angle are used as two inputs, after convolution processing is carried out on the two inputs through a CNN network module, the two inputs are accessed into a generating type countermeasure network, an image sample set of the second visual angle corresponding to the image sample set of the first visual angle is used as a target output, and after repeated iteration, the trained GAN correction model is finally obtained. The image sample set of the first perspective corresponds to the image sample set of the second perspective, which means that in the image sample sets of the two perspectives, the object (for the embodiment of the present invention, the object is a person) in the image sample of the first perspective and the image sample of the second perspective which have a corresponding relationship are the same, but the perspectives are different, for example, the image sample of the first perspective is a person in a certain posture taken from a self-photographing perspective, and the image sample of the corresponding second perspective is a person in the certain posture taken from a self-photographing perspective.
Fig. 4 is a schematic diagram of main blocks of an image processing apparatus according to an embodiment of the present invention.
As shown in fig. 4, the image processing apparatus 400 according to an embodiment of the present invention mainly includes: a first perspective limb image generation module 401, a template search module 402, a second perspective limb image generation module 403, and a correction processing module 404.
The first perspective limb image generating module 401 is configured to segment a limb area from an image to be converted from a first perspective to obtain a limb image from the first perspective.
The template searching module 402 is configured to search a limb image template of a second view angle corresponding to the limb image of the first view angle in the template library.
The second perspective limb image generating module 403 is configured to generate a limb image of a second perspective according to the limb image of the first perspective and the limb image template of the second perspective.
The correction processing module 404 is configured to perform correction processing on the limb image at the second view angle based on the image to be converted at the first view angle, so as to obtain a converted image at the second view angle corresponding to the image to be converted.
In one embodiment, the template lookup module is specifically configured to: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in a template library by utilizing a nearest neighbor algorithm; and searching a limb image template of a second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
In one embodiment, the template library establishing module may further be included to: pre-establishing an image pair, wherein the image pair comprises an image sample of a first view angle and an image sample of a second view angle corresponding to the image sample of the first view angle; performing limb identification on the image samples in the image pair to respectively obtain a limb image template of a first visual angle corresponding to the image sample of the first visual angle and a limb image template of a second visual angle corresponding to the image sample of the second visual angle; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into a template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
In one embodiment, the corrective processing module is specifically configured to: and inputting the image to be converted at the first visual angle and the limb image at the second visual angle into the trained preset generation model so as to output the converted image at the second visual angle corresponding to the image to be converted.
In one embodiment, the preset generative model is trained by using an image sample set of a first perspective and a limb image template set of a second perspective, wherein the image sample set of the first perspective comprises the image sample of the first perspective, and the limb image template set of the second perspective comprises the limb image template of the second perspective.
In one embodiment, the predetermined generative model is based on a twin network and generative confrontation network implementation model.
In one embodiment, the second perspective limb image generation module is specifically configured to: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
Fig. 5 shows an exemplary system architecture 500 of an image processing method or an image processing apparatus to which an embodiment of the present invention can be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 501, 502, 503. The backend management server may process data such as the received image processing request and feed back a processing result (e.g., a processed image — just an example) to the terminal device.
It should be noted that the image processing method provided by the embodiment of the present invention is generally executed by the server 505, and accordingly, the image processing apparatus is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use with a terminal device or server implementing an embodiment of the invention is shown. The terminal device or the server shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a first visual angle limb image generation module, a template searching module, a second visual angle limb image generation module and a correction processing module. The names of these modules do not limit the modules themselves in some cases, for example, the first perspective limb image generation module may also be described as "a module for segmenting a limb area from an image to be converted from a first perspective to obtain a limb image from the first perspective".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
According to the technical scheme of the embodiment of the invention, the limb area is divided from the image to be converted at the first visual angle to obtain the limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted. The image visual angle can be automatically converted without manual intervention, the problems of shoulder distortion, perspective deviation and the like caused by the self-photographing visual angle are solved, the conversion time is shortened, the labor cost is reduced, and batch conversion of the image visual angles can be realized.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (16)
1. An image processing method, comprising:
segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle;
searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library;
generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle;
and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
2. The method of claim 1, wherein the searching for the limb image template of the second perspective corresponding to the limb image of the first perspective in the template library comprises:
searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm;
and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
3. The method of claim 2, wherein prior to searching the template library for the limb image template of the second perspective corresponding to the limb image of the first perspective, comprising:
pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective;
performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle;
and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
4. The method according to claim 3, wherein the performing a correction process on the limb image of the second perspective based on the image to be converted of the first perspective to obtain a converted image of the second perspective corresponding to the image to be converted comprises:
and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
5. The method according to claim 4, wherein the pre-set generative model is trained using a first set of image samples from a first perspective comprising the first image sample and a second set of limb image templates from a second perspective comprising the second limb image template.
6. The method of claim 5, wherein the predetermined generative model is implemented based on a twin network and a generative countermeasure network.
7. The method of claim 3, wherein generating the limb image of the second perspective from the limb image of the first perspective and the limb image template of the second perspective comprises:
and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
8. An image processing apparatus characterized by comprising:
the first visual angle limb image generation module is used for segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle;
the template searching module is used for searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library;
the second visual angle limb image generation module is used for generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle;
and the correction processing module is used for correcting the limb image at the second visual angle based on the image to be converted at the first visual angle to obtain a converted image at the second visual angle corresponding to the image to be converted.
9. The apparatus of claim 8, wherein the template lookup module is further configured to:
searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm;
and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
10. The apparatus of claim 9, further comprising a template library creation module configured to:
pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective;
performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle;
and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
11. The apparatus of claim 10, wherein the corrective processing module is further configured to:
and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
12. The apparatus of claim 11, wherein the pre-set generative model is trained using a first set of image samples from a first perspective comprising the first image sample and a second set of limb image templates from a second perspective comprising the second limb image template.
13. The apparatus of claim 12, wherein the predetermined generative model is implemented based on a twin network and a generative countermeasure network.
14. The apparatus of claim 10, wherein the second perspective limb image generation module is further configured to:
and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
15. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
16. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110009410.2A CN113762015A (en) | 2021-01-05 | 2021-01-05 | Image processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110009410.2A CN113762015A (en) | 2021-01-05 | 2021-01-05 | Image processing method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113762015A true CN113762015A (en) | 2021-12-07 |
Family
ID=78786324
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110009410.2A Pending CN113762015A (en) | 2021-01-05 | 2021-01-05 | Image processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113762015A (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104599284A (en) * | 2015-02-15 | 2015-05-06 | 四川川大智胜软件股份有限公司 | Three-dimensional facial reconstruction method based on multi-view cellphone selfie pictures |
CN109785322A (en) * | 2019-01-31 | 2019-05-21 | 北京市商汤科技开发有限公司 | Simple eye human body attitude estimation network training method, image processing method and device |
CN110264539A (en) * | 2019-06-18 | 2019-09-20 | 北京字节跳动网络技术有限公司 | Image generating method and device |
CN110650239A (en) * | 2018-06-26 | 2020-01-03 | 百度在线网络技术(北京)有限公司 | Image processing method, image processing device, computer equipment and storage medium |
CN111260545A (en) * | 2020-01-20 | 2020-06-09 | 北京百度网讯科技有限公司 | Method and device for generating image |
CN111339918A (en) * | 2020-02-24 | 2020-06-26 | 深圳市商汤科技有限公司 | Image processing method, image processing device, computer equipment and storage medium |
CN111553968A (en) * | 2020-05-11 | 2020-08-18 | 青岛联合创智科技有限公司 | Method for reconstructing animation by three-dimensional human body |
CN111639580A (en) * | 2020-05-25 | 2020-09-08 | 浙江工商大学 | Gait recognition method combining feature separation model and visual angle conversion model |
-
2021
- 2021-01-05 CN CN202110009410.2A patent/CN113762015A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104599284A (en) * | 2015-02-15 | 2015-05-06 | 四川川大智胜软件股份有限公司 | Three-dimensional facial reconstruction method based on multi-view cellphone selfie pictures |
CN110650239A (en) * | 2018-06-26 | 2020-01-03 | 百度在线网络技术(北京)有限公司 | Image processing method, image processing device, computer equipment and storage medium |
CN109785322A (en) * | 2019-01-31 | 2019-05-21 | 北京市商汤科技开发有限公司 | Simple eye human body attitude estimation network training method, image processing method and device |
CN110264539A (en) * | 2019-06-18 | 2019-09-20 | 北京字节跳动网络技术有限公司 | Image generating method and device |
CN111260545A (en) * | 2020-01-20 | 2020-06-09 | 北京百度网讯科技有限公司 | Method and device for generating image |
CN111339918A (en) * | 2020-02-24 | 2020-06-26 | 深圳市商汤科技有限公司 | Image processing method, image processing device, computer equipment and storage medium |
CN111553968A (en) * | 2020-05-11 | 2020-08-18 | 青岛联合创智科技有限公司 | Method for reconstructing animation by three-dimensional human body |
CN111639580A (en) * | 2020-05-25 | 2020-09-08 | 浙江工商大学 | Gait recognition method combining feature separation model and visual angle conversion model |
Non-Patent Citations (2)
Title |
---|
MA, LIQIAN, LIN, ZHE, BARNES CONNELLY, EFROS ALEXEI A, LU JINGWAN: "Unselfie: Translating Selfies to Neutral-pose Portraits in the Wild", ARXIV, 29 July 2020 (2020-07-29), pages 1 - 19 * |
SHUREN ZHOU; PENG LUO; DEEPAK KUMAR JAIN; XIANGYUAN LAN; YUDONG ZHANG: "Double-Domain Imaging and Adaption for Person Re-Identification", IEEE XPLORE, 24 July 2019 (2019-07-24), pages 1 - 9 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106910247B (en) | Method and apparatus for generating three-dimensional avatar model | |
CN107633218B (en) | Method and apparatus for generating image | |
CN110517214B (en) | Method and apparatus for generating image | |
CN108629823B (en) | Method and device for generating multi-view image | |
US20200126315A1 (en) | Method and apparatus for generating information | |
CN110458781B (en) | Method and apparatus for processing image | |
CN109255337B (en) | Face key point detection method and device | |
CN112862877B (en) | Method and apparatus for training an image processing network and image processing | |
CN109308681A (en) | Image processing method and device | |
US20200082199A1 (en) | Method and apparatus for inspecting burrs of electrode slice | |
CN110349107B (en) | Image enhancement method, device, electronic equipment and storage medium | |
CN111815738B (en) | Method and device for constructing map | |
US20240046538A1 (en) | Method for generating face shape adjustment image, model training method, apparatus and device | |
CN114792355B (en) | Virtual image generation method and device, electronic equipment and storage medium | |
CN114782613A (en) | Image rendering method, device and equipment and storage medium | |
CN112258619A (en) | Image processing method and device | |
CN113766117B (en) | Video de-jitter method and device | |
CN110288625A (en) | Method and apparatus for handling image | |
CN112714263A (en) | Video generation method, device, equipment and storage medium | |
CN108256477B (en) | Method and device for detecting human face | |
CN113762015A (en) | Image processing method and device | |
CN110555799A (en) | Method and apparatus for processing video | |
CN111383289A (en) | Image processing method, image processing device, terminal equipment and computer readable storage medium | |
CN114869528A (en) | Scanning data processing method, device, equipment and medium | |
CN113808147A (en) | Image processing method, device and system and computer equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |