CN113762015A - Image processing method and device - Google Patents

Image processing method and device Download PDF

Info

Publication number
CN113762015A
CN113762015A CN202110009410.2A CN202110009410A CN113762015A CN 113762015 A CN113762015 A CN 113762015A CN 202110009410 A CN202110009410 A CN 202110009410A CN 113762015 A CN113762015 A CN 113762015A
Authority
CN
China
Prior art keywords
image
visual angle
limb
template
perspective
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110009410.2A
Other languages
Chinese (zh)
Inventor
左鑫孟
梅涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202110009410.2A priority Critical patent/CN113762015A/en
Publication of CN113762015A publication Critical patent/CN113762015A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses an image processing method and device, and relates to the technical field of computers. One embodiment of the method comprises: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted. This embodiment can need not artificial intervention, and automatic conversion image visual angle solves shoulder distortion, perspective skew scheduling problem that the auto heterodyne visual angle leads to, reduces conversion time, reduces the cost of labor, can realize the batch conversion of image visual angle.

Description

Image processing method and device
Technical Field
The present invention relates to the field of computer technologies, and in particular, to an image processing method and apparatus.
Background
In some scenes it is desirable to convert an image from one perspective to another, for example from a self-portrait perspective to an image from a point of view he takes. When a single person or a plurality of persons take a photo together to carry out self-shooting, the problems of arm entry, maladjustment of body proportion, unnaturalness and the like can be caused due to the influence of arm length, angle and the like, so that the image needs to be converted into a shooting visual angle of the person from the self-shooting visual angle so as to provide diversified selection of shooting. The existing scheme for converting the image of the self-shooting visual angle into the image of the other shooting visual angle has two kinds: the scheme is that through an image clipping method, frames such as arms extending out from a self-photographing visual angle and a self-photographing rod are removed, and an image which is shortened but is basically the other photographing visual angle is obtained; and the second scheme is that the perspective effect of the picture is manually changed by a manual post-phase picture repairing method, the entering arm is smeared, and the background pattern is filled in the areas such as the arm and the like by using the picture repairing functions such as 'imitation stamp' and the like, so that the aim of converting the self-shooting visual angle into the other shooting visual angle is fulfilled.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
in the first scheme, the cutting part needs to be manually selected, more parts can be cut to remove unnatural images such as arms, and the problems of shoulder distortion, perspective deviation and the like caused by a self-photographing visual angle cannot be solved, and the first scheme is mechanical; the second scheme needs a certain post-phase map repairing skill, has multiple map repairing steps and long consumed time, and cannot realize batch conversion of image visual angles in a short time.
Disclosure of Invention
In view of this, embodiments of the present invention provide an image processing method and apparatus, which can automatically convert an image viewing angle without manual intervention, solve the problems of shoulder distortion, perspective offset, and the like caused by a self-portrait viewing angle, reduce conversion time, reduce labor cost, and implement batch conversion of image viewing angles.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided an image processing method.
An image processing method comprising: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
Optionally, the searching for the limb image template of the second view angle corresponding to the limb image of the first view angle in the template library includes: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm; and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
Optionally, before searching for the limb image template of the second perspective corresponding to the limb image of the first perspective in the template library, the method includes: pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective; performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
Optionally, the correcting the limb image of the second view angle based on the image to be converted of the first view angle to obtain a converted image of the second view angle corresponding to the image to be converted includes: and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
Optionally, the preset generation model is trained by using an image sample set of a first perspective and a limb image template set of a second perspective, the image sample set of the first perspective includes the image sample of the first perspective, and the limb image template set of the second perspective includes the limb image template of the second perspective.
Optionally, the preset generative model is implemented based on a twin network and a generative countermeasure network.
Optionally, the generating a limb image of a second perspective according to the limb image of the first perspective and the limb image template of the second perspective includes: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
According to another aspect of the embodiments of the present invention, there is provided an image processing apparatus.
An image processing apparatus comprising: the first visual angle limb image generation module is used for segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; the template searching module is used for searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; the second visual angle limb image generation module is used for generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle; and the correction processing module is used for correcting the limb image at the second visual angle based on the image to be converted at the first visual angle to obtain a converted image at the second visual angle corresponding to the image to be converted.
Optionally, the template searching module is further configured to: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm; and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
Optionally, the system further comprises a template library establishing module, configured to: pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective; performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
Optionally, the corrective processing module is further configured to: and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
Optionally, the preset generation model is trained by using an image sample set of a first perspective and a limb image template set of a second perspective, the image sample set of the first perspective includes the image sample of the first perspective, and the limb image template set of the second perspective includes the limb image template of the second perspective.
Optionally, the preset generative model is implemented based on a twin network and a generative countermeasure network.
Optionally, the second perspective limb image generation module is further configured to: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
According to yet another aspect of an embodiment of the present invention, an electronic device is provided.
An electronic device, comprising: one or more processors; a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the image processing method provided by the embodiments of the present invention.
According to yet another aspect of an embodiment of the present invention, a computer-readable medium is provided.
A computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements an image processing method provided by an embodiment of the present invention.
One embodiment of the above invention has the following advantages or benefits: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted. The image visual angle can be automatically converted without manual intervention, the problems of shoulder distortion, perspective deviation and the like caused by the self-photographing visual angle are solved, the conversion time is shortened, the labor cost is reduced, and batch conversion of the image visual angles can be realized.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of the main steps of an image processing method according to one embodiment of the present invention;
FIG. 2 is a flow chart illustrating the conversion of an image from a self-portrait perspective to an other-portrait perspective according to one embodiment of the present invention;
FIG. 3 is a block diagram of a GAN rectification model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of the main blocks of an image processing apparatus according to one embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 6 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of main steps of an image processing method according to an embodiment of the present invention.
As shown in fig. 1, the image processing method according to an embodiment of the present invention mainly includes steps S101 to S104 as follows.
Step S101: and segmenting the limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle.
Step S102: and searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in the template library.
Searching the template library for a limb image template of a second view angle corresponding to the limb image of the first view angle may include: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in a template library by utilizing a nearest neighbor algorithm; and searching a limb image template of a second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle. The limb image template is a limb image serving as a template, and the template library may be a database for storing all the limb image templates.
The first viewing angle may be a self-timer viewing angle, and the second viewing angle may be a shooting viewing angle.
Before searching the template library for the limb image template of the second view angle corresponding to the limb image of the first view angle, the method may include: pre-establishing an image pair, wherein the image pair comprises an image sample of a first view angle and an image sample of a second view angle corresponding to the image sample of the first view angle; limb identification is carried out on the image sample of the first visual angle and the image sample of the second visual angle in the image pair, and a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle are obtained respectively; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into a template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
The body recognition can adopt a Dense Pose technology, namely a technology of mapping all human body pixels of a 2D RGB image to a 3D model in real time.
Step S103: and generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle.
Generating the limb image of the second perspective according to the limb image of the first perspective and the limb image template of the second perspective may include: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
Step S104: and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
Based on the image to be converted at the first view angle, performing correction processing on the limb image at the second view angle to obtain a converted image at the second view angle corresponding to the image to be converted, which may include: and inputting the image to be converted at the first visual angle and the limb image at the second visual angle into the trained preset generation model so as to output the converted image at the second visual angle corresponding to the image to be converted.
The method comprises the steps of training a preset generation model by utilizing an image sample set of a first visual angle and a limb image template set of a second visual angle, wherein the image sample set of the first visual angle comprises an image sample of the first visual angle, the limb image template set of the second visual angle comprises a limb image template of the second visual angle, the limb image template of the second visual angle is obtained by carrying out limb identification on the image sample of the second visual angle corresponding to the image sample of the first visual angle, and a training target is the image sample of the second visual angle corresponding to each image sample of the first visual angle.
The preset generative model may be implemented based on a twin network and a generative confrontation network.
FIG. 2 is a flow chart illustrating a process of converting an image from a self-timer perspective to an alternative-timer perspective according to an embodiment of the present invention.
As shown in fig. 2, the present invention provides an image self-timer view angle conversion method based on deep learning, which mainly uses a method of synthesizing a self-timer image from other images, uses a nearest neighbor method to retrieve the self-timer image closest to the given other image, and then synthesizes a corresponding self-timer photo through a GAN (generative confrontation network) rectification model (i.e. a preset generative model). The defects of the traditional conversion method can be effectively avoided, a good shooting angle conversion effect is achieved, manual intervention is not needed, and automatic conversion is achieved. The process of the present invention is described in detail below.
A self-timer-self-timer image pair is created, which comprises an image sample of a first viewing angle and an image sample of a second viewing angle corresponding to the image sample of the first viewing angle. Recognizing the human body limb part by using a Dense Pose technology, respectively obtaining a limb image template of a first visual angle corresponding to the image sample of the first visual angle and a limb image template of a second visual angle corresponding to the image sample of the second visual angle, forming a limb image template pair by the limb image template of the first visual angle and the limb image template of the second visual angle, and storing the limb image template pair in a self-timer and other-timer database (namely a template library).
And (3) segmenting the limb area of the image to be converted (namely the image to be converted with the first visual angle) to obtain a limb image with the first visual angle. The method includes the steps of identifying a limb part of a limb image at a first visual angle by means of Dense Pose, finding an image of the limb of the other person who is closest to the limb image at the first visual angle (namely, the image of the other person who is closest to the limb image at the first visual angle) from a template library of an established limb image template pair by means of nearest neighbor, specifically, searching a limb image template at the first visual angle which is closest to the limb image at the first visual angle from the template library by means of nearest neighbor algorithm, and then searching a limb image template at a second visual angle corresponding to the limb image template at the first visual angle (namely, the image of the other person who is closest to the limb image at the first visual angle from the template library) according to the found limb image template at the first visual angle.
And filling the detected limb image template of the second visual angle into each part of the human body according to the limb image of the first visual angle to obtain a rough other-shot limb image (the limb image of the second visual angle, namely the rough other-shot image in fig. 2), wherein the image only has a portrait and further background filling and limb part repairing are required.
The image sample set of the first view and the limb image template set of the second view are used as training samples, the image sample set of the second view corresponding to the image sample set of the first view is used as a label of the training samples, and specifically, a GAN correction model (namely, a preset generation model) can be trained based on a twin network and a CycleGAN (cyclic consistency generation countermeasure network). The image sample set of the first visual angle comprises an image sample of the first visual angle, the image sample set of the second visual angle comprises an image sample of the second visual angle, the limb image template set of the second visual angle comprises a limb image template of the second visual angle, and the limb image template of the second visual angle is obtained by performing limb identification on the image sample of the second visual angle corresponding to the image sample of the first visual angle. The GAN correction model realizes the background filling of the limb image at the second visual angle and the repair and refinement of the limb.
And inputting the image to be converted at the first visual angle and the limb image at the second visual angle into the trained GAN correction model, and performing background filling and limb part restoration to output a converted other-shot picture (namely the converted image at the second visual angle) corresponding to the image to be converted.
It can be understood by those skilled in the art that in other embodiments, the first view angle may be an image-taking view angle, and the second view angle may be a self-photographing view angle, so that a process of converting the image-taking view angle to the self-photographing view angle can be implemented by the image processing method of the embodiment of the present invention.
Fig. 3 is a schematic diagram of a framework of a GAN correction model according to an embodiment of the present invention.
As shown in fig. 3, when the GAN correction model is used, the image to be converted and the rough image taken by the user (i.e., the limb image of the angle of view taken by the user) are used as input, and the converted image taken by the user is obtained through a CNN (convolutional neural network) module and a generative confrontation network (e.g., CycleGAN).
When the GAN correction model is trained, the characteristic of double input and single output of a twin network is utilized, an image sample set of a first visual angle and a limb image template set of a second visual angle are used as two inputs, after convolution processing is carried out on the two inputs through a CNN network module, the two inputs are accessed into a generating type countermeasure network, an image sample set of the second visual angle corresponding to the image sample set of the first visual angle is used as a target output, and after repeated iteration, the trained GAN correction model is finally obtained. The image sample set of the first perspective corresponds to the image sample set of the second perspective, which means that in the image sample sets of the two perspectives, the object (for the embodiment of the present invention, the object is a person) in the image sample of the first perspective and the image sample of the second perspective which have a corresponding relationship are the same, but the perspectives are different, for example, the image sample of the first perspective is a person in a certain posture taken from a self-photographing perspective, and the image sample of the corresponding second perspective is a person in the certain posture taken from a self-photographing perspective.
Fig. 4 is a schematic diagram of main blocks of an image processing apparatus according to an embodiment of the present invention.
As shown in fig. 4, the image processing apparatus 400 according to an embodiment of the present invention mainly includes: a first perspective limb image generation module 401, a template search module 402, a second perspective limb image generation module 403, and a correction processing module 404.
The first perspective limb image generating module 401 is configured to segment a limb area from an image to be converted from a first perspective to obtain a limb image from the first perspective.
The template searching module 402 is configured to search a limb image template of a second view angle corresponding to the limb image of the first view angle in the template library.
The second perspective limb image generating module 403 is configured to generate a limb image of a second perspective according to the limb image of the first perspective and the limb image template of the second perspective.
The correction processing module 404 is configured to perform correction processing on the limb image at the second view angle based on the image to be converted at the first view angle, so as to obtain a converted image at the second view angle corresponding to the image to be converted.
In one embodiment, the template lookup module is specifically configured to: searching a limb image template of a first visual angle closest to the limb image of the first visual angle in a template library by utilizing a nearest neighbor algorithm; and searching a limb image template of a second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
In one embodiment, the template library establishing module may further be included to: pre-establishing an image pair, wherein the image pair comprises an image sample of a first view angle and an image sample of a second view angle corresponding to the image sample of the first view angle; performing limb identification on the image samples in the image pair to respectively obtain a limb image template of a first visual angle corresponding to the image sample of the first visual angle and a limb image template of a second visual angle corresponding to the image sample of the second visual angle; and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into a template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
In one embodiment, the corrective processing module is specifically configured to: and inputting the image to be converted at the first visual angle and the limb image at the second visual angle into the trained preset generation model so as to output the converted image at the second visual angle corresponding to the image to be converted.
In one embodiment, the preset generative model is trained by using an image sample set of a first perspective and a limb image template set of a second perspective, wherein the image sample set of the first perspective comprises the image sample of the first perspective, and the limb image template set of the second perspective comprises the limb image template of the second perspective.
In one embodiment, the predetermined generative model is based on a twin network and generative confrontation network implementation model.
In one embodiment, the second perspective limb image generation module is specifically configured to: and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
Fig. 5 shows an exemplary system architecture 500 of an image processing method or an image processing apparatus to which an embodiment of the present invention can be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 501, 502, 503. The backend management server may process data such as the received image processing request and feed back a processing result (e.g., a processed image — just an example) to the terminal device.
It should be noted that the image processing method provided by the embodiment of the present invention is generally executed by the server 505, and accordingly, the image processing apparatus is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, a block diagram of a computer system 600 suitable for use with a terminal device or server implementing an embodiment of the invention is shown. The terminal device or the server shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the system 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a first visual angle limb image generation module, a template searching module, a second visual angle limb image generation module and a correction processing module. The names of these modules do not limit the modules themselves in some cases, for example, the first perspective limb image generation module may also be described as "a module for segmenting a limb area from an image to be converted from a first perspective to obtain a limb image from the first perspective".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
According to the technical scheme of the embodiment of the invention, the limb area is divided from the image to be converted at the first visual angle to obtain the limb image at the first visual angle; searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library; generating a limb image at a second visual angle according to the limb image at the first visual angle and a limb image template at the second visual angle; and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted. The image visual angle can be automatically converted without manual intervention, the problems of shoulder distortion, perspective deviation and the like caused by the self-photographing visual angle are solved, the conversion time is shortened, the labor cost is reduced, and batch conversion of the image visual angles can be realized.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (16)

1. An image processing method, comprising:
segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle;
searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library;
generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle;
and based on the image to be converted of the first visual angle, correcting the limb image of the second visual angle to obtain a converted image of the second visual angle corresponding to the image to be converted.
2. The method of claim 1, wherein the searching for the limb image template of the second perspective corresponding to the limb image of the first perspective in the template library comprises:
searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm;
and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
3. The method of claim 2, wherein prior to searching the template library for the limb image template of the second perspective corresponding to the limb image of the first perspective, comprising:
pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective;
performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle;
and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
4. The method according to claim 3, wherein the performing a correction process on the limb image of the second perspective based on the image to be converted of the first perspective to obtain a converted image of the second perspective corresponding to the image to be converted comprises:
and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
5. The method according to claim 4, wherein the pre-set generative model is trained using a first set of image samples from a first perspective comprising the first image sample and a second set of limb image templates from a second perspective comprising the second limb image template.
6. The method of claim 5, wherein the predetermined generative model is implemented based on a twin network and a generative countermeasure network.
7. The method of claim 3, wherein generating the limb image of the second perspective from the limb image of the first perspective and the limb image template of the second perspective comprises:
and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
8. An image processing apparatus characterized by comprising:
the first visual angle limb image generation module is used for segmenting a limb area from the image to be converted at the first visual angle to obtain a limb image at the first visual angle;
the template searching module is used for searching a limb image template of a second visual angle corresponding to the limb image of the first visual angle in a template library;
the second visual angle limb image generation module is used for generating a limb image of a second visual angle according to the limb image of the first visual angle and the limb image template of the second visual angle;
and the correction processing module is used for correcting the limb image at the second visual angle based on the image to be converted at the first visual angle to obtain a converted image at the second visual angle corresponding to the image to be converted.
9. The apparatus of claim 8, wherein the template lookup module is further configured to:
searching a limb image template of a first visual angle closest to the limb image of the first visual angle in the template library by utilizing a nearest neighbor algorithm;
and searching the limb image template of the second visual angle corresponding to the limb image template of the first visual angle according to the searched limb image template of the first visual angle.
10. The apparatus of claim 9, further comprising a template library creation module configured to:
pre-establishing an image pair comprising an image sample of a first perspective and an image sample of a second perspective corresponding to the image sample of the first perspective;
performing limb identification on the image samples in the image pair to respectively obtain a limb image template of the first visual angle corresponding to the image sample of the first visual angle and a limb image template of the second visual angle corresponding to the image sample of the second visual angle;
and correspondingly storing the limb image template of the first visual angle and the limb image template of the second visual angle into the template library according to the corresponding relation between the image sample of the first visual angle and the image sample of the second visual angle.
11. The apparatus of claim 10, wherein the corrective processing module is further configured to:
and inputting the images to be converted at the first visual angle and the limb images at the second visual angle into a trained preset generation model so as to output the converted images at the second visual angle corresponding to the images to be converted.
12. The apparatus of claim 11, wherein the pre-set generative model is trained using a first set of image samples from a first perspective comprising the first image sample and a second set of limb image templates from a second perspective comprising the second limb image template.
13. The apparatus of claim 12, wherein the predetermined generative model is implemented based on a twin network and a generative countermeasure network.
14. The apparatus of claim 10, wherein the second perspective limb image generation module is further configured to:
and according to each part of the limb, filling the limb image template of the second visual angle according to the limb image of the first visual angle to generate the limb image of the second visual angle.
15. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
16. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202110009410.2A 2021-01-05 2021-01-05 Image processing method and device Pending CN113762015A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110009410.2A CN113762015A (en) 2021-01-05 2021-01-05 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110009410.2A CN113762015A (en) 2021-01-05 2021-01-05 Image processing method and device

Publications (1)

Publication Number Publication Date
CN113762015A true CN113762015A (en) 2021-12-07

Family

ID=78786324

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110009410.2A Pending CN113762015A (en) 2021-01-05 2021-01-05 Image processing method and device

Country Status (1)

Country Link
CN (1) CN113762015A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599284A (en) * 2015-02-15 2015-05-06 四川川大智胜软件股份有限公司 Three-dimensional facial reconstruction method based on multi-view cellphone selfie pictures
CN109785322A (en) * 2019-01-31 2019-05-21 北京市商汤科技开发有限公司 Simple eye human body attitude estimation network training method, image processing method and device
CN110264539A (en) * 2019-06-18 2019-09-20 北京字节跳动网络技术有限公司 Image generating method and device
CN110650239A (en) * 2018-06-26 2020-01-03 百度在线网络技术(北京)有限公司 Image processing method, image processing device, computer equipment and storage medium
CN111260545A (en) * 2020-01-20 2020-06-09 北京百度网讯科技有限公司 Method and device for generating image
CN111339918A (en) * 2020-02-24 2020-06-26 深圳市商汤科技有限公司 Image processing method, image processing device, computer equipment and storage medium
CN111553968A (en) * 2020-05-11 2020-08-18 青岛联合创智科技有限公司 Method for reconstructing animation by three-dimensional human body
CN111639580A (en) * 2020-05-25 2020-09-08 浙江工商大学 Gait recognition method combining feature separation model and visual angle conversion model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599284A (en) * 2015-02-15 2015-05-06 四川川大智胜软件股份有限公司 Three-dimensional facial reconstruction method based on multi-view cellphone selfie pictures
CN110650239A (en) * 2018-06-26 2020-01-03 百度在线网络技术(北京)有限公司 Image processing method, image processing device, computer equipment and storage medium
CN109785322A (en) * 2019-01-31 2019-05-21 北京市商汤科技开发有限公司 Simple eye human body attitude estimation network training method, image processing method and device
CN110264539A (en) * 2019-06-18 2019-09-20 北京字节跳动网络技术有限公司 Image generating method and device
CN111260545A (en) * 2020-01-20 2020-06-09 北京百度网讯科技有限公司 Method and device for generating image
CN111339918A (en) * 2020-02-24 2020-06-26 深圳市商汤科技有限公司 Image processing method, image processing device, computer equipment and storage medium
CN111553968A (en) * 2020-05-11 2020-08-18 青岛联合创智科技有限公司 Method for reconstructing animation by three-dimensional human body
CN111639580A (en) * 2020-05-25 2020-09-08 浙江工商大学 Gait recognition method combining feature separation model and visual angle conversion model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MA, LIQIAN, LIN, ZHE, BARNES CONNELLY, EFROS ALEXEI A, LU JINGWAN: "Unselfie: Translating Selfies to Neutral-pose Portraits in the Wild", ARXIV, 29 July 2020 (2020-07-29), pages 1 - 19 *
SHUREN ZHOU; PENG LUO; DEEPAK KUMAR JAIN; XIANGYUAN LAN; YUDONG ZHANG: "Double-Domain Imaging and Adaption for Person Re-Identification", IEEE XPLORE, 24 July 2019 (2019-07-24), pages 1 - 9 *

Similar Documents

Publication Publication Date Title
CN106910247B (en) Method and apparatus for generating three-dimensional avatar model
CN107633218B (en) Method and apparatus for generating image
CN110517214B (en) Method and apparatus for generating image
CN108629823B (en) Method and device for generating multi-view image
US20200126315A1 (en) Method and apparatus for generating information
CN110458781B (en) Method and apparatus for processing image
CN109255337B (en) Face key point detection method and device
CN112862877B (en) Method and apparatus for training an image processing network and image processing
CN109308681A (en) Image processing method and device
US20200082199A1 (en) Method and apparatus for inspecting burrs of electrode slice
CN110349107B (en) Image enhancement method, device, electronic equipment and storage medium
CN111815738B (en) Method and device for constructing map
US20240046538A1 (en) Method for generating face shape adjustment image, model training method, apparatus and device
CN114792355B (en) Virtual image generation method and device, electronic equipment and storage medium
CN114782613A (en) Image rendering method, device and equipment and storage medium
CN112258619A (en) Image processing method and device
CN113766117B (en) Video de-jitter method and device
CN110288625A (en) Method and apparatus for handling image
CN112714263A (en) Video generation method, device, equipment and storage medium
CN108256477B (en) Method and device for detecting human face
CN113762015A (en) Image processing method and device
CN110555799A (en) Method and apparatus for processing video
CN111383289A (en) Image processing method, image processing device, terminal equipment and computer readable storage medium
CN114869528A (en) Scanning data processing method, device, equipment and medium
CN113808147A (en) Image processing method, device and system and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination