CN116051695A - Image processing method and device - Google Patents

Image processing method and device Download PDF

Info

Publication number
CN116051695A
CN116051695A CN202310022691.4A CN202310022691A CN116051695A CN 116051695 A CN116051695 A CN 116051695A CN 202310022691 A CN202310022691 A CN 202310022691A CN 116051695 A CN116051695 A CN 116051695A
Authority
CN
China
Prior art keywords
target object
human body
dimensional
target
view angle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310022691.4A
Other languages
Chinese (zh)
Inventor
孙家岭
罗一衍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Priority to CN202310022691.4A priority Critical patent/CN116051695A/en
Publication of CN116051695A publication Critical patent/CN116051695A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses an image processing method and device, and belongs to the technical field of image processing. The method comprises the following steps: performing human body detection on image frame data of the video to obtain a human body detection frame comprising a target object; performing three-dimensional human body reconstruction on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object under each view angle of N view angles; wherein the N view angles are different view angles taking a shooting view angle of the target object in the image frame data as a reference view angle, N is an integer greater than 0; and generating a target image corresponding to the image frame data according to the three-dimensional human body data and the image frame data of the target object under each view angle of the N view angles.

Description

Image processing method and device
Technical Field
The application belongs to the technical field of image processing, and particularly relates to an image processing method and an image processing device.
Background
The human body multi-view motion copying special effect is commonly found in movie fragments and short video platforms of mobile phones, currently, when multi-view motion copying is realized, a plurality of cameras are used for shooting two-dimensional (2D) images of the same motion in multiple view, then through video post-processing, other view two-dimensional images are overlapped on a reference two-dimensional image to obtain a two-dimensional special effect image, and therefore video with the multi-view motion copying special effect is realized.
However, it requires photographers to use a plurality of cameras, each of which photographs a two-dimensional image of a corresponding view angle, and then performs video post-processing, which is cumbersome in operation and requires high demands on photographing equipment and post-processing.
Disclosure of Invention
The embodiment of the application aims to provide an image processing method and an image processing device, so as to solve the problem that electronic equipment cannot automatically realize video with a multi-view action copy special effect.
In a first aspect, an embodiment of the present application provides an image processing method, including:
performing human body detection on image frame data of the video to obtain a human body detection frame comprising a target object;
performing three-dimensional human body reconstruction on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object under each view angle of N view angles; wherein the N view angles are different view angles taking a shooting view angle of the target object in the image frame data as a reference view angle, N is an integer greater than 0;
and generating a target image corresponding to the image frame data according to the three-dimensional human body data and the image frame data under each view angle of the N view angles.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
the detection module is used for detecting the human body of the image frame data of the video to obtain a human body detection frame comprising a target object;
the three-dimensional reconstruction module is used for carrying out three-dimensional human body reconstruction on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object under each view angle of N view angles; wherein the N view angles are different view angles taking a shooting view angle of the target object in the image frame data as a reference view angle, N is an integer greater than 0;
and the generating module is used for generating a target image corresponding to the image frame data according to the three-dimensional human body data and the image frame data under each view angle of the N view angles.
In a third aspect, embodiments of the present application provide an electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the method as described in the first aspect.
In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor implement the steps of the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and where the processor is configured to execute a program or instructions to implement a method according to the first aspect.
In a sixth aspect, embodiments of the present application provide a computer program product stored in a storage medium, the program product being executable by at least one processor to implement the method according to the first aspect.
In the embodiment of the application, human body detection is performed on the image frame data of the video to obtain a human body detection frame comprising a target object, three-dimensional human body reconstruction is performed on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object of each view angle of N view angles, and then a target image corresponding to the image frame data is generated according to the three-dimensional human body data and the image frame data of each view angle of the N view angles, namely, the electronic equipment only needs to perform three-dimensional human body reconstruction on the target object in the image frame data of the video, so that the target image corresponding to the image frame data and having a multi-angle motion copy special effect can be obtained, the video having the multi-view angle motion copy special effect is realized, the operation is simple, and unnecessary photographing equipment and post-processing are not needed.
Drawings
Fig. 1 is a flowchart of an image processing method provided in an embodiment of the present application;
fig. 2 is a schematic diagram of a relationship among image frame data, three-dimensional human body data of a target object at a reference view angle, three-dimensional human body data of a target object at different target view angles, and a target image corresponding to the image frame data in the embodiment of the present application;
fig. 3 is a schematic structural diagram of an image processing apparatus provided in an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to another embodiment of the present application.
Detailed Description
Technical solutions in the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application are within the scope of the protection of the present application.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the objects identified by "first," "second," etc. are generally of a type and do not limit the number of objects, for example, the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.
The image processing method provided by the embodiment of the application is described in detail below by means of specific embodiments and application scenes thereof with reference to the accompanying drawings.
Please refer to fig. 1, which is a flowchart of an image processing method according to an embodiment of the present application. The method can be applied to the electronic equipment. The electronic equipment can be a mobile phone, a tablet personal computer, a notebook computer and the like. As shown in FIG. 1, the method may include steps 1100-1300, which are described in detail below.
Step 1100, performing human body detection on image frame data of the video to obtain a human body detection frame including the target object.
The image frame data is extracted from the electronic device during the video recording process or in the recorded video, and is used for making image frames with multiple-view motion copy special effects. The target object is a person needing three-dimensional human body reconstruction in the image frame data.
In this embodiment, the electronic device may acquire the image frame at the current time point in real time as the image frame data during the video recording process, or may extract an image frame every predetermined number of image frames (for example, every 5 image frames) as the image frame data. And further performing human body detection on the acquired image frame data to obtain a human body detection frame comprising the target object in the image frame data.
It should be understood that the human body detecting frame may be a dotted line frame or a solid line frame, and of course, the human body detecting frame may be other representation forms, which is not limited in this embodiment. For example, the electronic device may acquire the image frame data 21 shown in fig. 2 and perform human body detection on the image frame data 21 during video recording, to obtain a human body detection frame 22 including the target object 201.
After performing step 1100 to perform human body detection on image frame data of a video to obtain a human body detection frame including a target object, entering:
step 1200, performing three-dimensional human body reconstruction on the target object according to the human body detection frame, so as to obtain three-dimensional human body data of the target object under each view angle of the N view angles.
The N view angles are different view angles taking a shooting view angle of a target object in the image frame data as a reference view angle, and N is an integer greater than 0. For example, the photographing view angle of the target object 201 in the image frame data 21 shown in fig. 2 may be taken as a reference view angle, and the corresponding N view angles may be three views of a left view angle, a top view angle, and a right view angle, respectively.
In some embodiments of the present application, the reconstructing, by the step 1200, the three-dimensional human body of the target object according to the human body detection frame to obtain the three-dimensional human body data of the target object under each of the N view angles may further include the following steps 1210 to 1230:
step 1210, clipping the image frame data according to the human body detection frame to obtain a character image; wherein the character image includes the target object.
In step 1210, the electronic device performs clipping processing on the image frame data according to the human body detection frame to obtain a character image only including the target object, and records the position information of the human body detection frame, for example, the upper left corner coordinate information and the lower right corner coordinate information of the human body detection frame 22 shown in fig. 2 may be recorded, and it is understood that the recorded position information of the human body detection frame is used for projecting the three-dimensional human body data of the target object under different viewing angles.
Step 1220, obtaining a first parameter for reconstructing the three-dimensional human body of the target object according to the character image.
Wherein the first parameters include a body conformation parameter beta of the target subject and a joint rotation parameter theta of the target subject.
In step 1220, the electronic device inputs the cropped character image into the depth network of the human body reconstruction algorithm, so as to obtain the body type parameter β of the target object and the joint rotation parameter θ of the target object.
Step 1230, obtaining three-dimensional human body data of the target object under each view angle of the N view angles according to the first parameter and a preset three-dimensional human body reconstruction model.
Optionally, in step 1230, according to the first parameter and the preset three-dimensional human body reconstruction model, obtaining the three-dimensional human body data of the target object under each of the N view angles may further include steps 1231 to 1233 as follows:
step 1231, obtaining first three-dimensional vertex coordinate information of the target object under the reference view angle according to the body type parameter of the target object, the joint rotation parameter of the target object and the preset three-dimensional human body reconstruction model.
The preset three-dimensional human body reconstruction model may be a human body linear Skin Model (SMPL), or may be another model for performing three-dimensional human body reconstruction, which is not limited in this embodiment.
In step 1231, the electronic device inputs the body type parameter β of the target object and the joint rotation parameter θ of the target object into the SMPL model, so as to obtain the first three-dimensional vertex coordinate information of the target object at the reference view angle. Specifically, the first three-dimensional vertex coordinate information V of the target object at the reference viewing angle may include: position information of each of 6890 three-dimensional vertices representing the target object at the reference view angle. It can be appreciated that the three-dimensional human body data 202 of the target object at the reference view angle shown in fig. 2 can be obtained by rendering the first three-dimensional vertex information V of the target object at the reference view angle.
Step 1232, obtaining second three-dimensional vertex coordinate information of the target object under the target view according to the first three-dimensional vertex coordinate information and the target view; wherein the N views include the target view.
It may be appreciated that the second three-dimensional vertex coordinate information of the target object under the target perspective may include: position information representing each of 6890 three-dimensional vertices of the target object at the target perspective.
Optionally, in step 1232, obtaining the second three-dimensional vertex coordinate information of the target object under the target view angle according to the first three-dimensional vertex coordinate information and the target view angle may further include: acquiring a second parameter of the target viewing angle relative to the reference viewing angle; the second parameter is a rotation parameter and a translation parameter of a camera of the electronic device relative to the reference view angle when the target view angle is; and obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the second parameter and the first three-dimensional vertex coordinate information.
Specifically, the electronic device takes any view angle of the N view angles as a target view angle i, obtains a rotation parameter r_i and a translation parameter t_i of the camera of the electronic device relative to the reference view angle when the target view angle i is reached, and transforms the first three-dimensional vertex coordinate information V of the target object under the reference view angle to the second three-dimensional vertex coordinate information v_i of the target object under the target view angle according to the rotation parameter r_i and the translation parameter t_i. Wherein the second three-dimensional vertex angle coordinate information v_i of the target object at the target viewing angle i and the rotation parameter r_i and the translation parameter t_i satisfy the following formula 1:
V_i=R_i*V+t_i (1)
and step 1233, rendering the second three-dimensional vertex coordinate information to obtain three-dimensional human body data of the target object under the target view angle.
In step 1233, the electronic device may render the second three-dimensional vertex coordinate information of the target object under the target view angle, so as to obtain three-dimensional human body data m_3d of the target object under the target view angle, where the three-dimensional human body data may be referred to as a three-dimensional human body surface smoothing network. For example, referring to fig. 2, the electronic device renders the second three-dimensional vertex information of the target object under the left view angle, so as to obtain three-dimensional human body data 203 of the target object under the left view angle, renders the second three-dimensional vertex information of the target object under the overlook angle, so as to obtain three-dimensional human body data 204 of the target object under the dive view angle, and renders the second three-dimensional vertex information of the target object under the right view angle, so as to obtain three-dimensional human body data 205 of the target object under the right view angle.
After executing the above step 1200, according to the human body detection frame, performing three-dimensional human body reconstruction on the target object, and obtaining three-dimensional human body data of the target object under each view angle of the N view angles, entering:
step 1300, generating a target image corresponding to the image frame data according to the three-dimensional human body data and the image frame data of the target object under each view angle of the N view angles.
The target image corresponding to the image frame data may be understood as a two-dimensional special effect image obtained by performing multi-angle motion copy special effect processing on the image frame data.
In some embodiments of the present application, generating the target image corresponding to the image frame data in step 1300 according to the three-dimensional human body data and the image frame data of the target object at each of the N views may further include: and according to a preset projection model, projecting three-dimensional human body data of the target object under each view angle of N view angles to a first position corresponding to the view angle in the image frame data, and generating a target image corresponding to the image frame data.
Specifically, based on the human body detection frame information of the target object, three-dimensional human body data m_3d corresponding to the N rendered viewing angles respectively may be projected to corresponding positions in the image frame data to obtain a target image m_2d corresponding to the image frame data, which specifically satisfies the following formula 2:
M_2d=π(M_3d) (2)
where pi (·) represents the camera projection model.
Referring to fig. 2, the electronic apparatus projects three-dimensional human body data 203 of a target object at a left view angle to the left side of a target object 201 in image frame data 21, projects three-dimensional human body data 204 of a target object at a top view angle to the upper side of the target object 201 in image frame data 21, and projects three-dimensional human body data 205 of a target object at a right view angle to the right side of the target object 201 in image frame data 21, thereby obtaining a target image 22 that is motion-copied including four view angles of a reference view angle, a left view angle, a top view angle, and a right view angle.
According to the embodiment of the application, the image frame data of the video is subjected to human body detection to obtain the human body detection frame comprising the target object, the three-dimensional human body reconstruction is carried out on the target object according to the human body detection frame to obtain the three-dimensional human body data of the target object of each view angle of N view angles, and then the target image corresponding to the image frame data is generated according to the three-dimensional human body data and the image frame data of each view angle of the N view angles, namely, the electronic equipment only needs to carry out three-dimensional human body reconstruction on the target object in the image frame data of the video, so that the target image corresponding to the image frame data and having a multi-angle motion copy special effect can be obtained, the video with the multi-view angle motion copy special effect is realized, the operation is simple, and unnecessary photographing equipment and post-processing are not needed.
It should be noted that, the image processing apparatus provided in the embodiment of the present application can implement each process implemented by the method embodiment of fig. 1, and in order to avoid repetition, a description is omitted here.
It should be noted that, in the image processing method provided in the embodiment of the present application, the execution subject may be an image processing apparatus. In the embodiment of the present application, an image processing apparatus provided in the embodiment of the present application will be described by taking an example in which the image processing apparatus executes an image processing method.
Referring to fig. 3, the embodiment of the present application further provides an image processing apparatus 300, where the image processing apparatus 300 includes a detection module 301, a three-dimensional reconstruction module 302, and a generation module 303.
The detection module 301 is configured to perform human body detection on image frame data of a video to obtain a human body detection frame including a target object;
the three-dimensional reconstruction module 302 is configured to perform three-dimensional human body reconstruction on the target object according to the human body detection frame, so as to obtain three-dimensional human body data of the target object under each view angle of the N view angles; wherein the N view angles are different view angles taking a shooting view angle of the target object in the image frame data as a reference view angle, N is an integer greater than 0;
and the generating module 303 is configured to generate a target image corresponding to the image frame data according to the three-dimensional human body data and the image frame data of the target object under each of the N views.
In one embodiment, the three-dimensional reconstruction module 302 is specifically configured to:
cutting the image frame data according to the human body detection frame to obtain a figure image; wherein the character image includes the target object;
obtaining a first parameter for reconstructing a three-dimensional human body of the target object according to the figure image; wherein the first parameter includes a body type parameter of the target object and a joint rotation parameter of the target object;
and obtaining three-dimensional human body data of the target object under each view angle of N view angles according to the first parameters and a preset three-dimensional human body reconstruction model.
In one embodiment, the three-dimensional reconstruction module 302 is specifically configured to:
obtaining first three-dimensional vertex coordinate information of the target object under the reference view angle according to the body type parameter of the target object, the joint rotation parameter of the target object and the preset three-dimensional human body reconstruction model;
obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the first three-dimensional vertex coordinate information and the target view angle; wherein the N views include the target view;
and obtaining three-dimensional human body data of the target object under the target visual angle according to the second three-dimensional vertex coordinate information.
In one embodiment, the three-dimensional reconstruction module 302 is specifically configured to:
acquiring a second parameter of the target viewing angle relative to the reference viewing angle; the second parameter is a rotation parameter and a translation parameter of a camera of the electronic device relative to the reference view angle when the target view angle is;
and obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the second parameter and the first three-dimensional vertex coordinate information.
In one embodiment, the generating module 303 is specifically configured to:
and according to a preset projection model, projecting the three-dimensional human body data of the target object under each view angle of N view angles to a first position corresponding to the view angle in the image frame data, and generating a target image corresponding to the image frame data.
In the embodiment of the application, human body detection is performed on the image frame data of the video to obtain a human body detection frame comprising a target object, three-dimensional human body reconstruction is performed on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object of each view angle of N view angles, and then a target image corresponding to the image frame data is generated according to the three-dimensional human body data and the image frame data of each view angle of the N view angles, namely, the electronic equipment only needs to perform three-dimensional human body reconstruction on the target object in the image frame data of the video, so that the target image corresponding to the image frame data and having a multi-angle motion copy special effect can be obtained, the video having the multi-view angle motion copy special effect is realized, the operation is simple, and unnecessary photographing equipment and post-processing are not needed.
The image processing apparatus in the embodiment of the present application may be an electronic device, or may be a component in an electronic device, for example, an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. By way of example, the electronic device may be a mobile phone, tablet computer, notebook computer, palm computer, vehicle-mounted electronic device, mobile internet appliance (Mobile Internet Device, MID), augmented reality (augmented reality, AR)/Virtual Reality (VR) device, robot, wearable device, ultra-mobile personal computer, UMPC, netbook or personal digital assistant (personal digital assistant, PDA), etc., but may also be a server, network attached storage (Network Attached Storage, NAS), personal computer (personal computer, PC), television (TV), teller machine or self-service machine, etc., and the embodiments of the present application are not limited in particular.
The image processing apparatus in the embodiment of the present application may be an apparatus having an operating system. The operating system may be an Android operating system, an ios operating system, or other possible operating systems, which are not specifically limited in the embodiments of the present application.
The image processing apparatus provided in this embodiment of the present application can implement each process implemented by the method embodiment of fig. 1, and in order to avoid repetition, a description is omitted here.
Optionally, as shown in fig. 4, the embodiment of the present application further provides an electronic device 400, including a processor 401 and a memory 402, where the memory 402 stores a program or an instruction that can be executed on the processor 401, and the program or the instruction implements each step of the embodiment of the image processing method when executed by the processor 401, and the steps achieve the same technical effects, so that repetition is avoided, and no further description is given here.
The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device described above.
Fig. 5 is a schematic hardware structure of an electronic device implementing an embodiment of the present application.
The electronic device 500 includes, but is not limited to: radio frequency unit 501, network module 502, audio output unit 503, input unit 504, sensor 505, display unit 506, user input unit 507, interface unit 508, memory 509, processor 510, and image processing chip.
Those skilled in the art will appreciate that the electronic device 500 may further include a power source (e.g., a battery) for powering the various components, and that the power source may be logically coupled to the processor 510 via a power management system to perform functions such as managing charging, discharging, and power consumption via the power management system. The electronic device structure shown in fig. 5 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.
The processor 510 is configured to perform human body detection on image frame data of a video to obtain a human body detection frame including a target object; performing three-dimensional human body reconstruction on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object under each view angle of N view angles; wherein the N view angles are different view angles taking a shooting view angle of the target object in the image frame data as a reference view angle, N is an integer greater than 0; and generating a target image corresponding to the image frame data according to the three-dimensional human body data and the image frame data of the target object under each view angle of the N view angles.
In the embodiment of the application, human body detection is performed on the image frame data of the video to obtain a human body detection frame comprising a target object, three-dimensional human body reconstruction is performed on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object of each view angle of N view angles, and then a target image corresponding to the image frame data is generated according to the three-dimensional human body data and the image frame data of each view angle of the N view angles, namely, the electronic equipment only needs to perform three-dimensional human body reconstruction on the target object in the image frame data of the video, so that the target image corresponding to the image frame data and having a multi-angle motion copy special effect can be obtained, the video having the multi-view angle motion copy special effect is realized, the operation is simple, and unnecessary photographing equipment and post-processing are not needed.
In one embodiment, the processor 510 is specifically configured to crop the image frame data according to the human body detection frame to obtain a character image; wherein the character image includes the target object; obtaining a first parameter for reconstructing a three-dimensional human body of the target object according to the figure image; wherein the first parameter includes a body type parameter of the target object and a joint rotation parameter of the target object; and obtaining three-dimensional human body data of the target object under each view angle of N view angles according to the first parameters and a preset three-dimensional human body reconstruction model.
In one embodiment, the processor 510 is specifically configured to obtain first three-dimensional vertex coordinate information of the target object under the reference view angle according to the body type parameter of the target object, the joint rotation parameter of the target object, and the preset three-dimensional human body reconstruction model; obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the first three-dimensional vertex coordinate information and the target view angle; wherein the N views include the target view; and rendering the second three-dimensional vertex coordinate information to obtain three-dimensional human body data of the target object under the target view angle.
In one embodiment, the processor 510 is specifically configured to obtain a second parameter of the target viewing angle relative to the reference viewing angle; the second parameter is a rotation parameter and a translation parameter of a camera of the electronic device relative to the reference view angle when the target view angle is; and obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the second parameter and the first three-dimensional vertex coordinate information.
In one embodiment, the processor 510 is specifically configured to project, according to a preset projection model, three-dimensional human body data of the target object under each of N view angles to a first position corresponding to the view angle in the image frame data, and generate a target image corresponding to the image frame data.
It should be appreciated that in embodiments of the present application, the input unit 504 may include a graphics processor (Graphics Processing Unit, GPU) 5041 and a microphone 5042, with the graphics processor 5041 processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The display unit 506 may include a display panel 5061, and the display panel 5061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 507 includes at least one of a touch panel 5071 and other input devices 5072. Touch panel 5071, also referred to as a touch screen. Touch panel 5071 may include two parts, a touch detection device and a touch controller. Other input devices 5072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.
The memory 509 may be used to store software programs as well as various data. The memory 509 may mainly include a first storage area storing programs or instructions and a second storage area storing data, wherein the first storage area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 509 may include volatile memory or nonvolatile memory, or the memory 509 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (ddr SDRAM), enhanced SDRAM (Enhanced SDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DRRAM). Memory 509 in embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.
Processor 510 may include one or more processing units; optionally, the processor 510 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 510.
The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the embodiment of the image processing method, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.
Wherein the processor is a processor in the electronic device described in the above embodiment. The readable storage medium includes computer readable storage medium such as computer readable memory ROM, random access memory RAM, magnetic or optical disk, etc.
The embodiment of the application further provides a chip, the chip includes a processor and a communication interface, the communication interface is coupled with the processor, and the processor is used for running a program or an instruction, so as to implement each process of the embodiment of the image processing method, and achieve the same technical effect, so that repetition is avoided, and no redundant description is provided here.
It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.
The embodiments of the present application provide a computer program product stored in a storage medium, where the program product is executed by at least one processor to implement the respective processes of the embodiments of the image processing method described above, and achieve the same technical effects, and are not repeated herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may also be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solutions of the present application may be embodied essentially or in a part contributing to the prior art in the form of a computer software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the embodiments of the present application.
The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those of ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are also within the protection of the present application.

Claims (10)

1. An image processing method, the method comprising:
performing human body detection on image frame data of the video to obtain a human body detection frame comprising a target object;
performing three-dimensional human body reconstruction on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object under each view angle of N view angles; wherein the N view angles are different view angles taking a shooting view angle of the target object in the image frame data as a reference view angle, N is an integer greater than 0;
and generating a target image corresponding to the image frame data according to the three-dimensional human body data and the image frame data of the target object under each view angle of the N view angles.
2. The method according to claim 1, wherein the reconstructing the three-dimensional human body of the target object according to the human body detection frame to obtain three-dimensional human body data of the target object at each of N view angles includes:
cutting the image frame data according to the human body detection frame to obtain a figure image; wherein the character image includes the target object;
obtaining a first parameter for reconstructing a three-dimensional human body of the target object according to the figure image; wherein the first parameter includes a body type parameter of the target object and a joint rotation parameter of the target object;
and obtaining three-dimensional human body data of the target object under each view angle of N view angles according to the first parameters and a preset three-dimensional human body reconstruction model.
3. The method according to claim 2, wherein the obtaining three-dimensional human body data of the target object at each of N view angles according to the first parameter and a preset three-dimensional human body reconstruction model includes:
obtaining first three-dimensional vertex coordinate information of the target object under the reference view angle according to the body type parameter of the target object, the joint rotation parameter of the target object and the preset three-dimensional human body reconstruction model;
obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the first three-dimensional vertex coordinate information and the target view angle; wherein the N views include the target view;
and rendering the second three-dimensional vertex coordinate information to obtain three-dimensional human body data of the target object under the target view angle.
4. A method according to claim 3, wherein said obtaining second three-dimensional vertex coordinate information of said target object at said target perspective from said first three-dimensional vertex coordinate information and target perspective comprises:
acquiring a second parameter of the target viewing angle relative to the reference viewing angle; the second parameter is a rotation parameter and a translation parameter of a camera of the electronic device relative to the reference view angle when the target view angle is;
and obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the second parameter and the first three-dimensional vertex coordinate information.
5. The method according to claim 1, wherein generating the target image corresponding to the image frame data from the three-dimensional human body data and the image frame data of the target object for each of the N views includes:
and according to a preset projection model, projecting the three-dimensional human body data of the target object under each view angle of N view angles to a first position corresponding to the view angle in the image frame data, and generating a target image corresponding to the image frame data.
6. An image processing apparatus, characterized in that the apparatus comprises:
the detection module is used for detecting the human body of the image frame data of the video to obtain a human body detection frame comprising a target object;
the three-dimensional reconstruction module is used for carrying out three-dimensional human body reconstruction on the target object according to the human body detection frame to obtain three-dimensional human body data of the target object under each view angle of N view angles; wherein the N view angles are different view angles taking a shooting view angle of the target object in the image frame data as a reference view angle, N is an integer greater than 0;
and the generation module is used for generating a target image corresponding to the image frame data according to the three-dimensional human body data of the target object and the image frame data under each view angle of the N view angles.
7. The apparatus of claim 6, wherein the three-dimensional reconstruction module is configured to:
cutting the image frame data according to the human body detection frame to obtain a figure image; wherein the character image includes the target object;
obtaining a first parameter for reconstructing a three-dimensional human body of the target object according to the figure image; wherein the first parameter includes a body type parameter of the target object and a joint rotation parameter of the target object;
and obtaining three-dimensional human body data of the target object under each view angle of N view angles according to the first parameters and a preset three-dimensional human body reconstruction model.
8. The apparatus of claim 7, wherein the three-dimensional reconstruction module is configured to:
obtaining first three-dimensional vertex coordinate information of the target object under the reference view angle according to the body type parameter of the target object, the joint rotation parameter of the target object and the preset three-dimensional human body reconstruction model;
obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the first three-dimensional vertex coordinate information and the target view angle; wherein the N views include the target view;
and obtaining three-dimensional human body data of the target object under the target visual angle according to the second three-dimensional vertex coordinate information.
9. The apparatus of claim 8, wherein the three-dimensional reconstruction module is configured to:
acquiring a second parameter of the target viewing angle relative to the reference viewing angle; the second parameter is a rotation parameter and a translation parameter of a camera of the electronic device relative to the reference view angle when the target view angle is;
and obtaining second three-dimensional vertex coordinate information of the target object under the target view angle according to the second parameter and the first three-dimensional vertex coordinate information.
10. The apparatus of claim 6, wherein the generating module is specifically configured to:
and according to a preset projection model, projecting the three-dimensional human body data of the target object under each view angle of N view angles to a first position corresponding to the view angle in the image frame data, and generating a target image corresponding to the image frame data.
CN202310022691.4A 2023-01-03 2023-01-03 Image processing method and device Pending CN116051695A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310022691.4A CN116051695A (en) 2023-01-03 2023-01-03 Image processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310022691.4A CN116051695A (en) 2023-01-03 2023-01-03 Image processing method and device

Publications (1)

Publication Number Publication Date
CN116051695A true CN116051695A (en) 2023-05-02

Family

ID=86132731

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310022691.4A Pending CN116051695A (en) 2023-01-03 2023-01-03 Image processing method and device

Country Status (1)

Country Link
CN (1) CN116051695A (en)

Similar Documents

Publication Publication Date Title
CN112954212B (en) Video generation method, device and equipment
CN115278084B (en) Image processing method, device, electronic equipment and storage medium
WO2024174971A1 (en) Video processing method and apparatus, and device and storage medium
CN111901518B (en) Display method and device and electronic equipment
CN114125297B (en) Video shooting method, device, electronic equipment and storage medium
CN115242981B (en) Video playing method, video playing device and electronic equipment
CN114143457B (en) Shooting method and device and electronic equipment
CN115861579A (en) Display method and device thereof
CN116051695A (en) Image processing method and device
CN115967854A (en) Photographing method and device and electronic equipment
CN114785957A (en) Shooting method and device thereof
CN114241127A (en) Panoramic image generation method and device, electronic equipment and medium
CN114049473A (en) Image processing method and device
CN115426444A (en) Shooting method and device
CN114615426A (en) Shooting method, shooting device, electronic equipment and readable storage medium
CN112887611A (en) Image processing method, device, equipment and storage medium
CN114327174A (en) Virtual reality scene display method and cursor three-dimensional display method and device
CN112738398A (en) Image anti-shake method and device and electronic equipment
CN113706723B (en) Image processing method and device
CN115103112B (en) Lens control method and electronic equipment
CN114710624B (en) Shooting method and shooting device
CN117692756A (en) Shooting method, shooting device, electronic equipment and readable storage medium
CN114449172B (en) Shooting method and device and electronic equipment
CN117097982B (en) Target detection method and system
CN115278053B (en) Image shooting method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination