WO2022052620A1 - Procédé de génération d'image et dispositif électronique - Google Patents

Procédé de génération d'image et dispositif électronique Download PDF

Info

Publication number
WO2022052620A1
WO2022052620A1 PCT/CN2021/106178 CN2021106178W WO2022052620A1 WO 2022052620 A1 WO2022052620 A1 WO 2022052620A1 CN 2021106178 W CN2021106178 W CN 2021106178W WO 2022052620 A1 WO2022052620 A1 WO 2022052620A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
depth information
depth
pixel
dimensional model
Prior art date
Application number
PCT/CN2021/106178
Other languages
English (en)
Chinese (zh)
Inventor
安世杰
张渊
郑文
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022052620A1 publication Critical patent/WO2022052620A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/40Filling a planar surface by adding surface attributes, e.g. colour or texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present disclosure relates to the technical field of image processing, and in particular, to an image generation method and an electronic device.
  • two-dimensional images of the same scene at different angles are captured by dual cameras, difference information between the two-dimensional images at different angles is determined, the difference information is converted into depth information of the two-dimensional image, and reconstruction is based on the depth information. 3D image.
  • Embodiments of the present disclosure provide an image generation method and electronic device, which can optimize the image effect of the generated three-dimensional image.
  • the technical solution is as follows:
  • an image generation method comprising:
  • the first image area is the image area where the target object is located
  • the second image area is the image where the background is located area
  • the third image is obtained by fusing image data in the first image area into the depth-filled second image based on the first depth information and the third depth information.
  • the first depth information is obtained by fusing image data in the first image area into the depth-filled second image based on the first depth information and the third depth information.
  • Three images including:
  • first three-dimensional model based on the image data of the first image area, where the first three-dimensional model is a three-dimensional model corresponding to the target object;
  • the pixel information corresponding to the first three-dimensional model and the second three-dimensional model is fused to obtain the third image, wherein the first three-dimensional model corresponds to
  • the depth information of the pixel point in the third image is the first depth information
  • the depth information of the pixel point corresponding to the second three-dimensional model in the third image is the third depth information.
  • the third image is obtained by fusing pixel information corresponding to the first three-dimensional model and the second three-dimensional model based on the first depth information and the third depth information, comprising: :
  • the depth information of each pixel of the target object is based on the depth information of the target key point of the target object, the target The key point is the key point of the target object;
  • the pixel information of the first pixel point is the pixel information of the target key point in the first three-dimensional model
  • the depth information of the first pixel point is the first depth information of the target key point
  • the pixel information of the second pixel point is the pixel information of the other pixel points in the first three-dimensional model
  • the depth information of the second pixel is the third depth information of the other pixels.
  • acquiring the second image by replacing the image data of the first image area based on the image data of the second image area includes:
  • the background is filled in the removed outline of the region to obtain the second image.
  • the step of filling the background in the removed region outline to obtain the second image includes:
  • the removed first image is input into an image completion model to obtain the second image, and the image completion model is used to fill the background in the outline of the region.
  • the determining of the first depth information of the first image area and the second depth information of the second image area in the first image includes:
  • the first image is input into a first depth determination model to obtain the first depth information and the second depth information.
  • the first depth determination model includes a feature extraction layer, a feature map generation layer, a feature fusion layer, and a depth determination layer;
  • the inputting the first image into the first depth determination model to obtain the first depth information and the second depth information includes:
  • the first depth information and the second depth information are obtained by convolution processing the fused feature map through the depth determination layer.
  • the method further includes:
  • the first coordinate is the position coordinate of the special effect element in the image coordinate system of the third image
  • the second coordinate is the special effect element depth coordinates in the camera coordinate system of the third image
  • the fourth image is obtained by fusing the special effect element to the first target pixel point of the third image based on the first coordinate and the second coordinate, and the first target pixel point is the position coordinate of the the first coordinate, and the depth coordinate is the pixel point of the second coordinate.
  • the method further includes:
  • the third image is rotated to generate a video.
  • the rotating the third image to generate a video includes:
  • the pixels in the third image are rotated to generate a video.
  • the determining a rotation angle to rotate in a direction corresponding to each coordinate axis of the camera coordinate system includes:
  • the rotation angle of the direction is determined based on the display angle weight and the preset display angle.
  • a method for training a depth determination model comprising:
  • a sampling weight of the first image set is determined based on a first quantity and a second quantity, the first quantity being the quantity of sample images included in the first image set, and the second quantity
  • the number is the total number of sample images included in the plurality of first image sets, the sampling weight is positively correlated with the second number, and the sampling weight is negatively correlated with the first number;
  • sampling the first image set Based on the sampling weight, sampling the first image set to obtain a second image set;
  • the second depth determination model is trained to obtain the first depth determination model.
  • an image generating apparatus comprising:
  • a first determining unit configured to determine first depth information of a first image area and second depth information of a second image area in the first image, where the first image area is the image area where the target object is located, and the first image area is the image area where the target object is located, and the first image area is The second image area is the image area where the background is located;
  • a replacement unit configured to acquire a second image by replacing the image data of the first image area based on the image data of the second image area;
  • a filling unit configured to obtain third depth information of the third image area by filling the depth of the third image area based on the second depth information, where the third image area is the same as that in the second image the image area corresponding to the first image area;
  • the first fusion unit is configured to obtain the first image by fusing the image data in the first image area into the depth-filled second image based on the first depth information and the third depth information. Three images.
  • the first fusion unit includes:
  • a first creation subunit configured to create a first three-dimensional model based on the image data of the first image area, where the first three-dimensional model is a three-dimensional model corresponding to the target object;
  • a second creation subunit configured to create a second three-dimensional model based on the depth-filled second image, where the second three-dimensional model is a three-dimensional model corresponding to the background;
  • a fusion subunit configured to fuse pixel information corresponding to the first three-dimensional model and the second three-dimensional model based on the first depth information and the third depth information, to obtain the third image, wherein,
  • the depth information of the pixels corresponding to the first three-dimensional model in the third image is the first depth information
  • the depth information of the pixels corresponding to the second three-dimensional model in the third image is the the third depth information.
  • the fusion subunit is configured to determine, from the first three-dimensional model, depth information of each pixel of the target object, where the depth information of each pixel is based on the The depth information of the target key point of the target object is the benchmark, and the target key point is the key point of the target object; based on the target key point, a first pixel point is determined, and the first pixel point is the target key point The pixel point corresponding to the point in the second three-dimensional model; assign the pixel information and depth information of the first pixel point, and the pixel information of the first pixel point is that the target key point is in the first three-dimensional model.
  • the depth information of the first pixel is the first depth information of the target key point; based on the positional relationship between the target key point and other pixels in the target object, determine the second pixel point , the second pixel point is the pixel point corresponding to the other pixel points in the second three-dimensional model; assign the pixel information and depth information of the second pixel point to obtain the third image, the third image
  • the pixel information of the two pixel points is the pixel information of the other pixel points in the first three-dimensional model, and the depth information of the second pixel point is the third depth information of the other pixel points.
  • the replacement unit includes:
  • a segmentation subunit configured to perform image segmentation on the first image, and determine an area outline corresponding to the first image area
  • a removal subunit configured to remove image data within the outline of the region
  • the completion sub-unit is configured to fill the background in the region outline after removal to obtain the second image.
  • the completion subunit is configured to input the removed first image into an image completion model to obtain the second image, and the image completion model is used for Fill background in area outline.
  • the first determination unit is configured to input the first image into a first depth determination model to obtain the first depth information and the second depth information.
  • the first depth determination model includes a feature extraction layer, a feature map generation layer, a feature fusion layer, and a depth determination layer;
  • the first determining unit includes:
  • a feature extraction subunit configured to input the first image to the feature extraction layer, and extract multiple layers of features of the first image through the feature extraction layer to obtain a plurality of image features of the first image;
  • sampling subunit configured to sample the plurality of image features through the feature map generation layer to obtain a plurality of feature maps of different scales
  • a feature fusion subunit configured to fuse the plurality of feature maps through the feature fusion layer to obtain a fused feature map
  • the convolution subunit is configured to obtain the first depth information and the second depth information by convolution processing the fused feature map through the depth determination layer.
  • the apparatus further includes:
  • a third determining unit is configured to determine a first coordinate and a second coordinate of the special effect element to be added, where the first coordinate is the position coordinate of the special effect element in the image coordinate system of the third image, the The second coordinate is the depth coordinate of the special effect element in the camera coordinate system of the third image;
  • the second fusion unit is configured to obtain a fourth image by fusing the special effect element to the first target pixel point of the third image based on the first coordinate and the second coordinate.
  • the target pixel is a pixel whose position coordinate is the first coordinate, and the depth coordinate is the second coordinate.
  • the apparatus further includes:
  • a generating unit configured to rotate the third image to generate a video.
  • the generating unit includes:
  • a coordinate setting subunit configured to set the position coordinate corresponding to the target key point of the target object as the coordinate origin of the camera coordinate system corresponding to the third image
  • a determination subunit configured to determine a rotation angle to rotate in a direction corresponding to each coordinate axis of the camera coordinate system
  • a generating subunit is configured to rotate pixels in the third image based on the rotation angle to generate a video.
  • the determining subunit is configured to:
  • an obtaining subunit configured to obtain a preset display angle, a preset motion speed and a preset number of display frames of the target key point in each direction; based on the preset motion speed and the preset number of display frames, determine the display angle weight; based on the display angle weight and the preset display angle, determine the rotation angle of the direction.
  • an apparatus for training a depth determination model comprising:
  • an acquiring unit configured to acquire a plurality of first image sets, each of which corresponds to an image scene
  • a second determination unit configured to, for each first image set, determine a sampling weight of the first image set based on a first number and a second number, the first number being the samples included in the first image set the number of images, the second number is the total number of sample images included in the plurality of first image sets, the sampling weight is positively correlated with the second number, and the sampling weight is related to the first Quantity is negatively correlated;
  • a sampling unit configured to sample the first image set based on the sampling weight to obtain a second image set
  • the model training unit is configured to train a second depth determination model to obtain a first depth determination model based on the plurality of second image sets.
  • an electronic device includes a processor and a memory, the memory stores at least one piece of program code, and the at least one piece of program code is loaded by the processor and execute to achieve the following steps:
  • the first image area is the image area where the target object is located
  • the second image area is the image where the background is located area
  • the third image is obtained by fusing image data in the first image area into the depth-filled second image based on the first depth information and the third depth information.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • first three-dimensional model based on the image data of the first image area, where the first three-dimensional model is a three-dimensional model corresponding to the target object;
  • the pixel information corresponding to the first three-dimensional model and the second three-dimensional model is fused to obtain the third image, wherein the first three-dimensional model corresponds to
  • the depth information of the pixel point in the third image is the first depth information
  • the depth information of the pixel point corresponding to the second three-dimensional model in the third image is the third depth information.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the depth information of each pixel of the target object is based on the depth information of the target key point of the target object, the target The key point is the key point of the target object;
  • the pixel information of the first pixel point is the pixel information of the target key point in the first three-dimensional model
  • the depth information of the first pixel point is the first depth information of the target key point
  • the pixel information of the second pixel point is the pixel information of the other pixel points in the first three-dimensional model
  • the depth information of the second pixel is the third depth information of the other pixels.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the background is filled in the removed outline of the region to obtain the second image.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the removed first image is input into an image completion model to obtain the second image, and the image completion model is used to fill the background in the outline of the region.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the first image is input into a first depth determination model to obtain the first depth information and the second depth information.
  • the first depth determination model includes a feature extraction layer, a feature map generation layer, a feature fusion layer and a depth determination layer; the at least one piece of program code is loaded and executed by the processor to implement the following steps :
  • the first depth information and the second depth information are obtained by convolution processing the fused feature map through the depth determination layer.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the first coordinate is the position coordinate of the special effect element in the image coordinate system of the third image
  • the second coordinate is the special effect element depth coordinates in the camera coordinate system of the third image
  • the fourth image is obtained by fusing the special effect element to the first target pixel point of the third image based on the first coordinate and the second coordinate, and the first target pixel point is the position coordinate of the the first coordinate, and the depth coordinate is the pixel point of the second coordinate.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the third image is rotated to generate a video.
  • the at least one piece of program code is loaded and executed by the processor to achieve the following steps:
  • the pixels in the third image are rotated to generate a video.
  • the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • the rotation angle of the direction is determined based on the display angle weight and the preset display angle.
  • an electronic device is provided.
  • the electronic device includes a processor and a memory, the memory stores at least one piece of program code, and the at least one piece of program code is loaded and executed by the processor to implement the following steps:
  • a sampling weight of the first image set is determined based on a first quantity and a second quantity, the first quantity being the quantity of sample images included in the first image set, and the second quantity
  • the number is the total number of sample images included in the plurality of first image sets, the sampling weight is positively correlated with the second number, and the sampling weight is negatively correlated with the first number;
  • sampling the first image set Based on the sampling weight, sampling the first image set to obtain a second image set;
  • the second depth determination model is trained to obtain the first depth determination model.
  • a computer-readable storage medium is provided, and at least one piece of program code is stored in the computer-readable storage medium, and the at least one piece of program code is loaded and executed by a processor to implement follows the steps below:
  • the first image area is the image area where the target object is located
  • the second image area is the image where the background is located area
  • the third image is obtained by fusing image data in the first image area into the depth-filled second image based on the first depth information and the third depth information.
  • a computer-readable storage medium is provided, and at least one piece of program code is stored in the computer-readable storage medium, and the at least one piece of program code is loaded and executed by a processor to implement follows the steps below:
  • a sampling weight of the first image set is determined based on a first quantity and a second quantity, the first quantity being the quantity of sample images included in the first image set, and the second quantity
  • the number is the total number of sample images included in the plurality of first image sets, the sampling weight is positively correlated with the second number, and the sampling weight is negatively correlated with the first number;
  • sampling the first image set Based on the sampling weight, sampling the first image set to obtain a second image set;
  • the second depth determination model is trained to obtain the first depth determination model.
  • a computer program product or a computer program comprising computer program code, the computer program code being stored in a computer-readable storage medium
  • the processor of the computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computer device performs the following steps:
  • the first image area is the image area where the target object is located
  • the second image area is the image where the background is located area
  • the third image is obtained by fusing image data in the first image region into the depth-filled second image based on the first depth information and the third depth information.
  • a computer program product or a computer program comprising computer program code, the computer program code being stored in a computer-readable storage medium
  • the processor of the computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computer device performs the following steps:
  • a sampling weight of the first image set is determined based on a first quantity and a second quantity, the first quantity being the quantity of sample images included in the first image set, and the second quantity
  • the number is the total number of sample images included in the plurality of first image sets, the sampling weight is positively correlated with the second number, and the sampling weight is negatively correlated with the first number;
  • sampling the first image set Based on the sampling weight, sampling the first image set to obtain a second image set;
  • the second depth determination model is trained to obtain the first depth determination model.
  • the second image is obtained after background filling and depth filling in the first image
  • the second image is fused with the first image area where the target object is located in the first image to obtain the first image.
  • the three images when the perspective of the third image changes, can fill in the background holes, and at the same time prevent distortion or loss at the boundary of the target object, and optimize the image effect of the generated image.
  • FIG. 1 is a flowchart of an image generation method provided according to an exemplary embodiment
  • FIG. 2 is a flowchart of an image generation method provided according to an exemplary embodiment
  • FIG. 3 is a schematic diagram of an image processing provided according to an exemplary embodiment
  • FIG. 4 is a schematic diagram of an image processing provided according to an exemplary embodiment
  • FIG. 5 is a schematic diagram of an image processing provided according to an exemplary embodiment
  • FIG. 6 is a flowchart of an image generation method provided according to an exemplary embodiment
  • FIG. 7 is a flowchart of an image generation method provided according to an exemplary embodiment
  • FIG. 8 is a schematic diagram of an image processing provided according to an exemplary embodiment
  • FIG. 9 is a flowchart of an image generation method provided according to an exemplary embodiment.
  • FIG. 10 is a schematic diagram of an image processing provided according to an exemplary embodiment
  • Fig. 11 is a block diagram of an image generating apparatus provided according to an exemplary embodiment
  • FIG. 12 is a flowchart of a training method for a depth determination model provided according to an exemplary embodiment
  • FIG. 13 is a schematic structural diagram of an electronic device provided according to an exemplary embodiment.
  • the electronic device In order to display the collected images in the form of three-dimensional images, the electronic device performs image processing on the collected images, generates a three-dimensional image, and displays the three-dimensional image to the user.
  • a three-dimensional image refers to an image with a three-dimensional effect.
  • the solutions provided by the embodiments of the present disclosure are applied in an electronic device, and the electronic device is an electronic device with an image acquisition function.
  • the electronic device is a camera, or the electronic device is a mobile phone, a tablet computer, or a wearable device with a camera.
  • the electronic device is not specifically limited.
  • the image generation method provided by the embodiments of the present disclosure can be applied in the following scenarios:
  • an electronic device when an electronic device captures an image, it directly converts the captured two-dimensional image into a three-dimensional image according to the method provided by the embodiment of the present disclosure.
  • the electronic device stores the two-dimensional image in the electronic device after capturing the two-dimensional image; when the user shares the two-dimensional image through the electronic device, the electronic device uses the method provided by the embodiment of the present disclosure to store the two-dimensional image in the electronic device.
  • the 3D image is converted into a 3D image, and the 3D image is shared.
  • sharing an image includes at least one of sharing an image with other users, sharing an image with a social display platform, and sharing an image with a short video platform, and the like.
  • the electronic device stores the two-dimensional image in the electronic device after shooting to obtain a two-dimensional image; when the user generates a video through the electronic device, the electronic device obtains a plurality of selected two-dimensional images, through the present disclosure
  • the method provided by the embodiment converts multiple two-dimensional images into multiple three-dimensional images, and synthesizes the multiple three-dimensional images into a video. For example, when a user shares a video on a short video platform, he first selects multiple two-dimensional selfie images containing human faces, and converts the multiple two-dimensional selfie images into multiple three-dimensional selfie images by using the method provided by the embodiment of the present application. , synthesizing multiple three-dimensional selfie images into a video, and sharing the obtained video to the short video platform.
  • FIG. 1 is a flowchart of an image generation method provided according to an exemplary embodiment. As shown in Figure 1, the method includes the following steps:
  • Step 101 Determine the first depth information of the first image area and the second depth information of the second image area in the first image, where the first image area is the image area where the target object is located, and the second image area is where the background is located. image area.
  • Step 102 Obtain a second image by replacing the image data of the first image area based on the image data of the second image area.
  • Step 103 Obtain third depth information of the third image area by filling the depth of the third image area based on the second depth information, where the third image area is an image area corresponding to the first image area in the second image.
  • Step 104 Obtain a third image by fusing the image data in the first image area into the depth-filled second image based on the first depth information and the third depth information.
  • obtaining a third image by fusing image data in the first image region into the depth-filled second image based on the first depth information and the third depth information including:
  • first three-dimensional model based on the image data of the first image area, where the first three-dimensional model is a three-dimensional model corresponding to the target object;
  • the pixel information corresponding to the first three-dimensional model and the second three-dimensional model is fused to obtain the third image, wherein the pixels corresponding to the first three-dimensional model are in the third image.
  • the depth information in the three images is the first depth information
  • the depth information of the pixels corresponding to the second three-dimensional model in the third image is the third depth information.
  • the third image is obtained by fusing the pixel information of the first three-dimensional model and the second three-dimensional model based on the first depth information and the third depth information, including:
  • the depth information of each pixel of the target object is based on the depth information of the target key point of the target object, and the target key point is the target key points of the object;
  • the first pixel point is the pixel point corresponding to the target key point in the second three-dimensional model
  • the pixel information of the first pixel point is the pixel information of the target key point in the first three-dimensional model, and the depth information of the first pixel point is the first depth of the target key point information;
  • the pixel information of the second pixel is the pixel information of the other pixel in the first three-dimensional model
  • the depth information of the second pixel is the other pixel The third depth information of the point.
  • obtaining the second image by replacing the image data of the first image area based on the image data of the second image area includes:
  • a background is filled in the removed area outline to obtain a second image, including:
  • the second image is obtained by inputting the removed first image into the image completion model, which is used to fill the background in the contour of the region.
  • determining the first depth information of the first image area and the second depth information of the second image area in the first image includes:
  • the first image is input into the first depth determination model to obtain first depth information and second depth information.
  • the first depth determination model includes a feature extraction layer, a feature map generation layer, a feature fusion layer, and a depth determination layer;
  • first depth information and second depth information including:
  • the fused feature map is obtained by fusing multiple feature maps through the feature fusion layer;
  • the first depth information and the second depth information are obtained by convolution processing the fused feature map through the depth determination layer.
  • the method further includes:
  • the first coordinate is the position coordinate of the special effect element in the image coordinate system of the third image
  • the second coordinate is the special effect element in the camera coordinate system of the third image. depth coordinates
  • the fourth image is the first target pixel point where the special effect element is fused to the third image, and the fourth image is obtained, and the first target pixel point is the position and the coordinates are the first Coordinate, the depth coordinate is the pixel point of the second coordinate.
  • the method further includes:
  • rotating the third image to generate a video includes:
  • the pixels in the third image are rotated to generate a video.
  • determining a rotation angle to rotate in a direction corresponding to each coordinate axis of the camera coordinate system includes:
  • the rotation angle of the direction is determined.
  • the second image is obtained after background filling and depth filling in the first image
  • the second image is fused with the first image area where the target object is located in the first image to obtain the first image.
  • the three images when the perspective of the third image changes, can fill in the background holes, and at the same time prevent distortion or loss at the boundary of the target object, and optimize the image effect of the generated image.
  • Fig. 2 is a flowchart of an image generation method provided according to an exemplary embodiment.
  • the training of the first depth determination model is taken as an example for description.
  • the method includes the following steps:
  • Step 201 The electronic device acquires a plurality of first image sets, and each first image set corresponds to an image category.
  • the image category is used to represent the scene to which the image belongs, that is, the image category is the image scene, and the image scene includes an indoor scene and an outdoor scene.
  • the first image set includes a plurality of sample images, and the sample images mark the depth information of the pixels in the sample images.
  • the step of acquiring the first image set by the electronic device includes: acquiring a plurality of images by the electronic device, the categories of the multiple images are the image category; marking the depth information of the pixels in the multiple images, A plurality of sample images are obtained, and the plurality of sample images are formed into a first image set.
  • the electronic device after acquiring the multiple first image sets, divides the multiple first image sets into training data and test data.
  • the training data is used to train the model
  • the test data is used to determine whether the trained model meets the requirements.
  • the electronic device selects some sample images from the plurality of first image sets as training data, and uses the remaining sample images in the plurality of first image sets as test data. In some embodiments, the electronic device selects some sample images from each first image set, composes training data from the sample images selected from each first image set, and combines the remaining samples from each first image set The images make up the test data. For example, the electronic device acquires two first image sets, which are image set A and image set B respectively.
  • the shooting scene of the sample images included in image set A is outdoor, that is, the image category of image set A is outdoor;
  • image set B includes The shooting scene of the sample image is indoor, that is, the image category of the image set B is outdoor; the electronic device selects some sample images from the image set A and the image set B respectively, and composes the selected sample images into training data.
  • the remaining sample images in image set B and the remaining sample images in image set B constitute test data.
  • each first image set corresponds to one image category
  • subsequent model training is performed through multiple image sets, so that the model can be trained according to the difference in depth under different image categories, thereby improving the training result.
  • the first depth of determines the accuracy of the model.
  • Step 202 For each first image set, the electronic device determines the sampling weight of the first image set based on the first quantity and the second quantity.
  • the first number is the number of sample images included in the first image set
  • the second number is the total number of sample images included in the multiple first image sets
  • the sampling weight is positively correlated with the second number
  • the sampling weight is negatively correlated with the first quantity. Since each first image set corresponds to one image category, the electronic device determines the sampling weights of the first image sets of different image categories based on the number of sample images in different image collections, so that the subsequent modeling is performed based on the sampling weights of different image categories training to improve accuracy.
  • the electronic device uses the ratio of the second number to the first number as the sampling weight. For example, if the second quantity is K and the first quantity is k_i, the sampling weight is K/k_i. Wherein, i represents the label (image category) of the first image set.
  • the electronic device uses the ratio of the second quantity to the first quantity as the sampling weight, so that the sampling weight of the first image set with a larger first quantity is smaller, and the sampling weight of the first image set with a smaller first quantity is smaller.
  • the larger the sampling weight the more balanced the sample images of each image category can be during the model training process, and the deviation of the model training can be prevented.
  • Step 203 Based on the sampling weight, the electronic device samples the first image set to obtain a second image set.
  • the electronic device acquires sample images from the first image set based on the sampling weight, and composes the acquired sample images into a second image set.
  • the electronic device determines a third number, the third number being the expected total number of the second set of images. For each first image set, the electronic device determines a fourth number based on the sampling weight of the first image set and the third number, where the fourth number is the number of sample images that need to be collected from the first image set, from The fourth number of sample images are collected from the first image set.
  • the fourth number of sample images are adjacent sample images in the first image set, or the fourth number of sample images are randomly sampled sample images in the first image set, or the The fourth number of sample images are sample images obtained by uniform sampling in the first image set, and the like.
  • the manner in which the electronic device samples the sample image from the first image set is not specifically limited.
  • Step 204 The electronic device trains the second depth determination model to obtain the first depth determination model based on the plurality of second image sets.
  • the electronic device adjusts the model parameters of the second depth determination model based on the second image set and the loss function to obtain the trained first depth determination model, and the process is implemented through the following steps (1)-(3), including:
  • the electronic device determines the loss value of the second depth determination model based on the second image set and the loss function.
  • the electronic device obtains the first depth determination model by training the second depth determination model, and the process includes: for each sample image in the second image set, the electronic device inputs the sample image into the second depth determination model, and outputs the sample The depth information of the image, the depth information output by the second depth determination model and the depth information marked in the sample image are input into the loss function to obtain the loss value of the second depth determination model.
  • the loss function is a vector loss function; for example, the loss function includes at least one of a depth x-direction loss function, a depth y-direction loss function, a normal vector loss function, and a reverse robust loss function (Reversed HuBer).
  • the loss function includes at least one of a depth x-direction loss function, a depth y-direction loss function, a normal vector loss function, and a reverse robust loss function (Reversed HuBer).
  • the electronic device constructs a second depth determination model.
  • the electronic device constructs the second depth determination model through a convolutional neural network.
  • the second depth determination model includes a feature extraction layer, a feature map generation layer, a feature fusion layer and a depth determination layer.
  • each layer in the second depth determination model consists of a convolutional layer
  • each convolutional layer is a convolutional layer of the same structure or a convolutional layer of a different structure.
  • the convolution layer in the second depth determination model is at least one of Depthwise Convolution (depth convolution structure), Pointwise Convolution (pointwise convolution structure) or Depthwise-Pointwise Convolution (depth pointwise convolution structure).
  • the structure of the convolution layer is not specifically limited.
  • the feature extraction layer consists of four convolutional layers.
  • the feature extraction layer is used to extract multi-layer features of the sample image to obtain multiple image features of the sample image.
  • the sample image is a 3-channel image.
  • the electronic device inputs the 3-channel sample image into the first convolutional layer, and converts the 3-channel sample image into a 16-channel sample image through the first convolutional layer; then the 16-channel sample image is converted into a 16-channel sample image.
  • Channel sample images are converted to 128-channel sample images. For sample images with different channel numbers, the image features of the sample images are extracted respectively, so that different image features corresponding to different convolutional layers can be obtained.
  • the feature map generation layer is used to sample multiple image features to obtain multiple feature maps of different scales.
  • the features of the local image and the global image in the sample image are determined by the image features of different convolutional layers output by the feature extraction layer. , record the relative relationship between the position of each pixel in the sample image and the global image, so as to provide local feature information and global feature information to the feature fusion layer and the depth determination layer.
  • the feature map generation layer consists of five convolutional layers.
  • the first to fourth convolutional layers are used to sample 128-channel sample images; the first to fourth convolutional layers are respectively connected to the fifth convolutional layer, and the sampled image is input
  • the fifth convolutional layer performs scale conversion on the four received sample images to obtain multiple feature maps of different scales, and the multiple feature maps of different scales are input to the feature fusion layer.
  • the feature fusion layer is used to perform feature fusion on the multiple feature maps to obtain a fused feature map.
  • the feature fusion layer gradually restores the image resolution and reduces the number of channels, fuses the features of the feature extraction layer, and takes into account the features of different depths in the sample image.
  • the feature fusion layer includes three layers of convolution layers.
  • the first layer of convolution layer downsamples the feature map of the 128-channel sample image to obtain the feature map of the 64-channel sample image;
  • the second layer of convolutional layer downsamples the 64-channel sample image.
  • the feature map of the sample image is obtained, and the feature map of the 32-channel sample image is obtained;
  • the third convolutional layer downsamples the feature map of the 32-channel sample image to obtain the feature map of the 16-channel sample image, and then the obtained multiple
  • the feature map is fused to obtain a fused feature map, and the fused feature map is input to the depth determination layer.
  • the depth determination layer is used to determine the depth information of each pixel of the sample image based on the fused feature map.
  • the electronic device first acquires multiple first image sets, and then constructs the second depth determination model; or, the electronic device constructs the second depth determination model first, and then acquires multiple first image sets; or, the electronic device Simultaneously acquire a plurality of first image sets and construct a second depth determination model. That is, the electronic device executes step 201 first, and then executes step 202; or, the electronic device executes step 202 first, and then executes step 201; or, the electronic device executes step 201 and step 202 at the same time. In this embodiment of the present disclosure, the execution order of step 201 and step 202 is not specifically limited.
  • the electronic device updates the model parameters of the second depth determination model through the loss value and the model optimizer to obtain a third depth determination model.
  • the optimizer is used to update model parameters using stochastic gradient descent.
  • the electronic device updates model parameters through the stochastic gradient descent method, where the model parameters include gradient values.
  • the electronic device determines the loss value of the third depth determination model based on the training data and the vector loss function, and until the loss value is less than the preset loss value, completes model training to obtain the first depth determination model.
  • the electronic device After the electronic device adjusts the model parameters of the second depth determination model, it continues to perform model training on the obtained third depth determination model. This process is similar to steps (1)-(2), and will not be repeated here, and the steps are performed each time (2) After that, the electronic device determines whether the model training is completed based on the loss value of the model. In response to the loss value being not less than the preset loss value, it is determined that the model training has not been completed, and steps (1)-(2) are continued. In response to the loss value being less than the preset loss value, it is determined that the model training is completed, and the first depth determination is obtained. Model.
  • the electronic device after the electronic device completes the model training, it evaluates the prediction result of the first depth determination model.
  • the electronic device tests the first depth determination model based on the test data, and obtains a test result of the first depth determination model, where the test result is used to indicate whether the first depth determination model meets the requirements.
  • determine that the first depth determination module is an available depth determination model, and subsequently determine the depth information of the image based on the first depth determination model; When it indicates that the first depth determination model does not meet the requirements, continue to train the first depth determination model until the first depth determination model meets the requirements.
  • the electronic device adopts at least one algorithm of mean Relative Error (average relative error) algorithm or Root Mean Squared Error (root mean square error algorithm) to determine the test result of the first determined model.
  • FIG. 4 and FIG. 5 are renderings of test results of a first depth determination model provided according to an exemplary embodiment. Pixels with the same depth information are marked with the same mark, and the more similar the depth information is, the more marked The more similar the markers are. For example, different depth information is distinguished by different colors, and the more similar the depth information is, the more similar the colors are.
  • the training process of the first depth determination model is performed by the electronic device currently used to generate the image; or, performed by another electronic device other than the current device.
  • the process for the electronic device to acquire the first depth determination model is: the electronic device sends an acquisition request to other electronic devices, and the acquisition request is used to request to acquire the first depth determination model; Other electronic devices acquire the first depth determination model based on the acquisition request, and send the first depth determination model to the electronic device; the electronic device receives the first depth determination model.
  • the process of training the first depth determination model by other electronic devices is similar to the process of training the first depth determination model by the electronic device, and details are not described herein again.
  • the sampling weight is determined based on the first number and the second number
  • the first number is the number of sample images in the first image set
  • the second number is the sample images in the multiple first image sets Therefore, when sampling the first image set based on the sampling weight, the number of sample images in each first image set can be controlled, ensuring that the sampling weight of the first image set containing more sample images is higher.
  • Fig. 6 is a flowchart of an image generation method provided according to an exemplary embodiment.
  • processing an image to generate a three-dimensional dynamic image is taken as an example for description.
  • the method includes the following steps:
  • Step 601 The electronic device determines the first depth information of the first image area and the second depth information of the second image area in the first image.
  • the first image area is the image area where the target object is located
  • the second image area is the image area where the background is located
  • the background is the part of the first image excluding the target object.
  • the target object is a designated object, a human or other animal face, or the like.
  • the electronic device obtains the first depth information and the second depth information by using the first depth determination model, and the process is as follows: the electronic device inputs the first image into the first depth determination model, and obtains the first depth information and the second depth information. 2. In-depth information.
  • the structure of the first depth determination model is the same as that of the second depth determination model.
  • the first depth determination model includes a feature extraction layer, a feature map generation layer, a feature fusion layer and a depth determination layer. This step is realized through the following steps (1)-(4), including:
  • the electronic device inputs the first image to the feature extraction layer, and extracts multi-layer features of the first image through the feature extraction layer to obtain multiple image features of the first image.
  • This step is similar to the process of extracting image features by the electronic device through the feature extraction layer in the second depth determination model in step (1) of step 204, and details are not described here.
  • the electronic device samples multiple image features through the feature map generation layer to obtain multiple feature maps of different scales.
  • step and step (1) of step 204 the electronic device determines the feature map generation layer in the model through the second depth, and the process of generating the feature map is similar, which will not be repeated here.
  • the electronic device fuses multiple feature maps through the feature fusion layer to obtain a fused feature map.
  • This step is similar to the process of feature fusion performed by the electronic device through the feature fusion layer in the second depth determination model in step (1) of step 204, and details are not repeated here.
  • the electronic device obtains the first depth information and the second depth information through the feature map after convolution processing and fusion of the depth determination layer.
  • This step is similar to the process of determining the depth information of the image by the electronic device through the depth determination layer in the second depth determination model in step (1) of step 204, and details are not repeated here.
  • the first depth information and the second depth information of the first image are determined by the pre-trained first depth determination model, thereby shortening the determination time of the first depth information and the second depth information, and further The image processing speed is improved, so that the solution can be applied to the scene of instant imaging.
  • the electronic device detects whether the target object exists in the first image, and in response to the presence of the target object in the first image, the electronic device performs step 601; in response to the absence of the target object in the first image, end.
  • the electronic device in response to the presence of the target object in the first image, the electronic device further detects an area ratio between the first image area where the target object is located and the first image, and in response to the area ratio being greater than a preset threshold, step 601 is executed, In response to the area ratio not being greater than the preset threshold, end.
  • the first image is an RGB (Red Green Blue) three-channel image.
  • Step 602 The electronic device acquires the second image by replacing the image data of the first image area based on the image data of the second image area.
  • the image data includes information such as the position and pixel value of the pixel in the image.
  • the electronic device removes the image data in the first image area through a mask, and then fills the background of the first image area through the second image area to obtain a second image.
  • this step is realized through the following steps (1)-(3), including:
  • the electronic device performs image segmentation on the first image, and determines the area contour corresponding to the first image area.
  • the electronic device divides the first image by using the image segmentation model to obtain an area outline corresponding to the first image area.
  • the image segmentation model is an image segmentation model acquired in advance by the electronic device.
  • the image segmentation model is a mask segmentation model.
  • the area outline is marked in the first image.
  • the electronic device removes the image data within the outline of the area.
  • the electronic device removes the pixel values of the pixel points in the outline of the area, so as to remove the image data in the outline of the area.
  • the image mask of the first image area will be obtained. Referring to Fig. 7 and Fig. 8, the image on the left side of Fig. 7 and the image on the left side of Fig. 8 show the mask image of the region outline.
  • the electronic device fills the background in the removed area outline to obtain a second image.
  • step (3) includes: the electronic equipment inputs the removed first image into the image complementing model, and obtains the second image, which complements the image.
  • the full model is used to fill the background in the area outline.
  • the electronic device inputs the removed first image into the image completion model, and the image completion model fills the background in the outline of the region based on the image data of the second image area, and the obtained second image is a complete background image.
  • the right image of Figure 7 and the right image of Figure 8 are complete background images.
  • the image completion model determines image features of the second image region, and based on the image features of the second image region, fills a background in the region outline.
  • Step 603 The electronic device obtains third depth information of the third image area by filling the depth of the third image area based on the second depth information, where the third image area is the image area corresponding to the first image area in the second image .
  • the electronic device performs depth information diffusion to the third image area based on the second depth information to obtain the third depth information.
  • the diffusion mode is Poisson diffusion mode.
  • the electronic device determines the change rule of depth information between adjacent pixels in the second image area, and determines the depth information of each pixel in the third image area based on the change rule of depth information; or, for the third image area
  • the electronic device determines the depth information of the pixel in the area outline, and assigns the determined depth information to the pixel.
  • the electronic device fills the depth of the third image area, so that the depth of the third image area matches the depth of the second image area, so that the generated background is more harmonious and the effect of the generated three-dimensional image is more realistic .
  • Step 604 The electronic device creates a first three-dimensional model based on the image data of the first image area, where the first three-dimensional model is a three-dimensional model corresponding to the target object.
  • the first three-dimensional model is a three-dimensional model generated based on image data of the first image area.
  • the electronic device creates the first three-dimensional model based on at least one key point of the target object in the first image area. For example, the electronic device identifies at least one key point of the target object in the first image area, and based on the at least one key point, creates a first three-dimensional model through a three-dimensional model generation algorithm.
  • the figure on the right in FIG. 9 is a first three-dimensional model created based on the face image of the figure on the left. For example, if the target object is a face, the at least one key point is a face key point.
  • the three-dimensional model generation algorithm is a 3DMM (3D Morphable Model, 3D deformation model; 3D, 3 Dimensional, three-dimensional) algorithm. Then the first three-dimensional model is a mesh mesh image model.
  • Step 605 The electronic device creates a second three-dimensional model based on the depth-filled second image, where the second three-dimensional model is a three-dimensional model corresponding to the background.
  • This step is similar to step 604 and will not be repeated here.
  • Step 606 Based on the first depth information and the third depth information, the electronic device fuses pixel information corresponding to the first three-dimensional model and the second three-dimensional model to obtain a third image.
  • the depth information of the pixels corresponding to the first three-dimensional model in the third image is the first depth information
  • the depth information of the pixels corresponding to the second three-dimensional model in the third image is the third depth information
  • the third image is generated by fusing the first three-dimensional model and the second three-dimensional model, so that the third image contains the three-dimensional target object and the three-dimensional background, thereby ensuring that the background hole can be filled when the perspective is changed. At the same time, it also prevents distortion or missing at the boundary of the target object, optimizing the image effect of the generated 3D image.
  • the electronic device determines a coordinate system, and fuses the first three-dimensional model and the second three-dimensional model into the coordinate system, so that the depth information of the pixels corresponding to the first three-dimensional model and the second three-dimensional model is based on the coordinate system.
  • the pixel information corresponding to the first three-dimensional model and the second three-dimensional model are respectively assigned to corresponding pixel positions to obtain a third image.
  • the electronic device establishes a coordinate system based on the first three-dimensional model or the second three-dimensional model, maps the second three-dimensional model or the first three-dimensional model to the coordinate system, and maps the first three-dimensional model to the second three-dimensional model
  • the pixel information of , respectively, is assigned to the corresponding pixel position to obtain a third image.
  • the electronic device determines, based on the positions of the key points in the first three-dimensional model or the second three-dimensional model in the second image and the parameter information between each key point in the first three-dimensional model and the second three-dimensional model, respectively.
  • This step is realized through the following steps (A1)-(A5), including:
  • the electronic device determines the depth information of each pixel of the target object, the depth information of each pixel is based on the depth information of the target key point of the target object, and the target The key point is the key point of the target object.
  • the depth information of each pixel is based on the depth information of the target key point of the target object, and the target key point is a key point in the at least one key point.
  • the target key point is the pixel point corresponding to the nose in the face image, or the target key point is the center point of the first three-dimensional model.
  • the electronic device selects a target key point from at least one key point of the target object, and determines the depth information of the target key point as the first depth information, and the electronic device determines, based on the model parameters of the first three-dimensional model, the The depth information of each pixel point relative to the target key point, based on the first depth information of the target key point and the depth information of each pixel point in the first three-dimensional model relative to the target key point, determine the depth information of each pixel point in the first three-dimensional model. in-depth information. For example, if the first three-dimensional model is a mesh image determined by a 3DMM algorithm, the depth information of each pixel point of the target object is determined based on the parameter information of each pixel point in the mesh image.
  • the electronic device determines the first pixel point based on the target key point
  • the first pixel point is the pixel point corresponding to the target key point in the second three-dimensional model.
  • the first three-dimensional model and the second three-dimensional model are three-dimensional models corresponding to the target object and the background in the first image. Therefore, the first three-dimensional model and the second three-dimensional model can be mapped to the same image coordinate system.
  • the electronic device maps the first three-dimensional model to the second three-dimensional model.
  • the electronic device selects the center point of the second three-dimensional model as the first pixel point, or the electronic device determines the mapping relationship between the first three-dimensional model and the second three-dimensional model based on the first mapping relationship and the second mapping relationship , the first mapping relationship is the mapping relationship between the first three-dimensional model and the first image, the second mapping relationship is the mapping relationship between the second three-dimensional model and the first image, and based on the mapping relationship between the first three-dimensional model and the second three-dimensional model, The first pixel point corresponding to the target key point is determined from the second three-dimensional model.
  • the electronic device assigns pixel information and depth information of the first pixel point.
  • the pixel information of the first pixel point is the pixel information of the target key point in the first three-dimensional model
  • the depth information of the first pixel point is the first depth information of the target key point.
  • the pixel information includes information such as pixel values of pixel points.
  • the electronic device modifies the depth information of the first pixel point in the second three-dimensional model to the first depth information of the target key point, and modifies the pixel information of the first pixel point to the pixel information of the target key point. For example, the electronic device determines the position of the nose in the face in the first three-dimensional model as the target key point, and then determines the depth information of the first pixel point as the first depth information of the nose.
  • the electronic device directly assigns the pixel information of the target key point and the first depth information to the first pixel point. In some embodiments, the electronic device sets a new layer on the second three-dimensional model, and modifies the pixel information and depth information of the first pixel point in the layer to the pixel information and the first depth information of the target key point.
  • the first three-dimensional model and the second three-dimensional model can not affect each other, and achieve the effect of integral molding, thereby optimizing the image effect of the generated three-dimensional image.
  • the electronic device determines the second pixel point based on the positional relationship between the target key point and other pixel points in the target object.
  • the second pixel point is the pixel point corresponding to other pixel points in the second three-dimensional model.
  • the electronic device sets the target key point at the origin of the coordinate system corresponding to the second three-dimensional model, and sets the origin of the coordinate system corresponding to the first three-dimensional model and the second three-dimensional model at the pixel corresponding to the target key point in the second image point location.
  • the electronic device assigns the pixel information and depth information of the second pixel point to obtain a third image, and the pixel information of the second pixel point is the pixel information of the other pixel points in the first three-dimensional model, and the depth of the second pixel point The information is the third depth information of the other pixels.
  • step (A3) is similar to step (A3) and will not be repeated here.
  • the electronic device fuses the first three-dimensional model and the second three-dimensional model based on the positional relationship of different pixels in the same image, so that when the perspective is changed, the background holes can be filled and the target object can be prevented from being damaged. Distortion or missing at the borders, optimizing the image quality of the resulting 3D image.
  • the electronic device can also add special effects elements to the third image to obtain a fourth image with special effects elements.
  • the process is as follows: the electronic device determines the first and second coordinates of the special effects elements to be added, and the first coordinates are The position coordinate of the special effect element in the image coordinate system of the third image, the second coordinate is the depth coordinate of the special effect element in the camera coordinate system of the third image, and the depth coordinate is the special effect element in the camera coordinate system.
  • the coordinate position corresponding to the depth information in the image by fusing the special effect element to the first target pixel point of the third image based on the first coordinate and the second coordinate, the fourth image is obtained, the position coordinate of the first target pixel point is the first coordinate, and the depth coordinate is the pixel point of the second coordinate.
  • the electronic device converts the pixel position to a coordinate system based on the principle of camera imaging.
  • the coordinates in the coordinate system are homogeneous coordinates (X, Y, 1), and the depth of the pixel in this coordinate system is the estimated distance of the depth map, and the depth coordinate 1 of the homogeneous coordinates and the depth Z are multiplied to form
  • the true depth coordinates (X, Y, Z) are the reconstructed 3D model.
  • the electronic device selects different positions and depths in the three-dimensional image, and places different dynamic effects to obtain a fourth image. For example, referring to Figure 10, place butterfly elements around the face with depths of 1, 2, and 3.5, respectively. This process is similar to (A1)-(A5) in step 606, and will not be repeated here.
  • the electronic device adds special effect elements to the third image based on the depth information, so that the added special effect elements and the third image are more vivid, and the image effect of the generated three-dimensional image is optimized.
  • the electronic device After the electronic device generates the three-dimensional third image, it can also rotate the third image to generate a video.
  • the process is achieved through the following steps (B1)-(B3), including:
  • the electronic device sets the position coordinate corresponding to the target key point of the target object as the coordinate origin of the camera coordinate system corresponding to the third image.
  • the electronic device determines a rotation angle for selecting a direction corresponding to each coordinate axis of the camera coordinate system.
  • the electronic device determines the rotation angle in the direction corresponding to each coordinate axis.
  • the rotation angle is a preset rotation angle, or the rotation angle is a rotation angle generated based on a rotation instruction.
  • the electronic device determines that the target key point is at a preset display angle, a preset motion speed, and a preset number of display frames; based on the preset motion speed and the preset number of display frames, determines a display angle weight; based on the The display angle weight and the preset display angle determine the rotation angle of the direction.
  • the preset display angle in the preset X (or Y) direction is AmpX (or AmpY)
  • t is the preset display frame number
  • the preset display frame number is also marked by time
  • s is Preset motion speed, then rotate AmpX*sin(s*t) angle around X axis each time (or choose AmpY*sin(s*t) angle around Y axis).
  • sin(s*t) is the display angle.
  • the display track of the third image is determined by the preset motion track, so that the third image can be displayed in rotation according to the specified route, so as to prevent the problem of track confusion when the third image generates a video.
  • the electronic device acquires the rotation instruction, and based on the rotation instruction, selects the rotation angle of the direction from the rotation angle corresponding to the rotation instruction and the preset display angle.
  • the rotation instruction is an instruction input by the user through the screen received by the electronic device, or the rotation instruction is an instruction generated by an angle sensor in the electronic device.
  • the electronic device receives a gesture operation input by the user, and determines the rotation angle based on the gesture operation. In other embodiments, the electronic device determines the current tilt angle of the electronic device through an angle sensor, and determines the tilt angle as the rotation angle.
  • the electronic device obtains the quaternion attitude of the gyroscope based on the attitude of the electronic device, calculates the inclination angles x_anlge and y_angle of the X-axis and the Y-axis, and rotates the min(x_anlge, AmpX) angle around the X-axis, Then rotate the min(y_anlge, AmpY) angle around the Y axis.
  • the electronic device determines the motion trajectory of the third image based on the received rotation instruction, so that the motion trajectory of the third image is more flexible.
  • the electronic device rotates the pixels in the third image based on the rotation angle to generate a video.
  • the electronic device translates the coordinate system to the pixel point, rotates the pixel point in the third image based on the pixel point and the rotation angle, and obtains a video.
  • the key point of the target moves according to the above motion trajectory, and finally returns to the initial position, and repeats the above (B2)-(B3) to obtain a three-dimensional dynamic video.
  • the third image generates a three-dimensional dynamic video based on the running trajectory, which enriches the display manner of the image.
  • the second image is obtained after background filling and depth filling in the first image
  • the second image is fused with the first image area where the target object is located in the first image to obtain the first image.
  • the three images when the perspective of the third image changes, can fill in the background holes, and at the same time prevent distortion or loss at the boundary of the target object, and optimize the image effect of the generated image.
  • Figure 11 provides a block diagram of an image generation according to an exemplary embodiment.
  • the device includes:
  • the first determining unit 1101 is configured to determine the first depth information of the first image area and the second depth information of the second image area in the first image, where the first image area is the image area where the target object is located, the The second image area is the image area where the background is located;
  • a replacement unit 1102 configured to acquire a second image by replacing the image data of the first image area based on the image data of the second image area;
  • the filling unit 1103 is configured to obtain third depth information of the third image area by filling the depth of the third image area based on the second depth information, where the third image area is the same as that in the second image. an image area corresponding to the first image area;
  • the first fusion unit 1104 is configured to, based on the first depth information and the third depth information, fuse the image data in the first image area into the depth-filled second image, and obtain: third image.
  • the first fusion unit 1104 includes:
  • a first creation subunit configured to create a first three-dimensional model based on the image data of the first image area, where the first three-dimensional model is a three-dimensional model corresponding to the target object;
  • a second creation subunit configured to create a second three-dimensional model based on the depth-filled second image, where the second three-dimensional model is a three-dimensional model corresponding to the background;
  • a fusion subunit configured to fuse pixel information corresponding to the first three-dimensional model and the second three-dimensional model based on the first depth information and the third depth information, to obtain the third image, wherein,
  • the depth information of the pixels corresponding to the first three-dimensional model in the third image is the first depth information
  • the depth information of the pixels corresponding to the second three-dimensional model in the third image is the the third depth information.
  • the fusion subunit is configured to determine, from the first three-dimensional model, depth information of each pixel of the target object, where the depth information of each pixel is based on the The depth information of the target key point of the target object is the benchmark, and the target key point is the key point of the target object; based on the target key point, a first pixel point is determined, and the first pixel point is the target key point The pixel point corresponding to the point in the second three-dimensional model; assign the pixel information and depth information of the first pixel point, and the pixel information of the first pixel point is that the target key point is in the first three-dimensional model.
  • the depth information of the first pixel is the first depth information of the target key point; based on the positional relationship between the target key point and other pixels in the target object, determine the second pixel point , the second pixel point is the pixel point corresponding to the other pixel points in the second three-dimensional model; assign the pixel information and depth information of the second pixel point to obtain the third image, the third image
  • the pixel information of the two pixel points is the pixel information of the other pixel points in the first three-dimensional model, and the depth information of the second pixel point is the third depth information of the other pixel points.
  • the replacement unit 1102 includes:
  • a segmentation subunit configured to perform image segmentation on the first image, and determine an area outline corresponding to the first image area
  • a removal subunit configured to remove image data within the outline of the region
  • the completion sub-unit is configured to fill the background in the region outline after removal to obtain the second image.
  • the completion subunit is configured to input the removed first image into an image completion model to obtain the second image, and the image completion model is used for Fill background in area outline.
  • the first determination unit is configured to input the first image into a first depth determination model to obtain the first depth information and the second depth information.
  • the first depth determination model includes a feature extraction layer, a feature map generation layer, a feature fusion layer, and a depth determination layer;
  • the first determining unit 1101 includes:
  • a feature extraction subunit configured to input the first image to the feature extraction layer, and extract multiple layers of features of the first image through the feature extraction layer to obtain a plurality of image features of the first image;
  • sampling subunit configured to sample the plurality of image features through the feature map generation layer to obtain a plurality of feature maps of different scales
  • a feature fusion subunit configured to fuse the plurality of feature maps through the feature fusion layer to obtain a fused feature map
  • the convolution subunit is configured to obtain the first depth information and the second depth information by convolution processing the fused feature map through the depth determination layer.
  • the apparatus further includes:
  • a third determining unit is configured to determine a first coordinate and a second coordinate of the special effect element to be added, where the first coordinate is the position coordinate of the special effect element in the image coordinate system of the third image, the The second coordinate is the depth coordinate of the special effect element in the camera coordinate system of the third image;
  • the second fusion unit is configured to obtain a fourth image by fusing the special effect element to the first target pixel point of the third image based on the first coordinate and the second coordinate.
  • the target pixel is a pixel whose position coordinate is the first coordinate, and the depth coordinate is the second coordinate.
  • the apparatus further includes:
  • a generating unit configured to rotate the third image to generate a video.
  • the generating unit includes:
  • a coordinate setting subunit configured to set the position coordinate corresponding to the target key point of the target object as the coordinate origin of the camera coordinate system corresponding to the third image
  • a determination subunit configured to determine a rotation angle to rotate in a direction corresponding to each coordinate axis of the camera coordinate system
  • a generating subunit is configured to rotate pixels in the third image based on the rotation angle to generate a video.
  • the determining subunit is configured to obtain a preset display angle, a preset motion speed and a preset number of display frames of the target key point in each direction; based on the preset motion speed and The display angle weight is determined by the preset display frame number; the rotation angle of the direction is determined based on the display angle weight and the preset display angle.
  • the second image is obtained after background filling and depth filling in the first image
  • the second image is fused with the first image area where the target object is located in the first image to obtain the first image.
  • the three images when the perspective of the third image changes, can fill in the background holes, and at the same time prevent distortion or loss at the boundary of the target object, and optimize the image effect of the generated image.
  • the image generation device provided by the above embodiments only takes the division of the above functional modules as an example to illustrate the image generation.
  • the above functions can be allocated to different functional modules as required.
  • the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the image generating apparatus and the image generating method embodiments provided by the above embodiments belong to the same concept, and the specific implementation process thereof is detailed in the method embodiments, which will not be repeated here.
  • Fig. 12 provides a training device for a depth determination model according to an exemplary embodiment.
  • the device includes:
  • the acquiring unit 1201 is configured to acquire a plurality of first image sets, each of which corresponds to an image scene;
  • the second determining unit 1202 is configured to, for each first image set, determine a sampling weight of the first image set based on a first quantity and a second quantity, where the first quantity is included in the first image set
  • the number of sample images, the second number is the total number of sample images included in the plurality of first image sets, the sampling weight is positively correlated with the second number, and the sampling weight is related to the first image set. a negative correlation of quantity;
  • a sampling unit 1203, configured to sample the first image set based on the sampling weight to obtain a second image set
  • the model training unit 1204 is configured to train a second depth determination model to obtain a first depth determination model based on the plurality of second image sets.
  • the sampling weight is determined based on the first number and the second number
  • the first number is the number of sample images in the first image set
  • the second number is the sample images in the multiple first image sets Therefore, when sampling the first image set based on the sampling weight, the number of sample images in each first image set can be controlled, ensuring that the sampling weight of the first image set containing more sample images is higher.
  • the apparatus for determining the depth of the model only uses the division of the above-mentioned functional modules as an example when training the depth-determining model.
  • the above-mentioned functions can be assigned to different functions as required
  • Module completion means dividing the internal structure of the device into different functional modules to complete all or part of the functions described above.
  • the apparatus for training a depth determination model provided by the above embodiments and the embodiments of the training method for a depth determination model belong to the same concept, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.
  • FIG. 13 shows a structural block diagram of an electronic device 1300 provided by an exemplary embodiment of the present disclosure.
  • the electronic device 1300 is a portable mobile terminal, such as a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, Moving Picture Experts Group Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer III) Audio Layer IV, Motion Picture Expert Compression Standard Audio Layer 4) Player, Laptop or Desktop.
  • Electronic device 1300 may also be called user equipment, portable terminal, laptop terminal, desktop terminal, and the like by other names.
  • the electronic device 1300 includes: a processor 1301 and a memory 1302 .
  • the processor 1301 includes one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. In some embodiments, the processor 1301 adopts at least one of DSP (Digital Signal Processing, digital signal processing), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, programmable logic array). A form of hardware implementation. In some embodiments, the processor 1301 also includes a main processor and a co-processor. The main processor is a processor for processing data in a wake-up state, also referred to as a CPU (Central Processing Unit, central processing unit). ; a coprocessor is a low-power processor for processing data in a standby state.
  • CPU Central Processing Unit, central processing unit
  • a coprocessor is a low-power processor for processing data in a standby state.
  • the processor 1301 is integrated with a GPU (Graphics Processing Unit, image processor), and the GPU is used for rendering and drawing the content that needs to be displayed on the display screen.
  • the processor 1301 further includes an AI (Artificial Intelligence, artificial intelligence) processor, where the AI processor is used to process computing operations related to machine learning.
  • AI Artificial Intelligence, artificial intelligence
  • memory 1302 includes one or more computer-readable storage media that are non-transitory. In some embodiments, memory 1302 also includes high-speed random access memory, and non-volatile memory, such as one or more disk storage devices, flash storage devices. In some embodiments, a non-transitory computer-readable storage medium in the memory 1302 is used to store at least one instruction for execution by the processor 1301 to implement the image generation provided by the method embodiments of the present disclosure method.
  • the electronic device 1300 may also optionally include: a peripheral device interface 1303 and at least one peripheral device.
  • the processor 1301, the memory 1302 and the peripheral device interface 1303 are connected by a bus or a signal line.
  • each peripheral device is connected to the peripheral device interface 1303 through a bus, signal line or circuit board.
  • the peripheral device includes at least one of: a radio frequency circuit 1304 , a display screen 1305 , a camera assembly 1306 , an audio circuit 1307 , a positioning assembly 1308 , and a power supply 1309 .
  • the peripheral device interface 1303 may be used to connect at least one peripheral device related to I/O (Input/Output) to the processor 1301 and the memory 1302 .
  • processor 1301, memory 1302, and peripherals interface 1303 are integrated on the same chip or circuit board; in some other embodiments, any one of processor 1301, memory 1302, and peripherals interface 1303 or The two are implemented on a separate chip or circuit board, which is not limited by the embodiments of the present disclosure.
  • the radio frequency circuit 1304 is used for receiving and transmitting RF (Radio Frequency, radio frequency) signals, also called electromagnetic signals.
  • the radio frequency circuit 1304 communicates with communication networks and other communication devices via electromagnetic signals.
  • the radio frequency circuit 1304 converts electrical signals into electromagnetic signals for transmission, or converts received electromagnetic signals into electrical signals.
  • radio frequency circuitry 1304 includes: an antenna system, an RF transceiver, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and the like.
  • radio frequency circuitry 1304 communicates with other terminals via at least one wireless communication protocol.
  • the wireless communication protocol includes but is not limited to: World Wide Web, Metropolitan Area Network, Intranet, various generations of mobile communication networks (2G, 3G, 4G and 5G), wireless local area network and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.
  • the radio frequency circuit 1304 further includes a circuit related to NFC (Near Field Communication, short-range wireless communication), which is not limited in the present disclosure.
  • the display screen 1305 is used to display UI (User Interface, user interface).
  • the UI includes graphics, text, icons, video, and any combination thereof.
  • the display screen 1305 also has the ability to acquire touch signals on or above the surface of the display screen 1305 .
  • the touch signal is input to the processor 1301 as a control signal for processing.
  • the display screen 1305 is also used to provide virtual buttons and/or virtual keyboards, also called soft buttons and/or soft keyboards.
  • the display screen 1305 is a flexible display screen disposed on a curved or folded surface of the electronic device 1300 . Even, the display screen 1305 is also set as a non-rectangular irregular figure, that is, a special-shaped screen.
  • the display screen 1305 is made of materials such as LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, organic light emitting diode).
  • the camera assembly 1306 is used to capture images or video.
  • camera assembly 1306 includes a front-facing camera and a rear-facing camera.
  • the front camera is arranged on the front panel of the terminal, and the rear camera is arranged on the back of the terminal.
  • there are at least two rear cameras which are any one of a main camera, a depth-of-field camera, a wide-angle camera, and a telephoto camera, so as to realize the fusion of the main camera and the depth-of-field camera to realize the background blur function, the main camera It is integrated with the wide-angle camera to achieve panoramic shooting and VR (Virtual Reality, virtual reality) shooting functions or other integrated shooting functions.
  • the camera assembly 1306 also includes a flash.
  • the flash is a single color temperature flash, and in some embodiments, the flash is a dual color temperature flash. Dual color temperature flash refers to the combination of warm light flash and cold light flash, which is used for light compensation under different color temperatures.
  • the audio circuit 1307 includes a microphone and a speaker.
  • the microphone is used to collect the sound waves of the user and the environment, convert the sound waves into electrical signals, and input them to the processor 1301 for processing, or to the radio frequency circuit 1304 to realize voice communication.
  • there are multiple microphones which are respectively disposed in different parts of the electronic device 1300 .
  • the microphones are array microphones or omnidirectional collection microphones.
  • the speaker is used to convert the electrical signal from the processor 1301 or the radio frequency circuit 1304 into sound waves.
  • the loudspeaker is a conventional thin-film loudspeaker, and in some embodiments, the loudspeaker is a piezoelectric ceramic loudspeaker.
  • the speaker is a piezoelectric ceramic speaker, it can not only convert electrical signals into sound waves audible to humans, but also convert electrical signals into sound waves inaudible to humans for distance measurement and other purposes.
  • the audio circuit 1307 also includes a headphone jack.
  • the positioning component 1308 is used to locate the current geographic location of the electronic device 1300 to implement navigation or LBS (Location Based Service).
  • LBS Location Based Service
  • the positioning component 1308 is a positioning component based on the GPS (Global Positioning System, Global Positioning System) of the United States, the Beidou system of China or the Galileo system of Russia.
  • Power supply 1309 is used to power various components in electronic device 1300 .
  • the power source 1309 is alternating current, direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery is a wired rechargeable battery or a wireless rechargeable battery. Wired rechargeable batteries are batteries that are charged through wired lines, and wireless rechargeable batteries are batteries that are charged through wireless coils. The rechargeable battery is also used to support fast charging technology.
  • the electronic device 1300 also includes one or more sensors 1310 .
  • the one or more sensors 1310 include, but are not limited to, an acceleration sensor 1311 , a gyro sensor 1312 , a pressure sensor 1313 , a fingerprint sensor 1314 , an optical sensor 1315 and a proximity sensor 1316 .
  • the acceleration sensor 1311 detects the magnitude of acceleration on the three coordinate axes of the coordinate system established by the electronic device 1300 .
  • the acceleration sensor 1311 is used to detect the components of the gravitational acceleration on the three coordinate axes.
  • the processor 1301 controls the display screen 1305 to display the user interface in a landscape view or a portrait view based on the gravitational acceleration signal collected by the acceleration sensor 1311 .
  • the acceleration sensor 1311 is also used for game or user movement data collection.
  • the gyroscope sensor 1312 detects the body direction and rotation angle of the electronic device 1300 , and the gyroscope sensor 1312 cooperates with the acceleration sensor 1311 to collect 3D actions of the user on the electronic device 1300 .
  • the processor 1301 can implement the following functions: motion sensing (such as changing the UI based on the user's tilt operation), image stabilization during shooting, game control, and inertial navigation.
  • the pressure sensor 1313 is disposed on the side frame of the electronic device 1300 and/or the lower layer of the display screen 1305 .
  • the pressure sensor 1313 can detect the user's holding signal of the electronic device 1300 , and the processor 1301 performs left and right hand recognition or quick operation based on the holding signal collected by the pressure sensor 1313 .
  • the processor 1301 controls the operability controls on the UI interface based on the user's pressure operation on the display screen 1305.
  • the operability controls include at least one of button controls, scroll bar controls, icon controls, and menu controls.
  • the fingerprint sensor 1314 is used to collect the user's fingerprint, and the processor 1301 identifies the user's identity based on the fingerprint collected by the fingerprint sensor 1314, or the fingerprint sensor 1314 identifies the user's identity based on the collected fingerprint. When the user's identity is identified as a trusted identity, the processor 1301 authorizes the user to perform relevant sensitive operations, including unlocking the screen, viewing encrypted information, downloading software, making payments, and changing settings.
  • the fingerprint sensor 1314 is disposed on the front, back, or side of the electronic device 1300 . When the electronic device 1300 is provided with physical buttons or a manufacturer's logo, the fingerprint sensor 1314 is integrated with the physical buttons or the manufacturer's logo.
  • Optical sensor 1315 is used to collect ambient light intensity.
  • the processor 1301 controls the display brightness of the display screen 1305 based on the ambient light intensity collected by the optical sensor 1315 . In some embodiments, when the ambient light intensity is high, the display brightness of the display screen 1305 is increased; when the ambient light intensity is low, the display brightness of the display screen 1305 is decreased. In another embodiment, the processor 1301 further dynamically adjusts the shooting parameters of the camera assembly 1306 based on the ambient light intensity collected by the optical sensor 1315 .
  • Proximity sensor 1316 also referred to as a distance sensor, is typically provided on the front panel of electronic device 1300 .
  • Proximity sensor 1316 is used to collect the distance between the user and the front of electronic device 1300 .
  • the processor 1301 controls the display screen 1305 to switch from the bright screen state to the off screen state; when the proximity sensor 1316 detects When the distance between the user and the front of the electronic device 1300 gradually increases, the processor 1301 controls the display screen 1305 to switch from the off-screen state to the bright-screen state.
  • FIG. 13 does not constitute a limitation on the electronic device 1300, and can include more or less components than the one shown, or combine some components, or adopt different component arrangements.
  • a computer-readable storage medium in which at least one piece of program code is stored, and at least one piece of program code is loaded and executed by a server to implement the image generation method in the above embodiment.
  • a computer-readable storage medium stores at least one piece of program code, and the at least one piece of program code is loaded and executed by a server, so as to realize the depth determination model in the above embodiment. training method.
  • the computer-readable storage medium is a memory.
  • the computer-readable storage medium is ROM (Read-Only Memory, read-only memory), RAM (Random Access Memory, random access memory), CD-ROM (Compact Disc Read-Only Memory, compact disc read-only storage) devices), magnetic tapes, floppy disks, and optical data storage devices, etc.
  • a computer program product or computer program comprising computer program code stored in a computer readable storage medium, the processor of the computer device from The computer-readable storage medium reads the computer program code, and the processor executes the computer program code to cause the computer device to perform the operations performed in the image generation method described above.
  • a computer program product or computer program comprising computer program code stored in a computer readable storage medium, the processor of the computer device from A computer-readable storage medium reads the computer program code, and the processor executes the computer program code, so that the computer device performs the operations performed in the above-mentioned training method for a depth determination model.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

L'invention concerne un procédé de génération d'image et un dispositif électronique, se rapportant au domaine technique du traitement d'image. Le procédé consiste : à déterminer des premières informations de profondeur d'une première région d'image et des deuxièmes informations de profondeur d'une deuxième région d'image dans une première image, la première région d'image étant une région d'image où se trouve un objet cible, et la deuxième région d'image étant une région d'image où se trouve un arrière-plan; à obtenir une deuxième image en remplaçant des données d'image de la première région d'image par des données d'image de la deuxième région d'image; à obtenir des troisièmes informations de profondeur d'une troisième région d'image en remplissant la profondeur de la troisième région d'image sur la base des deuxièmes informations de profondeur; et à obtenir une troisième image par fusion, sur la base des premières informations de profondeur et des troisièmes informations de profondeur, des données d'image de la première région d'image dans la deuxième image après remplissage en profondeur.
PCT/CN2021/106178 2020-09-10 2021-07-14 Procédé de génération d'image et dispositif électronique WO2022052620A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010947268.1A CN114170349A (zh) 2020-09-10 2020-09-10 图像生成方法、装置、电子设备及存储介质
CN202010947268.1 2020-09-10

Publications (1)

Publication Number Publication Date
WO2022052620A1 true WO2022052620A1 (fr) 2022-03-17

Family

ID=80475637

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/106178 WO2022052620A1 (fr) 2020-09-10 2021-07-14 Procédé de génération d'image et dispositif électronique

Country Status (2)

Country Link
CN (1) CN114170349A (fr)
WO (1) WO2022052620A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115334239A (zh) * 2022-08-10 2022-11-11 青岛海信移动通信技术股份有限公司 前后摄像头拍照融合的方法、终端设备和存储介质
CN116543075A (zh) * 2023-03-31 2023-08-04 北京百度网讯科技有限公司 图像生成方法、装置、电子设备及存储介质
CN116704129A (zh) * 2023-06-14 2023-09-05 维坤智能科技(上海)有限公司 基于全景图的三维图像生成方法、装置、设备及存储介质
CN117197003A (zh) * 2023-11-07 2023-12-08 杭州灵西机器人智能科技有限公司 一种多条件控制的纸箱样本生成方法
CN117422848A (zh) * 2023-10-27 2024-01-19 神力视界(深圳)文化科技有限公司 三维模型的分割方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115205161B (zh) * 2022-08-18 2023-02-21 荣耀终端有限公司 一种图像处理方法及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271583A (zh) * 2008-04-28 2008-09-24 清华大学 一种基于深度图的快速图像绘制方法
CN102307312A (zh) * 2011-08-31 2012-01-04 四川虹微技术有限公司 一种对dibr技术生成的目标图像进行空洞填充的方法
US20120033852A1 (en) * 2010-08-06 2012-02-09 Kennedy Michael B System and method to find the precise location of objects of interest in digital images
CN102592275A (zh) * 2011-12-16 2012-07-18 天津大学 虚拟视点绘制方法
CN111222440A (zh) * 2019-12-31 2020-06-02 江西开心玉米网络科技有限公司 一种人像背景分离方法、装置、服务器及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271583A (zh) * 2008-04-28 2008-09-24 清华大学 一种基于深度图的快速图像绘制方法
US20120033852A1 (en) * 2010-08-06 2012-02-09 Kennedy Michael B System and method to find the precise location of objects of interest in digital images
CN102307312A (zh) * 2011-08-31 2012-01-04 四川虹微技术有限公司 一种对dibr技术生成的目标图像进行空洞填充的方法
CN102592275A (zh) * 2011-12-16 2012-07-18 天津大学 虚拟视点绘制方法
CN111222440A (zh) * 2019-12-31 2020-06-02 江西开心玉米网络科技有限公司 一种人像背景分离方法、装置、服务器及存储介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115334239A (zh) * 2022-08-10 2022-11-11 青岛海信移动通信技术股份有限公司 前后摄像头拍照融合的方法、终端设备和存储介质
CN115334239B (zh) * 2022-08-10 2023-12-15 青岛海信移动通信技术有限公司 前后摄像头拍照融合的方法、终端设备和存储介质
CN116543075A (zh) * 2023-03-31 2023-08-04 北京百度网讯科技有限公司 图像生成方法、装置、电子设备及存储介质
CN116543075B (zh) * 2023-03-31 2024-02-13 北京百度网讯科技有限公司 图像生成方法、装置、电子设备及存储介质
CN116704129A (zh) * 2023-06-14 2023-09-05 维坤智能科技(上海)有限公司 基于全景图的三维图像生成方法、装置、设备及存储介质
CN116704129B (zh) * 2023-06-14 2024-01-30 维坤智能科技(上海)有限公司 基于全景图的三维图像生成方法、装置、设备及存储介质
CN117422848A (zh) * 2023-10-27 2024-01-19 神力视界(深圳)文化科技有限公司 三维模型的分割方法及装置
CN117197003A (zh) * 2023-11-07 2023-12-08 杭州灵西机器人智能科技有限公司 一种多条件控制的纸箱样本生成方法
CN117197003B (zh) * 2023-11-07 2024-02-27 杭州灵西机器人智能科技有限公司 一种多条件控制的纸箱样本生成方法

Also Published As

Publication number Publication date
CN114170349A (zh) 2022-03-11

Similar Documents

Publication Publication Date Title
CN110544280B (zh) Ar系统及方法
US11205282B2 (en) Relocalization method and apparatus in camera pose tracking process and storage medium
CN109308727B (zh) 虚拟形象模型生成方法、装置及存储介质
WO2022052620A1 (fr) Procédé de génération d'image et dispositif électronique
CN110992493B (zh) 图像处理方法、装置、电子设备及存储介质
US11393154B2 (en) Hair rendering method, device, electronic apparatus, and storage medium
KR102595150B1 (ko) 다수의 가상 캐릭터를 제어하는 방법, 기기, 장치 및 저장 매체
CN110427110B (zh) 一种直播方法、装置以及直播服务器
CN110064200B (zh) 基于虚拟环境的物体构建方法、装置及可读存储介质
CN109815150B (zh) 应用测试方法、装置、电子设备及存储介质
CN110148178B (zh) 相机定位方法、装置、终端及存储介质
CN108694073B (zh) 虚拟场景的控制方法、装置、设备及存储介质
CN112287852B (zh) 人脸图像的处理方法、显示方法、装置及设备
CN109522863B (zh) 耳部关键点检测方法、装置及存储介质
CN109166150B (zh) 获取位姿的方法、装置存储介质
WO2020233403A1 (fr) Procédé et appareil d'affichage de visage personnalisé pour un personnage tridimensionnel et dispositif et support de stockage
WO2022042425A1 (fr) Procédé et appareil de traitement de données vidéo, et dispositif informatique et support de stockage
CN112581358B (zh) 图像处理模型的训练方法、图像处理方法及装置
CN111897429A (zh) 图像显示方法、装置、计算机设备及存储介质
CN110837300B (zh) 虚拟交互的方法、装置、电子设备及存储介质
WO2022199102A1 (fr) Procédé et dispositif de traitement d'image
CN111105474A (zh) 字体绘制方法、装置、计算机设备及计算机可读存储介质
CN112308103A (zh) 生成训练样本的方法和装置
CN112381729B (zh) 图像处理方法、装置、终端及存储介质
CN110300275B (zh) 视频录制、播放方法、装置、终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21865676

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 27.06.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21865676

Country of ref document: EP

Kind code of ref document: A1