WO2020211573A1 - Procédé et dispositif de traitement d'image - Google Patents

Procédé et dispositif de traitement d'image Download PDF

Info

Publication number
WO2020211573A1
WO2020211573A1 PCT/CN2020/078582 CN2020078582W WO2020211573A1 WO 2020211573 A1 WO2020211573 A1 WO 2020211573A1 CN 2020078582 W CN2020078582 W CN 2020078582W WO 2020211573 A1 WO2020211573 A1 WO 2020211573A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
shadow
illumination
network
target object
Prior art date
Application number
PCT/CN2020/078582
Other languages
English (en)
Chinese (zh)
Inventor
王光伟
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020211573A1 publication Critical patent/WO2020211573A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the embodiments of the present disclosure relate to the field of computer technology, and more particularly to methods and devices for processing images.
  • the virtual object image used for adding to the real scene image is usually an image preset by the technician according to the shape of the virtual object.
  • the embodiments of the present disclosure propose methods and devices for processing images.
  • an embodiment of the present disclosure provides a method for processing an image, the method includes: acquiring a target object illumination image and a target virtual object image, wherein the target object illumination image includes the object image and the corresponding object image Shadow image; input the target object illumination image into the pre-trained shadow extraction model to obtain the resulting shadow image including distance information, where the distance information is used to represent the pixel points of the shadow image and the pixel corresponding to the object image in the target object illumination image The distance of the point; based on the resulting shadow image, generate the illumination direction information corresponding to the target object illumination image; based on the illumination direction information, generate the virtual object illumination image corresponding to the target virtual object image, where the virtual shadow image in the virtual object illumination image The corresponding light direction matches the light direction indicated by the light direction information; the virtual object light image and the target object light image are merged to add the virtual object light image to the target object light image to obtain the result image.
  • generating the illumination direction information corresponding to the target object illumination image includes: inputting the resulting shadow image into a pre-trained illumination direction recognition model to obtain the illumination direction information.
  • the distance information is the pixel value of the pixel in the resulting shadow image.
  • the shadow extraction model is obtained by training in the following steps: obtaining a preset training sample set, where the training samples include a sample object illumination image and a sample result shadow image predetermined for the sample object illumination image; obtaining a pre-established Generative confrontation network, where the generative confrontation network includes a generation network and a discriminant network.
  • the generation network is used to identify the input object lighting image and output the resulting shadow image
  • the discriminant network is used to determine whether the input image is a generation network
  • the output image based on the machine learning method, the sample object illumination image included in the training samples in the training sample set is used as the input of the generation network, and the resulting shadow image output by the network and the sample corresponding to the input sample object illumination image are generated
  • the shadow image is used as the input of the discriminant network, the generation network and the discriminant network are trained, and the trained generation network is determined as the shadow extraction model.
  • the method further includes: displaying the obtained result image.
  • the method further includes: sending the obtained result image to a user terminal connected in communication, and controlling the user terminal to display the result image.
  • an embodiment of the present disclosure provides an apparatus for processing an image.
  • the apparatus includes: an image acquisition unit configured to acquire a target object illumination image and a target virtual object image, wherein the target object illumination image includes an object The shadow image corresponding to the image and the object image; the image input unit is configured to input the target object illumination image into the pre-trained shadow extraction model to obtain the resulting shadow image including distance information, where the distance information is used to characterize the illumination of the target object In the image, the distance between the pixel point of the shadow image and the pixel point corresponding to the object image; the information generation unit is configured to generate light direction information corresponding to the target object illumination image based on the resulting shadow image; the image generation unit is configured to be based on The illumination direction information generates the virtual object illumination image corresponding to the target virtual object image, wherein the illumination direction corresponding to the virtual shadow image in the virtual object illumination image matches the illumination direction indicated by the illumination direction information; the image fusion unit is It is configured to fuse the lighting image of the virtual object and the lighting image of the
  • the information generating unit is further configured to: input the resulting shadow image into a pre-trained light direction recognition model to obtain light direction information.
  • the distance information is the pixel value of the pixel in the resulting shadow image.
  • the shadow extraction model is obtained by training in the following steps: obtaining a preset training sample set, where the training samples include a sample object illumination image and a sample result shadow image predetermined for the sample object illumination image; obtaining a pre-established Generative confrontation network, where the generative confrontation network includes a generation network and a discriminant network.
  • the generation network is used to identify the input object lighting image and output the resulting shadow image
  • the discriminant network is used to determine whether the input image is a generation network
  • the output image based on the machine learning method, the sample object illumination image included in the training samples in the training sample set is used as the input of the generation network, and the resulting shadow image output by the network and the sample corresponding to the input sample object illumination image are generated
  • the shadow image is used as the input of the discriminant network, the generation network and the discriminant network are trained, and the trained generation network is determined as the shadow extraction model.
  • the device further includes: an image display unit configured to display the obtained result image.
  • the device further includes: an image sending unit configured to send the obtained result image to a user terminal connected in communication, and control the user terminal to display the result image.
  • the embodiments of the present disclosure provide an electronic device, including: one or more processors; a storage device, on which one or more programs are stored, when one or more programs are processed by one or more The processor executes, so that one or more processors implement the method of any one of the foregoing methods for processing images.
  • the embodiments of the present disclosure provide a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, the method of any one of the above methods for processing an image is implemented.
  • the method and device for processing an image provided by the embodiments of the present disclosure acquire a target object illumination image and a target virtual object image, where the target object illumination image includes the object image and the shadow image corresponding to the object image, and then the target object The illumination image is input into the pre-trained shadow extraction model to obtain the resulting shadow image including distance information.
  • the distance information is used to characterize the distance between the pixel point of the shadow image and the pixel point corresponding to the object image in the target object illumination image, and then based on As a result, the shadow image generates the illumination direction information corresponding to the illumination image of the target object, and then based on the illumination direction information, the virtual object illumination image corresponding to the target virtual object image is generated, where the illumination corresponding to the virtual shadow image in the virtual object illumination image
  • the direction matches the light direction indicated by the light direction information, and finally the virtual object light image and the target object light image are merged to add the virtual object light image to the target object light image to obtain the result image, which can be used as the target
  • the shadow image corresponding to the virtual object image is generated, so that the virtual object image can be better integrated into the target object lighting image, which improves the reality of the result image It can improve the display effect of the image.
  • FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present disclosure can be applied
  • Fig. 2 is a flowchart of an embodiment of a method for processing an image according to the present disclosure
  • FIG. 3 is a schematic diagram of an application scenario of the method for processing images according to an embodiment of the present disclosure
  • FIG. 4 is a flowchart of another embodiment of a method for processing an image according to the present disclosure.
  • Fig. 5 is a schematic structural diagram of an embodiment of an image processing apparatus according to the present disclosure.
  • Fig. 6 is a schematic structural diagram of a computer system suitable for implementing an electronic device of an embodiment of the present disclosure.
  • FIG. 1 shows an exemplary system architecture 100 to which an embodiment of the method for processing images or the apparatus for processing images of the present disclosure can be applied.
  • the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105.
  • the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and so on.
  • Various communication client applications such as image processing applications, web browser applications, shopping applications, search applications, instant messaging tools, email clients, and social platform software, can be installed on the terminal devices 101, 102, and 103.
  • the terminal devices 101, 102, and 103 may be hardware or software.
  • the terminal devices 101, 102, 103 can be various electronic devices with cameras, including but not limited to smart phones, tablets, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic Video experts compress standard audio layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image experts compress standard audio layer 4) players, laptop portable computers and desktop computers, etc.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, multiple software or software modules for providing distributed services), or as a single software or software module. There is no specific limitation here.
  • the server 105 may be a server that provides various services, for example, an image processing server that processes the illumination images of the target object obtained by shooting the terminal devices 101, 102, and 103.
  • the image processing server can analyze and process the received data such as the illumination image of the target object, and obtain the processing result (for example, the result image).
  • the server can also feed back the obtained processing result to the terminal device.
  • the method for processing images provided by the embodiments of the present disclosure can be executed by the server 105, and can also be executed by the terminal devices 101, 102, 103. Accordingly, the device for processing images can be set in the server. 105 can also be set in the terminal devices 101, 102, 103.
  • the server can be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
  • the server is software, it can be implemented as multiple software or software modules (for example, multiple software or software modules for providing distributed services), or as a single software or software module. There is no specific limitation here.
  • terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there can be any number of terminal devices, networks and servers.
  • the above system architecture may not include the network, but only include the terminal device or the server.
  • the method for processing images includes the following steps:
  • Step 201 Obtain a target object illumination image and a target virtual object image.
  • the execution subject of the method for processing images may remotely or locally obtain the target object illumination image and the target virtual object image through a wired connection or a wireless connection.
  • the illumination image of the target object is the image to be processed.
  • the illumination image of the target object includes the object image and the shadow image corresponding to the object image.
  • the illuminated image of the target object may be an image obtained by shooting an object in the illuminated scene.
  • the light source in the illumination scene in which the illumination image of the target object is captured is parallel light or sunlight. It can be understood that in an illuminated scene, when an object blocks the light source, shadows will be generated.
  • the target virtual object image is an image used to process the illumination image of the target object.
  • the target virtual object image may be an image predetermined according to the shape of the virtual object. Specifically, it may be a pre-drawn image, or it may be an image pre-extracted from an existing image according to the contour of the object.
  • the "virtual" of the target virtual object image is relative to the target object illumination image, which means that the virtual object corresponding to the target virtual object image does not actually exist in the target virtual object image. Objects in the real scene of the illuminated image.
  • Step 202 Input the illumination image of the target object into a pre-trained shadow extraction model to obtain a resultant shadow image including distance information.
  • the above-mentioned execution subject may input the target object illumination image into a pre-trained shadow extraction model to obtain a resultant shadow image including distance information.
  • the resulting shadow image may be a shadow image extracted from the illumination image of the target object and added with distance information.
  • the distance information is used to characterize the distance between the pixel point of the shadow image and the pixel point corresponding to the object image in the target object illumination image. Specifically, because a certain point in the object blocks the light source, a shadow point will be generated on the projection surface (such as the ground, wall, desktop, etc.). Furthermore, here, the point on the object used to generate the shadow point can be used.
  • the shadow point in the shadow corresponds to the pixel point in the shadow image, which can then be used to generate the shadow image corresponding to the pixel point
  • the pixel point corresponding to the object point of the shadow point is regarded as the pixel point corresponding to the pixel point in the shadow image.
  • the distance information can be embodied in the resulting shadow image in various forms.
  • the distance information can be recorded in the resulting shadow image in digital form.
  • each pixel in the resulting shadow image may correspond to a number, and the number may be the distance between the corresponding pixel and the corresponding pixel in the object image.
  • the distance information may be the pixel value of the pixel in the resulting shadow image.
  • various ways can be used to characterize the distance using pixel values. As an example, the larger the pixel value, the longer the distance; or the smaller the pixel value, the longer the distance.
  • the shadow extraction model can be used to characterize the correspondence between the illumination image of the object and the resulting shadow image.
  • the shadow extraction model may be pre-made by technicians based on statistics of a large number of object illumination images and result shadow images corresponding to the object illumination image, and stores multiple object illumination images and corresponding result shadows.
  • Correspondence table of the image it can also be a model obtained after training an initial model (such as a neural network) using a machine learning method based on a preset training sample.
  • the shadow extraction model may be trained by the above-mentioned executive body or other electronic devices through the following steps:
  • the illuminated image of the sample object may be an image obtained by shooting the sample object in an illuminated scene.
  • the sample object illumination image may include a sample object image and a sample shadow image.
  • the sample result shadow image may be an image obtained by extracting a sample shadow image from a sample object illumination image, and adding sample distance information to the extracted sample shadow image.
  • a pre-established generative confrontation network is obtained, where the generative confrontation network includes a generation network and a discrimination network.
  • the generation network is used to recognize the input object illumination image and output the resulting shadow image
  • the discrimination network is used to determine the input Whether the image of is the image output by the generation network.
  • the above-mentioned generative confrontation network may be a generative confrontation network of various structures.
  • the generative adversarial network may be a deep convolutional generative adversarial network (Deep Convolutional Generative Adversarial Network, DCGAN).
  • DCGAN Deep Convolutional Generative Adversarial Network
  • the above-mentioned generative confrontation network may be an untrained generative confrontation network after initializing parameters, or a trained generative confrontation network.
  • the generation network may be a convolutional neural network for image processing (for example, a convolutional neural network with various structures including a convolutional layer, a pooling layer, a depooling layer, and a deconvolutional layer).
  • the above-mentioned discriminant network may also be a convolutional neural network (for example, a convolutional neural network of various structures including a fully connected layer, where the above-mentioned fully connected layer can implement a classification function).
  • the discriminant network can also be other models used to implement classification functions, such as Support Vector Machine (SVM).
  • SVM Support Vector Machine
  • the discriminant network determines that the image input to the discriminant network is an image output by the generation network, it can output 1 (or 0); if it determines that it is not an image output by the generation network, it can output 0 (or 1). It should be noted that the discrimination network can also output other preset information to characterize the discrimination result, which is not limited to the values 1 and 0.
  • the sample object illumination image included in the training samples in the training sample set is used as the input of the generation network, and the resulting shadow image output by the network and the sample result shadow image corresponding to the input sample object illumination image are generated.
  • the generation network and the discriminant network are trained, and the trained generation network is determined as the shadow extraction model.
  • the parameters of any one of the generating network and the discriminating network (which can be called the first network) can be fixed first, and the network with no fixed parameters (which can be called the second network) can be optimized; then the parameters of the second network can be fixed. Parameters to improve the first network. Continuously carry out the above iterations, so that the judgment network cannot distinguish whether the input image is output by the generation network. At this time, the result shadow image generated by the above generation network is close to the sample result shadow image, and the above discrimination network cannot accurately distinguish between the real data and the generated data (that is, the accuracy rate is 50%).
  • the generation network at this time can be determined as the shadow extraction model.
  • the above-mentioned executive body or other electronic devices can use the existing back propagation algorithm and gradient descent algorithm to train the generation network and the discrimination network.
  • the parameters of the generation network and the discrimination network after each training will be adjusted, and the generation network and the discrimination network obtained after each adjustment of the parameters are used as the generation network and the discrimination network used in the next training.
  • Step 203 Based on the resulting shadow image, generate light direction information corresponding to the light image of the target object.
  • the above-mentioned execution subject may generate the illumination direction information corresponding to the illumination image of the target object.
  • the light direction information can be used to indicate the light direction, which can include but is not limited to at least one of the following: text, numbers, symbols, and images.
  • the illumination direction information may be an arrow marked in the resulting shadow image.
  • the direction of the arrow may be the illumination direction; or the illumination direction information may be a two-dimensional vector, where the two-dimensional vector corresponds to The direction can be the light direction.
  • the light direction indicated by the light direction information is the projection of the actual light direction in the three-dimensional coordinate system on the projection surface where the shadow in the three-dimensional coordinate system is located. It can be understood that in practice, the light direction (that is, the projection of the actual light direction on the projection surface of the shadow) is usually the same as the extension direction of the shadow. Furthermore, the above-mentioned execution subject may determine the extension direction of the shadow based on the pixel points in the resulting shadow image and the distance information corresponding to the pixel points, and then determine the light direction.
  • the above-mentioned execution subject may select, from the resulting shadow image, the pixel with the closest distance represented by the corresponding distance information as the first pixel, and select the pixel with the farthest distance represented by the corresponding distance information
  • the dot is used as the second pixel, and further, the above-mentioned execution subject may determine the direction in which the first pixel points to the second pixel as the illumination direction.
  • Step 204 based on the illumination direction information, generate a virtual object illumination image corresponding to the target virtual object image.
  • the above-mentioned execution subject may generate the virtual object illumination image corresponding to the target virtual object image.
  • the lighting image of the virtual object includes the above-mentioned target virtual object image and the virtual shadow image corresponding to the target virtual object image.
  • the lighting direction corresponding to the virtual shadow image in the lighting image of the virtual object matches the lighting direction indicated by the lighting direction information.
  • the matching means that the angular deviation of the illumination direction corresponding to the virtual shadow image with respect to the illumination direction indicated by the illumination direction information is less than or equal to the preset angle.
  • the above-mentioned execution subject may use various methods to generate the virtual object illumination image corresponding to the target virtual object image based on the illumination direction information.
  • a light source can be constructed in the rendering engine based on the light direction indicated by the light direction information, and then the target virtual object image can be rendered based on the constructed light source to obtain the virtual object light image.
  • the light direction indicated by the light direction information is the actual light direction projected on the projection surface where the shadow is located, in the process of constructing the light source, it is necessary to first determine the actual light direction based on the light direction information, and then based on the actual light The direction builds the light source.
  • the actual light direction can be determined by the light direction on the projection surface where the shadow is located and the light direction on the projection surface perpendicular to the projection surface where the shadow is located, and in this embodiment, it is perpendicular to the projection surface where the shadow is located.
  • the direction of illumination on the projection surface can be predetermined.
  • the above-mentioned execution subject pre-stores an initial virtual shadow image corresponding to the target virtual object image. Then the execution subject can adjust the initial virtual shadow image based on the illumination direction information to obtain the virtual shadow image, and then combine the virtual shadow image and the target virtual object image to generate the virtual object illumination image.
  • the light source corresponding to the illumination image of the target object is parallel light or sunlight, here, it can be considered that the illumination direction corresponding to the virtual shadow image in the illumination image of the virtual object is the same as the illumination direction indicated by the aforementioned illumination direction information. It does not need to consider the influence of the position of the virtual object illumination image added to the target object illumination image on the illumination direction corresponding to the virtual shadow image.
  • Step 205 Fusion of the lighting image of the virtual object and the lighting image of the target object to add the lighting image of the virtual object to the lighting image of the target object to obtain a result image.
  • the above-mentioned execution subject may fuse the virtual object illumination image and the target object illumination image to add the virtual object illumination image to the target object illumination image to obtain the result image.
  • the result image is the target object illumination image with the virtual object illumination image added.
  • the position where the virtual object illumination image is added to the target object illumination image can be predetermined (for example, the center position of the image), or it can be determined after recognizing the target object illumination image (for example, it can be After the object image and shadow image in the illumination image of the target object are obtained, the area in the illumination image of the target object that does not include the object image and shadow image is determined as the location for adding the virtual object illumination image).
  • the execution subject may display the obtained result image.
  • the above-mentioned execution subject may also send the obtained result image to the user terminal connected in communication, and control the user terminal to display the result image.
  • the user terminal is a terminal used by the user to communicate with the execution subject.
  • the above-mentioned execution subject may send a control signal to the user terminal, thereby controlling the user terminal to display the result image.
  • this implementation can control the user terminal Display a more realistic result image to improve the display effect of the image.
  • Fig. 3 is a schematic diagram of an application scenario of the method for processing an image according to this embodiment.
  • the server 301 first obtains the cat's lighting image 302 (target object lighting image) and the football image 303 (target virtual object image), where the cat lighting image 302 includes the cat's image (object image) and The shadow image of the cat (shadow image). Then, the server 301 can input the cat's light image 302 into the pre-trained shadow extraction model 304 to obtain the cat's shadow image 305 (resulting shadow image) including distance information, where the distance information is used to represent the cat's light image 302 , The distance between the pixel of the cat’s shadow image and the pixel of the cat’s image.
  • the server 301 may generate the light direction information 306 corresponding to the light image 302 of the cat based on the shadow image 305 of the cat including the distance information. Then, the server 301 can generate a football illumination image 307 (virtual object illumination image) corresponding to the football image 304 based on the illumination direction information 306, where the football shadow image (virtual shadow image) in the football illumination image 307 corresponds to the illumination direction It matches the light direction indicated by the light direction information 306. Finally, the server 301 may merge the football lighting image 307 and the cat lighting image 302 to add the football lighting image 307 to the cat lighting image 302 to obtain a result image 308.
  • a football illumination image 307 virtual object illumination image
  • the server 301 may merge the football lighting image 307 and the cat lighting image 302 to add the football lighting image 307 to the cat lighting image 302 to obtain a result image 308.
  • the method provided by the above-mentioned embodiments of the present disclosure can generate the virtual object illumination image corresponding to the virtual object image, so that the corresponding virtual shadow image can be added to the virtual object image, and then the illumination image of the virtual object and the illumination image of the target object can be added.
  • the authenticity of the generated result image can be improved; in addition, the present disclosure can determine the illumination direction of the virtual shadow image corresponding to the virtual object image based on the illumination direction of the shadow image in the target object illumination image.
  • the virtual object image is better integrated into the target object illumination image, and the authenticity of the result image is further improved, which helps to improve the display effect of the result image.
  • FIG. 4 shows a flow 400 of another embodiment of a method for processing an image.
  • the process 400 of the method for processing an image includes the following steps:
  • Step 401 Obtain a target object illumination image and a target virtual object image.
  • the execution subject of the method for processing images may remotely or locally obtain the target object illumination image and the target virtual object image through a wired connection or a wireless connection.
  • the illumination image of the target object is the image to be processed.
  • the illumination image of the target object includes the object image and the shadow image corresponding to the object image.
  • the target virtual object image is an image used to process the illumination image of the target object.
  • the target virtual object image may be an image predetermined according to the shape of the virtual object.
  • Step 402 Input the illumination image of the target object into a pre-trained shadow extraction model, and obtain a resulting shadow image including distance information.
  • the above-mentioned execution subject may input the target object illumination image into a pre-trained shadow extraction model to obtain a resulting shadow image including distance information.
  • the resulting shadow image may be a shadow image extracted from the illumination image of the target object and added with distance information.
  • the distance information is used to characterize the distance between the pixel point of the shadow image and the pixel point corresponding to the object image in the target object illumination image.
  • the shadow extraction model can be used to characterize the correspondence between the illumination image of the object and the resulting shadow image.
  • Step 403 Input the resulting shadow image into a pre-trained light direction recognition model to obtain light direction information.
  • the above-mentioned execution subject may input the result shadow image into a pre-trained light direction recognition model to obtain light direction information.
  • the light direction information can be used to indicate the light direction, which can include but is not limited to at least one of the following: text, numbers, symbols, and images.
  • the light direction recognition model can be used to characterize the corresponding relationship between the resulting shadow image and light direction information.
  • the illumination direction recognition model may be pre-made by technicians based on statistics of a large number of result shadow images and the illumination direction information corresponding to the result shadow images, and store multiple result shadow images and corresponding illumination.
  • Correspondence table of direction information it can also be a model obtained after training an initial model (such as a neural network) using a machine learning method based on preset training samples.
  • Step 404 based on the illumination direction information, generate a virtual object illumination image corresponding to the target virtual object image.
  • the above-mentioned execution subject may generate the virtual object illumination image corresponding to the target virtual object image.
  • the lighting image of the virtual object includes the above-mentioned target virtual object image and the virtual shadow image corresponding to the target virtual object image.
  • the lighting direction corresponding to the virtual shadow image in the lighting image of the virtual object matches the lighting direction indicated by the lighting direction information.
  • the matching means that the angular deviation of the illumination direction corresponding to the virtual shadow image with respect to the illumination direction indicated by the illumination direction information is less than or equal to the preset angle.
  • Step 405 Fusion of the lighting image of the virtual object and the lighting image of the target object to add the lighting image of the virtual object to the lighting image of the target object to obtain a result image.
  • the above-mentioned execution subject may merge the illumination image of the virtual object and the illumination image of the target object to add the illumination image of the virtual object to the illumination image of the target object to obtain the result image.
  • the result image is the target object illumination image with the virtual object illumination image added.
  • step 401, step 402, step 404, and step 405 can be respectively performed in a manner similar to step 201, step 202, step 204, and step 205 in the foregoing embodiment.
  • the above is for step 201, step 202, step 204, and step 205.
  • the description of 205 is also applicable to step 401, step 402, step 404, and step 405, and will not be repeated here.
  • the process 400 of the method for processing an image in this embodiment highlights the step of using the light direction recognition model to generate light direction information. Therefore, the solution described in this embodiment can more conveniently determine the illumination direction corresponding to the illumination image of the target object, thereby can generate the result image more quickly, and improve the efficiency of image processing.
  • the present disclosure provides an embodiment of a device for processing images.
  • the device embodiment corresponds to the method embodiment shown in FIG.
  • the device can be specifically applied to various electronic devices.
  • the apparatus 500 for processing images in this embodiment includes: an image acquisition unit 501, an image input unit 502, an information generation unit 503, an image generation unit 504, and an image fusion unit 505.
  • the image acquisition unit 501 is configured to acquire a target object illumination image and a target virtual object image, where the target object illumination image includes the object image and a shadow image corresponding to the object image;
  • the image input unit 502 is configured to illuminate the target object image Input the pre-trained shadow extraction model to obtain the resulting shadow image including distance information, where the distance information is used to represent the distance between the pixel point of the shadow image and the pixel point corresponding to the object image in the target object illumination image;
  • information generation unit 503 Is configured to generate illumination direction information corresponding to the target object illumination image based on the resulting shadow image;
  • the image generating unit 504 is configured to generate the virtual object illumination image corresponding to the target virtual object image based on the illumination direction information, wherein the virtual object illuminates The illumination direction corresponding to the virtual shadow image in the image matches the
  • the image acquisition unit 501 of the apparatus 500 for processing images may remotely or locally acquire the target object illumination image and the target virtual object image through a wired connection or a wireless connection.
  • the illumination image of the target object is the image to be processed.
  • the illumination image of the target object includes the object image and the shadow image corresponding to the object image.
  • the target virtual object image is an image used to process the illumination image of the target object.
  • the target virtual object image may be an image predetermined according to the shape of the virtual object.
  • the image input unit 502 may input the illumination image of the target object into a pre-trained shadow extraction model to obtain a resulting shadow image including distance information.
  • the resulting shadow image may be a shadow image extracted from the illumination image of the target object and added with distance information.
  • the distance information is used to characterize the distance between the pixel point of the shadow image and the pixel point corresponding to the object image in the target object illumination image.
  • the distance information can be embodied in the resulting shadow image in various forms.
  • the shadow extraction model can be used to characterize the correspondence between the illumination image of the object and the resulting shadow image.
  • the information generating unit 503 may generate light direction information corresponding to the light image of the target object.
  • the light direction information can be used to indicate the light direction, which can include but is not limited to at least one of the following: text, numbers, symbols, and images.
  • the image generating unit 504 Based on the lighting direction information obtained by the information generating unit 503, the image generating unit 504 generates a virtual object lighting image corresponding to the target virtual object image.
  • the lighting image of the virtual object includes the above-mentioned target virtual object image and the virtual shadow image corresponding to the target virtual object image.
  • the lighting direction corresponding to the virtual shadow image in the lighting image of the virtual object matches the lighting direction indicated by the lighting direction information.
  • the matching means that the angular deviation of the illumination direction corresponding to the virtual shadow image with respect to the illumination direction indicated by the illumination direction information is less than or equal to the preset angle.
  • the image fusion unit 505 may fuse the virtual object illumination image and the target object illumination image to add the virtual object illumination image to the target object illumination image. Obtain the resulting image. Among them, the result image is the target object illumination image with the virtual object illumination image added.
  • the information generating unit 503 may be further configured to input the resulting shadow image into a pre-trained light direction recognition model to obtain light direction information.
  • the distance information is the pixel value of the pixel in the resulting shadow image.
  • the shadow extraction model can be obtained by training in the following steps: Obtain a preset training sample set, where the training samples include sample object illumination images and samples predetermined for the sample object illumination images Result shadow image; obtain a pre-established generative confrontation network, where the generative confrontation network includes a generation network and a discriminant network.
  • the generation network is used to identify the input object illumination image and output the resulting shadow image
  • the discriminant network is used to determine Whether the input image is the image output by the generation network; based on the machine learning method, the illumination image of the sample object included in the training sample in the training sample set is used as the input of the generation network, and the resulting shadow image output by the generation network is combined with the input
  • the sample result shadow image corresponding to the illumination image of the sample object is used as the input of the discriminant network, the generation network and the discriminant network are trained, and the trained generation network is determined as the shadow extraction model.
  • the apparatus 500 may further include: an image display unit (not shown in the figure), configured to display the obtained result image.
  • the apparatus 500 may further include: an image sending unit (not shown in the figure) configured to send the obtained result image to the user terminal connected in communication, and to control the user The terminal displays the result image.
  • an image sending unit (not shown in the figure) configured to send the obtained result image to the user terminal connected in communication, and to control the user The terminal displays the result image.
  • the apparatus 500 provided by the above-mentioned embodiment of the present disclosure can generate the virtual object illumination image corresponding to the virtual object image, and thereby can add a corresponding virtual shadow image to the virtual object image, and then illuminate the virtual object illumination image and the target object illumination image After the fusion, the authenticity of the generated result image can be improved; in addition, the present disclosure can determine the illumination direction of the virtual shadow image corresponding to the virtual object image based on the illumination direction of the shadow image in the target object illumination image, so as to: The virtual object image can be better integrated into the target object illumination image, further improving the authenticity of the result image, and helping to improve the display effect of the result image.
  • FIG. 6 shows a schematic structural diagram of an electronic device (such as the terminal device or the server in FIG. 1) 600 suitable for implementing the embodiments of the present disclosure.
  • the terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (e.g. Mobile terminals such as car navigation terminals) and fixed terminals such as digital TVs, desktop computers, etc.
  • the electronic device shown in FIG. 6 is only an example, and should not bring any limitation to the function and use range of the embodiments of the present disclosure.
  • the electronic device 600 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which can be loaded into a random access device according to a program stored in a read-only memory (ROM) 602 or from a storage device 608.
  • the program in the memory (RAM) 603 executes various appropriate actions and processing.
  • the RAM 603 also stores various programs and data required for the operation of the electronic device 600.
  • the processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to the bus 604.
  • the following devices can be connected to the I/O interface 605: including input devices 606 such as touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, liquid crystal display (LCD), speakers, vibration An output device 607 such as a device; a storage device 608 such as a magnetic tape and a hard disk; and a communication device 609.
  • the communication device 609 may allow the electronic device 600 to perform wireless or wired communication with other devices to exchange data.
  • FIG. 6 shows an electronic device 600 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may alternatively be implemented or provided with more or fewer devices.
  • the process described above with reference to the flowchart can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 609, or installed from the storage device 608, or installed from the ROM 602.
  • the processing device 601 the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the computer-readable medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable signal medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: obtains the target object illumination image and the target virtual object image, where the target object illumination image includes The object image and the shadow image corresponding to the object image; input the target object illumination image into the pre-trained shadow extraction model to obtain the resulting shadow image including distance information, where the distance information is used to characterize the shadow image in the target object illumination image The distance between the pixel point and the pixel point corresponding to the object image; based on the resulting shadow image, the illumination direction information corresponding to the target object illumination image is generated; based on the illumination direction information, the virtual object illumination image corresponding to the target virtual object image is generated, where the virtual The lighting direction corresponding to the virtual shadow image in the object lighting image matches the lighting direction indicated by the lighting
  • the computer program code used to perform the operations of the present disclosure can be written in one or more programming languages or a combination thereof.
  • the programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user’s computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider for example, using an Internet service provider to pass Internet connection.
  • each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented in a software manner, or may be implemented in a hardware manner.
  • the name of the unit does not constitute a limitation on the unit itself under certain circumstances.
  • the image acquisition unit can also be described as "a unit that acquires the illumination image of the target object and the image of the virtual object".

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Les modes de réalisation de la présente invention concernent un procédé et un dispositif de traitement d'une image. Un mode de réalisation du procédé consiste à : acquérir une image d'éclairage d'un objet cible et une image d'un objet virtuel cible, l'image d'éclairage de l'objet cible comprenant une image de l'objet et une image de l'ombre correspondant à l'image de l'objet ; entrer l'image d'éclairage de l'objet cible dans un modèle d'extraction d'ombre pré-entraîné, et obtenir une image d'ombre résultante comprenant des informations de distance ; produire, sur la base de l'image d'ombre résultante, des informations de direction d'éclairage correspondant à l'image d'éclairage de l'objet cible ; produire, sur la base des informations de direction d'éclairage, une image d'éclairage de l'objet virtuel correspondant à l'image de l'objet virtuel cible ; et fusionner l'image d'éclairage de l'objet virtuel et l'image d'éclairage de l'objet cible, de façon à ajouter l'image d'éclairage de l'objet virtuel à l'image d'éclairage de l'objet cible, de façon à obtenir une image résultante. Selon le mode de réalisation, une image de l'objet virtuel peut être mieux fusionnée pour donner une image d'éclairage de l'objet cible, ce qui améliore l'authenticité de l'image résultante, et facilite ainsi l'amélioration de l'effet d'affichage de l'image.
PCT/CN2020/078582 2019-04-16 2020-03-10 Procédé et dispositif de traitement d'image WO2020211573A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910302471.0A CN110033423B (zh) 2019-04-16 2019-04-16 用于处理图像的方法和装置
CN201910302471.0 2019-04-16

Publications (1)

Publication Number Publication Date
WO2020211573A1 true WO2020211573A1 (fr) 2020-10-22

Family

ID=67238554

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/078582 WO2020211573A1 (fr) 2019-04-16 2020-03-10 Procédé et dispositif de traitement d'image

Country Status (2)

Country Link
CN (1) CN110033423B (fr)
WO (1) WO2020211573A1 (fr)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033423B (zh) * 2019-04-16 2020-08-28 北京字节跳动网络技术有限公司 用于处理图像的方法和装置
CN111144491B (zh) * 2019-12-26 2024-05-24 南京旷云科技有限公司 图像处理方法、装置及电子系统
CN111292408B (zh) * 2020-01-21 2022-02-01 武汉大学 一种基于注意力机制的阴影生成方法
CN111667420B (zh) * 2020-05-21 2023-10-24 维沃移动通信有限公司 图像处理方法及装置
CN111915642B (zh) * 2020-09-14 2024-05-14 北京百度网讯科技有限公司 图像样本的生成方法、装置、设备和可读存储介质
CN112686988A (zh) * 2020-12-31 2021-04-20 北京北信源软件股份有限公司 三维建模方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102696057A (zh) * 2010-03-25 2012-09-26 比兹摩德莱恩有限公司 增强现实系统
CN104913784A (zh) * 2015-06-19 2015-09-16 北京理工大学 一种行星表面导航特征自主提取方法
CN106558090A (zh) * 2015-09-21 2017-04-05 三星电子株式会社 3d渲染和阴影信息存储方法和设备
CN107808409A (zh) * 2016-09-07 2018-03-16 中兴通讯股份有限公司 一种增强现实中进行光照渲染的方法、装置及移动终端
US20190102950A1 (en) * 2017-10-03 2019-04-04 ExtendView Inc. Camera-based object tracking and monitoring
CN110033423A (zh) * 2019-04-16 2019-07-19 北京字节跳动网络技术有限公司 用于处理图像的方法和装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100594519C (zh) * 2008-03-03 2010-03-17 北京航空航天大学 用球面全景摄像机实时生成增强现实环境光照模型的方法
DE102008028945A1 (de) * 2008-06-18 2009-12-31 Siemens Aktiengesellschaft Verfahren und Visualisierungsmodul zur Visualisierung von Unebenheiten der Innen-Oberfläche eines Hohlorgans, Bildbearbeitungseinrichtung und Tomographiesystem
CN101520904B (zh) * 2009-03-24 2011-12-28 上海水晶石信息技术有限公司 带有现实环境估算的增强现实的方法及其系统
CN102426695A (zh) * 2011-09-30 2012-04-25 北京航空航天大学 一种单幅图像场景的虚实光照融合方法
CN104766270B (zh) * 2015-03-20 2017-10-03 北京理工大学 一种基于鱼眼镜头的虚实光照融合方法
CN108986199B (zh) * 2018-06-14 2023-05-16 北京小米移动软件有限公司 虚拟模型处理方法、装置、电子设备及存储介质
WO2020056689A1 (fr) * 2018-09-20 2020-03-26 太平洋未来科技(深圳)有限公司 Procédé et appareil d'imagerie ra et dispositif électronique
CN109523617B (zh) * 2018-10-15 2022-10-18 中山大学 一种基于单目摄像机的光照估计方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102696057A (zh) * 2010-03-25 2012-09-26 比兹摩德莱恩有限公司 增强现实系统
CN104913784A (zh) * 2015-06-19 2015-09-16 北京理工大学 一种行星表面导航特征自主提取方法
CN106558090A (zh) * 2015-09-21 2017-04-05 三星电子株式会社 3d渲染和阴影信息存储方法和设备
CN107808409A (zh) * 2016-09-07 2018-03-16 中兴通讯股份有限公司 一种增强现实中进行光照渲染的方法、装置及移动终端
US20190102950A1 (en) * 2017-10-03 2019-04-04 ExtendView Inc. Camera-based object tracking and monitoring
CN110033423A (zh) * 2019-04-16 2019-07-19 北京字节跳动网络技术有限公司 用于处理图像的方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
NGUYEN, VU ET AL.: "Shadow Detection with Conditional Generative Adversarial Networks", 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 22 October 2017 (2017-10-22), XP033283326, ISSN: 2380-7504, DOI: 20200603162940Y *

Also Published As

Publication number Publication date
CN110033423B (zh) 2020-08-28
CN110033423A (zh) 2019-07-19

Similar Documents

Publication Publication Date Title
WO2020211573A1 (fr) Procédé et dispositif de traitement d'image
CN109858445B (zh) 用于生成模型的方法和装置
CN109214343B (zh) 用于生成人脸关键点检测模型的方法和装置
JP7104683B2 (ja) 情報を生成する方法および装置
CN111476871B (zh) 用于生成视频的方法和装置
US20200234478A1 (en) Method and Apparatus for Processing Information
US11436863B2 (en) Method and apparatus for outputting data
CN110188719B (zh) 目标跟踪方法和装置
CN109829432B (zh) 用于生成信息的方法和装置
US10970938B2 (en) Method and apparatus for generating 3D information
CN109754464B (zh) 用于生成信息的方法和装置
CN110059623B (zh) 用于生成信息的方法和装置
CN109800730B (zh) 用于生成头像生成模型的方法和装置
CN109981787B (zh) 用于展示信息的方法和装置
CN111524216B (zh) 生成三维人脸数据的方法和装置
CN110059624B (zh) 用于检测活体的方法和装置
WO2020253716A1 (fr) Procédé et dispositif de génération d'image
WO2023185391A1 (fr) Procédé d'apprentissage de modèle de segmentation interactive, procédé de génération de données de marquage et dispositif
CN108597034B (zh) 用于生成信息的方法和装置
CN111402122A (zh) 图像的贴图处理方法、装置、可读介质和电子设备
CN115937033A (zh) 图像生成方法、装置及电子设备
CN109829431B (zh) 用于生成信息的方法和装置
WO2024060708A1 (fr) Procédé et appareil de détection de cible
CN109816791B (zh) 用于生成信息的方法和装置
CN110619602B (zh) 一种图像生成方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20790905

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 02.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20790905

Country of ref document: EP

Kind code of ref document: A1