WO2022083389A1

WO2022083389A1 - Virtual image generation method and apparatus

Info

Publication number: WO2022083389A1
Application number: PCT/CN2021/119751
Authority: WO
Inventors: 车广富; 郭景昊; 张夏杰; 安山
Original assignee: 北京沃东天骏信息技术有限公司; 北京京东世纪贸易有限公司
Priority date: 2020-10-21
Filing date: 2021-09-23
Publication date: 2022-04-28
Also published as: CN112330784A

Abstract

Provided are a virtual image generation method and apparatus (4, 5, 6), relating to the technical field of computers. The method comprises: obtaining a leg region in an image to be processed containing a leg and a foot; according to posture parameters of the foot in the image to be processed and the internal parameters of a camera, rendering a three-dimensional shoe model into a two-dimensional shoe image corresponding to the image to be processed; according to the overlapping part of the leg with the shoe region in the two-dimensional shoe image, determining a part of the shoe region that is obscured by the leg; according to the part obscured by the leg, rendering a composite image of the image to be processed and the two-dimensional shoe image, so as to generate a virtual image having a leg obscuring effect.

Description

Method and device for generating virtual image

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is based on the CN application number 202011134938.4 and the filing date is October 21, 2020, and claims its priority. The disclosure of the CN application is hereby incorporated into this application as a whole.

technical field

The present disclosure relates to the field of computer technologies, and in particular, to a method for generating a virtual image, a device for generating a virtual image, and a non-volatile computer-readable storage medium.

Background technique

With the advancement of VR (Virtual Reality, Virtual Reality) and AR (Augmented Reality, Augmented Reality) technology, the function of forming shopping guide conversion through virtual try-on is more and more popular. The virtual shoe try-on technology realized by the combination of AR augmented reality technology and smartphone camera can help users see the effect of shoes on their feet.

In order to realize the visual effect of human feet wearing shoes, the virtual shoe trial technology needs to perform virtual occlusion on the three-dimensional shoe model to replace the real shoes of human feet.

In the related art, a virtual leg model is modeled outside the shoe mouth area of the 3D shoe model, and the virtual leg model and the 3D shoe model are rendered on the screen to generate a virtual image of a human foot wearing shoes.

SUMMARY OF THE INVENTION

According to some embodiments of the present disclosure, a method for generating a virtual image is provided, including: acquiring a leg area in an image to be processed including legs and feet; Internal parameters, rendering the 3D shoe model as a 2D shoe image corresponding to the image to be processed; according to the overlapping part of the leg area and the shoe area in the 2D shoe image, determine the part of the shoe area that is occluded by the leg; The partially occluded part is rendered, and the composite image of the to-be-processed image and the two-dimensional shoe image is rendered to generate a virtual image with a leg occlusion effect.

In some embodiments, determining the portion of the shoe region occluded by the leg according to the overlapping portion of the leg region and the shoe region in the two-dimensional shoe image includes: determining the two-dimensional shoe image according to the position of the shoe body region in the two-dimensional shoe image For the outer contour of the middle shoe, the inner contour of the shoe in the two-dimensional shoe image is determined according to the position of the shoe mouth area in the two-dimensional image; according to the intersection of the contour of the leg area and the outer contour and the inner contour, it is determined that the shoe area is blocked by the leg. part.

In some embodiments, determining the portion of the shoe region that is occluded by the leg according to the intersection of the contour of the leg region and the outer contour and the inner contour includes: on the inner contour, determining a point closest to the intersection; The closest point that determines the portion of the shoe area that is occluded by the leg.

In some embodiments, rendering the three-dimensional shoe model into a two-dimensional shoe image corresponding to the image to be processed includes: performing transparency processing on the shoe opening area in the three-dimensional shoe model; rendering the transparently processed three-dimensional shoe model as Two-dimensional shoe image; according to the binary image of the two-dimensional shoe image, the shoe body area of the two-dimensional shoe image and the shoe mouth area of the two-dimensional shoe image are determined.

In some embodiments, the transparent processing of the shoe opening area in the three-dimensional shoe model includes: detecting the shoe opening area of the three-dimensional shoe model, and using a closed mesh to cover the shoe opening area of the three-dimensional shoe model; covering part of the closed mesh Transparency is performed.

In some embodiments, acquiring the leg region in the to-be-processed image including the legs and feet includes: inputting the to-be-processed image into a machine learning model to determine the leg region in the to-be-processed image.

In some embodiments, the machine learning model includes a convolutional neural network module and a spatial pyramid pooling module connected in sequence.

In some embodiments, the convolutional neural network module is set according to the Fast-SCNN (Fast Segmentation Convolutional Neural Network, fast segmentation convolutional neural network) model.

In some embodiments, the image to be processed is each frame of images in the video, and generating a virtual image with a leg occlusion effect includes: generating a virtual image with a leg occlusion effect corresponding to each frame of image; the generating method further includes: according to each frame. The virtual image corresponding to the frame image generates a video with a leg blocking effect.

According to other embodiments of the present disclosure, there is provided an apparatus for generating a virtual image, comprising: a determining unit, configured to acquire a leg area in an image to be processed including legs and feet, according to the relationship between the leg area and the two-dimensional shoe The overlapping part of the shoe area in the image determines the part of the shoe area that is occluded by the leg; the processing unit is used to render the three-dimensional shoe model as the image to be processed according to the posture parameters of the foot in the image to be processed and the internal parameters of the camera For the corresponding two-dimensional shoe image, a composite image of the to-be-processed image and the two-dimensional shoe image is rendered according to the part blocked by the leg, so as to generate a virtual image with a leg blocking effect.

In some embodiments, the determining unit determines the outer contour of the shoe in the two-dimensional shoe image according to the position of the shoe body region in the two-dimensional shoe image, and determines the inner contour of the shoe in the two-dimensional shoe image according to the position of the shoe mouth region in the two-dimensional image ; According to the intersection of the contour of the leg area with the outer contour and the inner contour, determine the part of the shoe area that is occluded by the leg.

In some embodiments, the determining unit determines the point on the inner contour that is closest to the intersection point; and determines the portion of the shoe area that is occluded by the leg according to the intersection point and the closest point.

In some embodiments, the processing unit performs transparency processing on the shoe opening area in the three-dimensional shoe model; and renders the three-dimensional shoe model after the transparent processing into the two-dimensional shoe image. The determining unit determines the shoe body area of the two-dimensional shoe image and the shoe mouth area of the two-dimensional shoe image according to the binary image of the two-dimensional shoe image.

In some embodiments, the processing unit detects the shoe opening area of the three-dimensional shoe model, and uses a closed mesh to cover the shoe opening area of the three-dimensional shoe model; and performs transparency processing on the covering part of the closed mesh.

In some embodiments, the determining unit determines the position where the closed mesh covers the shoe mouth area of the three-dimensional shoe model according to the size of the part of the shoe area that is not covered by the leg.

In some embodiments, the determining unit inputs the image to be processed into the machine learning model, and determines the leg region in the image to be processed.

In some embodiments, the convolutional neural network module is set up according to the Fast-SCNN model.

In some embodiments, the image to be processed is each frame image in the video; the processing unit generates a virtual image with a leg blocking effect corresponding to each frame image, and generates a leg blocking effect according to the virtual image corresponding to each frame image. video.

According to further embodiments of the present disclosure, there is provided an apparatus for generating a virtual image, comprising: a memory; and a processor coupled to the memory, the processor being configured to execute any one of the above implementations based on instructions stored in the memory apparatus The generation method of the virtual image in the example.

According to still other embodiments of the present disclosure, there is provided a non-volatile computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method for generating a virtual image in any one of the foregoing embodiments.

Description of drawings

The accompanying drawings, which form a part of the specification, illustrate embodiments of the present disclosure and together with the description serve to explain the principles of the present disclosure.

The present disclosure may be more clearly understood from the following detailed description with reference to the accompanying drawings:

FIG. 1 shows a flowchart of some embodiments of the method for generating a virtual image of the present disclosure;

FIG. 2 shows a flowchart of some embodiments of step 130 in FIG. 1;

3a-3c show schematic diagrams of some embodiments of the method for generating a virtual image of the present disclosure;

4 shows a block diagram of some embodiments of the apparatus for generating virtual images of the present disclosure;

FIG. 5 shows a block diagram of other embodiments of the virtual image generating apparatus of the present disclosure;

FIG. 6 shows a block diagram of further embodiments of the apparatus for generating virtual images of the present disclosure.

Detailed ways

Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangement of the components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

Meanwhile, it should be understood that, for the convenience of description, the dimensions of various parts shown in the accompanying drawings are not drawn in an actual proportional relationship.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application or uses in any way.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, techniques, methods, and apparatus should be considered part of the authorized description.

In all examples shown and discussed herein, any specific value should be construed as illustrative only and not as limiting. Accordingly, other examples of exemplary embodiments may have different values.

It should be noted that like numerals and letters refer to like items in the following figures, so once an item is defined in one figure, it does not require further discussion in subsequent figures.

The inventor of the present disclosure found that the above-mentioned related art has the following problems: the virtual image of human feet wearing shoes is only generated by approximate modeling, lacking real three-dimensional structure information, resulting in poor effect of generating the virtual image.

In view of this, the present disclosure proposes a technical solution for generating a virtual image, which can synthesize a virtual image by using a real leg image and a three-dimensional shoe model to improve the effect of the virtual image.

As mentioned above, the position of the virtual human leg model relative to the shoe model is fixed and cannot reflect the real three-dimensional structure of the leg (such as the user's actual trousers position, shape, posture, leg position, etc.). This will cause the effect of the virtual image to decrease.

In view of the above technical problems, the visual clues of the legs can be extracted in the real scene (for example, using a neural network model), and combined with the outline of the 3D model of the shoe after the shoe opening area is transparent, the area that needs to be covered in the virtual image can be accurately divided. . For example, the technical solutions of the present disclosure can be implemented through the following embodiments.

FIG. 1 shows a flowchart of some embodiments of the method for generating a virtual image of the present disclosure.

As shown in FIG. 1 , the production method includes: step 110 , acquiring a leg area; step 120 , rendering a two-dimensional shoe image; step 130 , determining the part occluded by the leg; and step 140 , generating a virtual image.

In step 110, the leg region in the to-be-processed image including the legs and feet is acquired.

In some embodiments, the image to be processed is input into a machine learning model that determines the region of the leg in the image to be processed. For example, the machine learning model can be a convolutional neural network module (as set up according to the Fast-SCNN model).

In this way, a skeleton network for extracting leg regions is set up based on the lightweight model Fast-SCNN. A simple and efficient composite coefficient can be used to build a more structured network structure, thereby compressing the amount of parameters of the model and improving the training speed. Moreover, Fast-SCNN can reduce the amount of floating-point operations of the model, thereby improving computing performance.

In some embodiments, the machine learning model includes a convolutional neural network module and an SPP (Spatial Pyramid Pooling) module connected in sequence. For example, the SPP module includes a convolution processing module, an upsampling module, and a concatenation (Concat) module that are connected in sequence.

In some embodiments, the machine learning model includes a convolutional neural network module, a first SPP module, and a second SPP module. The convolutional neural network module is connected with the convolution processing module and the connection module of the first SPP module, and the connection module of the second SPP module is connected; the first SPP module is connected with the second SPP module.

In this way, the SPP module can well preserve the complete inter-context information and avoid misclassification in image processing. Moreover, the SPP module is more robust to small-sized, insignificant object recognition and can pay attention to different sub-regions containing insignificant objects, thereby improving the accuracy of leg region recognition.

In some embodiments, the aforementioned machine learning model can be trained using the SoftMax Loss setting loss function.

In step 120, the three-dimensional shoe model is rendered into a two-dimensional shoe image corresponding to the to-be-processed image according to the posture parameters of the foot in the to-be-processed image and the camera's internal parameters. For example, the internal parameters of the camera are parameters related to the characteristics of the camera itself, such as the focal length and pixel size of the camera.

In some embodiments, a PnP (Perspective-n-Point, multi-point perspective) algorithm can be used to determine the pose parameters

In some embodiments, the three-dimensional shoe model can be rendered into a two-dimensional shoe image using a rendering tool such as OpenGL.

In some embodiments, the shoe opening area of the three-dimensional shoe model is detected, and a closed mesh (mesh) is used as a baffle to cover the shoe opening area; the covering part of the closed mesh is transparentized. For example, a transparent mesh of baffles can be equipped on the 3D model of the shoe and placed on the inner side of the shoe opening area to achieve transparency of the shoe opening area.

In some embodiments, the position of the closed mesh covering the shoe opening area is determined according to the size of the part of the shoe area that is not covered by the leg. For example, the position of the baffle can be moved a preset distance toward the sole, so that the edge thickness of the shoe opening area exceeds a threshold value.

In this way, according to the depth of the position of the blocking piece, the uncropped part of the shoe opening area has a certain thickness, which can increase the spatial layering of the virtual picture and enhance the real effect.

In step 130, according to the overlapping portion of the leg region and the shoe region in the two-dimensional shoe image, the portion of the shoe region that is occluded by the leg is determined.

In some embodiments, only the leg area can be accurately segmented by using the deep learning model, so as to determine the boundary area between the leg model and the shoe model, and then the occluded part of the shoe opening area can be accurately obtained as the cropping area. After the rendering process is displayed on the screen, the visual effect of virtual occlusion will be generated. For example, step 130 may be implemented according to the embodiment in FIG. 2 .

FIG. 2 shows a flowchart of some embodiments of step 130 in FIG. 1 .

As shown in FIG. 2 , step 130 includes: step 1310 , determining the inner and outer contours of the shoe; and step 1320 , determining the portion occluded by the leg.

In step 1310, the outer contour of the shoe is determined according to the position of the shoe body region in the two-dimensional shoe image, and the inner contour of the shoe is determined according to the position of the shoe mouth region in the two-dimensional image.

In some embodiments, the shoe body area and the shoe mouth area may be determined for use in determining the inner and outer contours using the embodiment of Figure 3a.

Figure 3a shows a schematic diagram of some embodiments of the method for generating a virtual image of the present disclosure.

As shown in Figure 3a, after transparent processing is performed on the shoe mouth area in the 3D shoe model, it is rendered into a 2D shoe image. According to the binary image of the two-dimensional shoe image, the shoe body area 31 and the shoe mouth area 32 are determined.

After the shoe body area 31 and the shoe mouth area 32 are determined, the outer contour and the inner contour can be determined, and then the occlusion part can be determined through the remaining steps in FIG. 2 .

In step 1320, the part of the shoe area that is blocked by the leg is determined according to the position of the intersection of the contour of the leg area and the outer contour, and the inner contour.

In some embodiments, the shoe body area and the shoe mouth area may be determined for use in determining the inner and outer contours through the embodiments in Figures 3b and 3c.

Figure 3b shows a schematic diagram of some embodiments of the method for generating a virtual image of the present disclosure.

As shown in Figure 3b, using the neural network module, the segmentation mask binary image of the leg region 30 can be inferred. Combining the binary image of the shoe in FIG. 3 a and the binary image of the leg in FIG. 3 b , the intersection of the contour of the leg region 30 and the contour of the shoe body region 31 can be determined.

Figure 3c shows a schematic diagram of some embodiments of the method for generating a virtual image of the present disclosure.

As shown in FIG. 3 c , the outline of the shoe body region 31 is an outer outline 311 , and the outline of the shoe mouth region 32 is an inner outline 321 . The intersection points of the contour of the leg region 30 and the outer contour 311 are 3a, 3b, the point closest to the intersection 3a on the inner contour 321 is 3d, and the closest point to the intersection 3b is 3c.

The connection points 3a, 3b, 3c, 3d form a closed area (located on the left side of the shoe body), which is defined as the part of the shoe area that is obscured by the legs.

In some embodiments, can also be based on. The intersection of the contour of the leg area 30 and the inner contour 321 and the closed area formed by 3a, 3b determine the part of the shoe area that is obscured by the leg.

After determining the part of the shoe area that is occluded by the leg, a virtual image can be generated through step 140 in FIG. 1 .

In step 140, a composite image of the to-be-processed image and the two-dimensional shoe image is rendered according to the part blocked by the leg, so as to generate a virtual image with a leg blocking effect.

In some embodiments, the images to be processed are frames of images in the video. A video with a leg blocking effect can be generated according to the virtual image corresponding to each generated frame of image.

In the above-mentioned embodiment, the virtual occlusion portion is accurately determined according to the obtained visual clues of the real leg and the position of the shoe in the combined two-dimensional shoe image. In this way, a virtual image can be synthesized by using the real leg image and the three-dimensional shoe model to improve the effect of the virtual image.

FIG. 4 shows a block diagram of some embodiments of an apparatus for generating a virtual image of the present disclosure.

As shown in FIG. 4 , the virtual image generating apparatus 4 includes a determination unit 41 and a processing unit 42 .

The determination unit 41 acquires the leg area in the to-be-processed image including the legs and feet; and determines the part of the shoe area covered by the leg according to the overlapping part of the leg area and the shoe area in the two-dimensional shoe image.

In some embodiments, the determining unit 41 determines the outer contour of the shoe according to the position of the shoe body region in the two-dimensional shoe image, and determines the inner contour of the shoe according to the position of the shoe mouth region in the two-dimensional image; The intersection of the contours and the inner contour, determine the portion of the shoe area that is occluded by the leg.

In some embodiments, the determining unit 41 determines the point closest to the intersection on the inner contour; and determines the part of the shoe area that is occluded by the leg according to the intersection and the closest point.

In some embodiments, the determining unit 41 inputs the image to be processed into the machine learning model, and determines the leg region in the image to be processed.

The processing unit 42 renders the three-dimensional shoe model as a two-dimensional shoe image corresponding to the to-be-processed image according to the posture parameters of the foot in the image to be processed and the internal parameters of the camera; A composite image of dimensional shoe images to generate a virtual image with a leg occlusion effect.

In some embodiments, the processing unit 42 performs transparency processing on the shoe mouth area in the three-dimensional shoe model; and renders the three-dimensional shoe model after the transparent processing as the two-dimensional shoe image. The determining unit 41 determines the shoe body area and the shoe opening area according to the binary image of the two-dimensional shoe image.

In some embodiments, the processing unit 42 detects the shoe opening area of the three-dimensional shoe model, and covers the shoe opening area with a closed mesh; and performs transparency processing on the covering part of the closed mesh.

In some embodiments, the determining unit 41 determines the position where the closed mesh covers the shoe opening area according to the size of the part of the shoe area that is not covered by the legs.

In some embodiments, the image to be processed is each frame of image in the video; the processing unit 42 generates a video with a leg blocking effect according to the generated virtual image corresponding to each frame of image.

FIG. 5 shows a block diagram of other embodiments of the virtual image generating apparatus of the present disclosure.

As shown in FIG. 5 , the virtual image generating apparatus 5 of this embodiment includes: a memory 51 and a processor 52 coupled to the memory 51 , and the processor 52 is configured to execute the present disclosure based on instructions stored in the memory 51 The method for generating a virtual image in any one of the embodiments.

Wherein, the memory 51 may include, for example, a system memory, a fixed non-volatile storage medium, and the like. The system memory stores, for example, an operating system, an application program, a boot loader Boot Loader, a database, and other programs.

As shown in FIG. 6 , the virtual image generating apparatus 6 of this embodiment includes: a memory 610 and a processor 620 coupled to the memory 610 , and the processor 620 is configured to execute any of the foregoing based on the instructions stored in the memory 610 . A method for generating a virtual image in one embodiment.

Memory 610 may include, for example, system memory, fixed non-volatile storage media, and the like. The system memory stores, for example, an operating system, an application program, a boot loader, and other programs.

The virtual image generating apparatus 6 may further include an input/output interface 630, a network interface 640, a storage interface 650, and the like. These

interfaces

630 , 640 , 650 and the memory 610 and the processor 620 may be connected, for example, through a bus 660 . The input and output interface 630 provides a connection interface for input and output devices such as a display, a mouse, a keyboard, a touch screen, a microphone, and a speaker. Network interface 640 provides a connection interface for various networked devices. The storage interface 650 provides a connection interface for external storage devices such as SD cards and U disks.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media having computer-usable program code embodied therein, including but not limited to disk storage, CD-ROM, optical storage, and the like.

So far, the method for generating a virtual image, the device for generating a virtual image, and the non-volatile computer-readable storage medium according to the present disclosure have been described in detail. Some details that are well known in the art are not described in order to avoid obscuring the concept of the present disclosure. Those skilled in the art can fully understand how to implement the technical solutions disclosed herein based on the above description.

The methods and systems of the present disclosure may be implemented in many ways. For example, the methods and systems of the present disclosure may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware. The above order of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Furthermore, in some embodiments, the present disclosure can also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

While some specific embodiments of the present disclosure have been described in detail by way of examples, those skilled in the art will appreciate that the above examples are provided for illustration only, and are not intended to limit the scope of the present disclosure. Those skilled in the art will appreciate that modifications may be made to the above embodiments without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.

Claims

A method for generating a virtual image, comprising:

Get the leg region in the to-be-processed image that contains the legs and feet;

rendering the three-dimensional shoe model into a two-dimensional shoe image corresponding to the to-be-processed image according to the posture parameters of the foot in the to-be-processed image and the internal parameters of the camera;

According to the overlapping portion of the leg region and the shoe region in the two-dimensional shoe image, determining the portion of the shoe region that is occluded by the leg;

A composite image of the to-be-processed image and the two-dimensional shoe image is rendered according to the portion occluded by the leg, so as to generate a virtual image with a leg occlusion effect.
The generation method according to claim 1, wherein, according to the overlapping part of the leg region and the shoe region in the two-dimensional shoe image, determining the part of the shoe region that is occluded by the leg in the shoe region comprises:

The outer contour of the shoe in the two-dimensional shoe image is determined according to the position of the shoe body region in the two-dimensional shoe image, and the inner contour of the shoe in the two-dimensional shoe image is determined according to the position of the shoe mouth region in the two-dimensional image. ;

According to the intersection of the contour of the leg area with the outer contour and the inner contour, the portion of the shoe area that is covered by the leg is determined.
The generation method according to claim 2, wherein, according to the intersection of the contour of the leg area and the outer contour and the inner contour, determining the part of the shoe area that is occluded by the leg includes:

On the inner contour, determine the point closest to the intersection;

According to the intersection point and the point with the closest distance, the part of the shoe area that is occluded by the leg is determined.
The generation method according to claim 1, wherein the rendering of the three-dimensional shoe model into a two-dimensional shoe image corresponding to the to-be-processed image comprises:

performing transparent processing on the shoe mouth area in the three-dimensional shoe model;

Rendering the transparent three-dimensional shoe model into the two-dimensional shoe image;

According to the binary image of the two-dimensional shoe image, the shoe body area of the two-dimensional shoe image and the shoe mouth area of the two-dimensional shoe image are determined.
The generation method according to claim 4, wherein the transparent processing of the shoe mouth region in the three-dimensional shoe model comprises:

Detecting the shoe mouth area of the three-dimensional shoe model, and covering the shoe mouth area of the three-dimensional shoe model with a closed mesh;

Transparency processing is performed on the closed mesh covering portion.
The generation method according to claim 5, wherein the detecting the shoe opening area of the three-dimensional shoe model and covering the shoe opening area of the three-dimensional shoe model with a closed mesh comprises:

According to the preset size of the portion of the shoe area that is not covered by the leg, the position where the closed mesh covers the shoe mouth area of the three-dimensional shoe model is determined.
The generation method according to any one of claims 1-6, wherein the acquiring the leg region in the to-be-processed image including the leg and the foot comprises:

The to-be-processed image is input into a machine learning model, and the leg region in the to-be-processed image is determined.
The generating method according to claim 7, wherein,

The machine learning model includes sequentially connected convolutional neural network modules and spatial pyramid pooling modules.
The generating method according to claim 8, wherein,

The convolutional neural network module is set according to the fast segmentation convolutional neural network Fast-SCNN model.
The generation method according to any one of claims 1-6, wherein the to-be-processed image is each frame of image in the video,

The generating a virtual image with a leg occlusion effect includes:

generating a virtual image with a leg occlusion effect corresponding to each of the frame images;

Also includes:

According to the virtual image corresponding to each frame of image, a video with a leg blocking effect is generated.
A device for generating virtual images, comprising:

a determining unit, configured to acquire the leg area in the to-be-processed image including the leg and the foot, and determine, according to the overlapping part of the leg area and the shoe area in the two-dimensional shoe image, that the shoe area is occluded by the leg part;

The processing unit is configured to render the three-dimensional shoe model into a two-dimensional shoe image corresponding to the to-be-processed image according to the posture parameters of the foot in the to-be-processed image and the internal parameters of the camera, and according to the part, rendering a composite image of the to-be-processed image and the two-dimensional shoe image to generate a virtual image with a leg occlusion effect.
A device for generating virtual images, comprising:

memory; and

A processor coupled to the memory, the processor configured to perform the method of generating a virtual image of any one of claims 1-10 based on instructions stored in the memory.
A non-volatile computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method for generating a virtual image according to any one of claims 1-10.