CN111462337B

CN111462337B - Image processing method, device and computer readable storage medium

Info

Publication number: CN111462337B
Application number: CN202010231304.4A
Authority: CN
Inventors: 赵琦; 颜忠伟; 毕铎; 王科
Original assignee: MIGU Culture Technology Co Ltd
Current assignee: MIGU Culture Technology Co Ltd
Priority date: 2020-03-27
Filing date: 2020-03-27
Publication date: 2023-08-18
Anticipated expiration: 2040-03-27
Also published as: CN111462337A

Abstract

The invention discloses an image processing method, image processing equipment and a computer readable storage medium, which relate to the technical field of communication and are used for improving the display effect of AR (augmented reality) combined images of a real user and a virtual object. The method comprises the following steps: acquiring a projection image of a user on a display screen; extracting the human body outline in the projection image; obtaining a virtual image of a virtual object according to the human body contour, wherein the matching degree of the contour of the virtual object and the human body contour meets a first preset requirement; determining a relative positional relationship between the virtual image and the projected image; and obtaining an AR image according to the projection image, the virtual image and the relative position relation. The embodiment of the invention can improve the display effect of AR combined images of the real user and the virtual object.

Description

Image processing method, device and computer readable storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image processing method, an image processing device, and a computer readable storage medium.

Background

In AR (Augmented Reality ) photography, the height and motion of a virtual object (e.g., a virtual character) are fixed. However, during the process of the user and virtual character's syndication, different users have different heights and photographing poses. Therefore, in this case, how to improve the sense of realism of a real user and a virtual character in a time to improve the display effect of an AR image is a technical problem to be solved.

Disclosure of Invention

The embodiment of the invention provides an image processing method, image processing equipment and a computer readable storage medium, which are used for improving the display effect of AR (augmented reality) combined images of a real user and a virtual object.

In a first aspect, an embodiment of the present invention provides an image processing method, including:

acquiring a projection image of a user on a display screen;

extracting the human body outline in the projection image;

obtaining a virtual image of a virtual object according to the human body contour, wherein the matching degree of the contour of the virtual object and the human body contour meets a first preset requirement;

determining a relative positional relationship between the virtual image and the projected image;

and obtaining an AR image according to the projection image, the virtual image and the relative position relation.

Wherein the extracting the human body contour in the projection image includes:

respectively carrying out image conversion on the projection images to obtain at least one gray scale image;

calculating the average value of the gray level images to obtain a background gray level image;

and calculating the difference between each gray level image and the background gray level image to obtain the human body outline of the user.

Wherein the virtual object comprises a virtual character; the obtaining a virtual image of a virtual object according to the human body contour includes:

determining a first key point on the human body contour, wherein the first key point at least comprises a head key point and a hand key point;

determining a target contour of the virtual object in the candidate images of the virtual object;

determining a second keypoint on the target contour corresponding to the first keypoint, wherein the second keypoint comprises at least a head keypoint and a hand keypoint;

calculating the similarity between the human body contour and the target contour based on the first key point and the second key point;

and if the similarity meets a second preset requirement, taking the candidate image as the virtual image.

Wherein the calculating the similarity between the human body contour and the target contour based on the first key point and the second key point includes:

for each first target key point in the first key points, calculating Euclidean distance between the first target key point and a second target key point, wherein the second target key point is a key point corresponding to the first target key point in the second key points;

and calculating the similarity between the human body contour and the target contour based on the obtained Euclidean distance.

Wherein the calculating the similarity between the human body contour and the target contour based on the obtained Euclidean distance includes:

multiplying each Euclidean distance by a corresponding weight to obtain a first value corresponding to each Euclidean distance;

adding the first numerical values to obtain the similarity between the human body contour and the target contour;

wherein the method further comprises:

and presetting the weight, wherein the weight of the Euclidean distance obtained based on the head key point and/or the hand key point is larger than the weight of the Euclidean distance obtained based on other key points.

Wherein the determining the relative positional relationship between the virtual image and the projection image includes:

determining a distance between an actual photographing position of the user and the projection image;

and determining a depth distance between the virtual image and the projection image according to the distance.

Wherein said determining a depth distance between said virtual image and said projected image from said distance comprises:

determining a depth distance between the virtual image and the projected image from the distance using the following formula:

wherein Δd represents the depth distance, Δθ represents the binocular parallax of the user, D represents the distance between the actual photographing position of the user and the projected image of the user on the display screen, P represents the distance between the eyes of the user, and Δθ and P are constants

In a second aspect, an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a program stored on the memory and executable on the processor; the processor is configured to read a program in a memory to implement the steps in the image processing method according to the first aspect.

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium storing a computer program, wherein the computer program is executed by a processor as steps in the image processing method according to the first aspect.

In the embodiment of the invention, the human body outline is extracted according to the projection image of the user on the display screen, and the virtual image of the virtual object is obtained according to the human body outline. And then, according to the relative position relation between the virtual object and the projection image, the projection image and the virtual image are subjected to AR combination. Because the matching degree of the outline of the virtual object and the outline of the human body meets the first preset requirement and the relative position relation between the virtual object and the projection image is considered in the process of combining, the AR combination obtained by the embodiment of the invention ensures that the matching degree of the shape and the posture of the virtual object and the shape and the posture of the user is higher, thereby enhancing the sense of reality of the image and improving the display effect of the AR image.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.

Fig. 1 is a flowchart of an image processing method provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of binocular imaging of a person;

FIG. 3 is a mathematical schematic diagram of the principle of binocular imaging;

FIG. 4 is a schematic illustration of photographing according to an embodiment of the present invention;

FIG. 5 is a second schematic view of photographing according to the embodiment of the present invention;

fig. 6 is a block diagram of an image processing apparatus provided in an embodiment of the present invention;

fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, fig. 1 is a flowchart of an image processing method according to an embodiment of the present invention, as shown in fig. 1, including the following steps:

step 101, obtaining a projection image of a user on a display screen.

When a user needs to take a picture, the user usually takes a picture by using a camera, so that an image of the user is displayed on a display screen. In the embodiment of the present invention, the displayed image of the user is referred to as a projection image. In the embodiment of the present invention, the projection image needs to include the outline of the human body of the user, and preferably, the projection image needs to include the complete outline of the human body of the user because of AR combination with the virtual object (such as virtual character, virtual article, etc.). The human body outline of the user can show the height, standing posture and other information of the user.

Step 102, extracting the human body outline in the projection image.

In the embodiment of the invention, projection images of a plurality of users are continuously acquired, and then, the outline of the human body is determined based on the projection images. Specifically, in this step, the obtained plurality of projection images are subjected to image conversion, respectively, to obtain at least one gray scale image. Wherein each projection image has a corresponding gray scale map. And then, calculating the average value of the gray level images to obtain a background gray level image. And finally, calculating the difference between each gray level image and the background gray level image to obtain the human body outline of the user. In this way, the obtained human body contour information can be made more accurate.

Taking the continuous shooting of five images of the user as an example, five images are converted into gray level images and marked as f _gi (x, y), i=1, 2,3,4,5. The gray level images of the five images are added according to the following formula (1) and the average value is obtained, namely the gray level image of the background image is marked as f _b (x,y)。

Then, the gray level of each gray level image is differenced from the gray level image of the background image, so that the human body contour information can be obtained, wherein the human body contour information can be expressed as a formula (2):

f _d (x,y)＝|f _gi (x,y)-f _b (x,y)| (2)

wherein f _d (x, y) represents human body contour information.

Step 103, obtaining a virtual image of a virtual object according to the human body contour, wherein the matching degree of the contour of the virtual object and the human body contour meets a first preset requirement.

The first preset requirement may be that the matching degree is greater than a certain preset value, and the preset value may be set according to actual needs.

Taking a virtual object as an example of a virtual character, the method can search all images of the virtual character in the virtual character library, and calculate with the current human contour information to match the gesture of the virtual character most similar to the human contour. The height and hand gestures of the user are considered in the matching process. In the process of calculating the human body contour similarity points by adopting the Euclidean distance, the weights of the head posture and the hand posture are properly increased, so that the similarity of the head and the hand is emphasized. By matching with all the images, the image of the virtual character most similar to the height and the gesture of the client is found.

Specifically, in this step, a virtual image of the virtual object may be acquired according to the following procedure:

step 1031, determining a first key point on the human body contour, wherein the first key point at least comprises a head key point and a hand key point.

Here, the first key point can be determined on the contour of the human body by means of a marking. Of course, the first keypoints may also comprise other keypoints on the human body contour.

Step 1032, determining a target contour of the virtual object in the candidate image of the virtual object.

In practical applications, multiple images of multiple virtual objects may be pre-stored, with different virtual objects having different heights, postures, and the like. These images are referred to herein as candidate images of the virtual object. If the user selects the virtual object to be combined, the candidate object of the virtual object selected by the user can be obtained from the pre-stored image directly according to the virtual object selected by the user.

Taking a virtual object as an example of a virtual character, the target contour determined herein is the human body contour of the virtual character. The determination manner of the human body outline of the virtual object is not limited in the embodiment of the invention.

Step 1033, corresponding to the first key point, determining a second key point on the target contour, wherein the second key point at least comprises a head key point and a hand key point.

"corresponding to the first key point" means that the key point, i.e., the second key point, is determined at a corresponding position of the human body contour of the virtual character according to the position of the first key point in the human body contour. In this way, the obtained height, posture and the like of the virtual object can be more similar to the height and posture of the user. Optionally, the second keypoints may also include keypoints of other positions.

Step 1034, calculating the similarity between the human body contour and the target contour based on the first key point and the second key point.

In this step, the euclidean distance between the key points is mainly calculated, and then the similarity between the human body contour and the target contour is calculated according to the euclidean distance.

When the Euclidean distance is calculated, the calculation is performed based on two corresponding key points on the human body contour and the target contour. Specifically, for each first target key point in the first key points, a euclidean distance between the first target key point and a second target key point is calculated, wherein the second target key point is a key point corresponding to the first target key point in the second key points. Wherein the first target keypoint is any one of the first keypoints. Then, based on the obtained euclidean distance, a similarity between the human body contour and the target contour is calculated.

And multiplying each Euclidean distance by a corresponding weight to obtain a first value corresponding to each Euclidean distance, and adding the first values to obtain the similarity between the human body contour and the target contour.

In the process of the embodiment of the present invention, the weight may be preset, where the weight of the euclidean distance obtained based on the head key point and/or the hand key point is greater than the weight of the euclidean distance obtained based on other key points.

In the embodiment of the invention, the obtained height, posture and the like of the virtual object are more similar to the height and posture of the user by adding the weight corresponding to the head key point or the hand key point, so that the display effect of the image is further improved.

And step 1035, if the similarity meets a second preset requirement, taking the candidate image as the virtual image.

The similarity meeting the second preset requirement may be that the similarity is greater than a certain preset value, and the preset value may be set according to actual needs.

Step 104, determining a relative position relation between the virtual image and the projection image.

In an embodiment of the present invention, the relative positional relationship may be represented by a depth distance between the virtual image and the projection image.

Specifically, in this step, a distance between the actual photographing position of the user and the projection image is determined, and then a depth distance between the virtual image and the projection image is determined according to the distance.

As shown in fig. 2, a principle diagram of binocular imaging of a person is shown. The images presented on the retina are different when the left and right eyes view the object at respective angles due to a distance of about 60mm between the eyes of the person. The brain judges the spatial position of the object according to the difference, so that people can generate stereoscopic vision for the object.

As shown in fig. 3, a mathematical representation of the principle of binocular imaging is provided. Referring to fig. 3, the geometric relationship between binocular parallax and the spatial position of an object is as shown in formula (3):

wherein P is the distance between two eyes of a person, D is the viewing distance, and Δd is the relative depth of the object. By the above formula, the function relation between the parallax of two eyes and the relative depth (depth distance) of the object can be obtained.

As shown in fig. 4, when the user performs photographing, the standing position, that is, the actual photographing position, is determined, and then the distance D between the actual photographing position of the user and the projection image is determined. Then, the position of the virtual image is dynamically adjusted according to the projection image of the user on the display screen so as to better present the stereoscopic viewing angle.

Specifically, the depth distance between the virtual image and the projected image is determined according to the following formula (4):

wherein Δd represents the depth distance, Δθ represents the binocular parallax of the user, D represents the distance between the actual photographing position of the user and the projection image of the user on the display screen, P represents the distance between the eyes of the user, and Δθ and P are constants.

When the D value is determined, a distance between a certain point of the human body and a point corresponding to the certain point in the projection image can be used as the D value. Δd may be a distance between a certain point in the projected image and a certain point in the virtual image, such as a point on the user's toe and a point on the virtual character's toe in the projected image, etc.

In the final composite photograph of the AR, the most obvious influence on the composite effect is the relative depth distance between the projection of the user and the virtual star, and the depth distance is dynamically adjusted based on the formula (4) so as to achieve the optimal photographing effect.

For example, when the user performs AR shooting with the virtual character, in order to ensure the best visual effect, the appearance position of the virtual character needs to be adjusted in real time according to the actual position where the user stands, that is, Δd is determined according to the value of D. As shown in fig. 5, in the photographing process of different times, based on the distance D obtained in the real scene of the user, the appearance position of the virtual character is adjusted in real time according to formula 4, i.e. Δd is determined, so as to ensure that the relative positions of the virtual character and the projection image of the user are as indicated by line 51, i.e. the best viewing distance effect is achieved.

And step 105, obtaining an AR image according to the projection image, the virtual image and the relative position relation.

After the position of the virtual image is determined, the projection image and the virtual image can be synthesized to obtain an AR image. The specific synthesis method is not limited in the embodiment of the present invention.

The embodiment of the invention also provides an image processing device. Referring to fig. 6, fig. 6 is a block diagram of an image processing apparatus according to an embodiment of the present invention. Since the principle of the image processing apparatus for solving the problem is similar to that of the image processing method in the embodiment of the present invention, the implementation of the image processing apparatus can refer to the implementation of the method, and the repetition is omitted.

As shown in fig. 6, the image processing apparatus 600 includes:

a first obtaining module 601, configured to obtain a projection image of a user on a display screen; a first extraction module 602, configured to extract a human body contour in the projection image; a second obtaining module 603, configured to obtain a virtual image of a virtual object according to the human body contour, where a matching degree between the contour of the virtual object and the human body contour meets a first preset requirement; a first determining module 604, configured to determine a relative positional relationship between the virtual image and the projection image; and a fourth obtaining module 605, configured to obtain an AR image according to the projection image, the virtual image, and the relative positional relationship.

Optionally, the first extraction module 602 may include:

the conversion sub-module is used for respectively carrying out image conversion on the projection images to obtain at least one gray level image; the first calculation sub-module is used for calculating the average value of the gray level images to obtain a background gray level image; and the second calculation sub-module is used for calculating the difference between each gray level image and the background gray level image to obtain the human body outline of the user.

Optionally, the virtual object includes a virtual character; the second obtaining module 603 includes:

a first determining sub-module for determining a first keypoint on the human body contour, wherein the first keypoint comprises at least a head keypoint and a hand keypoint; a second determining sub-module, configured to determine a target contour of the virtual object in a candidate image of the virtual object; a third determining sub-module for determining a second keypoint on the target contour corresponding to the first keypoint, wherein the second keypoint comprises at least a head keypoint and a hand keypoint; a first computing sub-module for computing a similarity between the human body contour and the target contour based on the first key point and the second key point; and a fourth determining sub-module, configured to take the candidate image as the virtual image if the similarity meets a second preset requirement.

Optionally, the first computing submodule includes:

a first calculation unit, configured to calculate, for each first target key point of the first key points, a euclidean distance between the first target key point and a second target key point, where the second target key point is a key point corresponding to the first target key point in the second key points; and a second calculation unit for calculating a similarity between the human body contour and the target contour based on the obtained Euclidean distance.

Optionally, the second computing unit includes:

the first calculating subunit is used for multiplying each Euclidean distance by a corresponding weight to obtain a first value corresponding to each Euclidean distance; and the second calculating subunit is used for adding the first numerical values to obtain the similarity between the human body contour and the target contour.

Optionally, the second computing unit may further include: and the setting sub-module is used for presetting the weight, wherein the weight of the Euclidean distance obtained based on the head key point and/or the hand key point is larger than the weight of the Euclidean distance obtained based on other key points.

Optionally, the first determining module 604 includes:

a first determining sub-module for determining a distance between an actual photographing position of the user and the projection image; and the second determining submodule is used for determining the depth distance between the virtual image and the projection image according to the distance.

Optionally, the second determining submodule is configured to determine a depth distance between the virtual image and the projection image according to the distance by using the following formula:

The device provided by the embodiment of the present invention may execute the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein.

As shown in fig. 7, an electronic device according to an embodiment of the present invention includes: the processor 700 is configured to read the program in the memory 710, and execute the following procedures:

acquiring a projection image of a user on a display screen;

extracting the human body outline in the projection image;

and obtaining an Augmented Reality (AR) image according to the projection image, the virtual image and the relative position relation.

Wherein in fig. 7, a bus architecture may comprise any number of interconnected buses and bridges, and in particular one or more processors represented by processor 700 and various circuits of memory represented by memory 710, linked together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. The processor 700 is responsible for managing the bus architecture and general processing, and the memory 710 may store data used by the processor 700 in performing operations.

The processor 700 is responsible for managing the bus architecture and general processing, and the memory 710 may store data used by the processor 700 in performing operations.

The processor 700 is further configured to read the program, and perform the following steps:

The virtual object comprises a virtual character; the processor 700 is further configured to read the program, and perform the following steps:

and adding the first numerical values to obtain the similarity between the human body contour and the target contour.

The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the processes of the above-mentioned image processing method embodiment, and can achieve the same technical effects, so that repetition is avoided, and no further description is given here. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. In light of such understanding, the technical solutions of the present invention may be embodied essentially or in part in the form of a software product stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a cell phone, computer, server, air conditioner, or network device, etc.) to perform the methods described in the various embodiments of the present invention.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims

1. An image processing method, comprising:

acquiring a projection image of a user on a display screen;

extracting the human body outline in the projection image;

determining a relative positional relationship between the virtual image and the projected image, wherein the relative positional relationship is determined according to a depth distance between the virtual image and the projected image, the depth distance being determined according to a distance between an actual photographing position of the user and the projected image;

2. The method of claim 1, wherein the projected image is at least one sheet; the extracting the human body outline in the projection image comprises the following steps:

3. The method of claim 1, wherein the virtual object comprises a virtual character; the obtaining a virtual image of a virtual object according to the human body contour includes:

4. A method according to claim 3, wherein said calculating a similarity between said body contour and said target contour based on said first keypoint and said second keypoint comprises:

5. The method of claim 4, wherein the calculating the similarity between the human body contour and the target contour based on the obtained euclidean distance comprises:

6. The method of claim 5, wherein the method further comprises:

7. The method of claim 1, wherein the determining the relative positional relationship between the virtual image and the projected image comprises:

8. The method of claim 7, wherein the determining a depth distance between the virtual image and the projected image based on the distance comprises:

9. An electronic device, comprising: a memory, a processor, and a program stored on the memory and executable on the processor; the image processing method according to any one of claims 1 to 8, characterized in that the processor is configured to read a program in a memory to implement the steps in the image processing method.

10. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, is adapted to perform the steps in the image processing method according to any one of claims 1 to 8.