CN111416938B

CN111416938B - Augmented reality close-shooting method and device and computer readable storage medium

Info

Publication number: CN111416938B
Application number: CN202010232139.4A
Authority: CN
Inventors: 赵琦; 张健; 杜欧杰; 王科
Original assignee: MIGU Culture Technology Co Ltd
Current assignee: MIGU Culture Technology Co Ltd
Priority date: 2020-03-27
Filing date: 2020-03-27
Publication date: 2021-11-02
Anticipated expiration: 2040-03-27
Also published as: CN111416938A

Abstract

The invention provides an augmented reality close-up method, equipment and a computer readable storage medium, belonging to the technical field of image processing, wherein the method comprises the following steps: collecting an image; acquiring first depth information of an entity object in the image; according to the first depth information, executing a target operation, wherein the target operation is used for adjusting at least one of the entity object and the photographed virtual object; based on a result of the target operation, forming an augmented reality snap image including the virtual object and the physical object. According to the method and the device, the depth information of the entity object in the image is acquired and utilized, so that the help of the augmented reality close shot can be avoided, and the reality of the close shot is improved.

Description

Augmented reality close-shooting method and device and computer readable storage medium

Technical Field

The embodiment of the invention relates to the technical field of image processing, in particular to an augmented reality close-shooting method, augmented reality close-shooting equipment and a computer-readable storage medium.

Background

Augmented Reality (AR) is a method of creating spatiotemporal consistency in visual effects by fusing virtual objects (e.g., virtual characters) and real objects (e.g., real characters) together. For example, a performance star, a sports star, and the like are shot in advance by a green screen movie, and a user photograph is superimposed on the star green screen movie, so that the user feels like a star in time.

Because the AR co-shooting generally adopts an image synthesis mode, that is, a two-dimensional image of a real object and a two-dimensional image of a virtual object, which are obtained by current shooting, are synthesized, some help-penetrating is easily caused. For example, when the user wants to take a photo with a virtual star, the limb of the user posing the photographing gesture may pass through the body of the virtual star, affecting the realism of the AR taking.

Disclosure of Invention

The embodiment of the invention provides an augmented reality auction method, augmented reality auction equipment and a computer-readable storage medium, which are used for solving the problem that the reality is influenced by the easy help penetration of AR auction at present.

In order to solve the technical problem, the invention is realized as follows:

in a first aspect, an embodiment of the present invention provides an augmented reality auction method, including:

collecting an image;

acquiring first depth information of an entity object in the image;

according to the first depth information, executing a target operation, wherein the target operation is used for adjusting at least one of the entity object and the photographed virtual object;

based on a result of the target operation, forming an augmented reality snap image including the virtual object and the physical object.

Optionally, the target operation includes at least one of:

outputting prompt information, wherein the prompt information is used for prompting a user to adjust the posture of a real entity object, and the real entity object is the user or other real entity objects;

adjusting a pose of the virtual object;

and adjusting the position of the virtual object in the snap-shot image.

Optionally, the step of obtaining first depth information of the entity object in the image includes:

screening K adjacent images of the images from a preset database, wherein the second depth information of each adjacent image is stored in the preset database, and K is an integer larger than zero;

determining the first depth information according to the second depth information of the K neighbor images.

Optionally, the step of screening K neighbor images of the images from a preset database includes:

and screening the adjacent images from the preset database by adopting a cosine similarity algorithm.

Optionally, the step of determining the first depth information according to the second depth information of the K neighboring images includes:

determining the first depth information according to the weighted values of the second depth information of the K adjacent images;

wherein, the weighted value of the second depth information of any neighboring image C is calculated according to the following formula:

wherein d (p) represents a weighted value of the second depth information of the any one neighboring image C;

representing a weight value corresponding to any one of the neighboring images C;

i represents the image;

S_i(p) a scale invariant feature transform, SIFT, feature vector representing the image i;

S_c(p + f (p)) represents a SIFT flow of said image i to said any neighboring image C;

D_Ci(p+f_i(p)) means for migrating the second depth information of said any neighboring image C onto said image i, f_i(p) denotes the SIFT flow between the any neighbor image C to the image i.

Optionally, the step of executing the target operation according to the first depth information includes:

and if the real object is not in contact with the virtual object in the close-up posture, adjusting the position of the virtual object in the close-up image according to the first depth information.

if the entity object is contacted with the virtual object in a close-time posture, determining the action amplitude of the entity object according to the first depth information;

if the action amplitude of the entity object is smaller than or equal to a preset threshold value, selecting a target posture image from a preset posture image library, wherein the posture of the virtual object in the target posture image is matched with the posture of the entity object so as to adjust the posture of the virtual object;

and if the action amplitude of the entity object is larger than the preset threshold, outputting the prompt information to enable the posture of the entity object to be matched with the posture of the virtual object, wherein the posture of the virtual object is a posture selected by a user in advance.

Optionally, before the step of executing the target operation according to the first depth information, the method further includes:

and determining whether the gesture of the real object and the virtual object in time is contacted or not according to the limb key point action of the real object.

In a second aspect, an embodiment of the present invention further provides an augmented reality close-up apparatus, including:

the acquisition module is used for acquiring images;

the acquisition module is used for acquiring first depth information of the entity object in the image;

an execution module, configured to execute a target operation according to the first depth information, where the target operation is used to adjust at least one of the entity object and the photographed virtual object;

an imaging module to form an augmented reality snap image including the virtual object and the physical object based on a result of the target operation.

Optionally, the target operation includes at least one of:

adjusting a pose of the virtual object;

and adjusting the position of the virtual object in the snap-shot image.

Optionally, the obtaining module includes:

the screening unit is used for screening K adjacent images of the images from a preset database, wherein the preset database stores second depth information of each adjacent image, and K is an integer larger than zero;

a first determining unit configured to determine the first depth information according to second depth information of the K neighboring images.

Optionally, the screening unit is configured to screen the neighboring image from the preset database by using a cosine similarity algorithm.

Optionally, the first determining unit is configured to determine the first depth information according to a weighted value of the second depth information of the K neighboring images;

i represents the image;

S_c(p + F (P)) represents the figureSIFT flow like i to the any neighbor image C;

Optionally, the executing module includes:

and the adjusting unit is used for adjusting the position of the virtual object in the snap image according to the first depth information if the real object is not in contact with the virtual object in the snap gesture.

Optionally, the executing module includes:

the second determining unit is used for determining the action amplitude of the entity object according to the first depth information if the entity object is contacted with the virtual object in a close-shot posture;

a selecting unit, configured to select a target pose image from a preset pose image library if the motion amplitude of the entity object is smaller than or equal to a preset threshold, where a pose of the virtual object in the target pose image matches a pose of the entity object to adjust the pose of the virtual object;

and the prompting unit is used for outputting the prompting information if the action amplitude of the entity object is larger than the preset threshold value so as to enable the posture of the entity object to be matched with the posture of the virtual object, wherein the posture of the virtual object is a posture selected by a user in advance.

Optionally, the augmented reality close-up apparatus further includes:

and the determining module is used for determining whether the real object is contacted with the virtual object in a close-time gesture according to the limb key point action of the real object.

In a third aspect, an embodiment of the present invention further provides an augmented reality close-up apparatus, including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the computer program, when executed by the processor, implements any of the steps of the augmented reality close-up method described above.

In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of any one of the above augmented reality close-up methods are implemented.

According to the embodiment of the invention, the target operation for adjusting the entity object and/or the virtual object in the close shot is executed by acquiring and according to the depth information (namely the first depth information) of the entity object in the image, so that the help penetrating phenomenon in the AR close shot can be avoided, and the reality of the close shot is improved.

Drawings

Fig. 1 is a schematic flow chart of an augmented reality close-up method in an embodiment of the present invention;

fig. 2 is a schematic flowchart of a method for acquiring first depth information according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating a real object and a virtual object being photographed without contact according to an embodiment of the present invention;

FIG. 4 is a schematic representation of a critical point of a limb;

fig. 5 is a schematic flow chart of an augmented reality close-up method in an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an augmented reality close-up apparatus in an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an augmented reality close-up apparatus in an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention, are within the scope of the invention.

Referring to fig. 1, fig. 1 is a schematic flow chart of an augmented reality close-up method according to an embodiment of the present invention, including:

101. collecting an image;

specifically, an image of a real object in reality can be acquired by an AR camera (or referred to as an AR camera);

102. acquiring first depth information of an entity object in the image;

the first depth information indicates a distance between a real object in reality and a lens when an image is shot;

103. according to the first depth information, executing a target operation, wherein the target operation is used for adjusting at least one of the entity object and the photographed virtual object;

specifically, the imaging of a real object in reality in a close-up image is adjusted, and/or the imaging of a virtual object in the close-up image is adjusted to avoid the help penetration;

104. based on a result of the target operation, forming an augmented reality snap image including the virtual object and the physical object.

In the embodiment of the invention, the target operation for adjusting the entity object and/or the photographed virtual object is executed by acquiring and according to the depth information (namely the first depth information) of the entity object in the image, so that the help penetrating phenomenon in the AR photographing can be avoided, for example, the situation that the body of a user passes through the body of a virtual star when the user and the virtual star are photographed in a combined manner is avoided, and the reality of the photographing is improved.

The above-described information processing method is exemplified below.

Optionally, the target operation includes at least one of:

adjusting a pose of the virtual object;

and adjusting the position of the virtual object in the snap-shot image.

The posture of the real physical object includes the spatial position of any part (such as the trunk, the limbs, the fingers, the toes and the like) on the physical object and the relative position relationship among the parts, and the posture of the virtual object also includes the same.

In this embodiment of the present invention, before the step of executing the target operation according to the first depth information, the method further includes:

acquiring the gesture of the virtual object selected by the user.

In particular, the gesture of the virtual object is pre-selected by the user.

Wherein the neighboring image may also be referred to as a similar image or a neighboring image, and the second depth information of the object in the neighboring image is similar to the first depth information.

Specifically, after the image is collected, the image features in the image are extracted by adopting a Convolutional Neural Network (CNN) algorithm of deep learning, and then the image features extracted from the image are compared with data in a preset database to screen out K neighbor images.

For example, referring to fig. 2, the method for obtaining the first depth information of the entity object may specifically include the following steps:

firstly, extracting the characteristics of an acquired image and a candidate image in a preset database, wherein the characteristic extraction step of the candidate image can be executed in advance;

then, screening K adjacent images of the images from the candidate images, wherein each adjacent image corresponds to one depth information (namely, second depth information);

then, carrying out depth map fusion of the neighboring images;

finally, the first depth information of the entity object is estimated.

Further optionally, the step of screening K neighbor images of the images from a preset database includes:

The formula of the cosine similarity algorithm is as follows:

where I represents the image, I is a candidate image in the preset database, c (I) represents a position p (in coordinates (x, y)) in the image, c (I) represents a position p (in coordinates (x, y)) in the candidate image, modulo of c (I) in | c (I) |, modulo of c (I) in |, and modulo of c (I) in | c (I) |.

representing a weight value corresponding to any one neighboring image C, namely representing a migration weight from the any one neighboring image C to the image i, wherein the contribution weights of different neighboring images to depth migration after the neighboring images are migrated to the image i are different;

i represents the image;

S_i(p) tableA Scale-invariant feature transform (SIFT) feature vector representing the image i;

p denotes the position in the image, expressed in coordinates (x, y).

Gestures of the entity object and the virtual object in time are divided into two types, namely a contact type and a non-contact type, wherein the contact type can be gestures needing contact, such as handshake and shoulder holding, and the non-contact type can be gestures needing no contact, such as side-by-side standing.

In the embodiment of the present invention, referring to fig. 3 (where the z-axis represents depth information), if the gesture of the real object 302 and the virtual object 301 does not contact, then it is not easy to generate interactive help, and it is only necessary to adjust the position of the virtual object in the captured image according to the first depth information of the real object, so that the position of the virtual object and the position of the real object are visually consistent, and in particular, the virtual object appears to be adjacent to the real object.

Of course, it is also necessary to adjust the size of the virtual object in the image according to the size of the solid object in the image, and/or adjust the position of the virtual object in the snap image according to the position in the image of the solid object (i.e. the position of the solid object in the x, y plane (expressed by coordinates (x, y)) (i.e. adjust the position of the virtual object in the x, y plane)).

if the action amplitude of the entity object is smaller than or equal to a preset threshold value, selecting a target posture image from a preset posture image library, wherein the posture of the virtual object in the target posture image is matched with the posture of the entity object so as to adjust the posture of the virtual object in time;

specifically, a target posture image is selected from a preset posture image library, and the posture of the virtual object in the target posture image is used as the posture of the virtual object in the shooting process so as to replace the shooting posture of the virtual object selected by the user in advance;

In the embodiment of the invention, before the target operation is executed according to the first depth information, the gesture of the virtual object selected by the user in advance can be received. Then, it is decided whether to replace the posture of the virtual object or prompt the user to adjust the posture of the real physical object, according to the magnitude of the motion of the physical object.

The action amplitude refers to the distance between the movable part farthest from the central axis and the central axis of the solid object in all the movable parts of the solid object. For example, if the physical object is a human body, the action amplitude refers to a distance between a point on a limb of the human body farthest from a central axis of the human body and the central axis of the human body.

In the embodiment of the invention, after the coordinates of the entity object in the image on the x and y planes are obtained, if the posture of the entity object and the virtual object in the close shot belongs to the contact posture, the entity object and the virtual object are in the close shot without the help penetration based on a physical collision engine. When rendering is performed based on the physical collision engine, the action amplitude of the entity object needs to be judged.

Specifically, the action amplitude of the entity object is extracted (which needs to be according to the first depth information), an action amplitude threshold N (namely, the preset threshold) is set, and if the action amplitude of the entity object is not greater than the threshold N, the corresponding snap-in posture of the virtual object is reselected according to the posture of the entity object; if the action amplitude of the entity object is larger than the threshold value N, prompt information is given, the user is enabled to adjust the posture of the entity object in reality, the posture of the entity object is enabled to be matched with the posture of the virtual object, the reality of the beat-in is guaranteed, and the phenomenon of help penetration is avoided. That is, if the motion amplitude of the physical object is small, a gesture image in which a virtual object gesture matches the physical object gesture is selected from a preset gesture image library to perform virtual enhanced co-shooting with the physical object. If the action magnitude of the physical object is large, prompting the user to adjust the gesture of the physical object in reality to match the gesture of the virtual object preselected by the user.

Specifically, the body key points (specifically, see fig. 4, where 401 is a nose, 402 is a neck, 403 is a right shoulder, 404 is a left shoulder, 405 is a right wrist, 406 is a left wrist, 407 is a right elbow, 408 is a left elbow, 409 is a right hip, 410 is a left hip, 411 is a right knee, 412 is a left knee, 413 is a right ankle, and 414 is a left ankle) of the entity object during the close-up preview can be tracked in real time, and it is determined whether the gesture of the entity object and the virtual object in close-up belongs to the contact class or the non-contact class. Optionally, the body key points of the entity object may be obtained based on a deep learning algorithm (e.g., openpos algorithm).

Of course, in other alternative specific embodiments, whether the gesture of the real object in the captured preview image is in contact with the gesture of the virtual object may be determined according to the relative position relationship between the real object and the virtual object in the captured preview image, or whether the gesture of the real object in the captured preview image is in contact with the gesture of the virtual object may be determined according to the relative position relationship between the real object and the virtual object in the captured preview image.

Optionally, before the step of acquiring an image, the method further includes:

establishing the preset gesture image library, wherein the preset gesture image library comprises images of a plurality of gestures of the virtual object.

In the embodiment of the invention, a preset gesture image library of the virtual object can be established in advance, so that a user can select the close-up gesture of the virtual object, and the close-up gesture of the virtual object can be automatically matched according to the gesture of the user.

Of course, a preset posture image library of a plurality of virtual objects may be established in advance, or the preset posture image library includes images of a plurality of different postures of the plurality of virtual objects. So that the user can select a virtual object to be photographed from among the plurality of virtual objects.

Finally, please refer to fig. 5, which illustrates the augmented reality auction method provided by the embodiment of the present invention, taking the example of the user and the virtual star auction:

501. planning a close shot plot;

502. planning virtual stars in the day, namely setting the stars in the day;

503. setting a virtual star shooting gesture;

504. setting a virtual star shooting position;

505. planning a close-shot action;

506. planning an interactive action;

507. estimating the contact surface of the user and the virtual star;

508. contact deformation;

509. and (5) performing AR snap shot.

Referring to fig. 6, fig. 6 is a schematic structural diagram of an augmented reality close-up apparatus provided in an embodiment of the present invention, where the augmented reality close-up apparatus includes:

the acquisition module 601 is used for acquiring images;

an obtaining module 602, configured to obtain first depth information of the entity object in the image;

an executing module 603, configured to execute a target operation according to the first depth information, where the target operation is used to adjust at least one of the entity object and the photographed virtual object;

an imaging module 604 for forming an augmented reality snap image including the virtual object and the physical object based on a result of the target operation.

Optionally, the target operation includes at least one of:

adjusting a pose of the virtual object;

and adjusting the position of the virtual object in the snap-shot image.

Optionally, the obtaining module 602 includes:

i represents the image;

Optionally, the executing module 603 includes:

a selecting unit, configured to select a target pose image from a preset pose image library if the motion amplitude of the entity object is smaller than or equal to a preset threshold, where a pose of the virtual object in the target pose image matches a pose of the entity object to adjust the pose of the virtual object; and/or the presence of a gas in the gas,

Optionally, the augmented reality close-up apparatus further includes:

a pre-establishment module to establish the preset pose image library, the preset pose image library including images of a plurality of poses of the virtual object.

The embodiment of the present invention is a product embodiment corresponding to the method embodiment described above, and can achieve the same technical effects, and for avoiding repetition, details are not described here, and please refer to the method embodiment in detail.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an augmented reality close-up apparatus according to an embodiment of the present invention, where the augmented reality close-up apparatus 700 includes a processor 701, a memory 702, and a computer program stored in the memory 702 and capable of being executed on the processor 701, and when executed by the processor 701, the computer program implements the following steps:

collecting an image;

acquiring first depth information of an entity object in the image;

Optionally, the target operation includes at least one of:

adjusting a pose of the virtual object;

and adjusting the position of the virtual object in the snap-shot image.

Optionally, the computer program may further implement the following steps when executed by the processor 701:

the step of acquiring first depth information of a solid object in the image comprises:

the step of screening K neighbor images of the image from a preset database comprises:

the step of determining the first depth information from the second depth information of the K neighbor images comprises:

i represents the image;

the step of executing the target operation according to the first depth information includes:

before the step of executing the target operation according to the first depth information, the method further includes:

before the step of acquiring the image, the method further comprises the following steps:

The augmented reality close-up device can achieve all processes of the method embodiment and achieve the same technical effect, and for avoiding repetition, the detailed description is omitted here, and please refer to the method embodiment in detail.

An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program implements each process in the foregoing method embodiments, and can achieve the same technical effect, and in order to avoid repetition, details are not repeated here. The computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

While the present invention has been described with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, which are illustrative and not restrictive, and it will be apparent to those skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An augmented reality close-up method, comprising:

collecting an image;

acquiring first depth information of an entity object in the image;

according to the first depth information, executing a target operation, wherein the target operation is used for adjusting at least one of the entity object and the photographed virtual object; this includes: if the entity object is contacted with the virtual object in a close-time posture, determining the action amplitude of the entity object according to the first depth information; if the action amplitude of the entity object is smaller than or equal to a preset threshold value, selecting a target posture image from a preset posture image library, and using the posture of the virtual object in the target posture image as the posture of the virtual object for close shooting so as to replace the close shooting posture of the virtual object selected by a user in advance; if the action amplitude of the entity object is larger than the preset threshold, outputting prompt information to enable the posture of the entity object to be matched with the posture of the virtual object, wherein the posture of the virtual object is a posture selected by a user in advance, the prompt information is used for prompting the user to adjust the posture of the entity object in reality, and the entity object in reality is the user or other entity objects in reality;

2. The method of claim 1, wherein the target operation further comprises: and adjusting the position of the virtual object in the snap-shot image.

3. The method of claim 1, wherein the step of obtaining first depth information of the physical object in the image comprises:

4. The method according to claim 3, wherein said step of screening K neighbor images of said image from a predetermined database comprises:

5. The method according to claim 3, wherein the step of determining the first depth information from the second depth information of the K neighboring images comprises:

i represents the image;

6. The method of claim 2, wherein the step of performing a target operation according to the first depth information comprises:

7. The method of claim 1, wherein the step of performing the target operation according to the first depth information is preceded by:

8. An augmented reality auction device comprising a processor, a memory and a computer program stored on the memory and executable on the processor, the computer program when executed by the processor implementing the steps of the augmented reality auction method of any one of claims 1 to 7.

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the augmented reality auction method according to any one of claims 1 to 7.