CN114489337A

CN114489337A - AR interaction method, device, equipment and storage medium

Info

Publication number: CN114489337A
Application number: CN202210082222.7A
Authority: CN
Inventors: 彭心
Original assignee: Shenzhen TetrasAI Technology Co Ltd
Current assignee: Shenzhen TetrasAI Technology Co Ltd
Priority date: 2022-01-24
Filing date: 2022-01-24
Publication date: 2022-05-13

Abstract

The utility model provides an AR interaction method, a device and a storage medium, which can identify the real position of a target object in an image and the action of the target object in a real scene through a plurality of first video images of the real scene collected by a first user end, further generate a virtual object according to the action, bind and store the virtual object and the real position, thus, the virtual object created by a user can be obtained by the video images collected by a terminal, further bind and store the virtual object and the real position, thereby realizing AR interaction, effectively reducing the dependence on specific devices, reducing the cost consumption in the aspect of devices, reducing the consumption and dependence of computing resources on the user device end in the AR interaction, reducing the threshold of the AR interaction, simplifying the complexity of the AR interaction, and increasing the applicability and the convenience of the AR interaction, the AR experience of the user is improved.

Description

AR interaction method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of AR technologies, and in particular, to an AR interaction method, apparatus, device, and storage medium.

Background

With the development and progress of science and technology, various new technologies appear and develop continuously, and the quality of life of people is improved continuously. Augmented Reality (AR) technology is a technology for skillfully fusing virtual information with a real world, and by means of the AR technology, people can feel that the real world is mapped to the virtual world through specific equipment, such as AR glasses and the like, so that entertainment, interaction and the like can be realized in the virtual world.

When carrying out experiences such as amusement or interaction through AR technique, mostly need be in specific place, experience through specific equipment, have great limitation.

Disclosure of Invention

The embodiment of the disclosure at least provides an AR interaction method, an AR interaction device, AR interaction equipment and an AR interaction storage medium.

The embodiment of the present disclosure provides an AR interaction method, which includes:

acquiring a plurality of first video images of a real scene, which are acquired by a first user side, wherein the first video images comprise a target object;

identifying a real position of the target object in the real scene and identifying an action of the target object based on the plurality of first video images;

generating a virtual object according to the action of the target object;

and binding and storing the virtual object and the real position.

In the embodiment of the disclosure, by means of the video images collected by the terminal, the virtual object created by the user can be obtained, and then the virtual object and the real position are bound and stored, so that the AR interaction is realized, the dependence on specific equipment can be effectively reduced, the cost consumption in the aspect of equipment is reduced, the consumption and the dependence on computing resources in the user equipment end in the AR interaction can be reduced, the AR interaction threshold can be reduced, the complexity of the AR interaction is simplified, the applicability and the convenience of the AR interaction are increased, and the AR experience of the user is favorably improved.

In an alternative embodiment, identifying a real position of the target object in the real scene based on the plurality of first video images includes:

extracting image features based on at least one of the plurality of first video images;

and matching in an AR scene model corresponding to the real scene based on the image characteristics to obtain the real position of the target object in the real scene.

In the embodiment of the disclosure, the image features in the video image are matched in the AR scene model corresponding to the real scene to obtain the real position of the target object in the real scene, so that the accuracy of the obtained real position of the target object in the real scene can be ensured.

In an alternative embodiment, the act of identifying the target object includes:

for each first video image, determining an image position of the target object in the first video image;

determining a movement track of the target object based on a plurality of image positions determined by the plurality of first video images;

and determining the action of the target object based on the movement track.

In the embodiment of the disclosure, the movement track of the target object can be determined through the recognition of the position in the image, so that the action of the target object is determined, the recognition mode is simple and effective, the dependence on specific equipment in AR interaction can be reduced, the limitation of time and place on the AR interaction is reduced, and the complexity of the AR interaction on a user is simplified.

In an optional embodiment, the generating a virtual object according to the action of the target object includes:

and generating a virtual object based on the moving track, wherein the outline of the virtual object is matched with the moving track.

In an optional embodiment, after the binding saving of the virtual object with the real position, the method includes:

and displaying an AR effect picture of the virtual object in the real scene to a first user through the first user terminal.

In the embodiment of the disclosure, the virtual object can be displayed directly through the user side, the AR effect picture in the real scene can be fed back to the user in time, the dependence on specific presentation equipment is reduced, the cost and complexity of AR interaction are reduced, and the improvement of the AR experience of the user is facilitated.

In an optional embodiment, after the displaying, by the first user end, the AR effect picture of the virtual object in the real scene to the first user, the method includes:

receiving a control operation of the first user for the virtual object in the AR effect picture;

and controlling the virtual object to execute an instruction corresponding to the control operation in the AR effect picture.

In the embodiment of the disclosure, by controlling the virtual object to execute the instruction corresponding to the control operation in the real scene, the interactivity of the user in use can be effectively increased.

In an optional embodiment, after the binding the virtual object with the real location is saved, the method further includes:

acquiring a second video image acquired by the first user side;

identifying whether a scene location in the second video image is consistent with the real location;

if the virtual objects are consistent with the real positions, the virtual objects bound and stored with the real positions are obtained and serve as target virtual objects;

and displaying an AR effect picture of the target virtual object in the real scene to a first user through the first user terminal.

In the embodiment of the disclosure, under the condition that the scene position in the collected second video image is consistent with the real position in the first video image, the virtual object bound with the real position can be directly used as the target virtual object, and then displayed to the first user, so that the storage, accumulation and retrieval of the content in the AR interaction can be effectively realized, and the convenience degree during the user interaction is increased.

In an optional embodiment, in response to that the virtual object saved in the bound with the real location is multiple, the obtaining, as a target virtual object, the virtual object saved in the bound with the real location includes:

and determining a preset number of target virtual objects from the plurality of virtual objects.

in response to a sharing permission operation applied by the first user to the virtual object, setting the virtual object to a sharing state to enable a second user to invoke viewing of the virtual object.

In the embodiment of the present disclosure, the virtual object may be set to a sharing state, so that different users may share the virtual object with each other, thereby effectively increasing the applicability and linkage of AR interaction.

The embodiment of the present disclosure further provides an AR interaction apparatus, the apparatus includes:

the system comprises an image acquisition module, a target object acquisition module and a target object acquisition module, wherein the image acquisition module is used for acquiring a plurality of first video images of a real scene, which are acquired by a first user terminal, and the first video images comprise the target object;

the action recognition module is used for recognizing the real position of the target object in the real scene and recognizing the action of the target object based on the plurality of first video images;

the object generation module is used for generating a virtual object according to the action of the target object;

and the object storage module is used for binding and storing the virtual object and the real position.

In an optional implementation manner, the action recognition module is specifically configured to:

and determining the action of the target object based on the movement track.

In an optional embodiment, the object generation module is specifically configured to:

In an optional implementation manner, the apparatus further includes a first display module, where the first display module is specifically configured to:

In an optional embodiment, the apparatus further comprises an object control module, the object control module is configured to:

and controlling the virtual object to execute an instruction corresponding to the control operation in the real scene.

In an optional implementation manner, the apparatus further includes a second display module, where the second display module is specifically configured to:

acquiring a second video image acquired by the first user side;

In an optional implementation manner, in response to that a plurality of virtual objects bound and stored with the real location are available, the second presentation module, when being configured to obtain the virtual object bound and stored with the real location as a target virtual object, is specifically configured to:

In an optional implementation manner, the apparatus further includes an object sharing module, where the sharing module is specifically configured to:

An embodiment of the present disclosure further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions being executed by the processor to perform the steps of the AR interaction method described above.

The embodiment of the present disclosure further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps in the AR interaction method.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 is a flowchart of an AR interaction method according to an embodiment of the present disclosure;

fig. 2 is a schematic diagram of a real scene acquisition according to an embodiment of the present disclosure;

FIG. 3 is a diagram illustrating a first video image according to an embodiment of the disclosure;

FIG. 4 is a diagram of drawing in AR interaction;

fig. 5 is a flowchart of another AR interaction method provided in the embodiments of the present disclosure;

FIG. 6 is a schematic diagram of effect display during AR interaction;

fig. 7 is a schematic diagram of an AR interaction apparatus according to an embodiment of the present disclosure;

fig. 8 is a second schematic diagram of an AR interaction apparatus according to a second embodiment of the disclosure;

fig. 9 is a schematic diagram of an electronic device according to an embodiment of the disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The term "and/or" herein merely describes an associative relationship, meaning that three relationships may exist, e.g., a and/or B, may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.

Research shows that when entertainment or interaction and other experiences are carried out through the AR technology, most experiences need to be carried out through specific equipment in a specific place, so that the limitation is large, and due to the limitation of the specific equipment, a user can only look over and interact through the specific equipment, so that the interaction method of the user is single to a certain extent.

Based on the research, the AR interaction method can obtain the virtual object created by the user through the video image collected by the terminal, and further binds and stores the virtual object and the real position, so that AR interaction is realized, dependence and cost consumption on specific equipment can be effectively reduced, calculation resource consumption and dependence on a user equipment end in the AR interaction can be reduced, an AR interaction threshold can be reduced, the complexity of the AR interaction is simplified, the applicability and the convenience of the AR interaction are improved, and the AR experience of the user is improved.

To facilitate understanding of the embodiment, first, an AR interaction method disclosed in the embodiment of the present disclosure is described in detail, where an execution subject of the AR interaction method provided in the embodiment of the present disclosure is generally an electronic device with certain computing capability, and the electronic device includes, for example: terminal equipment or servers or other processing devices. In some possible implementations, the AR interaction method may be implemented by a processor calling computer readable instructions stored in a memory.

Referring to fig. 1, fig. 1 is a flowchart illustrating an AR interaction method according to an embodiment of the disclosure. As shown in fig. 1, an AR interaction method provided in the embodiment of the present disclosure includes:

s101: the method comprises the steps of obtaining a plurality of first video images of a real scene, which are collected by a first user side, wherein the first video images comprise target objects.

In this step, when the user wants to perform AR interaction, the first user end may be used to acquire multiple first video images in a real scene, for example, a section of video including the target object may be acquired, and then a key frame is extracted from the video to obtain multiple first video images, or multiple continuous first video images of the target object may be directly acquired in a continuous photographing manner, and correspondingly, the background or a device for realizing AR interaction may be used to acquire the acquired video from the first user end, and then analyze the acquired video to obtain multiple first video images, or directly acquire multiple continuous first video images acquired by the first user end.

The target object may be a human body part, such as a finger, an arm, a leg, or the like, or a prop or an object such as a wand.

Referring to fig. 2 and fig. 3, fig. 2 is a schematic diagram of capturing a real scene according to an embodiment of the present disclosure, fig. 3 is a schematic diagram of a first video image according to an embodiment of the present disclosure, taking the example that a user shops in a shopping mall, the user sees a decorated center stage during the shopping process, the user likes the center stage and wants to draw a cat on the stage, and at the moment, the user uses fingers as a target object for drawing, then drawing a picture by the fingers facing the stage, and simultaneously shooting the process of drawing the picture by the fingers by using a mobile phone and other terminals so as to acquire the video of the fingers drawing the stage, and then the collected video can be uploaded to a background server or a background for supporting AR interaction, and the background can extract key frames from the video to obtain a plurality of first video images containing fingers, namely the target object.

S102: based on the plurality of first video images, a real position of the target object in the real scene is identified, and a motion of the target object is identified.

In this step, after the plurality of first video images are acquired, the image content of the plurality of first video images may be analyzed, so that the real position of the target object in the real scene may be identified, and the motion exhibited by the target object in the plurality of first video images may also be identified.

The real position of the target object in the real scene may refer to a real position pointed by the target object, or may refer to a real position where the target object is located. For example, in the example shown in fig. 2 and fig. 3, if, in the plurality of captured first video images, the finger is located above the stage, and in the whole process, there is no obvious directivity at the beginning, in the process, or at the end, and the like, it can be considered that the stage below the finger is the real position where the target object is located, and for another example, the user cannot approach the stage, but wants to draw a picture on the stage for AR interaction, at this time, a definite indication action and the like, such as pointing to the stage and staying for a period of time, can be given at the beginning, in the process, or at the end of capturing the first video images, and after identifying the indication pointing to the stage from the plurality of captured first video images, the stage can be used as the real position of the finger, i.e., the real position where the target object is located.

Here, the identifying of the real position of the target object in the real scene may be to obtain the actual position information of the real position by performing feature matching between image features in a plurality of collected first video images and an AR scene model in combination with the AR scene model corresponding to the real scene, or may be to obtain the position information of the real position of the target object in the real scene by converting plane coordinates of the first video images and depth information into world coordinates in the real scene by means of the first video images themselves, for example, in combination with depth information of image contents in the first video images.

In a possible implementation manner, when the determination of the real position is achieved through the AR scene model, image features may be extracted based on at least one of the plurality of first video images; and matching in an AR scene model corresponding to the real scene based on the image characteristics to obtain the real position of the target object in the real scene.

Specifically, at least one image meeting the requirements, such as a definition requirement, an image precision requirement, and the like, may be screened from the plurality of first video images, then image features, such as two-dimensional feature points capable of representing the target object, and positions corresponding to the target object or two-dimensional feature points of an article, are extracted from the screened image, then the two-dimensional feature points in the extracted image are compared with a three-dimensional point cloud prestored in a pre-established AR scene model, so as to obtain a virtual position of the target object in the AR scene model, further, since the three-dimensional point cloud in the AR scene model is prestored according to a positional relationship, that is, the content of the three-dimensional point cloud corresponds to the real scene one to one, the real position in the real scene can be obtained through the virtual position, this real position is the real position of the target object in the real scene.

The AR scene model may be pre-established and stored in a background, and may be directly retrieved from the background when in use, for example, the AR scene model may be a three-dimensional point cloud of the real scene pre-collected by a point cloud collection device such as a laser radar, and an AR model generated by the collected three-dimensional point cloud, and may be a scene image of the real scene collected by some historical users, and the generated AR model, and further, the AR scene model may be temporarily generated when the user needs to perform AR interaction, for example, when the user wants to perform AR interaction, the AR scene model needs to be used, but the AR scene model corresponding to the real scene is not stored in the background, and at this time, the user may use the first user end to receive guidance information sent by the background, and collect a plurality of scene images of the real scene by means of the guidance information, and then uploading the acquired scene image to a background so that the background can generate an AR scene model of the real scene. Correspondingly, the AR scene image is generated through the image collected by the historical user, or the guiding information is sent to the historical user to guide the historical user to collect the scene image of the real scene, and then the AR scene image is generated through the scene image collected by the historical user.

Further, the guidance information sent to the first terminal may include a plurality of acquisition points to guide the user to acquire a plurality of scene images used for establishing an AR scene model in the real scene at each acquisition point, and after the plurality of scene images acquired by the user are obtained, the AR scene model in the real scene may be obtained by matching image coordinates in the plurality of acquired scene images and by matching feature points corresponding to every two scene images in the plurality of scene images, restoring the matching points in the world coordinates in the real scene, and then splicing the restored coordinates, and then the established AR scene model may be stored in the server so that a subsequent user may directly use the AR scene model.

In a possible implementation manner, when the first user performs interaction but the current environment is not modeled and needs real-time modeling, the first user may open an AR interface by scanning a two-dimensional code in a real scene to send an interaction request to a server, the server may send a request position command to a client after receiving the interaction request of a user side, acquire a current position of the user after the user agrees, and then send guidance information to the user side to guide the user station to collect a plurality of scene images for establishing an AR scene model at different collection points in the real scene, and then send the collected scene images to the server for modeling.

When the motion of the target object is recognized, the motion of the target object may be obtained according to a comprehensive track of the target object in a plurality of first video images, or a sub-motion of the target object may be determined according to image contents, such as gesture contents, of the target object in each of the plurality of first video images, for example, if two consecutive pieces of contents of the target object in the plurality of first video images are respectively a fist and an open, the motion corresponding to the target object may be determined.

In a possible implementation, for the action of identifying the target object through a plurality of first video images, an image position of the target object in each first video image may be determined first; then determining a moving track of the target object based on a plurality of image positions determined by the plurality of first video images; and determining the action of the target object based on the movement track.

Here, the image recognition may be performed on each of the plurality of first video images to recognize the image position of the target object in the first video image, the recognized image positions may be integrated to obtain the movement track of the target object in the plurality of first video images, and finally, the motion of the target object may be obtained according to the movement track.

S103: and generating a virtual object according to the action of the target object.

The virtual object generated according to the action may be an object completely and freely created by the user, that is, the virtual object drawn by the action, or the virtual object generated by the user calling the action actually through a mapping relationship between a preset action and an object.

In a possible embodiment, when the user freely creates an object, for generating the virtual object, a virtual object may be generated based on the movement track, that is, in the case of the movement track of the target object identified by the plurality of first video images, the content described by the movement track may be used as the virtual object, and the outline of the virtual object matches the movement track.

For example, if the movement trajectory of the target object is circular, the outline of the corresponding virtual object is also circular.

For example, please refer to fig. 4, fig. 4 is a drawing diagram of AR interaction. As shown in fig. 4, taking the target object as a finger as an example, if the finger of the user is recognized from the plurality of first video images by using a computer vision technology, and the movement locus of the finger is captured, and the movement locus of the finger is obtained in the shape of a cat, the virtual object can be considered as the cat drawn by the user.

Wherein the virtual object is an AR image created against a drawing image of a user in the real scene.

S104: and binding and storing the virtual object and the real position.

In this step, the virtual object and the real position may be bound and saved, so that a corresponding virtual effect is presented when needed.

In addition, after the virtual object is generated and stored, the virtual object and the shortcut of the user side can be set, so that the virtual object can be directly displayed when a subsequent user shoots a video image with the same picture, or the virtual object can be directly called through the shortcut.

According to the AR interaction method provided by the embodiment of the disclosure, a plurality of first video images of a real scene, which are acquired by a first user side, are obtained, wherein the first video images comprise a target object; identifying a real position of the target object in the real scene and identifying an action of the target object based on the plurality of first video images; generating a virtual object according to the action of the target object; and binding and storing the virtual object and the real position.

Therefore, by means of the video images collected by the terminal, the virtual object created by the user can be obtained, and then the virtual object and the real position are bound and stored, so that AR interaction is realized, dependence on specific equipment can be effectively reduced, cost consumption in the aspect of equipment is reduced, consumption and dependence on computing resources in the user equipment end in the AR interaction can be reduced, an AR interaction threshold is reduced, the complexity of the AR interaction is simplified, the applicability and the convenience of the AR interaction are improved, and the AR experience of the user is improved.

Referring to fig. 5, fig. 5 is a flowchart of another AR interaction method according to an embodiment of the present disclosure. As shown in fig. 5, the AR interaction method provided in the embodiment of the present disclosure includes:

s501: the method comprises the steps of obtaining a plurality of first video images of a real scene, which are collected by a first user side, wherein the first video images comprise target objects.

S502: based on the plurality of first video images, a real position of the target object in the real scene is identified, and a motion of the target object is identified.

S503: and generating a virtual object according to the action of the target object.

S504: and binding and storing the virtual object and the real position.

S505: and displaying an AR effect picture of the virtual object in the real scene to a first user through the first user terminal.

In this step, after the virtual object is obtained, the virtual object bound and stored with the real position may be called in a server, and corresponding AR data is sent to the first user side, so that an AR effect picture of the virtual object in the real scene is displayed to the first user through the first user side.

For example, please refer to fig. 6, fig. 6 is a schematic diagram illustrating an effect display in an AR interaction process. As shown in fig. 6, the virtual object may be directly displayed at the bound real position in the first user end through position guidance, or may be displayed at a real scene position corresponding to the angle at which the first user end is located according to the direction and the angle at which the first user end is located, and does not need to be viewed through a specific device.

The descriptions of step S501 to step S504 may refer to the descriptions of step S101 to step S104, and the same technical effect and the same technical problem can be achieved, which are not described herein again.

Next, this embodiment will be further described with reference to some specific embodiments.

In some possible embodiments, step S505 includes:

In the step, when displaying the AR effect picture of the virtual object in the real scene to a first user through the first terminal, there may be a case where the first user needs to adjust the position of the virtual object or the motion of the virtual object in the displayed AR effect screen, and at this time, a control operation of the first user with respect to the virtual object in the AR effect picture may be received, such as the first user, via the first terminal, an applied operating action, an operating voice, or as in the process of generating the virtual object, acquiring a video image containing a control operation through the first terminal, and after the control operation is received or recognized, the virtual object can be controlled to execute an instruction corresponding to the control operation in the real scene.

The control of the virtual object in the AR effect picture may be to control the virtual object to move, for example, to move in a small area where the real position is located, or to make various actions, or to change the bound real position, and so on.

In some possible embodiments, after the binding the virtual object with the real location is saved, the method further includes:

acquiring a second video image acquired by the first user side;

In this step, when the first user comes to the real scene again or wants to perform interaction again after the last interaction is finished, the user may use the first user end to acquire a plurality of second video images of the real scene, and accordingly, after the plurality of second video images acquired by the first user end are obtained, a scene position included in the second video images may be identified, and further, whether the scene position is consistent with the real position may be determined, and in the case of consistency, the virtual object bound and stored with the real position may be directly acquired, and the acquired virtual object is used as a target virtual object, so that an AR effect picture of the target virtual object in the real scene may be displayed to the first user through the first user end.

The scene position in the second video image may be the same position as the real position, and the position may be viewed from the same angle and the same position in the real scene, or may be viewed from different angles and different positions in the real scene.

Further, the first user may perform multiple AR interactions at the same location, and accordingly, when performing AR interactions each time, a virtual object may be created, and as the number of interactions is accumulated, multiple virtual objects may be bound and stored at the same location, or the first user may create multiple virtual objects during sequential AR interactions, or if multiple users perform AR interactions at the same location, respective virtual objects may be created, and the created virtual objects may be shared by different users, so that the virtual objects may be kept and stored in the real location.

Therefore, in a possible implementation, in response to that the virtual object bound to be saved with the real position is multiple, the obtaining, as a target virtual object, the virtual object bound to be saved with the real position includes:

Here, when the number of the virtual objects bound and stored in the real location is multiple, when the first user performs AR interaction again in the real location, it is not necessary to create a virtual object again, that is, if there are objects required by the first user in the virtual objects bound and stored, a preset number of target virtual objects may be directly determined from the multiple virtual objects.

The target virtual object is determined from the plurality of virtual objects, and may be a virtual object selected by the user according to factors such as preference and time, or a virtual object selected by the user, or may be automatically recommended by the system, for example, a virtual object whose generation time is the closest is preferentially selected, or a virtual object created by the user.

The plurality of virtual objects bound to the real position may include a plurality of objects authored by the user himself for a plurality of times, or may include objects authored by other users and allowed to be shared.

Further, in some possible embodiments, after the binding saving of the virtual object and the real position, the method further includes:

In this step, after creating the virtual object, the first user may select to share the virtual object, so that other users may interact with the virtual object, and accordingly, a sharing permission operation may be applied to the virtual object by the first terminal or the like, and after receiving the sharing permission operation, the virtual object may be set to a sharing state in response to the sharing permission operation, so that a second user may invoke and view the virtual object, and convenience of AR interaction is improved to a certain extent.

According to the AR interaction method provided by the embodiment of the disclosure, a plurality of first video images of a real scene, which are acquired by a first user side, are obtained, wherein the first video images comprise a target object; identifying a real position of the target object in the real scene and identifying an action of the target object based on the plurality of first video images; generating a virtual object according to the action of the target object; binding and saving the virtual object and the real position; and displaying an AR effect picture of the virtual object in the real scene to a first user through the first user terminal.

Therefore, with the help of the video images collected by the user terminal, the virtual object created by the user can be obtained, the virtual object and the real position are bound and stored, AR effect pictures can be presented through the user terminal, AR interaction is realized, dependence on specific equipment can be effectively reduced, cost consumption in the aspect of equipment is reduced, consumption and dependence on computing resources of the user equipment end in AR interaction can be reduced, an AR interaction threshold is reduced, complexity of AR interaction is simplified, applicability and convenience of AR interaction are improved, and AR experience of the user is improved.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same inventive concept, an AR interaction device corresponding to the AR interaction method is also provided in the embodiments of the present disclosure, and since the principle of solving the problem of the device in the embodiments of the present disclosure is similar to that of the AR interaction method in the embodiments of the present disclosure, the implementation of the device may refer to the implementation of the method, and repeated details are not described again.

Referring to fig. 7 and 8, fig. 7 is a first schematic diagram of an AR interaction apparatus provided in an embodiment of the disclosure, and fig. 8 is a second schematic diagram of the AR interaction apparatus provided in the embodiment of the disclosure. As shown in fig. 7, an apparatus 700 provided by an embodiment of the present disclosure includes:

an image obtaining module 710, configured to obtain multiple first video images of a real scene, where the first video images are collected by a first user and include a target object;

a motion recognition module 720, configured to recognize a real position of the target object in the real scene and recognize a motion of the target object based on the plurality of first video images;

an object generating module 730, configured to generate a virtual object according to the action of the target object;

and an object storage module 740, configured to bind and store the virtual object and the real location.

In an optional implementation manner, the action recognition module 720 is specifically configured to:

In another optional implementation, the action recognition module 720 is specifically configured to:

and determining the action of the target object based on the movement track.

In an optional implementation manner, the object generation module 730 is specifically configured to:

In an optional implementation manner, the apparatus further includes a first display module 750, where the first display module 750 is specifically configured to:

In an optional embodiment, the apparatus further comprises an object control module 760, and the object control module 760 is configured to:

In an alternative embodiment, the apparatus further comprises a second display module 770, wherein the second display module 770 is specifically configured to:

acquiring a second video image acquired by the first user side;

In an optional embodiment, in response to that the virtual objects bound and saved with the real location are multiple, the second presentation module 770, when being configured to fetch the virtual objects bound and saved with the real location as target virtual objects, is specifically configured to:

In an optional implementation manner, the apparatus further includes an object sharing module 780, where the sharing module 780 is specifically configured to:

The utility model provides a AR interaction device, with the help of the video image that the terminal was gathered, can obtain the virtual object that the user was created, and then bind virtual object and reality position and save, thereby realize AR interaction, can effectively reduce the reliance to specific equipment, reduce the cost consumption in the aspect of the equipment, can also reduce to user equipment end computational resource consumption and dependence in the AR interaction, reduce AR interaction threshold, simplify the interactive complexity of AR, increase interactive suitability and the convenience of AR, be favorable to improving user's AR and experience.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

Corresponding to the above AR interaction method, an embodiment of the present disclosure further provides an electronic device 900, as shown in fig. 9, which is a schematic structural diagram of the electronic device 900 provided in the embodiment of the present disclosure, and includes:

a processor 910, a memory 920, and a bus 930; the storage 920 is used for storing execution instructions and includes a memory 921 and an external storage 922; the memory 921 is also referred to as an internal memory, and is configured to temporarily store operation data in the processor 910 and data exchanged with an external memory 922 such as a hard disk, the processor 910 exchanges data with the external memory 922 through the memory 921, and when the electronic device 900 operates, the processor 910 communicates with the memory 920 through a bus 930, so that the processor 930 performs the above-described steps of the AR interaction method.

The embodiments of the present disclosure also provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the AR interaction method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the AR interaction method in the foregoing method embodiments, which may be referred to specifically in the foregoing method embodiments, and are not described herein again.

The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. An AR interaction method, the method comprising:

generating a virtual object according to the action of the target object;

and binding and storing the virtual object and the real position.

2. The method of claim 1, wherein identifying the real position of the target object in the real scene based on the plurality of first video images comprises:

3. The method of claim 1, wherein the act of identifying the target object comprises:

and determining the action of the target object based on the movement track.

4. The method of claim 3, wherein generating a virtual object based on the action of the target object comprises:

5. The method of claim 1, wherein after the saving of the binding of the virtual object to the real location, the method comprises:

6. The method of claim 5, wherein after the displaying, by the first user end, the AR effect picture of the virtual object in the real scene to the first user, the method comprises:

7. The method of claim 1, wherein after the saving of the binding of the virtual object to the real location, the method further comprises:

acquiring a second video image acquired by the first user side;

8. The method according to claim 7, wherein in response to the virtual object bound and saved with the real position being plural, the acquiring the virtual object bound and saved with the real position as a target virtual object comprises:

9. The method of claim 1, wherein after the saving of the binding of the virtual object to the real location, the method further comprises:

10. An AR interaction apparatus, comprising:

11. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the AR interaction method of any of claims 1 to 9.

12. A computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, performs the steps of the AR interaction method according to any one of claims 1 to 9.