CN113813595A

CN113813595A - Method and device for realizing interaction

Info

Publication number: CN113813595A
Application number: CN202110056976.0A
Authority: CN
Inventors: 吴朝阳
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Wodong Tianjun Information Technology Co Ltd
Priority date: 2021-01-15
Filing date: 2021-01-15
Publication date: 2021-12-21

Abstract

The invention discloses a method and a device for realizing interaction, and relates to the technical field of computers. One embodiment of the method comprises: obtaining face key point data from a real-time face image; determining the position and the posture of an interactive model in an interactive scene according to the key point data of the human face, wherein the interactive model corresponds to the human face image; rendering the model in the interactive scene, wherein the model in the interactive scene comprises the interactive model. The implementation method can realize the interactive application such as AR interactive game without the need that the mobile terminal equipment supports the SLAM function.

Description

Method and device for realizing interaction

Technical Field

The invention relates to the technical field of computers, in particular to a method and a device for realizing interaction.

Background

Currently, for developing an interactive application, for example, developing an AR (augmented reality) interactive game based on a mobile terminal device, it is necessary to use core technology modules such as a rendering engine, a physical engine, and a particle special effect, and it is also necessary for the mobile terminal device to support a SLAM (instant positioning and mapping) function, so as to implement the AR interactive game.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:

the method is limited by the mobile terminal equipment supporting the SLAM function, and for the mobile terminal equipment not supporting the SLAM function, the existing method for developing interactive application cannot realize interaction.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method and an apparatus for implementing interaction, which can implement interactive applications such as AR interactive games without a mobile device supporting a SLAM function.

To achieve the above object, according to an aspect of an embodiment of the present invention, a method for implementing interaction is provided.

A method of enabling interaction, comprising: obtaining face key point data from a real-time face image; determining the position and the posture of an interactive model in an interactive scene according to the face key point data, wherein the interactive model corresponds to the face image; rendering a model in the interactive scene, wherein the model in the interactive scene comprises the interactive model.

Optionally, before obtaining the face key point data from the real-time face image, the method includes: initializing a physics engine and a rendering engine, the initializing the physics engine comprising constructing objects having physical attributes in the interactive scene, the initializing the rendering engine comprising loading a model in the interactive scene to the rendering engine, the physics engine and the rendering engine for rendering the model in the interactive scene.

Optionally, the physics engine and the rendering engine are encapsulated using a graphics device application program interface and a bottom layer application program interface algorithm library associated with image processing.

Optionally, the face key point data includes rotation information, scaling information, and two-dimensional position information of a specific key point; determining the position and the posture of an interactive model in an interactive scene according to the face key point data, wherein the determining comprises the following steps: calculating to obtain three-dimensional position information of the interactive model in the interactive scene by using the zooming information, the view matrix and the projection matrix in the rendering engine according to the two-dimensional position information of the specific key point; and determining the posture of the interactive model in the interactive scene by using the rotation information.

Optionally, the calculating, according to the two-dimensional position information of the specific keypoint, three-dimensional position information of the interactive model in the interactive scene by using the scaling information, the view matrix in the rendering engine, and the projection matrix includes: multiplying the view matrix in the rendering engine by the projection matrix, and then calculating an inverse matrix to obtain a projection view inverse matrix; and combining the two-dimensional position information of the specific key point and the scaling information into a three-dimensional coordinate point, and multiplying the projection view inverse matrix with the three-dimensional coordinate point to obtain the three-dimensional position information of the interactive model in the interactive scene.

Optionally, the rendering the model in the interactive scene includes: synchronizing a physical matrix and a rendering matrix through the physical engine, performing physical attribute simulation on the object in the interactive scene, and performing collision detection and processing on the object, wherein the rendering matrix comprises the view matrix and the projection matrix, and the physical matrix is a matrix in the physical engine; updating, by the rendering engine, the rendering matrix based on a result of the physical property simulation and/or a result of the collision detection and processing, and rendering the interactive scene based on the updated rendering matrix.

Optionally, the specific key point is a tip of a nose of the face image.

According to another aspect of the embodiments of the present invention, an apparatus for implementing interaction is provided.

An apparatus for enabling interaction, comprising: the face image recognition module is used for obtaining face key point data from a real-time face image; the interactive model pose determining module is used for determining the position and the posture of an interactive model in an interactive scene according to the face key point data, and the interactive model corresponds to the face image; and the rendering module is used for rendering the model in the interactive scene, wherein the model in the interactive scene comprises the interactive model.

Optionally, the system further comprises an initialization module, configured to: initializing a physics engine and a rendering engine, the initializing the physics engine comprising constructing objects having physical attributes in the interactive scene, the initializing the rendering engine comprising loading a model in the interactive scene to the rendering engine, the physics engine and the rendering engine for rendering the model in the interactive scene.

Optionally, the face key point data includes rotation information, scaling information, and two-dimensional position information of a specific key point; the interactive model pose determination module is further configured to: calculating to obtain three-dimensional position information of the interactive model in the interactive scene by using the zooming information, the view matrix and the projection matrix in the rendering engine according to the two-dimensional position information of the specific key point; and determining the posture of the interactive model in the interactive scene by using the rotation information.

Optionally, the interactive model pose determination module is further configured to: multiplying the view matrix in the rendering engine by the projection matrix, and then calculating an inverse matrix to obtain a projection view inverse matrix; and combining the two-dimensional position information of the specific key point and the scaling information into a three-dimensional coordinate point, and multiplying the projection view inverse matrix with the three-dimensional coordinate point to obtain the three-dimensional position information of the interactive model in the interactive scene.

Optionally, the rendering module is further configured to: synchronizing a physical matrix and a rendering matrix through the physical engine, performing physical attribute simulation on the object in the interactive scene, and performing collision detection and processing on the object, wherein the rendering matrix comprises the view matrix and the projection matrix, and the physical matrix is a matrix in the physical engine; updating, by the rendering engine, the rendering matrix based on a result of the physical property simulation and/or a result of the collision detection and processing, and rendering the interactive scene based on the updated rendering matrix.

Optionally, the specific key point is a tip of a nose of the face image.

According to yet another aspect of an embodiment of the present invention, an electronic device is provided.

An electronic device, comprising: one or more processors; a memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method for implementing interaction provided by embodiments of the present invention.

According to yet another aspect of an embodiment of the present invention, a computer-readable medium is provided.

A computer readable medium, on which a computer program is stored, which when executed by a processor implements the method for implementing interaction provided by embodiments of the present invention.

One embodiment of the above invention has the following advantages or benefits: obtaining face key point data from a real-time face image; determining the position and the posture of an interactive model in an interactive scene according to the key point data of the human face, wherein the interactive model corresponds to the human face image; rendering the model in the interactive scene, wherein the model in the interactive scene comprises the interactive model. The method can realize interactive applications such as AR interactive games without the need of supporting the SLAM function by the mobile terminal equipment.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

FIG. 1 is a diagram illustrating the main steps of a method for implementing interaction according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a system for implementing interaction according to an embodiment of the invention;

FIG. 3 is a flow diagram of a rendering stage according to one embodiment of the invention;

FIG. 4 is a schematic flow diagram of location estimation according to one embodiment of the present invention;

FIG. 5 is a schematic diagram of the main modules of an apparatus for implementing interaction according to an embodiment of the present invention;

FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of main steps of a method for realizing interaction according to an embodiment of the present invention.

As shown in fig. 1, the method for implementing interaction according to an embodiment of the present invention mainly includes the following steps S101 to S103.

Step S101: and obtaining the face key point data from the real-time face image.

Step S101 may specifically receive, in real time, picture data including a face image captured by the imaging device, and obtain face key point data from the picture data.

Before obtaining the face key point data from the real-time face image, the method may include: initializing a physics engine and a rendering engine, wherein the initializing the physics engine comprises constructing an object with physical properties (the object with the physical properties is a rigid body), the initializing the rendering engine comprises loading a model in the interactive scene to the rendering engine, and the physics engine and the rendering engine are used for rendering the model in the interactive scene; before the face key point data is obtained from the real-time face image, the method also comprises an initialized visual algorithm component, wherein the initialized visual algorithm component comprises the loading of a training model, and the loading of the training model refers to some model files which are trained by the visual algorithm component in advance and is mainly used for the recognition of the image.

And packaging to obtain a physical engine, a rendering engine and a visual algorithm component by utilizing a graphic device application program interface and a bottom Application Program Interface (API) algorithm library related to image processing. The bottom-layer application program interface algorithm library related to image processing may include OpenCV (open source computer vision), OpenGL ES (which is a subset of OpenGL three-dimensional graphics API, and OpenGL is a professional graphics program interface, and is a bottom-layer graphics library with powerful functions and convenient invocation), and the like.

Step S102: and determining the position and the posture of an interactive model in an interactive scene according to the key point data of the human face, wherein the interactive model corresponds to the human face image.

The face key point data comprises rotation information, scaling information and a contour dot matrix sequence of the face, wherein the contour dot matrix sequence of the face comprises two-dimensional position information of a specific key point of the face; determining the position and the posture of the interactive model in the interactive scene according to the face key point data, which may include: calculating to obtain three-dimensional position information of the interactive model in the interactive scene by using the zooming information, the view matrix and the projection matrix in the rendering engine according to the two-dimensional position information of the specific key point; and determining the posture of the interactive model in the interactive scene by using the rotation information.

Calculating to obtain three-dimensional position information of the interactive model in the interactive scene according to the two-dimensional position information of the specific key point by using the zoom information, the view matrix and the projection matrix in the rendering engine, which may include: multiplying a view matrix in a rendering engine by a projection matrix, and then calculating an inverse matrix to obtain a projection view inverse matrix; and combining the two-dimensional position information and the scaling information of the specific key point into a three-dimensional coordinate point, and multiplying the projection view inverse matrix with the three-dimensional coordinate point to obtain the three-dimensional position information of the interactive model in the interactive scene.

The specific key point may be a tip of a nose of the face image.

Step S103: rendering the model in the interactive scene, wherein the model in the interactive scene comprises the interactive model.

The interaction model is a model related to interaction generated according to the face image, for example, a model that swings at a certain angle is also realized when the face swings at a certain angle, and the form of interaction with the face realized by the interaction model may not be limited to this example. The model in the interactive scene may also include other models related to interaction, for example, a model that realizes the motion and collision of a collision body in the interactive model when the interactive model swings along with the human face, and the like, and the specific implementation form is not limited to this example.

The interactive scene is a scene for realizing interaction between people and interactive application and can be displayed through a screen.

Rendering a model in an interactive scene may include: the method comprises the steps of synchronizing a physical matrix and a rendering matrix through a physical engine, performing physical attribute simulation on an object in an interactive scene, and performing collision detection and processing on the object, wherein the rendering matrix comprises a view matrix and a projection matrix, the physical matrix is a matrix in the physical engine, the matrix in the physical engine is used for transforming a basic collision body in the physical engine, for example, transforming a rigid coordinate system into a world coordinate system, the matrix format can be 4 x 4, and the physical matrix and the rendering matrix can be synchronized through the middle coordinate system of the world coordinate system; and updating, by the rendering engine, the rendering matrix based on the result of the physical property simulation and/or the result of the collision detection and processing, and rendering the interactive scene based on the updated rendering matrix.

The method for realizing interaction of the embodiment of the invention is described in detail below by taking the implementation of an AR interactive game based on face key points and a physical engine as an example, and mainly comprises two links of initialization and game rendering.

The initialization link comprises the steps of initializing a rendering engine, initializing a physical engine and initializing a visual algorithm component (visual component for short). Initializing a physical engine comprises constructing an object (rigid body) with physical attributes in an interactive scene, initializing a rendering engine comprises loading a model in the interactive scene into the rendering engine, wherein the physical engine and the rendering engine are used for rendering the model in the interactive scene, initializing a visual algorithm component comprises loading a training model, and the loading of the training model refers to some model files which are trained in advance by the visual algorithm component and is mainly used for identifying images.

The initialization has the function of creating objects required by subsequent operation of graphs and images in advance so as to facilitate direct calling of some algorithms in the following steps, and the initialization also loads some basic models for displaying and training, so that the purpose of preprocessing in advance is achieved. After initialization, an image algorithm can be directly called for face recognition, and graph rendering can be called for model display.

FIG. 2 is a flow diagram of a rendering stage according to one embodiment of the invention.

The flow of each frame of the rendering link requiring update processing is shown in fig. 2, and includes a vision phase, a pose estimation phase and an engine phase.

In the visual stage, the face key point data is obtained from the real-time face image (i.e. the frame image in the figure). The visual component receives image data including a face image shot by a camera device in real time, and calculates face key point data corresponding to the current image data through an initialized training model (a model for image recognition) and a related algorithm library (mainly a visual algorithm library packaged based on OpenCV, other deep learning libraries and the like), wherein the face key point data comprises rotation information, scaling information and two-dimensional position information of a specific key point.

In the pose estimation stage, the position and the pose of an interactive model corresponding to the face image in an interactive scene are determined according to the face key point data. Calculating to obtain three-dimensional position information and position estimation of the interactive model in the interactive scene according to the two-dimensional position information of the specific key point by using the zoom information, the view matrix and the projection matrix in the rendering engineAs shown in fig. 3, the nose tip with stability is used as a specific key point in the embodiment of the present invention. Multiplying a view matrix V in the rendering engine by a projection matrix P, and calculating an inverse matrix to obtain a projection view inverse matrix (PV)^-1(ii) a Combining two-dimensional position information (namely nose tip data (x, y)) of a specific key point and z converted by the scaling information s into a three-dimensional coordinate point (x, y, z), and performing inverse matrix (PV) of projection view^-1And multiplying the three-dimensional coordinate points (x, y, z) to obtain three-dimensional position information (namely three-dimensional points of space) of the interactive model in the interactive scene. And determining the posture of the interactive model in the interactive scene by using the rotation information (such as the information of the rotation angle of the face) so as to obtain the matrix information of the position and the posture of the interactive model in the interactive scene, and updating. Modularization can be realized by calculating the position and posture of the interactive model in real time through the face key point data, so that the code has usability and is convenient for related business development.

The scaling information may be a value belonging to a [0,1] interval, specifically, a ratio of a width of a face in a horizontal direction to a width of the whole image, the scaling information s may be converted into z through a preset piecewise linear function, and the piecewise linear function may be adjusted as needed, and may be in a form of y ═ kx + b (both k and b are constants). The view matrix and the projection matrix are used for space transformation, the view matrix is used for transforming points of the world space into the coordinate system of the camera device, and the projection matrix is used for projecting the points of the camera device in the coordinate system onto a plane and keeping projected two-dimensional information and depth information corresponding to the zooming information. The world space in the rendering engine is the space under the world coordinate system.

In the engine phase, models in the interactive scene are rendered, and the models in the interactive scene comprise interactive models. The method comprises the steps of synchronizing a physical matrix and a rendering matrix through a physical engine (namely, synchronizing the physical matrix in a graph), performing physical attribute simulation on an object (rigid body) in an interactive scene, and performing collision detection and processing on the object, wherein the rendering matrix comprises a view matrix and a projection matrix, the physical matrix is a matrix in the physical engine, taking an interactive model as a maze as an example, the maze can rotate by an angle along with the swing of a human face, one object can be a small ball (only an example) which can move and collide in the maze along with the rotation of the maze, and then performing physical attribute simulation on the object in the interactive scene comprises simulating physical attributes such as the movement and collision of the small ball. One of the functions of the matrix in the physics engine is to transform the coordinates of the ball from the rigid coordinate system to the world coordinate system (denoted as coordinate a), one of the functions of the view matrix in the rendering engine is to transform the points in the world space (corresponding to the positions of the ball, denoted as coordinate B) to the camera coordinate system, and the function of the projection matrix is to project the points in the camera coordinate system onto a plane, so that the synchronization between the physics engine and the rendering engine needs to be ensured for the ball, and the synchronization between the positions of the ball in the rendering world and the positions in the physics world is required to be ensured, that is, the coordinate a is denoted as coordinate B. The collision detection and collision processing can be implemented by registering a callback. Updating a rendering matrix through a rendering engine based on the result of physical attribute simulation and/or the result of collision detection and processing, and rendering an interactive scene (i.e. model rendering in the graph) based on the updated rendering matrix, where rendering the interactive scene may include rendering a particle special effect and a related node (an object to be rendered), and the particle special effect is commonly used in a scene in which a series of particle effects (explosion, flame, snow, etc.) are triggered after collision occurs between rigid bodies in the physical world.

FIG. 4 is a diagram illustrating a system architecture for implementing interaction according to an embodiment of the present invention.

A schematic diagram of a system structure for implementing an AR interactive game based on face key points and physical engine capability is shown in fig. 4, and mainly includes an API (application program interface) layer, a component layer, and a service layer.

The API layer includes a graphics device API and an underlying API algorithm library associated with image processing. The graphics device API is an API related to graphics rendering, is not limited to a device, and can run in an operating system having a graphics card, such as an android system, an IOS system, and a windows (microsoft windows operating system). The underlying application program interface algorithm library associated with image processing may include OpenCV or the like. The component layer provides simple and easy-to-use interfaces and algorithm capability by packaging the API interface of the bottom layer, and mainly comprises a rendering engine with a rendering function, a physical engine with a physical simulation function and a visual algorithm component (namely a visual algorithm library) with a graphic analysis function. The business layer executes business logic based on the component layer, and mainly comprises pose estimation, model loading in an interactive scene, particle explosion effect and the like.

The service can be realized by adopting a script supporting thermal upgrading (such as a lua script, which is a lightweight and small scripting language, written in a standard C language and opened in a source code form, and the design purpose of the script is to be embedded into an application program so as to provide flexible extension and customization functions for the application program), so that the service development and flexible extension are facilitated.

FIG. 5 is a schematic diagram of the main modules of an apparatus for implementing interaction according to an embodiment of the present invention.

As shown in fig. 5, an apparatus 500 for implementing interaction according to an embodiment of the present invention mainly includes: the system comprises a face image recognition module 501, an interactive model pose determination module 502 and a rendering module 503.

A face image recognition module 501, configured to obtain face key point data from a real-time face image.

And the interactive model pose determining module 502 is configured to determine a position and a posture of the interactive model in the interactive scene according to the face key point data, where the interactive model corresponds to the face image.

And a rendering module 503, configured to render the models in the interactive scene, where the models in the interactive scene include the interactive models.

In one embodiment, the apparatus may further include an initialization module to: initializing a physics engine and a rendering engine, wherein the initializing the physics engine comprises constructing objects with physical attributes in an interactive scene, the initializing the rendering engine comprises loading a model in the interactive scene to the rendering engine, and the physics engine and the rendering engine are used for rendering the model in the interactive scene.

In one embodiment, the physics engine and the rendering engine are encapsulated using a graphics device application program interface and an underlying application program interface algorithm library associated with image processing.

In one embodiment, the face key point data comprises rotation information, scaling information and two-dimensional position information of a specific key point; the interactive model pose determination module is specifically configured to: calculating to obtain three-dimensional position information of the interactive model in the interactive scene by using the zooming information, the view matrix and the projection matrix in the rendering engine according to the two-dimensional position information of the specific key point; and determining the posture of the interactive model in the interactive scene by using the rotation information.

In one embodiment, the interactive model pose determination module is specifically configured to: multiplying a view matrix in a rendering engine by a projection matrix, and then calculating an inverse matrix to obtain a projection view inverse matrix; and combining the two-dimensional position information and the scaling information of the specific key point into a three-dimensional coordinate point, and multiplying the projection view inverse matrix with the three-dimensional coordinate point to obtain the three-dimensional position information of the interactive model in the interactive scene.

In one embodiment, the rendering module is specifically configured to: synchronizing a physical matrix and a rendering matrix through a physical engine, performing physical attribute simulation on an object in an interactive scene, and performing collision detection and processing on the object, wherein the rendering matrix comprises a view matrix and a projection matrix, and the physical matrix is a matrix in the physical engine; and updating, by the rendering engine, the rendering matrix based on the result of the physical property simulation and/or the result of the collision detection and processing, and rendering the interactive scene based on the updated rendering matrix.

In one embodiment, the particular keypoint may be the tip of the nose of the face image.

In addition, the specific implementation contents of the apparatus for implementing the interaction in the embodiment of the present invention have been described in detail in the above method for implementing the interaction, so that repeated contents are not described herein.

Fig. 6 illustrates an exemplary system architecture 600 of a method or apparatus for implementing interaction to which embodiments of the present invention may be applied.

As shown in fig. 6, the system architecture 600 may include

terminal devices

601, 602, 603, a network 604, and a server 605. The network 604 serves to provide a medium for communication links between the

terminal devices

601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. The

terminal devices

601, 602, 603 may have installed thereon various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).

The

terminal devices

601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 605 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

601, 602, 603. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.

It should be noted that the method for implementing interaction provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the apparatus for implementing interaction is generally disposed in the server 605.

It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 7, a block diagram of a computer system 700 suitable for use with a terminal device or server implementing an embodiment of the invention is shown. The terminal device or the server shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a face image recognition module, an interactive model pose determination module and a rendering module. The names of these modules do not in some cases constitute a limitation on the module itself, and for example, the face image recognition module may also be described as a "module for obtaining face key point data from a real-time face image".

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: obtaining face key point data from a real-time face image; determining the position and the posture of an interactive model in an interactive scene according to the key point data of the human face, wherein the interactive model corresponds to the human face image; rendering the model in the interactive scene, wherein the model in the interactive scene comprises the interactive model.

According to the technical scheme of the embodiment of the invention, the face key point data is obtained from the real-time face image; determining the position and the posture of an interactive model in an interactive scene according to the key point data of the human face, wherein the interactive model corresponds to the human face image; rendering the model in the interactive scene, wherein the model in the interactive scene comprises the interactive model. The method can realize interactive applications such as AR interactive games without the need of supporting the SLAM function by the mobile terminal equipment.

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for implementing interaction, comprising:

obtaining face key point data from a real-time face image;

determining the position and the posture of an interactive model in an interactive scene according to the face key point data, wherein the interactive model corresponds to the face image;

rendering a model in the interactive scene, wherein the model in the interactive scene comprises the interactive model.

2. The method of claim 1, wherein prior to obtaining face key point data from a live face image, comprising:

initializing a physics engine and a rendering engine, the initializing the physics engine comprising constructing objects having physical attributes in the interactive scene, the initializing the rendering engine comprising loading a model in the interactive scene to the rendering engine, the physics engine and the rendering engine for rendering the model in the interactive scene.

3. The method of claim 2, wherein the physics engine and the rendering engine are encapsulated using a graphics device application program interface and an underlying application program interface algorithm library associated with image processing.

4. The method of claim 2, wherein the face keypoint data comprises rotation information, scaling information and two-dimensional position information of a specific keypoint;

determining the position and the posture of an interactive model in an interactive scene according to the face key point data, wherein the determining comprises the following steps:

calculating to obtain three-dimensional position information of the interactive model in the interactive scene by using the zooming information, the view matrix and the projection matrix in the rendering engine according to the two-dimensional position information of the specific key point; and the number of the first and second groups,

and determining the posture of the interactive model in the interactive scene by using the rotation information.

5. The method of claim 4, wherein calculating three-dimensional position information of the interactive model in the interactive scene according to the two-dimensional position information of the specific keypoint by using the scaling information, the view matrix and the projection matrix in the rendering engine comprises:

multiplying the view matrix in the rendering engine by the projection matrix, and then calculating an inverse matrix to obtain a projection view inverse matrix;

and combining the two-dimensional position information of the specific key point and the scaling information into a three-dimensional coordinate point, and multiplying the projection view inverse matrix with the three-dimensional coordinate point to obtain the three-dimensional position information of the interactive model in the interactive scene.

6. The method of claim 4, wherein the rendering the model in the interactive scene comprises:

synchronizing a physical matrix and a rendering matrix through the physical engine, performing physical attribute simulation on the object in the interactive scene, and performing collision detection and processing on the object, wherein the rendering matrix comprises the view matrix and the projection matrix, and the physical matrix is a matrix in the physical engine;

updating, by the rendering engine, the rendering matrix based on a result of the physical property simulation and/or a result of the collision detection and processing, and rendering the interactive scene based on the updated rendering matrix.

7. The method of claim 4, wherein the specific key point is a tip of a nose of the face image.

8. An apparatus for enabling interaction, comprising:

the face image recognition module is used for obtaining face key point data from a real-time face image;

the interactive model pose determining module is used for determining the position and the posture of an interactive model in an interactive scene according to the face key point data, and the interactive model corresponds to the face image;

and the rendering module is used for rendering the model in the interactive scene, wherein the model in the interactive scene comprises the interactive model.

9. The apparatus of claim 8, further comprising an initialization module to:

10. The apparatus of claim 9, wherein the physics engine and the rendering engine are encapsulated using a graphics device application program interface and an underlying application program interface algorithm library associated with image processing.

11. The apparatus of claim 9, wherein the face keypoint data comprises rotation information, scaling information, and two-dimensional position information of a specific keypoint;

the interactive model pose determination module is further configured to:

12. The apparatus of claim 11, wherein the interactive model pose determination module is further configured to:

13. The apparatus of claim 11, wherein the rendering module is further configured to:

14. The apparatus of claim 11, wherein the specific key point is a tip of a nose of the face image.

15. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.

16. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.