WO2020042727A1 - 一种应用场景的交互方法和移动终端以及存储介质 - Google Patents

一种应用场景的交互方法和移动终端以及存储介质 Download PDF

Info

Publication number
WO2020042727A1
WO2020042727A1 PCT/CN2019/091402 CN2019091402W WO2020042727A1 WO 2020042727 A1 WO2020042727 A1 WO 2020042727A1 CN 2019091402 W CN2019091402 W CN 2019091402W WO 2020042727 A1 WO2020042727 A1 WO 2020042727A1
Authority
WO
WIPO (PCT)
Prior art keywords
mobile terminal
focus
interactive
face
frame image
Prior art date
Application number
PCT/CN2019/091402
Other languages
English (en)
French (fr)
Inventor
雷翔
李峰
熊一冬
路昊
张宸
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to JP2020572602A priority Critical patent/JP7026267B2/ja
Priority to EP19854820.8A priority patent/EP3845282A4/en
Publication of WO2020042727A1 publication Critical patent/WO2020042727A1/zh
Priority to US17/027,038 priority patent/US11383166B2/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/60Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor
    • A63F13/65Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition
    • A63F13/655Generating or modifying game content before or while executing the game program, e.g. authoring tools specially adapted for game development or game-integrated level editor automatically by game devices or servers from real world data, e.g. measurement in live racing competition by importing photos, e.g. of the player
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/40Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment
    • A63F13/42Processing input control signals of video game devices, e.g. signals generated by the player or derived from the environment by mapping the input signals into game commands, e.g. mapping the displacement of a stylus on a touch screen to the steering angle of a virtual vehicle
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/20Input arrangements for video game devices
    • A63F13/21Input arrangements for video game devices characterised by their sensors, purposes or types
    • A63F13/213Input arrangements for video game devices characterised by their sensors, purposes or types comprising photodetecting means, e.g. cameras, photodiodes or infrared cells
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/50Controlling the output signals based on the game progress
    • A63F13/52Controlling the output signals based on the game progress involving aspects of the displayed game scene
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/90Constructional details or arrangements of video game devices not provided for in groups A63F13/20 or A63F13/25, e.g. housing, wiring, connections or cabinets
    • A63F13/92Video game devices specially adapted to be hand-held while playing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Definitions

  • Embodiments of the present application relate to the field of computer technology, and in particular, to an interaction method of an application scenario, a mobile terminal, and a storage medium.
  • the computer simulation system can create a virtual world and allow users to experience the virtual world. It uses a computer to generate a simulation environment. It is an interactive three-dimensional dynamic vision and entity behavior system simulation of multi-source information fusion, which immerses the user. Into the environment.
  • a major feature of the game is to allow players to interact in the game. To this end, players need to interact with the touch screen of the finger and the terminal. In current mobile terminal games, users still need to interact with the touch screen with their fingers, for example, by tapping the touch screen with a finger, or using a finger to operate a handle button, or sliding a touch screen with a finger.
  • Embodiments of the present application provide an interaction method of an application scenario, a mobile terminal, and a storage medium, which are used to implement immersive interaction on a mobile terminal.
  • an embodiment of the present application provides an interaction method for an application scenario, including:
  • the mobile terminal performs real-time image collection of the target face through a camera configured by the mobile terminal to obtain a first frame image and a second frame image, where the first frame image and the second frame image are respectively taken before and after the shooting. Two adjacent frames;
  • the mobile terminal compares the first frame image with the second frame image to obtain the target face's motion and corresponding amplitude
  • the mobile terminal generates a control instruction of a simulated object in an interactive application scene according to the action of the target face and the corresponding amplitude, and the simulated object and the interactive object are displayed in the interactive application scene;
  • the mobile terminal controls the simulation object to interact with the interactive object in the interactive application scene according to the control instruction.
  • an embodiment of the present application further provides a computer-readable storage medium including instructions, which, when run on a computer, cause the computer to execute the foregoing method.
  • an embodiment of the present application further provides a mobile terminal including one or more processors and one or more memories storing a program unit, where the program unit is executed by the processor, and the program unit includes:
  • An image acquisition module is configured to perform real-time image acquisition of a target face through a camera configured by a mobile terminal to obtain a first frame image and a second frame image, where the first frame image and the second frame image are Two adjacent frames of images obtained before and after being taken respectively;
  • a comparison module configured to compare the first frame image and the second frame image to obtain the target human face motion and corresponding amplitude
  • the instruction generation module is configured to generate a control instruction of a simulated object in an interactive application scene according to the motion of the target face and a corresponding amplitude, where the simulated object and the interactive object are displayed in the interactive application scene;
  • the interaction module is configured to control the simulation object to interact with the interactive object in the interactive application scene according to the control instruction.
  • the constituent modules of the mobile terminal may also perform the steps described in the foregoing aspect and various possible implementation manners.
  • the constituent modules of the mobile terminal may also perform the steps described in the foregoing aspect and various possible implementation manners.
  • the constituent modules of the mobile terminal may also perform the steps described in the foregoing aspect and various possible implementation manners.
  • the constituent modules of the mobile terminal may also perform the steps described in the foregoing aspect and various possible implementation manners.
  • the constituent modules of the mobile terminal may also perform the steps described in the foregoing aspect and various possible implementation manners.
  • the constituent modules of the mobile terminal may also perform the steps described in the foregoing aspect and various possible implementation manners.
  • an embodiment of the present application provides a mobile terminal.
  • the mobile terminal includes: a processor and a memory; the memory is configured to store instructions; the processor is configured to execute the instructions in the memory, so that the mobile terminal executes the foregoing aspect. Any of the methods.
  • the camera configured by the mobile terminal performs real-time image acquisition of the target face to obtain a first frame image and a second frame image, and the first frame image and the second frame image are respectively captured before and after The two adjacent frames are compared.
  • the first frame and the second frame are compared to obtain the target face motion and the corresponding amplitude.
  • the simulated object in the interactive application scene is generated.
  • a control instruction The simulation object and the interactive object are displayed in the interactive application scene, and the simulation object is controlled to interact with the interactive object in the interactive application scene according to the control instruction.
  • the motion and amplitude of a human face can be obtained according to the comparison result of multiple frames of images captured by the camera in real time, and then a control instruction for the simulated object can be generated, and the interaction between the simulated object and the interactive object can be achieved through the control instruction.
  • scene interaction can be performed by relying on the user's facial expressions, rather than issuing instructions through the user's finger, so immersive interaction on a mobile terminal can be achieved.
  • FIG. 1 is a schematic diagram of an interaction scenario between a user and a mobile terminal according to an embodiment of the present application
  • FIG. 2 is a schematic flow block diagram of an application scenario interaction method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart block diagram of another interaction method for an application scenario according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a magnetic attraction effect provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a damping effect provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of interaction detection in a game scenario according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of interaction detection in a game scenario according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of interaction detection in another game scenario according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of interaction detection in another game scenario according to an embodiment of the present application.
  • FIG. 10-a is a schematic structural diagram of a structure of a mobile terminal according to an embodiment of the present application.
  • 10-b is a schematic structural diagram of a comparison module according to an embodiment of the present application.
  • FIG. 10-c is a schematic structural diagram of another mobile terminal according to an embodiment of the present application.
  • FIG. 10-d is a schematic structural diagram of another mobile terminal according to an embodiment of the present application.
  • FIG. 10-e is a schematic structural diagram of an interaction module according to an embodiment of the present application.
  • FIG. 10-f is a schematic structural diagram of another mobile terminal according to an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of an application scenario interaction method applied to a terminal according to an embodiment of the present application.
  • Embodiments of the present application provide an interaction method, terminal, and storage medium for an application scenario, which are used to implement immersive interaction on a mobile terminal.
  • FIG. 1 illustrates a schematic diagram of an interaction scenario between a user and a mobile terminal according to an embodiment of the present application.
  • the mobile terminal (referred to as the terminal) can interact with the user.
  • the terminal can be a mobile phone, tablet, e-book reader, moving picture expert compression standard audio layer 3 (Moving Picture Experts Group Audio Layer III, MP3) player, dynamic Video expert compression standard audio layer 4 (Moving Picture Experts Group Audio Audio Layer IV, MP4) player, laptop portable computer and desktop computer and so on.
  • moving picture expert compression standard audio layer 3 Moving Picture Experts Group Audio Layer III, MP3
  • dynamic Video expert compression standard audio layer 4 Moving Picture Experts Group Audio Audio Layer IV, MP4 player
  • a camera is configured on the terminal, and the camera may be a front camera.
  • the camera can collect the user's face image.
  • the terminal executes the application scenario interaction method provided in the embodiments of the present application. While compatible with the existing mobile terminal's input interaction method, the camera captures the user's face image frame by frame through the camera. Comparison of facial data, calculation of the actions and corresponding amplitudes made by the user, and corresponding control instructions in different interactive application scenarios. Based on the control instructions, simulated objects and interactive objects in interactive application scenarios can be realized. Interaction between objects. Compared with related technologies, it has greatly improved the accuracy and smoothness of face control, making it possible for new interaction methods without any finger operation. This interaction method is used to emphasize the immersive experience. Because it does not require any interactive operation by touching the screen with a finger, it completely uses facial expression recognition for interactive operations, which can greatly improve the user's sense of substitution.
  • FIG. 2 is a schematic flow block diagram of an interaction method of an application scenario according to an embodiment of the present application.
  • an interaction method of an application scenario provided by an embodiment of the present application may include the following steps:
  • the mobile terminal collects a real-time image of the target face through a camera configured by the mobile terminal, and obtains a first frame image and a second frame image, where the first frame image and the second frame image are respectively taken before and after the adjacent frame. Two frames of images.
  • the terminal is provided with a camera.
  • the camera of the mobile terminal is initialized. After the initialization is completed, the camera is started to collect an image of an object appearing in the field of view of the camera, and the object image is identified as a user.
  • the object image is identified as a user.
  • the user's face is taken in real time as a target face, and images taken at different frame moments are generated.
  • the camera installed on the terminal may be a rear camera or a front camera.
  • the terminal first obtains a first frame image and a second frame image to be processed. Each frame image may be a face image generated by shooting a user's face through a camera.
  • the face image may also be referred to as Face images or head images.
  • images at multiple frame moments can be obtained.
  • the image obtained by the camera at the first frame moment is defined as the first frame.
  • Image The image captured by the camera at the second frame moment is defined as the second frame image, and the first frame image and the second frame image are only used to distinguish the images captured by the camera at different frame moments.
  • feature point detection and calculation of the first frame image obtained at the first frame time can be used to obtain the feature points of the first frame image, and for the second frame image obtained at the second frame time, The detection and calculation of the feature points can obtain the feature points of the second frame image.
  • the feature points of the image play a very important role in the image matching algorithm based on the feature points.
  • the feature points of the image can reflect the essential characteristics of the image, can identify the target object in the image, and can complete the matching of the image by matching the feature points.
  • the feature points of the embodiment image may be local image feature points. There may be multiple implementations for the feature point extraction of the first frame image and the second frame image.
  • ORB English name: ORiented Binary Robust Independent Elementary Features
  • the point extraction can also be speeded up robust features (SURF), or scale-invariant feature transform (SIFT), etc.
  • Image features can be ORB feature points, SURF feature points, SIFT feature points. Faces on each frame of image can be identified by detecting feature points. For example, facial features can be identified by detecting feature points. These features can be used as localization points of the face.
  • the mobile terminal before step 201, performs real-time image collection of a target face through a camera configured by the mobile terminal.
  • the method provided in the embodiment of the present application further includes the following steps:
  • the mobile terminal detects whether there is a touch input on the touch screen of the mobile terminal;
  • the mobile terminal When no touch input is generated on the touch screen, the mobile terminal triggers and executes the following steps: real-time image collection of the target face through the camera configured by the mobile terminal.
  • the non-contact control between the user and the mobile terminal first determine whether there is a touch input on the touch screen. When it is determined that no touch input is generated on the touch screen, it means that the user has not touched the screen of the mobile terminal with a finger. Then the camera of the mobile terminal is started to collect the user's image.
  • the mobile terminal compares the first frame image and the second frame image to obtain the target human face motion and the corresponding amplitude.
  • the terminal may determine the action performed by the target face and the amplitude corresponding to the action by changing the position of the face image in the previous and subsequent frames.
  • the action of the target face refers to the user's action captured by the camera of the terminal.
  • the action can be the movement of the face up, down, left, and right, and the amplitude refers to the direction and distance of the movement of the face.
  • the action of the target human face is turning the face left and right, that is, corresponding to the screen face moving left and right, and the rotation amplitude corresponds to the screen moving distance.
  • the action of the human face is raising the head and lowering the head, corresponding to the screen face moving up and down, and the movement amplitude corresponding to the screen moving distance.
  • step 202 the mobile terminal compares the first frame image with the second frame image to obtain the target face's motion and corresponding amplitude, including:
  • the motion of the target face and the corresponding amplitude are determined according to the relative displacement between the first pixel position and the second pixel position.
  • the position of the face in the image can be output, and the position of the face in the image collected by the camera is expressed in pixels.
  • the face position detection is performed from the face position.
  • the features of the position of the face position can be determined by a statistical classification method in advance, so as to detect whether there is a pre- If there is a pixel position on the face that matches the feature of the anchor point, then it is determined that a pixel position on the face that matches the feature of the anchor point is the location of the face anchor in the face position. Point location.
  • the face positioning point used in the embodiment of the present application refers to a positioning reference point used to determine whether a turning of the face position occurs on the face position of the target object.
  • the selection of facial anchor points can be based on the facial features that can be achieved at the facial position.
  • the facial anchor points can refer to an organ at the facial position.
  • the pixel position where the pixel is located can also refer to the pixel position where multiple organs are located, which is not limited here.
  • the pixel position of the same face anchor point is detected in each frame of the image.
  • the pixel position of the face anchor point in the first frame image is called the first pixel position
  • the face anchor point is in the second frame image.
  • the pixel position is called the second pixel position.
  • the two pixel positions in the two frames of the image are compared to determine the relative displacement, and the motion and amplitude of the target face are calculated based on the relative displacement.
  • An example is given below. Taking the up and down movement of the human face as an example, the coordinates of the current eyebrow center point are compared with the coordinates data of the previous eyebrow center point. An increase in the Y coordinate indicates a head up; a decrease in the Y coordinate indicates a head down.
  • the current nose nose coordinate is compared with the previous nose nose coordinate data.
  • An increase in the X coordinate indicates a left turn
  • a decrease in the X coordinate indicates a right turn.
  • the coordinates of the center point of the current face are compared with the coordinates data of the previous center point, and they are held within a certain coordinate range for a certain period of time, then it is determined that the face is a click operation.
  • the mobile terminal generates a control instruction of the simulated object in the interactive application scene according to the action of the target face and the corresponding amplitude, and the simulated object and the interactive object are displayed in the interactive application scene.
  • the interactive application scenario in the embodiment of the present application may specifically be a game scenario or an application application interactive scenario.
  • the method for processing an interactive application scenario provided in the embodiment of the present application may be applicable to a scenario constructed for a game character or a scenario constructed for a user object in a software application system.
  • a simulation object is displayed.
  • the simulation object may be a game character in a game scene, or a hero and a soldier in a game scene.
  • the simulation object may be a strategy game by a user.
  • the person or thing controlled is not limited here.
  • interactive applications also display interactive objects.
  • the interactive objects refer to objects that can interact with simulated objects in interactive application scenarios.
  • the interactive scene can be a variety of objects, such as props in a game scene.
  • the motion and the corresponding amplitude may be mapped to a control instruction of the simulation object in the interactive application scenario.
  • the control instruction may also be called It is an "interaction instruction", that is, the user makes an action through his / her own face, and based on the type and amplitude of the action, it can correspond to a control instruction of the simulation object, and the control instruction can control the simulation object.
  • Step 203 The mobile terminal generates a control of the simulated object in the interactive application scene according to the action of the target face and the corresponding amplitude.
  • the method provided in the embodiment of the present application further includes the following steps:
  • the mobile terminal determines whether the focus remains stable within the range of the interactive object for a preset time according to the control instruction, and the focus is a reference point where the target face is mapped in the interactive application scene;
  • the mobile terminal locks out the interactive objects from the range of the interactive objects.
  • the target face is mapped with a focus in the interactive application scene, and the focus is a reference point of the user's face mapped in the interactive application scene.
  • the interactive object range refers to the range of a certain area where the interactive object is located in the interactive application scene. Only when the focus enters the range of the interactive object, the simulated object can interact with the interactive object. If the focus does not enter the interactive object, Within the scope of the interactive object, the simulation object cannot interact with the interactive object.
  • the lock of the interactive objects can adopt the default configuration, or when there are multiple interactive objects, the interactive objects can be locked by the distance between the focus and the range of the interactive objects. Locked interactive objects are the objects that the simulated object needs to interact with.
  • the mobile terminal controls the simulated object to interact with the interactive object in the interactive application scene according to the control instruction.
  • the terminal may control the interaction mode of the simulated object and the interactive object according to the control instruction.
  • the simulated object and the interactive object There can be multiple interactions between objects.
  • An example is as follows. Taking the interactive application scenario as a fish-eating game as an example, the interactive objects are fish props set in the game scenario. If the control instruction generated by the foregoing embodiment is an open mouth, the game character (such as the mouth) can be controlled. ) Start to open mouth to eat fish to realize the interaction between the game character and the interactive objects.
  • the control instruction of the simulation object is generated through image detection, and the whole process is in a non-touch form, using first-person vision to enhance the sense of game substitution and bring a unique game experience.
  • step 204 the mobile terminal controls the simulated object to interact with the interactive object in the interactive application scenario according to the control instruction, including:
  • the focus is the reference point of the target face mapped in the interactive application scene
  • the displacement rate corresponding to the focus is updated according to the distance calculated in real time
  • the control instruction is updated according to the updated displacement rate, and the updated control instruction is used to control the simulation object to interact with the interactive object.
  • the terminal when controlling the simulated object to interact with the interactive object, can calculate the distance between the simulated object and the interactive object in real time, and the distance can be used to determine whether the focus is within the range of the interactive object. Only when the focus is within the range of the interactive object In order to execute the subsequent interaction process, if the focus is not within the range of the interactable object, the interaction cannot be performed.
  • the displacement rate corresponding to the focus is updated according to the distance calculated in real time, that is, the displacement rate corresponding to the focus can be updated in real time according to the real-time change of the distance, and the control can be updated according to the continuously updated displacement rate.
  • control instruction can be used to control the interaction between the simulation object and the interactive object.
  • the simulated object can be moved in real time according to the control instruction.
  • the interaction mode between the simulated object and the interactive object can be determined in combination with the specific scene.
  • the displacement rate corresponding to the focus may be updated according to the real-time change of the distance, so as to realize the real-time displacement rate. Updated, can achieve magnetic effect and damping effect. Specifically, updating the displacement rate corresponding to the focus according to the distance calculated in real time includes:
  • the displacement rate is reduced first in the direction of focus movement, and then the displacement rate is increased in the opposite direction of the movement direction.
  • the terminal can adopt the magnetic focusing method, that is, the distance between the simulated object and the interactive object can be calculated in real time.
  • the focus moves.
  • the displacement rate is reduced first and then the displacement rate is increased, which increases the movement resistance of the focus away from the interactive object, and realizes the dynamic adjustment of the displacement rate corresponding to the focus.
  • the control instruction is updated, so that the control instruction can The suction effect of the interaction point reduces the difficulty and accuracy of the interactive operation of items.
  • FIG. 4 it is a schematic diagram of a magnetic attraction effect provided by an embodiment of the present application.
  • the speed of the simulation object under the control of the control instruction is calculated by the following formula:
  • the outside of the interactive object means that the focus of the human face mapped on the device is outside the range of the interactive objects
  • the inside of the interactive object means that the focus of the human face mapped on the device is within the range of the interactive objects.
  • the initial speed, initial direction, target speed, and target direction are all movement data corresponding to the movement of the face on the screen.
  • the magnetic constant can be adjusted according to the actual scene.
  • the magnetic constant is used to expand the speed of the face to the focal speed.
  • the value of magnetic constant can be 1.32.
  • An example is as follows. For interactive objects, when the cursor is moved closer to the face, the movement resistance will increase, that is, a larger range of motion is required to get out of the object.
  • the embodiment of the present application makes great improvements in the accuracy and smoothness of face control by magnetic focusing, which makes it possible to play games that rely entirely on facial expression recognition for interaction.
  • the terminal when controlling the simulated object to interact with the interactive object, may adopt a damping effect, and based on the distance between the simulated object and the interactive object calculated in real time, the focus and interaction
  • the distance between the objects increases, first reduce the displacement rate in the moving direction of the focus, and then increase the displacement rate in the opposite direction of the moving direction. Determine whether the displacement rate corresponding to the focus exceeds a threshold value. If the displacement rate exceeds the threshold value, it indicates that the face operation is too fast, resulting in high-speed displacement in both directions. At this time, you can first reduce the displacement rate corresponding to the focus in the same direction, and then Increasing the displacement rate in the opposite direction produces a damping effect and reduces the uncertainty of handling.
  • FIG. 5 it is a schematic diagram of a damping effect provided by an embodiment of the present application.
  • the speed of the simulation object under the control of the control instruction is calculated by the following formula:
  • the initial speed, the initial direction, the target speed, and the target direction are the movement data of the face corresponding to the movement on the screen.
  • the damping constant can be adjusted according to the actual scene.
  • the damping constant is used to enhance the face turning to focus turning.
  • the acceleration and deceleration experience for example, the value of the damping constant can be 0.53.
  • the displacement rate corresponding to the focus can be reduced, a damping effect can be generated, and the uncertainty of the control can be reduced.
  • a great improvement has been made in the accuracy and smoothness of face control, so that it is completely dependent on the face Emoticons to make interactive games possible.
  • FIG. 3 is a schematic flowchart of another interaction method of an application scenario provided by an embodiment of the present application. Please refer to FIG. 3 next. As shown below, the process of defocus determination is described in detail.
  • the method provided by the embodiment of the present application further includes the following steps:
  • a mobile terminal determines pixel coordinates of multiple key points of a face on each frame of image collected in real time.
  • the terminal can capture multiple pixel key point pixel coordinates through the camera, and the number of face key points can be set to 90.
  • the mobile terminal determines whether the target face has lost focus according to the pixel coordinates of multiple face key points.
  • the terminal uses these pixel keypoint pixel coordinates to determine whether the target face has lost focus. For example, based on the 90 pixel face keypoint pixel coordinate analysis captured by the camera, when 90 keypoints cannot be collected or the head is lowered by a certain amount (For example, 45 degrees), it is determined to be out of focus.
  • a certain amount For example, 45 degrees
  • focus correction is performed on the target face.
  • the user's face data is calculated in real time to determine whether the user has lost focus, the focus is corrected, and the focus mapped on the face is performed through steps 301 to 303 Real-time defocus determination, so that focus correction can be performed immediately, so that the aforementioned focus-based magnetic focusing and damping effects can be completed immediately.
  • the camera configured by the mobile terminal performs real-time image collection of the target face to obtain a first frame image and a second frame image, and the first frame image and the second frame image are respectively The two adjacent frames of the image obtained before and after are compared, and the first frame image and the second frame image are compared to obtain the target face motion and the corresponding amplitude. Based on the target face motion and the corresponding amplitude, a simulated object is generated for interaction.
  • a control instruction is displayed.
  • simulation objects and interactive objects are displayed. According to the control instruction, the simulated object is controlled to interact with the interactive objects in the interactive application scenario.
  • the motion and amplitude of a human face can be obtained according to the comparison result of multiple frames of images captured by the camera in real time, and then a control instruction for the simulated object can be generated, and the interaction between the simulated object and the interactive object can be achieved through the control instruction.
  • scene interaction can be performed by relying on the user's facial expressions, rather than issuing instructions through the user's finger, so immersive interaction on a mobile terminal can be achieved.
  • the embodiment of the present application is compatible with the existing input interaction mode, and captures the user's facial features frame by frame through the camera. By comparing the facial data of the front and back frames, the user's actions and corresponding amplitudes are calculated and corresponding to different games. Input, realize a new interaction method without any finger operation. In terms of the accuracy and smoothness of face control, a large improvement has been made, making it possible to play games that rely entirely on facial expression recognition to interact.
  • the hardware requirements for the terminal in the embodiments of the present application are as follows: either a mobile phone or a personal computer including a camera.
  • FIG. 6 it is a schematic flowchart of interaction detection in a game scenario provided by an embodiment of the present application.
  • the main implementation logic can include the following processes:
  • a user can collect a face image through a camera of a terminal, and can also operate a touch screen with a finger.
  • S02. Determine whether finger data is input.
  • the terminal can detect whether there is finger data input by the user on the touch screen.
  • the user can tap and swipe on the screen.
  • the terminal can determine whether the camera has face data input.
  • step S04 the terminal determines whether the input data is incorrect, that is, whether a complete face image can be detected.
  • the terminal compares multiple frames of continuous images before and after.
  • the terminal can compare the face images of the two frames before and after.
  • the specific comparison process please refer to the description of the foregoing embodiment, and no further description will be given here.
  • the terminal can determine the action currently performed by the user and the corresponding amplitude.
  • the coordinates of the current eyebrow center point are compared with the coordinates data of the previous eyebrow center point.
  • An increase in the Y coordinate indicates a head up; a decrease in the Y coordinate indicates a head down.
  • the current nose nose coordinate is compared with the previous nose nose coordinate data.
  • An increase in the X coordinate indicates a left turn, and a decrease in the X coordinate indicates a right turn.
  • the distance between the far corner of the left eye and the far corner of the right face of the current face is larger than the previous distance, which indicates that the face has moved forward.
  • the coordinates of the center point of the current face are compared with the coordinates data of the previous center point, and they are held within a certain coordinate range for a certain period of time.
  • rotating the face to the left and right means to move left and right corresponding to the screen
  • the rotation amplitude corresponds to the screen moving distance. Looking up corresponds to the screen moving up and down, and the amplitude corresponds to the screen moving distance. Open your mouth and shut your mouth to correspond to the relevant operations on the screen, for example: bite to eat fish.
  • the fixed point is maintained for a period of x seconds within a certain range. The fixed point maintains the focus of the specific line of sight to remain stable within the range, corresponding to the screen interactive items, and the focus is locked to achieve the corresponding operation.
  • Magnetic focusing By calculating the distance between the simulated object mapped by the human face in the device and the interactive object, when it is in the state of the interactive object, dynamically adjust the displacement rate corresponding to the focal point in the device to generate the interaction point.
  • the suction effect reduces the difficulty and accuracy of the interactive operation of items.
  • Defocus determination Obtain the user's face data through real-time calculation, determine whether the user has lost focus, and give defocus correction. For example, according to the coordinate analysis of the key points (90) of the face captured by the camera, when the points are incomplete, or the head is lowered by a certain range (45 degrees), it is determined to be out of focus.
  • the terminal may drive the game performance based on the control instructions generated in the foregoing steps.
  • various game scenarios are examples of various game scenarios:
  • FIG. 7 is a schematic diagram of interaction detection in a game scenario according to an embodiment of the present application.
  • a game level game scenario is taken as an example.
  • the simulation object controlled by the player is A drunk man's interactive object is an obstacle in the level game scene.
  • the camera of the mobile terminal collects the player's face in real time, and compares the two adjacent frames of the captured image to obtain the player's action and the corresponding amplitude. Based on the player's action and amplitude, control instructions for the drunkard can be generated. For example, the drunkard is controlled to move to the left. When the drunkard is moved to the left, the drunkard collides with an obstacle. .
  • the embodiment of the present application can avoid the obstacle by controlling the movement of the drunkard by the face, so as to realize the interaction between the drunkard and the obstacle.
  • FIG. 8 is a schematic diagram of interaction detection in another game scenario according to an embodiment of the present application.
  • the player controls The simulation object is a fish character
  • the interactive object in the sea level level game scene is the mouth.
  • the camera of the mobile terminal collects the player's face in real time, and compares the two frames before and after the captured image to obtain the player's action and the corresponding amplitude. Based on the player's action and amplitude, control instructions for the fish can be generated, such as The fish is controlled to move to the left. When the fish moves to the left, the mouth eats a fish. For example, the fish is controlled to move to the right.
  • the swimming of the fish can be controlled by the face, and the fish can be hunted (eating) by biting the mouth, thereby realizing the interaction between the fish and the mouth.
  • FIG. 9 is a schematic diagram of interaction detection in another game scenario according to an embodiment of the present application.
  • the game when the face recognition is detached from the camera, the game
  • the user interface (User) interface (UI) interaction is performed in the form of defocus correction, that is, the user adjusts the position of the face according to the UI contour prompts that appear on the screen to achieve the purpose of refocusing.
  • defocus correction For the detailed process of defocus correction, See the previous example.
  • data is stored to facilitate the recall of the face image of the previous frame when the data of the next frame is compared.
  • the technical solution provided by the embodiments of the present application realizes a game that relies entirely on facial expression recognition to interact, and greatly improves the sense of substitution of immersive experience games on the mobile terminal. In addition, it does not depend on the screen touch interaction mode, which improves the accessibility when playing games in some special scenarios. In addition, using this technical solution, a game for a special group of people with defective hands can also be developed, thereby completing the interaction when the user is inconvenient to use a finger or the mobile terminal is not configured with a touch screen.
  • An embodiment of the present application further provides a computer-readable storage medium.
  • the computer-readable storage medium includes instructions that, when run on a computer, cause the computer to execute the foregoing method.
  • the computer when the instructions included in the computer-readable storage medium are run on the computer, the computer is caused to perform the following steps:
  • FIG. 10-a is a schematic structural diagram of a mobile terminal according to an embodiment of the present application. Please refer to FIG. 10-a.
  • a mobile terminal 1000 provided by an embodiment of the present application may include one or more processors. And one or more memories storing a program unit, where the program unit is executed by a processor, the program unit includes: an image acquisition module 1001, a comparison module 1002, an instruction generation module 1003, and an interaction module 1004, wherein:
  • An image acquisition module 1001 is configured to perform real-time image acquisition of a target face through a camera configured by a mobile terminal to obtain a first frame image and a second frame image, where the first frame image and the second frame image Are two frames of images that were taken before and after each other;
  • the comparison module 1002 is configured to compare the first frame image and the second frame image to obtain the target face motion and corresponding amplitude
  • the instruction generating module 1003 is configured to generate a control instruction of a simulated object in an interactive application scene according to the target face's motion and corresponding amplitude, and the simulated application and the interactive object are displayed in the interactive application scene. ;
  • the interaction module 1004 is configured to control the simulation object to interact with the interactive object in the interactive application scene according to the control instruction.
  • FIG. 10-b is a schematic structural diagram of a comparison module according to an embodiment of the present application.
  • the comparison module 1002 includes:
  • the pixel position determining unit 10021 is configured to determine a first pixel position at which a face anchor point appears in the first frame image, and a second pixel position at which the face anchor point appears in the second frame image ;
  • the displacement determining unit 10022 is configured to compare the first pixel position and the second pixel position to obtain a relative displacement between the first pixel position and the second pixel position;
  • the action determining unit 10023 is configured to determine the action and the corresponding amplitude of the target face according to a relative displacement between the first pixel position and the second pixel position.
  • FIG. 10-c is a schematic structural diagram of another mobile terminal according to an embodiment of the present application.
  • the program unit further includes:
  • the touch detection module 1005 is configured to detect whether a touch input is generated on a touch screen of the mobile terminal before the image acquisition module 1001 performs real-time image acquisition of a target face through a camera configured on the mobile terminal; When no touch input is generated, the image acquisition module is triggered to be executed.
  • FIG. 10-d is a schematic structural diagram of another mobile terminal according to an embodiment of the present application.
  • the program unit further includes:
  • the focus detection module 1006 is configured so that the instruction generation module 1003 generates a control instruction of the simulated object in the interactive application scene according to the target face motion and the corresponding amplitude, and then determines whether the focus is interactive based on the control instruction.
  • the object range remains stable for a preset duration, and the focus is a reference point where the target face is mapped in the interactive application scene;
  • the object locking module 1007 is configured to lock the interactive object from the interactive object range when the focus remains stable within the interactive object range for a preset period of time.
  • FIG. 10-e is a schematic structural diagram of an interaction module according to an embodiment of the present application.
  • the interaction module 1004 includes:
  • the distance calculation unit 10041 is configured to calculate the distance between the focus and the interactive object in real time, where the focus is a reference point where the target face is mapped in the interactive application scene;
  • the range determining unit 10042 is configured to determine whether the focus is within a range of an interactive object according to the distance calculated in real time;
  • the rate updating unit 10043 is configured to update the displacement rate corresponding to the focus according to the distance calculated in real time when the focus is within the range of the interactive object;
  • the interaction unit 10044 is configured to update the control instruction according to the updated displacement rate, and use the updated control instruction to control the simulation object to interact with the interactive object.
  • the rate updating unit 10043 is specifically configured to reduce the displacement rate in the moving direction of the focus first when the distance between the focus and the interactive object decreases. Increase the displacement rate; or, when the distance between the focus and the interactive object increases, decrease the displacement rate in the direction of movement of the focus first, and then in the opposite direction of the movement direction Increasing the displacement rate.
  • FIG. 10-f is a schematic structural diagram of another mobile terminal according to an embodiment of the present application.
  • the program unit further includes:
  • the key point collection module 1008 is configured to determine the pixel coordinates of multiple face key points on each frame of image collected in real time;
  • a defocus determination module 1009 configured to determine whether the target face has lost focus according to the pixel coordinates of the multiple key points of the face;
  • the focusing module 1010 is configured to perform focus correction on the target human face when the target human face loses focus.
  • the camera configured by the mobile terminal performs real-time image acquisition of the target face to obtain a first frame image and a second frame image, and the first frame image and the second frame image are obtained separately.
  • the two adjacent frames of the image are compared with the first frame and the second frame to obtain the target face motion and the corresponding amplitude.
  • a simulated object is generated in the interactive application.
  • the control instruction in the scene, the simulation object and the interactive object are displayed in the interactive application scene, and the simulation object is controlled to interact with the interactive object in the interactive application scene according to the control instruction.
  • the motion and amplitude of a human face can be obtained according to the comparison result of multiple frames of images captured by the camera in real time, and then a control instruction for the simulated object can be generated, and the interaction between the simulated object and the interactive object can be achieved through the control instruction.
  • scene interaction can be performed by relying on the user's facial expressions, rather than issuing instructions through the user's finger, so immersive interaction on a mobile terminal can be achieved.
  • FIG. 11 is a schematic structural diagram of a composition of an application scenario interaction method provided in an embodiment of the present application applied to the terminal. As shown in FIG. 11, for convenience of explanation, only the parts related to the embodiment of the present application are shown, and specific technical details are not disclosed, please refer to the method part of the embodiment of the present application.
  • the terminal can be any terminal device including a mobile phone, a tablet computer, a Personal Digital Assistant (PDA), a sales terminal (Point of Sales, POS), an on-board computer, etc. Taking the terminal as a mobile phone as an example:
  • FIG. 11 is a block diagram showing a partial structure of a mobile phone related to a terminal provided in an embodiment of the present application.
  • the mobile phone includes: a radio frequency (RF) circuit 1010, a memory 1020, an input unit 1030, a display unit 1040, a sensor 1050, an audio circuit 1060, a wireless fidelity (WiFi) module 1070, and a processor 1080 , And power supply 1090 and other components.
  • RF radio frequency
  • the RF circuit 1010 may be configured to receive and transmit signals during transmission and reception of information or during a call.
  • the downlink information of the base station is received and processed by the processor 1080; in addition, the uplink data of the design is transmitted to the base station.
  • the RF circuit 1010 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
  • the RF circuit 1010 can also communicate with a network and other devices through wireless communication.
  • the above wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division Multiple Access) Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), E-mail, Short Messaging Service (SMS), etc.
  • GSM Global System of Mobile
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • E-mail Short Messaging Service
  • the memory 1020 may be configured to store software programs and modules, and the processor 1080 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1020.
  • the memory 1020 may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, at least one function required application program (such as a sound playback function, an image playback function, etc.), etc .; the storage data area may store data according to Data (such as audio data, phone book, etc.) created by the use of mobile phones.
  • the memory 1020 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the input unit 1030 may be configured to receive inputted numeric or character information, and generate key signal inputs related to user settings and function control of the mobile phone.
  • the input unit 1030 may include a touch panel 1031 and other input devices 1032.
  • Touch panel 1031 also known as touch screen, can collect user's touch operations on or near it (such as the user using a finger, stylus, etc. any suitable object or accessory on touch panel 1031 or near touch panel 1031 Operation), and drive the corresponding connection device according to a preset program.
  • the touch panel 1031 may include two parts, a touch detection device and a touch controller.
  • the touch detection device detects the user's touch position, and detects the signal caused by the touch operation, and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device, converts it into contact coordinates, and sends To the processor 1080, and can receive the commands sent by the processor 1080 and execute them.
  • various types such as resistive, capacitive, infrared, and surface acoustic wave can be used to implement the touch panel 1031.
  • the input unit 1030 may include other input devices 1032.
  • other input devices 1032 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, an operation lever, and the like.
  • the display unit 1040 may be configured to display information input by the user or information provided to the user and various menus of the mobile phone.
  • the display unit 1040 may include a display panel 1041.
  • the display panel 1041 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), or the like.
  • the touch panel 1031 may cover the display panel 1041. When the touch panel 1031 detects a touch operation on or near the touch panel 1031, the touch panel 1031 transmits the touch operation to the processor 1080 to determine the type of the touch event. The type provides corresponding visual output on the display panel 1041.
  • the touch panel 1031 and the display panel 1041 are implemented as two independent components to implement the input and input functions of the mobile phone, in some embodiments, the touch panel 1031 and the display panel 1041 can be integrated and Realize the input and output functions of the mobile phone.
  • the mobile phone may further include at least one sensor 1050, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein the ambient light sensor may adjust the brightness of the display panel 1041 according to the brightness of the ambient light, and the proximity sensor may close the display panel 1041 and / Or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in various directions (generally three axes), and can detect the magnitude and direction of gravity when it is stationary, and can be set as an application that recognizes the attitude of the mobile phone (such as horizontal and vertical screen switching , Related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tap), etc .; as for the mobile phone can also be configured with gyroscope, barometer, hygrometer, thermometer, infrared sensor and other sensors, here No longer.
  • the audio circuit 1060, the speaker 1061, and the microphone 1062 can provide an audio interface between the user and the mobile phone.
  • the audio circuit 1060 can transmit the received electrical data converted electrical signal to the speaker 1061, and the speaker 1061 converts the sound signal to an audio signal output.
  • the microphone 1062 converts the collected sound signal into an electrical signal, and the audio circuit 1060 After receiving, it is converted into audio data, and then the audio data is output to the processor 1080 for processing, and then sent to, for example, another mobile phone via the RF circuit 1010, or the audio data is output to the memory 1020 for further processing.
  • WiFi is a short-range wireless transmission technology.
  • the mobile phone can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 1070. It provides users with wireless broadband Internet access.
  • FIG. 11 shows the WiFi module 1070, it can be understood that it does not belong to the necessary structure of a mobile phone, and can be omitted as needed without changing the essence of the invention.
  • the processor 1080 is the control center of the mobile phone. It uses various interfaces and lines to connect various parts of the entire mobile phone.
  • the processor 1080 runs or executes software programs and / or modules stored in the memory 1020, and calls data stored in the memory 1020 to execute.
  • Various functions and processing data of the mobile phone so as to monitor the mobile phone as a whole.
  • the processor 1080 may include one or more processing units; preferably, the processor 1080 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, and an application program, etc.
  • the modem processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1080.
  • the mobile phone also includes a power supply 1090 (such as a battery) for supplying power to various components.
  • a power supply 1090 (such as a battery) for supplying power to various components.
  • the power supply can be logically connected to the processor 1080 through a power management system, so as to implement functions such as management of charging, discharging, and power consumption management through the power management system.
  • the mobile phone may further include a camera 1011.
  • the camera 1011 may be a front camera of the mobile phone. After collecting multiple face images, the camera 1011 processes the multiple face images by the processor 1080.
  • the processor 1080 included in the terminal further has an interaction method flow for controlling and executing the application scenario executed by the terminal.
  • the device embodiments described above are only schematic, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be A physical unit can be located in one place or distributed across multiple network units. Some or all of the modules may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • the connection relationship between the modules indicates that there is a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • the technical solutions of the embodiments of the present application that are essentially or contribute to related technologies can be embodied in the form of software products, which are stored in a readable storage medium, such as a computer Floppy disk, U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk, etc., including several instructions to make a computer device (can (A personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments of the present application.
  • a computer Floppy disk U disk
  • mobile hard disk read-only memory
  • RAM random access memory
  • magnetic disk or optical disk etc.
  • can A personal computer, a server, or a network device, etc.
  • the camera configured by the mobile terminal performs real-time image acquisition of the target face to obtain a first frame image and a second frame image, and the first frame image and the second frame image are respectively captured before and after The two adjacent frames are compared.
  • the first frame and the second frame are compared to obtain the target face motion and the corresponding amplitude.
  • the simulated object in the interactive application scene is generated.
  • a control instruction The simulation object and the interactive object are displayed in the interactive application scene, and the simulation object is controlled to interact with the interactive object in the interactive application scene according to the control instruction.
  • the motion and amplitude of a human face can be obtained according to the comparison result of multiple frames of images captured by the camera in real time, and then a control instruction for the simulated object can be generated, and the interaction between the simulated object and the interactive object can be achieved through the control instruction.
  • scene interaction can be performed by relying on the user's facial expressions, rather than issuing instructions through the user's finger, so immersive interaction on a mobile terminal can be achieved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种应用场景的交互方法和移动终端以及存储介质,用于实现移动终端上的沉浸式交互。本申请实施例提供一种应用场景的交互方法,包括:移动终端通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,第一帧图像和第二帧图像是分别拍摄得到的前后相邻的两帧图像(201);移动终端将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度(202);移动终端根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,所述交互式应用场景中显示有所述模拟对象和可交互物件(203);移动终端根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互(204)。

Description

一种应用场景的交互方法和移动终端以及存储介质
本申请要求于2018年08月28日提交中国专利局、申请号为201810989371.5、发明名称“一种应用场景的交互方法和终端以及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机技术领域,尤其涉及一种应用场景的交互方法和移动终端以及存储介质。
背景技术
计算机仿真系统可以创建虚拟世界,并使用户来体验虚拟世界,它利用计算机生成一种模拟环境,是一种多源信息融合的交互式的三维动态视景和实体行为的系统仿真,使用户沉浸到该环境中。
目前,重要的应用是移动端游戏,该游戏的一大特色就是让玩家在游戏中进行操作交互,为此玩家需要通过手指和终端的触摸屏幕进行交互。在目前的移动端游戏中,用户仍然需要手指和触摸屏幕进行游戏交互,例如通过手指点击触摸屏幕、或者手指操作手柄按钮、或者手指滑动触摸屏幕等操作完成。
发明内容
本申请实施例提供了一种应用场景的交互方法和移动终端以及存储介质,用于实现移动终端上的沉浸式交互。
本申请实施例提供以下技术方案:
一方面,本申请实施例提供一种应用场景的交互方法,包括:
移动终端通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,其中,所述第一帧图像和所述第二帧 图像是分别拍摄得到的前后相邻的两帧图像;
移动终端将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度;
移动终端根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,所述交互式应用场景中显示有所述模拟对象和可交互物件;
移动终端根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互。
另一方面,本申请实施例还提供了一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行上述的方法。
另一方面,本申请实施例还提供一种移动终端,包括括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,程序单元由处理器执行,该程序单元包括:
图像采集模块,被设置为通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,其中,所述第一帧图像和所述第二帧图像是分别拍摄得到的前后相邻的两帧图像;
对比模块,被设置为将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度;
指令生成模块,被设置为根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,所述交互式应用场景中显示有所述模拟对象和可交互物件;
交互模块,被设置为根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互。
在前述方面中,移动终端的组成模块还可以执行前述一方面以及各种可能的实现方式中所描述的步骤,详见前述对前述一方面以及各种可能的 实现方式中的说明。
另一方面,本申请实施例提供一种移动终端,该移动终端包括:处理器、存储器;存储器被设置为存储指令;处理器被设置为执行存储器中的指令,使得移动终端执行如前述一方面中任一项的方法。
在本申请实施例中,通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,第一帧图像和第二帧图像是分别拍摄得到的前后相邻的两帧图像,将第一帧图像和第二帧图像进行对比,得到目标人脸的动作和相应的幅度,根据目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,交互式应用场景中显示有模拟对象和可交互物件,根据控制指令控制模拟对象在交互式应用场景中与可交互物件进行交互。本申请实施例可以依据摄像头实时拍摄的多帧图像的比对结果获取到人脸的动作以及幅度,进而可以生成模拟对象的控制指令,通过该控制指令实现模拟对象与可交互物件的交互,本申请实施例中可以依赖于用户的人脸表情进行场景交互,而不通过用户的手指来下发指令,因此可以实现在移动终端上的沉浸式交互。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域的技术人员来讲,还可以根据这些附图获得其他的附图。
图1为本申请实施例中用户和移动终端之间交互场景示意图;
图2为本申请实施例提供的一种应用场景的交互方法的流程方框示意图;
图3为本申请实施例提供的另一种应用场景的交互方法的流程方框示意图;
图4为本申请实施例提供的磁吸效果的示意图;
图5为本申请实施例提供的阻尼效果的示意图;
图6为本申请实施例提供的游戏场景下的交互检测流程示意图;
图7为本申请实施例提供的一种游戏场景下的交互检测示意图;
图8为本申请实施例提供的另一种游戏场景下的交互检测示意图;
图9为本申请实施例提供的另一种游戏场景下的交互检测示意图;
图10-a为本申请实施例提供的一种移动终端的组成结构示意图;
图10-b为本申请实施例提供的一种对比模块的组成结构示意图;
图10-c为本申请实施例提供的另一种移动终端的组成结构示意图;
图10-d为本申请实施例提供的另一种移动终端的组成结构示意图;
图10-e为本申请实施例提供的一种交互模块的组成结构示意图;
图10-f为本申请实施例提供的另一种移动终端的组成结构示意图;
图11为本申请实施例提供的应用场景的交互方法应用于终端的组成结构示意图。
具体实施方式
本申请实施例提供了一种应用场景的交互方法和终端以及存储介质,用于实现移动终端上的沉浸式交互。
为使得本申请实施例的发明目的、特征、优点能够更加的明显和易懂,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,下面所描述的实施例仅仅是本申请一部分实施例,而非全部实施例。基于本申请中的实施例,本领域的技术人员所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
请参考图1,其示出了本申请实施例提供的用户和移动终端之间交互场景示意图。移动终端(简称终端)与用户之间可以进行交互,终端可以是手机、平板电脑、电子书阅读器、动态影像专家压缩标准音频层面3(Moving Picture Experts Group Audio Layer III,MP3)播放器、动态影像专家压缩标准音频层面4(Moving Picture Experts Group Audio Layer IV,MP4)播放器、膝上型便携计算机和台式计算机等等。
终端上配置有摄像头,该摄像头具体可以是前置摄像头。摄像头可以采集用户的人脸图像,该终端执行本申请实施例提供的应用场景的交互方法,在兼容现有移动端的输入交互方式的同时,通过摄像头逐帧捕捉用户的脸部图像,通过前后帧的脸部数据对比,计算得出用户所做出的动作和相应幅度,并对应成不同交互式应用场景下的控制指令,基于该控制指令可以实现在交互式应用场景中的模拟对象与可交互物件之间的交互。相比相关技术,在脸部控制的精准度和顺滑度方面,做出极大的提升,使无需任何手指操作的新的交互方式成为了可能。该交互方式运用于强调沉浸式体验,因不需要任何通过手指触碰屏幕来进行交互的操作,完全使用脸部表情识别来进行交互操作,所以能大幅提升用户的代入感。
以下从移动终端的角度进行详细说明。本申请应用场景的交互方法的一个实施例,具体可以应用于基于人脸图像的交互检测处理中,图2为本申请实施例提供的一种应用场景的交互方法的流程方框示意图。请参阅图2所示,本申请一个实施例提供的应用场景的交互方法,可以包括如下步骤:
201、移动终端通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,其中,第一帧图像和第二帧图像 是分别拍摄得到的前后相邻的两帧图像。
在本申请实施例中,终端中设置有摄像头,首先对移动终端的摄像头进行初始化运行,在初始化运行完成之后,启动摄像头采集在摄像头的视野内出现的物体图像,在该物体图像被识别为用户人脸图像的情况下,将该用户人脸作为目标人脸进行实时拍摄,并生成在不同帧时刻拍摄的图像,本申请实施例中终端上安装的摄像头可以是后置摄像头或者前置摄像头。终端首先获取到待处理的第一帧图像和第二帧图像,每帧图像可以是通过摄像头对用户的人脸进行拍摄后生成的人脸图像,本申请实施例中人脸图像也可以称为面部图像或者头部图像等。
本申请实施例中摄像头采集在不同帧时刻的图像时可以得到多个帧时刻的图像,其中为了区别上述多个帧时刻的图像,将摄像头在第一帧时刻拍摄得到的图像定义为第一帧图像,将摄像头在第二帧时刻拍摄得到的图像定义为第二帧图像,第一帧图像和第二帧图像只是用于区分摄像头在不同帧时刻拍摄到的图像。
在本申请实施例中,对于在第一帧时刻拍摄得到的第一帧图像进行特征点的检测计算可以得到第一帧图像的特征点,对于在第二帧时刻拍摄得到的第二帧图像进行特征点的检测计算可以得到第二帧图像的特征点。其中,图像的特征点在基于特征点的图像匹配算法中有着十分重要的作用,图像特征点能够反映图像本质特征,能够标识图像中目标物体,通过特征点的匹配能够完成图像的匹配,本申请实施例图像的特征点可以是局部图像特征点,对于第一帧图像和第二帧图像的特征点提取可以有多种实现方式,例如可以是ORB(英文名称:ORiented Binary Robust Independent Elementary Features)特征点的提取,也可以是加速鲁棒特征(Speeded Up Robust Features,SURF)的提取,还可以是尺度不变特征变换(Scale-Invariant Feature Transform,SIFT)的提取等,因此本申请实施例中局部图像特点可以是ORB特征点、SURF特征点、SIFT特征点。通过对特征点的检测可以识别出每个帧图像上的人脸,例如通过特征点检测可 以识别出人脸中的五官器官,这些五官器官可以作为人脸的定位点。
在本申请的一些实施例中,步骤201移动终端通过移动终端配置的摄像头对目标人脸进行实时的图像采集之前,本申请实施例提供的方法还包括如下步骤:
移动终端检测移动终端的触摸屏幕上是否产生有触摸输入;
当触摸屏幕上没有产生触摸输入时,移动终端触发执行如下步骤:通过移动终端配置的摄像头对目标人脸进行实时的图像采集。
其中,为了实现用户与移动终端的全程为无接触控制,首先判断触摸屏幕上是否产生有触摸输入,在确定触摸屏幕上没有产生触摸输入时,说明用户没有通过手指来触摸移动终端的屏幕,此时再启动移动终端的摄像头来采集用户的图像。
202、移动终端将第一帧图像和第二帧图像进行对比,得到目标人脸的动作和相应的幅度。
在本申请实施例中,终端在获取到前后帧的人脸图像之后,通过人脸图像在前后帧中的位置变化,可以确定出目标人脸所做的动作,以及该动作所对应的幅度。其中,目标人脸的动作指的是终端的摄像头所拍摄到的额用户动作,例如动作可以是脸部的上下左右移动,幅度指的是脸部动作移动的方向以及距离。
举例说明,本申请实施例中,通过前后帧图像的对比,可以识别出目标人脸的动作为左右转动脸部,即对应屏幕脸部左右移动,转动幅度对应屏幕移动距离。又如通过前后帧图像的对比,可以识别出人脸的动作为抬头低头,对应屏幕脸部上下移动,移动幅度对应屏幕移动距离。
在本申请的一些实施例中,步骤202移动终端将第一帧图像和第二帧图像进行对比,得到目标人脸的动作和相应的幅度,包括:
确定脸部定位点出现在第一帧图像中的第一像素位置,以及脸部定位 点出现在第二帧图像中的第二像素位置;
将第一像素位置和第二像素位置进行对比,得到第一像素位置和第二像素位置之间的相对位移;
根据第一像素位置和第二像素位置之间的相对位移确定目标人脸的动作和相应的幅度。
在本申请实施例,通过前述步骤对目标人脸采集得到的多帧图像进行脸部检测之后,可以输出图像中的脸部位置,脸部位置在摄像头采集到的图像中表示的位置是以像素为单位的,接下来从对该脸部位置进行脸部定位点检测,该脸部定位点的定位点特征可以通过预先的统计分类的方式来确定,从而从脸部位置上检测是否存在满足预置的定位点特征,如果在脸部位置上存在符合该定位点特征的像素位置,则确定在脸部位置上存在符合该定位点特征的像素位置就是脸部定位点在脸部位置中的定位点位置。其中本申请实施例采用的脸部定位点是指在目标对象的脸部位置上用于定位脸部位置是否发生转向的定位参考点。在实际应用中,脸部定位点的选择可以基于脸部位置上可实现的五官特征来选取脸部定位点,需要说明的是,脸部定位点可以指的是脸部位置上的某个器官所在的像素位置,也可以指的是多个器官所在的像素位置,此处不做限定。
在每帧图像中都检测同一个脸部定位点所在的像素位置,例如脸部定位点在第一帧图像中的像素位置称为第一像素位置,脸部定位点在第二帧图像中的像素位置称为第二像素位置,比对在两帧图像中的这两个像素位置,确定出相对位移,基于该相对位移计算出目标人脸的动作以及幅度。举例说明如下,以人脸的上下移动为例,当前人脸眉心点坐标与上一次眉心点坐标数据对比,Y坐标增大表示抬头;Y坐标减少表示低头。以人脸的左右移动为例,当前人脸鼻尖坐标与上一次鼻尖坐标数据对比,X坐标增大表示左转,X坐标减小表示右转。以人脸的前后移动为例,当前人脸左眼远角到右眼远角距离,与上一次同样距离数据对比是变大的,则说明人脸产生了前移,若该数据对比是变小的,则说明人脸产生了后移。以用 户的点击为例,当前人脸中心点坐标与上一次中心点坐标数据对比,在一定坐标范围内保持一定时间,则确定人脸产生的是点击操作。
203、移动终端根据目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,交互式应用场景中显示有模拟对象和可交互物件。
本申请实施例中交互式应用场景具体可以是游戏场景,也可以是应用程序的交互场景。举例说明,本申请实施例提供的交互式应用场景的处理方法可以适用于为游戏角色搭建的场景,也可以适用于在软件应用系统中为用户对象搭建的场景。本申请实施例中所述的交互式应用场景中显示有模拟对象,该模拟对象可以是游戏场景中的游戏角色,也可以是游戏场景中的英雄和士兵,例如模拟对象可以是策略游戏由用户控制的人或事物,此处不做限定。在交互式应用场景中除了显示有模拟对象之外,交互式应用场景中还显示有可交互物件,该可交互物件是指交互式应用场景中能够与模拟对象进行交互的物件,该物件在不同的交互式场景下可以是为多种物件,例如可以是游戏场景中的道具。
在本申请实施例中,终端在获取到目标人脸的动作以及相应的幅度之后,将该动作和相应的幅度可以映射为模拟对象在交互式应用场景中的控制指令,该控制指令也可以称为“交互指令”,即用户通过自己的脸部做出动作,基于该动作的类型以及幅度可以对应成模拟对象的控制指令,通过该控制指令可以控制模拟对象。
在本申请的一些实施例中,接下来对本申请实施例中可交互物件的锁定方式进行说明,步骤203移动终端根据目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令之后,本申请实施例提供的方法还包括如下步骤:
移动终端根据控制指令确定焦点是否在可交互物件范围内保持稳定达到预设的时长,焦点为目标人脸映射在交互式应用场景中的参考点;
当模拟对象在可交互物件范围内保持稳定达到预设的时长时,移动终端从可交互物件范围内锁定出可交互物件。
其中,目标人脸在交互式应用场景中映射有一个焦点,该焦点是用户的人脸映射在交互式应用场景中的参考点,首先通过该控制指令确定出该焦点是否在可交互物件范围内,可交互物件范围是指在交互式应用场景中可交互物件所处的一定区域大小的范围,只有焦点进入该可交互物件范围内模拟对象才能与可交互物件进行交互,若焦点没有进入该可交互物件范围内,模拟对象无法与可交互物件进行交互。确定该模拟对象在该可交互物件范围保持稳定的时长是否达到预设的时长,预设的时长以x秒表示,则在一定范围内维持定点达到x秒时,确定对应屏幕中的交互物件,进行焦点锁定,达成对应操作。可以理解的是,在交互式应用场景中,可交互物件的锁定可以采用默认配置,也可以在有多个可交互物件时,通过焦点与可交互物件范围的距离来锁定出可交互物件,被锁定的可交互物件即为模拟对象需要进行交互的物件。
204、移动终端根据控制指令控制模拟对象在交互式应用场景中与可交互物件进行交互。
在本申请实施例中,终端生成控制指令之后,终端可以根据该控制指令来控制模拟对象与可交互物件的交互方式,本申请实施例中根据交互式应用场景的设置不同,模拟对象与可交互物件之间的交互行为可以有多种。举例说明如下,以交互式应用场景为吃鱼游戏为例,可交互物件为在游戏场景中设置的鱼类道具,若通过前述实施例生成的控制指令为张嘴,则可以控制游戏角色(例如嘴巴)开始张嘴吃鱼,实现游戏角色与可交互物件之间的交互。本申请实施例中模拟对象的控制指令通过图像检测来生成,全程以非触控的形式,以第一人称视觉增强游戏代入感,并带来与众不同的游戏体验。
在本申请的一些实施例中,步骤204移动终端根据控制指令控制模拟对象在交互式应用场景中与可交互物件进行交互,包括:
实时计算焦点与可交互物件之间的距离,焦点为目标人脸映射在交互式应用场景中的参考点;
根据实时计算出的距离确定焦点是否在可交互物件范围内;
当焦点在可交互物件范围内时,根据实时计算出的距离更新焦点对应的位移速率;
根据更新后的位移速率更新控制指令,并使用更新后的控制指令控制模拟对象与可交互物件进行交互。
其中,在控制模拟对象与可交互物件进行交互时,终端可以实时计算模拟对象与可交互物件的距离,通过该距离可以判断焦点是否在可交互物件范围内,只有焦点在可交互物件范围内时才能执行后续的交互流程,若焦点没有在可交互物件范围内则无法进行交互。当焦点在可交互物件范围内时,根据实时计算出的距离更新焦点对应的位移速率,即可以根据该距离的实时变化来实时的更新焦点对应的位移速率,根据不断更新的位移速率可以更新控制指令,使得该控制指令可以用于控制模拟对象与可交互物件的交互。例如,焦点对应的位移速率发生变化时,模拟对象可以根据控制指令进行实时的移,在不同的应用场景下,模拟对象与可交互物件之间的交互方式可以结合具体场景来确定。
可选的,在本申请的一些实施例中,在实时计算出焦点与可交互物件之间的距离之后,根据该距离的实时变化情况可以更新焦点对应的位移速率,以实现对位移速率的实时更新,可以实现磁吸效果和阻尼效果。具体的,根据实时计算出的距离更新焦点对应的位移速率,包括:
当焦点与可交互物件之间的距离减小时,在焦点的移动方向上先减少位移速率再增加位移速率;或者,
当焦点与可交互物件之间的距离增大时,在焦点的移动方向上先减少位移速率,再在移动方向的相反方向上增加位移速率。
其中,终端可以采用磁吸对焦的方式,即可以通过实时计算模拟对象 与可交互物件的距离,当焦点处于可交互物品状态下,焦点与可交互物件之间的距离减小时,在焦点的移动方向上先减少位移速率再增加位移速率,增大了焦点脱离可交互物件的移动阻力,实现动态调节焦点对应的位移速率,基于该增大后的移动阻力来更新控制指令,使得该控制指令可以产生交互点的吸力作用,使得进行物品交互操作难度降低,精确度提升。
如图4所示,为本申请实施例提供的磁吸效果的示意图。模拟对象在控制指令的控制下,所产生的速度通过如下公式计算:
速度=(初始速度×初始方向-目标速度×目标方向)×(当前时间/总时间)×磁吸常数。
其中,交互物外指的是人脸映射于设备中的焦点在可交互物件范围外,交互物内指的是人脸映射于设备中的焦点在可交互物件范围内。初始速度、初始方向、目标速度、目标方向,这些都是人脸的移动对应到屏幕上的移动数据,磁吸常数可以根据实际场景来调整,磁吸常数用于扩大人脸转速到焦点移速的体验,例如磁吸常数的取值可以为1.32。举例说明如下,对于可交互物件,当脸部操作光标靠近时,会加大移动阻力,即需要更大的动作幅度才能脱离物体。本申请实施例通过磁吸对焦在脸部控制的精准度和顺滑度方面,做出了较大提升,使完全依托于脸部表情识别来进行交互的游戏成为了可能。
在本申请的另一些实施例中,在控制模拟对象与可交互物件进行交互时,终端可以采用阻尼效果的方式,根据实时计算出的模拟对象与可交互物件之间的距离,焦点与可交互物件之间的距离增大时,在焦点的移动方向上先减少位移速率,再在移动方向的相反方向上增加位移速率。判断该焦点对应的位移速率是否超过阈值,若位移速率超过该阈值,则说明脸部操作过快导致来回双方向的高速位移发生,此时可以在同方向上先减小焦点对应的位移速率,再在相反方向上增加位移速率,产生阻尼效果,减缓操控的不确定性。
如图5所示,为本申请实施例提供的阻尼效果的示意图。模拟对象在 控制指令的控制下,所产生的速度通过如下公式计算:
速度=cos(初始速度×方向/目标速度×方向)×(当前时间/总时间)×阻尼常数。
其中,初始速度、初始方向、目标速度、目标方向,这些都是人脸的移动对应到屏幕上的移动数据,阻尼常数可以根据实际场景来调整,阻尼常数用于增强人脸转向映射到焦点转向的加减速的体验,例如阻尼常数的取值可以为0.53。本申请实施例中可以减少焦点对应的位移速率,产生阻尼效果,减缓操控的不确定性,从而在脸部控制的精准度和顺滑度方面,做出了较大提升,使完全依托于脸部表情识别来进行交互的游戏成为了可能。
前述描述了交互式应用场景下基于人脸映射出的焦点进行的交互检测过程,图3为本申请实施例提供的另一种应用场景的交互方法的流程方框示意图,接下来请参阅图3所示,接下来对脱焦判定的过程进行详细说明,在本申请的一些实施例中,本申请实施例提供的方法还包括如下步骤:
301、移动终端确定实时采集到的每一帧图像上的多个人脸关键点像素坐标。
其中,在每一帧图像上,终端通过摄像头可以捕捉到多个人脸关键点像素坐标,人脸关键点的个数可以设置为90。
302、移动终端根据多个人脸关键点像素坐标判断目标人脸是否失去焦点。
终端通过这些人脸关键点像素坐标来确定目标人脸是否失去焦点,例如,根据摄像头捕捉到的90个人脸关键点像素坐标分析,当这些关键点无法采集到90个,或者抬头低头超过一定幅度(例如45度),就判断为脱焦。
303、当目标人脸失去焦点时,对目标人脸进行对焦校正。
在本申请实施例中,当脸部无法识别或脱离摄像头判定时,通过实时计算用户脸部数据,判定用户是否失去焦点,给予对焦校正,通过步骤301至步骤303对人脸所映射的焦点进行实时的脱焦判定,从而可以即时进行对焦校正,以使得前述基于焦点的磁吸对焦和阻尼效果能够即时完成。
通过以上实施例对本申请实施例的描述可知,通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,第一帧图像和第二帧图像是分别拍摄得到的前后相邻的两帧图像,将第一帧图像和第二帧图像进行对比,得到目标人脸的动作和相应的幅度,根据目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,交互式应用场景中显示有模拟对象和可交互物件,根据控制指令控制模拟对象在交互式应用场景中与可交互物件进行交互。本申请实施例可以依据摄像头实时拍摄的多帧图像的比对结果获取到人脸的动作以及幅度,进而可以生成模拟对象的控制指令,通过该控制指令实现模拟对象与可交互物件的交互,本申请实施例中可以依赖于用户的人脸表情进行场景交互,而不通过用户的手指来下发指令,因此可以实现在移动终端上的沉浸式交互。
为便于更好的理解和实施本申请实施例的上述方案,下面举例相应的应用场景来进行具体说明。
本申请实施例在兼容现有输入交互方式的同时,通过摄像头逐帧捕捉用户脸部特征,通过前后帧脸部数据对比,计算得出用户所做出的动作和相应幅度,并对应成不同游戏输入,实现了无需任何手指操作的新的交互方式。在脸部控制的精准度和顺滑度方面,做出了较大提升,使完全依托于脸部表情识别来进行交互的游戏成为了可能。
本申请实施例对于终端的硬件需求如下:包含摄像头的手机或个人计算机均可。如图6所示,为本申请实施例提供的游戏场景下的交互检测流程示意图。主要的实现逻辑可以包括如下过程:
S01、用户输入。
其中,用户可以通过终端的摄像头采集人脸图像,也可以通过手指来操作触摸屏。
S02、判断是否手指数据输入。
终端可以检测触摸屏上是否有用户输入的手指数据。
S03、手指屏幕点击滑动。
在检测到有手指数据输入的情况下,用户可以在屏幕上点击滑动。
S04、判断是否摄像头脸部数据输入。
在没有检测到手指数据的情况下,终端可以判断摄像头是否有脸部数据输入。
S05、判断输入数据是否错误。
在步骤S04之后,终端判断输入的数据是否有误,即是否可以检测到完整的人脸图像。
S06、提示用户。
在无法输入数据有误的情况下,提示用户重新输入,或者提述用户输入失败。
S07、判断是否有对比数据。
终端在采集到多帧连续的人脸图像的情况下,针对前后连续的多帧图像进行对比。
S08、数据对比。
在数据对比环节,终端可以比对前后两帧的人脸图像,具体对比过程详见前述实施例的描述,此处不再展开说明。
需要说明的是,第一次输入没有对比数据时,会跳转到数据存储,在第二次数据对比时,会拿之前存储的数据与当前的数据做对比。
S09、根据数据对比,判断出用户不同脸部动作,及幅度。
在完成数据对比之后,终端可以判断出用户当前所做的动作以及相应的幅度。
举例说明如下,以人脸的上下移动为例,当前人脸眉心点坐标与上一次眉心点坐标数据对比,Y坐标增大表示抬头;Y坐标减少表示低头。以人脸的左右移动为例,当前人脸鼻尖坐标与上一次鼻尖坐标数据对比,X坐标增大表示左转,X坐标减小表示右转。以人脸的前后移动为例,当前人脸左眼远角到右眼远角距离,与上一次同样距离数据对比是变大的,则说明人脸产生了前移,若该数据对比是变小的,则说明人脸产生了后移。以用户的点击为例,当前人脸中心点坐标与上一次中心点坐标数据对比,在一定坐标范围内保持一定时间,则确定人脸产生的是点击操作。
举例说明如下,请参阅表1所示,为基础功能与实现机制的对应关系表:
Figure PCTCN2019091402-appb-000001
请参阅表2所示,为特殊功能与实现机制的对应关系表:
Figure PCTCN2019091402-appb-000002
S10、生成控制指令(例如上下左右前后移动,点击)。
在上述表1和表2中,左右转动脸部是指对应屏幕左右移动,转动幅度对应屏幕移动距离。抬头低头对应屏幕上下移动,幅度对应屏幕移动距离。张嘴闭嘴对应屏幕的相关操作,例如:咬合吃鱼操作。在一定范围内维持定点x秒,定点为维持具体视线的焦点在范围内保持稳定,对应屏幕交互物品,进行焦点锁定,达成对应操作。
接下来对模拟对象与可交互物件之间的交互方式进行说明:
磁吸对焦:通过实时计算人脸在设备中所映射的模拟对象与可交互物件距离,当处于可交互物品状态下,动态调节人脸映射与设备中的焦点对 应的位移速率,产生交互点的吸力作用,使得进行物品交互操作难度降低,精确度提升。
脱焦判定:通过实时计算获取用户脸部数据,判定用户是否失去焦点,给予脱焦校正。例如,根据摄像头捕捉到的人脸关键点(90个)的坐标分析,当点不全,或者抬头低头超过一定幅度(45度),就判断为脱焦。
阻尼效果:当脸部操作过快导致来回双方的高速移位发生时,产生阻尼效果,减缓操控的不确定性。
S11、驱动相应游戏表现。
其中,终端可以基于上述步骤生成的控制指令来驱动游戏表现。接下来对各种游戏场景进行举例说明:
以基于脸部识别进行精确控制操作的游戏为例,在游戏场景中,用户将全程用脸部控制操作来进行游戏,通过手机的摄像头,根据每个不同关卡的玩法,以第一人称的形式来控制游戏中的角色完成不同的操作,如:移动、锁定、咬合、啄等等。不同于传统游戏的触摸式操作,本申请实施例中全程以非触控的形式,以第一人称视觉增强游戏代入感,并带来与众不同的游戏体验。
主要特点如下:
全程无触碰,整个游戏过程全部无触控操作,仅以脸部识别完成所有的游戏行为。
精准操控交互,通过磁吸对焦、脱焦矫正、阻尼效果等等产品技术方案,将原本较为模糊不精确的脸部控制变得实用且更加的精准。会使得脸部控制的难度降低,能更加精准的控制。
沉浸式代入感,第一人称视角,完整全面的契合脸部控制玩法,带来感同身受的体验感,增强游戏的冲击力。
图7为本申请实施例提供的一种游戏场景下的交互检测示意图,如图 7所示,在游戏场景的一种版本样图中,以玩法关卡游戏场景为例,玩家控制的模拟对象是一位醉汉的角色,在玩法关卡游戏场景中可交互物件为障碍物。移动终端的摄像头实时采集玩家的人脸,通过对拍摄到的前后相邻的两帧图像进行对比,得到玩家的动作以及相应的幅度,根据玩家的动作以及幅度可以生成对醉汉的控制指令,比如控制该醉汉向左移动,当醉汉向左移动后,该醉汉与障碍物发生碰撞,比如控制该醉汉向右移动,当醉汉向右移动后,该醉汉成功躲避障碍物。通过图7所示的场景举例,本申请实施例可以通过脸部控制醉汉的移动来躲避障碍,以此实现醉汉和障碍物的交互。
图8为本申请实施例提供的另一种游戏场景下的交互检测示意图,如图8所示,在游戏场景的另一种版本样图中,以海底玩法关卡游戏场景为例,玩家控制的模拟对象是一个鱼的角色,在海底玩法关卡游戏场景中可交互物件为嘴巴。移动终端的摄像头实时采集玩家的人脸,通过对拍摄到的前后相邻的两帧图像进行对比,得到玩家的动作以及相应的幅度,根据玩家的动作以及幅度可以生成对鱼的控制指令,比如控制该鱼向左移动,当鱼向左移动后,该嘴巴吃掉一条鱼,比如控制该鱼向右移动,当鱼向右移动后,鱼成功躲避掉嘴巴的捕捉。通过图8所示的场景举例,本申请实施例可以通过脸部控制鱼的游动,以及嘴巴咬合进行捕猎(吃鱼),以此实现鱼和嘴巴的交互。
图9为本申请实施例提供的另一种游戏场景下的交互检测示意图,如图9所示,在游戏场景的另一种版本样图中,当脸部识别出现脱离摄像头的情况下,游戏中会以脱焦矫正的形式进行用户界面(User Interface,UI)交互,即用户按照屏幕中出现的UI轮廓提示,进行脸部位置调整,达到重新对焦的目的,对于脱焦矫正的详细过程,请参阅前述的举例说明。
S12、数据存储。
在每获取到一帧的人脸图像之后,都进行数据存储,以方便下一帧的数据对比时可以调用前一帧的人脸图像。
本申请实施例提供的技术方案,实现了完全依托于脸部表情识别来进行交互的游戏,大幅提升了移动端上的沉浸式体验游戏的代入感。此外,不依赖于屏幕触碰的交互方式,提高了在一些特殊场景下进行游戏时的无障碍性。并且,利用该技术方案,也可以开发针对手部有缺陷的特殊人群的游戏,从而在用户不方便使用手指或者移动端没有配置触摸屏幕时,完成交互。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请实施例所必须的。
本申请的实施例还提供了一种计算机可读存储介质。可选地,在本实施例中,上述计算机可读存储介质包括指令,当其在计算机上运行时,使得计算机执行上述的方法。
可选地,在本实施例中,当计算机可读存储介质包括的指令在计算机上运行时,使得计算机执行以下步骤:
S11,通过所述移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,其中,所述第一帧图像和所述第二帧图像是分别拍摄得到的前后相邻的两帧图像;
S12,将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度;
S13,根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,所述交互式应用场景中显示有所述模拟对象和可交互物件;
S14,根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互。
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例在此不再赘述。
为便于更好的实施本申请实施例的上述方案,下面还提供用于实施上述方案的相关装置。
图10-a为本申请实施例提供的一种移动终端的组成结构示意图,请参阅图10-a所示,本申请实施例提供的一种移动终端1000,可以包括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,程序单元由处理器执行,该程序单元包括:图像采集模块1001、对比模块1002、指令生成模块1003、交互模块1004,其中,
图像采集模块1001,被设置为通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,其中,所述第一帧图像和所述第二帧图像是分别拍摄得到的前后相邻的两帧图像;
对比模块1002,被设置为将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度;
指令生成模块1003,被设置为根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,所述交互式应用场景中显示有所述模拟对象和可交互物件;
交互模块1004,被设置为根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互。
图10-b为本申请实施例提供的一种对比模块的组成结构示意图。在本申请的一些实施例中,如图10-b所示,所述对比模块1002,包括:
像素位置确定单元10021,被设置为确定脸部定位点出现在所述第一帧图像中的第一像素位置,以及所述脸部定位点出现在所述第二帧图像中的第二像素位置;
位移确定单元10022,被设置为将所述第一像素位置和所述第二像素 位置进行对比,得到所述第一像素位置和所述第二像素位置之间的相对位移;
动作确定单元10023,被设置为根据所述第一像素位置和所述第二像素位置之间的相对位移确定所述目标人脸的动作和相应的幅度。
图10-c为本申请实施例提供的另一种移动终端的组成结构示意图。在本申请的一些实施例中,如图10-c所示,相对于图10-a所示,所述程序单元还包括:
触摸检测模块1005,被设置为所述图像采集模块1001通过移动终端配置的摄像头对目标人脸进行实时的图像采集之前,检测移动终端的触摸屏幕上是否产生有触摸输入;当所述触摸屏幕上没有产生触摸输入时,触发执行所述图像采集模块。
图10-d为本申请实施例提供的另一种移动终端的组成结构示意图。在本申请的一些实施例中,如图10-d所示,相对于图10-a所示,所述程序单元还包括:
焦点检测模块1006,被设置为所述指令生成模块1003根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令之后,根据所述控制指令焦点是否在可交互物件范围内保持稳定达到预设的时长,所述焦点为所述目标人脸映射在所述交互式应用场景中的参考点;
物件锁定模块1007,被设置为当所述焦点在所述可交互物件范围内保持稳定达到预设的时长时,从所述可交互物件范围内锁定出可交互物件。
图10-e为本申请实施例提供的一种交互模块的组成结构示意图。在本申请的一些实施例中,如图10-e所示,所述交互模块1004包括:
距离计算单元10041,被设置为实时计算焦点与所述可交互物件之间的距离,所述焦点为所述目标人脸映射在所述交互式应用场景中的参考点;
范围确定单元10042,被设置为根据实时计算出的所述距离确定所述 焦点是否在可交互物件范围内;
速率更新单元10043,被设置为当所述焦点在所述可交互物件范围内时,根据实时计算出的所述距离更新所述焦点对应的位移速率;
交互单元10044,被设置为根据更新后的所述位移速率更新所述控制指令,并使用更新后的控制指令控制所述模拟对象与所述可交互物件进行交互。
在本申请的一些实施例中,速率更新单元10043,具体被设置为当所述焦点与所述可交互物件之间的距离减小时,在所述焦点的移动方向上先减少所述位移速率再增加所述位移速率;或者,当所述焦点与所述可交互物件之间的距离增大时,在所述焦点的移动方向上先减少所述位移速率,再在所述移动方向的相反方向上增加所述位移速率。
图10-f为本申请实施例提供的另一种移动终端的组成结构示意图。在本申请的一些实施例中,如图10-f所示,相对于图10-a所示,所述程序单元还包括:
关键点采集模块1008,被设置为确定实时采集到的每一帧图像上的多个人脸关键点像素坐标;
脱焦判定模块1009,被设置为根据所述多个人脸关键点像素坐标判断所述目标人脸是否失去焦点;
对焦模块1010,被设置为当所述目标人脸失去焦点时,对所述目标人脸进行对焦校正。
通过以上对本申请实施例的描述可知,通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,第一帧图像和第二帧图像是分别拍摄得到的前后相邻的两帧图像,将第一帧图像和第二帧图像进行对比,得到目标人脸的动作和相应的幅度,根据目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,交互式应用场景中显示有模拟对象和可交互物件,根据控制指令控制模拟对象 在交互式应用场景中与可交互物件进行交互。本申请实施例可以依据摄像头实时拍摄的多帧图像的比对结果获取到人脸的动作以及幅度,进而可以生成模拟对象的控制指令,通过该控制指令实现模拟对象与可交互物件的交互,本申请实施例中可以依赖于用户的人脸表情进行场景交互,而不通过用户的手指来下发指令,因此可以实现在移动终端上的沉浸式交互。
本申请实施例还提供了另一种终端,图11为本申请实施例提供的应用场景的交互方法应用于终端的组成结构示意图。如图11所示,为了便于说明,仅示出了与本申请实施例相关的部分,具体技术细节未揭示的,请参照本申请实施例方法部分。该终端可以为包括手机、平板电脑、个人数字助理(Personal Digital Assistant,PDA)、销售终端(Point of Sales,POS)、车载电脑等任意终端设备,以终端为手机为例:
图11示出的是与本申请实施例提供的终端相关的手机的部分结构的框图。参考图11,手机包括:射频(Radio Frequency,RF)电路1010、存储器1020、输入单元1030、显示单元1040、传感器1050、音频电路1060、无线保真(wireless fidelity,WiFi)模块1070、处理器1080、以及电源1090等部件。本领域技术人员可以理解,图11中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
下面结合图11对手机的各个构成部件进行具体的介绍:
RF电路1010可被设置为收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器1080处理;另外,将设计上行的数据发送给基站。通常,RF电路1010包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,RF电路1010还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division  Multiple Access,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE)、电子邮件、短消息服务(Short Messaging Service,SMS)等。
存储器1020可被设置为存储软件程序以及模块,处理器1080通过运行存储在存储器1020的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器1020可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1020可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。
输入单元1030可被设置为接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。具体地,输入单元1030可包括触控面板1031以及其他输入设备1032。触控面板1031,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1031上或在触控面板1031附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板1031可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器1080,并能接收处理器1080发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1031。除了触控面板1031,输入单元1030还可以包括其他输入设备1032。具体地,其他输入设备1032可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。
显示单元1040可被设置为显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元1040可包括显示面板1041,可选的, 可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板1041。进一步的,触控面板1031可覆盖显示面板1041,当触控面板1031检测到在其上或附近的触摸操作后,传送给处理器1080以确定触摸事件的类型,随后处理器1080根据触摸事件的类型在显示面板1041上提供相应的视觉输出。虽然在图11中,触控面板1031与显示面板1041是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板1031与显示面板1041集成而实现手机的输入和输出功能。
手机还可包括至少一种传感器1050,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1041的亮度,接近传感器可在手机移动到耳边时,关闭显示面板1041和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可被设置为识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
音频电路1060、扬声器1061,传声器1062可提供用户与手机之间的音频接口。音频电路1060可将接收到的音频数据转换后的电信号,传输到扬声器1061,由扬声器1061转换为声音信号输出;另一方面,传声器1062将收集的声音信号转换为电信号,由音频电路1060接收后转换为音频数据,再将音频数据输出处理器1080处理后,经RF电路1010以发送给比如另一手机,或者将音频数据输出至存储器1020以便进一步处理。
WiFi属于短距离无线传输技术,手机通过WiFi模块1070可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图11示出了WiFi模块1070,但是可以理解的是,其并不属于手机的必须构成,完全可以根据需要在不改变发明的本质的范围 内而省略。
处理器1080是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器1020内的软件程序和/或模块,以及调用存储在存储器1020内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器1080可包括一个或多个处理单元;优选的,处理器1080可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1080中。
手机还包括给各个部件供电的电源1090(比如电池),优选的,电源可以通过电源管理系统与处理器1080逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。
手机还可以包括摄像头1011,该摄像头1011可以是手机的前置摄像头,摄像头1011在采集到多帧人脸图像之后,由处理器1080对多帧人脸图像进行处理。在本申请实施例中,该终端所包括的处理器1080还具有控制执行以上由终端执行的应用场景的交互方法流程。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请实施例提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请实施例可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CPU、专用存储器、专用元器件等来 实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请实施例而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请实施例的技术方案本质上或者说对相关技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
综上所述,以上实施例仅用以说明本申请实施例的技术方案,而非对其限制;尽管参照上述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对上述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。
工业实用性
在本申请实施例中,通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,第一帧图像和第二帧图像是分别拍摄得到的前后相邻的两帧图像,将第一帧图像和第二帧图像进行对比,得到目标人脸的动作和相应的幅度,根据目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,交互式应用场景中显示有模拟对象和可交互物件,根据控制指令控制模拟对象在交互式应用场景中与可交互物件进行交互。本申请实施例可以依据摄像头实时拍摄的多帧图像的比对结果获取到人脸的动作以及幅度,进而可以生成模拟对象的控制指令,通过该控制指令实现模拟对象与可交互物件的交互,本申请实施例中可以依赖于用户的人脸表情进行场景交互,而不通过用户的手指来下发指令,因此可以实现在移动终端上的沉浸式交互。

Claims (16)

  1. 一种应用场景的交互方法,包括:
    移动终端通过所述移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,其中,所述第一帧图像和所述第二帧图像是分别拍摄得到的前后相邻的两帧图像;
    所述移动终端将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度;
    所述移动终端根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,所述交互式应用场景中显示有所述模拟对象和可交互物件;
    所述移动终端根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互。
  2. 根据权利要求1所述的方法,其中,所述移动终端所述将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度,包括:
    确定脸部定位点出现在所述第一帧图像中的第一像素位置,以及所述脸部定位点出现在所述第二帧图像中的第二像素位置;
    将所述第一像素位置和所述第二像素位置进行对比,得到所述第一像素位置和所述第二像素位置之间的相对位移;
    根据所述第一像素位置和所述第二像素位置之间的相对位移确定所述目标人脸的动作和相应的幅度。
  3. 根据权利要求1所述的方法,其中,所述移动终端通过所述移动终端配置的摄像头对目标人脸进行实时的图像采集之前,所述方法还包括:
    所述移动终端检测所述移动终端的触摸屏幕上是否产生有触摸输入;
    当所述触摸屏幕上没有产生触摸输入时,所述移动终端触发执行如下步骤:通过所述移动终端配置的摄像头对目标人脸进行实时的图像采集。
  4. 根据权利要求1所述的方法,其中,所述移动终端根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令 之后,所述方法还包括:
    所述移动终端根据所述控制指令确定焦点是否在可交互物件范围内保持稳定达到预设的时长,所述焦点为所述目标人脸映射在所述交互式应用场景中的参考点;
    当所述焦点在所述可交互物件范围内保持稳定达到预设的时长时,所述移动终端从所述可交互物件范围内锁定出可交互物件。
  5. 根据权利要求1所述的方法,其中,所述移动终端根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互,包括:
    实时计算焦点与所述可交互物件之间的距离,所述焦点为所述目标人脸映射在所述交互式应用场景中的参考点;
    根据实时计算出的所述距离确定所述焦点是否在可交互物件范围内;
    当所述焦点在所述可交互物件范围内时,根据实时计算出的所述距离更新所述焦点对应的位移速率;
    根据更新后的所述位移速率更新所述控制指令,并使用更新后的控制指令控制所述模拟对象与所述可交互物件进行交互。
  6. 根据权利要求5所述的方法,其中,所述根据实时计算出的所述距离更新所述焦点对应的位移速率,包括:
    当所述焦点与所述可交互物件之间的距离减小时,在所述焦点的移动方向上先减少所述位移速率再增加所述位移速率;或者,
    当所述焦点与所述可交互物件之间的距离增大时,在所述焦点的移动方向上先减少所述位移速率,再在所述移动方向的相反方向上增加所述位移速率。
  7. 根据权利要求4至6中任一项所述的方法,其中,所述方法还包括:
    所述移动终端确定实时采集到的每一帧图像上的多个人脸关键点像素坐标;
    所述移动终端根据所述多个人脸关键点像素坐标判断所述目标人脸是否失去焦点;
    当所述目标人脸失去焦点时,所述移动终端对所述目标人脸进行对焦 校正。
  8. 一种移动终端,包括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,所述程序单元由所述处理器执行,所述程序单元包括:
    图像采集模块,被设置为通过移动终端配置的摄像头对目标人脸进行实时的图像采集,得到第一帧图像和第二帧图像,其中,所述第一帧图像和所述第二帧图像是分别拍摄得到的前后相邻的两帧图像;
    对比模块,被设置为将所述第一帧图像和所述第二帧图像进行对比,得到所述目标人脸的动作和相应的幅度;
    指令生成模块,被设置为根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令,所述交互式应用场景中显示有所述模拟对象和可交互物件;
    交互模块,被设置为根据所述控制指令控制所述模拟对象在所述交互式应用场景中与所述可交互物件进行交互。
  9. 根据权利要求8所述的移动终端,其中,所述对比模块,包括:
    像素位置确定单元,被设置为确定脸部定位点出现在所述第一帧图像中的第一像素位置,以及所述脸部定位点出现在所述第二帧图像中的第二像素位置;
    位移确定单元,被设置为将所述第一像素位置和所述第二像素位置进行对比,得到所述第一像素位置和所述第二像素位置之间的相对位移;
    动作确定单元,被设置为根据所述第一像素位置和所述第二像素位置之间的相对位移确定所述目标人脸的动作和相应的幅度。
  10. 根据权利要求8所述的移动终端,其中,所述程序单元还包括:
    触摸检测模块,被设置为所述图像采集模块通过移动终端配置的摄像头对目标人脸进行实时的图像采集之前,检测所述移动终端的触摸屏幕上是否产生有触摸输入;当所述触摸屏幕上没有产生触摸输入时,触发执行所述图像采集模块。
  11. 根据权利要求8所述的移动终端,其中,所述程序单元还包括:
    焦点检测模块,被设置为所述指令生成模块根据所述目标人脸的动作和相应的幅度生成模拟对象在交互式应用场景中的控制指令之后,根据所 述控制指令确定焦点是否在可交互物件范围内保持稳定达到预设的时长,所述焦点为所述目标人脸映射在所述交互式应用场景中的参考点;
    物件锁定模块,被设置为当所述焦点在所述可交互物件范围内保持稳定达到预设的时长时,从所述可交互物件范围内锁定出可交互物件。
  12. 根据权利要求8所述的移动终端,其中,所述交互模块包括:
    距离计算单元,被设置为实时计算焦点与所述可交互物件之间的距离,所述焦点为所述目标人脸映射在所述交互式应用场景中的参考点;
    范围确定单元,被设置为根据实时计算出的所述距离确定所述焦点是否在可交互物件范围内;
    速率更新单元,被设置为当所述焦点在所述可交互物件范围内时,根据实时计算出的所述距离更新所述焦点对应的位移速率;
    交互单元,被设置为根据更新后的所述位移速率更新所述控制指令,并使用更新后的控制指令控制所述模拟对象与所述可交互物件进行交互。
  13. 根据权利要求12所述的移动终端,其中,所述速率更新单元,具体被设置为当所述焦点与所述可交互物件之间的距离减小时,在所述焦点的移动方向上先减少所述位移速率再增加所述位移速率;或者,当所述焦点与所述可交互物件之间的距离增大时,在所述焦点的移动方向上先减少所述位移速率,再在所述移动方向的相反方向上增加所述位移速率。
  14. 根据权利要求11至13中任一项所述的移动终端,其中,所述程序单元还包括:
    关键点采集模块,被设置为确定实时采集到的每一帧图像上的多个人脸关键点像素坐标;
    脱焦判定模块,被设置为根据所述多个人脸关键点像素坐标判断所述目标人脸是否失去焦点;
    对焦模块,被设置为当所述目标人脸失去焦点时,对所述目标人脸进行对焦校正。
  15. 一种计算机可读存储介质,包括指令,当其在计算机上运行时,使得计算机执行如权利要求1至7任意一项所述的方法。
  16. 一种移动终端,所述移动终端包括:处理器和存储器;
    所述存储器,被设置为存储指令;
    所述处理器,被设置为执行所述存储器中的所述指令,执行如权利要求1至7中任一项所述的方法。
PCT/CN2019/091402 2018-08-28 2019-06-14 一种应用场景的交互方法和移动终端以及存储介质 WO2020042727A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2020572602A JP7026267B2 (ja) 2018-08-28 2019-06-14 アプリケーションシーンにおけるインタラクティブ方法並びにその方法を実行するモバイル端末及びコンピュータプログラム
EP19854820.8A EP3845282A4 (en) 2018-08-28 2019-06-14 INTERACTION PROCEDURES FOR APPLICATION SCENARIO, MOBILE DEVICE AND STORAGE MEDIUM
US17/027,038 US11383166B2 (en) 2018-08-28 2020-09-21 Interaction method of application scene, mobile terminal, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810989371.5 2018-08-28
CN201810989371.5A CN109224437A (zh) 2018-08-28 2018-08-28 一种应用场景的交互方法和终端以及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/027,038 Continuation US11383166B2 (en) 2018-08-28 2020-09-21 Interaction method of application scene, mobile terminal, and storage medium

Publications (1)

Publication Number Publication Date
WO2020042727A1 true WO2020042727A1 (zh) 2020-03-05

Family

ID=65068815

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/091402 WO2020042727A1 (zh) 2018-08-28 2019-06-14 一种应用场景的交互方法和移动终端以及存储介质

Country Status (5)

Country Link
US (1) US11383166B2 (zh)
EP (1) EP3845282A4 (zh)
JP (1) JP7026267B2 (zh)
CN (1) CN109224437A (zh)
WO (1) WO2020042727A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113209622A (zh) * 2021-05-28 2021-08-06 北京字节跳动网络技术有限公司 动作的确定方法、装置、可读介质和电子设备

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109224437A (zh) * 2018-08-28 2019-01-18 腾讯科技(深圳)有限公司 一种应用场景的交互方法和终端以及存储介质
CN110244775A (zh) * 2019-04-29 2019-09-17 广州市景沃电子有限公司 基于移动设备夹持云台的自动跟踪方法及装置
CN110197171A (zh) * 2019-06-06 2019-09-03 深圳市汇顶科技股份有限公司 基于用户的动作信息的交互方法、装置和电子设备
CN110705510B (zh) * 2019-10-16 2023-09-05 杭州优频科技有限公司 一种动作确定方法、装置、服务器和存储介质
CN111013135A (zh) * 2019-11-12 2020-04-17 北京字节跳动网络技术有限公司 一种交互方法、装置、介质和电子设备
CN111068308A (zh) * 2019-11-12 2020-04-28 北京字节跳动网络技术有限公司 基于嘴部动作的数据处理方法、装置、介质和电子设备
CN111768474B (zh) * 2020-05-15 2021-08-20 完美世界(北京)软件科技发展有限公司 动画生成方法、装置、设备
CN112843693B (zh) * 2020-12-31 2023-12-29 上海米哈游天命科技有限公司 拍摄图像的方法、装置、电子设备及存储介质
CN114415929B (zh) * 2022-01-24 2024-10-11 维沃移动通信有限公司 电子设备的控制方法、装置、电子设备和可读存储介质
CN114779937A (zh) * 2022-04-28 2022-07-22 脑陆(重庆)智能科技研究院有限公司 多媒体互动成像方法、装置、存储介质和计算机程序产品
CN115185381A (zh) * 2022-09-15 2022-10-14 北京航天奥祥通风科技股份有限公司 基于头部的运动轨迹控制终端的方法和装置
CN115830518B (zh) * 2023-02-15 2023-05-09 南京瀚元科技有限公司 一种红外场景下电力巡检视频智能抽帧的方法
CN117528254B (zh) * 2023-11-06 2024-05-31 北京中数文化科技有限公司 一种基于ai智能终端的交互场景照片制作系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105307737A (zh) * 2013-06-14 2016-02-03 洲际大品牌有限责任公司 互动视频游戏
CN105630169A (zh) * 2015-12-25 2016-06-01 北京像素软件科技股份有限公司 一种体感输入方法及装置
CN108153422A (zh) * 2018-01-08 2018-06-12 维沃移动通信有限公司 一种显示对象控制方法和移动终端
CN108255304A (zh) * 2018-01-26 2018-07-06 腾讯科技(深圳)有限公司 基于增强现实的视频数据处理方法、装置和存储介质
CN109224437A (zh) * 2018-08-28 2019-01-18 腾讯科技(深圳)有限公司 一种应用场景的交互方法和终端以及存储介质

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4553346B2 (ja) * 2003-10-22 2010-09-29 キヤノン株式会社 焦点調節装置及び焦点調節方法
CN101393599B (zh) * 2007-09-19 2012-02-08 中国科学院自动化研究所 一种基于人脸表情的游戏角色控制方法
CN103442201B (zh) 2007-09-24 2018-01-02 高通股份有限公司 用于语音和视频通信的增强接口
WO2010132568A1 (en) * 2009-05-13 2010-11-18 Wms Gaming, Inc. Player head tracking for wagering game control
JP2013039231A (ja) * 2011-08-16 2013-02-28 Konami Digital Entertainment Co Ltd ゲーム装置、ゲーム装置の制御方法、ならびに、プログラム
US10223838B2 (en) * 2013-03-15 2019-03-05 Derek A. Devries Method and system of mobile-device control with a plurality of fixed-gradient focused digital cameras
JP5796052B2 (ja) * 2013-09-24 2015-10-21 レノボ・イノベーションズ・リミテッド(香港) 画面表示制御方法、画面表示制御方式、電子機器及びプログラム
US9857591B2 (en) * 2014-05-30 2018-01-02 Magic Leap, Inc. Methods and system for creating focal planes in virtual and augmented reality
JP6290754B2 (ja) * 2014-09-11 2018-03-07 株式会社パスコ 仮想空間表示装置、仮想空間表示方法及びプログラム
CN104656893B (zh) * 2015-02-06 2017-10-13 西北工业大学 一种信息物理空间的远程交互式操控系统及方法
CN106249882B (zh) * 2016-07-26 2022-07-12 华为技术有限公司 一种应用于vr设备的手势操控方法与装置
CN106951069A (zh) * 2017-02-23 2017-07-14 深圳市金立通信设备有限公司 一种虚拟现实界面的控制方法及虚拟现实设备
US10761649B2 (en) * 2018-02-27 2020-09-01 Perfect Shiny Technology (Shenzhen) Limited Touch input method and handheld apparatus using the method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105307737A (zh) * 2013-06-14 2016-02-03 洲际大品牌有限责任公司 互动视频游戏
CN105630169A (zh) * 2015-12-25 2016-06-01 北京像素软件科技股份有限公司 一种体感输入方法及装置
CN108153422A (zh) * 2018-01-08 2018-06-12 维沃移动通信有限公司 一种显示对象控制方法和移动终端
CN108255304A (zh) * 2018-01-26 2018-07-06 腾讯科技(深圳)有限公司 基于增强现实的视频数据处理方法、装置和存储介质
CN109224437A (zh) * 2018-08-28 2019-01-18 腾讯科技(深圳)有限公司 一种应用场景的交互方法和终端以及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3845282A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113209622A (zh) * 2021-05-28 2021-08-06 北京字节跳动网络技术有限公司 动作的确定方法、装置、可读介质和电子设备

Also Published As

Publication number Publication date
US11383166B2 (en) 2022-07-12
JP7026267B2 (ja) 2022-02-25
US20210001228A1 (en) 2021-01-07
EP3845282A1 (en) 2021-07-07
JP2021516836A (ja) 2021-07-08
EP3845282A4 (en) 2021-10-27
CN109224437A (zh) 2019-01-18

Similar Documents

Publication Publication Date Title
WO2020042727A1 (zh) 一种应用场景的交互方法和移动终端以及存储介质
US10318011B2 (en) Gesture-controlled augmented reality experience using a mobile communications device
US20220076000A1 (en) Image Processing Method And Apparatus
WO2021135601A1 (zh) 辅助拍照方法、装置、终端设备及存储介质
EP3617995A1 (en) Augmented reality processing method, object recognition method, and related apparatus
US11623142B2 (en) Data processing method and mobile terminal
US9268404B2 (en) Application gesture interpretation
US9400548B2 (en) Gesture personalization and profile roaming
WO2020108261A1 (zh) 拍摄方法及终端
US11366528B2 (en) Gesture movement recognition method, apparatus, and device
CN108712603B (zh) 一种图像处理方法及移动终端
WO2020020134A1 (zh) 拍摄方法及移动终端
WO2020233323A1 (zh) 显示控制方法、终端设备及计算机可读存储介质
CN109558000B (zh) 一种人机交互方法及电子设备
CN109495616B (zh) 一种拍照方法及终端设备
US12028476B2 (en) Conversation creating method and terminal device
CN111079030A (zh) 一种群组搜索方法及电子设备
WO2024055748A1 (zh) 一种头部姿态估计方法、装置、设备以及存储介质
CN111026562B (zh) 一种消息发送方法及电子设备
CN107807740B (zh) 一种信息输入方法及移动终端
CN107643821B (zh) 一种输入控制方法、装置及电子设备
JP6481360B2 (ja) 入力方法、入力プログラムおよび入力装置
CN113849142B (zh) 图像展示方法、装置、电子设备及计算机可读存储介质
CN113970997B (zh) 一种虚拟键盘展示方法和相关装置
CN113419663A (zh) 控制方法、移动终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19854820

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020572602

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019854820

Country of ref document: EP

Effective date: 20210329