CN111443853B - Digital human control method and device - Google Patents

Digital human control method and device Download PDF

Info

Publication number
CN111443853B
CN111443853B CN202010220091.5A CN202010220091A CN111443853B CN 111443853 B CN111443853 B CN 111443853B CN 202010220091 A CN202010220091 A CN 202010220091A CN 111443853 B CN111443853 B CN 111443853B
Authority
CN
China
Prior art keywords
action
digital
digital person
task
person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010220091.5A
Other languages
Chinese (zh)
Other versions
CN111443853A (en
Inventor
李扬
郑磊
李士岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010220091.5A priority Critical patent/CN111443853B/en
Publication of CN111443853A publication Critical patent/CN111443853A/en
Application granted granted Critical
Publication of CN111443853B publication Critical patent/CN111443853B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04845Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/016Input arrangements with force or tactile feedback as computer generated output to the user
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/0486Drag-and-drop
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0487Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
    • G06F3/0488Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/02Reservations, e.g. for tickets, services or events
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0281Customer communication at a business location, e.g. providing product or service information, consulting
    • G06Q50/40

Abstract

The embodiment of the application provides a method and a device for controlling a digital person, relates to the technical field of artificial intelligence, and specifically comprises the following steps: a device bearing a first digital person may identify a target user; controlling a first digital person to execute a first task in the case that it is determined that the target user has an uncompleted first task; the first task is generated by triggering the target user in the equipment bearing the second digital person, and the first digital person executing the first task is displayed, so that multi-equipment interaction can be realized, and more comprehensive and convenient service is provided for the user.

Description

Digital human control method and device
Technical Field
The present application relates to artificial intelligence in the field of data processing technologies, and in particular, to a method and an apparatus for controlling a digital human.
Background
At present, robots can be placed in places such as shopping malls and exhibition halls, and users can know related services based on videos or voices played in the robots and voice interaction with the robots.
However, the interaction mode of the robot and the user is relatively fixed, the action of the robot is also relatively rigid, and the humanization is lacked.
Disclosure of Invention
The embodiment of the application provides a control method and device for a digital person, and aims to solve the technical problem that in the prior art, the accuracy of identifying a traffic signal lamp is not high.
A first aspect of an embodiment of the present application provides a method for controlling a digital person, which is applied to a device for bearing a first digital person, and the method includes: identifying a target user; controlling the first digital person to execute a first task in the case that the target user is determined to have the first task unfinished; the first task is triggered and generated by the target user in equipment bearing a second digital person; and displaying a first digital person performing the first task. Therefore, if the user has an incomplete first task in one digital human device, when the user interacts with another digital human, the other digital human can automatically identify the target user, acquire the first task and continuously execute the first task, so that multi-device interaction is realized, and more comprehensive and convenient service is provided for the user.
In a possible implementation manner, the method further includes:
under the condition that the trigger operation of the target user on the target object is received in a graphical user interface, acquiring the position information of the target object in the graphical user interface; and controlling the first digital person to contact the target object according to the position information. In the embodiment of the application, the interaction between the digital person and the target object triggered by the user can be controlled based on the triggering operation of the user in the graphical user interface, and the interaction between the digital person and the user can be better realized.
In a possible implementation manner, the method further includes:
in the case that a touch operation or a slide operation of the first digital person on a first object in the graphical user interface is detected, controlling the first object to change position in the graphical user interface along with the action of the first digital person. Therefore, the interaction between the digital person and the GUI object can be controlled based on the operation of the first digital person in the graphical user interface, and more humanized service can be realized.
In one possible implementation, the controlling the first digital person to contact the target object according to the position information includes:
controlling the first digital person to transition from a first action to a second action according to the position information; wherein the first action is an action currently performed by the first digital person in the first task, and the second action is: an act of touching or sliding the target object.
In one possible implementation, the digital human is a digital object based on skeletal point location control; the controlling the first digital person to transition from a first action to a second action includes:
controlling the digital human to realize transition from the first action to the second action in a fusion mode under the condition that no conflict skeleton point exists in the first action and the second action; the fusion mode is that the digital person is controlled to execute the fused action according to the position of each bone point when the digital person executes the first action, the position of each bone point when the digital person executes the second action and the human body motion rule. This allows for a more fluid, near-real human motion transition.
In one possible implementation, the digital human is a digital object based on skeletal point location control; the controlling the digital person to transition from a first action to a second action includes:
controlling the digital person to implement transition from the first action to the second action in an articulated manner in case of existence of conflicting skeletal points in the first action and the second action; the connection mode is that a motion path between the position of each bone point when the digital person executes the first action and the position of each bone point when the digital person executes the second action is calculated, and the digital person is controlled to transition from the first action to the second action according to the motion path. This allows for a more fluid, near-real human motion transition.
In one possible implementation, the determining that the target user has an incomplete first task includes: and acquiring the existence of the unfinished first task of the target user in a database based on the identification of the target user.
In one possible implementation, the determining that the target user has an incomplete first task includes: receiving the first task from the second digital person.
A second aspect of the embodiments of the present application provides a control apparatus for a digital person, which is applied to a device for carrying a first digital person, the apparatus including:
the processing module is used for identifying a target user;
the processing module is further used for controlling the first digital person to execute the first task under the condition that the target user is judged to have the uncompleted first task; the first task is triggered and generated by the target user in equipment bearing a second digital person; and the number of the first and second groups,
and the display module is used for displaying the first digital person executing the first task.
In one possible implementation, the method further includes:
under the condition that the trigger operation of the target user on the target object is received in a graphical user interface, acquiring the position information of the target object in the graphical user interface;
and controlling the first digital person to contact the target object according to the position information.
In a possible implementation manner, the processing module is further configured to:
in the case that a touch operation or a slide operation of the first digital person on a first object in the graphical user interface is detected, controlling the first object to change position in the graphical user interface along with the action of the first digital person.
In a possible implementation manner, the processing module is specifically configured to:
controlling the first digital person to transition from a first action to a second action according to the position information; wherein the first action is an action currently performed by the first digital person in the first task, and the second action is: an act of touching or sliding the target object.
In one possible implementation, the digital human is a digital object based on skeletal point location control; the processing module is specifically configured to:
controlling the digital human to realize transition from the first action to the second action in a fusion mode under the condition that no conflict skeleton point exists in the first action and the second action; the fusion mode is that the digital person is controlled to execute the fused action according to the position of each bone point when the digital person executes the first action, the position of each bone point when the digital person executes the second action and the human body motion rule.
In one possible implementation, the digital human is a digital object based on skeletal point location control; the processing module is specifically configured to:
controlling the digital person to implement transition from the first action to the second action in an articulated manner in case of existence of conflicting skeletal points in the first action and the second action; the connection mode is that a motion path between the position of each bone point when the digital person executes the first action and the position of each bone point when the digital person executes the second action is calculated, and the digital person is controlled to transition from the first action to the second action according to the motion path.
In a possible implementation manner, the processing module is specifically configured to:
and acquiring the existence of the unfinished first task of the target user in a database based on the identification of the target user.
In a possible implementation manner, the processing module is specifically configured to:
receiving the first task from the second digital person.
A third aspect of the embodiments of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the preceding first aspects.
A fourth aspect of embodiments of the present application provides a non-transitory computer-readable storage medium having stored thereon computer instructions for causing a computer to perform the method of any one of the preceding first aspects.
In summary, the embodiment of the present application has the following beneficial effects with respect to the prior art:
the embodiment of the application provides a method and a device for controlling a digital person, wherein if a user has an unfinished first task in one digital person device, when the user interacts with another digital person, the other digital person can automatically identify a target user, acquire the first task and continuously execute the first task, so that multi-device interaction is realized, and more comprehensive and convenient services are provided for the user. Specifically, the device bearing the first digital person may identify the target user; controlling a first digital person to execute a first task in the case that it is determined that the target user has an uncompleted first task; the first task is generated by triggering the target user in the equipment bearing the second digital person, and the first digital person executing the first task is displayed, so that multi-equipment interaction can be realized, and more comprehensive and convenient service is provided for the user.
Drawings
Fig. 1 is a schematic diagram of an apparatus architecture to which a control method of a digital human provided in an embodiment of the present application is applicable;
FIG. 2 is a schematic diagram of a digital personal device provided by an embodiment of the present application;
fig. 3 is a schematic flowchart of a control method of a digital human provided in an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a digital human control device according to an embodiment of the present disclosure;
fig. 5 is a block diagram of an electronic device for implementing a method of controlling a digital person according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. The embodiments described below and the features of the embodiments can be combined with each other without conflict.
The device for bearing the first digital person and the device for bearing the second digital person described in the embodiment of the present application may be devices arranged in different positions or regions, and based on the method in the embodiment of the present application, a plurality of digital person devices in different positions or different regions may provide continuous services for a user. The images of the first digital person and the second digital person may be the same, and the specific images of the first digital person and the second digital person are not limited in this embodiment of the application.
The device carrying the first digital person and the device carrying the second digital person in the embodiment of the present application may be referred to as digital person devices, and the digital person devices may include: digital human intelligent interactive air screen, or any electronic device capable of carrying digital human. The embodiment of the present application does not specifically limit the specific device used.
Illustratively, taking a digital human device as a digital human intelligent interaction air screen as an example, the digital human intelligent interaction air screen may include a transparent air screen, a Graphical User Interface (GUI) may be provided in the air screen, and a control, a switch, and the like for receiving a user operation may be set in the GUI, so that a user may perform a trigger operation in the GUI, and it can be understood that a specific content of the GUI may be determined according to an actual application scenario, which is not specifically limited in the embodiment of the present invention. In a possible implementation, the digital human equipment can set the air screen on the rotating base, so that the air screen can rotate along with the position of a user, and face-to-face service experience is realized.
As shown in fig. 1, it can be a technical architecture diagram of a digital human device. May include a technology layer, an operating system, a hardware layer, a software platform, a capability layer, and a base layer. The skill layer can be a layer facing the user, and can be used for intelligent welcoming, intelligent explanation, intelligent recommendation, interactive marketing and business handling. The operating system may include, for example: a Nile operating system NIRO OS, an Android operating system Android and a voice conversation system. The hardware layer may include: the intelligent recognition system comprises an intelligent recognition chip, a driving server, an organic light-emitting diode (OLED) self-luminous air screen, a machine vision camera and a 4-Microphone (MIC) array. The software platform may include: digital human platform, human-computer interaction capability, text, voice customer service and industry solution. The energy layer may be used to: speech recognition, natural language interpretation, speech synthesis, video analysis, image recognition, and the like. The base layer may include a hundred degree brain and a base cloud.
The digital human described in the embodiments of the present application can be a crystal of digital character technology and artificial intelligence technology. The digital role technologies such as portrait modeling and motion capture can bring vivid and natural image expression to digital people, and the artificial intelligence technologies such as voice recognition, natural language understanding and dialogue understanding can bring perfect cognition, understanding and expression capability to digital people. The digital person can use electronic screens, holographic display and other equipment as carriers and interact with users based on the equipment.
Illustratively, a digital person can identify the user identity based on an artificial intelligence technology, and uninterrupted service is provided by combining multiple modes such as natural conversation and a traditional user interface, so that the business efficiency can be improved, and the labor cost can be reduced. In a possible implementation mode, the artificial intelligence system corresponding to the digital person can also summarize and analyze information expressed by the user, construct a user portrait and accurately match user requirements.
In a possible implementation mode, the digital person can support different images, timbres and the like, the user can select the specific image of the digital person by himself, or the digital person equipment can automatically select the digital person image which accords with the habit of the user based on the portrayal of the user.
In a possible implementation manner, a user can interact with a digital person by adopting voice, gestures, expressions or body movements, and the like, the digital person in the embodiment of the application can send instructions directly by the voice, the gestures, the expressions or the body movements and the like to interact with the digital person without needing to wake up words when the user is in a recognizable area of digital person equipment.
In a possible application scene, the digital person can be applied to new retail, the identification of the age, the sex and the like of the user is realized, and then the goal marketing, the welcome and the pull-away, the commodity recommendation and introduction and the like are carried out based on the characteristics of the user. In a possible application scene, the digital person can be arranged in a financial enterprise hall to realize business consultation and distribution, intelligent investment, credit borrowing and the like. In a possible application scene, the digital people can be arranged in the exhibition hall, and intelligent welcoming, explanation in the exhibition hall, content consultation and the like are realized. The embodiment of the application does not limit the specific application scenario of the digital person.
In a possible implementation manner, the digital person may be a digital object based on skeleton point location control, and it may also be understood that the motion form control of the digital person may adopt logic similar to robot control, and by means of the control of the skeleton point location of the digital person, the digital person performs motion and expression similar to human, and is more natural and smooth, so that the user may feel the feeling similar to the real person serving the digital person.
For example, fig. 2 is a schematic diagram of a possible digital human device, a screen 21 for carrying a digital human may be disposed on a base 22, and the base may be rotated 360 degrees to provide a user with an all-directional service.
As shown in fig. 3, fig. 3 is a schematic flow chart of a digital human control method according to an embodiment of the present application. The method specifically comprises the following steps:
s101: a target user is identified.
In this application embodiment, the target user may be a user whose distance from a device (hereinafter referred to as a first digital human device) carrying the first digital human does not exceed a distance threshold, or a user that can be identified by the first digital human device, and the like.
In a possible implementation manner, a face recognition module may be disposed in the first digital human device, and when the user is located in a recognition range of the face recognition module, the first digital human device may automatically recognize the target user.
In another possible implementation manner, the user may send a command such as a voice command or an action to the first digital human device, or the user may perform operations such as clicking and sliding in a graphical user interface of the first digital human device, and the first digital human device may further identify the target user.
It can be understood that, in actual application, the first digital human device may also be triggered to identify the target user according to an actual application scenario, which is not specifically limited in this embodiment of the application.
Any possible technology can be adopted by the specific algorithm and the like of the first digital personal device for identifying the target user, and this is not particularly limited in the embodiment of the present application.
S102: controlling the first digital person to execute a first task in the case that the target user is determined to have the first task unfinished; the first task is triggered and generated by the target user in a device bearing a second digital person.
In this embodiment of the present application, in a possible application scenario, a target user may trigger generation of a first task in a device bearing a second digital person (hereinafter referred to as a second digital person device for short) before, and when the first task is not executed completely, the user may leave the second digital person device, so that the target user has an incomplete first task.
In one possible application scenario, the target user may have previously triggered the generation of a first task in a second digital human device, where the first task relates to content that needs to be executed at another location or area, and needs to be completed by a digital human device (e.g., a first digital human device) at the other location or area, such that the target user has an incomplete first task.
In specific implementation, under the condition that the target user is judged to have the unfinished first task, the first digital human equipment can be controlled to continue to execute the first task, multi-equipment interaction is realized, and more comprehensive and convenient service is provided for the user. In addition, from the perspective of a user, for the same task, the user does not need to send instructions to the digital person for multiple times, so that the operation steps are simplified, and the user can feel the experience similar to that of one digital person continuously serving the digital person.
In one possible implementation, if the target user selects the second digital person character in the second digital device, the first digital person device may automatically set the first digital person character to be consistent with the second digital person character when recognizing the target user, so that the user's preference may be better met.
In one possible implementation, if the target user has not performed the operation of selecting the digital human image, the user portrait may be constructed according to the historical data of the user, and the digital human image adapted to the user portrait may be selected for the user based on the user portrait, thereby implementing a detailed user service.
For example, if the user instructs the second digital human device to "please reserve a taxi for 10 minutes after the scheduled flight arrives at airport B" in airport a in area a, and then the user leaves, the second digital human device may save the incomplete task of "reserve a taxi for 10 minutes after the scheduled flight arrives at airport B" of the user, and when the first digital human device at airport B recognizes the user, the user may be informed whether the taxi reservation is successful, and if so, the information of the license plate, the location, and the like of the taxi reservation may be continuously informed, and the subsequent processing of the incomplete task may be completed. In a possible implementation manner, a specific operation for performing the network appointment reservation may be a first digital person, a second digital person, a third-party device, or the like, and the embodiment of the present application is only an exemplary illustration and is not intended to limit a specific application scenario.
S103: displaying a first digital person performing the first task.
In this embodiment of the present application, a first digital person who executes a first task may be displayed on a display screen of a first digital person device, and when the first digital person executes the first task, there may be a voice, an expression, an action, and the like.
In summary, the embodiments of the present application provide a method and an apparatus for controlling a digital person, where if a user has an incomplete first task in a digital person device, when the user interacts with another digital person, the another digital person may automatically identify a target user, acquire the first task, and continue to execute the first task, thereby implementing multi-device interaction and providing more comprehensive and convenient services for the user. Specifically, the device bearing the first digital person may identify the target user; controlling a first digital person to execute a first task in the case that it is determined that the target user has an uncompleted first task; the first task is generated by triggering the target user in the equipment bearing the second digital person, and the first digital person executing the first task is displayed, so that multi-equipment interaction can be realized, and more comprehensive and convenient service is provided for the user.
In the embodiment corresponding to fig. 3, in a possible implementation manner, the method further includes:
under the condition that the trigger operation of the target user on the target object is received in a graphical user interface, acquiring the position information of the target object in the graphical user interface; and controlling the first digital person to contact the target object according to the position information.
In the embodiment of the application, the action of the first digital person can be influenced by the trigger operation of the user in the GUI. The trigger operation may be, for example, a click operation, a drag operation, etc. on an arbitrary element (target object) in the GUI, and the first digital human device may determine location information of the target object based on a location of the trigger operation in the GUI, and for example, if the trigger operation is a drag operation, the location information of the target object may be end location information of the drag operation in the user graphical interface.
The first digital human device may control the first digital human to contact the target object. For example, in a game scenario, if a first digital person performs a ball drag interaction with a user, the user may drag the ball to position one in the GUI, the first digital person device may control the hand of the first digital person to move to position one, and control the hand of the first digital person to contact the ball and drag the ball, etc.
In the embodiment of the application, the interaction between the digital person and the target object triggered by the user can be controlled based on the triggering operation of the user in the graphical user interface, and the interaction between the digital person and the user can be better realized.
In the embodiment corresponding to fig. 3, in a possible implementation manner, the method further includes: in the case that a touch operation or a slide operation of the first digital person on a first object in the graphical user interface is detected, controlling the first object to change position in the graphical user interface along with the action of the first digital person.
In an embodiment of the application, the action of the first digital person may drive the action of an element in the GUI. For example, goods and the like may be displayed in the GUI, when the first digital person introduces goods, the first digital person may take the first goods to a display stand for promotion similar to a real promoter, and if the first digital person detects that the first digital person performs a touch operation (for example, holds the first goods) or a slide operation (for example, drags the first goods) on the first goods, the first digital person may control the first goods to perform a position change along with a hand motion of the first digital person, for example, follow the hand motion of the first digital person, and the like, which is not specifically limited in the embodiment of the present application. Therefore, the interaction between the digital person and the GUI object can be controlled based on the operation of the first digital person in the graphical user interface, and more humanized service can be realized.
In an embodiment corresponding to fig. 3, in a possible implementation manner, the controlling, according to the location information, the first digital person to contact the target object includes: controlling the first digital person to transition from a first action to a second action according to the position information; wherein the first action is an action currently performed by the first digital person in the first task, and the second action is: an act of touching or sliding the target object.
In the embodiment of the application, when the user triggers the target object in the GUI, the first digital person may be performing a first action in a first task, and then the first digital person may be controlled to transition from the first action to a second action interacting with the target object, for example, the second action may be touching or sliding the target object.
In one possible implementation, the digital human is a digital object based on skeletal point location control; the controlling the first digital person to transition from a first action to a second action includes: controlling the digital human to realize transition from the first action to the second action in a fusion mode under the condition that no conflict skeleton point exists in the first action and the second action; the fusion mode is that the digital person is controlled to execute the fused action according to the position of each bone point when the digital person executes the first action, the position of each bone point when the digital person executes the second action and the human body motion rule.
In the embodiment of the application, it may be determined whether a collision skeleton point exists in the first digital person when the first digital person performs the first action and performs the second action, for example, the first action is an action of the first digital person bending down to pick up an object, the second action is an action of the first digital person receiving an apple, and if the action of bending down to pick up an object and the action of the first digital person receiving an apple are merged, the merged action is strange, and it may be determined that the collision skeleton point exists in the first action and the second action. It can be understood that, in a specific scenario, two actions that are not suitable for fusion may be determined as actions having a conflicting bone point, and the determination of whether a conflicting bone point exists is not specifically limited in the embodiment of the present application.
When no conflicting bone points exist in the first action and the second action, the first action and the second action can be fused to obtain a fused action according to the positions of the bone points when the digital person executes the first action, the positions of the bone points when the digital person executes the second action and the motion rule of the human body; and controlling the digital human to execute the fused action.
For example, the first action of the first digital person is smiling, when the operation that the user clicks an apple in the GUI is received, the second action executed by the digital person can be indicated as an apple picking action, and then a fusion action fusing the two actions of the digital person for picking the apple with smiling can be calculated according to the motion rule that the real person executes the apple with smiling and the position of each bone point in the action that the digital person stretches out to pick the apple with hand, and the digital person is controlled to execute the fused action.
In the embodiment corresponding to fig. 3, in one possible implementation, the digital human is a digital object based on the control of the bone point location; the controlling the digital person to transition from a first action to a second action includes: controlling the digital person to implement transition from the first action to the second action in an articulated manner in case of existence of conflicting skeletal points in the first action and the second action; the connection mode is that a motion path between the position of each bone point when the digital person executes the first action and the position of each bone point when the digital person executes the second action is calculated, and the digital person is controlled to transition from the first action to the second action according to the motion path.
In the embodiment of the application, when the first action and the second action have conflicting bone points, positions of the bone points through which the first action transits to the second action can be calculated, a motion path between the positions of the bone points when the digital person executes the first action and the positions of the bone points when the digital person executes the second action is obtained according to a preset moving sequence of the bone points, and the digital person is controlled to transit from the first action to the second action according to the motion path.
For example, the first action is an action of a first digital person bending to pick up an object, when an operation of dragging and throwing an apple in the GUI by a user is received, the second action is an action of picking the apple, and a motion path from the first action of the first digital person bending to pick up the object to a position of the second action of the first digital person bending to pick up the apple can be controlled, so that the first action of the right hand of the digital person naturally transitions from the first action of bending to the second action of picking the apple. In this way, a smooth and natural motion transition can be obtained.
In an embodiment corresponding to fig. 3, in a possible implementation manner, the determining that the target user has an incomplete first task includes: and acquiring the existence of the unfinished first task of the target user in a database based on the identification of the target user.
In the embodiment of the present application, the identifier of the target user may be any information that can identify the target user, such as a facial image and identity information of the target user, which is not specifically limited in the embodiment of the present application.
The second digital human device may store the association relationship between the uncompleted first task and the identifier of the target user in the database, and after the first digital human device recognizes the target user, the first digital human device may further obtain the uncompleted first task of the target user in the database according to the identifier of the target user.
In an embodiment corresponding to fig. 3, in a possible implementation manner, the determining that the target user has an incomplete first task includes: receiving the first task from the second digital person.
In this embodiment, the second digital human device and the first digital human device may also communicate with each other, for example, the second digital human device and the first digital human device may communicate with each other in the form of a full duplex communication protocol Websocket, a universal asynchronous receiver/transmitter (UART), or bluetooth. The second digital personal device may then send the first task to the first digital personal device, and the first digital personal device may continue to execute the first task.
Illustratively, the user a sends an instruction at the second digital human device, and needs to perform a video call with the user B based on the digital human device, and the first digital human device is serving the user B, the second digital human device may send a task of video call with the user B to the first digital human device, and then establish a video call between the first digital human device and the second digital human device, so as to complete the task of video call.
Fig. 4 is a schematic structural diagram of an embodiment of a digital human control apparatus provided in the present application, applied to a device for carrying a first digital human. As shown in fig. 4, the digital human control apparatus provided in this embodiment includes:
a processing module 41 for identifying a target user;
the processing module 41 is further configured to control the first digital person to execute the first task if it is determined that the target user has an incomplete first task; the first task is triggered and generated by the target user in equipment bearing a second digital person; and the number of the first and second groups,
a display module 42 for displaying a first digital person performing the first task.
In one possible implementation, the method further includes:
under the condition that the trigger operation of the target user on the target object is received in a graphical user interface, acquiring the position information of the target object in the graphical user interface;
and controlling the first digital person to contact the target object according to the position information.
In a possible implementation manner, the processing module is further configured to:
in the case that a touch operation or a slide operation of the first digital person on a first object in the graphical user interface is detected, controlling the first object to change position in the graphical user interface along with the action of the first digital person.
In a possible implementation manner, the processing module is specifically configured to:
controlling the first digital person to transition from a first action to a second action according to the position information; wherein the first action is an action currently performed by the first digital person in the first task, and the second action is: an act of touching or sliding the target object.
In one possible implementation, the digital human is a digital object based on skeletal point location control; the processing module is specifically configured to:
controlling the digital human to realize transition from the first action to the second action in a fusion mode under the condition that no conflict skeleton point exists in the first action and the second action; the fusion mode is that the digital person is controlled to execute the fused action according to the position of each bone point when the digital person executes the first action, the position of each bone point when the digital person executes the second action and the human body motion rule.
In one possible implementation, the digital human is a digital object based on skeletal point location control; the processing module is specifically configured to:
controlling the digital person to implement transition from the first action to the second action in an articulated manner in case of existence of conflicting skeletal points in the first action and the second action; the connection mode is that a motion path between the position of each bone point when the digital person executes the first action and the position of each bone point when the digital person executes the second action is calculated, and the digital person is controlled to transition from the first action to the second action according to the motion path.
In a possible implementation manner, the processing module is specifically configured to:
and acquiring the existence of the unfinished first task of the target user in a database based on the identification of the target user.
In a possible implementation manner, the processing module is specifically configured to:
receiving the first task from the second digital person.
The embodiment of the application provides a method and a device for controlling a digital person, wherein if a user has an unfinished first task in one digital person device, when the user interacts with another digital person, the other digital person can automatically identify a target user, acquire the first task and continuously execute the first task, so that multi-device interaction is realized, and more comprehensive and convenient services are provided for the user. Specifically, the device bearing the first digital person may identify the target user; controlling a first digital person to execute a first task in the case that it is determined that the target user has an uncompleted first task; the first task is generated by triggering the target user in the equipment bearing the second digital person, and the first digital person executing the first task is displayed, so that multi-equipment interaction can be realized, and more comprehensive and convenient service is provided for the user.
The digital human control device provided in the embodiments of the present application can be used to execute the method shown in the corresponding embodiments, and the implementation manner and principle thereof are the same, and are not described again.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 5, it is a block diagram of an electronic device of a control method of a digital person according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.
Memory 502 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by at least one processor to cause the at least one processor to perform the method of controlling a digital human provided herein. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the digital human control method provided by the present application.
The memory 502, which is a non-transitory computer-readable storage medium, may be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules (e.g., the processing module 41 and the display module 42 shown in fig. 4) corresponding to the control method of the digital human in the embodiments of the present application. The processor 501 executes various functional applications of the server and data processing, i.e., implements the digital human control method in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 502.
The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the digital person's control electronics, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 optionally includes memory located remotely from processor 501, which may be connected to the digital person's control electronics via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device of the digital person control method may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
The input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the digital person's control electronics, such as a touch screen, keypad, mouse, track pad, touch pad, pointer stick, one or more mouse buttons, track ball, joystick, or other input device. The output devices 504 may include a display device, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to the technical scheme of the embodiment of the application, in the process of task execution of the digital person, if a new instruction is generated and the action corresponding to the new instruction and the action currently executed by the digital person do not have a conflict skeleton point, the action corresponding to the new instruction and the action currently executed by the digital person can be fused and output, so that the action connection similar to that of the person is realized, the action is more natural and smooth, and the humanization of the digital person is realized to a greater extent. Specifically, the digital person may be controlled to perform a first task; under the condition that a first instruction is received, acquiring a first action currently executed by the digital person in a first task and a second action indicated by the first instruction; under the condition that no conflict skeleton point exists in the first action and the second action, fusing the first action and the second action according to the position of each skeleton point when the digital person executes the first action, the position of each skeleton point when the digital person executes the second action and the motion rule of the human body to obtain a fused action; and the digital person is controlled to execute the fused action, so that the action connection similar to that of the person is realized, the action is more natural and smooth, and the humanization of the digital person is realized to a greater extent.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (14)

1. A control method of a digital person is applied to equipment bearing a first digital person, and the method comprises the following steps:
identifying a target user;
controlling the first digital person to execute a first task in the case that the target user is determined to have the first task unfinished; the first task is triggered and generated by the target user in equipment bearing a second digital person; and the number of the first and second groups,
displaying a first digital person performing the first task;
under the condition that the trigger operation of the target user on the target object is received in a graphical user interface, acquiring the position information of the target object in the graphical user interface;
controlling the first digital person to transition from a first action to a second action according to the position information; wherein the first action is an action currently performed by the first digital person in the first task, and the second action is: an act of touching or sliding the target object.
2. The method of claim 1, further comprising:
and in the case of detecting the touch operation or the sliding operation of the first digital person on a first object in the graphical user interface, controlling the first object to change the position in the graphical user interface along with the action of the first digital person.
3. The method of claim 1, wherein the digital human is a digital object based on skeletal point location control; the controlling the first digital person to transition from a first action to a second action includes:
controlling the digital human to realize transition from the first action to the second action in a fusion mode under the condition that no conflict skeleton point exists in the first action and the second action; the fusion mode is that the digital person is controlled to execute the fused action according to the position of each bone point when the digital person executes the first action, the position of each bone point when the digital person executes the second action and the human body motion rule.
4. The method of claim 1, wherein the digital human is a digital object based on skeletal point location control; the controlling the digital person to transition from a first action to a second action includes:
controlling the digital person to implement transition from the first action to the second action in an articulated manner in case of existence of conflicting skeletal points in the first action and the second action; the connection mode is that a motion path between the position of each bone point when the digital person executes the first action and the position of each bone point when the digital person executes the second action is calculated, and the digital person is controlled to transition from the first action to the second action according to the motion path.
5. The method of claim 1 or 2, wherein the determining that the target user has an incomplete first task comprises:
and acquiring the existence of the unfinished first task of the target user in a database based on the identification of the target user.
6. The method of claim 1 or 2, wherein the determining that the target user has an incomplete first task comprises:
receiving the first task from the second digital person.
7. A digital person control apparatus, for use with a device carrying a first digital person, the apparatus comprising:
the processing module is used for identifying a target user;
the processing module is further used for controlling the first digital person to execute the first task under the condition that the target user is judged to have the uncompleted first task; the first task is triggered and generated by the target user in equipment bearing a second digital person; and the number of the first and second groups,
a display module for displaying a first digital person performing the first task;
the processing module is further configured to:
under the condition that the trigger operation of the target user on the target object is received in a graphical user interface, acquiring the position information of the target object in the graphical user interface;
controlling the first digital person to transition from a first action to a second action according to the position information; wherein the first action is an action currently performed by the first digital person in the first task, and the second action is: an act of touching or sliding the target object.
8. The apparatus of claim 7, wherein the processing module is further configured to:
and in the case of detecting the touch operation or the sliding operation of the first digital person on a first object in the graphical user interface, controlling the first object to change the position in the graphical user interface along with the action of the first digital person.
9. The apparatus of claim 7, wherein the digital human is a digital object based on control of bone point locations; the processing module is specifically configured to:
controlling the digital human to realize transition from the first action to the second action in a fusion mode under the condition that no conflict skeleton point exists in the first action and the second action; the fusion mode is that the digital person is controlled to execute the fused action according to the position of each bone point when the digital person executes the first action, the position of each bone point when the digital person executes the second action and the human body motion rule.
10. The apparatus of claim 9, wherein the digital human is a digital object based on control of bone point locations; the processing module is specifically configured to:
controlling the digital person to implement transition from the first action to the second action in an articulated manner in case of existence of conflicting skeletal points in the first action and the second action; the connection mode is that a motion path between the position of each bone point when the digital person executes the first action and the position of each bone point when the digital person executes the second action is calculated, and the digital person is controlled to transition from the first action to the second action according to the motion path.
11. The apparatus according to claim 7 or 8, wherein the processing module is specifically configured to:
and acquiring the existence of the unfinished first task of the target user in a database based on the identification of the target user.
12. The apparatus according to claim 7 or 8, wherein the processing module is specifically configured to:
receiving the first task from the second digital person.
13. An electronic device, comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
CN202010220091.5A 2020-03-25 2020-03-25 Digital human control method and device Active CN111443853B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010220091.5A CN111443853B (en) 2020-03-25 2020-03-25 Digital human control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010220091.5A CN111443853B (en) 2020-03-25 2020-03-25 Digital human control method and device

Publications (2)

Publication Number Publication Date
CN111443853A CN111443853A (en) 2020-07-24
CN111443853B true CN111443853B (en) 2021-07-20

Family

ID=71654560

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010220091.5A Active CN111443853B (en) 2020-03-25 2020-03-25 Digital human control method and device

Country Status (1)

Country Link
CN (1) CN111443853B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112379812B (en) * 2021-01-07 2021-04-23 深圳追一科技有限公司 Simulation 3D digital human interaction method and device, electronic equipment and storage medium
CN116430991A (en) * 2023-03-06 2023-07-14 北京黑油数字展览股份有限公司 Exhibition hall digital person explanation method and system based on mixed reality and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105807929A (en) * 2016-03-10 2016-07-27 沈愉 Virtual person as well as control system and device therefor
CN108491147A (en) * 2018-04-16 2018-09-04 青岛海信移动通信技术股份有限公司 A kind of man-machine interaction method and mobile terminal based on virtual portrait
CN110442438A (en) * 2019-02-26 2019-11-12 北京蓦然认知科技有限公司 Task cooperative method, equipment and system between a kind of more equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10936998B2 (en) * 2018-03-29 2021-03-02 Adp, Llc Metadata-based chat wizard

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105807929A (en) * 2016-03-10 2016-07-27 沈愉 Virtual person as well as control system and device therefor
CN108491147A (en) * 2018-04-16 2018-09-04 青岛海信移动通信技术股份有限公司 A kind of man-machine interaction method and mobile terminal based on virtual portrait
CN110442438A (en) * 2019-02-26 2019-11-12 北京蓦然认知科技有限公司 Task cooperative method, equipment and system between a kind of more equipment

Also Published As

Publication number Publication date
CN111443853A (en) 2020-07-24

Similar Documents

Publication Publication Date Title
US11080520B2 (en) Automatic machine recognition of sign language gestures
JP2021524629A (en) Transformer mode input fusion for wearable systems
Lee et al. Towards augmented reality driven human-city interaction: Current research on mobile headsets and future challenges
CN112036509A (en) Method and apparatus for training image recognition models
US20210081029A1 (en) Gesture control systems
CN112667068A (en) Virtual character driving method, device, equipment and storage medium
CN111443853B (en) Digital human control method and device
US11294475B1 (en) Artificial reality multi-modal input switching model
CN113325952A (en) Method, apparatus, device, medium and product for presenting virtual objects
Zobl et al. A real-time system for hand gesture controlled operation of in-car devices
CN111913585A (en) Gesture recognition method, device, equipment and storage medium
CN111443854B (en) Action processing method, device and equipment based on digital person and storage medium
CN108549487A (en) Virtual reality exchange method and device
Mohd et al. Multi-modal data fusion in enhancing human-machine interaction for robotic applications: A survey
Lapointe et al. A literature review of AR-based remote guidance tasks with user studies
CN112382291B (en) Voice interaction processing method and device, electronic equipment and storage medium
Ismail et al. Vision-based technique and issues for multimodal interaction in augmented reality
CN111273783B (en) Digital human control method and device
CN112527110A (en) Non-contact interaction method and device, electronic equipment and medium
US11086406B1 (en) Three-state gesture virtual controls
CN112148196A (en) Display method and device of virtual keyboard
CN113498029B (en) Interactive broadcast
CN111309153A (en) Control method and device for man-machine interaction, electronic equipment and storage medium
CN112102447A (en) Image processing method, device, equipment and storage medium
CN112382292A (en) Voice-based control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant