CN115904059A - Human-computer interaction continuing method, device and system - Google Patents

Human-computer interaction continuing method, device and system Download PDF

Info

Publication number
CN115904059A
CN115904059A CN202111165583.XA CN202111165583A CN115904059A CN 115904059 A CN115904059 A CN 115904059A CN 202111165583 A CN202111165583 A CN 202111165583A CN 115904059 A CN115904059 A CN 115904059A
Authority
CN
China
Prior art keywords
instruction
user
historical
data
dialog
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111165583.XA
Other languages
Chinese (zh)
Inventor
张田
陈开济
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202111165583.XA priority Critical patent/CN115904059A/en
Priority to PCT/CN2022/120571 priority patent/WO2023051379A1/en
Publication of CN115904059A publication Critical patent/CN115904059A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

The application discloses a human-computer interaction connection method, device and system, relates to the technical field of communication, and can realize human-computer interaction connection timely, efficiently and accurately. In the present application, when a user interacts with a public device (e.g., a first device), the public device associates corresponding historical dialog data with the relevant user for reference when the subsequent public device continues a task. For example, when the second device receives the second instruction, the relevant historical dialogue data can be retrieved by referring to the user (such as the first user) to which the determined second instruction relates. And further, responding to the second instruction in time according to the related historical dialogue data and the second instruction, and accurately completing the continuous task. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection, cross-device cross-user human-computer interaction connection and the like can be timely, efficiently and accurately realized.

Description

Human-computer interaction continuing method, device and system
Technical Field
The embodiment of the application relates to the technical field of communication, in particular to a method, equipment and a system for continuing human-computer interaction.
Background
With the diversified development of terminal technologies, more and more terminal devices can be used by the same user. Based on this, the human-computer interaction range of the user is expanded from one terminal device to a plurality of terminal devices. For example, in daily life, multiple users at home can interact with devices such as smart phones, tablet computers, smart sounds, smart televisions, smart air conditioners, smart refrigerators and smart curtains in a cross manner anytime and anywhere.
In the human-computer interaction scene like the above, how to timely, efficiently and accurately realize human-computer interaction connection becomes a challenging problem. For example, how to timely, efficiently and accurately achieve that the user B takes over the interaction between the user a and the device a, achieve that the device B takes over the interaction between the device a and the user a, achieve that the device B takes over the conversation between the device a and the user a, and interact with the user B becomes a challenging problem.
Disclosure of Invention
The application provides a human-computer interaction connection method, device and system, which can realize human-computer interaction connection timely, efficiently and accurately.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
in a first aspect, a method for continuing human-computer interaction is provided, where the method includes: the first device receives a first instruction, executes a first task indicated by the first instruction, and generates first historical dialogue data. Wherein the first historical dialog data is associated with the first user. The second device determines that the second instruction is related to the first user when receiving the second instruction; then, the second device acquires historical dialogue data related to the first user and comprising the first historical dialogue data; finally, the second device executes a second task according to the first historical dialog data and the second instruction. The first aspect provides for the public device to associate historical session data with the associated user when the user interacts with the public device (e.g., the first device) for subsequent reference by the public device in continuing the task. For example, when the second device receives the second instruction, the relevant historical dialogue data can be retrieved by referring to the user (such as the first user) to which the determined second instruction relates. And further, responding to the second instruction in time according to the related historical dialogue data and the second instruction, and accurately completing the continuous task. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized.
In a possible implementation manner, the first historical dialog data is related to a first user, and includes: a first instruction corresponding to the first historical dialogue data is sent by a first user; or when the first device receives a first instruction corresponding to the first historical dialogue data, the first device detects the first user. The scheme provided by the application can realize the continuing processing of the tasks corresponding to the instructions sent by the same user, and can also realize the continuing of the tasks corresponding to the instructions sent by different users, and the interpersonal interaction continuing scene coverage is wide.
In a possible implementation manner, the method further includes: the first device detects a personal portable device of a first user upon receiving the first instruction. Illustratively, the first device may determine the associated user based on the detected personal portable device. Private devices are often carried around by users due to their privacy and portability, so if a first device detects a personal portable device of a first user, it can be determined that the first user is located near the first device. The method can improve the accuracy of instruction-related user detection.
In a possible implementation manner, the storing the first historical dialog data in a third device, and the acquiring, by the second device, the historical dialog data related to the first user includes: the second device obtains historical dialog data associated with the first user from a third device. According to the scheme provided by the application, historical conversation data can be stored through the third equipment, and unified management, searching and the like are facilitated.
In a possible implementation manner, the storing the first historical dialog data in a first device, and the obtaining, by the second device, the historical dialog data related to the first user includes: the second device obtains historical dialog data associated with the first user from the first device. According to the scheme, historical dialogue data can be stored through the first equipment, so that the flexibility is high and the cost is low.
In a possible implementation manner, when the second device receives the second instruction, determining that the second instruction is related to the first user includes: when the second device receives the second instruction, detecting one or more fourth devices around to determine a user related to the second instruction; the one or more fourth devices comprise a personal portable device of the first user. Illustratively, the second device may determine the associated user based on the detected personal portable device. Private devices are often carried around by users due to their privacy and portability, so if a second device detects a personal portable device of a first user, it can be determined that the first user is associated with a second instruction. The method can improve the accuracy of instruction-related user detection.
In a possible implementation manner, the acquiring, by the second device, historical dialogue data related to the first user includes: the second device obtains historical dialog data associated with the first user based on the identification of the personal portable device of the first user. By associating historical dialogue data with a personal portable device identification, relevant historical dialogue data may be retrieved based on the detected personal portable device identification when the second device determines the relevant user through the detected personal portable device. The method can improve the accuracy of the acquired historical dialogue data.
In a possible implementation manner, the first historical dialog data represents that the first task is playing a first program, and the second instruction is used for instructing to continue playing the first program; the second device performs a second task according to the first historical dialog data and the second instruction, including: and the second equipment continues the playing progress of the first equipment to the first program according to the first historical conversation data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for connection of progress of playing programs (such as videos and music).
In a possible implementation manner, the first historical session data indicates that the first task is to play a first program, and the second instruction is used to instruct to switch the played program, and then the second device executes the second task according to the first historical session data and the second instruction, including: the second device plays a next program of the first program played by the first device according to the first historical dialogue data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for switching playing programs (such as videos and music).
In a possible implementation, the second device is the same as the first device. As a scene, the method provided by the application can be suitable for scenes in which the users are in interactive connection with the same equipment, including scenes in which a plurality of users are in interactive connection with the same equipment and scenes in which the same user is in interactive connection with the same equipment.
In a possible implementation manner, the first instruction and the second instruction are voice instructions or text instructions. As an example, in the scheme provided by the present application, the man-machine interaction mode may be voice interaction or text interaction, and the present application is not limited.
In one possible implementation, the first historical dialog data is also associated with a second user. Based on the method provided by the application, all users related to the instruction can be determined, for example, the instruction may be related to a plurality of users.
In a possible implementation manner, the method further includes: the fourth device determines that the third instruction is related to the second user when receiving the third instruction; the fourth device acquires historical conversation data related to the second user and comprising the first historical conversation data; the fourth device performs a third task according to the first historical dialog data and the third instruction. When a user interacts with a public device (e.g., a second device), the public device references subsequent public devices to follow up on a task by associating historical session data with the relevant user. For example, when the fourth device receives the third instruction, the relevant historical dialogue data may be retrieved by referring to the determined user (e.g., the second user) to which the third instruction relates. And further, responding to the third instruction in time according to the related historical dialogue data and the third instruction, and accurately completing the continuous task. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized.
In a possible implementation manner, the first historical dialogue data is at least used for characterizing the occurrence time of the historical dialogue corresponding to the first task, the intention of the historical dialogue, the identification of the first device, the user and the slot related to the historical dialogue.
In a possible implementation, the second instruction is also associated with a third user. Based on the method provided by the application, all users related to the instruction can be determined, for example, the instruction may be related to a plurality of users.
In a possible implementation manner, the method further includes: when the fifth device receives the fourth instruction, determining that the fourth instruction is related to a third user; the fifth device obtaining historical dialogue data related to a third user and comprising the second historical dialogue data; the fifth device executes a fourth task according to the second historical dialog data and the fourth instruction. When a user interacts with a public device (e.g., a second device), the public device references subsequent public devices to follow up on a task by associating historical conversation data with the relevant user. For example, when the fifth device receives the fourth instruction, the relevant historical dialogue data may be retrieved by referring to the user (e.g., the third user) to which the determined fourth instruction is relevant. And further, responding to the fourth instruction in time according to the related historical dialogue data and the fourth instruction, and accurately completing the continuing task. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized.
In a second aspect, a method for continuing human-computer interaction is provided, where the method includes: the first device receives a first instruction, executes a first task indicated by the first instruction, and generates first historical dialogue data. Wherein the first historical conversation data is associated with the first user.
According to the scheme provided by the second aspect, when the user interacts with the public device (such as the first device), the public device associates the historical conversation data with the related user so as to be convenient for reference when the subsequent public device continues the task, and therefore cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection can be achieved timely, efficiently and accurately.
In a possible implementation manner, the above-mentioned first historical dialogue data is related to a first user, and includes: a first instruction corresponding to the first historical dialogue data is sent by a first user; or when the first device receives a first instruction corresponding to the first historical dialogue data, the first device detects the first user. The scheme provided by the application can realize the continuing processing of the tasks corresponding to the instructions sent by the same user, and can also realize the continuing of the tasks corresponding to the instructions sent by different users, and the interpersonal interaction continuing scene coverage is wide.
In a possible implementation manner, the method further includes: the first device detects a personal portable device of a first user upon receiving the first instruction. Illustratively, the first device may determine the associated user based on the detected personal portable device. Private devices are often carried around by users due to their privacy and portability, so if a first device detects a personal portable device of a first user, it can be determined that the first user is located near the first device. The method can improve the accuracy of instruction-related user detection.
In one possible implementation, the first historical dialogue data is stored in the third device. According to the scheme provided by the application, historical conversation data can be stored through the third equipment, and unified management, searching and the like are facilitated.
In one possible implementation, the first historical dialog data is stored in the first device. According to the scheme, historical dialogue data can be stored through the first equipment, so that the flexibility is high and the cost is low.
In a possible implementation manner, the first instruction is a voice instruction or a text instruction. As an example, in the scheme provided by the present application, the man-machine interaction mode may be voice interaction or text interaction, and the present application is not limited.
In one possible implementation, the first historical dialog data is also associated with a second user. Based on the method provided by the application, all users related to the instruction can be determined, for example, the instruction may be related to a plurality of users.
In a possible implementation manner, the first historical dialogue data is at least used for characterizing the occurrence time of the historical dialogue corresponding to the first task, the intention of the historical dialogue, the identification of the first device, the user and the slot related to the historical dialogue.
In a third aspect, a method for continuing human-computer interaction is provided, where the method includes: the second device determines that the second instruction is related to the first user when receiving the second instruction; then, the second device acquires historical dialogue data related to the first user and comprising the first historical dialogue data; finally, the second device executes a second task according to the first historical dialog data and the second instruction.
The third aspect provides an arrangement, when receiving the second instruction, the second device may retrieve the relevant historical dialogue data by using the determined user (e.g. the first user) related to the second instruction as a reference. And further, responding to the second instruction in time according to the related historical dialogue data and the second instruction, and accurately completing the continuous task. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized.
In a possible implementation manner, the storing the first historical dialog data in a third device, and the acquiring, by the second device, historical dialog data related to the first user include: the second device obtains historical dialog data associated with the first user from a third device. According to the scheme provided by the application, historical conversation data can be stored through the third equipment, and unified management, searching and the like are facilitated.
In a possible implementation manner, the storing the first historical dialog data in the first device, and the obtaining, by the second device, the historical dialog data related to the first user include: the second device obtains historical dialog data associated with the first user from the first device. According to the scheme, historical conversation data can be stored through the first device, and the method is high in flexibility and low in cost.
In a possible implementation manner, when the second device receives the second instruction, determining that the second instruction is related to the first user includes: when the second device receives the second instruction, detecting one or more fourth devices around to determine a user related to the second instruction; the one or more fourth devices comprise a personal portable device of the first user. Illustratively, the second device may determine the associated user based on the detected personal portable device. Private devices are often carried around by users due to their privacy and portability, so if a second device detects a personal portable device of a first user, it can be determined that the first user is associated with a second instruction. The method can improve the accuracy of instruction-related user detection.
In a possible implementation manner, the acquiring, by the second device, historical dialog data related to the first user includes: the second device obtains historical dialog data associated with the first user based on the identification of the personal portable device of the first user. By associating historical dialogue data with a personal portable device identification, relevant historical dialogue data may be retrieved based on the detected personal portable device identification when the second device determines the relevant user through the detected personal portable device. The method can improve the accuracy of the acquired historical dialogue data.
In a possible implementation manner, the first historical dialog data represents that the first task is playing a first program, and the second instruction is used for instructing to continue playing the first program; the second device performs a second task according to the first historical dialog data and the second instruction, including: and the second equipment continues the playing progress of the first equipment to the first program according to the first historical conversation data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for connection of progress of playing programs (such as videos and music).
In a possible implementation manner, the first historical dialogue data represents that the first task is to play a first program, and the second instruction is used for instructing to switch the played program; the second device performs a second task according to the first historical dialog data and the second instruction, including: and the second equipment plays the next program of the first program played by the first equipment according to the first historical conversation data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for switching playing programs (such as videos and music).
In a possible implementation, the second device is the same as the first device. As a scene, the method provided by the application can be suitable for scenes in which the users are in interactive connection with the same equipment, including scenes in which a plurality of users are in interactive connection with the same equipment and scenes in which the same user is in interactive connection with the same equipment.
In a possible implementation, the second instruction is a voice instruction or a text instruction. As an example, in the scheme provided by the present application, the man-machine interaction mode may be voice interaction or text interaction, and the present application is not limited.
In a possible implementation manner, the first historical dialogue data is at least used for characterizing the occurrence time of the historical dialogue corresponding to the first task, the intention of the historical dialogue, the identification of the first device, the user and the slot related to the historical dialogue.
In a possible implementation, the second instruction is also associated with a third user. Based on the method provided by the application, all users related to the instruction can be determined, for example, the instruction may be related to a plurality of users.
In a fourth aspect, a first device is provided, the first device comprising: the instruction detection unit is used for performing instruction detection, such as voice instruction detection; the processing unit is used for determining that the second instruction is related to the first user when the instruction detection unit detects the second instruction; obtaining historical conversation data associated with a first user that includes the first historical conversation data; and executing a second task according to the first historical dialogue data and the second instruction.
The above fourth aspect provides a solution, when receiving the second instruction, the second device may retrieve the relevant historical dialogue data by using the determined user (e.g. the first user) related to the second instruction as a reference. And further, responding to the second instruction in time according to the related historical dialogue data and the second instruction, and accurately completing the continuous task. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized.
In a possible implementation manner, the first historical dialogue data is stored in a third device, and the first device further includes: a transceiving unit for obtaining historical dialog data associated with the first user from the third device. According to the scheme, historical conversation data can be stored through the third equipment, and unified management, searching and the like are facilitated.
In a possible implementation manner, the first historical dialog data is stored in a first device, and the second device further includes: a transceiving unit to obtain historical dialog data associated with a first user from a first device. According to the scheme, historical dialogue data can be stored through the first equipment, so that the flexibility is high and the cost is low.
In a possible implementation manner, the first device further includes: a device discovery unit; the processing unit is specifically configured to, when the instruction detecting unit detects the second instruction, detect one or more fourth devices around through the device discovering unit to determine a user related to the second instruction; the one or more fourth devices comprise a personal portable device of the first user. Illustratively, the second device may determine the associated user based on the detected personal portable device. Private devices are often carried around by users due to their privacy and portability, so if a second device detects a personal portable device of a first user, it can be determined that the first user is associated with a second instruction. The method can improve the accuracy of instruction-related user detection.
In a possible implementation manner, the processing unit is specifically configured to: historical dialogue data associated with the first user is obtained based on the identification of the personal portable device of the first user. By associating historical dialogue data with a personal portable device identification, relevant historical dialogue data may be retrieved based on the detected personal portable device identification when the second device determines the relevant user through the detected personal portable device. The method can improve the accuracy of the acquired historical dialogue data.
In a possible implementation manner, the first historical dialog data represents that the first task is playing a first program, and the second instruction is used for instructing to continue playing the first program; the processing unit is specifically configured to: and continuing the playing progress of the first program by the first equipment according to the first historical conversation data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for connection of progress of playing programs (such as videos and music).
In a possible implementation manner, the first historical dialogue data represents that the first task is to play a first program, and the second instruction is used to instruct to switch the played program, and then the processing unit is specifically configured to: and playing the next program of the first program played by the first equipment according to the first historical dialogue data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for switching playing programs (such as videos and music).
In a possible implementation, the second device is the same as the first device. As a scene, the method provided by the application can be suitable for scenes in which the users are in interactive connection with the same equipment, including scenes in which a plurality of users are in interactive connection with the same equipment and scenes in which the same user is in interactive connection with the same equipment.
In a possible implementation, the second instruction is a voice instruction or a text instruction. As an example, in the scheme provided by the present application, the man-machine interaction mode may be voice interaction or text interaction, and the present application is not limited.
In a possible implementation manner, the first historical dialogue data is at least used for characterizing the occurrence time of the historical dialogue corresponding to the first task, the intention of the historical dialogue, the identification of the first device, the user and the slot related to the historical dialogue.
In a fifth aspect, there is provided a second device comprising: the instruction detection unit is used for carrying out instruction detection; the processing unit is used for determining that the second instruction is related to the first user when the instruction detection unit detects the second instruction; obtaining historical conversation data associated with a first user that includes the first historical conversation data; and executing a second task according to the first historical dialog data and the second instruction.
The fifth aspect provides a solution, when receiving the second instruction, the second device may retrieve the relevant historical dialogue data by using the determined user (e.g. the first user) related to the second instruction as a reference. And further, responding to the second instruction in time according to the related historical dialogue data and the second instruction, and accurately completing the continuous task. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized.
In a possible implementation manner, the first historical dialog data is stored in a third device, and the second device further includes: a transceiving unit for obtaining historical dialog data associated with the first user from the third device. According to the scheme provided by the application, historical conversation data can be stored through the third equipment, and unified management, searching and the like are facilitated.
In a possible implementation manner, the first historical dialog data is stored in a first device, and the second device further includes: a transceiving unit to obtain historical dialog data associated with a first user from a first device. According to the scheme, historical dialogue data can be stored through the first equipment, so that the flexibility is high and the cost is low.
In a possible implementation manner, the second device further includes: a device discovery unit; the processing unit is specifically configured to: detecting, by the device discovery unit, one or more fourth devices around to determine a user associated with the second instruction, upon receiving the second instruction; the one or more fourth devices comprise a personal portable device of the first user. Illustratively, the second device may determine the associated user based on the detected personal portable device. Private devices are often carried around by users due to their privacy and portability, so if a second device detects a personal portable device of a first user, it can be determined that the first user is associated with a second instruction. The method can improve the accuracy of instruction-related user detection.
In a possible implementation manner, the processing unit is specifically configured to: historical dialogue data associated with the first user is obtained based on the identification of the personal portable device of the first user. By associating historical dialogue data with a personal portable device identification, relevant historical dialogue data may be retrieved based on the detected personal portable device identification when the second device determines the relevant user through the detected personal portable device. The method can improve the accuracy of the acquired historical dialogue data.
In a possible implementation manner, the first historical dialog data represents that the first task is playing a first program, and the second instruction is used for instructing to continue playing the first program; the processing unit is specifically configured to: and continuing the playing progress of the first program by the first equipment according to the first historical conversation data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for connection of progress of playing programs (such as videos and music).
In a possible implementation manner, the first historical dialogue data represents that the first task is to play a first program, and the second instruction is used for instructing to switch the played program; the processing unit is specifically configured to: and playing the next program of the first program played by the first equipment according to the first historical dialogue data and the second instruction. The scheme provided by the application can be suitable for various human-computer interaction connection scenes, for example, the scheme can be suitable for switching playing programs (such as videos and music).
In one possible implementation, the second device is the same as the first device. As a scene, the method provided by the application can be suitable for scenes in which the users are interactively and continuously connected with the same device, and comprises scenes in which a plurality of users are interactively and continuously connected with the same device and scenes in which the same user is interactively and continuously connected with the same device.
In a possible implementation, the second instruction is a voice instruction or a text instruction. As an example, in the scheme provided by the present application, the man-machine interaction mode may be voice interaction or text interaction, and the present application is not limited.
In a sixth aspect, there is provided a first device comprising: a memory for storing a computer program; a transceiver for receiving or transmitting a radio signal; a processor configured to execute the computer program, so that the first device performs the method according to any one of the possible implementation manners of the second aspect.
In a seventh aspect, a second device is provided, which includes: a memory for storing a computer program; a transceiver for receiving or transmitting a radio signal; a processor configured to execute the computer program, so that the second device performs the method according to any one of the possible implementation manners of the third aspect.
In an eighth aspect, there is provided a communication system comprising: the first device in any one of the possible implementations of the fourth aspect or the sixth aspect, and the second device in any one of the possible implementations of the fifth aspect or the seventh aspect. The communication system is adapted to implement a method as in any one of the possible implementations of the first aspect.
In a possible implementation manner, the communication system further includes a third device, and the third device is configured to store and/or manage historical conversation data.
In a possible implementation manner, the communication system further includes a fourth device, where the fourth device is a public device, and the fourth device can receive and respond to an instruction of the user, such as a voice instruction or a text instruction.
In a possible implementation manner, the communication system further includes a fifth device, where the fifth device is a public device, and the fifth device may receive and respond to an instruction of a user, such as a voice instruction or a text instruction.
In a ninth aspect, a computer readable storage medium is provided, having computer program code stored thereon, which, when executed by a processor, causes the processor to implement the method as in any one of the possible implementations of the second or third aspect.
In a tenth aspect, a chip system is provided, where the chip system includes a processor and a memory, and the memory stores computer program codes; the computer program code, when executed by the processor, causes the processor to implement the method as in any of the possible implementations of the second aspect or the third aspect. The chip system may be formed by a chip, and may also include a chip and other discrete devices.
In an eleventh aspect, a computer program product is provided that includes computer instructions. The computer instructions, when executed on a computer, cause the computer to implement a method as in any one of the possible implementations of the second aspect or the third aspect.
Drawings
Fig. 1 is an exemplary diagram of a human-computer interaction application scenario provided in an embodiment of the present application;
fig. 2 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application;
fig. 3 is a schematic diagram of a human-computer interaction continuation scene provided in an embodiment of the present application;
fig. 4 is a schematic diagram of another human-computer interaction continuation scene provided in the embodiment of the present application;
fig. 5 is a diagram illustrating a network architecture according to an embodiment of the present application;
fig. 6 is a flowchart of a human-computer interaction continuation method according to an embodiment of the present disclosure;
FIG. 7 is a flowchart of another human-computer interaction continuation method provided in the present application;
fig. 8 is a first exemplary diagram of a connection scenario of human-computer interaction provided in the embodiment of the present application;
fig. 9 is a first flowchart of a successive interaction of human-computer interaction provided in the embodiment of the present application;
fig. 10 is a second exemplary diagram of a human-computer interaction connection scenario provided in the embodiment of the present application;
fig. 11 is a second flowchart of the following interaction of human-computer interaction provided in the embodiment of the present application;
fig. 12 is a third exemplary diagram of a connection scenario of human-computer interaction provided in the embodiment of the present application;
fig. 13 is a third sequential interaction flow chart of human-computer interaction provided in the embodiment of the present application;
fig. 14 is a block diagram of a terminal device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. In the description of the embodiments herein, "/" means "or" unless otherwise specified, for example, a/B may mean a or B; "and/or" herein is merely an association describing an associated object, and means that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, in the description of the embodiments of the present application, "a plurality" means two or more than two.
In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present embodiment, "a plurality" means two or more unless otherwise specified.
The embodiment of the application provides a human-computer interaction continuing method, which is applied to a human-computer interaction process of multiple devices and/or multiple users. For example, the method may be applied in a cross-device and/or cross-user human-machine interaction process.
In the embodiment of the present application, "human" in "human-computer interaction" refers to one or more users, and "machine" refers to one or more terminal devices. The embodiment of the application does not limit the specific way of man-machine interaction. For example, the human-computer interaction mode may include, but is not limited to, a voice interaction mode, a gesture interaction mode, a touch interaction mode, a key interaction mode, a somatosensory interaction mode, a facial expression interaction mode and the like. For a detailed description of the man-machine interaction mode, reference may be made to conventional technologies, which are not described herein.
The terminal device can support a human-computer interaction function (such as a voice interaction function). For example, the terminal device includes a voice interaction functional unit therein, which is used for supporting human-computer voice interaction. For another example, a voice assistant application (hereinafter referred to as "voice assistant") may be installed in the terminal device to support voice interaction such as intelligent conversation or instant question and answer with the user through the voice assistant. For another example, a man-machine interaction service (e.g., a voice interaction service) may be integrated into an application (e.g., a video application, a music application, etc.) installed in the terminal device. For a specific implementation manner of the human-computer interaction function, reference may be made to conventional technologies, which are not described herein in detail.
As an example, the method provided by the embodiment of the present application may be applied to a human-computer interaction process between the same user and multiple devices, for example, in a scene where the device a takes over a historical conversation between the device B and the user a and interacts with the user a.
As another example, the method provided by the embodiment of the present application may be applied to a process of human-computer interaction between multiple users and the same device, for example, in a scenario where the user a takes over historical conversation between the user B and the device a and interacts with the device a.
As another example, the method provided by the embodiment of the present application may be applied to a process of cross human-computer interaction between multiple users and multiple devices, for example, in a scenario where a user a takes over a historical conversation between a user B and a device a to interact with a device B.
For example, in the embodiment of the present application, the process of human-computer interaction among multiple devices and/or multiple users may occur in a smart home scene or a driving scene. For example, in a smart home scenario, multiple users (e.g., male and female owners) may cross-interact between multiple smart home devices anytime and anywhere. For another example, in a driving scene, multiple users (such as a male owner and a female owner) can interact with each other in a cross manner at any time among devices such as a smart phone, a tablet computer, an intelligent air conditioner, a vehicle-mounted device (such as a navigator, an intelligent sound box, and an intelligent large-screen player), and the like. The application does not limit specific human-computer interaction scenarios.
Taking the smart home scenario as an example, the smart home scenario may include, but is not limited to, one or more portable terminal devices (also referred to as "personal portable devices", hereinafter referred to as "portable devices") and a plurality of smart home devices. As shown in fig. 1, the smart home scene includes a portable device 1 and a portable device 2; equipment such as an intelligent television and an intelligent sound box for entertainment; the intelligent fresh air system, the intelligent air conditioner, the intelligent refrigerator, the intelligent air purifier, the intelligent air quality sensor and the like are used for health; the intelligent desk lamp is used for lighting.
In the embodiment of the present application, fig. 1 is only used as an example of an intelligent home scenario, and the present application does not limit a specific architecture of the intelligent home scenario. For example, the smart home system may further include other devices such as smart home devices with other functions, for example, devices such as an alarm for security and an intelligent door lock; dish-washing machines, disinfection cabinets and the like for kitchen electricity; devices such as motorized window shades for home decoration; or a sweeping robot for cleaning and the like.
For another example, in one possible architecture, one or more routers may also be included in the smart home scenario. A router may also be referred to as a smart host or home gateway. Routers are used to connect two or more network hardware devices, acting as gateways between networks. Routers are specialized intelligent network devices that read each packet address and decide how to transmit. In an intelligent home scene, the router is in wireless connection with the host, so that a user can conveniently use portable equipment such as a smart phone to manage and control different intelligent home equipment. Generally, the router can provide a Wi-Fi hotspot, and the smart home devices and the portable devices can access the Wi-Fi network through the Wi-Fi hotspot provided by the access router.
In some embodiments, the portable device may be used to control smart home devices via wireless communication techniques. For example, the smart home devices are directly controlled through bluetooth, infrared, wireless fidelity (Wi-Fi) direct connection and the like. And if the intelligent household equipment is controlled by the router or the central control equipment, the intelligent household equipment is controlled.
In the embodiment of the present application, since the portable devices serving a single user, such as a smart phone, a smart band, a smart watch, and a smart headset, are usually carried by the user, for example, when the user interacts with the device, the portable devices are also carried by the user. Based on this, in the embodiment of the present application, some portable devices serving a single user may be understood as private devices of the user (also referred to as "personal portable devices").
As an example, in the embodiment of the present application, the private device may also be a terminal device such as a netbook, a tablet computer, a palmtop computer, a telephone watch, a cellular phone, a cordless phone, and a Personal Digital Assistant (PDA).
Corresponding to a private device, i.e. a public device. The public device refers to a terminal device which can be shared by a plurality of users. For example, the public devices may include, but are not limited to, smart home devices such as a smart television, a smart sound, a smart desk lamp, a smart fresh air system, a smart air conditioner, a smart refrigerator, a smart air purifier, and a smart air quality sensor as shown in fig. 1. For another example, the public device may include an in-vehicle device in a driving scene, or a device in another human-computer interaction scene (e.g., a HiLink ecology), and the application is not limited thereto. As another example, the public device may further include a device capable of providing services to a plurality of users, such as a tablet computer, a palmtop computer, a netbook, an ultra-mobile personal computer (UMPC), a smart camera, a motion sensing game machine, a Portable Multimedia Player (PMP), an Augmented Reality (AR)/Virtual Reality (VR) device, a Session Initiation Protocol (SIP) phone, a terminal device in an internet of things (IOT) smart system, a wireless device in a city (smart city), a wireless device in a smart home (smart home), and the like. The terminal devices in the IOT system are smart home devices such as a smart television, a smart sound, a smart air conditioner and the like shown in fig. 1.
Referring to fig. 2, fig. 2 shows a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application. As shown in fig. 2, the terminal device may include a processor 210, a memory (including an external memory interface 220 and an internal memory 221), a Universal Serial Bus (USB) interface 230, a charging management module 240, a power management module 241, a battery 242, an antenna 1, an antenna 2, a mobile communication module 250, a wireless communication module 260, an audio module 270, a speaker 270A, a receiver 270B, a microphone 270C, an earphone interface 270D, a sensor module 280, keys 290, a motor 291, an indicator 292, a camera 293, a display 294, and the like. Among them, the sensor module 280 may include a gyro sensor, an acceleration sensor, a magnetic sensor, a touch sensor, a fingerprint sensor, and a pressure sensor. In some embodiments, the sensor module 280 may also include a barometric pressure sensor, a distance sensor, a proximity light sensor, a temperature sensor, an ambient light sensor, a bone conduction sensor, and the like.
It is to be understood that the illustrated structure of the embodiment of the present invention does not specifically limit the terminal device. In other embodiments of the present application, a terminal device may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components may be used. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Processor 210 may include one or more processing units. For example: the processor 210 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a flight controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), among others. Wherein, the different processing units may be independent devices or may be integrated in one or more processors.
A memory may also be provided in processor 210 for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 210. If the processor 210 needs to use the instruction or data again, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 210, thereby increasing the efficiency of the system.
In some embodiments, processor 210 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.
The charge management module 240 is configured to receive a charging input from a charger. The power management module 241 is used to connect the battery 242, the charging management module 240 and the processor 210. The power management module 241 receives input from the battery 242 and/or the charging management module 240, and provides power to the processor 210, the internal memory 221, the display 294, the camera 293, and the wireless communication module 260.
The wireless communication function of the terminal device may be implemented by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, the modem processor, the baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in a terminal device may be used to cover a single or multiple communications bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 250 may provide a solution including wireless communication of 2G/3G/4G/5G/6G, etc. applied on the terminal device. The mobile communication module 250 may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module 250 may receive the electromagnetic wave from the antenna 1, filter, amplify, etc. the received electromagnetic wave, and transmit the electromagnetic wave to the modem processor for demodulation. The mobile communication module 250 may also amplify the signal modulated by the modem processor, and convert the signal into electromagnetic wave through the antenna 1 to radiate the electromagnetic wave. In some embodiments, at least some of the functional modules of the mobile communication module 250 may be disposed in the processor 210. In some embodiments, at least some of the functional modules of the mobile communication module 250 may be disposed in the same device as at least some of the modules of the processor 210.
The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then passed to the application processor. The application processor outputs a sound signal through an audio device (not limited to the speaker 270A, the receiver 270B, etc.) or displays an image or video through the display screen 294. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be separate from the processor 210, and may be disposed in the same device as the mobile communication module 250 or other functional modules.
The wireless communication module 260 may provide solutions for wireless communication applied to the terminal device, including Wireless Local Area Networks (WLANs), such as Wi-Fi networks, bluetooth BT, global Navigation Satellite System (GNSS), frequency Modulation (FM), near Field Communication (NFC), infrared (IR), and the like. The wireless communication module 260 may be one or more devices integrating at least one communication processing module. The wireless communication module 260 receives electromagnetic waves via the antenna 2, performs frequency modulation and filtering processing on the electromagnetic wave signal, and transmits the processed signal to the processor 210. The wireless communication module 260 may also receive a signal to be transmitted from the processor 210, frequency-modulate and amplify the signal, and convert the signal into electromagnetic waves via the antenna 2 to radiate the electromagnetic waves.
In some embodiments, the terminal device's antenna 1 is coupled to the mobile communication module 250 and antenna 2 is coupled to the wireless communication module 260 so that the terminal device can communicate with the network and other devices through wireless communication techniques. The wireless communication technology may include global system for mobile communications (GSM), general Packet Radio Service (GPRS), code division multiple access (code division multiple access, CDMA), wideband Code Division Multiple Access (WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long Term Evolution (LTE), new radio (new radio, NR), BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc. The GNSS may include a Global Positioning System (GPS), a global navigation satellite system (GLONASS), a beidou navigation satellite system (BDS), a quasi-zenith satellite system (QZSS), and/or a Satellite Based Augmentation System (SBAS).
The terminal device implements the display function through the GPU, the display screen 294, and the application processor. The GPU is a microprocessor for image processing, coupled to a display screen 294 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or alter display information.
The display screen 294 is used to display images, video, and the like. The display screen 294 includes a display panel. The display panel may be a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode, AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-oeld, a quantum dot light-emitting diode (QLED), or the like. In some embodiments, the terminal device may include 1 or N display screens 294, N being a positive integer greater than 1.
The terminal device can implement a shooting function through the ISP, the camera module 293, the video codec, the GPU, the display screen 294, and the application processor.
The external memory interface 220 may be used to connect an external memory card, such as a Micro SD card, to extend the storage capability of the terminal device. The external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.
Internal memory 221 may be used to store computer-executable program code, including instructions. The internal memory 221 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (such as audio data, a phonebook, etc.) created during use of the terminal device, and the like. In addition, the internal memory 221 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. The processor 210 executes various functional applications of the terminal device and data processing by executing instructions stored in the internal memory 221 and/or instructions stored in a memory provided in the processor.
The terminal device may implement an audio function through the audio module 270, the speaker 270A, the receiver 270B, the microphone 270C, the application processor, and the like. Such as music playing, recording, etc. As to the specific operation principle and action of the audio module 270, the speaker 270A, the receiver 270B and the microphone 270C, reference may be made to the description in the conventional art.
The keys 290 include a power-on key, a volume key, and the like. The keys 290 may be mechanical keys. Or may be touch keys. The terminal device may receive a key input, and generate a key signal input related to user setting and function control of the terminal device.
It should be noted that the hardware modules included in the terminal device shown in fig. 2 are only described by way of example, and do not limit the specific structure of the terminal device. For example, the terminal device may also include other functional modules. For example, if the terminal device is a smartphone, the terminal device may further include a Subscriber Identity Module (SIM) interface. If the terminal equipment is a smart television, the terminal equipment can also comprise a remote controller.
As described above, how to timely, efficiently and accurately implement a human-computer interaction continuation (e.g., a cross-device and/or cross-user human-computer interaction continuation) in a human-computer interaction scenario of multiple devices and/or multiple users becomes a challenging problem.
In the embodiment of the application, cross-device human-computer interaction continues to the task 1 of the device 1 indicated by the user, for example, the device 2, and the task 2 is continuously executed according to the latest indication of the user. And (4) continuing a task 1 of the equipment, such as the equipment, then indicating the equipment by the user 1 through cross-user man-machine interaction, and continuously executing a task 2 according to the latest indication of the equipment by the user 2. And the cross-device cross-user man-machine interaction connection is that for example, the device 2 indicates the task 1 of the device 1 after the user 1, and the task 2 is continuously executed according to the latest indication of the user 2.
For example, in a case where the user performs man-machine interaction in a voice manner, as shown in fig. 3 (a), the living room television 1 completes a conversation (e.g., conversation 1) with the user 1 upon receiving a first instruction from the user 1. Then, as shown in fig. 3 (b), assuming that the user 1 walks into the bedroom and issues a second instruction to the tv set 2 to instruct the tv set 2 to continue the task before the tv set 1, how the tv set 2 completes the seamless connection to the conversation 1 becomes an important issue (in which the tv set 1 and the tv set 2 are smart tvs).
Based on the above scenario, as a possible implementation manner, the television 2 may access the history dialog through a login account on the device. For example, if the television 2 and the television 1 are registered with the same account, the television 2 acquires the session 1 between the user 1 and the television 1, and completes the processing for continuing the session 1.
However, based on this method, the tv set 2 actually recognizes an account rather than a real user, and thus a cross-user human-computer interaction connection cannot be really realized. For example, if a plurality of family members operate a certain device through a voice command, the device will regard all received voice commands as the behavior of users corresponding to the account, and cannot correspond to the real family members.
In addition, there is also a case where the television 1 and the television 2 cannot complete the connection to the session 1 if the account numbers registered in the television 1 and the television 2 are different. For example, if the television 1 is registered with the account of the user 2 and the television 2 is registered with the account of the user 1, the television 2 cannot complete the connection to the session 1. For another example, the television 1 logs in the account of the user 1, the television 2 logs in the account of the user 2, and the television 2 cannot complete the connection to the session 1. Based on the method, for the above situation, the public device cannot realize cross-device human-computer interaction connection, and cannot realize cross-device cross-user human-computer interaction connection.
There is also a case where the television 2 cannot complete the connection to the session 1 if at least one of the television 1 and the television 2 does not register an account. For example, in the case where neither the tv 1 nor the tv 2 has registered an account, the tv 1 has not registered an account and the tv 2 has registered an account of the user 1 or the user 2, and the tv 1 has registered an account of the user 1 or the user 2 and the tv 2 has not registered an account, the tv 2 cannot complete the connection to the session 1. Based on the method, for the above situation, the public device cannot realize cross-device human-computer interaction connection, and cannot realize cross-device cross-user human-computer interaction connection.
As another possible implementation, the television set 2 may access the historical dialog by recognizing a physiological characteristic of the user issuing the voice instruction, such as a voiceprint. For example, if the physiological characteristics recognized by the television 2 match the physiological characteristics of the user corresponding to the session 1, the television 2 acquires the session 1 between the user 1 and the television 1, and completes the processing for continuing the session 1. However, based on this method, for the above situation, the public device cannot implement cross-device human-computer interaction connection, and also cannot implement cross-device cross-user human-computer interaction connection.
As shown in fig. 4 (a), the living room tv 1 completes a session (e.g., session 1) with the user 1 upon receiving a first instruction from the user 1. Then, as shown in (b) in fig. 4, the user 2 walks into the bedroom to issue a second instruction to the television 2, and the physiological characteristic of the user 2 identified by the television 2 does not match the physiological characteristic of the user corresponding to the dialog 1 (i.e., the user 1), then the television 2 determines that the received second instruction and the dialog 1 come from different users. In this case, the television 2 cannot complete the processing for continuing the session 1.
Based on the disadvantages of the above method, in the method, when a public device receives an instruction (such as a voice instruction), the public device determines the identity of a user, and accesses historical dialogue data through a device identifier of a private device to determine the intention and specific response content of the historical dialogue, so as to accurately determine the purpose of the received instruction, timely, efficiently and accurately respond to the instruction, and complete the processing of continuing the historical dialogue.
For example, a public device, upon receiving an instruction (e.g., a voice instruction), may determine the relevant user through the detected surrounding private devices, and access historical conversation data through the device identification of the detected private device. Wherein the historical dialogue data is used at least for characterizing the intentions, slots, etc. of the historical dialogue. The historical dialogue data carries a user identification (e.g., a device identification of a private device of the user).
It will be appreciated that in embodiments of the present application, private devices are often carried around by users due to their privacy and portability and may therefore be used on behalf of the user. Based on this, the public device can determine the user identity through the detected surrounding private devices. Further, the intention of the historical conversation of the user can be accurately determined based on the identity of the user, so that the received instruction (such as a voice instruction) can be responded timely, efficiently and accurately. By the method, cross-device human-computer interaction connection can be realized, cross-user human-computer interaction connection can also be realized, and cross-device and cross-user human-computer interaction connection can also be realized.
The embodiment of the present application does not limit a specific storage location of the historical dialog data.
For example, historical conversation data may be stored in a public device and support sharing between different public devices or support opening to other public devices. For example, in the scenario shown in fig. 3 or fig. 4, the historical dialogue data of the tv set 1 and the user 1 may be saved in the tv set 1 and support sharing to the tv set 2 or support access and reading by the tv set 2.
For another example, the historical conversation data may be stored in a cloud device (e.g., a conversation server, etc.), and access and reading of public devices are supported. The embodiment of the application does not limit the specific location for storing the historical dialogue data.
Taking the example that the historical conversation data is stored in the cloud device, the man-machine interaction continuing method provided by the embodiment of the application can be realized based on a communication system comprising the cloud device. Illustratively, the cloud device may be a conversation server. The dialogue server is used for providing services for man-machine interaction functions (such as voice assistance) of terminal equipment (including public equipment and private equipment) in a communication system. Further, the communication system architecture provided in the embodiments of the present application further includes one or more public devices and one or more private devices.
Referring to fig. 5, fig. 5 is a diagram illustrating an exemplary architecture of a communication system according to an embodiment of the present application. As shown in fig. 5, the communication system architecture includes a conversation server, a television set 1 and a television set 2 (i.e., public devices), a handset 1 and a handset 2 (i.e., private devices). The mobile phone 1 belongs to the user 1, and the mobile phone 2 belongs to the user 2. The conversation server is used for storing historical conversation data of public devices (such as the television 1 and the television 2) and private devices (including the mobile phone 1 and the mobile phone 2). The dialog server may also support access and reading of historical dialog data by public devices, including television set 1 and television set 2.
As shown in fig. 6, taking the historical dialog data stored in the cloud device (e.g., the dialog server shown in fig. 5) as an example, in combination with the communication system architecture shown in fig. 5, the method for continuing the human-computer interaction provided in the embodiment of the present application may include the following steps S601-S606:
s601, the first device receives the first instruction, executes a first task indicated by the first instruction, and generates first historical dialogue data.
The first device is a public device, such as an intelligent television, an intelligent sound box and the like. The first instructions are to instruct a first device to perform a first task.
For example, the first instruction may include, but is not limited to, a voice instruction, a text instruction, a gesture execution, a touch instruction, a key instruction, a somatosensory instruction, a facial expression instruction, and the like, and reference may be made to conventional descriptions and illustrations of human-computer interaction manners.
For example, if the first command is a voice command "please play video 1", the first task is to play video 1. For another example, if the first command is a voice command "please play music 1", the first task is to play music 1.
Taking the human-computer interaction continuation scenario shown in fig. 3 or fig. 4 as an example, when receiving a first instruction of the user 1, the television 1 shown in (a) in fig. 3 or fig. 4 executes a first task indicated by the first instruction, and completes the conversation 1 with the user 1.
Illustratively, the first historical dialog data is associated with a first user. For example, the first user is the issuer of the first instruction, or the first device detects the first user when the first device receives the first instruction.
As an example, if the first device detects a personal portable device of the first user, the first device determines that the first user is detected. Since the personal portable device is carried by the user frequently, when the first electronic device detects the personal portable device of the first user, it indicates that the first user is beside the first device, and it can be determined that the first user is related to the first instruction and/or the first task. The personal portable device of the first user detected by the first device can be more ready to determine the relevance of the first user to the first instruction and/or the first task, and the task continuation can be realized more naturally and invisibly and the accuracy of the task continuation can be realized. The first device determines to detect the first user, which may be a personal portable device of which the first device discovers the first user, or may be a short-distance connection established between the first device and the personal portable device of the first user, or may be in another form, and this is not specifically limited in this embodiment of the present application.
As another example, the first device determines that the first user is detected if the first device detects a physiological characteristic of the first user (e.g., a voiceprint, a facial characteristic, a fingerprint, a gesture, etc.).
As another example, if the first device detects a login account of the first user (e.g., a video application login account, a music application login account, etc.), the first device determines that the first user is detected.
As an example, the first device may also determine the user associated with the first historical dialogue data based on a plurality of the detected personal portable device, physiological characteristics, and login account number.
The first historical dialogue data is used for at least characterizing occurrence time of historical dialogue corresponding to the first task, intention of the historical dialogue, identification of the first device, users (including the first user) and slots related to the historical dialogue, and the like.
Taking the human-computer interaction continuation scenario shown in fig. 3 or fig. 4 as an example, dialog 1 shown in (a) in fig. 3 or fig. 4 is the first historical dialog data. Wherein, the intention of the history dialog for representing the first history dialog data refers to an action performed by the first device, such as playing/pausing VIDEO (PLAY _ VIDEO/PAUSE _ VIDEO), playing/pausing MUSIC (PLAY _ MUSIC/PAUSE _ MUSIC) continuous playing VIDEO (RESUME _ VIDEO), continuous playing MUSIC (RESUME _ MUSIC), etc., instructed by the user through the first instruction. The identification of the first device may be a public device ID or name, etc., such as a smart tv in a living room, the tv 1, etc.
In the embodiment of the present application, the user related to the historical dialog represented by the first historical dialog data may be represented by an identifier of a private device (e.g., the mobile phone 1) of the user (including the first user) who issues the first instruction, such as a MAC address of the mobile phone 1, an ID of the mobile phone 1, and the like. Slots such as video name (VideoName), music name (MusicName), video lead actor (VideoArtist), music singer (MusicArtist), etc.
S602, the first device sends the first historical dialogue data to the third device.
For example, the third device is a cloud device (such as a session server), a central control device, an intelligent home service device, or the like, or the third device may also be a public device or a private device, where the public device or the private device is used to store historical session data, and the specific function and structure of the third device are not limited in the present application.
Wherein the first device sends the first historical dialogue data to the third device for storing the first historical dialogue data in the third device.
In the embodiment of the present application, the third device stores historical dialogue data of one or more public devices (including the first device). For example, a public device (e.g., a first device) may send historical conversation data to a third device in real-time during human-computer interaction. Taking the communication system shown in fig. 5 as an example, the session server (i.e., the third device) may store therein historical session data of each piece of human-computer interaction of the television 1 and historical session data of each piece of human-computer interaction of the television 2.
As a possible implementation, the third device (e.g., the session server) may maintain the relevant historical session data at the granularity of a public device. For example, the historical dialog data may be as follows:
{ "identification of first device": xxx;
"user identification": [ 'identification of handset 1' ];
first historical dialogue data;
……
}。
the above example is exemplified by the user identification being the identification of the mobile phone 1 of the user (including the first user) who issues the first instruction.
S603, when the second device receives the second instruction, the second device determines the user related to the second instruction.
The second device is a public device, such as an intelligent television, an intelligent sound and the like.
For example, the second instruction may include, but is not limited to, a voice instruction, a text instruction, a gesture execution, a touch instruction, a key instruction, a somatosensory instruction, a facial expression instruction, and the like.
As an example, the second device may determine the user associated with the second instruction based on multiple of the detected one or more fourth devices, physiological characteristics (e.g., voice print, facial characteristics, fingerprint, gesture, etc.), and login account number (e.g., video application login account number, music application login account number, etc.) upon receiving the second instruction.
It is understood that, in the embodiment of the present application, when the second device receives the second instruction, the users to which the private devices located around the second device belong are generally the users related to the second instruction. Wherein, the private device (such as the fourth device) is a smart phone, a smart watch, a smart bracelet, and the like. For example, a user to which a fourth device located around the second device belongs may be an issuer of the second instruction. As another example, a user to which a fourth device located around the second device belongs may be a participant of the second instruction.
Based on this, in the embodiment of the present application, as a possible implementation manner, when the second device receives the second instruction, the second device may detect a surrounding private device to discover a private device related to the second instruction, and further determine a user related to the second instruction. That is, the second device, upon receiving the second instruction, may determine the user associated with the second instruction by detecting one or more fourth devices in the vicinity.
As a possible implementation manner, the second device may also determine the user related to the second instruction by detecting one or more fourth devices around at any time after receiving the second instruction and before executing the second task.
As an example, the second device, upon receiving the second instruction, may detect one or more fourth devices in the vicinity based on a device discovery technique.
In one possible implementation, the device discovery technique may be a near field discovery technique. Near field discovery technology may exchange data through close range communication when devices are close to each other to discover and identify each other. For example, the second device may discover and identify one or more fourth devices in the surroundings by one or more of: private devices which are in wireless connection with the periphery, such as Bluetooth, NFC and Wi-Fi direct connection, are detected, and private devices which are in the same local area network with the periphery and the private devices are detected. The embodiments of the present application do not limit the specific manner and method of near field discovery.
Taking the human-computer interaction continuing scenario shown in fig. 3 and the communication system architecture shown in fig. 5 as an example, assuming that the user 1 shown in fig. 3 carries the mobile phone 1 shown in fig. 5 and the user 2 carries the mobile phone 2 shown in fig. 5, as shown in (b) in fig. 3, since the television 2 is located in a bedroom and the user 1 carries the mobile phone 1 and is also located in the bedroom, the television 2 can detect the mobile phones 1 (i.e., fourth devices) of the surrounding users 1 when receiving a second instruction of the user.
Taking the human-computer interaction continuation scenario shown in fig. 4 and the communication system architecture shown in fig. 5 as an example, assuming that the user 1 shown in fig. 4 carries the mobile phone 1 shown in fig. 5 and the user 2 carries the mobile phone 2 shown in fig. 5, as shown in (b) in fig. 4, since the television 2 is located in a bedroom and the user 2 carries the mobile phone 2 and is also located in the bedroom, the television 2 can detect the mobile phone 2 (i.e., the fourth device) of the user 2 around when receiving the second instruction of the user.
In some scenarios, if multiple users carrying private devices (i.e., fourth devices) respectively exist around the second device, the second device may discover multiple private devices belonging to the multiple users through detection. For example, when receiving the first instruction, the television 1 in the living room shown in fig. 3 (a) or fig. 4 (a) detects that the mobile phone 1 of the user 1 and the mobile phone 2 of the user 2 are located in the living room.
And S604, the second device acquires the related historical dialogue data from the third device according to the determined user related to the second instruction. The relevant historical conversation data includes first historical conversation data.
Wherein, in some embodiments, the first instruction is associated with a first user and the second instruction is associated with the first user.
In other embodiments, the first instruction is associated with a first user and a second user, and the second instruction is associated with the second user.
In other embodiments, the first instruction is associated with a first user and the second instruction is associated with the first user and a second user.
As a possible implementation manner, if the second device receives the second instruction, it is determined that the first user is related to the second instruction by detecting one or more fourth devices (e.g., private devices including the first user) around the second device. The second device may obtain historical dialog data associated with the first user from the second device based on detecting the identity of the private device of the first user. The second device detects one or more fourth devices around, may be that the second device discovers the one or more fourth devices, may also be that the first device establishes a connection with the one or more fourth devices, or may be in another form, which is not specifically limited in this embodiment of the present application.
The identifier of the private device is used to uniquely identify the private device, and is therefore also called unique identifier (unique id) of the private device. For example, the identification of the private device may be a Media Access Control (MAC) address, an identity number (ID), and the like of the second private device, and the embodiment of the present application is not limited thereto.
The relevant historical dialogue data acquired by the second device from the third device comprises at least one section of historical dialogue data. A historical dialog may be associated with at least one fourth device, for example dialog 1 may be associated with both handset 1 and handset 2 in a human interaction scenario as shown in (a) of figure 3 or (a) of figure 4. Based on this, in the embodiment of the present application, the data of one historical session carries an identifier of at least one fourth device. The second device may refer to the obtained identifications of one or more fourth devices, and retrieve historical conversation data related to the one or more fourth devices from the third device (where, in this embodiment, the historical conversation data related to the fourth device means that the user to which the fourth device belongs is the initiator or the participant of the historical conversation).
And S605, the second equipment determines the purpose of the second instruction according to the related historical dialogue data and the second instruction.
For example, in this embodiment of the present application, the second device may determine, according to the intention and the slot position represented by the relevant historical dialogue data, the purpose of the second instruction in combination with the content of the second instruction.
For example, assuming that the intention of the relevant historical dialogue data characterizing the historical dialogue is "play video", the slot is "video 1", and the content of the second instruction received by the second device is "continue playing video played before", the second device may determine that the purpose of the second instruction is to continue the playing progress of video 1.
For another example, assuming that the related historical dialog data characterizes the intent of the historical dialog as "play video", the slot is "video 1", the content of the second instruction received by the second device is "play next video", and assuming that the next video of video 1 is video 2, the second device may determine that the purpose of the second instruction is to switch to video 2.
In some examples, assuming that the second device references the identity of one or more fourth devices, and the relevant historical dialog data retrieved from the third device includes data of a segment of the historical dialog, the second device determines the purpose of the second instruction based on the data of the segment of the historical dialog and the content of the second instruction.
And S606, the second device responds to the second instruction and executes the second task.
In this embodiment of the application, the second device responding to the second instruction includes the second device executing a second task indicated by the second instruction according to an object of the second instruction.
In some embodiments, if the second device is the first device, the second device (i.e., the first device) continues to execute the action corresponding to the second instruction following the first task.
For example, assuming that the second device determines that the purpose of the second instruction is to continue the playing progress of the first device on the video 1, the second device responds to the second instruction, and continues to play the video 1 following the previous progress of the first device on playing the video 1. For another example, assuming that the second device determines that the purpose of the second instruction is to switch to the next video (e.g., video 2) of video 1 played before the first device, the second device terminates playing video 1 and starts playing video 2 in response to the second instruction.
In other embodiments, if the second device is different from the first device, the second device continues to execute the action corresponding to the second instruction following the first task of the first device in the first history dialog.
For example, assuming that the second device determines that the second instruction is intended to continue the playback progress of the video 1 by the first device, the second device acquires playback information (including, for example, playback to ×. Time × × ×. Minutes ×. Seconds) for the video 1 from the first device in response to the second instruction, and continues playing the video 1 following the progress of the video 1 previously played by the first device. For another example, assuming that the second device determines that the purpose of the second instruction is to switch to the next video of video 1 that was played before the first device, the second device obtains historical playing information (such as a name of a recently played video and video list information) from the first device in response to the second instruction, and plays the next video (such as video 2).
In some embodiments, the responding, by the second device, to the second instruction further includes sending, by the second device, a response message to at least one of the one or more fourth devices, where the response message is used to notify the user that the task indicated by the second instruction has been executed.
In some embodiments, during the second task performed by the second device, the second device may further refine the user associated with the second instruction by detecting surrounding private devices. For example, assume that the second device detects through the surrounding private devices at S603, and determines that the user related to the second instruction includes the first user; during the process that the second device executes the second task, the second device determines that the second user is still related to the second instruction by detecting surrounding private devices, and then the second device can update the user related to the second instruction to the first user and the second user.
Further, after the second device responds to the received second instruction, the second device may also send the content of the current session to a third device (e.g., a session server), so as to update the historical session data for reference when the subsequent public device continues to perform a task. That is, the method for continuing human-computer interaction provided in the embodiment of the present application further includes the following step S607:
and S607, the second device sends the second historical conversation data to a third device (such as a conversation server).
Wherein the second device sends the second historical dialogue data to a third device (e.g., dialogue server) for the third device (e.g., dialogue server) to update the stored historical dialogue data.
The second historical dialogue data is at least used for representing the occurrence time of the historical dialogue corresponding to the executed second task, the intention of the historical dialogue, the identification of the second device, the user and the slot position related to the historical dialogue and the like after the second device receives the second instruction. Illustratively, the users associated with the historical dialog may be represented by an identification of one or more fourth devices.
In the embodiment shown in fig. 6, only the historical dialogue data is stored in the cloud device or the central control device or another type of third device as an example. It is understood that one skilled in the art may implement more or fewer steps in fig. 6. For the storage of the historical dialog data, there may be other situations, for example, as shown in fig. 7, the historical dialog data storage first device, and specifically, in the continuation method of human-computer interaction provided in the embodiment of the present application shown in fig. 6, S602 is replaced by S701, S604 is replaced by S702, and S607 is replaced by S703:
s701, the first equipment stores first historical conversation data.
S702, the second device obtains the relevant historical dialogue data from the first device according to the determined user relevant to the second instruction. The relevant historical conversation data includes first historical conversation data.
And S703, the second device stores second historical conversation data.
In one implementation, the saving of the historical dialog data may also be saved on the first device that received the instructions, and the first device synced to other related electronic devices. In this way, when the second device receives the instruction again, because the second electronic device has synchronized the historical dialogue data of other devices, the second device can directly acquire the relevant historical dialogue data from the local without acquiring the historical dialogue data from other devices.
The embodiment of the present application does not specifically limit the method for storing and acquiring the historical dialogue data.
According to the man-machine interaction connection method provided by the embodiment of the application, when a user interacts with the public device (such as the first device), the public device marks the conversation by using the identification of the peripheral private device, so that reference is provided for the following public device in task connection. For example, when the second device receives the second instruction, it may retrieve the relevant historical dialogue data by detecting surrounding private devices, with the identity of the detected private device as a reference. Furthermore, the purpose of the second instruction is determined according to the related historical dialogue data and the content of the second instruction, so that the second instruction can be responded in time, and the continuing task can be accurately finished. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized. For example, when the first instruction is related to a first user, the second instruction is related to the first user, the second device is different from the first device, or when the first instruction is related to the first user and a second user, the second instruction is related to the second user, the second device is the same as the first device, or when the first instruction is related to the first user, the second instruction is related to the first user and a second user, the second device is the same as the first device, or when the first instruction is related to the first user and a second user, the second instruction is related to a second user, the second device is different from the first device, or when the first instruction is related to the first user, the second instruction is related to the first user and a second user, and the second device is different from the first device, the man-machine interaction connection can be timely, efficiently and accurately realized.
It should be noted that, in some embodiments of the present application, assuming that the second device includes a piece of historical dialog data in the relevant historical dialog data retrieved from the third device (e.g., the dialog server) by using the identifier of the above-mentioned one or more fourth devices as a reference, the second device may execute the above-mentioned S605 to determine the purpose of the second instruction. Further, the second device performs the above S606-S607.
In other embodiments of the present application, assuming that the second device includes multiple pieces of historical dialogue data in the relevant historical dialogue data retrieved from the third device (e.g., the dialogue server) by referring to the identifier of the one or more fourth devices, as an implementation manner, the second device may determine the purpose of the second instruction according to the latest piece of historical dialogue data in the multiple pieces of historical dialogue and the content of the second instruction. Further, the second device performs the above S606-S607.
As another implementation manner, assuming that the second device includes a plurality of pieces of data of the historical dialog in the relevant historical dialog data retrieved from the third device (such as the dialog server) with the identifier of the one or more fourth devices as a reference, the second device may further determine the most relevant data of the historical dialog from the plurality of pieces of data of the historical dialog in any one of the following manners 1 to 4, so as to determine the purpose of the second instruction based on the most relevant data of the historical dialog and the content of the second instruction.
Mode 1: and the second equipment determines the data of the most relevant historical conversation from the data of the multiple historical conversations by combining the login account corresponding to the second instruction.
For this case, when the second device receives the second instruction, the information of the login account corresponding to the second instruction is also obtained. And the second device also carries login account information in the related historical dialog data acquired in S604.
Based on this, when the second device includes multiple pieces of data of the historical dialog in the relevant historical dialog data retrieved from the third device (e.g., the dialog server) with reference to the identifier of the one or more fourth devices, the second device may determine the data of the most relevant historical dialog from among the multiple pieces of data of the historical dialog with reference to the login account information acquired in S603. For example, the most relevant history session is a history session in which the login account coincides with the login account acquired by the second device at S603.
Further, for this case, after the second device responds to the received second instruction, the second device further includes, in the second historical dialog data sent to the third device (e.g., the dialog server) in S607, the login account information that the second device obtained in S603.
Mode 2: the second device determines data of a most relevant historical session from the data of the plurality of historical sessions in conjunction with the detected physiological characteristic.
It will be appreciated that some physiological characteristics of the user may be used to uniquely characterize the user, based on which the third device may also make the most relevant historical dialog determination with reference to the physiological characteristics. Taking voiceprint as an example, voiceprint is a sound wave spectrum carrying speech information displayed by an electro-acoustic instrument. The voiceprint is not only specific, but also has the characteristic of relative stability. For example, the user identity may be determined by analyzing the formant frequencies, trends, waveforms, etc. of the voiceprint.
Therefore, in the case where the relevant historical dialogue data includes multiple pieces of historical dialogue data, the physiological characteristics of the user are also detected when the second device receives the second instruction. And the second device also carries the detected physiological characteristics in the relevant historical dialogue data acquired in S604. For example, the physiological characteristics may include, but are not limited to, one or more of the following: voice prints, facial features, fingerprints, gestures, etc.
Based on this, when the second device includes the data of the multiple pieces of historical conversations in the relevant historical conversation data retrieved from the third device (e.g., the conversation server) with reference to the identification of the one or more fourth devices, the second device may determine the data of the most relevant historical conversation from among the data of the multiple pieces of historical conversations with reference to the physiological characteristics detected at S603. For example, the most relevant historical dialog is a historical dialog in which the physiological characteristics coincide with the physiological characteristics detected by the second device at S603.
Further to this case, after the second device responds to the received second instruction, the second device further includes the physiological characteristic information of the user detected by the second device at S603 in the second historical dialogue data sent to the third device (e.g., the dialogue server) at S607.
Mode 3: the second device determines, in conjunction with the login account number and the detected physiological characteristic, data of a most relevant historical session from the data of the plurality of segments of historical sessions.
For this case, when the second device receives the second instruction, the information of the login account is also acquired, and the physiological characteristics of the user are detected. And the second device also carries login account information and the detected physiological characteristics in the relevant historical dialogue data acquired in S604.
Based on this, when the second device includes data of a plurality of pieces of historical conversations in the relevant historical conversation data retrieved from the third device (e.g., a conversation server) with reference to the identifier of the one or more fourth devices, the second device may determine data of the most relevant historical conversation from among the data of the plurality of pieces of historical conversations with reference to the login account information acquired at S603 and the detected physiological characteristics. For example, the most relevant history dialog is a history dialog in which the login account coincides with the login account acquired by the second device at S603, and the physiological characteristics coincide with the physiological characteristics detected by the second device at S603.
Further, for this case, after the second device responds to the received second instruction, the data of the session of this time, which is sent by the second device to the third device (e.g., the session server) in S607, further includes the login account information acquired by the second device in S603 and the detected physiological characteristic information of the user.
Mode 4: the second device presents the user with information related to the plurality of segments of the historical dialog for the user to select the data of the most relevant historical dialog from.
For example, the second device may present the intent and slot of the multiple-segment historical conversation to the user by means of a pop-up window or the like.
According to the man-machine interaction connection method provided by the embodiment of the application, when a user interacts with a public device (such as a first device), the public device marks the conversation by using the identification, login account information and/or physiological characteristics of peripheral private devices, so that reference can be made in the following task connection of the public device. For example, when the first device receives the second instruction, the relevant historical dialogue data may be retrieved by detecting surrounding private devices, obtaining login account information, and/or detecting physiological characteristics, using the above information as a reference. Furthermore, the purpose of the second instruction is determined according to the related historical dialogue data and the content of the second instruction, so that the second instruction can be responded in time, and the continuing task can be accurately finished. For example, cross-device human-computer interaction connection, cross-user human-computer interaction connection and cross-device cross-user human-computer interaction connection are timely, efficiently and accurately realized.
A method for continuing human-computer interaction provided in the embodiments of the present application will be specifically described below with reference to several specific examples of human-computer interaction scenarios.
Example one scenario:
referring to fig. 8, fig. 8 shows an example of a continuation scenario based on the man-machine interaction shown in fig. 4. As shown in fig. 8 (a), assuming that at 13 pm, user 1 sends a voice instruction (i.e., a first instruction) to play video 1 to tv 1 (i.e., a first device) in the living room, at this time, tv 1 detects and finds surrounding mobile phones 1, and stores the session record (i.e., session 1) in the session server (i.e., a third device) in association with mobile phone 1 (i.e., first historical session data). Next, at 15 pm, user 2 sends a voice command to tv 1 to play video 2 in the living room, at which time tv 1 detects surrounding mobile phones 2 and correlates the conversation record (e.g., conversation 2) with mobile phone 2 and stores it in the conversation server. Then, at 20 pm, user 1 issues a voice instruction (i.e., a second instruction) to television set 2 (i.e., a second device) in the bedroom to continue playing the video that was previously played on television set 1.
For example, the data of dialog 1 and the data of dialog 2 described above may be as shown in (b) of fig. 8. Here, the data of the session 1 and the data of the session 2 shown in (b) in fig. 8 may be stored in the session server. Dialog 1 and dialog 2 are historical dialogs.
As shown in fig. 9, the continuation method of human-computer interaction applied to scenario example one may include the following steps S901 to S907:
s901, the television set 2 (i.e. the second device) detects a peripheral private device (i.e. one or more fourth devices) when receiving the second instruction of the user.
When the television set 2 receives the second instruction, the users to which the private apparatuses located around the television set 2 belong are generally users related to the interactive instruction, such as an issuer of the interactive instruction and a participant of the interactive instruction. Thus, the television set 2 can determine the user associated with the second instruction by detecting the surrounding private devices.
It should be noted that the second instruction of the user may include, but is not limited to, a voice instruction, a text instruction, a gesture execution, a touch instruction, a key instruction, a somatosensory instruction, a facial expression instruction, and the like of the user, and the embodiment of the present application is not limited. The example scenario shown in fig. 8 is merely exemplified by the second instruction being a voice instruction.
As shown in fig. 8, the second instruction "please continue playing the video played on the tv 1 before" received by the tv 2 is used to indicate the progress of playing the video on the tv 1. When receiving the second instruction from the user, the television 2 detects the surrounding mobile phones 1.
As an example, the television set 2 may detect surrounding private devices based on technologies such as near field discovery.
When the tv set 2 detects the mobile phone 1, further, the tv set 2 performs the following steps S902-S905:
s902, the television 2 (i.e. the second device) obtains the detected identity of the handset 1 (i.e. the fourth device or devices).
As shown in fig. 8, when receiving a second instruction "please continue playing the video that is played on the tv 1 before the user" from the user, the tv 2 detects the surrounding mobile phones 1, and then the tv 2 obtains the identifier (also called unique id) of the mobile phone 1.
As an example, the identifier of the mobile phone 1 may be a MAC address, an ID, and the like of the mobile phone 1, and the embodiment of the present application does not limit a specific identification manner.
S903, the television 2 (i.e. the second device) obtains the relevant historical dialogue data from the dialogue server (i.e. the third device) according to the identifier of the mobile phone 1 (i.e. one or more fourth devices).
The session server stores historical session data of one or more public devices (such as the television 1 and the television 2 shown in fig. 8). Taking the connection scenario of human-computer interaction shown in fig. 8 as an example, the session server stores data of session 1 between user 1 and television 1, and data of session 2 between user 2 and television 1. The data of dialog 1 and the data of dialog 2 carry the identification of the relevant private device. For example, the data of the dialog 1 carries the identifier of the mobile phone 1 of the user 1, that is, the dialog 1 is related to the mobile phone 1; the data of the session 2 carries the identity of the mobile phone 2 of the user 2, i.e. the session 2 is related to the mobile phone 2.
As shown in fig. 8 (a), since the television 2 detects the surrounding mobile phones 1 when the television 2 receives the second instruction of the user, the television 2 acquires the relevant historical dialogue data from the dialogue server with the identifier of the mobile phone 1 as a reference. Since the data of the session 1 carries the identifier of the mobile phone 1, the television 2 can obtain the data of the session 1 as shown in (b) in fig. 8, that is, the relevant historical session data.
The historical dialogue data at least represents the occurrence time, intention (such as playing/pausing video, playing/pausing music, continuing to play video, continuing to play music, etc.), identification of public equipment (such as represented by ID or name of public equipment), user (such as represented by identification of private equipment), slot position (such as video name, music name, video master, music singer, etc.), etc. of the historical dialogue.
As shown in (b) in fig. 8, the data of dialog 1 is used to characterize the occurrence time of dialog 1: afternoon 13: video playing and public equipment: television 1, user: mobile phone 1, slot position: video 1, dialog 1, records that the user to which the handset 1 belongs interacts with the television set 1 at 13 pm 00 to instruct the television set 1 to play video 1.
S904, the television 2 (i.e. the second device) determines the purpose of the second instruction according to the relevant historical dialog data and the content of the second instruction.
As an example, the television 2 may determine the purpose of the second instruction according to the intention and the slot of the data representation of the dialog 1, and the content of the second instruction.
As shown in (b) in fig. 8, the data of the session 1 indicates that the intention of the session 1 is to play the video, the slot is the video 1, and the content of the second instruction is "please continue playing the video that was played on the television 1 before", then the television 2 may determine that the purpose of the second instruction is to continue the playing progress of the video 1 by the television 1.
S905, the television 2 (i.e. the second device) responds to the second instruction, and executes the second task.
For example, the television set 2 executes the second task indicated by the second instruction according to the purpose of the second instruction. Since the television set 2 determines in step S904 that the purpose of the second instruction is to continue the playing progress of the video 1 by the television set 1 based on the above-mentioned relevant historical dialogue data and the content of the above-mentioned second instruction, the television set 2 executes the second task indicated by the above-mentioned second instruction, acquires the playing information (including, for example, playing to xx x times xx minutes x seconds) for the video 1 from the television set 1, and continues playing the video 1 following the playing progress of the video 1 by the television set 1.
In some embodiments, further, the step of the television 2 responding to the second instruction further includes that the television 2 sends a response message to the handset 1, so as to notify the user that the task indicated by the second instruction has been executed.
Further, after the television 2 responds to the second instruction, the television 2 may also upload a tag of the detected fourth device (i.e., the mobile phone 1) identifier printed on the current session to the session server for archiving, that is, as shown in fig. 9, the method for connecting human-computer interaction provided in the embodiment of the present application further includes the following steps S906 to S907:
s906, the television 2 (i.e., the second device) transmits the data of the current session (i.e., the second historical session data) to the session server (i.e., the third device).
Taking the example of the connection scenario of human-computer interaction shown in fig. 8 as an example, after the television 2 responds to the received second instruction and executes the second task, the television 2 sends data of the session 3 (i.e., data of the second historical session, such as data of the session 3 shown in (b) in fig. 8) to the session server, so as to be referred to when the subsequent public device connects the task. As shown in fig. 8 (b), the data of the session 3 carries the identifier of the mobile phone 1, and is used to identify the user of the session.
S907, the dialogue server (i.e., the third device) updates the historical dialogue data.
In the method for continuing human-computer interaction provided by the above scenario example of the present application, when a user interacts with a public device, for example, 13 pm when a user 1 interacts with a television 1, and 15 pm when a user 2 interacts with the television 1, the public device (e.g., the television 1) marks the conversation with an identifier of a peripheral private device (e.g., marks the conversation 1 with the identifier of the mobile phone 1, and marks the conversation 2 with the identifier of the mobile phone 2), so as to be referred to when a subsequent public device continues a task. For example, when the television set 2 receives the instruction, it may be possible to retrieve the relevant historical dialogue data by detecting the surrounding private devices, with reference to the identity of the detected private device (e.g. the handset 1). Furthermore, the purpose of the instruction is determined according to the related historical dialogue data (such as dialogue 1) and the content of the instruction, so that the instruction can be responded in time, and cross-device man-machine interaction connection can be accurately completed.
Scenario example two:
referring to fig. 10, fig. 10 shows an example of a continuation scenario based on the man-machine interaction shown in fig. 4. As shown in fig. 10 (a), it is assumed that at 14 pm, when user 1 and user 2 are in the living room together, user 1 issues a voice command (i.e., a first command) to tv 1 (i.e., a first device) to play movie 1, and at this time, tv 1 detects, finds surrounding handsets 1 and 2, and stores the session record (e.g., session 1) in the session server (i.e., a third device) in association with handsets 1 and 2 (i.e., first historical session data). Next, at 14 pm.
Illustratively, the data of the above dialog 1 may be as shown in (b) of fig. 10. Among them, the data of the session 1 shown in (b) in fig. 10 may be stored in the session server. Dialog 1 is a history dialog.
As shown in fig. 11, the continuation method of human-computer interaction applied to scenario example two may include the following steps S1101-S1107:
s1101, the television 1 (i.e., the second device) detects a peripheral private device (i.e., one or more fourth devices) upon receiving the second instruction of the user.
When the television 1 receives the second instruction, the users to which the private devices located around the television 1 belong are generally users related to the second instruction, such as an issuer of the second instruction, and a participant of the second instruction. Thus, the television set 1 can determine the user related to the second instruction by the detected surrounding private devices.
In the example of the continuation scenario shown in fig. 10, since the user 1 and the user 2 are in the living room together when the television 1 receives the second instruction "please change another movie", the television 1 can find the surrounding mobile phones 1 and 2 by detecting. Further, the television set 1 performs the following steps S1102 to S1105:
s1102, the television 1 (i.e. the second device) obtains the detected identities of the mobile phone 1 and the mobile phone 2 (i.e. the one or more fourth devices).
As an example, the identifier of the mobile phone 1 may be a MAC address, an ID, and the like of the mobile phone 1, and the identifier of the mobile phone 2 may be a MAC address, an ID, and the like of the mobile phone 2, and the embodiment of the present application does not limit a specific identification manner.
S1103, the television 1 (i.e. the second device) obtains the relevant historical dialog data from the dialog server (i.e. the third device) according to the identifier of the mobile phone 1 and the identifier of the mobile phone 2.
Taking the connection scenario of human-computer interaction shown in fig. 10 as an example, the session server stores the data of session 1. The dialog 1 carries the identifier of the mobile phone 1 and the identifier of the mobile phone 2, that is, the dialog 1 is related to both the mobile phone 1 and the mobile phone 2.
As shown in fig. 10 (a), since the television 1 detects the surrounding mobile phones 1 and 2 when the television 1 receives the second instruction of the user, the television 1 acquires the relevant historical dialogue data from the dialogue server with the identifier of the mobile phone 1 and the identifier of the mobile phone 2 as references. Since the data of the session 1 carries the identifier of the mobile phone 1 and the identifier of the mobile phone 2, the television 1 can acquire the data of the session 1 as shown in (b) in fig. 10, that is, the historical session data. Wherein, the data of dialog 1 is used to characterize the occurrence time of dialog 1: 14 pm, intent: playing videos and public equipment: television 1, user: mobile phones 1 and 2, slot positions: movie 1 of actor a, dialog 1, records that the user to which cell phone 1 belongs and the user to which cell phone 2 belongs interact with television set 1 at 14 pm.
S1104, the television 1 (i.e. the second device) determines the purpose of the second instruction according to the relevant historical dialogue data and the content of the second instruction.
As an example, the television 1 may determine the purpose of the second instruction according to the intention and the slot of the data representation of the session 1, and the content of the second instruction.
As shown in fig. 10 (b), the data of the conversation 1 characterizes that the intention of the conversation 1 is to play a video, the slot is the movie 1 of the actor a, and the content of the second instruction is "please change another movie" of him, the television set 1 can determine that "he" in the second instruction refers to the actor a. Further, the television set 1 may determine that the second command is intended to change another movie of the actor a.
S1105, the television 1 (i.e., the second device) executes the second task in response to the second instruction.
For example, the television set 1 executes the second task indicated by the second instruction according to the purpose of the second instruction.
Since the television 1 determines in step S1104 that the second command is intended to change another movie of the actor a according to the related historical dialogue data and the content of the second command, the television 1 executes the task indicated by the second command, terminates playing the movie 1, and starts playing another movie of the actor a, such as movie 2. For example, movie 2 satisfies one or more of the following conditions: the present application is not limited to the above examples, such as the approach to the subject of movie 1, the approach to the genre of movie 1, and the highest score.
Further, after the television 1 responds to the second instruction, the television 1 may also upload the tag of the detected fourth device (i.e., the mobile phone 1 and the mobile phone 2) marked on the current session to the session server for archiving, that is, as shown in fig. 11, the method for continuing human-computer interaction provided in the embodiment of the present application further includes the following steps S1106-S1107:
s1106, the television 1 (i.e., the second device) transmits the data of the current session (i.e., the second history session data) to the session server (i.e., the third device).
Taking the example of the connection scenario of human-computer interaction shown in fig. 10 as an example, after the television 1 responds to the received second instruction, the television 1 sends data of the session 2 (i.e., data of the second historical session, such as data of the session 2 shown in (b) in fig. 10) to the session server, so as to be referred to when the following public device connects to the task. As shown in fig. 10 (b), the data of the session 2 carries the identifier of the mobile phone 1 and the identifier of the mobile phone 2, which are used to identify the user of the session.
S1107, the session server (i.e., the third device) updates the historical session data.
According to the human-computer interaction continuing method provided by the above scenario example two of the present application, when a user interacts with a public device, for example, when the user 1 and the user 2 are together, the user 1 interacts with the television 1 at 13 pm, and the user 2 interacts with the television 1 at 15 pm, the public device (such as the television 1) marks the conversation (for example, marks the conversation 1 with the identification of the mobile phone 1 and the identification of the mobile phone 2) by using the identification of the peripheral private device, so that the subsequent public device can refer to the task continuing. For example, when the television set 1 receives an instruction, it may retrieve the relevant historical dialogue data by detecting surrounding private devices, with reference to the identities of the detected private devices (e.g., the mobile phone 1 and the mobile phone 2). Furthermore, the purpose of the instruction is determined according to the relevant historical dialogue data (such as dialogue 1) and the content of the instruction, so that the instruction can be responded in time, and cross-user human-computer interaction connection can be accurately completed.
It should be noted that, the connection scenario example shown in fig. 10 and the connection method of human-computer interaction shown in the scenario example one described above in the embodiment of the present application are taken as examples that user 1 and user 2 are always together. The human-computer interaction continuing method provided by the embodiment of the application is still applicable to the situation that the user 1 sends the instruction 1 to the second device at the first moment, the user 1 or the user 2 sends the instruction 2 to the second device at the second moment, the user 1 and the user 2 are together at the first moment, and the user 1 and the user 2 are not together at the second moment.
For the above case, it is assumed that the second device obtains the identities of the surrounding handsets 2 (where the handset 2 belongs to the user 2) through detection when receiving the instruction 2 of the user 2 at the second time. The second device then retrieves the relevant historical dialog data from the dialog server, based on the identity of the handset 2. Because the data of the dialog 1 corresponding to the instruction 1 carries the identifier of the mobile phone 1 and the identifier of the mobile phone 2, the second device can obtain the data of the dialog 1 from the dialog server according to the identifier of the mobile phone 2. Finally, the second device determines the purpose of the instruction 2 from the data of the dialog 1 and the content of the instruction 2, so as to respond to the instruction 2 in time. Further, the second device sends the data of the current session to the session server for archiving.
It is assumed that the second device, upon receiving the instruction 2 of the user 1 at the second moment, detects and obtains the identities of the surrounding handsets 1 (where the handset 1 belongs to the user 1). The second device then retrieves the relevant historical dialogue data from the dialogue server according to the identity of the handset 1. Because the data of the dialog 1 corresponding to the instruction 1 carries the identifier of the mobile phone 1 and the identifier of the mobile phone 2, the second device may obtain the data of the dialog 1 from the dialog server according to the identifier of the mobile phone 1. Finally, the second device determines the purpose of the instruction 2 based on the data of the dialog 1 and the content of the instruction 2, so that the instruction 2 is responded to in time. Further, the second device sends the data of the conversation to the conversation server for archiving.
Similarly, the human-computer interaction continuing method provided by the embodiment of the application is still applicable to the case that the user 1 sends the instruction 1 to the second device at the first moment, the user 1 or the user 2 sends the instruction 2 to the second device at the second moment, the user 1 and the user 2 are not together at the first moment, and the user 1 and the user 2 are together at the second moment.
For the above situation, it is assumed that when the second device receives the instruction 2 of the user 2 at the second time, the identifiers of the surrounding mobile phones 1 and 2 (where the mobile phone 1 belongs to the user 1 and the mobile phone 2 belongs to the user 2) are obtained through detection. The second device then retrieves the relevant historical dialogue data from the dialogue server according to the identity of the handset 1 or the handset 2. Because the data of the dialog 1 corresponding to the instruction 1 carries the identifier of the mobile phone 1, the second device can obtain the data of the dialog 1 from the dialog server according to the identifier of the mobile phone 1. Finally, the two devices determine the purpose of the instruction 2 according to the data of the conversation 1 and the content of the instruction 2, so as to respond to the instruction 2 in time. And further, the two devices send the data of the conversation to the conversation server for archiving.
It is assumed that when the second device receives the instruction 2 of the user 1 at the second time, the identifiers of the surrounding mobile phones 1 and 2 (where the mobile phone 1 belongs to the user 1 and the mobile phone 2 belongs to the user 2) are obtained through detection. The second device then retrieves the relevant historical dialogue data from the dialogue server according to the identity of the handset 1 or the handset 2. Because the data of the dialog 1 corresponding to the instruction 1 carries the identifier of the mobile phone 1, the second device can obtain the data of the dialog 1 from the dialog server according to the identifier of the mobile phone 1. Finally, the second device determines the purpose of the instruction 2 based on the data of the dialog 1 and the content of the instruction 2, so that the instruction 2 is responded to in time. Further, the second device sends the data of the current session to the session server for archiving.
Example scenario three:
referring to fig. 12, fig. 12 shows an example of a continuation scenario based on the man-machine interaction shown in fig. 4. As shown in fig. 12 (a), it is assumed that at 14 pm, when user 1 and user 2 are in the living room together, user 1 issues a voice command (i.e., a first command) to tv 1 (i.e., a first device) to play movie 1, and at this time, tv 1 detects, finds surrounding handsets 1 and 2, and stores a session record (e.g., session 1) in the session server (i.e., a third device) in association with handsets 1 and 2 (i.e., first historical session data). Then, at 14 pm.
Exemplarily, the data of the above dialog 1 may be as shown in (b) of fig. 12. Among them, the data of the session 1 shown in (b) in fig. 12 may be stored in the session server. Dialog 1 is a history dialog.
As shown in fig. 13, the continuation method of human-computer interaction applied to scene example three may include the following steps S1301-S1307:
s1301, the television set 2 (i.e. the second device) detects a peripheral private device (i.e. one or more fourth devices) when receiving the second instruction of the user.
In the continuing scene example shown in fig. 12, since the user 2 is in the bedroom when the television 2 receives the second instruction "please change another movie", the television 2 can find the surrounding mobile phones 2 through detection. Further, the television set 2 executes the following steps S1302 to S1305:
s1302, the television set 2 (i.e. the second device) obtains the detected identity of the handset 2 (i.e. the fourth device or devices).
As an example, the identifier of the mobile phone 2 may be a MAC address, an ID, and the like of the mobile phone 2, and the embodiment of the present application does not limit a specific identification manner.
S1303, the television 2 (i.e. the second device) obtains the relevant historical dialog data from the dialog server (i.e. the third device) according to the identifier of the mobile phone 2.
Taking the following scenario of human-computer interaction shown in fig. 12 as an example, the session server stores the data of session 1.
As shown in fig. 12 (a), since the television 2 detects the surrounding mobile phones 2 when the television 2 receives the second instruction of the user, the television 2 acquires the relevant historical dialogue data from the dialogue server with reference to the identification of the mobile phone 2. Since the data of the conversation 1 is related to both the cellular phone 1 and the cellular phone 2, the television set 2 can acquire the data of the conversation 1 as shown in (b) in fig. 12, that is, the related history conversation data. Wherein, the data of dialog 1 is used to characterize the occurrence time of dialog 1: 14 pm, intent: video playing and public equipment: television 1, user: mobile phone 1 and mobile phone 2, slot position: movie 1 of actor a, dialog 1, records that the user to which cell phone 1 belongs and the user to which cell phone 2 belongs interact with television set 1 at 14 pm.
S1304, the television 2 (i.e., the second device) determines the purpose of the second instruction according to the relevant historical dialog data and the content of the second instruction.
As shown in fig. 12 (b), the data of the dialog 1 indicates that the intention of the dialog 1 is to play a video, and the slot is the movie 1 of the actor a, the television set 2 may determine that "he" in the second instruction "please change another movie of his" indicates the actor a, and the second instruction is for changing another movie of the actor a.
S1305, the tv set 2 (i.e. the second device) executes the second task in response to the second instruction.
For example, the television set 2 executes the second task indicated by the second instruction according to the purpose of the second instruction.
Since the tv set 2 determines in step S1304 that the second command is intended to change another movie of the actor a according to the relevant historical dialogue data and the content of the second command, the tv set 2 executes the task indicated by the second command, terminates playing the movie 1, and starts playing another movie of the actor a, such as the movie 2. For example, movie 2 satisfies one or more of the following conditions: the present application is not limited to the case of the material close to the subject of movie 1, the genre close to movie 1, the highest score, and the like.
Further, after the television 2 responds to the second instruction, the television 2 may also upload the tag of the detected fourth device (i.e., the mobile phone 2) identifier printed on the current session to the session server for archiving, that is, as shown in fig. 13, the method for connecting human-computer interaction provided in the embodiment of the present application further includes the following steps S1306 to S1307:
s1306, the television set 2 (i.e., the second device) transmits the data of the current session (i.e., the second historical session data) to the session server (i.e., the third device).
Taking the example of the connection scenario of the human-computer interaction shown in fig. 12 as an example, after the television 2 responds to the received second instruction, the television 2 sends data of the dialog 2 (i.e. data of the second historical dialog, such as data of the dialog 2 shown in (b) in fig. 12) to the dialog server for reference when the following public device connects to the task. As shown in fig. 12 (b), the data of the session 2 includes the identifier of the mobile phone 2, and is used to identify the user of the current session.
S1307, the dialogue server (i.e., the third device) updates the historical dialogue data.
According to the human-computer interaction continuing method provided by the third scenario example of the present application, when a user interacts with a public device, for example, when the user 1 and the user 2 are together, the user 1 interacts with the television 1 at 13 pm, and the user 2 interacts with the television 1 at 15 pm, the public device (such as the television 1) marks the conversation (for example, marks the conversation 1 with the identification of the mobile phone 1 and the identification of the mobile phone 2) by using the identification of the peripheral private device, so that the subsequent public device can refer to the task continuing. For example, when the television set 2 receives the instruction, it may be possible to retrieve the relevant historical dialogue data by detecting the surrounding private devices, with reference to the identity of the detected private device (e.g. the handset 2). Furthermore, the purpose of the instruction is determined according to the related historical dialogue data (such as dialogue 1) and the content of the instruction, so that the instruction can be responded in time, and cross-equipment and cross-user man-machine interaction connection can be accurately completed.
It should be understood that the various aspects of the embodiments of the present application can be reasonably combined and the explanation or description of each term appearing in the embodiments can be mutually referred to or explained in each embodiment, which is not limited.
It should also be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not imply any order of execution, and the order of execution of the processes should be determined by their functions and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
It is understood that, in order to implement the functions of any of the above embodiments, the terminal device (including the first device and the second device) includes a hardware structure and/or a software module for executing the respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed in hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiment of the present application, the terminal device (including the first device and the second device) may be divided into the functional modules, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and another division manner may be available in actual implementation.
For example, in a case that each functional module is divided in an integrated manner, as shown in fig. 14, the functional module is a block diagram of a terminal device provided in the embodiment of the present application. As shown in fig. 14, the terminal device may include an instruction detecting unit 1410, a processing unit 1420, a device discovery unit 1430, a transceiving unit 1440, and a storage unit 1450.
The instruction detection unit 1410 is configured to support the terminal device to perform instruction detection, such as performing voice instruction or text instruction detection. The processing unit 1420 is configured to support the terminal device to determine a user related to the instruction after the instruction detecting unit 1410 detects the instruction, obtain related historical dialog data according to the user related to the instruction, perform a corresponding task according to the related historical dialog data and the instruction, and/or perform other processes related to the embodiments of the present application. The device discovery unit 1430 is used to support peripheral device (e.g., private device) detection by the terminal device and/or other processes related to embodiments of the present application. The transceiving unit 1440 is used for transmitting and receiving radio signals, for example, for supporting the terminal device to receive historical dialogue data from other devices (e.g., a third device), and/or other processes related to embodiments of the present application. The storage unit 1450 is used for supporting the terminal device to store historical dialogue data, to store computer programs and processing data and/or processing results for implementing the methods provided by the embodiments of the present application, and/or to perform other processes related to the embodiments of the present application.
As an example, the transceiving unit 1440 may include a radio frequency circuit. Specifically, the terminal device may receive and transmit a wireless signal through the radio frequency circuit. Typically, the radio frequency circuitry includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency circuitry may also communicate with other devices via wireless communication. The wireless communication may use any communication standard or protocol including, but not limited to, global system for mobile communications, general packet radio service, code division multiple access, wideband code division multiple access, long term evolution, email, short message service, and the like.
It should be understood that the respective modules in the terminal device may be implemented in software and/or hardware, and are not particularly limited thereto. In other words, the electronic device is presented in the form of a functional module. "module" herein may refer to an application specific integrated circuit ASIC, a circuit, a processor and memory that execute one or more software or firmware programs, an integrated logic circuit, and/or other devices that may provide the described functionality.
In an alternative approach, when data transfer is implemented using software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are implemented in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in or transmitted from a computer-readable storage medium to another computer-readable storage medium, for example, from one website, computer, server, or data center, through wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.) means to another website, computer, server, or data center.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware or may be embodied in software instructions executed by a processor. The software instructions may consist of corresponding software modules that may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. Additionally, the ASIC may reside in an electronic device. Of course, the processor and the storage medium may reside as discrete components in a terminal device.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions.

Claims (31)

1. A continuation method of human-computer interaction is characterized in that the method comprises the following steps:
the method comprises the steps that first equipment receives a first instruction, executes a first task indicated by the first instruction and generates first historical conversation data, and the first historical conversation data are relevant to a first user;
the second device determines that the second instruction is relevant to the first user when receiving the second instruction;
the second device obtaining historical conversation data related to the first user, the historical conversation data comprising the first historical conversation data;
and the second equipment executes a second task according to the first historical conversation data and the second instruction.
2. The method of claim 1, wherein the first historical conversation data is associated with a first user, comprising:
the first instruction corresponding to the first historical dialogue data is sent by the first user; alternatively, the first and second liquid crystal display panels may be,
and when the first device receives the first instruction corresponding to the first historical dialogue data, the first device detects the first user.
3. The method of claim 2, further comprising:
the first device detects a personal portable device of the first user upon receiving the first instruction.
4. The method of any of claims 1-3, wherein the first historical conversation data is stored in a third device, and wherein the second device obtains historical conversation data associated with the first user, comprising:
the second device obtains historical dialog data associated with the first user from the third device.
5. The method of any of claims 1-3, wherein the first historical conversation data is stored in the first device, and wherein the second device obtains historical conversation data associated with the first user, comprising:
the second device obtains the historical conversation data associated with the first user from the first device.
6. The method of any of claims 1-5, wherein the second device, upon receiving a second instruction, determining that the second instruction is relevant to the first user comprises:
when the second device receives the second instruction, detecting one or more surrounding fourth devices to determine a user related to the second instruction; the one or more fourth devices comprise a personal portable device of the first user.
7. The method of claim 6, wherein the second device obtains historical conversation data associated with the first user, comprising:
the second device obtains the historical dialog data associated with the first user based on the identification of the personal portable device of the first user.
8. The method of any of claims 1-7, wherein the first historical dialog data characterizes the first task as playing a first program, the second instruction to indicate to continue playing the first program; the second device executes a second task according to the first historical dialog data and the second instruction, including:
and the second equipment continues the playing progress of the first equipment to the first program according to the first historical dialogue data and the second instruction.
9. The method according to any one of claims 1-7, wherein the first historical dialogue data characterizes that the first task is playing a first program, the second instruction is used for instructing switching of the played program, and then the second device executes a second task according to the first historical dialogue data and the second instruction, and the method comprises:
and the second equipment plays the next program of the first program played by the first equipment according to the first historical dialogue data and the second instruction.
10. The method according to any of claims 1-9, characterized in that the second device is identical to the first device.
11. The method of any of claims 1-10, wherein the first instruction and the second instruction are voice instructions or text instructions.
12. The method of any of claims 1-11, wherein the first historical conversation data is further associated with a second user.
13. The method of claim 12, further comprising:
the fourth device determines that the third instruction is related to the second user when receiving the third instruction;
the fourth device obtaining historical dialogue data related to the second user, wherein the historical dialogue data comprises the first historical dialogue data;
and the fourth equipment executes a third task according to the first historical conversation data and the third instruction.
14. The method of any of claims 1-13, wherein the first historical conversation data is at least used to characterize an occurrence time of a historical conversation corresponding to the first task, an intent of a historical conversation, an identification of the first device, a user and a slot related to a historical conversation.
15. A human-computer interaction continuation method is characterized by comprising the following steps:
when the second device receives the second instruction, determining that the second instruction is relevant to the first user;
the second device acquires historical dialogue data related to the first user, wherein the historical dialogue data comprises first historical dialogue data, the first historical dialogue data is generated by the first device after a first instruction is received and a first task indicated by the first instruction is executed, and the first historical dialogue data is related to the first user;
and the second equipment executes a second task according to the first historical conversation data and the second instruction.
16. The method of claim 15, wherein the first historical conversation data is stored in a third device, and wherein the second device obtains historical conversation data associated with the first user, comprising:
the second device obtains historical dialog data associated with the first user from the third device.
17. The method of claim 15, wherein the first historical dialogue data is stored in the first device, and wherein the second device obtains historical dialogue data associated with the first user, comprising:
the second device obtains the historical dialog data associated with the first user from the first device.
18. The method of any of claims 15-17, wherein the second device, upon receiving a second instruction, determining that the second instruction is relevant to the first user comprises:
the second device detects one or more surrounding fourth devices when receiving the second instruction to determine a user related to the second instruction; the one or more fourth devices comprise a personal portable device of the first user.
19. The method of claim 18, wherein the second device obtaining historical conversation data associated with the first user comprises:
the second device obtains the historical dialog data associated with the first user based on the identification of the personal portable device of the first user.
20. The method of any of claims 15-19, wherein the first historical dialog data characterizes the first task as playing a first program, the second instruction to indicate to continue playing the first program; the second device executes a second task according to the first historical dialog data and the second instruction, including:
and the second equipment continues the playing progress of the first equipment to the first program according to the first historical conversation data and the second instruction.
21. A method according to any of claims 15-19, wherein the first historical dialogue data characterizes that the first task is playing a first program, and the second instruction is for instructing switching of the playing program, then the second device performs a second task according to the first historical dialogue data and the second instruction, comprising:
and the second equipment plays the next program of the first program played by the first equipment according to the first historical dialogue data and the second instruction.
22. The method according to any of claims 15-21, wherein the second device is the same as the first device.
23. The method of any of claims 15-22, wherein the second instruction is a voice instruction or a text instruction.
24. The method of any of claims 15-23, wherein after the second device performs the second task, the method further comprises:
the second device generates second historical conversation data, the second historical conversation data being related to the first user.
25. The method of claim 24, wherein the second instruction is further associated with a third user.
26. A second device, characterized in that the second device comprises:
a memory for storing a computer program;
a transceiver for receiving or transmitting a radio signal;
a processor for executing the computer program such that the second device implements the method of any of claims 15-25.
27. A communication system, the communication system comprising: a first device and a second device, the communication system being adapted to implement the method of any of claims 1-14.
28. The communication system of claim 27, further comprising a third device.
29. A computer-readable storage medium, having stored thereon computer program code, which, when executed by a processing circuit, implements the method of any of claims 15-25.
30. A chip system, comprising a processing circuit, a storage medium having computer program code stored therein; the computer program code realizing the method of any of claims 15-25 when executed by the processing circuit.
31. A computer program product for running on a computer to implement the method of any one of claims 15-25.
CN202111165583.XA 2021-09-30 2021-09-30 Human-computer interaction continuing method, device and system Pending CN115904059A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111165583.XA CN115904059A (en) 2021-09-30 2021-09-30 Human-computer interaction continuing method, device and system
PCT/CN2022/120571 WO2023051379A1 (en) 2021-09-30 2022-09-22 Method, device and system for continuing human-machine interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111165583.XA CN115904059A (en) 2021-09-30 2021-09-30 Human-computer interaction continuing method, device and system

Publications (1)

Publication Number Publication Date
CN115904059A true CN115904059A (en) 2023-04-04

Family

ID=85739469

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111165583.XA Pending CN115904059A (en) 2021-09-30 2021-09-30 Human-computer interaction continuing method, device and system

Country Status (2)

Country Link
CN (1) CN115904059A (en)
WO (1) WO2023051379A1 (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9713444B2 (en) * 2008-09-23 2017-07-25 Digital Artefacts, Llc Human-digital media interaction tracking
CN110191372A (en) * 2019-07-03 2019-08-30 百度在线网络技术(北京)有限公司 Multimedia interaction method, system and device
CN110460905A (en) * 2019-09-03 2019-11-15 腾讯科技(深圳)有限公司 The automatic continuous playing method of video, device and storage medium based on more equipment
CN113132782A (en) * 2020-01-16 2021-07-16 百度在线网络技术(北京)有限公司 Streaming media transmission method, streaming media playing device and electronic device
CN112702633A (en) * 2020-12-21 2021-04-23 深圳市欧瑞博科技股份有限公司 Multimedia intelligent playing method and device, playing equipment and storage medium
CN113411652A (en) * 2021-07-02 2021-09-17 广州酷狗计算机科技有限公司 Media resource playing method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
WO2023051379A1 (en) 2023-04-06

Similar Documents

Publication Publication Date Title
EP4080834A1 (en) Notification processing system and method, and electronic device
WO2021023208A1 (en) Data sharing method, graphical user interface, related device, and system
CN112714214B (en) Content connection method, equipment, system, GUI and computer readable storage medium
CN105099724B (en) Group creating method and device
CN108831448A (en) The method, apparatus and storage medium of voice control smart machine
WO2020155014A1 (en) Smart home device sharing system and method, and electronic device
WO2017063283A1 (en) System and method for triggering smart vehicle-mounted terminal
CN105532634A (en) Ultrasonic wave mosquito repel method, device and system
CN114499587B (en) Audio synchronization communication method, system, wireless earphone, terminal and storage medium
CN111756764B (en) Audio signal interaction method and system based on WiFi
WO2020029094A1 (en) Method for generating speech control command, and terminal
CN112237031B (en) Method for accessing intelligent household equipment to network and related equipment
CN104299403A (en) Method and device for controlling household appliances
CN115552915A (en) Method for continuously playing multimedia content between devices
CN113965715B (en) Equipment cooperative control method and device
CN114449333B (en) Video note generation method and electronic equipment
CN114090986A (en) Method for identifying user on public equipment and electronic equipment
CN114285938A (en) Equipment recommendation method and equipment
EP4354831A1 (en) Cross-device method and apparatus for synchronizing navigation task, and device and storage medium
CN115904059A (en) Human-computer interaction continuing method, device and system
CN116384342A (en) Semantic conversion method, semantic conversion device, semantic conversion apparatus, semantic conversion storage medium, and semantic conversion computer program
CN110224991A (en) Depending on the networked terminals means of communication and device
CN104298508B (en) The method and apparatus for performing operation
CN113407076A (en) Method for starting application and electronic equipment
CN114327198A (en) Control function pushing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination