CN111930229B

CN111930229B - Man-machine interaction method and device and electronic equipment

Info

Publication number: CN111930229B
Application number: CN202010714175.4A
Authority: CN
Inventors: 杨润润; 胡新星; 宋禹君; 袁焕; 林静芝
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2020-07-22
Filing date: 2020-07-22
Publication date: 2021-09-03
Anticipated expiration: 2040-07-22
Also published as: CN111930229A

Abstract

The embodiment of the disclosure discloses a man-machine interaction method, a man-machine interaction device and electronic equipment. One embodiment of the method comprises: outputting multimedia information of first question information based on the playing of a multimedia file, wherein the multimedia file comprises the multimedia information of at least one question information; receiving first reply information fed back by a user aiming at the first question information; determining whether the first reply information and the first question information are matched; and executing the target operation based on whether the first reply information is matched with the first question information. The implementation mode provides a new method for man-machine question-answer interaction.

Description

Man-machine interaction method and device and electronic equipment

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a human-computer interaction method, a human-computer interaction device and electronic equipment.

Background

With the development of computer technology, users can perform question-answer interaction through terminal equipment.

In the related art, a user may ask a question to a terminal device, and further, the terminal device may reply according to the question of the user.

Disclosure of Invention

This disclosure is provided to introduce concepts in a simplified form that are further described below in the detailed description. This disclosure is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

The embodiment of the disclosure provides a man-machine interaction method, a man-machine interaction device and electronic equipment, and provides a new man-machine question-answer interaction method.

In a first aspect, an embodiment of the present disclosure provides a human-computer interaction method, where the method includes: executing a first recording step: outputting multimedia information of first question information based on the playing of a multimedia file, wherein the multimedia file comprises the multimedia information of at least one question information; receiving first reply information fed back by a user aiming at the first question information; determining whether the first reply information and the first question information are matched; and executing the target operation based on whether the first reply information is matched with the first question information.

In a second aspect, an embodiment of the present disclosure provides a human-computer interaction device, including: the output unit is used for outputting the multimedia information of the first question information based on the playing of a multimedia file, wherein the multimedia file comprises the multimedia information of at least one question information; the receiving unit is used for receiving first reply information fed back by a user aiming at the first question information; a determining unit configured to determine whether the first reply information and the first question information match; and the execution unit is used for executing the target operation based on whether the first reply information is matched with the first question information.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the human-computer interaction method as described in the first aspect.

In a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium, on which a computer program is stored, which when executed by a processor, implements the steps of the human-computer interaction method according to the first aspect.

The man-machine interaction method, the man-machine interaction device and the electronic equipment can output the multimedia information of the first question information based on the playing of the multimedia file. In practice, the multimedia file contains multimedia information of at least one question message. Further, first reply information fed back by the user for the first question information may be received. Still further, it may be determined whether the first reply information and the first question information match. Finally, the target operation may be executed based on whether the first reply information and the first question information match. Therefore, the terminal equipment initiates a question and the user performs the replied question-answer interaction. Therefore, a new method for man-machine question-answer interaction is provided.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.

FIG. 1 is a flow diagram of some embodiments of a human-machine interaction method according to the present disclosure;

FIG. 2 is a schematic diagram of one application scenario of a human-computer interaction method according to the present disclosure;

FIG. 3 is a flow diagram of still further embodiments of human-computer interaction methods according to the present disclosure;

FIG. 4 is a schematic block diagram of some embodiments of a human-computer interaction device according to the present disclosure;

FIG. 5 is an exemplary system architecture to which the human-computer interaction methods of some embodiments of the present disclosure may be applied;

fig. 6 is a schematic diagram of a basic structure of an electronic device provided in accordance with some embodiments of the present disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.

It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.

The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.

It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.

It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.

The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.

Referring to FIG. 1, a flow diagram of some embodiments of a human-machine interaction method according to the present disclosure is shown. As shown in fig. 1, the man-machine interaction method includes the following steps:

step 101, outputting the multimedia information of the first question information based on the playing of the multimedia file.

In this embodiment, an execution subject of the human-computer interaction method (e.g.,

terminal devices

501 and 502 shown in fig. 5) may output the multimedia information of the first question information based on the playing of the multimedia file.

The multimedia file comprises multimedia information of at least one question message. In practice, the multimedia file may be an audio file or a video file.

It is understood that the question information may be information for asking a question to the user.

In some scenarios, the execution subject may play the multimedia file in response to a play operation performed by a user on the multimedia file. Here, during the playing of the multimedia file, the execution main body may output the multimedia information of the first question information.

And 102, receiving first reply information fed back by the user aiming at the first question information.

In this embodiment, after outputting the multimedia information of the first question information, the user may feed back the first reply information with respect to the first question information.

It is understood that the reply information may be information that the user replies to the quiz information. Accordingly, the first reply information may be information in which the user replies to the first question information.

In some scenarios, the reply message may be a text message entered by the user. Therefore, the execution main body can receive the text information input by the user aiming at the first question information as the first reply information fed back by the user.

In this embodiment, the executing entity may receive first reply information fed back by the user with respect to the first question information.

Step 103, determining whether the first reply information and the first question information are matched.

In this embodiment, the execution subject may determine whether the first reply information and the first question information match.

In some scenarios, the execution subject may extract keywords from the first reply message. Further, the execution subject may determine a similarity between the extracted keyword and the first question information. Still further, in response to that the similarity between the first answer information and the first question information is greater than or equal to a preset threshold, the executing body may determine that the first answer information and the first question information match. Otherwise, the executing entity may determine that the first reply information and the first question information do not match.

And 104, executing the target operation based on whether the first reply information is matched with the first question information.

In this embodiment, the executing agent may execute the target operation based on whether the first reply information and the first question information are matched.

In some scenarios, the execution subject may stop playing the multimedia file in response to the first reply information and the first question information not matching. That is, the output of the multimedia information of the quiz information is stopped.

In some scenarios, the execution principal may store the first question information and the first reply information in a predetermined file in response to the first reply information and the first question information matching.

As described in the background art, regarding the question-answer interaction between the user and the terminal device, in the related art, the user first asks a question to the terminal device, and then the terminal device replies according to the question of the user. Therefore, the method realizes the question and answer interaction of the user initiating the question and the terminal equipment replying. Therefore, the flexibility of the man-machine interaction of question and answer is low to a certain extent.

In this embodiment, multimedia information of the first question information is output based on the playing of the multimedia file, further, first reply information fed back by the user for the first question information is received, further, whether the first reply information is matched with the first question information is determined, and finally, target operation is executed based on whether the first reply information is matched with the first question information. Therefore, the terminal equipment initiates a question and the user performs the replied question-answer interaction. Therefore, a new method for man-machine question-answer interaction is provided.

In some embodiments, the execution subject may determine whether the first reply information and the first question information match in the following manner.

The method comprises the steps of firstly, inquiring the reference reply information of the first question information from a preset database according to a keyword set for the first question information.

The keyword may be a word for determining a matching reply message. In practice, the keywords may be preset for the questioning information, or may be extracted from the questioning information.

The database is used for storing the reference reply information. It will be understood that the reference reply information may be used as a reference for determining whether the reply information matches the question information.

In some scenarios, in a preset database, keywords set for the question information and reference reply information of the question information are stored in association. Therefore, the execution subject may query the reference reply information stored in association with the first question information from the database.

And secondly, determining whether the first reply information is matched with the first question information or not based on the inquired reference reply information.

In some scenarios, the execution subject may determine whether the queried reference reply message includes the first reply message. In response to the first reply information included in the queried reference reply information, the executing entity may determine that the first reply information matches the first question information. In response to the reference reply message not including the first reply message, the executing entity may determine that the first reply message does not match the first question message.

In some scenarios, the execution subject may determine whether there is a reference reply message with a similarity greater than or equal to a preset similarity threshold from the queried reference reply message. In response to the existence of the reference reply information having the similarity greater than or equal to the preset similarity threshold with the first reply information, the executing entity may determine that the first reply information matches the first question information. In response to the absence of the reference reply information having a similarity greater than or equal to the preset similarity threshold with the first reply information, the execution subject may determine that the first reply information and the first question information do not match.

In the embodiments, whether the first reply information fed back by the user and the first question information match is determined through the reference reply information queried from the preset database.

In some embodiments, the execution subject may execute the target operation as follows.

Specifically, in response to the first reply information not matching the first question information, the multimedia information of the first question information is output again based on the playing of the multimedia file, so that the user feeds back the first reply information of the first question information again.

It is understood that the first reply information and the first question information do not match, meaning that the reply of the user feedback is incorrect.

In some scenarios, the executing entity may re-output the multimedia information of the first question information in response to a play operation performed by the user on the multimedia file. Further, the user may re-feed the first reply information of the first question information.

Therefore, if the fed back first reply information is not matched with the first question information, the user can feed back the first reply information of the first question information again.

Specifically, in response to that the first reply information is matched with the first question information and the multimedia file is not played completely, the multimedia information of the second question information is output based on the playing of the multimedia file, so that the user feeds back the second reply information of the second question information.

In practice, in the process of receiving the user feedback reply information, the execution main body may pause playing the multimedia file. That is, the presentation of the quiz information is suspended during the user's feedback of the reply information. Therefore, orderly implementation of man-machine question-answer interaction can be realized.

In some scenarios, the execution subject may continue to play the multimedia file in response to a play operation performed by a user on the multimedia file. Here, during the playing of the multimedia file, the execution main body may output the multimedia information of the second question information. Further, the user may feed back second reply information for the second question information.

Therefore, if the first reply information fed back by the user is matched with the first question information and the multimedia file is not played completely, the user can feed back the second reply information aiming at the second question information. Therefore, the man-machine question-answer interaction is realized again.

In some embodiments, the execution subject may further display the first prompt message.

The first cue information may characterize at least one of: the reply information fed back by the user is not matched with the question information given by the preset role; and feeding back reply information aiming at the questioning information given by the preset role again.

Therefore, the first prompt message is displayed to prompt the user that the feedback reply message and the question message are not matched, and/or prompt the user to feed back the reply message again.

In some embodiments, the execution subject may further display a second prompt message.

The second prompt message is used for prompting the benchmark reply message of the questioning message.

Therefore, the user is prompted to feed back the reply information matched with the question information by displaying the second prompt information. It is understood that when the user does not know how to reply, the user is guided to reply by presenting the second prompt message. Therefore, the user experience of the user in the question and answer interaction process is improved.

In some embodiments, the execution subject may present the second prompt message as follows.

The first step, according to the language selection operation executed by the user, the target language for displaying the second prompt information is determined.

The language selection operation may be an operation of selecting a language for determining a target language. The target language may be a language for presenting the second prompt message. In practice, the target language may be various languages such as english, chinese, japanese, etc.

In some scenarios, the language selection operation may be a trigger operation performed by a user on a control that selects a language. Therefore, the execution main body can take the language indicated by the control for executing the trigger operation as the target language.

And secondly, displaying second prompt information according to the target language.

In some scenarios, the execution subject may select the second prompt information in the target language from the second prompt information in the various languages. Further, the execution subject may display a second prompt message in the target language.

In these embodiments, the target language of the second prompt information is determined according to the language selection operation performed by the user. Therefore, the second prompt message of the target language is displayed, and the flexibility of displaying the second prompt message can be improved.

Referring to fig. 2, an application scenario of a human-computer interaction method according to an embodiment of the present disclosure is shown. As shown in fig. 2, the terminal device 201 may issue the first question information 203 to the user based on the playing of the multimedia file 202. In practice, the multimedia file 202 contains multimedia information including at least one question. Further, the terminal device 201 may receive the first reply information 204 fed back by the user with respect to the first question information 203. Still further, the terminal apparatus 201 may determine whether the first reply information 204 and the first question information 203 match. Finally, the terminal device 201 may execute the target operation 205 based on whether the first reply information 204 and the first question information 203 match. Therefore, the question-answer interaction between the user and the terminal device 201 is realized.

Continuing to refer to FIG. 3, a flow chart of still further embodiments of human-machine interaction methods according to the present disclosure is shown. As shown in fig. 3, the man-machine interaction method includes the following steps:

step 301, outputting the multimedia information of the first question information based on the playing of the multimedia file.

In this embodiment, the multimedia information contained in the multimedia file indicates a predetermined character to give question information. That is, the multimedia information contained in the multimedia file may be multimedia information in which a predetermined character gives question information.

The predetermined roles may be various roles defined in advance. For example, the predetermined character may be a human or an animal.

Step 302, receiving first reply information fed back by the user aiming at the first question information.

Step 303, determining whether the first reply message and the first question message match.

And step 304, executing the target operation based on whether the first reply information is matched with the first question information.

Step 301, step 302, step 303, and step 304 may be performed in a similar manner as step 101, step 102, step 103, and step 104 in the embodiment shown in fig. 1, and the above description for step 101, step 102, step 103, and step 104 also applies to step 301, step 302, step 303, and step 304, and is not repeated here.

Step 305, recording first multimedia information of which the preset role gives question information in the process of outputting the multimedia information.

In this embodiment, in outputting the multimedia information, an executing body of the man-machine interaction method (e.g.,

terminal devices

501, 502 shown in fig. 5) may record first multimedia information in which a predetermined character gives question information.

It is understood that the first multimedia information may be multimedia information in which a predetermined character gives question information.

It is understood that recording the first multimedia information means recording the process of a predetermined character giving the quiz information.

And step 306, recording the second multimedia information of the user feedback reply information in the user feedback reply information process.

In this embodiment, in the process of the user feeding back the reply message, the executing body may record the second multimedia message of the user feeding back the reply message.

It is understood that the second multimedia information may be multimedia information of the user feedback reply information.

It is understood that recording the second multimedia message means recording the process of the user feedback reply message.

In this embodiment, when the reply information fed back by the user does not match the question information given by the predetermined role, the second recording step is re-executed, so that the second voice and the second picture of the reply information fed back by the user can be re-recorded. Therefore, the fact that the reply information fed back by the user is matched with the question information given by the preset role in the final recording target can be achieved.

In some embodiments, the first multimedia message includes a first voice and a first picture of the question information given by the predetermined character, and the second multimedia message includes a second voice and a second picture of the user feedback reply information.

It is understood that the first voice may be a voice in which a predetermined character gives question information. The first screen may be a screen in which a predetermined character gives question information. For example, the question information may be "What's your name", the first voice may be a voice in which the predetermined character gives "What's your name", and the first screen may be a screen in which the predetermined character gives "What's your name".

It is understood that the second voice may be a voice in which the user feeds back the reply information. The second screen may be a screen on which the user feeds back the reply information. For example, the reply information may be "My name is Mary", the second voice may be a voice of the user feedback "My name is Mary", and the second screen may be a screen of the user feedback "My name is Mary".

In practice, recording the first multimedia information means recording a first picture and a first voice of a predetermined character giving the quiz information. Recording the second multimedia information means recording a second picture and a second voice of the user feedback reply information.

In some embodiments, the first frame is displayed in a first display area of the screen, and the second frame is displayed in a second display area of the screen.

Thus, a screen in which a predetermined character gives question information can be presented in the first presentation area in the screen. And, the picture of the user feedback reply information can be displayed in a second display area in the screen.

In some embodiments, the recording of the first picture and the recording of the second picture are done by screen recording. That is, the first picture and the second picture are recorded by screen recording.

Referring to the foregoing, the first display area and the second display area displayed in the screen respectively display the first picture and the second picture. Therefore, recording the first picture by screen recording will also record the second picture shown in the screen. Similarly, the second picture is recorded by screen recording, and the first picture displayed in the screen is also recorded. Therefore, the finally recorded target video can simultaneously display a first picture of the questioning information given by the preset role and a second picture of the feedback reply information of the user.

In some embodiments, the second picture is captured by a camera and the second voice is captured by a microphone.

In the embodiments, a second picture of the user feedback reply information is collected through the communication connected camera, and a second voice of the user feedback reply information is collected through the communication connected microphone. Therefore, the recording process of the feedback reply information of the user is completed by recording the second picture acquired by the camera and the second voice acquired by the microphone.

In some embodiments, the execution body may perform the following steps.

Specifically, in response to the end of playing the multimedia file, the target multimedia file is generated based on the recording of the first multimedia information and the recording of the second multimedia information.

In some scenarios, the execution subject may use a multimedia file recording the first multimedia information and the second multimedia information as the target multimedia file.

It is understood that generating the target multimedia file means generating a multimedia file including a predetermined character and a question-answer interaction process performed by the user.

With further reference to fig. 4, as an implementation of the methods shown in the above-mentioned figures, the present disclosure provides some embodiments of a human-computer interaction device, which correspond to the method embodiment shown in fig. 1, and which can be applied in various electronic devices.

As shown in fig. 4, the human-computer interaction device of the present embodiment includes: output unit 401, receiving unit 402, determining unit 403, and executing unit 404. The output unit 401 is configured to: and outputting the multimedia information of the first question information based on the playing of the multimedia file, wherein the multimedia file comprises at least one piece of multimedia information of the question information. The receiving unit 402 is configured to: and receiving first reply information fed back by the user aiming at the first question information. The determination unit 403 is configured to: it is determined whether the first reply information and the first question information match. Execution unit 404 is to: and executing the target operation based on whether the first reply information is matched with the first question information.

In this embodiment, specific processes of the output unit 401, the receiving unit 402, the determining unit 403, and the executing unit 404 of the human-computer interaction device and technical effects thereof may refer to the related descriptions of step 101, step 102, step 103, and step 104 in the corresponding embodiment of fig. 1, which are not described herein again.

In some embodiments, the multimedia information contained in the multimedia file indicates that a predetermined character gives the question information. The human-computer interaction means may comprise a recording unit (not shown in fig. 4). The recording unit is used for: recording first multimedia information of a preset role giving question information in the process of outputting the multimedia information; and recording second multimedia information of the user feedback reply information in the user feedback reply information process.

In some embodiments, the recording of the first picture and the recording of the second picture are done by screen recording.

In some embodiments, the human-computer interaction device may comprise a generating unit (not shown in fig. 4). The generation unit is used for: and responding to the end of the playing of the multimedia file, and generating the target multimedia file based on the recording of the first multimedia information and the recording of the second multimedia information.

In some embodiments, the determining unit 403 is further configured to: inquiring the reference reply information of the first question information from a preset database according to the keyword set for the first question information; and determining whether the first reply information is matched with the first question information or not based on the inquired reference reply information.

In some embodiments, the execution unit 404 is further configured to: and in response to the fact that the first reply information is not matched with the first question information, based on the playing of the multimedia file, outputting the multimedia information of the first question information again, so that the user feeds back the first reply information of the first question information again.

In some embodiments, the execution unit 404 is further configured to: and responding to the matching of the first reply information and the first question information and the non-playing end of the multimedia file, outputting the multimedia information of the second question information based on the playing of the multimedia file, so that the user can feed back the second reply information of the second question information.

In some embodiments, the human-computer interaction device may include a first presentation unit (not shown in fig. 4). The first display unit is used for: displaying first prompt information, wherein the first prompt information represents at least one of the following items: the reply information fed back by the user is not matched with the question information given by the preset role; and feeding back reply information aiming at the questioning information given by the preset role again.

In some embodiments, the human-computer interaction device may comprise a second presentation unit (not shown in fig. 4). The second display unit is used for: and displaying second prompt information, wherein the second prompt information is used for prompting the benchmark reply information of the questioning information.

In some embodiments, the second presentation unit is further for: determining a target language for displaying the second prompt message according to the language selection operation executed by the user; and displaying the second prompt message according to the target language.

With further reference to fig. 5, fig. 5 illustrates an exemplary system architecture to which the human-machine interaction methods of some embodiments of the present disclosure may be applied.

As shown in fig. 5, the system architecture may include

terminal devices

501, 502, a network 503, and a server 504. The network 503 is the medium used to provide communication links between the

terminal devices

501, 502 and the server 504. Network 503 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.

The

terminal devices

501, 502 may interact with a server 504 via a network 503. Various client applications may be installed on the

terminal devices

501, 502. For example, the

terminal devices

501 and 502 may have a question-answering interaction application, a video recording application, and the like installed thereon. In some scenarios, the

terminal device

501, 502 may output the multimedia information of the first question information based on the playing of the multimedia file. In practice, the multimedia file contains multimedia information of at least one question message. Further, the

terminal devices

501 and 502 may receive first reply information fed back by the user with respect to the first question information. Still further, the

terminal device

501, 502 may determine whether the first reply information and the first question information match. Finally, the

terminal device

501, 502 may execute the target operation based on whether the first reply information and the first question information match.

The

terminal devices

501 and 502 may be hardware or software. When the

terminal devices

501 and 502 are hardware, they may be various electronic devices having a display screen and supporting information interaction, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. When the

terminal devices

501 and 502 are software, they can be installed in the electronic devices listed above. It may be implemented as multiple pieces of software or software modules, or as a single piece of software or software module. And is not particularly limited herein.

The server 504 may be a server that provides various services. In some scenarios, the server 504 may distribute the target video recorded by the

terminal devices

501, 502.

The server 504 may be hardware or software. When the server 504 is hardware, it can be implemented as a distributed server cluster composed of a plurality of servers, or as a single server. When the server 504 is software, it may be implemented as multiple pieces of software or software modules (e.g., multiple pieces of software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be noted that the human-computer interaction method provided by the embodiment of the present disclosure may be executed by the

terminal devices

501 and 502, and accordingly, the human-computer interaction apparatus may be disposed in the

terminal devices

501 and 502.

It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to fig. 6, shown is a schematic diagram of an electronic device (e.g., the terminal device of fig. 5) suitable for use in implementing some embodiments of the present disclosure. The terminal device in some embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle-mounted terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 6, the electronic device may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 601, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage means 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus are also stored. The processing device 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Generally, the following devices may be connected to the I/O interface 605: input devices 606 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 607 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 608 including, for example, tape, hard disk, etc.; and a communication device 609. The communication means 609 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While fig. 6 illustrates an electronic device having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided. Each block shown in fig. 6 may represent one device or may represent multiple devices as desired.

In particular, according to some embodiments of the present disclosure, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 609, or may be installed from the storage means 608, or may be installed from the ROM 602. The computer program, when executed by the processing device 601, performs the above-described functions defined in the methods of the embodiments of the present disclosure.

It should be noted that the computer readable medium described in some embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.

The computer readable medium may be included in the electronic device or may exist separately without being incorporated in the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: outputting multimedia information of first question information based on the playing of a multimedia file, wherein the multimedia file comprises the multimedia information of at least one question information; receiving first reply information fed back by a user aiming at the first question information; determining whether the first reply information and the first question information are matched; and executing the target operation based on whether the first reply information is matched with the first question information.

Computer program code for carrying out operations for embodiments of the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in some embodiments of the present disclosure may be implemented by software, and may also be implemented by hardware. The names of these units do not in some cases form a limitation on the unit itself, and for example, the receiving unit may also be described as a unit that receives the first reply information fed back by the user for the first question information.

The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure in the embodiments of the present disclosure is not limited to the particular combination of the above-described features, but also encompasses other embodiments in which any combination of the above-described features or their equivalents is possible without departing from the scope of the present disclosure. For example, the above features may be interchanged with other features disclosed in this disclosure (but not limited to) those having similar functions.

Claims

1. A human-computer interaction method, comprising:

playing multimedia information of the first question information based on the playing of a multimedia file, wherein the multimedia file comprises the multimedia information of at least one question information; the multimedia information contained in the multimedia file indicates a preset role to give out question information;

receiving first reply information fed back by a user aiming at the first question information;

determining whether the first reply information and the first question information match;

executing target operation based on whether the first reply information is matched with the first question information;

the method further comprises the following steps: displaying first prompt information, wherein the first prompt information represents at least one of the following items: the reply information fed back by the user is not matched with the question information given by the preset role; feeding back reply information aiming at the question information given by the preset role again; the predetermined roles are various roles defined in advance;

the method further comprises the following steps: recording first multimedia information of the question information given by the preset role in the process of playing the multimedia information; recording second multimedia information of the user feedback reply information in the user feedback reply information process;

the method further comprises the following steps: and generating a multimedia file containing the predetermined role and the question-answer interaction process of the user.

2. The method according to claim 1, wherein the first multimedia message comprises a first voice and a first picture of the question information given by the predetermined character, and the second multimedia message comprises a second voice and a second picture of the user feedback reply information.

3. The method of claim 2, wherein the first frame is displayed in a first display area of a screen and the second frame is displayed in a second display area of the screen.

4. The method of claim 3, wherein the recording of the first picture and the recording of the second picture are performed by screen recording.

5. The method of claim 2, wherein the second picture is captured by a camera and the second voice is captured by a microphone.

6. The method of claim 1, further comprising:

and responding to the end of the playing of the multimedia file, and generating a target multimedia file based on the recording of the first multimedia information and the recording of the second multimedia information.

7. The method of claim 1, wherein the determining whether the first reply information and the first question information match comprises:

inquiring the reference reply information of the first question information from a preset database according to the keyword set for the first question information;

and determining whether the first reply information is matched with the first question information or not based on the inquired reference reply information.

8. The method of claim 1, wherein performing a target operation based on whether the first reply information and the first question information match comprises:

and in response to the first reply information not matching with the first question information, based on the playing of the multimedia file, re-playing the multimedia information of the first question information, so that the user can feed back the first reply information of the first question information again.

9. The method of claim 1, wherein performing a target operation based on whether the first reply information and the first question information match comprises:

and responding to the matching of the first reply information and the first question information and the non-playing end of the multimedia file, and playing the multimedia information of the second question information based on the playing of the multimedia file so as to enable the user to feed back the second reply information of the second question information.

10. The method of claim 1, further comprising:

and displaying second prompt information, wherein the second prompt information is used for prompting the benchmark reply information of the questioning information.

11. The method of claim 10, wherein presenting the second prompting message comprises:

determining a target language for displaying the second prompt message according to the language selection operation executed by the user;

and displaying the second prompt message according to the target language.

12. A human-computer interaction device, comprising:

the playing unit is used for playing the multimedia information of the first question information based on the playing of a multimedia file, wherein the multimedia file comprises the multimedia information of at least one question information; the multimedia information contained in the multimedia file indicates a preset role to give out question information;

the receiving unit is used for receiving first reply information fed back by the user aiming at the first question information;

a determining unit, configured to determine whether the first reply information and the first question information match;

an execution unit, configured to execute a target operation based on whether the first reply information and the first question information are matched

The device further comprises: the first display unit is used for displaying first prompt information, wherein the first prompt information represents at least one of the following items: the reply information fed back by the user is not matched with the question information given by the preset role; feeding back reply information aiming at the question information given by the preset role again; the predetermined roles are various roles defined in advance;

the device further comprises: the recording unit is used for recording first multimedia information of the question information given by the preset role in the process of playing the multimedia information; recording second multimedia information of the user feedback reply information in the user feedback reply information process;

the device further comprises: and the generating unit is used for generating a multimedia file containing the predetermined role and the question-answer interactive process of the user.

13. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-11.

14. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-11.