CN117111741A - Interaction method, terminal equipment and storage medium - Google Patents

Interaction method, terminal equipment and storage medium Download PDF

Info

Publication number
CN117111741A
CN117111741A CN202311088180.9A CN202311088180A CN117111741A CN 117111741 A CN117111741 A CN 117111741A CN 202311088180 A CN202311088180 A CN 202311088180A CN 117111741 A CN117111741 A CN 117111741A
Authority
CN
China
Prior art keywords
real
time
data
interaction
answer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202311088180.9A
Other languages
Chinese (zh)
Inventor
许雅淇
余颂伟
林羽赫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Socratic Education Technology Co ltd
Original Assignee
Shenzhen Socratic Education Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Socratic Education Technology Co ltd filed Critical Shenzhen Socratic Education Technology Co ltd
Priority to CN202311088180.9A priority Critical patent/CN117111741A/en
Publication of CN117111741A publication Critical patent/CN117111741A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses an interaction method, which comprises the steps of generating an initial question-back question, carrying out question-back on a user, carrying out heuristic question on the user based on the answer behavior and answer result of the user, carrying out paraphrasing feedback and requiring the user to review or evaluate, training user answer data and user behavior data collected in the question-start-evaluate-release process based on a deep learning model, so that the processing output result of the trained model on the interaction data is more similar to the thinking and behavior habit of a personal user, and the interaction robot based on the deep learning model can replace the user to interact with the outside. The invention can improve the intelligent level of the interactive robot and the question-answer interactive experience.

Description

Interaction method, terminal equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence interaction technologies, and in particular, to an interaction method, a terminal device, and a storage medium.
Background
With the rapid development of artificial intelligence technology and the technology maturity in the field of machine learning, the method has been widely applied in man-machine interaction. In the technical field, such as graphical user interfaces, voice recognition technologies, gesture control technologies, virtual reality technologies, augmented reality technologies, biometric technologies, etc., have matured gradually; in application fields such as smart home, medical health, education and training, entertainment games, industrial manufacturing, aerospace, transportation and the like, human-computer interaction is not separated. However, most of the current interaction is based on machine learning to perform semantic translation comparison answers and perform positive and negative feedback, and the mode is only suitable for questions and answers in simple scenes, the interaction process is hard, the intelligent degree is low, and the question and answer interaction experience is difficult to meet.
Disclosure of Invention
The invention mainly aims to provide an interaction method, terminal equipment and a computer readable storage medium, and aims to solve the problems of simplicity, hardness, low intelligent degree and poor interaction experience of an interaction robot in the prior art.
To achieve the above object, the present invention provides an interactive method including interaction using an interactive robot based on deep learning training, the interactive robot training method including:
acquiring target interaction content in a designated field;
performing keyword segmentation and classification on the target interactive content, and extracting the dividing components of the target interactive content;
matching the division part with a preset initial question library, generating an initial question and outputting the initial question;
acquiring first reply data and first line data of the initial question back of the user; wherein the first line of data comprises one or more than two of expression data, sound data, action data, mouth shape data and eyeball data;
taking the initial question back as input, the first answer data and the first line data as output, and inputting the input into a neural network for training to obtain a first deep learning model;
Performing first answer comparison on the first answer data and a preset answer of the first answer data, and performing first behavior comparison matching on the first behavior data and a preset model in a preset model library, wherein the preset model library comprises one or more than two of a preset expression library, a preset sound library, a preset action library, a preset mouth shape library and a preset eyeball library;
based on the first answer comparison result, invoking heuristic question back-asking questions aiming at the initial question back-asking questions in a preset question-asking library, and outputting the heuristic question back-asking questions;
acquiring second response data and second behavior data of the user on the heuristic question back;
taking the heuristic question as input, the second response data and the second behavior data as output, and inputting the heuristic question into a first deep learning model for training to obtain a second deep learning model;
performing second answer comparison on the second answer data and a preset answer of the second answer data, and performing second comparison matching on the second behavior data and a preset model in a preset model library;
generating and outputting paraphrase feedback information based on the second answer comparison result, and prompting a user to reiterate the initial question and the corresponding answer;
Acquiring third reply data and third behavior data when the user is replied;
and taking the initial question back as input, the third answer data and the third behavior data as output, and inputting the input into a second deep learning model for training to obtain a third deep learning model.
Optionally, the method for interacting with the interactive robot based on the deep learning training comprises the following steps:
acquiring a real-time interaction initial problem;
inputting the real-time interaction initial questions into the third deep learning model for processing to obtain first real-time response data and first real-time behavior data;
the first real-time reply data is used as reply content, the reply behavior of the interactive robot is controlled according to the first real-time behavior data, and the interactive robot is used for carrying out first reply;
performing first real-time answer comparison on the first real-time answer data and a preset answer of the first real-time answer data, and performing first real-time comparison matching on the first real-time behavior data and a preset model in a preset model library;
based on the comparison result of the first real-time answers, calling and outputting a real-time heuristic question in a preset question library aiming at the real-time interactive initial question;
Inputting the real-time heuristic question back into the third deep learning model for processing to obtain second real-time response data and second real-time behavior data;
the second real-time reply data is used as reply content, the reply behavior of the preset robot is controlled according to the second real-time behavior data, and the preset robot is utilized to reply for the second time;
performing a second real-time answer comparison on the second real-time answer data and a preset answer of the second real-time answer data, and performing a second real-time comparison matching on the second real-time behavior data and a preset model in a preset model library;
and generating and outputting real-time paraphrase feedback information based on the comparison result of the second real-time answer.
Optionally, the number of the interactive robots and the third deep learning models is more than two, and the interactive robots and the third deep learning models respectively correspond to different individual users, and the acquiring the real-time interactive initial problem includes:
acquiring interaction information of remote transmission of a hosting live terminal;
performing keyword segmentation and classification on the interaction information, and extracting the segmentation components of the interaction information;
and matching the division part with a preset initial problem library to generate real-time interaction initial problems, and respectively inputting the real-time interaction initial problems into each third deep learning model for processing.
Optionally, the interaction method is applied to an interaction system, and the interaction system comprises a central control system, a hosting live broadcast end and more than two interaction user ends; each interactive user terminal and each hosting live terminal are respectively connected with the central control system; each interactive user terminal is provided with a corresponding third deep learning model; the step of generating the real-time interaction initial question comprises the following steps:
the real-time interaction initial questions are sent to each interaction user side, so that each user interaction side respectively inputs the real-time interaction initial questions to a respective third deep learning model for processing;
receiving processing results sent by each interactive user terminal;
generating an interaction reply picture of each corresponding interaction robot according to each processing result;
and distributing each interaction reply picture to all interaction user terminals participating in the interaction.
Optionally, the step of generating and outputting real-time paraphrase feedback information based on the second real-time answer comparison result includes:
prompting a preset robot to reiterate the real-time interaction initial questions and the corresponding answers;
acquiring third real-time reply data and third real-time behavior data when the interactive robot is reiterated;
And training the third deep learning model based on the real-time interaction initial question, the real-time heuristic question, the first real-time response data, the first real-time behavior data, the second real-time response data, the second real-time behavior data, the third real-time response data and the third real-time behavior data to obtain an updated third deep learning model.
Optionally, the interaction method further includes:
acquiring correction information of an individual user on one or more of the first real-time response data, the first real-time behavior data, the second real-time response data, the second real-time behavior data, the third real-time response data and the third real-time behavior data;
and training the third deep learning model according to the correction information to obtain an updated third deep learning model.
Optionally, the step of obtaining the target interactive content in the specified domain includes:
collecting voice input information of a user;
performing voice recognition on the voice input information based on a natural language processing technology to obtain text conversion information corresponding to the voice input information;
And acquiring target interactive contents in the appointed field based on the text conversion information.
Optionally, the interaction method further includes:
acquiring a display image of the interactive robot selected by the individual user;
acquiring a personal image of a user;
and carrying out fusion processing on the display image of the interactive robot selected by the user and the personal image of the user based on an AI fusion technology to obtain a fusion image serving as the final display image of the interactive robot.
To achieve the above object, the present invention also provides a terminal device comprising a memory and a processor, the memory storing a computer program executable by the processor, the computer program implementing the method steps as described above when being executed by the processor.
Furthermore, to achieve the above object, the present invention provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the method steps as described above.
According to the interaction method, the terminal equipment and the computer readable storage medium, the initial question-back problem is generated through the acquisition and the composition division of the target interaction content, the question-back is carried out on the user, the heuristic question-back is carried out on the user based on the answer behavior and the answer result of the user, the explanation feedback is carried out, the user answer data and the user behavior data acquired in the question-start-evaluation-release process are required to be repeated or evaluated, the training is carried out on the basis of the deep learning model, so that the processing output result of the trained model on the interaction data is more similar to the thinking and behavior habit of the individual user, the interaction robot based on the deep learning model can replace the user to interact with the outside, the intelligent level of the interaction robot is improved, the method is applicable to a wider application scene, the answer which is more similar to the thinking and behavior habit of the user is provided, and the question-answer interaction experience in the learning field is improved.
Drawings
FIG. 1 is a schematic diagram of a system architecture of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a flowchart of a super-resolution video processing method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an application scenario of the on-line teaching of the present invention.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the prior art, the interactive robot has simple and hard interaction, low intelligent degree and poor interaction experience.
In order to solve the technical problems, the invention provides a training method of an interactive robot, in the method, an initial question-back problem is generated by acquiring target interactive contents and dividing components, a question-back is carried out on a user, a heuristic question is carried out on the user based on a response action and a response result of the user, then a paraphrase feedback is carried out, the user response data and the user action data acquired in the question-start-evaluate-release process are required to be repeated or evaluated, and are trained based on a deep learning model, so that the processing output result of the trained model on the interactive data is more similar to the thinking and action habit of a personal user, the interactive robot based on the deep learning model can replace the user to interact with the outside, the intelligent level of the interactive robot is improved, the method is applicable to a wider application scene, the answer which is more similar to the thinking and action habit of the user is provided, and the question-answer interactive experience in the learning field is improved. .
Referring to fig. 1, fig. 1 is a schematic system architecture diagram of a hardware running environment according to an embodiment of the present invention.
The terminal of the embodiment of the invention can be a terminal device with computing capability, a PC, a mobile terminal device with display function such as a smart phone, a tablet personal computer, an electronic book reader, an MP3 (Moving Picture Experts Group Audio Layer III, dynamic image expert compression standard audio layer 3) player, an MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert compression standard audio layer 4) player, a portable computer and the like.
As shown in fig. 1, the terminal may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.
Optionally, the terminal may also include a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and so on. Among other sensors, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor may include an ambient light sensor that may adjust the brightness of the display screen according to the brightness of ambient light, and a proximity sensor that may turn off the display screen and/or the backlight when the mobile terminal moves to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and the direction when the mobile terminal is stationary, and the mobile terminal can be used for recognizing the gesture of the mobile terminal (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; of course, the mobile terminal may also be configured with other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, an infrared sensor, and the like, which are not described herein.
It will be appreciated by those skilled in the art that the terminal structure shown in fig. 1 is not limiting of the terminal and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
As shown in fig. 1, an operating system, a network communication module, a user interface module, and a training program may be included in the memory 1005, which is a type of computer storage medium.
In the terminal shown in fig. 1, the network interface 1004 is mainly used for connecting to a background server and performing data communication with the background server; the user interface 1003 is mainly used for connecting a client (user side) and performing data communication with the client; and the processor 1001 may be configured to invoke the training/interactive program stored in the memory 1005 and perform the following operations:
acquiring target interaction content in a designated field;
performing keyword segmentation and classification on the target interactive content, and extracting the dividing components of the target interactive content;
matching the division part with a preset initial question library, generating an initial question and outputting the initial question;
acquiring first reply data and first line data of the initial question back of the user; wherein the first line of data comprises one or more than two of expression data, sound data, action data, mouth shape data and eyeball data;
taking the initial question back as input, the first answer data and the first line data as output, and inputting the input into a neural network for training to obtain a first deep learning model;
Performing first answer comparison on the first answer data and a preset answer of the first answer data, and performing first behavior comparison matching on the first behavior data and a preset model in a preset model library, wherein the preset model library comprises one or more than two of a preset expression library, a preset sound library, a preset action library, a preset mouth shape library and a preset eyeball library;
based on the first answer comparison result, invoking heuristic question back-asking questions aiming at the initial question back-asking questions in a preset question-asking library, and outputting the heuristic question back-asking questions;
acquiring second response data and second behavior data of the user on the heuristic question back;
taking the heuristic question as input, the second response data and the second behavior data as output, and inputting the heuristic question into a first deep learning model for training to obtain a second deep learning model;
performing second answer comparison on the second answer data and a preset answer of the second answer data, and performing second comparison matching on the second behavior data and a preset model in a preset model library;
generating and outputting paraphrase feedback information based on the second answer comparison result, and prompting a user to reiterate the initial question and the corresponding answer;
Acquiring third reply data and third behavior data when the user is replied;
and taking the initial question back as input, the third answer data and the third behavior data as output, and inputting the input into a second deep learning model for training to obtain a third deep learning model.
Further, the processor 1001 may call the interactive program stored in the memory 1005, and further perform the following operations:
acquiring a real-time interaction initial problem;
inputting the real-time interaction initial questions into the third deep learning model for processing to obtain first real-time response data and first real-time behavior data;
the first real-time reply data is used as reply content, the reply behavior of the interactive robot is controlled according to the first real-time behavior data, and the interactive robot is used for carrying out first reply;
performing first real-time answer comparison on the first real-time answer data and a preset answer of the first real-time answer data, and performing first real-time comparison matching on the first real-time behavior data and a preset model in a preset model library;
based on the comparison result of the first real-time answers, calling and outputting a real-time heuristic question in a preset question library aiming at the real-time interactive initial question;
Inputting the real-time heuristic question back into the third deep learning model for processing to obtain second real-time response data and second real-time behavior data;
the second real-time reply data is used as reply content, the reply behavior of the preset robot is controlled according to the second real-time behavior data, and the preset robot is utilized to reply for the second time;
performing a second real-time answer comparison on the second real-time answer data and a preset answer of the second real-time answer data, and performing a second real-time comparison matching on the second real-time behavior data and a preset model in a preset model library;
and generating and outputting real-time paraphrase feedback information based on the comparison result of the second real-time answer.
Further, the processor 1001 may call the interactive handler stored in the memory 1005, and further perform the following operations:
acquiring interaction information of remote transmission of a hosting live terminal;
performing keyword segmentation and classification on the interaction information, and extracting the segmentation components of the interaction information;
and matching the division part with a preset initial problem library to generate real-time interaction initial problems, and respectively inputting the real-time interaction initial problems into each third deep learning model for processing.
Further, the processor 1001 may call a video processing program stored in the memory 1005, and further perform the following operations:
acquiring interaction information of remote transmission of a hosting live terminal;
performing keyword segmentation and classification on the interaction information, and extracting the segmentation components of the interaction information;
and matching the division part with a preset initial problem library to generate real-time interaction initial problems, and respectively inputting the real-time interaction initial problems into each third deep learning model for processing.
Further, the processor 1001 may call the interactive program stored in the memory 1005, and further perform the following operations:
the real-time interaction initial questions are sent to each interaction user side, so that each user interaction side respectively inputs the real-time interaction initial questions to a respective third deep learning model for processing;
receiving processing results sent by each interactive user terminal;
generating an interaction reply picture of each corresponding interaction robot according to each processing result;
and distributing each interaction reply picture to all interaction user terminals participating in the interaction.
Further, the processor 1001 may call the interactive program stored in the memory 1005, and further perform the following operations:
Prompting a preset robot to reiterate the real-time interaction initial questions and the corresponding answers;
acquiring third real-time reply data and third real-time behavior data when the interactive robot is reiterated;
and training the third deep learning model based on the real-time interaction initial question, the real-time heuristic question, the first real-time response data, the first real-time behavior data, the second real-time response data, the second real-time behavior data, the third real-time response data and the third real-time behavior data to obtain an updated third deep learning model.
Further, the processor 1001 may call the interactive program stored in the memory 1005, and further perform the following operations:
acquiring correction information of an individual user on one or more of the first real-time response data, the first real-time behavior data, the second real-time response data, the second real-time behavior data, the third real-time response data and the third real-time behavior data;
and training the third deep learning model according to the correction information to obtain an updated third deep learning model.
Further, the processor 1001 may call the interactive program stored in the memory 1005, and further perform the following operations:
collecting voice input information of a user;
performing voice recognition on the voice input information based on a natural language processing technology to obtain text conversion information corresponding to the voice input information;
and acquiring target interactive contents in the appointed field based on the text conversion information.
Further, the processor 1001 may call the interactive program stored in the memory 1005, and further perform the following operations:
acquiring a display image of the interactive robot selected by the individual user;
acquiring a personal image of a user;
and carrying out fusion processing on the display image of the interactive robot selected by the user and the personal image of the user based on an AI fusion technology to obtain a fusion image serving as the final display image of the interactive robot.
Referring to fig. 2, fig. 2 is a flow chart of an embodiment of a training method of the interactive robot of the present invention, and the interactive robot and the training method thereof of the present invention may be applied to the fields of education training, medical health, entertainment games, etc., and the embodiment of the present invention is mainly described in the field of education training. The interactive robot provided by the invention has the advantages that the main user learns the thinking and behavior habit of the individual user, so that the individual user is replaced to interact with the outside. The interactive robot disclosed by the invention is mainly used for carrying out data processing through a deep learning model trained based on a deep learning technology. The training method of the interactive robot comprises the following steps:
And S1, acquiring target interaction content in the appointed field.
The target interactive contents refer to interactive contents of a specific domain selected by a user's individual user. In the implementation process, a plurality of interaction fields can be preset, corresponding interaction contents are pre-stored, and for each interaction field, a corresponding deep learning model can be trained from S10. Specifically, the interactive content module may be preset to allow the individual user to select the object selection of the virtual robot in the system, and when receiving the selection instruction of the virtual robot object of the individual user, the corresponding virtual robot object is used as the interactive robot of the individual user. And then receiving voice input of a user or text input or file import of content (setting the specific content in a limited practical field, such as Chinese learning content, electronic commodity live broadcast content, network game content and the like in the obligation education stage) on the terminal equipment, and finally converting the input content into a text or data format to import the system after the system receives the input content so as to determine the category of the interactive content. If through speech input, the system converts the text into text for data storage through speech recognition and Natural Language Processing (NLP).
In some embodiments, step S1 comprises:
step S101, collecting voice input information of a user;
step S102, performing voice recognition on the voice input information based on a natural language processing technology to obtain text conversion information corresponding to the voice input information;
and step S103, acquiring target interactive contents in the designated field based on the text conversion information.
Specifically, the user can determine the target interactive content of the designated field through a voice input mode. The voice input information refers to voice data of the user. For example, the user may speak the word "singing goose", and after the user's voice is collected, voice analysis is performed based on natural language processing technology to obtain corresponding text conversion information, i.e. text "singing goose". And then determining that the interaction margin is within the text range of the 'singing goose', and taking the interaction margin as target interaction content.
And S2, carrying out keyword segmentation and classification on the target interactive content, and extracting the division components of the target interactive content.
After the target interactive content is obtained, the target xiaohu content is subjected to keyword segmentation and classification, and the dividing components of the target interactive content are extracted, namely, the dividing components of subjects, predicates, objects, stationary subjects, complements and the like of sentences of the target interactive content are extracted.
And step S3, matching the division parts with a preset initial question library, generating an initial question and outputting the initial question.
The initial question of the invention refers to the question which the system presents to the individual user or his interactive robot, and the question of the invention does not refer to the question of the skill of the user. The preset initial question library is preset and is used for storing a question database of initial questions for initiating initial questions to an individual user or an interactive robot user of the individual user. The system based on the scheme of the invention comprises a set of question libraries (corresponding to input apparent content) corresponding to the key words, wherein the question libraries can comprise a positive question library and a negative question library, the positive question library comprises a question library actively inquired by a user, and the negative question library is a question library which is arranged in the system and related to the key words; the system comprises two question modes, wherein one question mode is used for actively initiating question interaction for a user, and the other question mode is used for initiating active question interaction for the user, wherein the former question mode is used for a question library, and the latter question mode is used for a question library. For the question bank, the system does not answer directly, and the question bank is pointed to match, a question back mechanism is triggered, and the initial questions are evaluated and interpreted through question back and prompt.
In some embodiments, the preset initial question library may store a specific number of preset initial questions in advance, where the preset initial questions may be generated in advance using an initial question generation model. After the dividing components are obtained, keyword matching is carried out on the dividing components and a preset initial question library, and matched questions are extracted from pre-stored preset initial questions to serve as generated initial question back and output.
In some embodiments, the preset initial question library may be an initial question generation model itself, and the initial question is directly generated by using the model when the initial question needs to be generated without storing the preset initial question in advance. Specifically, according to the present object content "singing geese", the problem setting is performed in a manner as follows:
wherein, 5W2H is: who/which, when, wheree, why, what, how, how mux;
through the problem generation model, the full poem of 'singing goose' can generate 300-3000 problems, and according to the reference of our teaching target, half of the problems can be manually selected out by 30-100.
Such as we choose:
q1, the article is written in the Tang dynasty, in the beginning, in the middle, in the evening,?
Q2, continuous problem: q2-1, please ask the poem if you feel that the goose is happy or sad?
Q2-2, do you prove with a sentence or word in poetry that goose is happy or sad?
Q2-3, you can tell why geese are happy
Q3, do you know what animal is the ancestor of the goose?
Q4, do you know why geese like water, like swimming?
……
In some embodiments, the initial question may be output by displaying text on the display device, or may be output by voice dialog through a speaker or the like, which is not limited herein.
Step S4, first answer data and first line data of the initial question back of the user are obtained; wherein the first line of data includes one or more of expression data, sound data, motion data, mouth shape data, and eyeball data.
The first reply data refers to reply data for the initial question back, and may be text reply data, voice reply data, or the like. The first line of data refers to behavior data of an individual user in response to an initial question, including one or more of expression data, sound data, motion data, mouth shape data, and eyeball data.
According To the steps, after the initial question is asked, the system TTS (Text To Speech) is used for outputting voice, the question is asked To an interactor (an individual user or an interactive robot thereof), the data processing center module is started, and the data processing center module can collect the data of the answer and collect the expression data, the sound data, the action data, the mouth shape data and the eyeball data of the user in the process of ending the answer of the user interaction.
And S5, taking the initial question back as input, the first answer data and the first line data as output, and inputting the output into a neural network for training to obtain a first deep learning model.
The interactive robot can be used as a proxy of a personal user to replace the personal user to interact with the outside, and the interactive robot processes the interaction data mainly based on a trained deep learning model. Step S5 is a process step of training a deep learning model of the interactive robot.
The initial question back is taken as input, the first answer data and the first action data are taken as output, the input is input into a preset neural network for iterative training, and model parameters are adjusted to obtain a first deep learning model, so that when the first deep learning model is utilized to process external interactive questions in the subsequent use process of the model, answer data and action data which accord with the thinking of a personal user can be output, and the display image of the interactive robot is controlled to answer in a behavior mode and a thinking mode which are close to the personal user.
Step S6, performing first answer comparison on the first answer data and a preset answer of the first answer data, and performing first behavior comparison matching on the first behavior data and a preset model in a preset model library, wherein the preset model library comprises one or more than two of a preset expression library, a preset sound library, a preset action library, a preset mouth shape library and a preset eyeball library.
It will be appreciated that the step S50 and the step S60 may be performed synchronously or separately, and the execution sequence may be no matter what order.
The preset model database refers to a preset model database, and the preset model database comprises a plurality of preset models.
In the specific implementation process of the step S5 and the step S6, the system data are synchronously divided into two data processing logics A and B, wherein the logic A is that the user data feed the selected back-query robot, so that the changed voice, action, micro expression and the like are close to or even consistent with the actual individual user when the system is interacted with the same problem subsequently; logic a is the mental learning of individuals of the user to be used by the challenge robot, as well as the training of individuals in reality. After the user interaction answers, the data processing center module finishes data acquisition and further processes the data, and the data are matched and compared with an expression library (101 expression models), a sound library (sound models: mandarin, dialect, english, other foreign languages and the like) and an action library (256 action models) respectively, wherein the sound is converted into a text through voice recognition and Natural Language Processing (NLP) and semantic comparison of a corpus (namely, a Q/A model of each industry and each field) is carried out.
And for the logic B, comparing and matching the data with each database in the steps. The first answer data contains answer information given by the individual user for the initial answer questions, and can be compared with preset answers in a preset answer database to judge whether the answer of the individual user is correct. And comparing the first answer data with the first answer of the preset answer, wherein the first answer comprises a pair, a pair and a non-pair, and the three results cannot be matched.
And performing first behavior comparison and matching on the first behavior data and a preset model in a preset model database, and constructing a personalized behavior model of the personal user on the behavior data of the user so that when the subsequent interactive robot replaces the personal user to interact, the behavior model of the interactive robot is compared and matched with the personalized behavior model, and whether the behavior performance of the interactive robot is matched with the behavior of the personal user or not is judged.
Step S7, based on the first answer comparison result, a heuristic question of the initial question in a preset question library is called and output;
the preset starting question library is used for storing heuristic question back questions aiming at the initial question back, and the heuristic question is used for inspiring a user so that the user can further think in determining answers aiming at the initial question back.
Based on the steps, the first answer of the first answer data and the preset answer is compared, and the three results of pairing, unpaired and unmatched are included. When the first answer comparison result is unpaired or unmatched, the corresponding heuristic question can be called from the preset question library corresponding to the initial question, and the heuristic question is output to the user in a voice or text display mode. It will be appreciated that the result of the inability to match may include failure to identify or exceed a predetermined time when the user is not responding.
If the result of the first answer comparison is the time comparison, the correct answer and/or the evaluation information for evaluating the answer pair of the user can be directly output.
And S8, obtaining second response data and second behavior data of the user on the heuristic question-back.
The second response data is response data of the user to the heuristic question, and the second behavior data is behavior data of the user when responding to the heuristic question.
Based on the above steps, after the system outputs the heuristic question, the second answer data and the second behavior data when the user answers the question may be collected, and the obtaining method refers to the obtaining method of the first answer data and the first behavior data, which is not described herein in detail.
And S9, inputting the heuristic question as input, the second response data and the second behavior data as output into a first deep learning model for training, and obtaining a second deep learning model.
After the second answer data and the second behavior data are obtained, the heuristic question-back question can be used as input of the first deep learning model, the second answer data and the second behavior data are used as output of the first deep learning model, the first deep learning model is further trained, and the second deep learning model is obtained, so that when the second deep learning model processes similar heuristic question-back questions, answer output data and answer behavior data which are more similar to the thinking mode and behavior habit of an individual user can be output.
And step S10, performing second answer comparison on the second answer data and a preset answer of the second answer data, and performing second comparison matching on the second behavior data and a preset model in a preset model library.
For heuristic question-back, a corresponding preset answer, i.e. a preset answer of the second answer data, may be set. The second answer comparison means that the second answer data is compared with the corresponding preset answer, and the comparison result comprises the situations of pairing, unpaired situation, unmatched situation and the like.
The type or range of the second behavior data may be consistent with the first behavior data, including motion data, expression data, and the like. And comparing and matching the second behavior data with a preset model in a preset model database to further improve modeling of the reply behavior of the individual user.
And S11, generating and outputting paraphrase feedback information based on the comparison result of the second answer, and prompting the user to reiterate the initial question and the corresponding answer.
Based on the steps, the second answer comparison results comprise three results of pairing, unparalleling and unmatched results. For the second answer comparison result is a result which is not matched or cannot be matched, the system generates and outputs paraphrasing feedback information, wherein the paraphrasing feedback information comprises correct answer information, and then the paraphrasing feedback information is output in a mode of displaying characters or voices. Meanwhile, the user is prompted to reiterate the answers in the initial question-back questions and the paraphrasing feedback information in a real text or voice mode.
If the second answer comparison result is a result of the comparison, the evaluation information for evaluating the answer pair of the user can be directly output.
Step S12, obtaining third reply data and third behavior data when the user is replied.
Based on the steps, after prompting the user to reiterate the initial question and the corresponding answer, collecting third answer data and third behavior data when the user reiterates, wherein the third answer data comprises the initial question reiterated by the user and the reiterated answer data thereof. The third behavior data is behavior data when the user recites.
And S13, taking the initial question back as input, the third answer data and the third behavior data as output, and inputting the input into a second deep learning model for training to obtain a third deep learning model.
And inputting the second deep learning model of the initial question back to the third behavior data as output, and inputting the third behavior data into the second deep learning model to further train the second deep learning model to obtain the third deep learning model, so that the reply data and the behavior data output by the third deep learning model when the interactive data are processed are more similar to the thinking and behaviors of the user.
For ease of understanding, the following description is given by way of specific examples.
For the example of "singing geese" above, the initial inverse problem is Q1, Q1, which is written in the Tang dynasty, in the beginning, in the middle, in the evening? This question may be pre-matched with the correct answer as a preset answer to the initial question-back. For example, the initial Tang bar, I feel as initial Tang, tang initial bar, tang dynasty first year, tang Chu, just after the establishment of Tang dynasty, etc., which are all correct answers;
After collecting first answer data of the user and matching the answer library, and carrying out first answer comparison, if the answer is right, evaluating to say: you answer you happy-! Feedback information may then be generated; if the answer is wrong, or not known, or if the silence exceeds 10 seconds, the process goes to inspiration; a heuristic step of inspiring that the first answer is incorrect, such as; after the user who is said Luo Binwang to be killed in the army in the future by the time of 20 years is inspired, the user is subjected to a second round of answer, simple evaluation is carried out, the user answers the congratulations, answers the congratulations or does not answer the congratulations, and the user terminates the answer; then explain, explain this kind of people's poem of the population of writing out in Luo Binwang years of development of the above-mentioned problem, he has grown up and has become a poem person that is just the person that the person overflowed, and is very straight, so he sees the armed law and is the emperor, very dissatisfied, has participated in anti Wu Dajun, is killed in "XXX child in the army at last, your study is very positive, the person continues the next problem. The need for interlinking problems is basically the same as the above, except that it is classified into a multi-problem talking model, and the single problem is a single-problem talking model. In the process, the response data and the behavior data of the user response are input into the neural network for training, and a deep learning model which is close to the response data and the behavior data of the user thinking and the behavior habit is output after the data processing is gradually constructed and perfected and is used as the data processing model of the interactive robot. Through the steps, the whole process of the proposal, the answer, the evaluation, the inspiring and the explanation of the system questions related to a certain theme is completed.
Referring to fig. 3, fig. 3 is a schematic diagram of an application scenario of the on-line teaching of the present invention, and when the system imports text content of a primary language textbook and a corresponding question-answer library in the specific application scenario shown in fig. 3, students first select their own image of a question-answer robot, such as a lovely piglet, in the system; second, the way of asking questions is, but not limited to, 2, 1 st that the user ask questions back to the pig robot, such as "why the Trojan is going through the river? The 2 nd mode is external import, such as a user clicking a second grade-language lower book part textbook material-a fifth unit-a fourteenth class 'Xiaoma river' on a terminal screen to trigger a system question back mechanism, which is characterized by comprising but not limited to a question generation mechanism, wherein a 5W2H method is adopted to back questions for text names, or a weight mechanism is adopted to screen questions in a system language question and answer library;
when the user answers the questions, two groups of Data are generated, namely Voice Data of the user answer and Voice Data of data_Voice_1 and data_action_1, wherein the Voice Data of the user answer and the Voice Data of the user answer are behavior Data collected by a camera, an eye tracker and the like of the terminal equipment in the process.
Further, the data_Voice_1 Data is subjected to Voice-to-text and semantic analysis by the system and is used for comparing and matching answer results, after the answer results are returned, the answer results are evaluated in real time and displayed, and if the answer results are not compared or cannot be identified, the system feeds back prompt voices corresponding to the problems to inspire;
After the user gets the heuristic prompt, the user carries out secondary answer, the process generates Voice Data data_Voice_2 and data_Action_2, wherein the data_Voice_2 will repeat the steps to carry out answer comparison and matching, if the answer is wrong, the explanation and the display of the answer are carried out, the user is prompted to review the answer result, and the process will generate Voice Data data_Voice_3 and data_Action_3 again;
while the above steps are being performed, the system will perform machine learning and neural network learning on the pig review robot with { data_Voice_1, data_Voice_2, data_Voice_3} and { data_Action_1, data_Action_2, data_Action_3} Data, so that the pig approaches the user in the style of tone, accent, and corresponding behavior of the Voice feedback.
In the above embodiment, by acquiring the target interactive content and dividing the components, generating the initial question-back question, performing a question-back to the user, performing a heuristic question to the user based on the answer behavior and the answer result of the user, performing a paraphrasing feedback and requiring the user to review or evaluate, and training the user answer data and the user behavior data acquired in the question-start-evaluate-release process based on the deep learning model, the processing output result of the trained model on the interactive data is more similar to the thinking and behavior habit of the individual user, so that the interactive robot based on the deep learning model can replace the user to interact with the outside, improve the intelligent level of the interactive robot, be applicable to wider application scenes, give a answer more similar to the thinking and behavior habit of the user, and improve the question-answer interactive experience in the learning field.
In some embodiments, when the third deep learning model trained by applying the method interacts in an actual interaction scene, the interaction method includes:
step S14, acquiring a real-time interaction initial problem.
Specifically, the real-time interaction initial problem refers to an initial problem which is set forth by a system in the process of performing real-time interaction by using the interaction robot based on the trained third deep learning model. With reference to the above embodiment, after determining the real-time interactive content in the specified area, the system performs keyword segmentation and classification on the real-time interactive content, extracts the segmentation components of the target interactive content, and then matches the segmentation components with a preset problem library to generate and output the real-time interactive initial problem. The output modes include a text display mode and a voice output mode.
And S15, inputting the real-time interaction initial questions into the third deep learning model for processing to obtain first real-time response data and first real-time behavior data.
Different personal users correspondingly train different third deep learning models, in the real-time interaction process, after the real-time interaction initial problem is acquired, the fact interaction initial problem is input into the third deep learning model of the interaction robot corresponding to the personal user currently participating in the interaction to be processed, and after the third deep learning model is processed, the first real-time response data and the first real-time behavior data are output. The third deep learning model is trained based on reply data and behavior data of the individual user in advance, the third real-time reply data is answer data given by the third deep learning model in a thinking mode close to the individual user in a real-time interaction process, and the first real-time behavior data is behavior data generated by the third deep learning model in a real-time delivery process in a mode close to behavior habits of the user.
And S16, taking the first real-time reply data as reply content, controlling the reply behavior of the interactive robot according to the first real-time behavior data, and carrying out first reply by using the interactive robot.
Specifically, the reply content refers to a reply answer of the interactive robot, the reply behavior refers to behaviors such as reply actions, expressions, sounds and the like of the robot during real-time interaction, and the first reply refers to a reply of the interactive robot to the real-time interaction initial question.
The interactive robot displays the robot display image selected by a preset or individual user in a display device of the system, and when the real-time interactive initial problem is replied for the first time, the action, expression, eyeball rotation and other replying actions of the robot are controlled according to the first real-time behavior data. The output modes of the first real-time reply data comprise a text display mode and a voice output mode. If the output mode is a voice output mode, the first real-time behavior data further includes a sound element, for example, a sound frequency, etc., and the first real-time reply data can be controlled to be close to the voice of the individual user according to the sound element. It will be appreciated that in the process of step S14-following step S22 of the real-time interaction, the individual user may participate in the process from the perspective of the spectator, learn or acquire the required learning during the interaction, and also correct the answer or answer behavior of the interaction robot.
Step S17, performing first real-time answer comparison on the first real-time answer data and a preset answer of the first real-time answer data, and performing first real-time comparison matching on the first real-time behavior data and a preset model in a preset model library.
The preset answer of the first real-time reply data is the preset answer of the real-time interaction initial question. After the first real-time response data and the first real-time behavior data of the interactive robot are obtained, the first real-time response data are compared with a preset answer of the fact interaction initial question, and the comparison results comprise results of pairing, unpaired and unmatched and the like. And comparing the first real-time behavior data with a preset model in a preset model library. Based on the above embodiment, the behavior of the user is modeled at the same time in the process of training the third deep learning model. The first real-time behavior data in the real-time interaction process can be compared and matched with a user behavior model which is generated in advance in a preset model library, and the comparison and matching result comprises two matching or unmatched results. The comparison and matching result of the behaviors can be used for carrying out corresponding evaluation on whether the behaviors shown by the interactive robot are matched with the behaviors of the personal user or not.
And S18, based on the comparison result of the first real-time answer, calling and outputting the real-time heuristic question in the preset question library aiming at the real-time interactive initial question.
Referring to the above embodiment, when the comparison result of the first real-time answer of the interactive robot is unpaired or unmatched, the real-time heuristic question of the initial question of the real-time interaction in the preset initiation question library is called. And outputting by means of voice or text display.
And S19, inputting the real-time heuristic question back into the third deep learning model for processing to obtain second real-time response data and second real-time behavior data.
And inputting the real-time heuristic question back into a corresponding third deep learning model for processing to obtain second real-time response data and second real-time behavior data. The second real-time response data is response data which is obtained by the interactive robot processing the heuristic question based on the third deep learning model and is close to the thinking mode of the personal user, and the second real-time behavior data is behavior data which is obtained by the interactive robot processing the heuristic question based on the third deep learning model and is close to the habit of the personal user.
And S20, taking the second real-time reply data as reply content, controlling the reply behaviors of the preset robot according to the second real-time behavior data, and carrying out a second reply by using the preset robot.
And controlling the action, expression, eyeball rotation and other reply actions of the robot according to the second real-time behavior data. The output modes of the second real-time reply data comprise a text display mode and a voice output mode. If the output mode is a voice output mode, the second real-time behavior data further includes a sound element, for example, a sound frequency, etc., and the second real-time reply data can be controlled to be close to the voice of the individual user according to the sound element.
And S21, performing a second real-time answer comparison on the second real-time answer data and a preset answer of the second real-time answer data, and performing a second real-time comparison matching on the second real-time behavior data and a preset model in a preset model library.
The preset answer of the second real-time answer data is the preset answer of the heuristic question. After the second real-time answer data and the second real-time behavior data are obtained, carrying out second answer comparison on the second real-time answer data and a preset answer of the heuristic question, wherein the second answer comparison result comprises results of pairing, unpaired and unmatched.
And performing second real-time comparison and matching on the second real-time behavior data and a preset model in a preset model database, so as to judge whether the behavior performance of the interactive robot when replying to the heuristic question is consistent with the behavior of the interactive robot when replying to the question by the individual user.
Step S22, generating and outputting real-time paraphrasing feedback information based on the second real-time answer comparison result; and generating and outputting behavior matching evaluation information based on the result of the first real-time comparison and matching and the result of the second real-time comparison and matching.
Based on step S21, for the case that the second real-time answer comparison result is not opposite or cannot be matched, paraphrasing feedback information is generated and output, wherein the paraphrasing feedback information includes preset answer information of the heuristic question. If the comparison result of the second real-time answer is correct, generating and outputting evaluation information with correct answer.
The first real-time comparison and matching result refers to a comparison and matching result of the first real-time behavior data and the preset model, the second real-time comparison and matching result refers to a comparison and matching result of the second real-time behavior data and the preset model, whether the phenotype of the interactive robot is matched with the behavior of the individual user can be judged according to the comparison and matching result, corresponding behavior matching evaluation information is generated, and whether the third deep learning model is further trained and perfected can be further determined based on the behavior matching evaluation information.
In the interaction method, the trained third deep learning model is used as the data processing model of the interaction robot, so that the interaction robot is used for replacing a personal user, interaction is performed with the outside in a thinking mode and behavior habit close to the personal user, the intelligent level of interaction is improved, the interaction is closer to daily communication of people, and the interaction experience is improved.
In some embodiments, the interactive robot and the third deep learning model are more than two in number, each corresponding to a different individual user.
The present embodiment will be described with specific field of live teleeducation. Based on the embodiment, the interaction method can be applied to the technical field of live remote education. The individual user of each student can respectively select the corresponding interactive robot to train based on the method, so as to obtain the corresponding third deep learning model and the corresponding interactive robot, and the corresponding interactive robot participates in the online course.
Step S14 includes:
step S141, obtaining interaction information of remote transmission of the hosting live terminal.
The live broadcasting end is a terminal corresponding to a teacher in the online remote course, and the interactive information is teaching content information remotely transmitted to the system by the teacher through the live broadcasting end. As shown in fig. 3, in a live broadcast scene, when a teacher teaches remotely, in the same interactive environment, the system can receive audio information of the remote teacher through a program module, and meanwhile, a 3DMAX or Unity3D technology or the like is adopted to project the back-asking robots of different students to an interactive device terminal interface for display, so that the online client and the live teacher can see the large interactive device terminal interface.
And step S142, keyword segmentation and classification are carried out on the interaction information, and the division components of the interaction information are extracted.
After the system acquires the interactive information, the method is referred to for implementing the method, keyword segmentation and classification are carried out on the interactive information, and the division components of the interactive information are extracted.
When a teacher transmits information at a live broadcast end, a system program receives information such as audio of the teacher and converts the information into text input information of the system, independent transmission and feedback are carried out on the question-back robots in the current system, and after each feedback robot receives transmission data, automatic question-back feedback of a mode II in the scheme can be carried out, so that a live broadcast interface is realized, and interaction of the teacher and a plurality of question-back robots at an interaction end interface is realized.
And step S143, matching the division parts with a preset initial problem library to generate real-time interaction initial problems, and respectively inputting the real-time interaction initial problems into each third deep learning model for processing.
After the dividing components are obtained, the dividing components are matched with a preset initial problem library to generate real-time interaction initial problems, and then the real-time interaction initial problems are respectively input into third deep learning models corresponding to the interaction robots of the student individual users participating in the online courses for processing. Each interactive robot can answer the questions according to the output result of the third deep learning model, and can carry out subsequent start-release-evaluation stages according to the answer condition.
In the implementation scenario shown in figure 3,
according to the method, students can interact with live broadcasting or remote teachers through the interaction robots in the system, on one hand, the manner can not perform face-exposed interaction in the form of real people in eyes of surrounding students, meanwhile, the students can represent interaction students in voice and behavior habits, interestingness is improved, for students who do not dare to answer, the interaction robots call corresponding usual behavior features and voice features through the behavior database of the students, the students can be directly helped to feed back, the problem of cold field or embarrassment of online teaching is solved, and when the interaction robots answer, the individuals in reality represent that the answer of the answer robots is not satisfied enough, and result verification feedback can be performed at the interaction terminal.
In some embodiments, the interaction method is applied to an interaction system, and the interaction system comprises a central control system, a hosting live broadcast end and more than two interaction user ends; each interactive user terminal and each hosting live terminal are respectively connected with the central control system; and each interactive user terminal is provided with a corresponding third deep learning model.
Specifically, taking live remote education as an example, the central control system is used for receiving and distributing interaction data of the hosting live broadcast end and the interaction user end, and generating a problem based on information sent by the hosting live broadcast end. The hosting live broadcast end is a terminal corresponding to a teacher and is used for inputting interactive contents in the appointed field to the central control system. The interactive user terminal is a terminal of a student personal user. And configuring a third deep learning model of the interactive robot of the corresponding personal user on each interactive user terminal.
The step S14 is followed by:
and S23, sending the real-time interaction initial questions to each interaction user side so that each user interaction side respectively inputs the real-time interaction initial questions to a respective third deep learning model for processing.
The central control system generates real-time interaction initial questions based on the interaction information sent by the hosting live terminal, sends the real-time interaction initial questions to each interaction user terminal participating in interaction, and after each interaction user terminal receives the real-time interaction questions, inputs the real-time interaction initial questions to respective third deep learning models for processing, and respectively obtains respective processing results, namely response data and behavior data of the real-time interaction initial questions.
Step S24, receiving processing results sent by each interactive user terminal.
After each interactive user terminal generates reply data and behavior data aiming at the real-time interactive initial problem, the reply data and the behavior data are used as processing results and sent to a central control system, and the central control system receives the processing results sent by each interactive user terminal.
And S25, generating an interaction reply picture of each corresponding interaction robot according to each processing result.
After receiving the reply data and the behavior data sent by each interactive user terminal, the central control system takes the reply data as reply content of the corresponding interactive robot, controls the reply behavior of the corresponding interactive robot for displaying the image, including actions, expressions and the like, and generates reply pictures of each corresponding interactive robot.
And step S26, distributing each interaction reply picture to all interaction user terminals participating in the interaction.
After the central control system generates the reply pictures of each interactive robot, each interactive picture is distributed to the interactive user terminals corresponding to all individual users participating in the interaction, and each interactive user terminal can play the received reply picture, so that students of live course can see the reply pictures of other students.
In the above interaction method, the interaction method is applied to an interaction system, and the interaction system comprises a central control system, a hosting live broadcast end and more than two interaction user ends; each interactive user terminal and each hosting live terminal are respectively connected with the central control system; each interactive user terminal is provided with a corresponding third deep learning model; the real-time interaction initial questions are sent to each interaction user side, so that each user interaction side respectively inputs the real-time interaction initial questions to a respective third deep learning model for processing; receiving processing results sent by each interactive user terminal; generating an interaction reply picture of each corresponding interaction robot according to each processing result; and distributing each interaction reply picture to all interaction user terminals participating in the interaction. Through the mode, a plurality of personal users can participate in online remote education through the interactive robot by utilizing the interactive user terminal, participate in interactive interaction, and promote interactive experience.
In some embodiments, step S22 is followed by:
step S27, prompting a preset robot to reiterate the real-time interaction initial question and the corresponding answer.
In the implementation process of the invention, the third deep learning model is generally obtained by pre-training, and further, in the process of utilizing the third deep learning model for interaction, the interaction data generated in the interaction process can be utilized for further training and perfecting the third deep learning model, so that the output result of the third deep learning model is more similar to the thinking mode and behavior habit of the individual user.
Specifically, after the real-time paraphrase feedback information is generated in the interaction process, the interaction robot can be prompted to reiterate the corresponding answer in the real-time interaction initial question and the real-time paraphrase feedback information.
Step S28, third real-time reply data and third real-time behavior data during interactive robot reiteration are obtained.
And when each interactive robot is reiterated, acquiring corresponding third real-time response data to obtain third real-time behavior data. The third real-time reply data refers to answer data of the interactive robot, and the third real-time behavior data refers to behavior data of the interactive robot during the repeated description, including actions, expressions and the like.
And step S29, training the third deep learning model based on the real-time interaction initial question, the real-time heuristic question, the first real-time response data, the first real-time behavior data, the second real-time response data, the second real-time behavior data, the third real-time response data and the third real-time behavior data to obtain an updated third deep learning model.
When training the third deep learning model by utilizing data generated in the real-time interaction process, taking the real-time initial interaction problem as input, and taking corresponding first real-time reply data and first real-time behavior data as output; taking the real-time heuristic question as input, and taking second real-time response data and second real-time behavior data as output; and taking the real-time interaction initial question as input, third real-time response data and third real-time behavior data as output, and inputting the three groups of data into a third deep learning model for training to obtain an updated third deep learning model.
In the interaction method, the third deep learning model is trained by utilizing real-time data generated by the interaction robot in the real-time interaction process, so that the third deep learning model which is more similar to the thinking mode and the behavior habit of the individual user is obtained.
In some embodiments, the interaction method further comprises:
step S30, obtaining correction information of the individual user on one or more of the first real-time reply data, the first real-time behavior data, the second real-time reply data, the second real-time behavior data, the third real-time reply data, and the third real-time behavior data.
In the process of real-time interaction between the interactive robot and the outside, the interactive robot replaces a personal user to answer the questions, actual answer data or real-time answer behaviors may deviate from thinking habits or behavior modes of the personal user, and when the personal user finds the deviation, one or more than two answer data or real-time behavior data can be corrected in the interaction process. And acquiring correction information of the individual user in the interaction process of the interaction robot.
And step S31, training the third deep learning model according to the correction information to obtain an updated third deep learning model.
After the correction information of the individual user is obtained, training the third deep learning model according to the correction information to obtain an updated third deep learning model. For example, if the user corrects the first real-time reply data to obtain correction information, replacing the corresponding part in the first real-time reply data with the correction information to obtain corrected first real-time reply data, then taking the real-time interaction initial question as input, inputting the corrected first real-time reply data as output into the third deep learning model for training, and obtaining an updated third deep learning model, so that the updated third deep learning model accords with the thinking mode and behavior habit of the corresponding individual user.
In some implementations, the interaction method further includes:
and step S32, acquiring the display image of the interactive robot selected by the individual user.
It will be appreciated that the individual user may select the displayed representation of the interactive robot prior to the training step, may select the displayed representation of the interactive robot prior to the interactive step, and step S32 may be performed prior to step S14 or may be performed prior to step S1. The displayed image is an image displayed when an individual user interacts with the outside by using the interactive robot, and is selected by the individual user independently, for example, the image can be cartoon.
Step S33, acquiring a personal image of a user;
the personal image of the user may be a self-photograph uploaded by the user, and may be a self-photograph of the whole body or half body.
And step S34, carrying out fusion processing on the display image of the interactive robot selected by the user and the personal image of the user based on an AI fusion technology to obtain a fusion image serving as the final display image of the interactive robot.
The characteristics of the face, the clothes, the hair ornaments, the daily motion state and the like of the user can be acquired from the personal image, the acquired characteristics are fused with the display image of the interactive robot based on the AI fusion technology, the characteristics are fused into the display image, the fused image is obtained and is used as the final external display image of the interactive robot, so that the display image of the interactive robot is more similar to the image of the personal user, the vividness and the reality of the display image are improved, and the interactive experience is improved.
In addition, the invention also provides terminal equipment.
The video processing device based on the super-resolution network of the invention comprises:
comprising a memory and a processor, said memory storing a computer program executable by said processor, said computer program implementing the method steps as described above when executed by said processor.
The specific implementation manner of the base terminal device of the present invention may refer to various embodiments of the above method of the present invention, and will not be described herein again.
In addition, the embodiment of the invention also provides a computer readable storage medium.
The computer-readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the steps of the video processing method as described above.
The method implemented when the program running on the processor is executed may refer to various embodiments of the method of the present invention, which are not described herein.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (10)

1. An interaction method, characterized in that the interaction method comprises interaction by using an interaction robot based on deep learning training, and the training method of the interaction robot comprises the following steps:
acquiring target interaction content in a designated field;
performing keyword segmentation and classification on the target interactive content, and extracting the dividing components of the target interactive content;
matching the division part with a preset initial question library, generating an initial question and outputting the initial question;
acquiring first reply data and first line data of the initial question back of the user; wherein the first line of data comprises one or more than two of expression data, sound data, action data, mouth shape data and eyeball data;
taking the initial question back as input, the first answer data and the first line data as output, and inputting the input into a neural network for training to obtain a first deep learning model;
performing first answer comparison on the first answer data and a preset answer of the first answer data, and performing first behavior comparison matching on the first behavior data and a preset model in a preset model library, wherein the preset model library comprises one or more than two of a preset expression library, a preset sound library, a preset action library, a preset mouth shape library and a preset eyeball library;
Based on the first answer comparison result, invoking heuristic question back-asking questions aiming at the initial question back-asking questions in a preset question-asking library, and outputting the heuristic question back-asking questions;
acquiring second response data and second behavior data of the user on the heuristic question back;
taking the heuristic question as input, the second response data and the second behavior data as output, and inputting the heuristic question into a first deep learning model for training to obtain a second deep learning model;
performing second answer comparison on the second answer data and a preset answer of the second answer data, and performing second comparison matching on the second behavior data and a preset model in a preset model library;
generating and outputting paraphrase feedback information based on the second answer comparison result, and prompting a user to reiterate the initial question and the corresponding answer;
acquiring third reply data and third behavior data when the user is replied;
and taking the initial question back as input, the third answer data and the third behavior data as output, and inputting the input into a second deep learning model for training to obtain a third deep learning model.
2. The interaction method of claim 1, wherein the method of interacting with the deep learning training-based interactive robot comprises:
Acquiring a real-time interaction initial problem;
inputting the real-time interaction initial questions into the third deep learning model for processing to obtain first real-time response data and first real-time behavior data;
the first real-time reply data is used as reply content, the reply behavior of the interactive robot is controlled according to the first real-time behavior data, and the interactive robot is used for carrying out first reply;
performing first real-time answer comparison on the first real-time answer data and a preset answer of the first real-time answer data, and performing first real-time comparison matching on the first real-time behavior data and a preset model in a preset model library;
based on the comparison result of the first real-time answers, calling and outputting a real-time heuristic question in a preset question library aiming at the real-time interactive initial question;
inputting the real-time heuristic question back into the third deep learning model for processing to obtain second real-time response data and second real-time behavior data;
the second real-time reply data is used as reply content, the reply behavior of the preset robot is controlled according to the second real-time behavior data, and the preset robot is utilized to reply for the second time;
performing a second real-time answer comparison on the second real-time answer data and a preset answer of the second real-time answer data, and performing a second real-time comparison matching on the second real-time behavior data and a preset model in a preset model library;
And generating and outputting real-time paraphrase feedback information based on the comparison result of the second real-time answer.
3. The interaction method of claim 2, wherein the number of the interaction robot and the third deep learning model is more than two, corresponding to different individual users, respectively, and the acquiring the real-time interaction initial question comprises:
acquiring interaction information of remote transmission of a hosting live terminal;
performing keyword segmentation and classification on the interaction information, and extracting the segmentation components of the interaction information;
and matching the division part with a preset initial problem library to generate real-time interaction initial problems, and respectively inputting the real-time interaction initial problems into each third deep learning model for processing.
4. The interactive method according to claim 3, wherein the interactive method is applied to an interactive system, and the interactive system comprises a central control system, a hosting live client and more than two interactive user clients; each interactive user terminal and each hosting live terminal are respectively connected with the central control system; each interactive user terminal is provided with a corresponding third deep learning model; the step of generating the real-time interaction initial question comprises the following steps:
The real-time interaction initial questions are sent to each interaction user side, so that each user interaction side respectively inputs the real-time interaction initial questions to a respective third deep learning model for processing;
receiving processing results sent by each interactive user terminal;
generating an interaction reply picture of each corresponding interaction robot according to each processing result;
and distributing each interaction reply picture to all interaction user terminals participating in the interaction.
5. The interactive method of claim 2, wherein the step of generating and outputting real-time paraphrasing feedback information based on the second real-time answer comparison result comprises:
prompting a preset robot to reiterate the real-time interaction initial questions and the corresponding answers;
acquiring third real-time reply data and third real-time behavior data when the interactive robot is reiterated;
and training the third deep learning model based on the real-time interaction initial question, the real-time heuristic question, the first real-time response data, the first real-time behavior data, the second real-time response data, the second real-time behavior data, the third real-time response data and the third real-time behavior data to obtain an updated third deep learning model.
6. The interaction method of claim 5, wherein the interaction method further comprises:
acquiring correction information of an individual user on one or more of the first real-time response data, the first real-time behavior data, the second real-time response data, the second real-time behavior data, the third real-time response data and the third real-time behavior data;
and training the third deep learning model according to the correction information to obtain an updated third deep learning model.
7. The interactive method according to claim 1, wherein the step of acquiring the target interactive contents of the specified domain comprises:
collecting voice input information of a user;
performing voice recognition on the voice input information based on a natural language processing technology to obtain text conversion information corresponding to the voice input information;
and acquiring target interactive contents in the appointed field based on the text conversion information.
8. The interaction method of claim 2, wherein the interaction method further comprises:
acquiring a display image of the interactive robot selected by the individual user;
acquiring a personal image of a user;
and carrying out fusion processing on the display image of the interactive robot selected by the user and the personal image of the user based on an AI fusion technology to obtain a fusion image serving as the final display image of the interactive robot.
9. A terminal device comprising a memory and a processor, the memory storing a computer program executable by the processor, the computer program implementing the method steps of any one of claims 1-8 when executed by the processor.
10. A computer-readable storage medium, characterized in that a computer program is stored, which, when being executed by a processor, carries out the method steps of any of claims 1-8.
CN202311088180.9A 2023-08-24 2023-08-24 Interaction method, terminal equipment and storage medium Withdrawn CN117111741A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311088180.9A CN117111741A (en) 2023-08-24 2023-08-24 Interaction method, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311088180.9A CN117111741A (en) 2023-08-24 2023-08-24 Interaction method, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117111741A true CN117111741A (en) 2023-11-24

Family

ID=88797897

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311088180.9A Withdrawn CN117111741A (en) 2023-08-24 2023-08-24 Interaction method, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117111741A (en)

Similar Documents

Publication Publication Date Title
KR101925440B1 (en) Method for providing vr based live video chat service using conversational ai
CN109940627B (en) Man-machine interaction method and system for picture book reading robot
Baker et al. DiapixUK: task materials for the elicitation of multiple spontaneous speech dialogs
US11222633B2 (en) Dialogue method, dialogue system, dialogue apparatus and program
CN107203953A (en) It is a kind of based on internet, Expression Recognition and the tutoring system of speech recognition and its implementation
US20120156659A1 (en) Foreign language learning method based on stimulation of long-term memory
CN110808038B (en) Mandarin evaluating method, device, equipment and storage medium
CN110992222A (en) Teaching interaction method and device, terminal equipment and storage medium
Oliveira et al. Automatic sign language translation to improve communication
Wilks et al. A prototype for a conversational companion for reminiscing about images
Mehta et al. Automated 3D sign language caption generation for video
CN114495927A (en) Multi-modal interactive virtual digital person generation method and device, storage medium and terminal
CN117541444B (en) Interactive virtual reality talent expression training method, device, equipment and medium
KR20190061191A (en) Speech recognition based training system and method for child language learning
CN110767005A (en) Data processing method and system based on intelligent equipment special for children
CN112070865A (en) Classroom interaction method and device, storage medium and electronic equipment
CN112530218A (en) Many-to-one accompanying intelligent teaching system and teaching method
CN117522643B (en) Talent training method, device, equipment and storage medium
KR20140025619A (en) Method for providing learning foreign language service based on interpretation test and writing test using speech recognition and speech to text technology
CN114048299A (en) Dialogue method, apparatus, device, computer-readable storage medium, and program product
CN117111741A (en) Interaction method, terminal equipment and storage medium
KR102395702B1 (en) Method for providing english education service using step-by-step expanding sentence structure unit
CN112632262A (en) Conversation method, conversation device, computer equipment and storage medium
CN112634684B (en) Intelligent teaching method and device
CN116226411B (en) Interactive information processing method and device for interactive project based on animation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20231124