CN116127006A

CN116127006A - Intelligent interaction method, language ability classification model training method and device

Info

Publication number: CN116127006A
Application number: CN202211319755.9A
Authority: CN
Inventors: 白安琪; 蒋宁; 吴海英; 肖冰
Original assignee: Mashang Xiaofei Finance Co Ltd
Current assignee: Mashang Xiaofei Finance Co Ltd
Priority date: 2022-10-26
Filing date: 2022-10-26
Publication date: 2023-05-16

Abstract

The embodiment of the application discloses an intelligent interaction method, a language ability classification model training method and a device. The intelligent interaction method comprises the following steps: acquiring a target language knowledge graph corresponding to a target user; the target language knowledge graph is created based on the language capability class of the target user, and the language capability class of the target user is determined based on the language capability characteristics of the target user; determining target knowledge information matched with the language capability category of the target user according to the target language knowledge graph; acquiring behavior information of the target user; the behavior information includes at least one of: action information, language information, emotion information; and determining an interaction strategy corresponding to the target user according to the target knowledge information and the behavior information, and interacting with the target user based on the interaction strategy. The technical scheme can realize personalized man-machine interaction effect matched with the language capability category of the user.

Description

Intelligent interaction method, language ability classification model training method and device

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to an intelligent interaction method, a language capability classification model training method and a device.

Background

The individual language development guidance is usually completed by guardians or teachers through experience, so as to realize the intellectualization of the language development guidance. The voice input assembly is used for extracting voice information of a user from sound of surrounding environment, the touch input assembly is used for receiving touch operation of the user through the interactive interface, the voice output assembly is used for outputting reply voice responding to the voice information or feedback voice responding to the touch operation, and the display assembly is used for outputting the interactive interface or outputting expression images matched with the voice information and the reply voice. Therefore, the intelligent accompanying robot has the function of intelligent accompanying users, and the energy and time spent by users (such as parents, guardians and the like) on the accompanying users are saved. However, the existing intelligent companion robot is limited to how to save effort and time spent for human companion, and no effective solution is currently proposed in how to more professionally guide the development of individual language ability.

Disclosure of Invention

The embodiment of the application aims to provide an intelligent interaction method, a language ability classification model training method and a device, which are used for solving the problem that intelligent guidance for individual language development is lacking in the prior art.

In order to solve the technical problems, the embodiment of the application is realized as follows:

in one aspect, an embodiment of the present application provides an intelligent interaction method, including:

acquiring a target language knowledge graph corresponding to a target user; the target language knowledge graph is created based on the language capability class of the target user, and the language capability class of the target user is determined based on the language capability features of the target user;

determining target knowledge information matched with the language capability category of the target user according to the target language knowledge graph;

acquiring behavior information of the target user; the behavior information includes at least one of: action information, language information, emotion information;

and determining an interaction strategy corresponding to the target user according to the target knowledge information and the behavior information, and interacting with the target user based on the interaction strategy.

By adopting the technical scheme of the embodiment of the application, the target language knowledge graph corresponding to the target user is obtained, the target knowledge information matched with the language capability category of the target user is determined according to the target language knowledge graph, the behavior information of the target user is obtained, and then the interaction strategy corresponding to the target user is determined according to the target knowledge information and the behavior information of the target user. Because the interaction strategy is determined according to the target knowledge information matched with the language capability type of the target user and the behavior information of the target user, and the target knowledge information is determined according to the target language knowledge graph created based on the language capability type of the target user, when the interaction strategy is used for interacting with the target user, the interaction process can be ensured to be matched with the language capability type of the target user, and the situation that interaction barriers (such as language, action and the like in the interaction process are difficult to understand) are caused when the interaction strategy is not matched with the language capability type of the target user is avoided, so that the target user is guided accurately in the human-computer interaction process. And the interaction strategy can be formulated by comprehensively considering the action information, the language information and/or the emotion information of the target user, so that the intellectualization and individuation of human-computer interaction are improved. In addition, because the target language knowledge graph and the target user have correspondence, namely, the target language knowledge graph and the target user have respective corresponding language knowledge graphs for different users, the technical scheme can realize customized language knowledge graphs for different users on the basis of realizing intelligent interaction with the users, and further realize personalized and customized man-machine interaction effects.

In another aspect, an embodiment of the present application provides a language capability classification model training method, including:

acquiring sample language capability characteristics and sample language capability categories of a sample user; the sample language capability features include transaction response features and/or sound response features;

inputting the sample language capability features into a language capability classification model to be trained, and classifying the language capability of the sample user to obtain a classification result;

and according to the classification result and the sample language ability category, adjusting model parameters of the language ability classification model to be trained.

By adopting the technical scheme of the embodiment of the application, the sample language capability characteristics and the sample language capability types of the sample user are obtained, the sample language capability characteristics are input into the language capability classification model to be trained, the language capability of the sample user is classified to obtain a classification result, and then model parameters of the language capability classification model to be trained are adjusted according to the classification result and the sample language capability types, so that the trained language capability classification model is obtained. Because the sample language capability features of the sample user comprise the things response features and/or the voice response features, the language capability classification model obtained based on the training of the sample language capability features not only has the capability of analyzing the things response features and/or the voice response features of the user, but also has the capability of classifying the language capability of the user according to the language capability features of the user. Furthermore, when intelligent human-computer interaction is performed, the language capability classification model can be used for accurately analyzing the language capability class of the target user, and powerful data support is provided for the follow-up intelligent human-computer interaction process.

In still another aspect, an embodiment of the present application provides an intelligent interaction device, including:

the first acquisition module is used for acquiring a target language knowledge graph corresponding to a target user; the target language knowledge graph is created based on the language capability class of the target user, and the language capability class of the target user is determined based on the language capability features of the target user;

the first determining module is used for determining target knowledge information matched with the language capability category of the target user according to the target language knowledge graph;

the second acquisition module is used for acquiring the behavior information of the target user; the behavior information includes at least one of: action information, language information, emotion information;

and the second determining module is used for determining an interaction strategy corresponding to the target user according to the target knowledge information and the behavior information and interacting with the target user based on the interaction strategy.

In yet another aspect, an embodiment of the present application provides an electronic device, including a processor and a memory electrically connected to the processor, where the memory stores a computer program, and the processor is configured to call and execute the computer program from the memory to implement the intelligent interaction method of the above aspect, or call and execute the computer program from the memory to implement the language capability classification model training method of the above aspect.

In yet another aspect, embodiments of the present application provide a storage medium storing a computer program executable by a processor to implement the intelligent interaction method of one aspect described above, or executable by a processor to implement the language capability classification model training method of another aspect described above.

Drawings

In order to more clearly illustrate one or more embodiments of the present specification or the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described, and it is apparent that the drawings in the following description are only some embodiments described in one or more embodiments of the present specification, and other drawings may be obtained according to these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is a schematic flow chart of a smart interaction method according to an embodiment of the present application;

FIG. 2 is a schematic block diagram of an initial linguistic knowledge graph, according to one embodiment of the present application;

FIG. 3 is a schematic block diagram of a target language knowledge graph, according to an embodiment of the disclosure;

FIG. 4 is a schematic application diagram of a linguistic capability classification model, shown in accordance with one embodiment of the present application;

FIG. 5 is a schematic flow chart diagram illustrating one method of intelligent interaction according to one embodiment of the present application;

FIG. 6 is a schematic flow chart diagram illustrating a method of intelligent interaction according to another embodiment of the present application;

FIG. 7 is a schematic flow chart diagram of a linguistic capability classification model training method according to one embodiment of the present application;

FIG. 8 is a schematic diagram illustrating a linguistic ability classification model training method, according to one embodiment of the present application;

FIG. 9 is a schematic block diagram of a smart interactive device, according to an embodiment of the present application;

FIG. 10 is a schematic block diagram of a linguistic capability classification model training device according to one embodiment of the present application;

fig. 11 is a schematic block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The embodiment of the application provides an intelligent interaction method, a language ability classification model training method and a device, which are used for solving the problem that intelligent guidance for individual language development is lacking in the prior art.

In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

In terms of individual language development guidance, related technologies provide an intelligent accompanying robot and an intelligent accompanying system, for example, a child accompanying robot and a child accompanying system, which mainly include a voice input component, a touch input component, a sound output component and a display component. The voice input assembly is used for extracting voice information of a user from sound of surrounding environment, the touch input assembly is used for receiving touch operation of the user through the interactive interface, the voice output assembly is used for outputting reply voice responding to the voice information or feedback voice responding to the touch operation, and the display assembly is used for outputting the interactive interface or outputting expression images matched with the voice information and the reply voice. Taking the child accompanying robot as an example, when the child accompanying robot performs man-machine interaction with a child, the child is required to actively make a sound, or perform interactive operation through an interactive interface, such as triggering a designated button on the interactive interface, so that the child accompanying robot can acquire the sound actively made by the child, or an operation instruction sent by the child through the interactive interface, and then perform corresponding operation according to the acquired information or instruction. Therefore, the child accompanying robot has the function of intelligently accompanying the child, so that the energy and time spent by parents in the aspect of accompanying the child are saved. But is limited to how to save effort and time spent in human companion and does not focus on how to more professionally guide the development of individual linguistic abilities. According to the intelligent interaction method, the target language knowledge graph corresponding to the target user is obtained, the target knowledge information matched with the language capability category of the target user is determined according to the target language knowledge graph, the behavior information of the target user is obtained, and then the interaction strategy corresponding to the target user is determined according to the target knowledge information and the behavior information of the target user. Because the interaction strategy is determined according to the target knowledge information matched with the language capability type of the target user and the behavior information of the target user, and the target knowledge information is determined according to the target language knowledge graph created based on the language capability type of the target user, when the interaction strategy is used for interacting with the target user, the interaction process can be ensured to be matched with the language capability type of the target user, and the situation that interaction barriers (such as language, action and the like in the interaction process are difficult to understand) are caused when the interaction strategy is not matched with the language capability type of the target user is avoided, so that the target user is guided accurately in the human-computer interaction process.

In addition, when the above-mentioned child accompanying robot performs man-machine interaction with a child, the above-mentioned child accompanying robot performs corresponding actions according to collected information (such as sound made by the child or operation instructions made through an interaction interface) and a built-in universal interaction mode of the robot, where the universal interaction mode refers to that the same interaction mode is used for all users. For example, a robot has a dialogue algorithm built therein, and performs a dialogue based on the dialogue algorithm and a child by collecting sound information emitted from the child and performing voice and semantic analysis on the sound. Obviously, the children accompany the robot and carry out man-machine interaction with children, the current language development capability of the children is not considered, and the man-machine interaction effect is poor. For example, if the language development ability of the child is still in a simple sentence stage, i.e. only a dialogue can be performed in a simple sentence form, if the child accompanying robot sends a relatively complex long sentence to the child based on a universal interaction mode, the child can hardly understand the meaning of the robot, and cannot perform the next interaction. According to the intelligent interaction method, the personalized target language knowledge graph can be customized for the target user, target knowledge information matched with the language capability type of the target user and action information, language information and/or emotion information of the target user can be comprehensively considered to formulate an interaction strategy, so that the current language development capability of the target user is fully considered in the interaction process, and the man-machine interaction process is more personalized and customized.

The intelligent interaction method and the language capability classification model training method provided by the embodiment of the application can be executed by the electronic equipment or software installed in the electronic equipment, and specifically, the electronic equipment can be terminal equipment or service end equipment. In the embodiment of the application, the electronic device may be an intelligent robot with an interaction function.

FIG. 1 is a schematic flow chart of an intelligent interaction method according to an embodiment of the present application, as shown in FIG. 1, the method includes the following steps S102-S108:

s102, acquiring a target language knowledge graph corresponding to a target user; the target language knowledge graph is created based on language capability categories of the target user, which are determined based on language capability features of the target user.

Wherein the language capability class is used for representing the language capability of the user at the current stage. Alternatively, according to the development stage of language capability, the following language capability categories may be divided: language preparation period, language completion period, language intelligence period, dialogue period, writing period, echo period, call period, talk center, writing center, visual language center, etc. In addition, the development stage of language capability can be divided in a finer dimension, for example, the language preparation stage comprises a unigram stage, a double-phrase stage and a simple sentence stage; the language complete period comprises a compound sentence stage and a question stage; etc.

The target language knowledge graph comprises a plurality of knowledge nodes and knowledge information corresponding to each knowledge node, and each knowledge node corresponds to one language capability class respectively. That is, in the target language knowledge graph, there is one knowledge node corresponding to each language ability class. For example, the target language knowledge graph includes knowledge nodes corresponding to a language preparation period, and the next level knowledge nodes of the knowledge nodes are: the independent sentence stage, the double sentence stage and the simple sentence stage are respectively correspondent to knowledge nodes.

Alternatively, the language capability class of the target user may be determined based on a pre-trained language capability classification model that is trained based on sample language capability features of a plurality of sample users and sample language capability classes. The specific training method of the language capability classification model will be described in detail in the following embodiments, which are not repeated here. Alternatively, the language capability class of the target user may be determined based on other means, not limited to a pre-trained language capability classification model. For example, the corresponding relation between the different language capability features and the language capability categories is preset, so that the language capability category corresponding to the language capability feature of the target user is determined according to the corresponding relation.

S104, determining target knowledge information matched with the language capability category of the target user according to the target language knowledge graph.

The target knowledge information matched with the language capability class of the target user can be understood as language information which can be learned by the target user in a stage corresponding to the language capability class of the target user, wherein the language information comprises voice (such as words, words and sentences), semantics, grammar, language skills (such as independent words and complex sentences) and the like.

S106, obtaining behavior information of a target user; the behavior information includes at least one of: action information, language information, emotion information.

The action information is the action executed by the target user, such as lifting hands, taking food and the like; language information is the sound made by the target user; the emotion information is information capable of representing emotion of a target user, and comprises emotion (such as crying, smiling and the like), expression (such as squint, breaking and the like) and the like.

S108, determining an interaction strategy corresponding to the target user according to the target knowledge information and the behavior information, and interacting with the target user based on the interaction strategy.

The interaction strategy may include interaction modes, interaction content corresponding to each interaction mode, and the like, and the interaction modes may include language interaction and/or action interaction.

Optionally, the interaction mode for the target user can be determined according to the behavior information of the target user, and the interaction content corresponding to the interaction mode can be further determined according to the target knowledge information matched with the language capability type of the target user.

For example, in the target knowledge information matched with the language capability class of the target user, the sentence forming mode that the target user can learn is a unique word, and the behavior information of the target user is: looking at the apples in the room, the interaction strategy corresponding to the target user can be determined as follows: the interaction mode comprises language interaction and action interaction, wherein the interaction content corresponding to the language interaction mode is the word voice "apple" sent out, and the interaction content corresponding to the action interaction mode is: the apple is taken for the target user and the image "apple" is presented on a screen pre-configured to be visible to the target user.

In one embodiment, the method for obtaining (or generating) the target language knowledge graph may include the following steps A1-A5:

a1, acquiring a pre-established initial language knowledge graph; the initial language knowledge graph includes: the system comprises a plurality of knowledge nodes, and user information and knowledge information corresponding to each knowledge node.

The initial language knowledge graph can be generated according to classical language development theory. Classical language development theory is also called "language acquisition theory", which is a theory for explaining the ability of children to acquire speaking and listening in native language spoken language, and the composition of language includes three components of speech, grammar and semantics. As an effective interaction tool, both the speaker and listener must have a range of skills and rules, namely, speech skills. In the development process, children can master the four kinds of voice, grammar, semantics and language skills step by step to acquire the ability of learning and understanding the native language.

Optionally, firstly, based on classical language development theory, all entities in the language development theory are obtained through entity recognition technology, including the speech, grammar, semantics and language skills required to be learned in each language development stage, and the relationships among all the entities are obtained through relationship extraction technology and attribute extraction technology, then, based on the obtained entities and the relationships among the entities, each language development stage is taken as a knowledge node, and the knowledge required to be learned in each language development stage (including the speech, grammar, semantics and language skills) is taken as knowledge information associated with the corresponding knowledge node, so that a complete initial language knowledge graph is constructed. The entity recognition technology, the relationship extraction technology and the attribute extraction technology all belong to the prior art, so that the description thereof is omitted.

Optionally, an arrow between two adjacent knowledge nodes is used to represent: the development directions of the language development stages corresponding to the two adjacent knowledge nodes. For different language capability division dimensions, a atlas can be constructed by adopting a parent-child node mode. The child nodes of the knowledge node "language preparation period" include: the independent sentence stage, the double sentence stage and the simple sentence stage are respectively correspondent to knowledge nodes.

Each knowledge node corresponds to respective user information. The user information may include at least one of a user age group, user identity information, and the like. The age of a user may be represented by a specific age range, such as including a 0-1 year old stage, a 1-2 year old stage, and so on. The user identity information may include infants, preschool children, pupils, and the like.

Fig. 2 is a schematic block diagram illustrating an initial linguistic knowledge graph, in accordance with one embodiment of the present application. Limited by the graph size, fig. 2 shows only some knowledge nodes in the initial linguistic knowledge graph, and other knowledge nodes and child nodes not shown are represented by ellipses. Arrows between two adjacent knowledge nodes are used to represent: the development directions of the language development stages corresponding to two adjacent knowledge nodes, for example, an arrow is arranged between a knowledge node 'single word stage' and a knowledge node 'double word stage', and the directions are pointing to the 'double word stage', which indicates that the language development directions are as follows: the single sentence stage is developed to the double sentence stage. It should be noted that, for two adjacent knowledge nodes of the same hierarchy, or between a plurality of child nodes of the same knowledge node, an arrow may or may not exist. If no arrow exists between the two knowledge nodes, the fact that no interconnection exists between the two knowledge nodes is indicated.

And step A2, determining knowledge information matched with the user information of the target user according to the initial language knowledge graph and the user information of the target user.

Wherein the user information of the target user may include at least one of user age, user identity information, and the like of the target user. By comparing the user information of the target user with the user information corresponding to each knowledge node in the initial language knowledge graph, the knowledge node matched with the user information of the target user can be determined, and further the knowledge information corresponding to the matched knowledge node is determined, namely the knowledge information matched with the user information of the target user. The knowledge information matched with the user information of the target user is the knowledge information which the target user should learn at the present stage, and can also be called as priori knowledge information.

A3, collecting language capability features of the target user based on knowledge information matched with user information of the target user; language capability features include transaction response features and/or sound response features.

Alternatively, in collecting the object reaction characteristics of the target user, the target user may be subjected to an object-related action matching with its a priori knowledge information, such as moving the food "apple" from a location to the front of the target user's eyes, and then collecting the reaction information of the target user to the action. When the voice response characteristics of the target user are collected, voice information matched with the prior knowledge information of the target user can be sent to the user, for example, the target user can learn a unique word at the present stage, then voice apple can be sent to the target user, and then the response information of the target user to the voice is collected.

Alternatively, a thing reaction feature may be characterized by a reaction sensitivity to a thing, and a sound reaction feature may be characterized by a reaction sensitivity to sound. When the response information of the target user to the action or the voice is collected, the brain signal of the target user can be collected by utilizing the existing brain signal collection technology, wherein the brain signal comprises at least one of brain wave data and brain language partition brightness. By analyzing the brain signal, response information, such as response sensitivity, of the target user can be determined. The brain wave data is the electric wave change when the brain of the user moves, so that the larger and quicker the electric wave change amplitude in the brain wave data is, the higher the response sensitivity of the target user is, whereas the smaller and slower the electric wave change amplitude in the brain wave data is, the lower the response sensitivity of the target user is. The brightness of the brain language partition can reflect the liveness of the brain language partition, so that the higher the brightness of the brain language partition is, the higher the response sensitivity of the target user is indicated, and on the contrary, the lower the brightness of the brain language partition is, the lower the response sensitivity of the target user is indicated.

And step A4, determining the language capability class of the target user according to the language capability characteristics.

And step A5, generating a target language knowledge graph according to the initial language knowledge graph and the language capability class of the target user.

In this embodiment, each knowledge node corresponds to a language capability class. Based on the above, when the target language knowledge graph is generated according to the initial language knowledge graph and the language capability class of the target user, the language capability class of the target user and each knowledge node are respectively matched, and a first knowledge node matched with the language capability class of the target user is determined according to the matching result. And then generating a target language knowledge graph according to the first knowledge node. The target language knowledge graph comprises a first knowledge node and knowledge information corresponding to the first knowledge node. Therefore, when determining the target knowledge information matched with the language capability class of the target user according to the target language knowledge graph, the knowledge information corresponding to the first knowledge node can be determined as the target knowledge information.

In addition, in order to make the target language knowledge graph have language guiding function, the target language knowledge graph can be generated according to the first knowledge node and the second knowledge node. The target language knowledge graph comprises: the first knowledge node, the second knowledge node, knowledge information corresponding to the first knowledge node and knowledge information corresponding to the second knowledge node. The second knowledge node is at least one knowledge node adjacent to the first knowledge node. Optionally, the second knowledge node is the next knowledge node adjacent to the first knowledge node, so that the language capability type of the next stage of the target user and knowledge information to be learned by the next stage of the target user can be obtained according to the target language knowledge graph. Based on this, when determining the target knowledge information matching the language capability class of the target user according to the target language knowledge graph, the target knowledge information may be determined according to the knowledge information corresponding to the first knowledge node and/or the second knowledge node, for example, the first knowledge node includes a plurality of sub-nodes, and an arrow for representing the stage development direction is provided between every two adjacent sub-nodes, and if the target user currently corresponds to the first sub-node of the first knowledge node (i.e., the sub-node of the earliest stage), the knowledge information corresponding to the first knowledge node may be determined as the target knowledge information, or the knowledge information corresponding to the first sub-node of the first knowledge node may be determined as the target knowledge information. If the target user currently corresponds to the last child node (i.e., the child node of the latest stage) of the first knowledge node, since the next stage of the last child node will be the second knowledge node, in order to gradually guide the language capability development of the target user, knowledge information corresponding to the first knowledge node and knowledge information corresponding to the second knowledge node may be determined together as target knowledge information, or knowledge information corresponding to the last child node of the first knowledge node and knowledge information corresponding to the second knowledge node may be determined together as target knowledge information.

In one embodiment, the knowledge nodes included in the target language knowledge graph may be understood as knowledge nodes for display on the front-end interface, while for other knowledge nodes included in the initial language knowledge graph that do not belong to the target language knowledge graph, knowledge nodes that are hidden on the front-end interface may be understood. Specifically, in response to a display instruction for the target language knowledge graph, knowledge nodes included in the target language knowledge graph, such as a first knowledge node or the first knowledge node and a second knowledge node, are displayed on the front-end interface, while other knowledge nodes included in the initial language knowledge graph but not belonging to the target language knowledge graph are hidden. In addition, the front-end interface may be provided with an input port for display instructions of other knowledge nodes, and when the display instructions input through the input port are received, the hidden other knowledge nodes are displayed.

Fig. 3 is a schematic structural diagram of a target language knowledge graph according to an embodiment of the present application. In this embodiment, assuming that the language capability class corresponding to the current stage of the target user is a unigram stage and the language capability class corresponding to the next stage is a double-phrase stage, the target language knowledge graph shown in fig. 3 only includes knowledge nodes corresponding to the unigram stage and knowledge nodes corresponding to the double-phrase stage.

In this embodiment, by hiding other knowledge nodes that are included in the initial language knowledge graph but not belonging to the target language knowledge graph, only the knowledge nodes that are included in the target language knowledge graph are displayed, so that the user can clearly know the language capability type of the target user at the current stage. And, in the case that the target language knowledge graph includes the second knowledge node, the user can also clearly know the language ability class of the target user in the next stage. In addition, when the human-computer interaction is carried out with the target user, the human-computer interaction can be carried out based on the knowledge information corresponding to the second knowledge node included in the target language knowledge graph, so that the indication information required to be learned in the next stage is guided for the target user through the human-computer interaction, the human-computer interaction experience of the target user is improved, and the language capability development of the target user can be accelerated.

In one embodiment, when performing step A4, the language capability class of the target user may be determined according to the language capability features of the target user and the pre-trained language capability classification model. Specifically, the language capability features of the target user may be input into a pre-trained language capability classification model to obtain a language capability class of the target user, where the language capability classification model is obtained based on sample language capability features of the sample user and sample language capability class training.

The language capability features may include language capability feature values, each of which corresponds to a respective weight when there are multiple language capability features. Optionally, the linguistic capability features include a transaction response feature characterized by a response sensitivity to a transaction and a sound response feature characterized by a response sensitivity to sound. Therefore, the response sensitivity to the object can be used as the object response characteristic value, and the response sensitivity to the sound can be used as the sound response characteristic value.

Optionally, to facilitate model calculation, the linguistic capability feature value may be normalized to obtain a normalized linguistic capability feature value. The normalized language ability characteristic value is distributed between 0 and 1. The normalized linguistic capability feature values are then input into a linguistic capability classification model.

In one embodiment, the linguistic capability features may include linguistic capability feature values, and the pre-trained linguistic capability classification model includes a score calculation layer and a classification layer. When determining the language capability class of the target user according to the language capability features and the pre-trained language capability classification model, firstly inputting the language capability feature values into the pre-trained language capability classification model, and then executing the following steps B1-B2 by utilizing the language capability classification model:

And B1, calculating the language capability score of the target user according to the language capability feature value corresponding to each language capability feature and the weight corresponding to each language capability feature through a score calculation layer.

In the step, the language capability feature values can be weighted and summed according to the weights respectively corresponding to the language capability features, so that the language capability scores of the target users can be obtained. For example, if the object reaction feature value corresponds to a weight of 0.6 and the sound reaction feature value corresponds to a weight of 0.4, and if it is determined that the object reaction feature value of the target user is 0.9 and the sound reaction feature value is 0.1, the language ability score of the target user is: 0.9 x 0.6+0.1 x 0.4=0.58.

And B2, determining the language capability category of the target user according to the language capability score of the target user and the preset mapping relation between the language capability score and the language capability category through a classification layer.

In the preset mapping relation between the language ability score and the language ability category, the higher the language ability score is, the higher the language ability category is in the language development stage.

FIG. 4 is a schematic application diagram of a linguistic capability classification model, shown according to one embodiment of the present application. As shown in fig. 4, the language capability feature value is input into a language capability classification model, specifically, a score calculation layer of the language capability classification model is input, the language capability score of the target user is calculated through the score calculation layer, then the language capability score is input into a classification layer of the language capability classification model, and the classification layer classifies the language capability of the target user according to a preset mapping relationship between the language capability score and the language capability class, so that the language capability class of the target user is output.

For example, the language ability score of the target user is calculated to be 0.58. In the mapping relation between the preset language capability score and the language capability category, the language capability category corresponding to 0.58 is the language complete period, so that the language capability category of the target user can be determined to be the language complete period.

In one embodiment, when the human-computer interaction is performed with the target user, the interaction characteristic information of the target user in the interaction process can be obtained, wherein the interaction characteristic information comprises a transaction response characteristic and/or a sound response characteristic. Then, judging whether the interactive characteristic information is matched with the language capability category of the target user. If the language capability features are not matched, the language capability features of the target user are redetermined according to the acquired interaction feature information, and updated language capability features are obtained. And then, determining the language capability category of the target user again according to the updated language capability characteristics to obtain the language capability category to be updated. And updating the target language knowledge graph according to the language capability category to be updated.

Because the target language knowledge graph can comprise the first knowledge node, the second knowledge node, the knowledge information corresponding to the first knowledge node and the knowledge information corresponding to the second knowledge node, when the language capability development of the target user is guided through the target language knowledge graph, the knowledge node corresponding to the second knowledge node (namely, the knowledge node is used as the target knowledge node) can be adopted to perform man-machine interaction with the target user. And if the language capability class of the target user is determined to be matched with the knowledge node corresponding to the second knowledge node according to the interaction characteristic information of the interaction of the target user in the interaction process, the language capability of the target user is considered to be developed to the second knowledge node. In this case, the second knowledge node may be updated to a new first knowledge node in the target language knowledge-graph.

In this embodiment, when the language capability class of the target user is redetermined according to the interactive feature information, the method for determining the language capability class of the target user through steps A3-A4 in the above embodiment is similar to that described above, and will not be repeated here.

In one embodiment, the human-computer interaction process may be actively triggered by the electronic device, or the electronic device may monitor behavior information of the target user, and trigger the human-computer interaction process when the behavior information is monitored.

If the man-machine interaction process can be actively triggered by the electronic device, when the behavior information of the target user is obtained (i.e. S106 is executed), the target user can be triggered to execute the behavior event according to the target knowledge information matched with the language capability class of the target user, and then the behavior information corresponding to the behavior event is obtained. The behavior information corresponding to the behavior event may include at least one of action information, language information, and emotion information.

Optionally, when the behavior information corresponding to the behavior event is acquired, multimedia data, such as audio/video data, of the behavior event executed by the target user may be acquired first, then the multimedia data is processed to obtain audio data and image data of the behavior event executed by the target user, and finally the behavior information corresponding to the behavior event may be obtained by analyzing the audio data and the image data.

If the electronic device monitors the behavior information of the target user and triggers the man-machine interaction process when the behavior information is monitored, the behavior information of the target user may be monitored in real time or monitored according to a preset period, where the behavior information may include at least one of action information, language information, and emotion information, and triggers the man-machine interaction process when the behavior information is monitored, and S108 in the above embodiment is executed, that is, an interaction policy corresponding to the target user is determined, and interaction is performed with the target user based on the interaction policy.

In the following, by means of a specific embodiment, how to perform man-machine interaction with a target user in a triggering manner of two different man-machine interaction processes is described in detail. In the specific embodiment shown in fig. 5 and fig. 6, a target language knowledge graph created in advance for a target user is deployed in an intelligent robot, and the intelligent robot has a display interface, which can display the target language knowledge graph of the target user and display graphic and text information in a man-machine interaction process for the target user.

Optionally, at least one set of login information may be deployed in advance in the intelligent robot, each set of login information including one login account number and a corresponding login password. Each set of login information corresponds to one target user, and for each set of login information, a target language knowledge graph of the target user corresponding to the login information is stored in an associated mode. Before the intelligent robot is used for human-computer interaction, the intelligent robot needs to log in through login information, the intelligent robot determines a target language knowledge graph which is stored in association with the login information according to the login information logged in at present, and human-computer interaction is carried out based on the determined target language knowledge graph.

By arranging at least one group of login information in the intelligent robot, the effect of accompanying a plurality of target users by the same intelligent robot can be achieved, and the safety of the target language knowledge graph of the target users can be ensured. Specifically, if the login information provided when logging in the intelligent robot is wrong (if the login account number and the login password are not matched), the intelligent robot does not display the internally stored target language knowledge graph, so that the situation that other people learn the target language knowledge graph of the target user is avoided, and the safety of the target language knowledge graph of the target user is ensured.

Fig. 5 is a schematic flow chart diagram illustrating an intelligent interaction method according to an embodiment of the present application. In this embodiment, the intelligent robot actively triggers the man-machine interaction process, as shown in fig. 5, the intelligent interaction method includes the following steps S501-S505:

s501, determining target knowledge information matched with the language capability category of the target user according to the target language knowledge graph of the target user.

The target language knowledge graph comprises knowledge nodes corresponding to the language capability categories of the target user at the current stage and knowledge information corresponding to the knowledge nodes. Therefore, the knowledge information corresponding to the knowledge nodes included in the target language knowledge graph can be determined as the target knowledge information matched with the language ability class of the target user. The target knowledge information may include speech (e.g., words, sentences), semantics, grammar, skills in speech (e.g., unigram, complex sentences), and the like.

S502, triggering the target user to execute the behavior event according to the target knowledge information.

Wherein the behavioral events may include voice events and/or action events.

Alternatively, the electronic device may trigger the target user to perform the behavioral event by performing the behavioral trigger event to the target user. The behavior trigger event may include a speech output event, i.e., sending out speech information to the target user that matches the target knowledge information, and/or an action execution event. Assuming that the current phase of the target user belongs to the phrase phase, the target knowledge information corresponding to the target user includes the language skills of the phrases, so that the phrase voice information can be sent to the target user, for example, voice "apple" is output by using the familiar sound of the target user (such as the familiar family voice of the target user).

The action execution event is to execute the action matched with the target knowledge information to the target user. Assuming that the current stage of the target user belongs to the phrase stage, the target knowledge information corresponding to the target user comprises the language skills of the phrases, which means that the current stage of the target user can only understand simple unique things, the unique things can be understood as the unique things, and therefore actions related to the unique things can be performed to the user, such as moving apples from a certain position to the front of eyes of the target user.

S503, shooting audio and video data of a behavior event executed by a target user through a pre-installed shooting device; and acquiring brain signals of the target user.

In this embodiment, the audio and video data includes audio data and/or image data, and the camera device may be installed in advance at a position around the target user, and in the human-computer interaction process, the behavior of the target user is shot by using the camera device. Optionally, after the man-machine interaction process is triggered, the camera device is started, and the behavior of the target user is shot. The brain signals may include brain wave data, brain language partition brightness, etc.

S504, analyzing the shot audio and video data and brain signals to obtain behavior information corresponding to the behavior event.

The behavior information corresponding to the behavior event may include at least one of action information, language information, and emotion information.

The camera device is in communication connection with the intelligent robot, and the camera device can transmit the audio and video data obtained through shooting to the intelligent robot through a communication connection relation. Optionally, if the behavioral event of the target user is a voice event, the intelligent robot may obtain the voice information of the target user directly according to the voice acquisition device built in the intelligent robot without obtaining the voice information of the target user through the camera device.

Optionally, a waiting duration after the intelligent robot executes the behavior trigger event may be preset, so that behavior information corresponding to the behavior event executed by the target user is obtained in the waiting duration. If the behavior information of the target user is not acquired within the waiting time, the current man-machine interaction is considered to be failed. In case of failure, S502 may be returned, and the behavior trigger event is again performed to the target user.

When the behavior information corresponding to the behavior event is obtained by analyzing the audio and video data, the audio data and the image data can be stripped from the audio and video data, and then the audio data and the image data are analyzed respectively. For audio data, the audio data can be identified by utilizing the existing audio identification algorithm, and then the identified audio data is converted into text content with semantic information. For image data, an algorithm (such as Opencv) in terms of computer vision can be used for framing the image data, and then image recognition is performed on multi-frame images obtained after framing, so that behavior information of a target user, such as whether the target user has a reaction, what action is performed by the target user, what expression is performed by the target user, and the like, can be recognized.

Brain signals may be analyzed based on brain wave analysis algorithms, including brain wave data and/or brain language partition brightness, to analyze the response sensitivity of the target user. Specifically, the brain wave data is the electric wave change when the brain of the user moves, so that the larger and quicker the electric wave change amplitude in the brain wave data is, the higher the response sensitivity of the target user is, whereas the smaller and slower the electric wave change amplitude in the brain wave data is, the lower the response sensitivity of the target user is. The brightness of the brain language partition can reflect the liveness of the brain language partition, so that the higher the brightness of the brain language partition is, the higher the response sensitivity of the target user is indicated, and on the contrary, the lower the brightness of the brain language partition is, the lower the response sensitivity of the target user is indicated.

In general, the behavioral information of the target user may be comprehensively analyzed by combining audio and video data and brain signals, where the audio and video data is mainly used to analyze whether the target user has a response, what action is specifically performed when the target user has a response, what voice, and the like, and the brain signals are mainly used to analyze the response sensitivity of the target user. For example, it is determined by analyzing the audio/video data that the target user performed the turning motion, and by analyzing the brain signal, the sensitivity of the target user performing the turning motion is determined.

S505, the interaction characteristic information of the target user in the interaction process is obtained by analyzing the shot audio and video data and brain signals, and the interaction characteristic information of the target user in the interaction process is stored locally.

The interaction characteristic information stored locally can be used for updating the target language knowledge graph later. For example, if the update period is preset, when the update period arrives, the language development category of the current stage of the target user is analyzed based on the locally stored interactive feature information, and then the target language knowledge graph is updated according to the language development category of the current stage of the target user.

In the embodiment shown in fig. 5, the target user may be any user who needs to use the intelligent robot, such as an infant who needs to accompany, an elderly person with inconvenient movement, and the like.

Fig. 6 is a schematic flow chart diagram illustrating an intelligent interaction method according to another embodiment of the present application. In this embodiment, the intelligent robot monitors the behavior information of the target user to trigger the man-machine interaction process, as shown in fig. 6, the intelligent interaction method includes the following steps S601-S605:

s601, determining target knowledge information matched with the language capability category of the target user according to the target language knowledge graph of the target user.

S602, monitoring behavior information of the target user, wherein the behavior information can comprise at least one of action information, language information and emotion information.

In this embodiment, an image pickup device may be installed in advance at a position around the target user, and behavior information of the target user may be monitored using the image pickup device. Or, the language information of the infant can be monitored according to the voice acquisition device arranged in the intelligent robot.

S603, when the behavior information is monitored, determining an interaction strategy corresponding to the target user according to the target knowledge information and the behavior information matched with the language capability type of the target user, and interacting with the target user based on the interaction strategy.

S604, shooting audio and video data of a behavior event executed by a target user through a pre-installed camera device in the interaction process; and acquiring brain signals of the target user.

The audio and video data comprise audio data and/or image data, a camera device can be arranged at a position around the target user in advance, and the camera device is used for shooting the behaviors of the target user in the human-computer interaction process. Optionally, after the man-machine interaction process is triggered, the camera device is started, and the behavior of the target user is shot. The brain signals may include brain wave data, brain language partition brightness, etc.

S605, obtaining the interaction characteristic information of the target user in the interaction process by analyzing the shot audio and video data and brain signals, and storing the interaction characteristic information of the target user in the interaction process locally.

The analysis methods of the audio/video data and the brain signal are similar to those in the above embodiments, and are not repeated here. The interaction characteristic information stored locally can be used for updating the target language knowledge graph later. For example, if the update period is preset, when the update period arrives, the language development category of the current stage of the target user is analyzed based on the locally stored interactive feature information, and then the target language knowledge graph is updated according to the language development category of the current stage of the target user.

In the embodiment shown in fig. 6, the target user may be any user who needs to use the intelligent robot, such as an infant who needs to accompany, an aged person with inconvenient movement, and the like.

According to the embodiment, whether the electronic equipment (such as an intelligent robot) actively triggers the human-computer interaction process or the electronic equipment monitors the behavior information of the target user and triggers the human-computer interaction process when the behavior information is monitored, the personalized and customized interaction strategy can be determined for the target user based on the target knowledge information matched with the language capability type of the target user and the behavior information of the target user, so that when the interaction strategy is based on the interaction with the target user, the interaction process can be ensured to be matched with the language capability type of the target user, and the situation that interaction barriers (such as language, action and the like in the interaction process are difficult to understand) are caused when the interaction strategy is not matched with the language capability type of the target user is avoided, and the accurate guidance of the target user in the human-computer interaction process is realized. In addition, the action information, the language information and/or the emotion information of the target user can be comprehensively considered to formulate an interaction strategy, so that the intellectualization and individuation of human-computer interaction are improved.

FIG. 7 is a schematic flow chart diagram of a language ability classification model training method, as shown in FIG. 7, according to an embodiment of the application, the method including:

s702, acquiring sample language capability features and sample language capability categories of sample users; the sample language capability features include transaction response features and/or sound response features.

Wherein the sample language capability class is used to characterize the current stage language capability of the sample user. Alternatively, according to the development stage of language capability, the following language capability categories may be divided: language preparation period, language completion period, language intelligence period, dialogue period, writing period, echo period, call period, talk center, writing center, visual language center, etc. In addition, the development stage of language capability can be divided in a finer dimension, for example, the language preparation stage comprises a unigram stage, a double-phrase stage and a simple sentence stage; the language complete period comprises a compound sentence stage and a question stage; etc.

Alternatively, sample language capability features of a sample user may be pre-collected. For each sample user, in collecting the transaction response characteristics of the sample user, a transaction-related action matched with the prior knowledge information of the sample user can be performed on the sample user, such as moving food 'apples' from a position to the front of the eyes of the sample user, and then collecting the response information of the sample user to the action. When the voice response characteristics of the sample user are collected, voice information matched with the prior knowledge information of the sample user can be sent to the sample user, for example, a unique word should be learned by the sample user at the present stage, then voice apple can be sent to the sample user, and then the response information of the sample user to the voice is collected.

The transaction response characteristic may be characterized by a response sensitivity to a transaction and the sound response characteristic may be characterized by a response sensitivity to sound. When the response information of the sample user to the action or the voice is collected, the brain signal of the sample user can be collected by utilizing the existing brain signal collection technology, wherein the brain signal comprises at least one of brain wave data and brain language partition brightness. By analyzing the brain signals, the response information, such as the response sensitivity, of the sample user can be determined. The brain wave data is the electric wave change when the brain of the sample user moves, so that the larger and quicker the electric wave change amplitude in the brain wave data is, the higher the response sensitivity of the sample user is, and on the contrary, the smaller and slower the electric wave change amplitude in the brain wave data is, the lower the response sensitivity of the sample user is. The brightness of the brain language partition can reflect the liveness of the brain language partition, so that the higher the brightness of the brain language partition is, the higher the response sensitivity of the sample user is indicated, and on the contrary, the lower the brightness of the brain language partition is, the lower the response sensitivity of the sample user is indicated.

Alternatively, where the sample language capability class of the sample user is known, the sample language capability feature of the sample user may also be estimated from the sample language capability class of the sample user. For example, if the sample user is an adult user whose language is mature, the response sensitivity of the sample user can be considered to be high, and if the response sensitivity is expressed by a value of 0 to 1, the response sensitivity to the sample user can be 0.9 (or another high value).

S704, inputting the characteristics of the language ability of the sample into a language ability classification model to be trained, and classifying the language ability of the sample user to obtain a classification result.

S706, according to the classification result and the sample language ability class, the model parameters of the language ability classification model to be trained are adjusted.

The sample language capability class is used as label data, namely standard output data, of a language capability classification model to be trained, so that the difference between the classification result and the sample language capability class can be determined by comparing the classification result with the sample language capability class, and the difference can be used for judging whether to continue to adjust model parameters.

In one embodiment, the sample language capability features include sample language capability feature values, each sample language capability feature corresponding to a respective weight when there are multiple sample language capability features. Optionally, the sample linguistic capability features include a transaction reaction feature characterized by a reaction sensitivity to a transaction and a sound reaction feature characterized by a reaction sensitivity to sound. Therefore, the response sensitivity to the object can be used as the object response characteristic value, and the response sensitivity to the sound can be used as the sound response characteristic value.

Optionally, to facilitate model calculation, the sample language capability feature values may be normalized to obtain normalized sample language capability feature values. The normalized sample language capability feature value is distributed between 0 and 1. The normalized sample language capability feature values are then input into a language capability classification model.

As shown in fig. 8, the language ability classification model to be trained includes a parameter adjustment layer, a score calculation layer, a classification layer and a full connection layer. Inputting the characteristics of the sample language ability into a language ability classification model to be trained, and when classifying the language ability of a sample user, particularly inputting the characteristic values of the sample language ability into the language ability classification model to be trained, and then executing the following steps C1-C3 by using the language ability classification model to be trained:

And step C1, determining the weight corresponding to each language capability feature in the current iteration process through a parameter adjustment layer.

And C2, calculating the sample language capability score of the sample user according to the sample language capability feature value corresponding to each language capability feature and the weight corresponding to each language capability feature through a score calculation layer.

In the step, the sample language capability feature values can be weighted and summed according to the weights respectively corresponding to the language capability features, so that the sample language capability scores of the sample users can be obtained. For example, if the weight corresponding to the transaction response feature value is 0.6 and the weight corresponding to the sound response feature value is 0.4, and if it is determined that the transaction response feature value of the sample user is 0.9 and the sound response feature value is 0.1, the sample language ability score of the sample user is: 0.9 x 0.6+0.1 x 0.4=0.58.

And C3, classifying the language capability of the sample user through a classification layer according to the sample language capability score and the preset mapping relation between the language capability score and the language capability category.

After the language capability classification result of the sample user is obtained, comparing the classification result with the sample language capability category through the full connection layer of the language capability classification model to be trained so as to determine whether the model parameters need to be continuously adjusted, namely whether the preset iteration termination condition is met. The iteration termination condition may include at least one of: the accuracy of the classification result is larger than or equal to a preset accuracy threshold, the probability value of the sample user belonging to the sample language capability class is larger than or equal to a preset probability threshold, and the iteration times reach a preset times threshold.

Optionally, the classification result may include a probability value of the sample user belonging to each language capability class, and the probability value of the sample user belonging to the sample language capability class after the current iteration may be determined by comparing the classification result with the sample language capability class. If the probability value is greater than or equal to a preset probability threshold value, iteration is terminated, and a trained language ability classification model is obtained; if the probability value is smaller than the preset probability threshold value, the full-connection layer transmits the classification result back to the parameter adjustment layer, so that the parameter adjustment layer updates the model parameters based on the classification result of the iteration, and further, the next iteration is performed based on the updated model parameters.

Optionally, the classification result may include a first language capability class of the sample user, and by comparing the first language capability class with the sample language capability class, it may be determined whether the sample capability classification of the sample user after the current iteration is correct. When a plurality of sample users exist, each sample user corresponds to a respective classification result, and based on the classification result, the accuracy of the classification result of the plurality of sample users can be calculated. If the accuracy is greater than or equal to a preset accuracy threshold, iteration is terminated, and a trained language ability classification model is obtained; if the accuracy is smaller than the preset accuracy threshold, the full-connection layer transmits the classification result back to the parameter adjustment layer, so that the parameter adjustment layer updates the model parameters based on the classification result of the iteration, and further performs the next iteration based on the updated model parameters.

In summary, particular embodiments of the present subject matter have been described. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may be advantageous.

The above method for training the intelligent interaction method and the language capability classification model provided by the embodiment of the application also provides an intelligent interaction device and a device for training the language capability classification model based on the same thought.

FIG. 9 is a schematic block diagram of an intelligent interaction device according to an embodiment of the present application, as shown in FIG. 9, the device includes:

a first obtaining module 91, configured to obtain a target language knowledge graph corresponding to a target user; the target language knowledge graph is created based on the language capability class of the target user, and the language capability class of the target user is determined based on the language capability features of the target user;

A first determining module 92, configured to determine, according to the target language knowledge graph, target knowledge information that matches the language capability class of the target user;

a second obtaining module 93, configured to obtain behavior information of the target user; the behavior information includes at least one of: action information, language information, emotion information;

a second determining module 94, configured to determine an interaction policy corresponding to the target user according to the target knowledge information and the behavior information, and interact with the target user based on the interaction policy.

In one embodiment, the first obtaining module 91 includes:

the first acquisition unit is used for acquiring an initial language knowledge graph which is created in advance; the initial language knowledge graph includes: the system comprises a plurality of knowledge nodes, and user information and knowledge information corresponding to each knowledge node;

a first determining unit, configured to determine knowledge information matched with user information of the target user according to the initial language knowledge graph and the user information of the target user;

the acquisition unit is used for acquiring language capability characteristics of the target user based on knowledge information matched with the user information of the target user; the linguistic capability features include transaction response features and/or sound response features;

A second determining unit, configured to determine a language capability class of the target user according to the language capability feature;

and the generating unit is used for generating the target language knowledge graph according to the initial language knowledge graph and the language capability category of the target user.

In one embodiment, each knowledge node corresponds to a language capability class;

the generating unit is used for:

respectively matching the language capability category of the target user with each knowledge node, and determining a first knowledge node matched with the language capability category of the target user according to a matching result;

generating the target language knowledge graph according to the first knowledge node and the second knowledge node; the target language knowledge graph comprises: the first knowledge node, the second knowledge node, knowledge information corresponding to the first knowledge node and knowledge information corresponding to the second knowledge node; the second knowledge node is at least one knowledge node adjacent to the first knowledge node.

In one embodiment, the second determining unit is configured to:

inputting the language capability features into a pre-trained language capability classification model to obtain the language capability category of the target user; the language capability classification model is trained based on sample language capability features and sample language capability categories of sample users.

In one embodiment, the linguistic capability feature comprises a linguistic capability feature value; the language ability classification model comprises a score calculation layer and a classification layer;

the second determining unit is configured to:

calculating the language ability score of the target user according to the language ability characteristic value corresponding to each language ability characteristic and the weight corresponding to each language ability characteristic through the score calculating layer;

and determining the language capability category of the target user according to the language capability score of the target user and the preset mapping relation between the language capability score and the language capability category through the classification layer.

In one embodiment, the second obtaining module 93 includes:

the execution unit is used for triggering the target user to execute a behavior event according to the target knowledge information;

and the second acquisition unit is used for acquiring the behavior information corresponding to the behavior event.

In one embodiment, the second acquisition unit is further configured to:

acquiring multimedia data of the target user executing the behavior event;

processing the multimedia data to obtain audio data and/or image data of the behavior event executed by the target user;

And analyzing the audio data and/or the image data to obtain the behavior information corresponding to the behavior event.

In one embodiment, the apparatus further comprises:

the second acquisition module is used for acquiring the interaction characteristic information of the target user in the interaction process; the interactive feature information comprises a transaction response feature and/or a sound response feature;

the judging module is used for judging whether the interactive characteristic information is matched with the language capability characteristics of the target user or not;

the third determining module is used for determining the language capability features of the target user again according to the interaction feature information if not, so as to obtain updated language capability features;

a fourth determining module, configured to redetermine, according to the updated language capability feature, a language capability class of the target user, to obtain a language capability class to be updated;

and the updating module is used for updating the target language knowledge graph according to the language capability category to be updated.

By adopting the device of the embodiment of the application, the target knowledge information matched with the language capability category of the target user is determined according to the target language knowledge graph by acquiring the target language knowledge graph corresponding to the target user, and the behavior information of the target user is acquired, so that the interaction strategy corresponding to the target user is determined according to the target knowledge information and the behavior information of the target user. Because the interaction strategy is determined according to the target knowledge information matched with the language capability type of the target user and the behavior information of the target user, and the target knowledge information is determined according to the target language knowledge graph created based on the language capability type of the target user, when the interaction strategy is used for interacting with the target user, the interaction process can be ensured to be matched with the language capability type of the target user, and the situation that interaction barriers (such as language, action and the like in the interaction process are difficult to understand) are caused when the interaction strategy is not matched with the language capability type of the target user is avoided, so that the target user is guided accurately in the human-computer interaction process. And the interaction strategy can be formulated by comprehensively considering the action information, the language information and/or the emotion information of the target user, so that the intellectualization and individuation of human-computer interaction are improved. In addition, because the target language knowledge graph and the target user have correspondence, namely, the device has respective corresponding language knowledge graph for different users, on the basis of realizing intelligent interaction with the users, the device can realize customized language knowledge graph for different users, and further realize personalized and customized man-machine interaction effect.

It should be understood by those skilled in the art that the intelligent interaction device in fig. 9 can be used to implement the intelligent interaction method described above, and the detailed description thereof should be similar to that of the method described above, so as to avoid complexity, and is not repeated herein.

FIG. 10 is a schematic block diagram of a linguistic capability classification model training device, as shown in FIG. 10, according to one embodiment of the present application, comprising:

a third obtaining module 101, configured to obtain a sample language capability feature and a sample language capability class of a sample user; the sample language capability features include transaction response features and/or sound response features;

the classification module 102 is configured to input the sample language capability features into a language capability classification model to be trained, and classify the language capability of the sample user to obtain a classification result;

and the parameter adjustment module 103 is configured to adjust model parameters of the language capability classification model to be trained according to the classification result and the sample language capability class.

In one embodiment, the sample language capability feature comprises a sample language capability feature value; the language ability classification model to be trained comprises a parameter adjustment layer, a score calculation layer and a classification layer;

The parameter adjusting layer is used for determining the weight corresponding to each language capability feature in the current iteration process;

the score calculating layer is used for calculating the sample language ability score of the sample user according to the sample language ability characteristic value corresponding to each language ability characteristic and the weight corresponding to each language ability characteristic;

the classification layer is used for classifying the language capability of the sample user according to the sample language capability score and the preset mapping relation between the language capability score and the language capability category.

By adopting the device of the embodiment of the application, the sample language capability characteristics and the sample language capability types of the sample user are obtained, the sample language capability characteristics are input into the language capability classification model to be trained, the language capability of the sample user is classified to obtain a classification result, and then the model parameters of the language capability classification model to be trained are adjusted according to the classification result and the sample language capability types, so that the trained language capability classification model is obtained. Because the sample language capability features of the sample user comprise the things response features and/or the voice response features, the language capability classification model obtained based on the training of the sample language capability features not only has the capability of analyzing the things response features and/or the voice response features of the user, but also has the capability of classifying the language capability of the user according to the language capability features of the user. Furthermore, when intelligent human-computer interaction is performed, the language capability classification model can be used for accurately analyzing the language capability class of the target user, and powerful data support is provided for the follow-up intelligent human-computer interaction process.

It should be understood by those skilled in the art that the language capability classification model training apparatus of fig. 10 can be used to implement the language capability classification model training method described above, and the detailed description thereof should be similar to that of the method section described above, so as to avoid complexity and redundancy.

Based on the same thought, the embodiment of the application also provides electronic equipment, as shown in fig. 11. The electronic device may vary considerably in configuration or performance and may include one or more processors 1101 and memory 1102, where the memory 1102 may store one or more stored applications or data. Wherein the memory 1102 may be transient storage or persistent storage. The application programs stored in the memory 1102 may include one or more modules (not shown), each of which may include a series of computer-executable instructions for use in an electronic device. Still further, the processor 1101 may be arranged to communicate with the memory 1102 and execute a series of computer executable instructions in the memory 1102 on an electronic device. The electronic device can also include one or more power supplies 1103, one or more wired or wireless network interfaces 1104, one or more input output interfaces 1105, one or more keyboards 1106.

In particular, in this embodiment, an electronic device includes a memory, and one or more programs, where the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the electronic device, and the one or more programs configured to be executed by one or more processors include instructions for:

acquiring a target language knowledge graph corresponding to a target user; the target language knowledge graph is created based on the language capability class of the target user, and the language capability class of the target user is determined based on the language capability characteristics of the target user;

In particular, in another embodiment, an electronic device includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the electronic device, and the one or more programs configured to be executed by one or more processors comprise instructions for:

The embodiments of the present application also provide a storage medium storing one or more computer programs, where the one or more computer programs include instructions, which when executed by an electronic device including a plurality of application programs, enable the electronic device to perform the various processes of the above-described intelligent interaction method embodiments, and are specifically configured to perform:

The embodiments also provide a storage medium storing one or more computer programs, where the one or more computer programs include instructions, which when executed by an electronic device that includes a plurality of application programs, enable the electronic device to perform the respective processes of the language capability classification model training method embodiments described above, and specifically are configured to perform:

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present application.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims

1. An intelligent interaction method is characterized by comprising the following steps:

2. The method according to claim 1, wherein the obtaining a target language knowledge graph corresponding to a target user includes:

acquiring a pre-established initial language knowledge graph; the initial language knowledge graph includes: the system comprises a plurality of knowledge nodes, and user information and knowledge information corresponding to each knowledge node;

determining knowledge information matched with the user information of the target user according to the initial language knowledge graph and the user information of the target user;

collecting language capability features of the target user based on knowledge information matched with user information of the target user; the linguistic capability features include transaction response features and/or sound response features;

determining the language capability category of the target user according to the language capability features;

And generating the target language knowledge graph according to the initial language knowledge graph and the language capability class of the target user.

3. The method of claim 2, wherein each of the knowledge nodes corresponds to a respective language capability class;

the generating the target language knowledge graph according to the initial language knowledge graph and the language capability class of the target user comprises the following steps:

4. The method of claim 2, wherein said determining the language capability class of the target user based on the language capability features comprises:

5. The method of claim 4, wherein the language capability feature comprises a language capability feature value; the language ability classification model comprises a score calculation layer and a classification layer;

inputting the language capability features into a pre-trained language capability classification model to obtain the language capability category of the target user, wherein the method comprises the following steps:

6. The method of claim 1, wherein the obtaining behavior information of the target user comprises:

Triggering the target user to execute a behavior event according to the target knowledge information;

and acquiring the behavior information corresponding to the behavior event.

7. The method of claim 6, wherein the obtaining the behavior information corresponding to the behavior event comprises:

acquiring multimedia data of the target user executing the behavior event;

8. The method according to claim 1, wherein the method further comprises:

acquiring interaction characteristic information of the target user in the interaction process; the interactive feature information comprises a transaction response feature and/or a sound response feature;

judging whether the interactive feature information is matched with the language capability feature of the target user or not;

if not, re-determining the language capability features of the target user according to the interaction feature information to obtain updated language capability features;

according to the updated language capability characteristics, determining the language capability category of the target user again to obtain the language capability category to be updated;

And updating the target language knowledge graph according to the language capability category to be updated.

9. A language ability classification model training method, comprising:

10. The method of claim 9, wherein the sample language capability feature comprises a sample language capability feature value; the language ability classification model to be trained comprises a parameter adjustment layer, a score calculation layer and a classification layer;

inputting the sample language capability features into a language capability classification model to be trained, classifying the language capability of the sample user, including:

11. An intelligent interaction device, comprising:

12. An electronic device comprising a processor and a memory electrically connected to the processor, the memory storing a computer program, the processor being configured to call and execute the computer program from the memory to implement the intelligent interaction method of any of claims 1-8, or the processor being configured to call and execute the computer program from the memory to implement the language capability classification model training method of any of claims 9-10.

13. A storage medium storing a computer program executable by a processor to implement the intelligent interaction method of any of claims 1-8, or executable by a processor to implement the language capability classification model training method of any of claims 9-10.