CN108536802B

CN108536802B - Interaction method and device based on child emotion

Info

Publication number: CN108536802B
Application number: CN201810290987.3A
Authority: CN
Inventors: 黄鸣夏; 钱隽夫
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd; Shanghai Xiaodu Technology Co Ltd
Priority date: 2018-03-30
Filing date: 2018-03-30
Publication date: 2020-01-14
Anticipated expiration: 2038-03-30
Also published as: CN108536802A

Abstract

The invention provides an interaction method and device based on child emotion, electronic equipment and a readable storage medium, which are used for receiving voice interaction information of a child user; determining interactive content in the voice interactive information; determining the voice characteristics of the child user in the voice interaction information; then, according to the interactive content, determining the emotional characteristics of the child user; determining a degree value of the emotion characteristic according to the voice characteristic; and determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics. The method realizes accurate control of the emotion of the child, improves harmony and smoothness of communication with the child, and ensures good guidance of the emotion of the child.

Description

Interaction method and device based on child emotion

Technical Field

The present invention relates to emotion recognition technologies, and in particular, to an interaction method and device based on children's emotion, and an electronic device and a readable storage medium using the method.

Background

With the rise of artificial intelligence, man-machine interaction equipment is more and more widely applied to daily life of people.

Taking a child story machine as an example, a man-machine interaction mode commonly adopted by the child story machine in the prior art is to return a corresponding result according to a literal intention in a child utterance to perform interaction of 'asking and answering' for example, the story machine asks the child what story is to be heard? "the child answers" i want to listen to a small red cap ", the story machine finds and plays the story resource corresponding to the keyword" small red cap "in the database.

On one hand, due to the limited expression capacity of the children, the story name which needs to be listened to cannot be accurately spoken, so that the story machine randomly selects story playing for the story machine, but the story is not necessarily the story which the children want to listen to; on the other hand, children generally like to share their moods with story machines capable of conversation as friends, but story machines all answer with a relatively fixed utterance, cannot give appropriate answers according to the moods of children, and may also have a case where the moods of children are influenced by questions or answers.

Therefore, the interaction mode of the existing children story machine is more mechanized and rigid, so that children cannot be well understood, and the interaction mode is not beneficial to the communication capability culture of children in the language formation stage.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides an interaction method and device based on children emotion, an electronic device and a readable storage medium, which can accurately determine the emotion of a child user by analyzing the content and the voice characteristics of voice interaction information of the child user and provide a corresponding response strategy according to the emotion. The method realizes accurate control of the emotion of the child, improves harmony and smoothness of communication with the child, and ensures good guidance of the emotion of the child.

In a first aspect, an embodiment of the present invention provides an interaction method based on a child emotion, including:

receiving voice interaction information of a child user;

determining interactive content in the voice interactive information;

determining the voice characteristics of the child user in the voice interaction information;

determining emotional characteristics of the child user according to the interactive content;

determining a degree value of the emotion characteristic according to the voice characteristic;

and determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics.

Optionally, the determining interactive content in the voice interaction information includes:

converting the voice interaction information into text information;

performing semantic analysis on the text information, and extracting subject words and/or emotion words from the text information to obtain interactive content;

correspondingly, the determining the emotional characteristics of the child user according to the interactive content includes:

and determining the emotional characteristics of the child user according to the theme words and/or the emotional words based on a preset identification rule.

Optionally, the method further comprises:

establishing a child dictionary based on the child language;

wherein the child language includes at least: the language of expression of the words stacked on children and the language of expression of the anthropomorphic animals; the child dictionary includes: paraphrasing of children language, paraphrasing of children language combined with context, and emotion identification of children language;

the emotion marks of the children language are used for identifying a positive emotion category, a negative emotion category and a neutral emotion category.

Optionally, the determining, based on a preset identification rule, an emotional feature of the child user according to the theme words and/or the emotion words includes:

determining an emotion identifier corresponding to each subject word and/or each emotion word in the child dictionary;

if the determined emotion marks do not contain the negative emotion types and contain the positive emotion types, determining the emotional characteristics of the child user as positive emotional characteristics;

if the determined emotion marks do not contain the positive emotion types and contain the negative emotion types, determining the emotional characteristics of the child user as negative emotional characteristics;

if the determined emotion marks only contain neutral emotion types, determining the emotion characteristics of the child user as neutral emotion characteristics;

and if the determined emotion identifications comprise both positive emotion types and negative emotion types, determining the emotional characteristics of the child user based on the context semanteme according to the word order of the theme words and/or the emotion words in the voice interaction information.

Optionally, the method further comprises:

acquiring emotion marking data of the children, and training to obtain a child emotion recognition model;

correspondingly, the determining the emotional characteristics of the child user according to the theme words and/or the emotion words based on the preset identification rules includes:

and inputting the theme words and/or the emotion words into the child emotion recognition model, and recognizing to obtain the emotional characteristics of the child user.

Optionally, the determining the voice characteristics of the child user in the voice interaction information includes:

and determining at least one of the following voice characteristics, such as voice intensity, voice speed and intonation, in the voice interaction information.

Optionally, determining a degree value of the emotional feature according to the voice feature includes:

determining the average value of the voice features corresponding to the voice interaction information by taking the whole voice interaction information as a statistical object;

and determining the degree value of the emotional characteristic according to the average value of the voice characteristic.

Optionally, the determining the degree value of the emotional feature according to the voice feature includes:

determining the voice features of each subject word and/or emotion word in the voice interaction information;

according to the weighted values of different parts of speech, carrying out weighted calculation on the voice features of the voice interaction information to obtain a weighted average value of the voice features corresponding to the voice interaction information;

and determining the degree value of the emotional characteristic according to the weighted average value of the voice characteristic.

Optionally, the response policy includes: a conversation negotiation answering mode and/or an audio resource playing mode; the determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics comprises the following steps:

if the emotional characteristic is a negative emotional characteristic and the degree value of the emotional characteristic exceeds a preset threshold value, determining that the response strategy is the conversation negotiation response mode; or determining that the response strategy is to respond in the dialogue negotiation response mode first and then respond in the audio resource playing mode.

Optionally, the method further comprises:

determining a user representation of the child user; the user portrait comprises at least one of the following characteristics, namely attribute information of a child user, historical interaction records of the child user, habitual expressions of the child user, work and rest rules of the child user, audio resources preferred by the child user, and an association relationship between a geographic position and the child user;

and optimizing the response strategy according to the determined user image of the child user.

Optionally, the method further comprises:

acquiring time information and/or place information of the received voice interaction information;

according to the time information and/or the place information, determining the current scene of the child user based on the user portrait;

and optimizing the response strategy according to the current scene.

Optionally, the method further comprises:

and generating an emotion analysis report of the child user according to a preset period.

In a second aspect, an embodiment of the present invention provides an interaction apparatus based on a child emotion, including:

the receiving module is used for receiving voice interaction information of a child user;

the determining module is used for determining interactive contents in the voice interactive information; determining the voice characteristics of the child user in the voice interaction information; determining emotional characteristics of the child user according to the interactive content; determining a degree value of the emotion characteristic according to the voice characteristic; and determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics.

Optionally, the determining module includes:

the conversion submodule is used for converting the voice interaction information into text information;

the analysis submodule is used for performing semantic analysis on the text information;

the extraction submodule is used for extracting the theme words and/or the emotion words from the text information to obtain interactive contents;

and the emotion characteristic determination submodule is used for determining the emotion characteristics of the child user according to the theme words and/or the emotion words based on a preset identification rule.

Optionally, the method further comprises:

the dictionary module is used for establishing a child dictionary based on the child language;

Optionally, the emotion feature determination submodule is specifically configured to determine an emotion identifier corresponding to each topic word and/or each emotion word in the child dictionary;

when the determined emotion marks do not contain negative emotion categories and contain positive emotion categories, determining the emotional characteristics of the child user as positive emotional characteristics;

when the determined emotion marks do not contain the positive emotion types and contain the negative emotion types, determining the emotional characteristics of the child user as negative emotional characteristics;

when the determined emotion marks only contain neutral emotion categories, determining the emotional characteristics of the child user as neutral emotional characteristics;

and when each determined emotion mark comprises a positive emotion category and a negative emotion category, determining the emotional characteristics of the child user based on the context and the semanteme according to the word sequence of the theme words and/or the emotional words in the voice interaction information.

Optionally, the method further comprises:

the recognition model module is used for acquiring emotion marking data of the children and training to obtain an emotion recognition model of the children;

correspondingly, the emotion feature determination submodule is specifically configured to input the theme words and/or the emotion words into the child emotion recognition model of the recognition model module, and recognize the theme words and/or the emotion words to obtain the emotion features of the child user.

Optionally, the determining module includes:

and the voice characteristic determining submodule is used for determining at least one of the following voice characteristics, such as voice strength, voice speed and intonation, in the voice interaction information.

Optionally, the determining module includes:

the first degree value determining submodule is used for determining the average value of the voice features corresponding to the voice interaction information by taking the whole voice interaction information as a statistical object; and determining the degree value of the emotional characteristic according to the average value of the voice characteristic.

Optionally, the determining module includes:

a second degree value determining submodule, configured to determine the voice feature of each topic word and/or each emotion word in the voice interaction information; according to the weighted values of different parts of speech, carrying out weighted calculation on the voice features of the voice interaction information to obtain a weighted average value of the voice features corresponding to the voice interaction information; and determining the degree value of the emotional characteristic according to the weighted average value of the voice characteristic.

Optionally, the response policy includes: a conversation negotiation answering mode and/or an audio resource playing mode; the determining module includes:

the first determining sub-module is used for determining that the response strategy is the conversation negotiation response mode when the emotional characteristic is a negative emotional characteristic and the degree value of the emotional characteristic exceeds a preset threshold value; or determining that the response strategy is to respond in the dialogue negotiation response mode first and then respond in the audio resource playing mode.

Optionally, the determining module further includes:

a user representation determination sub-module for determining a user representation of the child user; the user portrait comprises at least one of the following characteristics, namely attribute information of a child user, historical interaction records of the child user, habitual expressions of the child user, work and rest rules of the child user, audio resources preferred by the child user, and an association relationship between a geographic position and the child user;

and the optimization sub-module is used for optimizing the response strategy according to the user portrait of the child user determined by the user portrait determination sub-module.

Optionally, the apparatus further comprises:

the acquisition module is used for acquiring time information and/or place information of the received voice interaction information;

the determining module further comprises:

the scene determining submodule is used for determining the current scene of the child user based on the user portrait according to the time information and/or the place information;

and the optimization submodule is also used for optimizing the response strategy according to the current scene.

Optionally, the method further comprises:

and the generating module is used for generating the emotion analysis report of the child user according to a preset period.

In a third aspect, an embodiment of the present invention provides an electronic device, including:

a processor; a memory; and a program; wherein the program is stored in the memory and configured to be executed by the processor, the program comprising instructions for performing the method of the first aspect.

In a fourth aspect, an embodiment of the present invention provides an electronic device readable storage medium, where the electronic device readable storage medium stores a program, and the program causes an electronic device to execute the method according to the first aspect.

According to the interaction method and device based on the emotion of the child, the electronic equipment and the readable storage medium, the voice interaction information of the child user is received; determining interactive content in the voice interactive information; determining the voice characteristics of the child user in the voice interaction information; then, according to the interactive content, determining the emotional characteristics of the child user; determining a degree value of the emotion characteristic according to the voice characteristic; and determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics. The method realizes accurate control of the emotion of the child, improves harmony and smoothness of communication with the child, and ensures good guidance of the emotion of the child.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart illustrating a method of interaction based on child emotions of the present invention in an exemplary embodiment;

FIG. 2 is a flow chart illustrating a method of child emotion-based interaction of the present invention in accordance with another exemplary embodiment;

FIG. 3 is a schematic diagram of a structure of a child emotion-based interaction device according to an exemplary embodiment of the present invention;

FIG. 4 is a schematic diagram of a structure of a child emotion-based interaction device according to another exemplary embodiment of the present invention;

FIG. 5a is a schematic diagram of an electronic device of the present invention according to an exemplary embodiment;

fig. 5b is a schematic structural diagram of an electronic device according to another exemplary embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terms "first," "second," "third," and "fourth," if any, in the description and claims of the invention and in the above-described figures are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The technical means of the present invention will be described in detail below with reference to specific examples. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments.

Fig. 1 is a flowchart illustrating a child emotion-based interaction method according to an exemplary embodiment of the present invention, and as shown in fig. 1, an embodiment of the present invention provides a child emotion-based interaction method, which can be performed by any device that performs the child emotion-based interaction method, and the device can be implemented by software and/or hardware. In this embodiment, the device for executing the interaction method based on the child emotion may be integrated in the electronic device of the user, or may be integrated in a cloud server that performs network data interaction with the electronic device of the user. Electronic devices, among others, include but are not limited to child interaction devices (story machines, smartwatches, interactive robots), smartphones, tablets, portable computers, desktop computers, and the like. By receiving the voice interaction information of the children, the voice interaction information is analyzed and recognized based on the interaction method based on the emotion of the children, so that a response strategy according with the current emotion of a child user is obtained, and therefore smooth communication with the children is achieved in a mode of understanding the emotion of the children more. The cloud server can comprise one or more servers for executing functions of operation, analysis, storage and the like, taking a device for executing the interaction method based on the emotion of the child as an example, the story machine receives voice interaction information of the child and sends the voice interaction information to the cloud server, after the cloud server receives the voice interaction information, the voice interaction information is analyzed and identified based on the interaction method based on the emotion of the child, a response strategy according with the current emotion of the child user is obtained, the response strategy is fed back to the story machine through network transmission, and the story machine can be smoothly communicated with the child in a mode of understanding the emotion of the child. The implementation subject in each of the following embodiments is described by taking a story machine as an example, and the interaction method based on the emotion of the child in this embodiment includes:

step 101, receiving voice interaction information of a child user.

In this embodiment, an audio receiving device in the story machine, such as a microphone, may receive voice interaction information of a child user, where the voice interaction information is voice information of a child sensed by the story machine. The voice interaction information may include instructional language sent by the child to the story machine, for example, indicating the story machine, "i want to listen to the story of the Sophia princess," "two chat bars of a person," and the like; the sound information may include only the sound information of the child, such as crying, laughing or exclamation sounds, such as heuman, ao, and so on.

And step 102, determining interactive contents in the voice interactive information.

In this embodiment, the interactive content is an intention recognized in the voice interaction information, for example, when a child user expresses an intention of listening to a story of the songheia princess, the intention may not be accurately expressed as an adult, but various interference factors such as pause, sound elongation, word folding, and the like are added in the process of expressing the intention of the child user, for example, the voice interaction information expressed by the child user may be "i … … hui … …, songheia … …, meow-" manow- "to listen to … …", so that the story teller needs to determine the interactive content from the voice interaction information, the interactive content may be a content capable of being expressed in a text form, for example, the interactive content determined by the story teller may be "yamaaomiao listen to me"; the story machine can also determine the interactive content of 'i listen to Sophia and meow' in the voice interactive information according to preset identification rules, for example, rules such as exclamation word removal, word folding simplification and word order adjustment. Therefore, the story machine can determine the interactive content from the voice interactive information and recognize the semanteme of the story machine according to the interactive content so as to determine the intention of the children. The technology of semantic recognition, word segmentation processing, and the like in the prior art may be adopted for obtaining the interactive content in the voice interactive information, and this embodiment is not particularly limited in this respect.

And 103, determining the voice characteristics of the child user in the voice interaction information.

In the present embodiment, the speech feature is a feature for evaluating and describing sound. Which may include, reflecting the sound intensity of the sound energy; the loudness reflecting the strength of the sound; for example, the stronger the sound intensity, the greater the loudness, for a given sound frequency; pitch, which reflects the auditory system's perception of sound frequency, pitch period, signal-to-noise ratio, etc., characteristics used to identify the gender of the speaker may also be included. The feature dimensions included in the speech features may be determined by those skilled in the art according to the recognition requirements, which is not specifically limited in this embodiment. By analyzing the voice characteristics, the attribute information of the child user is determined, for example, whether the child user is a boy or a girl; the approximate age stage of the child and the like can be determined according to the sound ray characteristics reflected by the voice characteristics; thereby providing predictive data for describing the user representation. In addition, when a sentence is expressed by different sound intensities, the speech characteristics can reflect the emotional state of the speaker, for example, when a child expresses 'i want to listen to the story of the Sophia princess', the child can be determined to be calmer in emotion at the moment, and the fact that the child really wants to listen to the story is predicted; when a child expresses the same content "i want to listen to the story of the princess of songhia" in a hoarse tone, it is likely that the child has just made a quarrel with a person, and is very angry, and although the child expresses that the child wants to listen to the story, it may be the case that the child does not really want to listen to the story, but releases the emotion to the person by taking the story machine as a target of abreaction, and playing the story for the child at that moment does not seem to be very appropriate, but should adopt another communication means to ease the angry emotion according to the emotion of the child at that moment. Therefore, by determining the voice characteristics of the child from the voice interaction information, the emotional state of the child can be grasped more accurately.

And step 104, determining the emotional characteristics of the child user according to the interactive content.

In this embodiment, the speech is usually the most direct way to express the emotion, and the emotion is classified into a large category, which can be summarized as happiness, anger, sadness and happiness; through the recognition of the words, the emotion of the speaker can be mastered more accurately. For example, a child user may say "today is really happy" when expressing the emotion of "likes"; then it can be preliminarily determined that the emotional characteristics of the child tend to be happy through the interactive content "happy" determined in step 102; and when the child expresses the emotion of "anger", it may say that "i hate you"; then it may be preliminarily determined that the emotional characteristics of the child tend to be inattentive or angry through the interactive content "disliked" determined in step 102. For the classification of emotional characteristics, more detailed classification can be derived based on happiness, anger, sadness and happiness, for example, the happiness can be further classified into happiness and love; anger can also be divided into angry and abhormesis; in addition, emotional characteristics such as fear, etc. may also be included. By subdividing the emotion characteristics into categories or subcategories, the accuracy of emotion recognition is improved.

And 105, determining the degree value of the emotion characteristics according to the voice characteristics.

In this embodiment, as described in step 103, the features such as pitch, tone, and speech speed in the speech feature may reflect the emotion of the speaker to some extent, for example, an example a, when a mother says "eat from a fast to" gently for a child, it is not very urgent to some extent, and it may be only an expression habit that the child may pay more attention to eating; however, example b, if mom says "Kuaikuo!that the mother roars about the child! Point! Eating! O! | A | A ", it may reflect that the mother's emotion is angry; it is also possible that the time is really very urgent. Thus, the determined emotional characteristic of step 104, such as the emotional characteristic of "urgency," may be evaluated for a degree of travel value based on the speech characteristic, e.g., for the case of example a above, the degree of travel value of "urgency" is less than the degree of travel value in example b; that is, example a reflects the emotional character of "general urgency", "time not very stressful"; example b reflects the emotional character of "very urgent", "very intense".

And step 106, determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics.

In this embodiment, according to the emotion feature determined from the interactive content and the degree value of the emotion feature determined from the voice feature, a response policy corresponding to the interactive content and conforming to the emotion of the child is determined. For example, the received voice interaction information of the child user is that the child cries and says "i do not feel happy today and whine- -and the interaction content is determined as" i do not feel happy today and whine "; determining the emotional characteristics of the children as emotions such as 'no worry', 'hurt' and the like according to the interactive content; if the speech feature of the child is determined to be some sound feature (e.g., the intensity of the crying sound, the duration of the crying sound, etc.) capable of identifying the "crying" based on the crying sound transmitted by the child in the interactive information, the emotional feature of the child is determined to be "not happy" or "not happy" with a moderate degree of the "sad" or "sad" according to the "crying sound feature. Thus, the story machine may employ a response strategy that may be talking about the child, asking the child "what do you? Why did it cry? "or play their favorite story for the child to symptomatically relieve their sad mood.

According to the user emotion-based interaction method provided by the embodiment of the invention, voice interaction information of a child user is received; determining interactive content in the voice interactive information; determining the voice characteristics of the child user in the voice interaction information; then, according to the interactive content, determining the emotional characteristics of the child user; determining a degree value of the emotion characteristic according to the voice characteristic; and determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics. The method realizes accurate control of the emotion of the child, improves harmony and smoothness of communication with the child, and ensures good guidance of the emotion of the child.

Fig. 2 is a flowchart illustrating an interaction method based on user emotion according to another exemplary embodiment of the present invention. On the basis of the embodiment shown in fig. 1, the interaction method based on the user emotion of the embodiment specifically includes:

step 201, receiving voice interaction information of a child user.

Step 202, converting the voice interaction information into text information.

In this embodiment, a technology of extracting words having semantic information in the voice interaction information in the prior art may be adopted to form text information. Because the storage space generally occupied by the text information and the processor resources occupied in the calculation process are smaller, the conversion of the voice interaction information into the corresponding text is beneficial to improving the processing efficiency of the text information in the emotion determination process, and the improvement of the processing efficiency is beneficial to ensuring the accuracy of emotion analysis. For example, the received voice interaction information of the child is "i am o, go to the kindergarten woollen, ha-" and the converted text information may be "i am o go to the kindergarten woollen ha". For the continuous laughing sound "ha-" is converted, the text determination is carried out by using the principle of reserving the language elements which can be accurately identified, for example, one or more clearly identified "ha" characters in the laughing sound are used as the text information of the laughing sound, and then the corresponding number of "ha" characters are reserved in the text information.

And 203, performing semantic analysis on the text information, and extracting subject words and/or emotion words from the text information to obtain interactive content.

In this embodiment, semantic analysis is performed on the text information, and elements with analysis value, such as topic words and emotion words, are extracted from the converted text information; for semantic analysis, word segmentation techniques in the prior art may be used, such as at least one of the following granularities: the method comprises the steps of segmenting text information by single Chinese characters, single words and phrases, and extracting subject words capable of expressing the subject content of a sentence and/or emotional words capable of expressing the emotional tendency of a speaker according to the characteristics of parts of speech, grammar and the like in the segmentation. In the example of step 202, the child says "i go to the kindergarten haha at present", wherein the extracted theme words may be "kindergarten", and the emotional words may be "o", "woollen", "haha", and other words. The content related to the kindergarten can be responded in the response strategy through the theme word 'kindergarten', the current emotion of the child can be judged to be happy through the emotional words 'o', 'wong', 'ha' and the like, and therefore the cheerful content is matched in the response strategy.

And 204, determining the emotional characteristics of the child user according to the theme words and/or the emotional words based on the preset identification rules.

In this embodiment, the preset recognition rule may be based on a dictionary of children, and the corresponding emotional features of the topic words and/or the emotional words in the dictionary are searched. The child dictionary is created based on a child language, that is, the child dictionary is formed for language habits in the child language.

Wherein, children's language can include at least: a word-stacking expression language for children, an animal anthropomorphic expression language, and the like; the child dictionary may include: paraphrasing of children language, paraphrasing of children language combined with context, emotion identification of children language and the like; wherein the emotion identification of the language of the child can be used to identify a positive emotion category, a negative emotion category, or a neutral emotion category in the language expressed by the child.

For example, if a child likes to say a word stack, such as "haohuang" or "eat fruits", the child dictionary may have meanings of "good" in addition to "good" in the child dictionary, such as "good" in the case of "very good" or "very liked"; the meaning of the child language combined with the context is based on that the child language expression is generally not strict with adults, and the situations of commendation and derogation mixed use, disordered front and back word sequences and the like may occur in the expression habit of the child language, so that the meaning determined only according to the word-surface meaning obtained after word segmentation processing may not be the intention of the child, therefore, the words can be defined based on the context, for example, the child says that the child sits and sends out the sand, the word "sends out the sand" may not have the meaning in the child dictionary, but the word sequence of the "send out the sand" being the sofa "can be judged to be reversed according to the" sitting "in front of the" send out the sand "in combination with the context, and the" send out the sand "can be correctly defined in combination with the context. For another example, the child says "i am good and do not like you, the parent kitten" and determines that the child is to express the emotion of "like" and "happy" according to "not", but "good and dislike" according to the context, and if there are multiple definitions of "like" and "dislike" for "good and dislike" in the child dictionary, the child can also continue to determine that "good and dislike" is expressed in the sentence as the emotion of "like" and "happy" according to the context "parent kitten". Therefore, the word "good dislike" can be identified in the children dictionary by emotional marks such as "like", "happy", and the like. The emotion mark indicates that the current emotion of the child is a positive emotion category, and for positive, negative, neutral and other emotion categories, technical personnel in the field can perform subdivision according to statistical data so as to realize richer and more accurate child dictionaries and improve the accuracy of emotion recognition.

Based on the children dictionary, emotion identifications corresponding to each subject word and/or each emotion word can be determined in the children dictionary;

A) if the determined emotion marks do not contain the negative emotion types and contain the positive emotion types, determining the emotion characteristics of the child user as positive emotion characteristics;

specifically, there may be a plurality of topic words and emotion words extracted from the voice interaction information, and the emotional characteristics of each word are determined by querying in the child dictionary. If it is determined that the emotional characteristics expressed by each word are positive, or positive or neutral, the emotion expressed in the whole word can be determined to be a positive emotion. For example, if a child says "cat is beautiful today", the extracted subject words may be "cat" or "today"; the emotional words may be "good", "beautiful"; the emotion marks based on the children dictionary "cat" and "today" can be neutral emotion features, while the emotion marks of "good" and "beautiful" can be positive emotion features, and therefore, the emotion features expressed in the whole sentence are determined to be positive emotion features.

B) If the determined emotion marks do not contain the positive emotion types and contain the negative emotion types, determining the emotion characteristics of the child user as negative emotion characteristics;

for example, if the child says "i dislikes the cat," the extracted subject term may be "cat," where the rule of "i" extraction based on the subject term may be determined as the subject term, or determined as the human-named pronouns and ignored. The emotional words in the sentence are "annoying"; the emotional flag based on the children dictionary "cat", "i" may be a neutral emotional feature, and the emotional flag of "annoying" may be a negative emotional feature, thus determining the emotional feature expressed in the whole sentence as a negative emotional feature.

C) If the determined emotion marks only contain neutral emotion types, determining the emotion characteristics of the child user as neutral emotion characteristics;

for example, if a child says "i go to a kindergarten today", the extracted subject term may be "today" or "kindergarten", or may be "i", "today", "go" or "kindergarten", and the sentence does not contain emotional terms, the terms may be identified as neutral emotional features based on a child dictionary, and the emotional features expressed by the whole sentence are determined as neutral emotional features.

D) And if the determined emotion identifications comprise both positive emotion types and negative emotion types, determining the emotional characteristics of the child user based on the context semanteme according to the word order of the theme words and/or the emotion words in the voice interaction information.

For example, if a child says "i am good and hate cat", the extracted subject term may be "cat", and the emotional terms in the sentence may be "good" and "hate"; the emotion mark based on the child dictionary cat can be a neutral emotion feature, the emotion mark of good can be a positive emotion feature, and the emotion mark of unpleasant can be a negative emotion feature, so that the emotion mark of unpleasant can be determined to be the negative emotion feature according to the word order of good and unpleasant in the original sentence and based on the context and the meaning. It should be noted that, for the emotion recognition of "good and annoying", negative emotion may be determined according to the recognition mode in D), or "good and annoying" may be determined as an emotional feature of the whole word "dislike" in the paraphrasing of the language and context of the child contained in the child dictionary according to the expression habit of the child.

The preset recognition rule can search corresponding emotion characteristics of the theme words and/or the emotion words in the dictionary based on the child dictionary, and can also recognize emotion in the language expression of the child based on the child emotion recognition model.

Firstly, a child emotion recognition model needs to be established, and the child emotion recognition model is obtained through training by acquiring child emotion marking data; and then inputting the theme words and/or the emotion words into the child emotion recognition model, and recognizing and obtaining the emotional characteristics of the child user.

For the former way of children dictionary, it is a dictionary-type query rule based on a large amount of statistical data; the updating of the children dictionary is greatly challenged by the ever-changing of languages and the evolution of languages which occur at any time along with popularity, therefore, the children idiomatic expressions with large data volume can be trained based on advanced algorithms such as a neural network, so that the model can obtain a children emotion recognition model with recognition capability on the emotion expressed by the children language by continuously learning the speaking mode of the children, the accurate recognition of the emotion of the children is realized, the updating can be carried out through continuous learning and training, and the recognition accuracy is improved.

Step 205, determining at least one item of the following voice characteristics, such as voice intensity, voice speed and intonation, in the voice interaction information.

In this embodiment, the speech intensity may include sound features such as sound intensity, loudness, pitch, and the like;

the speed of speech can reflect the emotional expressions of tension, urgency, cheerfulness, excitation and the like of the children through the vocabulary capacity included in the unit time;

the tone is the cavity tone of speech, indicating the formulation and change of the high or low level. The intonation meaning of a sentence can indicate the attitude or mood of the speaker. The same sentence has different meaning according to different intonation. For example, if the language tone adopted by the child saying "i want to do work today" is a steady or more depressed language tone, it may indicate a less willing to do work, a less happy mood; if the adopted tone is rising, the sentence expresses the question, "i want to do work today"? The effect of (1).

Therefore, the emotional tendency of the child user is clarified by recognizing the voice intensity, the voice speed, the intonation, and the like in the voice interaction information.

And step 206, determining the degree value of the emotion characteristics according to the voice characteristics.

In this embodiment, when determining the degree value of the emotion feature, statistics may be performed based on the speech feature of the whole sentence, or statistics may be performed based on the speech feature of each word in the whole sentence. In particular, the method comprises the following steps of,

in the first mode, the voice interaction information is used as a whole statistical object, and the average value of the voice characteristics corresponding to the voice interaction information is determined; and determining the degree value of the emotional feature according to the average value of the voice feature.

Determining the voice characteristics of each topic word and/or emotional word in the voice interaction information in a second mode; according to the weighted values of different parts of speech, carrying out weighted calculation on the voice features of the voice interaction information to obtain a weighted average value of the voice features corresponding to the voice interaction information; and determining the degree value of the emotional feature according to the weighted average value of the voice feature.

The first mode is to obtain the average value of the voice characteristics according to the voice characteristics expressed by the whole sentence, for example, according to the voice waveform, and the average value can be compared with a preset threshold value to determine the degree value of the emotional characteristics reflected by the average value. The method has the advantages of small algorithm amount, simple algorithm and effective improvement of the efficiency of determining the degree value of the emotional characteristic. However, since the averaging is performed based on the expression of the whole sentence, there is a possibility that only the intonation or the speech intensity of an individual word may be changed relatively strongly in the whole sentence, and when the averaging operation is performed on the overall speech feature, the reinforcing effect may be weakened and averaged out. And in the second mode, the voice characteristics of each theme word and/or each emotion word in the whole sentence are extracted, different weight values are given according to the part of speech of the word, each word is weighted, and the whole sentence is weighted and averaged to obtain a degree value. The degree determined by the method is more accurate, but the speech characteristics of each word are determined and weighted, so that the calculation amount is larger than that of the first method, and the accuracy is improved. For example, a child user says "I am not! | A | A And if the radish is liked to eat, high tone is adopted when the expression word 'not' is expressed, the part of speech of the 'not' as a negative word is endowed with a larger weight value, and the degree value of the negative emotion expressed by the sentence is enhanced after weighted average calculation.

And step 207, determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics.

In this embodiment, the response policy may include: a conversation negotiation answering mode and/or an audio resource playing mode; if the emotional characteristic is a negative emotional characteristic and the degree value of the emotional characteristic exceeds a preset threshold value, determining that the response strategy is a conversation negotiation response mode; or determining the response strategy as that the response is carried out in a dialogue negotiation response mode firstly and then the response is carried out in an audio resource playing mode.

That is, when the child is in a state of low mood, it should respond in a more pertinent manner, such as a conversation therewith, so as to understand the cause of low mood; if the child is directly responded with music or stories, the child feels that his mood is overlooked and the mood is lowered more and more. Thus, for negative emotional characteristics, the response strategy may be a conversational response mode; or talk to the children first and then insert audio resources (such as songs or stories liked by children) for mood relief according to mood changes.

And step 208, optimizing the response strategy according to the determined user portrait of the child user and/or according to the current scene.

In this embodiment, when determining the response policy, in addition to determining the response policy by analyzing the voice interaction information of the child in steps 201 to 207, the response policy may be determined by combining the user image of the child (e.g., boy or girl, what is liked, what is disliked, etc.). With respect to the manner in which the user representation is determined, those skilled in the art may employ prior art methods of determining a user representation, such as analyzing past records of interaction by the user. This embodiment is not particularly limited.

The user portrait can comprise at least one of the following characteristics, namely attribute information of the child user, historical interaction records of the child user, habitual expressions of the child user, work and rest rules of the child user, audio resources preferred by the child user, association relation between the geographic position and the child user and the like; thereby optimizing the response strategy according to the determined user portrait of the child user.

In addition, the story machine can also acquire time information and/or place information of the received voice interaction information; determining the current scene of the child user based on the user portrait according to the time information and/or the location information; and optimizing the response strategy according to the current scene.

For example, according to the foregoing steps 201 to 207, it is determined that the current emotion of the child is happy, however, the location information obtained at this time is the home of the child, and the time information is 9 pm, and then according to the user image of the child, it is found that the general sleeping time of the child is about 9 pm 30 minutes, and then according to the response policy determined in step 207, a piece of cheerful music may be played, but considering that the current scene is before the child falls asleep, a relatively relaxing song may be selected from the cheerful music, so as to help the child fall asleep at about 9 pm 30. Thus, in combination with multi-dimensional considerations (e.g., user profile, time, location, etc.), the response strategy may be optimized to match the mood of the child user while helping to direct the mood towards a more benign direction.

Optionally, an emotion analysis report of the child user can be generated according to a preset period, so that parents can know their children more.

Fig. 3 is a schematic structural diagram illustrating an exemplary embodiment of the interaction apparatus based on children's emotion according to the present invention, which may be various electronic devices supporting information communication, including but not limited to children's interaction devices (story machine, smart watch, interaction robot), smart phone, tablet computer, portable computer, desktop computer, and the like. This mutual device based on children's mood can also set up in high in the clouds server, and the high in the clouds server carries out network data interaction through electronic equipment such as with children's interaction equipment (story machine, intelligent wrist-watch, interactive robot), smart mobile phone, panel computer, portable computer and desktop computer, realizes providing the service of response strategy to children's mood. The cloud server can comprise one or more servers used for executing functions such as operation, analysis and storage, the children interaction device is taken as an example, the story machine receives voice interaction information of children and sends the voice interaction information to the cloud server, the cloud server receives the voice interaction information, the voice interaction information is analyzed and identified based on an interaction method based on emotion of the children in the previous embodiment, a response strategy according with current emotion of a child user is obtained, the response strategy is fed back to the story machine through a network, and smooth communication between the story machine and the children in a mode of understanding emotion of the children is achieved. The interaction device based on the emotion of the child can be realized by software, hardware or a combination of the software and the hardware. As shown in fig. 3, the apparatus includes:

and the receiving module 31 is used for receiving the voice interaction information of the child user.

A determining module 32, configured to determine interactive content in the voice interaction information; determining the voice characteristics of the child user in the voice interaction information; determining emotional characteristics of the child user according to the interactive content; determining a degree value of the emotion characteristics according to the voice characteristics; and determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics.

The interaction device based on the emotion of the child provided by this embodiment may execute the method embodiment shown in fig. 1, and the implementation principle and the technical effect are similar, which are not described herein again.

The interaction device based on the emotion of the user provided by the embodiment receives the voice interaction information of the child user; determining interactive content in the voice interactive information; determining the voice characteristics of the child user in the voice interaction information; then, according to the interactive content, determining the emotional characteristics of the child user; determining a degree value of the emotion characteristic according to the voice characteristic; and determining a response strategy according to the emotional characteristics and the degree value of the emotional characteristics. The method realizes accurate control of the emotion of the child, improves harmony and smoothness of communication with the child, and ensures good guidance of the emotion of the child.

Fig. 4 is a schematic structural diagram of an interaction device based on children's emotion according to another exemplary embodiment, as shown in fig. 4, on the basis of the above embodiment, the device further includes:

a determination module 32 comprising:

the converting submodule 321 is configured to convert the voice interaction information into text information.

And the analysis submodule 322 is used for performing semantic analysis on the text information.

And the extraction submodule 323 is used for extracting the theme words and/or the emotion words from the text information to obtain the interactive content.

And the emotional characteristic determining submodule 324 is used for determining the emotional characteristics of the child user according to the theme words and/or the emotional words based on the preset identification rules.

Optionally, the method further includes:

and the dictionary module 33 is used for establishing a child dictionary based on the child language.

Wherein, children's language includes at least: the language of expression of the words stacked on children and the language of expression of the anthropomorphic animals; the children dictionary includes: paraphrasing of children language, paraphrasing of children language combined with context, and emotion identification of children language.

The emotion identification of the child language is used for identifying a positive emotion category, a negative emotion category and a neutral emotion category.

Optionally, the emotion feature determination sub-module 324 is specifically configured to determine an emotion identifier corresponding to each topic word and/or each emotion word in the child dictionary.

when the determined emotion marks do not contain the positive emotion types and contain the negative emotion types, determining the emotion characteristics of the child user as negative emotion characteristics;

when each determined emotion mark only contains a neutral emotion category, determining the emotion characteristics of the child user as neutral emotion characteristics;

and when each determined emotion mark contains both the positive emotion category and the negative emotion category, determining the emotional characteristics of the child user based on the context semanteme according to the word sequence of the theme words and/or the emotion words in the voice interaction information.

Optionally, the method further includes:

and the recognition model module 34 is used for acquiring the emotion marking data of the child and training to obtain the emotion recognition model of the child.

Correspondingly, the emotional characteristic determination sub-module 324 is specifically configured to input the theme words and/or the emotion words into the child emotion recognition model of the recognition model module, and recognize the emotion characteristics of the child user.

Optionally, the determining module 32 includes:

the voice feature determining submodule 325 is configured to determine at least one of the following voice features, i.e., voice strength, voice speed, and intonation, in the voice interaction information.

Optionally, the determining module 32 includes:

the first degree value determining submodule 326 is configured to determine an average value of voice features corresponding to the voice interaction information by using the whole voice interaction information as a statistical object; and determining the degree value of the emotional feature according to the average value of the voice feature.

Optionally, the determining module 32 includes:

a second degree value determining submodule 327, configured to determine a voice feature of each topic word and/or each emotion word in the voice interaction information; according to the weighted values of different parts of speech, carrying out weighted calculation on the voice features of the voice interaction information to obtain a weighted average value of the voice features corresponding to the voice interaction information; and determining the degree value of the emotional feature according to the weighted average value of the voice feature.

Optionally, the response policy includes: a conversation negotiation answering mode and/or an audio resource playing mode; a determination module 32 comprising:

the first determining submodule 328 is configured to determine that the response policy is a conversation negotiation response mode when the emotional characteristic is a negative emotional characteristic and the degree value of the emotional characteristic exceeds a preset threshold; or determining the response strategy as that the response is carried out in a dialogue negotiation response mode firstly and then the response is carried out in an audio resource playing mode.

Optionally, the determining module 32 further includes:

a user representation determination submodule 329 for determining a user representation of a child user; the user portrait comprises at least one of the following characteristics, namely attribute information of the child user, historical interaction records of the child user, habitual expressions of the child user, work and rest rules of the child user, audio resources preferred by the child user, and an association relationship between a geographic position and the child user.

And the optimization sub-module 330 is configured to optimize the response strategy according to the user portrait of the child user determined by the user portrait determination sub-module.

Optionally, the method further includes:

the obtaining module 35 is configured to obtain time information and/or location information of the received voice interaction information.

The determination module 32 further includes:

and the scene determining sub-module 331 is configured to determine a current scene where the child user is located based on the user representation according to the time information and/or the location information.

The optimization submodule 330 is further configured to optimize the response policy according to the current scene.

Optionally, the method further includes:

and the generating module 36 is configured to generate an emotion analysis report of the child user according to a preset period.

The interaction apparatus based on the emotion of the user provided in this embodiment may execute the method embodiment shown in fig. 2, and the implementation principle and the technical effect are similar, which are not described herein again.

Fig. 5a is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present invention. The electronic device 500 includes: a processing unit 502 and a communication unit 503. The processing unit 502 is configured to control and manage actions of the electronic device 500, for example, the processing unit 502 is configured to support the electronic device 500 to perform steps 102-106 of fig. 1, the processing unit 502 may also be configured to support the electronic device 500 to perform steps 202-208 of fig. 2, and/or other processes for the techniques described herein. The communication unit 503 is used for communication between the electronic device 500 and other network entities, and may also be used to support the electronic device 500 to perform step 101 in fig. 1 or step 201 in fig. 2. The electronic device 500 may further comprise a storage unit 501 for storing program codes and data of the electronic device 500.

The processing unit 502 may be a processor or a controller, such as a CPU, a general purpose processor, a Digital Signal Processor (DSP), an Application-specific integrated circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs, and microprocessors, among others. The communication unit 503 may be a communication interface, a transceiver circuit, etc., wherein the communication interface is a generic term and may include one or more interfaces. The storage unit 501 may be a memory.

When the processing unit 502 is a processor, the communication unit 503 is a communication interface, and the storage unit 501 is a memory, the electronic device according to the present invention may be the electronic device 510 shown in fig. 5 b.

Referring to fig. 5b, the electronic device 510 includes: a processor 512, a communication interface 513, and a memory 511. Optionally, the electronic device 510 may also include a bus 514. Wherein, the communication interface 513, the processor 512 and the memory 511 may be connected to each other by a bus 514; the bus 514 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus 514 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5b, but this does not indicate only one bus or one type of bus.

Further, the memory 511 has stored therein a program comprising instructions for performing the method according to any of the embodiments described above, and is configured to be executed by the processor 512.

An embodiment of the present invention further provides an electronic device readable storage medium, where the electronic device readable storage medium stores a program, and the program enables a server to execute the interaction method based on the user emotion provided in any of the foregoing embodiments. The readable storage medium may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method of interaction based on the mood of a child, comprising:

receiving voice interaction information of a child user;

determining interactive content in the voice interactive information;

determining a degree value of the emotional characteristic of the child user according to the voice characteristic of the child user;

determining a response strategy according to the emotional characteristics of the child user and the degree value of the emotional characteristics of the child user;

wherein the determining interactive content in the voice interaction information includes:

converting the voice interaction information into text information;

determining emotional characteristics of the child user according to the theme words and/or the emotional words based on a preset recognition rule, wherein the preset recognition rule comprises that corresponding emotional characteristics of the theme words and/or the emotional words in a dictionary are searched based on a child dictionary, or based on a child emotion recognition model, the emotional characteristics in the language expression of the child are recognized according to the theme words and/or the emotional words, and the child emotion recognition model is obtained by training according to child emotion marking data in advance;

the method further comprises the following steps:

establishing a child dictionary based on the child language;

the emotion identification of the child language is used for identifying a positive emotion category, a negative emotion category and a neutral emotion category;

wherein, the determining the emotional characteristics of the child user according to the theme words and/or the emotion words based on the preset identification rules comprises:

2. The method of claim 1, further comprising:

3. The method of claim 1, wherein determining the voice characteristics of the child user in the voice interaction information comprises:

4. The method of claim 1, wherein determining the degree value of the emotional characteristic of the child user based on the voice characteristic of the child user comprises:

and determining the degree value of the emotional characteristic of the child user according to the average value of the voice characteristic.

5. The method of claim 1, wherein determining the level value of the emotional characteristic of the child user based on the speech characteristic comprises:

determining voice features corresponding to the voice interaction information, wherein the voice features corresponding to the voice interaction information comprise the voice features of each theme word and/or each emotion word in the voice interaction information;

according to the weighted values of different parts of speech, carrying out weighted calculation on the voice features corresponding to the voice interaction information to obtain a weighted average value of the voice features corresponding to the voice interaction information;

and determining the degree value of the emotional characteristic of the child user according to the weighted average value of the voice characteristic corresponding to the voice interaction information.

6. The method of claim 1, wherein the response policy comprises: a conversation negotiation answering mode and/or an audio resource playing mode; determining a response strategy according to the emotional characteristics of the child user and the degree value of the emotional characteristics of the child user, comprising the following steps:

if the emotional characteristics of the child user are negative emotional characteristics and the degree value of the emotional characteristics of the child user exceeds a preset threshold value, determining that a response strategy is the conversation negotiation response mode; or determining that the response strategy is to respond in the dialogue negotiation response mode first and then respond in the audio resource playing mode.

7. The method of claim 1, further comprising:

determining a user representation of the child user; the user portrait of the child user comprises at least one of the following characteristics, namely attribute information of the child user, historical interaction records of the child user, habitual expressions of the child user, work and rest rules of the child user, audio resources preferred by the child user, and an association relationship between a geographic position and the child user;

8. The method of claim 7, further comprising:

and optimizing the response strategy according to the current scene.

9. The method of claim 1, further comprising:

10. An interaction device based on emotion of a child, comprising:

the determining module is used for determining interactive contents in the voice interactive information; determining the voice characteristics of the child user in the voice interaction information; determining emotional characteristics of the child user according to the interactive content; determining a degree value of the emotional characteristic of the child user according to the voice characteristic; determining a response strategy according to the emotional characteristics of the child user and the degree value of the emotional characteristics of the child user;

wherein the determining module comprises:

the emotion feature determination submodule is used for determining emotion features of the child user according to the theme words and/or the emotion words based on preset recognition rules, wherein the preset recognition rules comprise that corresponding emotion features of the theme words and/or the emotion words in a dictionary are searched based on a child dictionary, or emotion features in language expression of the child are recognized according to the theme words and/or the emotion words based on a child emotion recognition model, and the child emotion recognition model is obtained by training according to child emotion marking data in advance;

the device further comprises:

the emotion feature determination submodule is specifically used for determining emotion identifications corresponding to each subject word and/or each emotion word in the child dictionary;

11. The apparatus of claim 10, further comprising:

12. The apparatus of claim 10, wherein the determining module comprises:

13. The apparatus of claim 10, wherein the determining module comprises:

the first degree value determining submodule is used for determining the average value of the voice features corresponding to the voice interaction information by taking the whole voice interaction information as a statistical object; and determining the degree value of the emotional characteristic of the child user according to the average value of the voice characteristic.

14. The apparatus of claim 10, wherein the determining module comprises:

the second degree value determining submodule is used for determining the voice characteristics corresponding to the voice interaction information, and the voice characteristics corresponding to the voice interaction information comprise the voice characteristics of each theme word and/or each emotion word in the voice interaction information; according to the weighted values of different parts of speech, carrying out weighted calculation on the voice features corresponding to the voice interaction information to obtain a weighted average value of the voice features corresponding to the voice interaction information; and determining the degree value of the emotional characteristic of the child user according to the weighted average value of the voice characteristic corresponding to the voice interaction information.

15. The apparatus of claim 10, wherein the response policy comprises: a conversation negotiation answering mode and/or an audio resource playing mode; the determining module includes:

the first determining sub-module is used for determining that the response strategy is the conversation negotiation response mode when the emotional characteristics of the child user are negative emotional characteristics and the degree value of the emotional characteristics of the child user exceeds a preset threshold value; or determining that the response strategy is to respond in the dialogue negotiation response mode first and then respond in the audio resource playing mode.

16. The apparatus of claim 10, wherein the determining module further comprises:

17. The apparatus of claim 16, further comprising:

the determining module further comprises:

18. The apparatus of claim 10, further comprising:

19. An electronic device, comprising: a processor; a memory; and a program; wherein the program is stored in the memory and configured to be executed by the processor, the program comprising instructions for performing the method of any of claims 1-9.

20. An electronic-device-readable storage medium characterized in that the electronic-device-readable storage medium stores a program that causes an electronic device to execute the method of any one of claims 1 to 9.