CN111897977A

CN111897977A - Intelligent voice entertainment system and method carried on child seat

Info

Publication number: CN111897977A
Application number: CN202010519432.9A
Authority: CN
Inventors: 倪旭春; 彭登富; 黄守义; 刘莹莹
Original assignee: Huizhou Desay SV Automotive Co Ltd
Current assignee: Huizhou Desay SV Automotive Co Ltd
Priority date: 2020-06-09
Filing date: 2020-06-09
Publication date: 2020-11-06

Abstract

The invention relates to an intelligent voice entertainment system carried on a child seat, which comprises a voice input end, a processing end electrically connected with the voice input end, a voice output end electrically connected with the processing end, a power supply end electrically connected with the processing end and a server end, wherein the server end is arranged on the processing end; the voice input end comprises a microphone for receiving sound information of children, the voice output end comprises a loudspeaker for playing the voice information, the power end comprises a power socket for providing input of an external power supply, the processing end carries out voiceprint recognition of the children, personality portrait of the children, preference analysis of the children and entertainment content generation on the sound information from the microphone, a server end program source is asked for, sound amplification is carried out, the loudspeaker is controlled to play the entertainment program, or voice chat and communication are carried out on the sound information and the children. The scheme provides rich and changeable voice entertainment contents aiming at different personalities and hobbies of each child, can catch the attention of the child for a long time, and reduces the interference of the child on driving in the riding process.

Description

Intelligent voice entertainment system and method carried on child seat

Technical Field

The invention relates to the field of children entertainment products, in particular to an intelligent voice entertainment system and method carried on a child seat.

Background

Child seat design aim at lets young children have a safe car interior position of taking, because the young children are physically less, so child seat takes the controlling part generally littleer, and child takes the oppression sense in the mind stronger, and in addition the young children are good at, and energy is flourishing, and long-term fixed taking can produce the uneasy mood in child seat, is unfavorable for the safety of driving.

In order to reduce the influence of the child on the driving safety and improve the riding experience for the child, a plurality of entertainment products for the child sitting in the child seat are available on the market.

The products of this type are roughly classified into two types, one is a product with video and can play contents such as animation, and the other is an audio product and plays contents such as preset music and stories.

The existing child seat entertainment product has the following defects:

(1) the product with the video cannot watch the video stably during the running of the vehicle, and the video can have adverse effects on the visual development of children after being used for a long time;

(2) the audio product is provided with voice contents, the type is single, the audio product is played for a long time, the contents are repeated in a large amount, children lose interest quickly, and the purpose of attracting the attention of the children cannot be achieved;

(3) video entertainment products cannot be effectively and reasonably installed on the child seat, cannot be bound with the products and the child seat, and therefore sales promotion is influenced

(4) Audio products, such as opening and closing or content switching, mostly require additional operations, such as pressing keys, which increases the difficulty of children in use.

In order to solve the problems, the invention provides an intelligent voice entertainment system and method carried on a child seat.

Disclosure of Invention

The invention aims to solve the problems that the existing entertainment products have video products, cannot be stably watched in the running process of a vehicle, have influence on the vision development of children after being used for a long time, cannot be effectively fixed on a child seat, have audio products, are single in preset voice content, can be played for a long time, have a large amount of repeated content, are quickly lost interest of the children, cannot achieve the aim of attracting the attention of the children, need additional operation and increase the use difficulty of the children. The concrete solution is as follows:

an intelligent voice entertainment system carried on a child seat comprises a voice input end, a processing end electrically connected with the voice input end, a voice output end electrically connected with the processing end, a power supply end electrically connected with the processing end and a server end wirelessly connected with the processing end; the voice input end comprises a microphone for receiving sound information of children, the voice output end comprises a loudspeaker for playing the voice information, the power end comprises a power socket for providing input of an external power supply, the processing end analyzes and generates entertainment content by carrying out voiceprint recognition on the children, personality portrait of the children and preference analysis of the children on the sound information from the microphone, requests a program source of the server end, carries out sound amplification, and controls the loudspeaker to play the entertainment program or carries out voice chat and communication with the children.

Further, the processing end comprises:

the first processor is used for controlling and processing each module of the whole system;

the power amplification module is electrically connected with the first processor and is used for amplifying the sound signal and electrically connected to the loudspeaker to make sound;

the first wireless communication module is electrically connected with the first processor and is used for wirelessly connecting the server end;

the memory is electrically connected with the first processor and is used for storing a system program, user characteristic information and a user use record;

the voiceprint recognition module is electrically connected with the first processor and is used for recognizing voiceprints, establishing voiceprint characteristics of a new user or loading voiceprint characteristics of an original user;

the user portrait module is electrically connected with the first processor and is used for portraying the characters and the preferences of the children as the user portrait so as to adopt corresponding strategies according to different types of the user portrait;

the voice conversion module is electrically connected with the first processor and is used for carrying out analog/digital or digital/analog conversion on the voice signal;

the voice instruction module is electrically connected with the first processor and used for converting the received voice into a corresponding control instruction and guiding the system operation;

the content generating module is electrically connected with the first processor and used for providing corresponding content strategies according to different children characters;

and the background service module is electrically connected with the first processor and is used for child character modeling and training, question and answer content operation and dialogue content operation.

Further, the server side includes:

the second processor is used for controlling the server to work and cooperating with the first processor to work;

the second wireless communication module is electrically connected with the second processor and is used for wireless communication connection with the first wireless communication module;

a program source library electrically connected to the second processor for providing entertainment content of the system.

Furthermore, the microphone is arranged on the side face of the middle upper portion of the backrest of the child seat and close to the mouth of the child, the loudspeaker is arranged on two sides of a headrest of the child seat respectively, the processing end is arranged at the bottom of a seat cushion of the child seat, and the power socket is arranged on one side of the seat cushion of the child seat.

The intelligent voice entertainment method based on the intelligent voice entertainment system carried on the child seat comprises the following steps:

step 1, a system is started through voice;

step 2, the system carries out simple greeting dialogue with children through voice;

step 3, the system collects the voiceprint of the child in the conversation and compares whether the voiceprint exists in the system? If yes, executing step 7, if no, executing the next step;

step 4, inquiring the identity information of the child if the system does not have the voiceprint of the child in the conversation;

step 5, the system generates some dialogue contents required for depicting the portrait of the child user, and judges which personality type the child belongs to by using the personality model according to the contents;

step 6, after the system judges the character type of the child, entertainment content is generated according to the character type;

step 7, if user information which is consistent with the voiceprint of the child in the conversation exists in the system, generating entertainment content according to the user information;

step 8, the system automatically carries out statistical analysis on the preference of the children according to preset entertainment contents;

step 9, the system automatically collects the dialogue information in the

steps

5, 6, 7 and 8, and updates the portrait of the child user;

step 10, the system automatically updates entertainment content according to the preference of children;

and step 11, in the starting or entertainment process, automatically recognizing the voice command to carry out system control.

The entertainment content, including any of a story or intellectual questions posed to the child to provide correct answers or to have a normal conversation with the child.

Further, the calculation formula of the character model in step 5 includes a character calculation formula and a character result error calculation formula.

Further, the character calculation formula is: logit ═ C]_L×H[W]_k×H ^T

Wherein C represents the words spoken by the child, W represents the lattice weight, logic is a matrix of L rows and K columns, H is the number of layers of the model, and T is the operation symbol, representing the matrix transposition.

Further, the calculation formula of the personality result error is as follows:

wherein, loss is the total error of the character result, loss _ start is the error between the real character description beginning and the operation result, and loss _ end is the error between the real character description ending and the operation result.

Further, the calculation method of the character model comprises the following steps:

step 1, according to a character calculation formula logit, starting calculation of an initial weight matrix [ w ] and solving a result;

step 2, calculating the error according to the error calculation formula loss of the character result, repeatedly calculating by a deep learning method until a weight matrix [ w ] with the minimum loss is generated, and then calculating the error until the weight matrix [ w ] with the minimum loss is generated

Step 3, obtaining a final output character model logit;

and 4, establishing three character classifications according to the specific application range of the system: listening type characters, question type characters, dialogue type characters.

Further, the statistical analysis of the preference of the children in step 8 is calculated according to the following preference formula:

P＝w_t*t(x)+w_q*q(x)+w_s*s(x)

where P is the preference, t (x) is a statistical function of the number of accesses of the entertainment content, q (x) is a statistical function of the time of first access of the entertainment content after system start-up, s (x) is a statistical function of the frequency of accesses of the entertainment content, w_t、w_q、w_sRespectively, the weights of the corresponding numerical values.

In summary, the technical scheme of the invention has the following beneficial effects:

the invention solves the problems that the existing entertainment products have video products, cannot be stably watched during the running of a vehicle, have influence on the vision development of children after long-time use, cannot be effectively fixed on a child seat, have audio products, have single preset voice content, can be played for a long time, have a large amount of repeated content, quickly lose interest of the children, cannot achieve the purpose of attracting the attention of the children, need additional operation and increase the use difficulty of the children. This scheme is applied to children's amusement product with artificial intelligence voice conversation technique to carry on children's seat, be different from common children's seat entertainment system, this scheme can provide abundant and changeable pronunciation amusement content to the different individual character of each children and hobby, can grasp children's attention for a long time, improves the rate of utilization of product, reduces children and to the interference of driving at the in-process of taking a bus. The scheme has the following advantages:

(1) the entertainment content is voice, so that the children do not need to excessively use eyes for a long time, and the bad influence on the eyesight is avoided.

(2) The entertainment content is generated according to the individual customization of each child, and basically can not be repeated, the freshness of the children is kept all the time, the using time and interest of the children are greatly improved, and the attention of the children is greatly attracted.

(3) The system adopts natural language voice instruction control, has no additional key switch operation and no learning cost, and is very suitable for children.

(4) In the use process of the system, the attention of children can be concentrated in the entertainment of the participation system, the uncomfortable feeling of the children taking the child seat can be relieved, meanwhile, the children can not interfere with a driver in the riding process, and the driving safety is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

FIG. 1 is a block diagram of a child seat of the present invention;

FIG. 2 is a block diagram of an intelligent audio entertainment system carried on a child seat in accordance with the present invention;

fig. 3 is a block diagram of a content generation module of the present invention.

Description of reference numerals:

1-voice input end, 2-processing end, 3-voice output end, 4-power end, 5-server end, 6-child seat, 7-child, 10-microphone, 20-first processor, 21-power amplifier module, 22-first wireless communication module, 23-memory, 24-voiceprint recognition module, 25-user portrait module, 26-voice conversion module, 27-voice instruction module, 28-content generation module, 29-background service module, 30-loudspeaker, 40-power socket, 50-second processor, 51-second wireless communication module, 52-program source bank, 60-seat cushion, 61-backrest and 62-headrest.

Detailed Description

The technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1, 2 and 3, an intelligent voice entertainment system carried on a child seat comprises a voice input end 1, a processing end 2 electrically connected with the voice input end 1, a voice output end 3 electrically connected with the processing end 2, a power supply end 4 electrically connected with the processing end 2, and a server end 5 wirelessly connected with the processing end 2; voice input end 1 includes microphone 10 for receiving children's sound information (including speech information), voice output end 3 includes loudspeaker 30, be used for broadcasting speech information, power end 4 includes supply socket 40, be used for providing external power supply (that is car direct current) the input (supply processing end 2 work), processing end 2, to the sound information (including speech information) that comes from microphone 10, carry out children's voiceprint discernment, children's personage portrait, children's taste analysis, produce amusement content, ask for server end 5 program sources, carry out sound amplification, and control loudspeaker 30 broadcast amusement program (including amusement content), or carry out the pronunciation chat with children, exchange. Preferably, the horn 30 is a dual horn in this embodiment.

Further, the processing end 2 includes:

the first processor 20 is used for controlling and processing each module of the whole system; (the first processor 20 is a high-performance single-chip microcomputer.)

The power amplification module 21 is electrically connected with the first processor 20, is used for amplifying the sound signal, and is electrically connected to the loudspeaker 30 to make sound;

a first wireless communication module 22 electrically connected to the first processor 20, for wirelessly connecting to the server 5;

a memory 23 electrically connected to the first processor 20 for storing system programs, user characteristic information (including user voiceprint, character type, preference, name, gender, age, hobbies, etc.), user usage records (including previously listened to entertainment content, length of entertainment time, questions asked, etc.);

a voiceprint recognition module 24 electrically connected to the first processor 20 for recognizing voiceprints, creating new user voiceprint features or loading original user voiceprint features; (the voiceprint recognition technology belongs to the prior art, and the specific working principle and working process thereof are not described in detail here.)

A user representation module 25 electrically connected to the first processor 20 for representing the child's personality, child's preferences as a user representation so as to employ corresponding strategies based on the different types of user representations;

(the portrait contents comprise children's personality: each child has different personality, some are good, some are quiet, some love thinking, etc., the likeness and the promotion problem are established in the conversation performance, some like listening to stories, some like ceaseless conversation with people, etc.. therefore, the children are featured, the accurate communication characteristics of various types of children are provided, the attention of the children is firmly grasped, the portrait contents also comprise children's preference: the contents are provided for different types of children

A voice conversion module 26 electrically connected to the first processor 20 for performing analog/digital or digital/analog conversion on the voice signal;

a voice instruction module 27 electrically connected to the first processor 20, for converting the received voice into a corresponding control instruction, and guiding the system operation; (the system does not need a remote controller and a key switch, and all uses sound control operation, and the control instruction comprises starting up, shutting down, turning up the volume, turning down the volume, and the like.)

A content generation module 28 electrically connected to the first processor 20 for providing a corresponding content (i.e., entertainment content) policy based on the different child personality;

(the structure of the content generating module 28 is shown in fig. 3, mainly comprising voice information input → language understanding → state tracking → reply decision → language generation → voice information output. the system provides strategies corresponding to three types of characters, (1) listening type character strategy: preparing a large number of stories full of entertainment and having early education function for children, judging the type of story the children want to listen to according to the keywords extracted from the request of listening to the story sentence from children, selecting the story with the highest matching degree, if there is no content matching the request, or the number of times of repeated playing of the content is too large, then requesting new content from the server 5 through the first wireless communication module 22 and the second wireless communication module 51. (2) asking type character strategy: requesting new content from the server 5 through the first wireless communication module 22 and the second wireless communication module 51 according to the knowledge type question asked by children, to obtain the correct answer. (3) Dialogue type personality policy: according to the current sentence of the child, the first wireless communication module 22 and the second wireless communication module 51 request the server 5 to obtain the appropriate communication sentence corresponding to the current sentence. )

And the background service module 29 is electrically connected with the first processor 20 and is used for child character modeling and training, question and answer content operation and dialogue content operation. The background service module has three functions:

(1) modeling and training a character model: according to the character analysis modeling method, enough corpus data are prepared for training, a model with high accuracy is obtained, and the model is provided for a content generation module to carry out real-time operation. And simultaneously, the dialogue sentences of the children in actual operation are collected and supplemented to the training corpus set for model iteration.

(2) Knowledge question-and-answer type content operation: a sufficient knowledge base (included in the program source base 52) is prepared in advance, upon receipt of a request for a question, the knowledge base is searched, the correct answer is returned, if the question is not in the system, the answer is not returned, and the question is recorded for subsequent addition.

(3) Dialogue type question content calculation: a sufficient corpus of everyday dialogs (included in the program source library 52) is prepared in advance, and after receiving the dialog initiating sentence, the model is used to start the operation and return the dialog response sentence which is the best matched with the model.

Further, the server 5 includes:

a second processor 50 for controlling the operation of the server 5 and cooperating with the operation of the first processor 20;

a second wireless communication module 51 electrically connected to the second processor 50 for wireless communication connection with the first wireless communication module 22;

a program source library 52 electrically connected to the second processor 50 for providing entertainment content of the system.

Further, the microphone 10 is disposed on the upper side of the backrest 61 of the child seat 6 near the mouth of the child 7, the speakers 30 are disposed on the two sides of the headrest 62 of the child seat 6, the processing terminal 2 is disposed on the bottom of the seat cushion 60 of the child seat 6, and the power outlet 40 is disposed on the seat cushion 60 side of the child seat 6.

step 1, a system is started through voice; (or awaken by voice)

step 3, the system collects the voiceprint of the child in the conversation and compares whether the voiceprint exists in the system?

If yes, executing step 7, if no, executing the next step;

step 4, if the system has no voiceprint of the child in the conversation, inquiring the identity information of the child (such as what you call, what you are in a small name, what you are in a few years, where you are, and the like);

step 5, the system generates some dialogue contents required for depicting the portrait of the child user, and judges which personality type the child belongs to by using the personality model according to the contents; (calculation of concrete character model auspicious see the following explanation)

Step 6, after the system judges the character type of the child (auspicious later related explanation), entertainment content is generated according to the character type;

step 7, if user information (namely user characteristic information including user voiceprints, character types, favorability, names, sexes, ages, hobbies and the like) which is consistent with the voiceprints of the children in the conversation exists in the system, generating entertainment content according to the user information;

step 8, the system automatically analyzes the children's favor statistics according to the preset entertainment content (auspicious see the relevant description behind);

step 9, the system automatically collects the dialogue information (including the cheerful, praise, exclamation and the like presented by the children in the process of listening to the entertainment content) in the

steps

5, 6, 7 and 8, and updates the user portrait of the children;

Entertainment content, including any of stories or intellectual questions posed to the child to provide correct answers or to have a normal conversation with the child.

The statistical analysis of the preference of the children is calculated according to the following preference degree formula:

P＝w_t*t(x)+w_q*q(x)+w_s*s(x)

The calculation formula of the character model comprises a character calculation formula and a character result error calculation formula.

The character calculation formula is: logit ═ C]_L×H[W]_k×H ^T

Wherein C represents the words spoken by the child, W represents the lattice weight, logic is a matrix of L rows and K columns, H is the number of layers of the model, and T is the operation symbol, representing the matrix transposition. The value of K is 2.

The calculation formula of the error of the character result is as follows:

The character model calculation method comprises the following steps:

step 2, calculating the error according to the error calculation formula loss of the character result, and repeatedly calculating by a deep learning method (such as gradient descent training) until a weight matrix [ w ] with the minimum loss is generated, then

Step 3, obtaining a final output character model logit;

The system is used for providing correct answers to intellectual questions posed by children in listening type characters, the system is used for providing correct answers to intellectual questions posed by children in questioning type characters, and the system is used for providing normal conversations with the children in conversation type characters, continuously changing and continuously generating new entertainment contents during long-time entertainment, so that the attention of the children is tightly caught, the use rate of products is improved, and the interference of the children on driving during riding is reduced.

The above-described embodiments do not limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the above-described embodiments should be included in the protection scope of the technical solution.

Claims

1. The utility model provides an intelligent pronunciation entertainment system of carrying on children's seat which characterized in that: the voice processing device comprises a voice input end (1), a processing end (2) electrically connected with the voice input end (1), a voice output end (3) electrically connected with the processing end (2), a power supply end (4) electrically connected with the processing end (2), and a server end (5) wirelessly connected with the processing end (2); speech input end (1) includes microphone (10) for receive children's sound information, speech output end (3) include loudspeaker (30) for broadcast speech information, power end (4) include supply socket (40) for provide external power supply's input, processing end (2) to the sound information that comes from microphone (10), carry out children's voiceprint discernment, children's personage portrait, children's taste analysis, produce entertainment content, ask for server end (5) program source, carry out sound amplification, and control loudspeaker (30) broadcast entertainment program, or carry out voice chat, exchange with children.

2. The intelligent voice entertainment system carried on a child seat according to claim 1, wherein the processing terminal (2) comprises:

the first processor (20) is used for controlling and processing each module of the whole system;

the power amplification module (21) is electrically connected with the first processor (20) and is used for amplifying the sound signal and electrically connected to the loudspeaker (30) to make sound;

a first wireless communication module (22) electrically connected with the first processor (20) and used for wirelessly connecting the server end (5);

a memory (23) electrically connected to the first processor (20) for storing system programs, user characteristic information, user usage records;

a voiceprint recognition module (24) electrically connected to the first processor (20) for recognizing voiceprints, creating new user voiceprint features, or loading original user voiceprint features;

a user representation module (25) electrically connected to the first processor (20) for representing the child's personality, child's preferences as a user representation so as to employ corresponding strategies based on the different types of user representations;

a voice conversion module (26) electrically connected to the first processor (20) for performing analog/digital or digital/analog conversion on the voice signal;

the voice instruction module (27) is electrically connected with the first processor (20) and is used for converting the received voice into a corresponding control instruction and guiding the system operation;

a content generation module (28) electrically connected to the first processor (20) for providing corresponding content policies based on different child traits;

and the background service module (29) is electrically connected with the first processor (20) and is used for child character modeling and training, question and answer content operation and dialogue content operation.

3. The intelligent voice entertainment system carried on a child seat according to claim 2, wherein the server (5) comprises:

the second processor (50) is used for controlling the operation of the server end (5) and is matched with the operation of the first processor (22);

a second wireless communication module (51) electrically connected to the second processor (50) for wireless communication connection with the first wireless communication module (22);

a program source library (52) electrically connected to the second processor (50) for providing entertainment content of the system.

4. The intelligent voice entertainment system of claim 3, wherein: microphone (10) set up in upper portion side in back (61) of children's seat (6), are close to the position of children (7) mouth, loudspeaker (30) set up respectively in headrest (62) both sides of children's seat (6), it sets up in seatpad (60) bottom of children's seat (6) to handle end (2), supply socket (40) set up in seatpad (60) one side of children's seat (6).

5. The intelligent voice entertainment method based on the intelligent voice entertainment system carried on the child seat as claimed in claim 4, is characterized by comprising the following steps:

step 1, a system is started through voice;

step 9, the system automatically collects the dialogue information in the steps 5, 6, 7 and 8, and updates the portrait of the child user;

step 11, in the starting or entertainment process, automatically recognizing a voice command to carry out system control;

6. The intelligent voice entertainment method of claim 5, wherein: and 5, calculating formulas of the character model in the step 5, wherein the formulas comprise a character calculation formula and a character result error calculation formula.

7. The intelligent voice entertainment method of claim 6, wherein: the character calculation formula is as follows: logit ═ C]_L×H[W]_k×H ^T

8. The intelligent voice entertainment method of claim 7, wherein: the calculation formula of the character result error is as follows:

9. The intelligent voice entertainment method of claim 8, wherein: the character model calculation method comprises the following steps:

Step 3, obtaining a final output character model logit;

10. The intelligent voice entertainment method according to claim 5, wherein the statistical analysis of the preference of the children in step 8 is calculated according to the following preference formula:

P＝w_t*t(x)+w_q*q(x)+w_s*s(x)