CN109979455A

CN109979455A - A kind of dialect phonetic AI control method, device and terminal

Info

Publication number: CN109979455A
Application number: CN201910268128.9A
Authority: CN
Inventors: 刘吉林
Original assignee: Shenzhen Shangke Decoration Technology Co Ltd
Current assignee: Shenzhen Shangke Decoration Technology Co Ltd
Priority date: 2019-04-03
Filing date: 2019-04-03
Publication date: 2019-07-05
Also published as: CN110379421A

Abstract

The present invention relates to intelligent sound control field, a kind of dialect phonetic AI control method, device and terminal.The dialect phonetic AI control method includes: to obtain to wake up word and control instruction；Language learning state is entered according to control instruction；Speech reproduction is carried out by the voice messaging repeatedly obtained；By the language message of duplication carry out analysis and with mainstream speech intertranslation；Intellectual product is controlled according to mutual translation entry.The study and the control to intellectual product that the method can realize dialect in the state of without networking through the invention.

Description

A kind of dialect phonetic AI control method, device and terminal

Technical field

The present invention relates to intelligent sound control fields, and in particular to a kind of dialect phonetic AI control method, device and terminal.

Background technique

AI is theory, method, technology and the application system of the intelligence of research, exploitation for simulating, extending and extending people One new technological sciences, artificial intelligence are a branches of computer science, it attempts to understand the essence of intelligence, and produces A kind of new intelligence machine that can be made a response in such a way that human intelligence is similar, the research in the field includes robot, language Identification, image recognition, natural language processing and expert system etc..

Above-mentioned language identification can be language and Voice conversion at the process for the information that can be handled.It is big in the world at present There are about 6000 ~ 10000 multilinguals, it is relatively more tired to carry out complete documentation to this speech like sound for the language of the most of not text of dialect It is difficult.Have following defect in the prior art: AI speech-sound intelligent needs the mandarin of standard to control, and mandarin is nonstandard, can not make With being easy to be defined as pseudo- intelligence.Network cloud sound bank big data is needed for the device of some language identifications, net cannot be fallen, Otherwise it is not available.

To solve the above problems, a kind of dialect phonetic AI control method, device and terminal of the present invention, without connection The study and the control to intellectual product that dialect is realized in the state of net.

Summary of the invention

Present invention solves the technical problem that being to provide a kind of dialect phonetic AI control method, device and terminal.Described A kind of dialect phonetic AI control method, device and terminal realized in the state of without networking the study of dialect with to intellectual product Control.

In order to solve the above technical problem, the present invention provides technical solution are as follows:

A kind of dialect phonetic AI control method:

It obtains and wakes up word and control instruction；

Language learning state is entered according to control instruction；

Speech reproduction is carried out by the voice messaging repeatedly obtained；

By the language message of duplication carry out analysis and with mainstream speech intertranslation；

Intellectual product is controlled according to mutual translation entry.

Preferably, when the acquisition control instruction, if inputting in the stipulated time without right instructions, auto-returned waits waking up State.If obtaining and not receiving control instruction after waking up word, auto-returned waits for wake-up states.Energy-efficient effect is reached, simultaneously Realize intelligent automaticization.

Preferably, the speech reproduction is to carry out complete documentation to user language by intelligent sound technology.Voice skill Art is the key technology in computer field, there is automatic speech recognition technology and speech synthesis technique, allows computer capacity to listen, energy It sees, can say, can feel.It is of the present invention according to voice technology after to be system obtain voice by microphone, by voice messaging Completely recorded.

Preferably, it is described user language is recorded after include analyzing voice, specially analyze voice Structure, syntactic structure, the change of tune of continuous speech and tone sandhi.

Preferably, the intertranslation is specially corresponding with mainstream speech according to result of voice analysis, basic meaning unit Relationship carries out intertranslation.The advantages of carrying out intertranslation for basic meaning unit and analysis result is that the result office of intertranslation is unlimited It is expressed in single syntax, so that the control instruction of user has more freedom during statement, identifies the knowledge of control instruction Other effect is also more preferable.

Preferably, mutual translation entry is generated after the completion of the intertranslation, mutual translation entry is adjusted for being matched with control instruction Intellectual product is controlled with matched control instruction.Mutual translation entry is the process for converting voice and text, text and control After the completion of instruction is matched, matched control instruction is called to control intellectual product.

Preferably, the control model includes three kinds: selling field mode, home mode, kitchen mode.The sales field mould Formula is filtered stable state noise and dynamic noise after obtaining copying voice information.Control model is divided into three kinds of situations, considers It having arrived under various circumstances, the requirement of the acquisition of voice, under the mode situation of sales field, it is desirable that voice transmission range is closer, More complicated noise processed is carried out to voice simultaneously, comprising: filter out most stable state noises, such as motor, blower.Filtering Fall dynamic noise in some lives, such as the underwater sound, door slam etc..

This solution, satisfaction are demonstrated under the environment of sales field, filter out around some voices, musical sound etc. and to obtain Instruction it is more accurate.Under home mode, it is contemplated that in more quiet situation, the phonetic order of acquisition has been better simply makes an uproar Sound accelerates recognition speed, while desired controlled range is bigger to carry out simple noise processed.

A kind of dialect phonetic AI control device:

Instruction acquisition module: the instruction acquisition module wakes up word and control instruction for obtaining；

Selecting module: the selecting module is used to enter language learning state according to control instruction；

Speech reproduction module: the speech reproduction module is used to carry out speech reproduction by the voice messaging repeatedly obtained；

Intertranslation module: the intertranslation module be used for by the language message of duplication carry out analysis and with mainstream speech intertranslation；

Control module: the control module is for controlling intellectual product according to mutual translation entry.

Preferably, the instruction acquisition module wakes up word and control instruction for obtaining；The wake-up word is for controlling System processed is opened, and control word is used to carry out system the input of control instruction.

Preferably, the selecting module is used to enter language learning state according to control instruction；Into after the state It can carry out language learning.

Preferably, the speech reproduction module is used to carry out speech reproduction by the voice messaging repeatedly obtained；It is described Language message carry out duplication use intelligent sound technology realize voice duplication, duplication is generallyd use for the duplication of voice Three times the case where.

Preferably, the intertranslation module be used for by the language message of duplication carry out analysis and with mainstream speech intertranslation；Institute The analytic process for the voice stated is, it is described user language is recorded after include analyzing voice, specially analyze Phonetic structure, syntactic structure, the change of tune of continuous speech and tone sandhi out.The intertranslation is specially according to speech analysis knot The corresponding relationship of fruit, basic meaning unit and mainstream speech carries out intertranslation.Basic meaning unit and analysis result are carried out The advantages of intertranslation, is that the result office of intertranslation is not limited to single syntax expression, so that mistake of the control instruction of user in statement Freedom is had more in journey, identifies that the recognition effect of control instruction is also more preferable.

Preferably, the control module is for controlling intellectual product according to mutual translation entry.The mutual translation word Item is the text generated after a kind of intertranslation, by the matching of the text and control instruction, to carry out the control of intellectual product.

The present invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage has computer Program instruction, the program instruction are suitable for being loaded by processor and executing above-mentioned dialect phonetic AI control method.

The present invention also provides a kind of mobile terminals comprising processor and memory, the processor is for executing The program stored in memory, to realize above-mentioned dialect phonetic AI control method.

Compared with prior art, the device have the advantages that are as follows: a kind of dialect phonetic AI control of the present invention Study of the method without realizing dialect in the state of networking and the control to intellectual product.Specifically, by the duplication of language, dividing Analysis and the intertranslation of mainstream speech, realize the study of voice dialect, the study of dialect without network can be realized dialect control electric appliance and Light.Preparatory typing language is not needed simultaneously, it is local to learn the local dialect, control can be operated.The control solution of dialect will never The people to speak standard Chinese pronunciation uses, for example local old man and child.Control method described herein is not necessarily to network, is not necessarily to APP, nothing Any terminal and platform auxiliary tool, product is needed locally to can be operated.Product modular simultaneously lowers docking any products AI Voice development and production cost.Network is not needed as platform, is suitable for single machine and operates.Using while learning without waiting does intelligence To hommization.

Detailed description of the invention

Present invention will be further explained below with reference to the attached drawings and examples.

Fig. 1 is a kind of flow diagram of dialect phonetic AI control method of the present invention；

Fig. 2 is a kind of structure chart of dialect phonetic AI control device of the present invention.

Specific embodiment

In conjunction with the accompanying drawings, the present invention is further explained in detail.These attached drawings are simplified schematic diagram, only with Illustration illustrates basic flow chart of the invention, therefore it only shows process related to the present invention.

Embodiment 1

As shown in Figure 1, the present invention is a kind of dialect phonetic AI control method, the method specifically:

S1. it obtains and wakes up word and control instruction；

S2. language learning state is entered according to control instruction；

S3. speech reproduction is carried out by the voice messaging repeatedly obtained；

S4. by the language message of duplication carry out analysis and with mainstream speech intertranslation；

S5. intellectual product is controlled according to mutual translation entry.

Step S1., which is obtained, wakes up word and control instruction, and there are three types of the instructions altogether of the control language brain:

The first: waking up word and control instruction, and it is all usually this mode on intelligent sound that this mode, which is referred to as One Shot, but The mode of the most convenient of voice control, such as: it is small and lamp to be opened.

Second: individually say control control instruction, such as: lamp pass, after system wake-up, it can only say and control finger It enables.

The third: individually saying wake up instruction, such as: it is small can it is small can, wake up system, generally as wake-up mode.

Step S2: language learning state is entered according to control instruction；Into after language learning state, including following several Point requires:

In dialect study, ambient enviroment is kept quite state；Mouth is best at 1 meter or so with a distance from microphone；The language of acquisition Speed needs to slow down, slightly slower than normal word speed；Speaking, it is full to pronounce distinctly；When learning training, if played, " instruction of study is not Standardize, the study that please retell xth all over xx instructs " when, the content for indicating that recording training quality is not good enough or speaking is less than three Word, trained speech content must be more than three words or three words；After learning training, if recognition effect is bad, need After deletion, then relearn training one time.

Step S3: speech reproduction is carried out by the voice messaging repeatedly obtained；

Preferably, the speech reproduction is to carry out complete documentation to user language by intelligent sound technology.Voice technology is Key technology in computer field has automatic speech recognition technology and speech synthesis technique, allow computer capacity to listen, can see, It can say, can feel.It is of the present invention that after to be system obtain voice by microphone, voice messaging is carried out according to voice technology Complete record.

Step S4: by the language message of duplication carry out analysis and with mainstream speech intertranslation；

The analysis of the voice point are as follows: preferably, it is described user language is recorded after include dividing voice Analysis, specially analyzes phonetic structure, syntactic structure, the change of tune of continuous speech and tone sandhi.The phonetic structure includes: Initial consonant, simple or compound vowel of a Chinese syllable, tone of syllabic language etc..

The intertranslation is specially to be carried out according to the corresponding relationship of result of voice analysis, basic meaning unit and mainstream speech Intertranslation.The advantages of the method, is, the advantages of basic meaning unit and analysis result progress intertranslation is, the knot of intertranslation Fruit office is not limited to single syntax expression, so that the control instruction of user has more freedom, identification control during statement The recognition effect of instruction is also more preferable.

Such as: user issue it is small close whole light/small turn off whole light/small can light all turn off/it is small can Light all closed/it is small light can be all turned off/it is small can light whole pass/it is small can the full pass of light/small can the full pass of lamp/ It is small can lamp close entirely/it is small can the full pass of lamp/it is small can lamp turn off entirely/it is small lamp is turned off can entirely/it is small can lamp whole pass/small can be Lamp all closed/it is small can lamp all turn off/it is small lamp can be all turned off/it is small can lamp all passes/small can lamp all turn off/it is small can be lamp It all closes and/small lamp all to be turned off the whole light of closing/turn off whole light/light and all turn off/light whole pass/lamp Light all turns off/light whole pass/the complete full pass of pass/lamp of light/full pass of lamp/the full pass of lamp/lamp turns off entirely/lamp Turn off entirely/lamp all closed/lamp whole pass/lamp is all turned off/lamp is all turned off/lamp all passes/lamp all turns off/lamp It has all closed/lamp has all been turned off, above all of expression way all can recognize to close the instruction of whole light.

Step S5: intellectual product is controlled according to mutual translation entry.

Mutual translation entry is generated after the completion of the intertranslation, mutual translation entry calls matching for being matched with control instruction Control instruction control intellectual product.Mutual translation entry be the process for converting voice and text, text and control instruction into After the completion of row matching, matched control instruction is called to control intellectual product.

Embodiment 2

Exit the process of study are as follows:

If system plays " please say that xth is instructed all over the study of xxx ", user does not speak, and after 30 seconds, system can play " study Time-out, during system reboot comes into force ", system can automatically exit from mode of learning.The dialect instruction of learning success can all retain, still It can be with accent recognition.

Embodiment 3

System and device technical parameter:

System and device is powered using 5V USB port, is plugged microphone and loudspeaker, and after powering on, 8 seconds or so, loudspeaker played " xxx preparation It is ready ".

Control model includes three kinds: selling field mode, home mode, kitchen mode.The sales field pattern acquiring replicates language Voice is filtered to stable state noise and dynamic noise after message breath.When control model is set as home mode, environmental noise No more than 60 decibels, identification distance is no more than 5 meters, and the projecting noise of sound of speaking can be identified effectively, user needs face Microphone can not bow towards other directions.

When control model is set as selling field mode, identification distance is no more than 1 meter, the projecting noise of one's voice in speech, Can effectively it identify.The best face microphone of mouth can not bow or towards other directions.

As shown in Fig. 2, the present invention provides a kind of dialect phonetic AI control devices:

Instruction acquisition module 1: the instruction acquisition module wakes up word and control instruction for obtaining；

Selecting module 2: the selecting module is used to enter language learning state according to control instruction；

Speech reproduction module 3: the speech reproduction module is used to carry out speech reproduction by the voice messaging repeatedly obtained；

Intertranslation module 4: the intertranslation module be used for by the language message of duplication carry out analysis and with mainstream speech intertranslation；

Control module 5: the control module is for controlling intellectual product according to mutual translation entry.

The instruction acquisition module 1: word and control instruction are waken up for obtaining；The wake-up word is used for control system It opens, control word is used to carry out system the input of control instruction.

The selecting module 2: for entering language learning state according to control instruction；It can be carried out after into the state Language learning.

The speech reproduction module 3: speech reproduction is carried out for the voice messaging by repeatedly obtaining；The language Information carries out duplication and uses the duplication that intelligent sound technology realizes voice, and the duplication of voice is generallyd use and is replicated three times Situation.

The intertranslation module 4: for by the language message of duplication carry out analysis and with mainstream speech intertranslation；The language The analytic process of sound is, it is described user language is recorded after include analyzing voice, specially analyze voice Structure, syntactic structure, the change of tune of continuous speech and tone sandhi.The intertranslation be specially based on the analysis results, basic meaning Intertranslation is carried out between any text or voice and mainstream speech of the corresponding relationship and language of unit and mainstream speech.For base The advantages of this meaning unit and analysis result carry out intertranslation is that the result office of intertranslation is not limited to single syntax expression, makes The control instruction for obtaining user has more freedom during statement, identifies that the recognition effect of control instruction is also more preferable.

The control module 5: for being controlled according to mutual translation entry intellectual product.The mutual translation entry is one The language text generated after kind intertranslation, by the matching of the text and control instruction, to carry out the control of intellectual product.

Above-listed detailed description is illustrating for possible embodiments of the present invention, and above embodiments are not to limit this The scope of the patents of invention, all equivalence enforcements or change without departing from carried out by the present invention, is intended to be limited solely by the scope of the patents of this case.

Claims

1. a kind of dialect phonetic AI control method characterized by comprising

It obtains and wakes up word and control instruction；

Language learning state is entered according to control instruction；

Intellectual product is controlled according to mutual translation entry.

2. a kind of dialect phonetic AI control method according to claim 1, which is characterized in that the acquisition control instruction When, if inputting in the stipulated time without right instructions, auto-returned waits for wake-up states.

3. a kind of dialect phonetic AI control method according to claim 1, which is characterized in that the speech reproduction is logical It crosses intelligent sound technology and complete documentation is carried out to user speech.

4. a kind of dialect phonetic AI control method according to claim 3, which is characterized in that it is described to user speech into Include analyzing voice after row record, specially analyzes phonetic structure, syntactic structure, the change of tune of continuous speech and company Continuous modified tone, generates result of voice analysis.

5. a kind of dialect phonetic AI control method according to claim 1, which is characterized in that the intertranslation is specially root Intertranslation is carried out according to the corresponding relationship of result of voice analysis, basic meaning unit and mainstream speech.

6. a kind of dialect phonetic AI control method according to claim 5, which is characterized in that raw after the completion of the intertranslation At mutual translation entry, mutual translation entry calls matched control instruction control intellectual product for being matched with control instruction.

7. a kind of dialect phonetic AI control method according to claim 1, which is characterized in that the control model includes Three kinds: selling field mode, home mode, kitchen mode.

It include being filtered to stable state noise and dynamic noise after sales field pattern acquiring copying voice information described in 8..

9. a kind of dialect phonetic AI control device, comprising:

10. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has computer journey Sequence instruction, which is suitable for by processor load and perform claim requires 1 ~ 8 described in any item methods.

11. a kind of mobile terminal, which is characterized in that including processor and memory, the processor is for executing storage The program stored in device, to realize the described in any item methods of claim 1 ~ 8.