CN109872726A - Pronunciation evaluating method, device, electronic equipment and medium - Google Patents

Pronunciation evaluating method, device, electronic equipment and medium Download PDF

Info

Publication number
CN109872726A
CN109872726A CN201910234740.4A CN201910234740A CN109872726A CN 109872726 A CN109872726 A CN 109872726A CN 201910234740 A CN201910234740 A CN 201910234740A CN 109872726 A CN109872726 A CN 109872726A
Authority
CN
China
Prior art keywords
keyword
assessment
voice
pronunciation
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910234740.4A
Other languages
Chinese (zh)
Inventor
曾慧
宋征轩
徐燃
雷宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Rubo Technology Co Ltd
Original Assignee
Beijing Rubo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Rubo Technology Co Ltd filed Critical Beijing Rubo Technology Co Ltd
Priority to CN201910234740.4A priority Critical patent/CN109872726A/en
Publication of CN109872726A publication Critical patent/CN109872726A/en
Pending legal-status Critical Current

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The embodiment of the invention discloses a kind of pronunciation evaluating method, device, electronic equipment and media, wherein this method comprises: obtaining the voice that assessment object issues in real time;If identifying target keyword in the voice obtained in real time, the target keyword of identification is matched with the standard keyword in assessment sentence;If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword, and continue based on the target keyword after the invalid keyword of speech recognition obtained in real time;If the value of matching result is greater than or equal to confidence threshold, the target keyword currently identified is determined as effective keyword, and determine the assessment result of effective keyword to the pronunciation character of effective keyword according to assessment object.The embodiment of the present invention can solve existing pronunciation evaluating method aiming at the problem that pronunciation evaluation result of children lacks objectivity, improves the objectivity of pronunciation evaluation result, increases the flexibility of assessment interactive process.

Description

Pronunciation evaluating method, device, electronic equipment and medium
Technical field
The present embodiments relate to intellectual education technical fields more particularly to a kind of pronunciation evaluating method, device, electronics to set Standby and medium.
Background technique
The pronunciation of children is assessed, the language competence of child can be best understood from, this language learning in children Stage plays an important role.For example, passing through pronunciation evaluation, it will be appreciated that whether children's pronunciation correct, children are to the reason of language Solution level and children are to the reply degree etc. of complex language.
Currently, in Speech Assessment interactive process, after user receives the beginning prompt tone of pronunciation evaluation product broadcasting, start It is spoken according to screen prompt;When assessment system detect user speech tail point or user speech acquisition time-out, stop adopting Collect user speech;Then collected user speech is carried out with the assessment sentence template of assessment system as a whole pair Than returning to the pronunciation evaluation result of user according to comparing result.
Consider that cognitive ability and the self-control of children are relatively weak, it, can not be in strict accordance with during pronunciation evaluation Assess product requirement complete assessment, it may appear that a variety of randomness events, for example, skip assessment sentence in unacquainted word or The pronunciation sequence etc. of word in the reverse assessment sentence of person.If still using above-mentioned evaluation scheme, lead to the assessment pronounced to children As a result lack objectivity.Also, the characteristics of being directed to children itself, using the above scheme to the assessment interactive process of children's pronunciation Lack flexibility.
Summary of the invention
The embodiment of the present invention provides a kind of pronunciation evaluating method, device, electronic equipment and medium, to improve pronunciation evaluation knot The objectivity and accuracy of fruit increase the flexibility of assessment interactive process.
In a first aspect, the embodiment of the invention provides a kind of pronunciation evaluating methods, this method comprises:
The voice that assessment object issues is obtained in real time;
If identifying target keyword in the voice obtained in real time, by the target keyword of the identification and assessment language Standard keyword in sentence is matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword, And continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as having Keyword is imitated, and determines the assessment knot of effective keyword to the pronunciation character of effective keyword according to assessment object Fruit.
Second aspect, the embodiment of the invention also provides a kind of pronunciation evaluation device, which includes:
Voice obtains module, the voice issued for obtaining assessment object in real time;
Keywords matching module, if for identifying target keyword in the voice obtained in real time, by the identification Target keyword with assessment sentence in standard keyword matched;
Invalid keyword determining module, if the value for matching result is less than confidence threshold, the target that will currently identify Keyword is determined as invalid keyword, and continues to close based on the target after invalid keyword described in the speech recognition obtained in real time Keyword;
Effective keyword evaluation module will be current if the value for matching result is greater than or equal to the confidence threshold The target keyword of identification is determined as effective keyword, and is determined according to pronunciation character of the assessment object to effective keyword The assessment result of effective keyword.
The third aspect, the embodiment of the invention also provides a kind of electronic equipment, comprising:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processing Device realizes the pronunciation evaluating method as described in any embodiment of the present invention.
Fourth aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer Program realizes the pronunciation evaluating method as described in any embodiment of the present invention when the program is executed by processor.
The voice that the embodiment of the present invention is issued by obtaining assessment object in real time, and the voice of acquisition is known in real time Not, it once therefrom identifying target keyword, is then matched with the standard keyword in assessment sentence, and then tied according to matching Fruit and assessment object pronunciation character, determine assessment result, be equivalent to as unit of keyword carry out circulation identification with it is matched Mode, the pronunciation evaluation result for solving existing pronunciation evaluating method for the relatively weak assessment object of self-control lack visitor The problem of property seen, the objectivity and accuracy of pronunciation evaluation result are improved, increases the flexibility of assessment interactive process.
Detailed description of the invention
Fig. 1 is the flow chart for the pronunciation evaluating method that the embodiment of the present invention one provides;
Fig. 2 is the flow chart for another pronunciation evaluating method that the embodiment of the present invention one provides;
Fig. 3 is the flow chart of pronunciation evaluating method provided by Embodiment 2 of the present invention;
Fig. 4 is the structural schematic diagram for the pronunciation evaluation device that the embodiment of the present invention three provides;
Fig. 5 is the structural schematic diagram for a kind of electronic equipment that the embodiment of the present invention four provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is the flow chart for the pronunciation evaluating method that the embodiment of the present invention one provides, and the present embodiment is applicable to automatic control The relatively weak assessment object of ability, such as children, carry out pronunciation evaluation the case where, this method can by pronunciation evaluation device Lai It executes, which can be realized by the way of software and/or hardware, and can be integrated on an electronic device, such as mobile terminal, Intelligent appliance product and intellectual education product etc..
As shown in Figure 1, pronunciation evaluating method provided in this embodiment may include:
S110, the voice that assessment object issues is obtained in real time.
After the pronunciation evaluation function of electronic equipment is activated, the voice acquisition device on electronic equipment can use, Such as microphone, the voice that the object of acquisition assessment in real time issues.In general, electronic equipment can be by standard during pronunciation evaluation Assessment sentence be displayed on the screen, assessment object according to display content pronounce, such as with read.
If S120, identifying target keyword in the voice obtained in real time, by the target keyword and assessment of identification Standard keyword in sentence is matched.
Electronic equipment can be based on speech recognition technology, identified in real time to the voice obtained in real time, extract voice In target keyword.Standard keyword in assessment sentence can be by being split to obtain to assessment sentence.The present embodiment In, target keyword and standard keyword include the word being made of at least one minimum language element, wherein according to language kind The difference of class, minimum language element can be different.For example, minimum language element refers to individual Chinese character, then target critical for Chinese The phrase that word and standard keyword may each comprise a Chinese character or be made of at least two Chinese characters;For English, minimum language Speech element refers to single English word, then target keyword and standard keyword may each comprise an English word or by least The phrase of two English words composition.At least two target keywords or at least two standard keywords, can form one it is complete Whole sentence.
Illustratively, the criterion evaluation language by sentence " child will form a Good Habit from small " as children's pronunciation evaluation Sentence, after children distribute the voice of Chinese character " small ", electronic equipment can obtain the voice, and identify " small " in voice Word, the standard keyword for including with assessment sentence are matched;As children continue to speak, when the language for distributing the second Chinese character " friend " After sound, electronic equipment obtains the voice again, and identifies " friend " word in voice, the standard keyword for including with assessment sentence It is matched;The identification and matching of the Chinese character said every time to children are persistently carried out, until children will assess each of sentence Word is finished.Here, being illustrated by taking the identification of Chinese character one by one and matching as an example, but should not be construed as to the present embodiment Specific restriction.Can also by the mixing of phrase and Chinese character identify with it is matched in a manner of, the pronunciation of children is assessed, for example, After children finish word " child ", electronic equipment identifies the word " child " in current speech in real time, and with assessment language The standard keyword that sentence includes is matched;With the continuation spoken, after children finish Chinese character " wanting ", electronic equipment is known again " wanting " word in other voice, and the standard keyword for including with assessment sentence is matched;Then, children can continue to issue " from ", " small ", " forming ", " good " and " habit " voice, obtain the voice that issues every time and in real time in real time with electronic device Identification.Wherein, children can issue the voice comprising target keyword in any order every time, however it is not limited to assess language Keyword pronunciation sequence defined by the normal word order of sentence.
The target keyword in voice that electronic equipment obtains every time specifically includes minimum language element or at least Two minimum language elements can finish the dead time after talking about according to assessment object to determine every time.Electronic equipment have compared with High identification sensitivity can distinguish this.
If not identifying target keyword in the voice obtained in real time, such as assessment object issues cough etc. and do not have There is the sound of real semanteme, then can abandon current speech, continues next section of voice for obtaining assessment object.
If the value of S130, matching result is less than confidence threshold, the target keyword currently identified is determined as closing in vain Keyword, and continue based on the target keyword after the invalid keyword of speech recognition obtained in real time.
Whether the matching result of target keyword and standard key is for determining assessment object according to the content for assessing sentence Pronounce, specific numerical value, which can be used, to be indicated.The value of matching result is less than confidence threshold, illustrates that the target currently identified is closed Keyword is not belonging to the content of assessment sentence, i.e., invalid keyword.Wherein, confidence threshold can require to be fitted according to matching precision Answering property is arranged.
Specifically, consider that target keyword can be matched with each standard keyword in assessment sentence, therefore, if The value of matching result is less than confidence threshold, then the target keyword currently identified is determined as invalid keyword, may include: true The matching result of the target keyword and each standard keyword that are identified before settled, if each matching result is respectively less than confidence threshold Value, then be determined as invalid keyword for the target keyword;Alternatively, determining that the target keyword currently identified and each standard are closed The matching result of keyword, and determine the maximum value in each matching result, if the maximum value is less than confidence threshold, will currently know Other target keyword is determined as invalid keyword.If current goal key belong to invalid keyword, ignored, then after It is continuous to obtain the voice and identification that assessment object issues in real time, i.e., the operation of S110 to S120 is executed again, it in this way can be to avoid hair Interference of the non-assessment sentence content (i.e. interference voice) to pronunciation evaluation in sound evaluation process, while to assess interactive process It is more flexible with it is humanized.
Illustratively, as shown in Fig. 2, completely commenting sentence is " You are beautiful ", electronic equipment comments this Predicate sentence is split as the form of word, i.e. three standard keywords: " You ", " are " and " beautiful ", and is deposited Storage.After the pronunciation evaluation function of electronic equipment is activated, the voice " Apple " that assessment object issues is got, and know in real time Not Chu target keyword " Apple ", matched with three standard keywords, the value of three obtained matching result is respectively less than and sets Believe threshold value, it is determined that " Apple " is invalid keyword, is ignored, and is not accounted for the word during pronunciation evaluation, simultaneously Continue to obtain voice of the assessment object after distributing " Apple ".Certainly, invalid keyword also includes electronic equipment to except assessment It is closed obtained from keyword obtained from the identification for the voice that other objects except object issue, or identification to ambient sound Keyword.
In addition, if determining that the target keyword currently identified is determined as invalid keyword, i.e., the target keyword does not belong to In the content of assessment sentence, pre-stored sound bank can also be called, is closed based on what is stored in the sound bank about the target The standard pronunciation feature of keyword, determines the assessment result of the target keyword, and prompts assessment object current goal keyword not Belong to the content that assessment sentence includes.Wherein, assessment result can be shown using the forms such as fractional value or pronunciation grade, this reality Example is applied to be not especially limited.
If the value of S140, matching result is greater than or equal to confidence threshold, the target keyword currently identified is determined as Effective keyword, and the assessment result of effective keyword is determined to the pronunciation character of effective keyword according to assessment object.
The value of matching result is greater than or equal to confidence threshold, illustrates that the target keyword currently identified belongs in assessment sentence Content, i.e., effective keyword.The determination of effective keyword can use method similar with the invalid keyword of above-mentioned determination, i.e., It can be determined compared with confidence threshold according to the target critical currently identified with the matching result of each standard keyword, It can be according to the maximum value in the matching result of the target critical and each standard keyword that currently identify, and the ratio of confidence threshold Compared with and determine.If current goal keyword belongs to effective keyword, it is special to the pronunciation of effective keyword object can will to be assessed It levies and is compared with standard pronunciation feature of the effective keyword in assessment sentence, determine the assessment knot of effective keyword Accurately whether fruit assess object to the voice quality of effective keyword, such as pronounce.Determining commenting for effective keyword While estimating result, the voice and the identification that obtain assessment object can be continued, i.e., execute the operation of S110 to S120 again.
Illustratively, continue as shown in Fig. 2, being identified after electronic equipment gets pronunciation of the assessment object to " You " Target keyword " You " determines that " You " belongs to effective keyword by matching, then combines assessment object special to the pronunciation of " You " Levy the assessment result for determining the word;The sound of " You " is distributed when assessing object, and has issued the sound of " beautiful ", electronics is set It is standby to continue to identify target keyword " beautiful " after identifying " You ", and determine whether " beautiful " belongs to Effective keyword;Above process circulation executes, until pronunciation evaluation process terminates.
It should be noted that the matching of upper a target keyword and standard keyword, electronic equipment is had no effect on to commenting The acquisition and the identification to current goal keyword for estimating object current speech, as long as the language of assessment object can be got in real time The real-time identification of sound, target keyword is just persistently carried out with matching.The assessment result of each target keyword can be tied in assessment Assessment object is fed back to after beam together, determining assessment result can also be fed back into assessment object in evaluation process.
Optionally, this method further include: include if detecting that the quantity of effective keyword of identification is equal in assessment sentence Standard keyword quantity, then stop obtain assessment object issue voice.By being stopped according to the determination of the quantity of effective keyword The opportunity for only obtaining voice, i.e., using the quantity of effective keyword as a kind of termination condition of pronunciation evaluation, available assessment Pronunciation of the object to each standard keyword, and then can determine that assessment object is directed to the pronunciation matter of each standard keyword Amount avoids assessment object and does not finish content in commentary sentence at the appointed time compared with the prior art, and causes to pronounce Assessment result lacks the phenomenon that objectivity, meanwhile, the tone period limitation to assessment object is also avoided, so that evaluation process is more Add flexibly.Certainly, this embodiment scheme can also determine when to terminate to obtain assessment pair according to the preset assessment time The voice of elephant, wherein the assessment time can be according to factors such as the word speed of assessment object and the behavior expressions in evaluation process Flexible setting.
The technical solution of the present embodiment by real time obtain assessment object issue voice, and in real time to the voice of acquisition into Row identification is then matched with the standard keyword in assessment sentence once therefrom identifying target keyword;If it is determined that current Target keyword belongs to invalid keyword, then is ignored, and continues based on after the invalid keyword of speech recognition obtained in real time Target keyword, can effectively remove non-assessment sentence content during pronunciation evaluation and filter out assessment object has Voice input is imitated, influence of the appearance of non-assessment sentence content to assessment result accuracy is avoided;If it is determined that current goal is crucial Word belongs to effective keyword, it is determined that its corresponding assessment result, while continuing to obtain voice and the identification of assessment object, until Evaluation process terminates, and solves existing pronunciation evaluating method for the pronunciation evaluation knot of the relatively weak assessment object of self-control Fruit lacks the problem of objectivity, weakens pronunciation evaluation knot of the pronunciation sequence to assessment object of each keyword in assessment sentence The influence of fruit improves the applicability of pronunciation evaluating method, improves the objectivity and accuracy of pronunciation evaluation result, increases Assess the flexibility of interactive process.
Embodiment two
Fig. 3 is the flow chart of pronunciation evaluating method provided by Embodiment 2 of the present invention, and the present embodiment is in above-described embodiment On the basis of further progress optimization and extension.As shown in figure 3, this method may include:
S210, under the dual-mode based on echo cancellation technology, in real time obtain assessment object issue voice, wherein Dual-mode refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, and assessment system prompt tone is broadcast Device is put for issuing voice in assessment interactive process prompt assessment object.
If S220, identifying target keyword in the voice obtained in real time, by the target keyword and assessment of identification Standard keyword in sentence is matched.
If the value of S230, matching result is less than confidence threshold, the target keyword currently identified is determined as closing in vain Keyword, and continue based on the target keyword after the invalid keyword of speech recognition obtained in real time.Execute S210 extremely again The operation of S220.
If the value of S240, matching result is greater than or equal to confidence threshold, the target keyword currently identified is determined as Effective keyword, and the assessment result of effective keyword is determined to the pronunciation character of effective keyword according to assessment object.True While the assessment result of fixed effective keyword, the voice and the identification that obtain assessment object can be continued, i.e., executed again The operation of S110 to S120.
The present embodiment using the voice for obtaining assessment object in real time under dual-mode, comment by the pronunciation for being equivalent to electronic equipment Estimate after function is activated, voice acquisition device is in the state that can acquire the voice of assessment object at any time, does not need assessment pair As broadcasting prompt tone using assessment system prompt tone playing device in electronic equipment and then loquituring, thus hair is desalinated The limitation that sound evaluation process is put at the beginning of pronouncing to assessment object.Especially for children, even if it cannot be in strict accordance with Prompt tone pronounces, and will not just be loquitured and (be robbed before prompt tone casting because of children using this embodiment scheme Say), and missing children phonological component for issuing before prompt tone casting in the voice for causing electronic equipment to acquire that is, will not be because Cause the children speech information of acquisition incomplete to rob.
Likewise, this embodiment scheme is to pronunciation since voice acquisition device can be at any time in the state for acquiring voice The pronunciation duration that object is assessed in evaluation process is also not especially limited, i.e. the pronunciation end time point of assessment object also has spirit Activity, rather than such as the accuracy in the prior art to guarantee pronunciation evaluation result, it is desirable that assessment object is before the deadline Complete pronunciation.For example, the prior art is used, if there are the pause of long period or hesitations in pronunciation evaluation for assessment object Equal behaviors, not only cause the objectivity of the waste to regulation tone period and impact evaluation result, but also it is also possible to by electronics Equipment, which is mistakenly identified as assessment object, to be terminated to pronounce and terminate pronunciation evaluation;This embodiment scheme is to assessment pair during pronunciation evaluation The pronunciation duration of elephant is not especially limited, and well-to-do tone period, the hesitation that assessment object generates can be provided for assessment object Or the behaviors such as pause are in the error tolerance of assessment result, will not influence the accuracy of final pronunciation evaluation result and objective Property.
In addition, being based on above-mentioned dual-mode, pronunciation evaluation process may be incorporated into echo cancellation signal processing technique, utilize The system prompt sound and guidance sound etc. that assessment system prompt tone playing device issues can be by echo cancellation signal processing technique System sound, the reference loop of evaluated system prompt sound playing device is adaptively supported after including by voice acquisition device Disappear, interference sound will not be become and the accuracy of pronunciation evaluation result is impacted.
Optionally, this method further include: according to the assessment result of each effectively keyword, determine the comprehensive hair of assessment object Sound assessment result, wherein include the pronunciation of each effectively keyword and corresponding standard keyword in comprehensive pronunciation evaluation result Sequence comparing result.
In the present embodiment, hair after given assessment sentence, to assessment object about each keyword in assessment sentence Sound sequence has no stringent sequence requirement, i.e., assessment object can according to sequence of each keyword in assessment sentence successively into Row pronunciation can also carry out the out-of-order pronunciation of keyword, can also skip during the pronunciation process unacquainted keyword and direct Issue the voice of the keyword after the keyword.Because the present embodiment carries out pronunciation evaluation as unit of keyword, as long as commenting Estimate the pronunciation that object says the keyword, then the available pronunciation evaluation corresponding to the keyword is as a result, in turn according to each The pronunciation evaluation result of keyword provides the pronunciation evaluation relative to full assessment sentence as a result, i.e. comprehensive pronunciation evaluation result. The method assessed as unit of whole sentence sentence in compared with the prior art, this embodiment scheme can weaken in assessment sentence Influence of the pronunciation sequence of each keyword to the pronunciation evaluation result of assessment object, and then improve the objective of pronunciation evaluation result Property and accuracy, for grasp assessment object language competence have more reference value, and improve to assessment object carry out The applicability and validity of pronunciation evaluation.Especially for cognitive ability and the relatively weak children of self-control, using this implementation Example scheme, can get rid of the constraint in existing appraisal procedure to children, it is suitable according to arbitrarily pronouncing that children can play its person's character Sequence completes pronunciation evaluation, improves the flexibility in assessment interactive process.
The technical solution of the present embodiment is assessed by obtaining in real time under the dual-mode based on echo cancellation technology first The voice that object issues, the limitation put at the beginning of having desalinated pronunciation evaluation process to assessment object pronunciation, increases assessment The flexibility of interactive process;Then by being identified in real time to the voice of acquisition, once therefrom identify target keyword, then It is matched with the standard keyword in assessment sentence, and then according to the pronunciation character of matching result and assessment object, determination is commented Estimate as a result, being equivalent to as unit of keyword and carry out circulation identification and matched mode, solves existing pronunciation evaluating method needle The problem of pronunciation evaluation result of the assessment object relatively weak to self-control lacks objectivity, weakens every in assessment sentence A keyword pronunciation sequence to assessment object pronunciation evaluation result influence, improve pronunciation evaluation result objectivity and Accuracy further increases the flexibility of assessment interactive process.
Embodiment three
Fig. 4 is the structural schematic diagram for the pronunciation evaluation device that the embodiment of the present invention three provides, and the present embodiment is applicable to pair The case where relatively weak assessment object of self-control, such as children, progress pronunciation evaluation.The device can using software and/ Or the mode of hardware is realized, and can be integrated on an electronic device, such as intellectual education product etc., it is specific such as intelligent robot.
As shown in figure 4, pronunciation evaluation device provided in this embodiment may include that voice obtains module 310, keyword With module 320, invalid keyword determining module 330 and effective keyword evaluation module 340, in which:
Voice obtains module 310, the voice issued for obtaining assessment object in real time;
Keywords matching module 320, if for identifying target keyword in the voice obtained in real time, by identification Target keyword is matched with the standard keyword in assessment sentence;
Invalid keyword determining module 330, if the value for matching result is less than confidence threshold, the mesh that will currently identify Mark keyword is determined as invalid keyword, and continues based on the target critical after the invalid keyword of speech recognition obtained in real time Word;
Effective keyword evaluation module 340 will currently be known if the value for matching result is greater than or equal to confidence threshold Other target keyword is determined as effective keyword, and is determined according to pronunciation character of the assessment object to effective keyword and effectively closed The assessment result of keyword.
Optionally, the device further include:
Voice obtains stopping modular, includes if the quantity of effective keyword for detecting identification is equal in assessment sentence Standard keyword quantity, then stop obtain assessment object issue voice.
Optionally, voice obtains module 310 and is specifically used for:
Under the dual-mode based on echo cancellation technology, the voice that assessment object issues is obtained in real time, wherein duplexing mould Formula refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, assessment system prompt tone playing device For issuing voice in assessment interactive process prompt assessment object.
Optionally, the device further include:
Comprehensive assessment result determining module determines assessment object for the assessment result according to each effectively keyword Comprehensive pronunciation evaluation result, wherein include each effectively keyword and corresponding standard keyword in comprehensive pronunciation evaluation result Pronunciation sequence comparing result.
Pronunciation evaluation device provided by the embodiment of the present invention can be performed pronunciation provided by any embodiment of the invention and comment Estimate method, has the corresponding functional module of execution method and beneficial effect.The content of not detailed description can join in the present embodiment Examine the description in any means embodiment of the present invention.
Example IV
Fig. 5 is the structural schematic diagram for a kind of electronic equipment that the embodiment of the present invention four provides.Fig. 5, which is shown, to be suitable for being used in fact The block diagram of the example electronic device 412 of existing embodiment of the present invention.The electronic equipment 412 that Fig. 5 is shown is only an example, Should not function to the embodiment of the present invention and use scope bring any restrictions.
As shown in figure 5, electronic equipment 412 is showed in the form of universal electronic device.The component of electronic equipment 412 can wrap It includes but is not limited to: one or more processor 416, storage device 428, voice collection device 450, sound play device 452, Connect different system components (including storage device 428, processor 416, voice collection device 450 and sound play device 452) Bus 418.Wherein, voice collection device 450 includes microphone, the voice issued for acquiring assessment object in real time;Sound Playing device 452 includes loudspeaker, is used for play system prompt tone, such as prompt assessment object issues the prompt tone etc. of voice.
Bus 418 indicates one of a few class bus structures or a variety of, including storage device bus or storage device control Device processed, peripheral bus, graphics acceleration port, processor or total using the local of any bus structures in a variety of bus structures Line.For example, these architectures include but is not limited to industry standard architecture (Industry Subversive Alliance, ISA) bus, microchannel architecture (Micro Channel Architecture, MAC) bus is enhanced Isa bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local are total Line and peripheral component interconnection (Peripheral Component Interconnect, PCI) bus.
Electronic equipment 412 typically comprises a variety of computer system readable media.These media can be it is any can be by The usable medium that electronic equipment 412 accesses, including volatile and non-volatile media, moveable and immovable medium.
Storage device 428 may include the computer system readable media of form of volatile memory, such as arbitrary access Memory (Random Access Memory, RAM) 430 and/or cache memory 432.Electronic equipment 412 can be into one Step includes other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only as an example, it stores System 434 can be used for reading and writing immovable, non-volatile magnetic media (Fig. 5 do not show, commonly referred to as " hard disk drive "). Although being not shown in Fig. 5, the disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided, And to removable anonvolatile optical disk, such as CD-ROM (Compact Disc Read-Only Memory, CD-ROM), Digital video disk (Digital Video Disc-Read Only Memory, DVD-ROM) or other optical mediums) read-write light Disk drive.In these cases, each driver can pass through one or more data media interfaces and 418 phase of bus Even.Storage device 428 may include at least one program product, which has one group of (for example, at least one) program mould Block, these program modules are configured to perform the function of various embodiments of the present invention.
Program/utility 440 with one group of (at least one) program module 442 can store in such as storage dress It sets in 428, such program module 442 includes but is not limited to operating system, one or more application program, other program moulds It may include the realization of network environment in block and program data, each of these examples or certain combination.Program module 442 usually execute function and/or method in embodiment described in the invention.
Electronic equipment 412 (such as keyboard, can also be directed toward terminal, display 424 with one or more external equipments 414 Deng) communication, can also be enabled a user to one or more terminal interact with the electronic equipment 412 communicate, and/or with make Any terminal that the electronic equipment 412 can be communicated with one or more of the other computing terminal (such as network interface card, modem Etc.) communication.This communication can be carried out by input/output (I/O) interface 422.Also, electronic equipment 412 can also lead to Cross network adapter 420 and one or more network (such as local area network (Local Area Network, LAN), wide area network (Wide Area Network, WAN) and/or public network, such as internet) communication.As shown in figure 5, network adapter 420 It is communicated by bus 418 with other modules of electronic equipment 412.It should be understood that although not shown in the drawings, can be set in conjunction with electronics Standby 412 use other hardware and/or software module, including but not limited to: microcode, terminal driver, redundant processor, outside Disk drive array, disk array (Redundant Arrays of Independent Disks, RAID) system, tape drive Dynamic device and data backup storage system etc..
The program that processor 416 is stored in storage device 428 by operation, thereby executing various function application and number According to processing, such as realize pronunciation evaluating method provided by any embodiment of the invention, this method may include:
The voice that assessment object issues is obtained in real time;
If identifying target keyword in the voice obtained in real time, by the target keyword of the identification and assessment language Standard keyword in sentence is matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword, And continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as having Keyword is imitated, and determines the assessment knot of effective keyword to the pronunciation character of effective keyword according to assessment object Fruit.
Embodiment five
The embodiment of the present invention five additionally provides a kind of computer readable storage medium, is stored thereon with computer program, should Realize that such as pronunciation evaluating method provided by any embodiment of the invention, this method may include: when program is executed by processor
The voice that assessment object issues is obtained in real time;
If identifying target keyword in the voice obtained in real time, by the target keyword of the identification and assessment language Standard keyword in sentence is matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword, And continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as having Keyword is imitated, and determines the assessment knot of effective keyword to the pronunciation character of effective keyword according to assessment object Fruit.
The computer storage medium of the embodiment of the present invention, can be using any of one or more computer-readable media Combination.Computer-readable medium can be computer-readable signal media or computer readable storage medium.It is computer-readable Storage medium for example may be-but not limited to-the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, device or Device, or any above combination.The more specific example (non exhaustive list) of computer readable storage medium includes: tool There are electrical connection, the portable computer diskette, hard disk, random access memory (RAM), read-only memory of one or more conducting wires (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD- ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer-readable storage Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device Using or it is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion Divide and partially executes or executed on remote computer or terminal completely on the remote computer on the user computer.It is relating to And in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or extensively Domain net (WAN)-be connected to subscriber computer, or, it may be connected to outer computer (such as provided using Internet service Quotient is connected by internet).
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation, It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.

Claims (10)

1. a kind of pronunciation evaluating method characterized by comprising
The voice that assessment object issues is obtained in real time;
It, will be in the target keyword of the identification and assessment sentence if identifying target keyword in the voice obtained in real time Standard keyword matched;
If the value of matching result is less than confidence threshold, the target keyword currently identified is determined as invalid keyword, and after Continue based on the target keyword after invalid keyword described in the speech recognition obtained in real time;
If the value of matching result is greater than or equal to the confidence threshold, the target keyword currently identified is determined as effectively closing Keyword, and the assessment result of effective keyword is determined to the pronunciation character of effective keyword according to assessment object.
2. the method according to claim 1, wherein the method also includes:
If detecting, the quantity of effective keyword of identification is equal to the standard keyword quantity in the assessment sentence included, stops Only obtain the voice that assessment object issues.
3. method according to claim 1 or 2, which is characterized in that the real-time voice for obtaining assessment object and issuing, packet It includes:
Under the dual-mode based on echo cancellation technology, the voice that assessment object issues is obtained in real time, wherein the duplex mould Formula refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, and the assessment system prompt tone plays Device is used to issue voice in assessment interactive process prompt assessment object.
4. the method according to claim 1, wherein the method also includes:
According to the assessment result of each effectively keyword, the synthesis pronunciation evaluation result of assessment object is determined, wherein the synthesis It include the pronunciation sequence comparing result of each effectively keyword and corresponding standard keyword in pronunciation evaluation result.
5. a kind of pronunciation evaluation device characterized by comprising
Voice obtains module, the voice issued for obtaining assessment object in real time;
Keywords matching module, if for identifying target keyword in the voice obtained in real time, by the mesh of the identification Mark keyword is matched with the standard keyword assessed in sentence;
Invalid keyword determining module, if the value for matching result is less than confidence threshold, the target critical that will currently identify Word is determined as invalid keyword, and continues based on the target critical after invalid keyword described in the speech recognition obtained in real time Word;
Effective keyword evaluation module will be identified currently if the value for matching result is greater than or equal to the confidence threshold Target keyword be determined as effective keyword, and described in being determined according to pronunciation character of the assessment object to effective keyword The assessment result of effective keyword.
6. device according to claim 5, which is characterized in that described device further include:
Voice obtains stopping modular, includes if the quantity of effective keyword for detecting identification is equal in the assessment sentence Standard keyword quantity, then stop obtain assessment object issue voice.
7. device according to claim 5 or 6, which is characterized in that the voice obtains module and is specifically used for:
Under the dual-mode based on echo cancellation technology, the voice that assessment object issues is obtained in real time, wherein the duplex mould Formula refers to the mode that voice acquisition device and assessment system prompt tone playing device work at the same time, and the assessment system prompt tone plays Device is used to issue voice in assessment interactive process prompt assessment object.
8. device according to claim 5, which is characterized in that described device further include:
Comprehensive assessment result determining module determines the synthesis of assessment object for the assessment result according to each effectively keyword Pronunciation evaluation result, wherein include each effectively keyword and corresponding standard keyword in the comprehensive pronunciation evaluation result Pronunciation sequence comparing result.
9. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now pronunciation evaluating method as described in any in claim 1-4.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The pronunciation evaluating method as described in any in claim 1-4 is realized when execution.
CN201910234740.4A 2019-03-26 2019-03-26 Pronunciation evaluating method, device, electronic equipment and medium Pending CN109872726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910234740.4A CN109872726A (en) 2019-03-26 2019-03-26 Pronunciation evaluating method, device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910234740.4A CN109872726A (en) 2019-03-26 2019-03-26 Pronunciation evaluating method, device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN109872726A true CN109872726A (en) 2019-06-11

Family

ID=66921325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910234740.4A Pending CN109872726A (en) 2019-03-26 2019-03-26 Pronunciation evaluating method, device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN109872726A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111370029A (en) * 2020-02-28 2020-07-03 北京一起教育信息咨询有限责任公司 Voice data processing method and device, storage medium and electronic equipment
CN111402895A (en) * 2020-06-08 2020-07-10 腾讯科技(深圳)有限公司 Voice processing method, voice evaluating method, voice processing device, voice evaluating device, computer equipment and storage medium
CN111863022A (en) * 2020-07-23 2020-10-30 中国科学技术大学 Children sound feature detection method based on special-shaped double-microphone array
CN113421587A (en) * 2021-06-02 2021-09-21 网易有道信息技术(北京)有限公司 Voice evaluation method and device, computing equipment and storage medium
CN115691497A (en) * 2023-01-04 2023-02-03 深圳市大晶光电科技有限公司 Voice control method, device, equipment and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0619911B1 (en) * 1992-11-04 1997-06-04 The Secretary Of State For Defence In Her Britannic Majesty's Government Of The United Kingdom Of Great Britain And Children's speech training aid
CN1750121A (en) * 2004-09-16 2006-03-22 北京中科信利技术有限公司 A kind of pronunciation evaluating method based on speech recognition and speech analysis
CN102194454A (en) * 2010-03-05 2011-09-21 富士通株式会社 Equipment and method for detecting key word in continuous speech
CN103035244A (en) * 2012-11-24 2013-04-10 安徽科大讯飞信息科技股份有限公司 Voice tracking method capable of feeding back loud-reading progress of user in real time
CN103680505A (en) * 2013-09-03 2014-03-26 安徽科大讯飞信息科技股份有限公司 Voice recognition method and voice recognition system
CN104143328A (en) * 2013-08-15 2014-11-12 腾讯科技(深圳)有限公司 Method and device for detecting keywords
CN109273004A (en) * 2018-12-10 2019-01-25 苏州思必驰信息科技有限公司 Predictive audio recognition method and device based on big data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0619911B1 (en) * 1992-11-04 1997-06-04 The Secretary Of State For Defence In Her Britannic Majesty's Government Of The United Kingdom Of Great Britain And Children's speech training aid
CN1750121A (en) * 2004-09-16 2006-03-22 北京中科信利技术有限公司 A kind of pronunciation evaluating method based on speech recognition and speech analysis
CN102194454A (en) * 2010-03-05 2011-09-21 富士通株式会社 Equipment and method for detecting key word in continuous speech
CN103035244A (en) * 2012-11-24 2013-04-10 安徽科大讯飞信息科技股份有限公司 Voice tracking method capable of feeding back loud-reading progress of user in real time
CN104143328A (en) * 2013-08-15 2014-11-12 腾讯科技(深圳)有限公司 Method and device for detecting keywords
CN103680505A (en) * 2013-09-03 2014-03-26 安徽科大讯飞信息科技股份有限公司 Voice recognition method and voice recognition system
CN109273004A (en) * 2018-12-10 2019-01-25 苏州思必驰信息科技有限公司 Predictive audio recognition method and device based on big data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
朱志祥等: "《IP网络多媒体通信技术及应用》", 30 November 2007, 西安电子科技大学出版社 *
王勇: "基于点过程模型的连续语音关键词检测技术研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111370029A (en) * 2020-02-28 2020-07-03 北京一起教育信息咨询有限责任公司 Voice data processing method and device, storage medium and electronic equipment
CN111402895A (en) * 2020-06-08 2020-07-10 腾讯科技(深圳)有限公司 Voice processing method, voice evaluating method, voice processing device, voice evaluating device, computer equipment and storage medium
CN111863022A (en) * 2020-07-23 2020-10-30 中国科学技术大学 Children sound feature detection method based on special-shaped double-microphone array
CN111863022B (en) * 2020-07-23 2022-09-30 中国科学技术大学 Children sound feature detection method based on special-shaped double-microphone array
CN113421587A (en) * 2021-06-02 2021-09-21 网易有道信息技术(北京)有限公司 Voice evaluation method and device, computing equipment and storage medium
CN113421587B (en) * 2021-06-02 2023-10-13 网易有道信息技术(北京)有限公司 Voice evaluation method, device, computing equipment and storage medium
CN115691497A (en) * 2023-01-04 2023-02-03 深圳市大晶光电科技有限公司 Voice control method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN109872726A (en) Pronunciation evaluating method, device, electronic equipment and medium
US10152971B2 (en) System and method for advanced turn-taking for interactive spoken dialog systems
Barker et al. The PASCAL CHiME speech separation and recognition challenge
CN106057206B (en) Sound-groove model training method, method for recognizing sound-groove and device
CN105575386B (en) Audio recognition method and device
CN103021409B (en) A kind of vice activation camera system
Lee Prologue: talking organisation
US9361589B2 (en) System and a method for providing a dialog with a user
CN109102824B (en) Voice error correction method and device based on man-machine interaction
CN110600013A (en) Training method and device for non-parallel corpus voice conversion data enhancement model
EP2879062A2 (en) A system and a method for providing a dialog with a user
CN110175242B (en) Human-computer interaction association method, device and medium based on knowledge graph
CN109697981A (en) A kind of voice interactive method, device, equipment and storage medium
JP7060106B2 (en) Dialogue device, its method, and program
Williams et al. Demonstration of AT&T “Let's Go”: A production-grade statistical spoken dialog system
CN116821290A (en) Multitasking dialogue-oriented large language model training method and interaction method
CN109859773A (en) A kind of method for recording of sound, device, storage medium and electronic equipment
CN110164020A (en) Ballot creation method, device, computer equipment and computer readable storage medium
Cumbal et al. Detection of listener uncertainty in robot-led second language conversation practice
Möller et al. A corpus analysis of spoken smart-home interactions with older users
CN109712443A (en) A kind of content is with reading method, apparatus, storage medium and electronic equipment
CN113707128B (en) Test method and system for full duplex voice interaction system
CN109147419A (en) Language learner system based on incorrect pronunciations detection
Clift Discovering order
CN109255988A (en) Interactive learning methods based on incorrect pronunciations detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190611