Voice error correction method, terminal and storage medium for ordering song by voice
Technical field
The present embodiments relate to music service technology more particularly to a kind of voice error correction method for ordering song by voice,
Terminal and storage medium.
Background technique
As speech recognition technology is continuously improved, application is also more and more extensive, is also complied with by playing speech on demand song existing
The usage scenario at alternative family.Compared with touch screen requesting song, ordering song by voice gets rid of the constraint at interface, can allow user completely according to a
The retrieval that people is accustomed to completing song plays, but due to the complexity of human linguistic communication, this considerably increases the difficulty of requesting song, lead to
The mode for crossing natural language program request wants more flexible and extensive, is just able to satisfy the different language mode of various users and requesting song is practised
It is used.
Ordering song by voice is related to the identification to voice, speech recognition it is accurate whether, it is whether full to the song finally played
Sufficient user demand is expected to have very big influence.Existing voice error correction typically carries out in speech recognition process, for example,
For the speech recognition result of natural language, by syntactic information (position, identification stability), the semantic letter of analysing in depth word
Breath (sentence target meaning) and pragmatic information (context harmony degree) are assessed speech recognition result sentence, EDC error detection and correction,
Final output optimizes sentence.The above method is sensu lato voice error correction method, needs to carry out point of grammer, semanteme and pragmatic
Analysis, method is complicated, and time-consuming, and not applicable ordering song by voice.Currently, there is no the voice error correction sides proposed to ordering song by voice process
Method.
Summary of the invention
The present invention provides a kind of voice error correction method, terminal and storage medium for ordering song by voice, can be quickly to language
The speech recognition result of point of articulation song carries out error correction, avoids music sources retrieval error caused by speech recognition errors, improves music
The success rate of service.
In a first aspect, the embodiment of the invention provides a kind of voice error correction methods for ordering song by voice, comprising:
Speech recognition result is matched with the information in preset musical dictionary, wherein the preset musical dictionaries store
There are the attribute information and its corresponding relationship of music sources;
From acquisition in the preset musical dictionary and the matched attribute information of song information in institute's speech recognition result;
Judge the song information with the presence or absence of mistake according to the matched attribute information;
If there is mistake, the song information is corrected according to the matched attribute information.
Further, speech recognition result is matched with the information in preset musical dictionary, comprising:
Receive the voice messaging of user's input;
Speech recognition is carried out to the voice messaging, obtains speech recognition result;
Word segmentation processing is carried out to institute's speech recognition result, obtains at least one word;
At least one described word is matched with the information in the preset musical dictionary.
Further, matched with the song information in institute's speech recognition result from being obtained in the preset musical dictionary
Attribute information, comprising:
According to the text and phonetic of the song information, obtains from the preset musical dictionary and matched with the song information
Attribute information.
Further, judge the song information with the presence or absence of mistake according to the matched attribute information, comprising:
In the case where an only song information, judge in the matched attribute information whether include and the song
The information of information text exact matching;
If it is, determining that the song information identification is correct;
If it is not, then determining the song information, there are mistakes.
Further, the song information is corrected according to the matched attribute information, comprising:
In the case where an only song information,
If there is multiple matched attribute informations and do not include the information of text exact matching, then calculates separately each matched
The song information is corrected as the maximum information of similarity by the similarity of attribute information and the song information;
If an only matched attribute information and be not text exact matching information, more by the song information
It is just the matched attribute information.
Further, judge the song information with the presence or absence of mistake according to the matched attribute information, comprising:
In the case where there are multiple song informations, for current song information, institute is judged according to the preset musical dictionary
Whether the attribute information for stating current song information matches with other identifies correct song informations there are corresponding relationships;
If it is, determining that the current song information identification is correct;
If it is not, then determining the current song information, there are mistakes.
Further, the song information is corrected according to the matched attribute information, comprising:
In the case where there is multiple song informations, according to the correct song information of identification and the matched category of each song information
Property information and attribute information corresponding relationship, corrigendum exist mistake song information.
Further, the method also includes: the preset musical dictionary is updated according to the music sources of update.
Further, after carrying out error correction to the song information according to matched attribute information, the method is also wrapped
It includes: corresponding song is searched according to the song information after corrigendum.
Second aspect, the embodiment of the invention also provides a kind of terminal, the terminal includes:
One or more processors;
Memory, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes the voice error correction method for ordering song by voice as described in any embodiment of that present invention.
The third aspect, the embodiment of the invention also provides a kind of computer readable storage mediums, are stored thereon with computer
Program realizes the voice error correction side for ordering song by voice as described in any embodiment of that present invention when the program is executed by processor
Method.
The present invention quickly the speech recognition result to ordering song by voice can be carried out error correction, be avoided by preset musical dictionary
Music sources retrieval error caused by speech recognition errors, improves the success rate of music service.
Detailed description of the invention
Fig. 1 is the flow chart for the voice error correction method for ordering song by voice that the embodiment of the present invention one provides;
Fig. 2 is a kind of structural schematic diagram for terminal that the embodiment of the present invention three provides.
Specific embodiment
The present invention is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining the present invention rather than limiting the invention.It also should be noted that in order to just
Only the parts related to the present invention are shown in description, attached drawing rather than entire infrastructure.
Embodiment one
Fig. 1 is the flow chart for the voice error correction method for ordering song by voice that the embodiment of the present invention one provides, the present embodiment
The case where being applicable to ordering song by voice, this method can be executed by terminal having data processing function.As shown in Figure 1, should
Method specifically comprises the following steps:
Step 110, speech recognition result is matched with the information in preset musical dictionary, wherein preset musical dictionary
It is stored with the attribute information and its corresponding relationship of music sources.
Wherein, the attribute information of music sources can be singer's name, song title, album name, style, languages etc., these categories
Property information between there are corresponding relationships, for example, sing album belonging to the singer of certain song, the song, the song style, should
The languages etc. of song, therefore the corresponding relationship of preset musical dictionary also attribute information storage carry out that these can be referred to when error correction
Corresponding relationship obtains more accurate result.Preset musical dictionary can be believed according to existing all music sources and its attribute
Breath is integrated to obtain, and dictionary is abundanter, and the understanding being intended to for user is more accurate.Preset musical dictionary can store in terminal
Or in server.
Step 120, from acquisition in preset musical dictionary and the matched attribute information of song information in speech recognition result.
Wherein, song information refers to information relevant to music sources, can be the attribute information of music sources, for example,
Title of the song, Ge Shouming, album name etc..Matched attribute information may include exact matching information and part match information.
Step 130, judge song information with the presence or absence of mistake according to matched attribute information.
Wherein, briefly, if in matched attribute information not including the information with the exact matching of song information text,
It can determine that there are mistakes for the song information.Type of error can be text missing or it is extra, phonetic is identical but text is different etc..
Step 140, if there is mistake, song information is corrected according to matched attribute information.
Wherein, if mistake is not present in song information, corresponding song directly can be searched according to the song information.Such as
There are mistakes for fruit song information, and after correcting song information, corresponding song can be searched according to the song information after corrigendum, real
Existing ordering song by voice.
The technical solution of the present embodiment can be quickly to the speech recognition result of ordering song by voice by preset musical dictionary
Error correction is carried out, music sources retrieval error caused by speech recognition errors is avoided, improves the success rate of music service.
Based on the above technical solution, it is preferred that will be in speech recognition result and preset musical dictionary in step 110
Information carry out matching may include: receive user input voice messaging;Speech recognition is carried out to voice messaging, obtains voice
Recognition result;Word segmentation processing is carried out to speech recognition result, obtains at least one word;By at least one word and preset musical
Information in dictionary is matched.
Wherein, speech recognition result can be text information, and existing audio recognition method specifically can be used and carry out language
Sound identification, for example, the algorithm based on dynamic time warping, the hidden Markov method based on parameter model, be based on nonparametric model
Vector quantization method, the algorithm based on artificial neural network etc., the embodiment of the present invention is to speech recognition process without specifically
It is bright.Existing segmentation methods can be used in word segmentation processing, for example, the mechanical Chinese word segmentation algorithm based on string matching, based on understand
Segmentation methods, the segmentation methods based on statistics etc., the embodiment of the present invention is to specific participle process without being described in detail.Example
Such as, user issues voice: I wants to listen the east wind of Zhou Jielun broken, then after word segmentation processing, obtained word may be: I, think
Listen, Zhou Jielun, east wind it is broken.
Preferably, the song information in speech recognition result can be determined as follows: judging preset musical dictionary
In with the presence or absence of the attribute information for being higher than preset value with the text similarity of current term;If so, determining that current term is song
Bent information;If it is not, then current term is not song information.Wherein, preset value can be configured according to the actual situation, such as
0.7.Existing method can be used in the calculating of Words similarity, for example, being based on the word of semantic dictionary (such as Wordnet, Hownet)
Language similarity algorithm is woven in all word groups in tree structure, by the path length between calculate node as word away from
From;For another example, the Words similarity algorithm based on corpus statistics, using word vectors spatial model, which selects one group in advance
Feature Words, calculate this group of Feature Words and each word correlation (generally with this group of word in actual large-scale corpus with
The frequency that the word occurs within a context is measured), then each word can be obtained the Feature Words of a correlation to
Amount, then using the similarity (included angle cosine of vector is generally used to calculate) between these vectors as the similar of the two words
Degree.The present invention is to specific similarity calculation process without detailed description.Text similarity is higher than pre- by this preferred embodiment
If the word of value can exclude the interference of other words as song information.For example, user issues voice: playing Zhou Jielun's
East wind, participle obtain: broadcasting, Zhou Jielun, east wind, by comparison, can determine that song information is Zhou Jielun and east wind.Really
The purpose for determining song information be exclude the interference of some texts in voice, for example, play, I will listen.
Based on the above technical solution, it is preferred that step 120 can according to the text and phonetic of song information, from
It is obtained and the matched attribute information of the song information in preset musical dictionary.It is carried out pair according to the text of song information and phonetic
Than mistake caused by the text missing/extra, phonetically similar word occurred in speech recognition can be evaded.
Furthermore, it is contemplated that music sources can increase at any time, the embodiment of the present invention can be according to the music sources of update to pre-
If music dictionary is updated, guarantee the timeliness and accuracy of preset musical dictionary, and then guarantees to make up voice in time
Identify the situation of mistake.Preferably, update can be timed to preset musical dictionary according to prefixed time interval.To default sound
The step of happy dictionary is updated can execution when not using the dictionary.
Embodiment two
On the basis of the above embodiment 1, it present embodiments provides and judges song information with the presence or absence of wrong and corresponding
Corrigendum song information mode, be illustrated separately below.
(1) in the case where an only song information, judge in matched attribute information whether include and song information
The information of text exact matching;If it is, determining that song information identification is correct;If it is not, then determining that the song information is deposited
In mistake.
Wherein, if other than the information of text exact matching, there is also the attribute information of other unisonance difference words or
Attribute information similar in person can also export these attribute informations, and user is prompted to select.
For example, user input voice: black sweater, speech recognition result are also black sweater, by its text, phonetic with
Preset musical dictionary is matched, and matched attribute information is black sweater (this belongs to exact matching information), thereby determines that language
Sound recognition result is correct.If matched attribute information further includes that (this belongs to portion to grey sweater other than black sweater
Divide match information), then it can determine that the identification of black sweater is correct, black sweater and grey sweater can also be showed into user,
Prompt user selects.Specifically, can be the matched attribute information of voice output, such as 1 represents selection black sweater, 2 generations
Table selects grey sweater, and user speech replies 1 or 2;It is also possible to show matched attribute information on a display screen, user can be with
It is selected, can also be selected by voice response 1 or 2 by click keys.
In the case where an only song information, the process of the song information is corrected such as according to matched attribute information
Under:
1) if there is multiple matched attribute informations and do not include text exact matching information, then calculate separately each matching
Attribute information and the song information similarity, song information is corrected as the maximum information of similarity.It wherein calculates similar
The prior art can be used in the method for degree, and as described in above-described embodiment one, the present embodiment repeats no more this.
For example, speech recognition result and song information are the peninsulas, matched according to the peninsula and bandao, in default sound
It is peninsula can and with island that matched attribute information is found in happy dictionary, is not the information of text exact matching, then counts respectively
Calculate peninsula can, the similarity on companion island and the peninsula obtains for example, the Words similarity algorithm based on corpus statistics is calculated
The similarity highest of peninsula can and the peninsula, therefore the peninsula is corrected as peninsula can.This belongs to the case where text missing.
If 2) an only matched attribute information and be not text exact matching information, more by the song information
It is just the matched attribute information.
For example, speech recognition result and song information are the peninsulas, matched according to the peninsula and bandao, in default sound
It is peninsula can that matched attribute information is found in happy dictionary, information that is as a result unique and not being text exact matching, then will be partly
Island is corrected as peninsula can.
For another example, speech recognition result and song information are " being not desired to grow up ", according to " being not desired to grow up " and
" buxiangzhangdaya " is matched, and it is " being not desired to grow up " that part match information is found in preset musical dictionary, as a result
It is uniquely and not the information of text exact matching, then " will be not desired to grow up " is corrected as " being not desired to grow up ".It is extra that this belongs to text
The case where.
(2) in the case where there are multiple song informations, judge that song information is as follows with the presence or absence of wrong step: for working as
Preceding song information judges whether the attribute information of current song information matches is correct with other identifications according to preset musical dictionary
There are corresponding relationships for song information;If it is, determining that the identification of current song information is correct;If it is not, then determining current song
There are mistakes for information.
For example, speech recognition result is: it is good that I will listen poplar ancestor's latitude and Zhang Bichen to sing, and song information has Yang Zongwei, opens
It is green morning, good, by the matching with preset musical dictionary, determines poplar ancestor's latitude and Zhang Bichen is the correct song information of identification;It will
" good " and " liangliang " are matched in preset musical dictionary respectively, find matched attribute information have it is good and cool
It is cool, judged at this time according to the corresponding relationship with Yang Zongwei, Zhang Bichen, can determine that there are mistakes for song information " good ".
This belongs to mistake caused by phonetically similar word.
In the case where there is multiple song informations, the process according to matched attribute information corrigendum song information is as follows: root
According to correct song information, the corresponding relationship with the matched attribute information of each song information and attribute information is identified, corrigendum exists
The song information of mistake.
For example, speech recognition result is: it is good that I will listen poplar ancestor's latitude and Zhang Bichen to sing, and determines that song information has Yang Zong
It is latitude, Zhang Bichen, good, " good " and " liangliang " is matched in preset musical dictionary respectively, is found matched
Attribute information and its corresponding relationship are as follows: what the good of the good performance of singer, singer Yang Zongwei and Zhang Bichen were sung cools, thus
Can determine that user wants to listen according to singer's name should cool, and thus be corrected as cooling by good.
For another example, speech recognition result is: I will listen Christmas Day of Chen Yixun, and song information is that Chen Yi is fast and Christmas Day, benefit
Matched attribute information Christmas knot is found in preset musical dictionary with " Christmas Day " and " shengdanjie " and its singer is old
Yi Xun can determine that Christmas Day is identification mistake thus according to singer's name, will be corrected as Christmas knot Christmas Day.This belongs to unisonance
Mistake caused by word.
(3) if there is no any matched attribute information, then prompt information is exported, user is prompted to input voice mistake;
Or song retrieval is carried out still according to speech recognition result, export song retrieval result.
To sum up, error correction may be summarized to be content augmentation, content removal and wrongly written character and correct these types of situation, wherein content is mended
Filling is that auto-complete user does not state complete resource name, and content removal is to delete user to state extra resource name, wrong
Word correction is to correct the errors in text different with word of the sound as caused by speech recognition.It can be evaded in speech recognition by error correction
The case where text of appearance lacks, text is extra and errors in text, reduces since resource name is imperfect, resource name text
The failure of resource retrieval caused by extra or Text region mistake, to improve music service success rate.
Embodiment three
Fig. 2 is a kind of structural schematic diagram for terminal that the embodiment of the present invention three provides, as shown in Fig. 2, the terminal includes: place
Manage device 210, memory 220, input unit 230 and output device 240;In terminal the quantity of processor 210 can be one or
It is multiple, in Fig. 2 by taking a processor 210 as an example;Processor 210, memory 220, input unit 230 and output dress in terminal
Setting 240 can be connected by bus or other modes, in Fig. 2 for being connected by bus.
Memory 220 is used as a kind of computer readable storage medium, can be used for storing software program, journey can be performed in computer
Sequence and module, such as the corresponding program instruction of voice error correction method for ordering song by voice in the embodiment of the present invention.Processor
210 software program, instruction and the modules being stored in memory 220 by operation, are answered thereby executing the various functions of terminal
With and data processing, that is, realize the above-mentioned voice error correction method for ordering song by voice.
Memory 220 can mainly include storing program area and storage data area, wherein storing program area can store operation system
Application program needed for system, at least one function;Storage data area, which can be stored, uses created data etc. according to terminal.This
Outside, memory 220 may include high-speed random access memory, can also include nonvolatile memory, for example, at least one
Disk memory, flush memory device or other non-volatile solid state memory parts.In some instances, memory 220 can be into one
Step includes the memory remotely located relative to processor 210, these remote memories can pass through network connection to terminal.On
The example for stating network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Input unit 230 can be used for receiving the voice messaging and character information of input, and generates and set with the user of terminal
It sets and the related key signals of function control inputs, for example, input unit 230 can be microphone, keyboard, display screen etc..It is defeated
Device 240 may include the equipment such as loudspeaker, display screen out, and wherein loudspeaker is for playing voice and song, and display screen is for showing
Show song and relevant information.
Example IV
The embodiment of the present invention four also provides a kind of computer readable storage medium, is stored thereon with computer program, the journey
For executing a kind of voice error correction method for ordering song by voice when sequence is executed by processor, this method comprises:
Speech recognition result is matched with the information in preset musical dictionary, wherein preset musical dictionaries store has sound
The attribute information and its corresponding relationship of happy resource;
From acquisition in preset musical dictionary and the matched attribute information of song information in speech recognition result;
Judge song information with the presence or absence of mistake according to matched attribute information;
If there is mistake, song information is corrected according to matched attribute information.
Certainly, a kind of computer readable storage medium provided by the embodiment of the present invention, is stored thereon with computer program
(also referred to as computer executable instructions), the method operation that computer executable instructions are not limited to the described above, can also hold
Relevant operation in the row voice error correction method provided by any embodiment of the invention for ordering song by voice.
By the description above with respect to embodiment, it is apparent to those skilled in the art that, the present invention
It can be realized by software and required common hardware, naturally it is also possible to which by hardware realization, but in many cases, the former is more
Good embodiment.Based on this understanding, technical solution of the present invention substantially in other words contributes to the prior art
Part can be embodied in the form of software products, which can store in computer readable storage medium
In, floppy disk, read-only memory (Read-Only Memory, ROM), random access memory (Random such as computer
Access Memory, RAM), flash memory (FLASH), hard disk or CD etc., including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes method described in each embodiment of the present invention.
Note that the above is only a better embodiment of the present invention and the applied technical principle.It will be appreciated by those skilled in the art that
The invention is not limited to the specific embodiments described herein, be able to carry out for a person skilled in the art it is various it is apparent variation,
It readjusts and substitutes without departing from protection scope of the present invention.Therefore, although being carried out by above embodiments to the present invention
It is described in further detail, but the present invention is not limited to the above embodiments only, without departing from the inventive concept, also
It may include more other equivalent embodiments, and the scope of the invention is determined by the scope of the appended claims.