CN109410915A

CN109410915A - The appraisal procedure and device of voice quality, computer readable storage medium

Info

Publication number: CN109410915A
Application number: CN201710698522.7A
Authority: CN
Inventors: 赵奕晨; 何成林; 刘启飞; 丁芹; 曹艳艳
Original assignee: China Mobile Communications Group Co Ltd
Current assignee: China Mobile Communications Group Co Ltd
Priority date: 2017-08-15
Filing date: 2017-08-15
Publication date: 2019-03-01
Anticipated expiration: 2037-08-15
Also published as: CN109410915B

Abstract

The invention discloses a kind of appraisal procedures of voice quality and device, computer readable storage medium.The appraisal procedure of the voice quality includes: according to pre-selection voice content acquisition primary voice data；Call is carried out to primary voice data to handle to obtain voice data to be assessed；Voice data to be assessed is converted into speech text to be assessed；Pre-selection voice content is split into M keyword, M keyword is utilized respectively and the quantity that retrieval obtains each keyword not restored correctly is carried out to speech text to be assessed；The quantity for each keyword not restored correctly is added to obtain the first total quantity of the keyword not restored correctly.The second total quantity of keyword corresponding with primary voice data is calculated according to the duration of primary voice data；The content intact degree of voice data to be assessed is assessed according to the first total quantity and the second total quantity.Using the appraisal procedure and device of the voice quality of the embodiment of the present invention, the integrity degree of voice communication content can be assessed.

Description

The appraisal procedure and device of voice quality, computer readable storage medium

Technical field

The present invention relates to the appraisal procedures and device of voice communication technical field more particularly to a kind of voice quality, calculating Machine readable storage medium storing program for executing.

Background technique

From early stage fixed-line telephone, mobile terminal, the tool for voice communication are rapidly developed till now, carry out voice Call also becomes one of the primary demand in people's daily life.Accurately in order to the meaning to be expressed of a side that will converse It is communicated to other side, needs to guarantee the integrity degree of voice communication content.

But the appraisal procedure of voice quality in the prior art is mainly in terms of tone color and tone to voice communication Carry out distortion factor assessment.For example, establishing auditory model based on input-output mode, the voice signal received is calculated with original The distortion factor between voice signal；Alternatively, being based on the way of output, connect according to IP network impairment parameter or audio stream parameter, calculating The distortion factor of the voice signal received.Since the appraisal procedure of voice quality in the prior art does not include to voice communication content Integrity degree assessment, therefore, it is necessary to establish the new appraisal procedure assessed for the integrity degree to voice communication content.

Summary of the invention

The embodiment of the invention provides a kind of appraisal procedure of voice quality and device, computer readable storage medium, energy It is enough that the integrity degree of voice communication content is assessed.

In a first aspect, the embodiment of the invention provides a kind of appraisal procedure of voice quality, which includes:

Primary voice data is acquired according to pre-selection voice content；

Call processing is carried out to the primary voice data, obtains voice data to be assessed；

The voice data to be assessed is converted into speech text to be assessed；

The pre-selection voice content is split into M keyword, is utilized respectively the M keyword to the language to be assessed Sound text is retrieved, and obtains the quantity for each keyword not restored correctly, and M is positive integer；

The quantity for each keyword not restored correctly is added, obtain the keyword not restored correctly first is total Quantity；

According to the duration of the primary voice data, the of keyword corresponding with the primary voice data is calculated Two total quantitys；

According to first total quantity and second total quantity, the content intact of the voice data to be assessed is assessed Degree.

It is described according to first total quantity and second total quantity, assessment in some embodiments of first aspect The content intact degree of the voice data to be assessed, comprising:

Calculate the ratio of first total quantity and second total quantity；

The content intact degree of the voice data to be assessed is assessed according to the ratio.

In some embodiments of first aspect, the pre-selection voice content is configured as covering and is arbitrarily designated languages use The high tone of frequency and/or the basic pronunciation for constituting the specified languages.

In some embodiments of first aspect, the pre-selection voice content is additionally configured to the M keyword split into Between meet at least one of following condition: it is semantic it is different, do not repeat, there is no comprising with by comprising relationship, be not present Homonym.

It is described that primary voice data is acquired according to pre-selection voice content, comprising: root in some embodiments of first aspect According to the primary voice data of pre-selection voice content acquisition male voice or female voice.

In some embodiments of first aspect, the duration according to the primary voice data, calculate and institute State the second total quantity of the corresponding keyword of primary voice data, comprising:

According to the duration of the primary voice data, calculates the primary assessment of satisfaction and need to repeat in the pre-selection voice The times N of appearance, N are positive integer；

Calculate the product of M and N, the second total quantity as keyword corresponding with the primary voice data.

In some embodiments of first aspect, the duration according to the primary voice data calculates and meets Primary assessment needs to repeat the times N of the pre-selection voice content, comprising:

Obtain the number of words that the pre-selection voice content includes；

The product for calculating number of words and voice communication word speed that the pre-selection voice content includes, obtains being repeated once the original The duration that beginning voice content needs；

Calculate the primary voice data duration and it is described be repeated once that the original speech content needs when Long ratio, the primary assessment of satisfaction as the primary voice data need to repeat the times N of the pre-selection voice content.

In some embodiments of first aspect, in the primary voice data in the duplicate pre-selection voice of adjacent needs There are one section between appearance to be left white the time.

In some embodiments of first aspect, the duration of the primary voice data requires to be greater than duration threshold value, The duration threshold value is relative transport speed, the transmission frequency of the talk channel and the raw tone based on talk channel What the word speed of data obtained.

In some embodiments of first aspect, the duration threshold of the primary voice data is determined using following formula Value:

T=100 × α × max (c/ ν f, s)

Wherein, T is the duration threshold value of the primary voice data, and α is constant, and c is the light velocity, and ν is the opposite of talk channel Transmission speed, f are the transmission frequency of the talk channel, and s is the word speed of the primary voice data.

Second aspect, the embodiment of the present invention provide a kind of assessment device of voice quality, which includes:

Acquisition module, for acquiring primary voice data according to pre-selection voice content；

Processing module obtains voice data to be assessed for carrying out call processing to the primary voice data；

Conversion module, for the voice data to be assessed to be converted to speech text to be assessed；

Retrieval module is utilized respectively the M keyword for the pre-selection voice content to be split into M keyword The speech text to be assessed is retrieved, obtains the quantity for each keyword not restored correctly, M is positive integer；

First computing module obtains not gone back correctly for the quantity for each keyword not restored correctly to be added First total quantity of former keyword；

Second computing module calculates and the raw tone number for the duration according to the primary voice data According to the second total quantity of corresponding keyword；

Evaluation module, for assessing the voice number to be assessed according to first total quantity and second total quantity According to content intact degree.

The third aspect, the embodiment of the present invention provide a kind of assessment device of voice quality, including memory, processor and deposit The program that can be run on a memory and on a processor is stored up, the processor is realized as described in right when executing described program Voice quality appraisal procedure.

Fourth aspect, the embodiment of the present invention provide a kind of computer readable storage medium, are stored thereon with program, the journey The appraisal procedure of voice quality as described above is realized when sequence is executed by processor.

According to an embodiment of the invention, language will be preselected by the way that voice data to be assessed is converted to speech text to be assessed Sound content splits into M keyword, and is utilized respectively M keyword and retrieves to speech text to be assessed, it can be deduced that not The quantity of each keyword correctly restored.Then by the quantity for each keyword not restored correctly, it is available not The total quantity of the keyword correctly restored gulps down number of words as during this Speech Assessment.Due to the embodiment of the present invention Number of words is gulped down during available Speech Assessment, as long as gulping down number of words and Speech Assessment number during establishing Speech Assessment According to comprising all keywords sum between relationship, it will be able to assess the content intact degree of voice data to be assessed.

Detailed description of the invention

The present invention may be better understood from the description with reference to the accompanying drawing to a specific embodiment of the invention wherein, The same or similar appended drawing reference indicates the same or similar feature.

Fig. 1 is the flow diagram of the appraisal procedure for the voice quality that one embodiment of the invention provides；

Fig. 2 be another embodiment of the present invention provides voice quality appraisal procedure flow diagram；

Fig. 3 is the flow diagram of the appraisal procedure for the voice quality that the excellent embodiment of the present invention provides；

Fig. 4 is the structural schematic diagram of the assessment device of voice quality provided in an embodiment of the present invention；

Fig. 5 is the hardware structural diagram of the assessment device of voice quality provided in an embodiment of the present invention.

Specific embodiment

The feature and exemplary embodiment of various aspects of the invention is described more fully below.In following detailed description In, many details are proposed, in order to provide complete understanding of the present invention.But to those skilled in the art It will be apparent that the present invention can be implemented in the case where not needing some details in these details.Below to implementation The description of example is used for the purpose of providing by showing example of the invention and better understanding of the invention.The present invention never limits In any concrete configuration set forth below and algorithm, but cover under the premise of without departing from the spirit of the present invention element, Any modification, replacement and the improvement of component and algorithm.In the the accompanying drawings and the following description, well known structure and skill is not shown Art is unnecessary fuzzy to avoid causing the present invention.

The embodiment of the invention provides a kind of appraisal procedures of voice quality and device, computer readable storage medium.It adopts The integrity degree of voice quality can be assessed with the embodiment of the present invention, so that can be by call in communication process The meaning to be expressed of one side is accurately communicated to other side.

Fig. 1 is the flow diagram of the appraisal procedure for the voice quality that one embodiment of the invention provides.As shown in Figure 1, should Appraisal procedure includes step 101 to step 107.

In step 101, primary voice data is acquired according to pre-selection voice content.

Wherein, pre-selection voice content is to be chosen in advance for carrying out assessment tested speech content.Preselect voice content Form can be text, or a Duan Yuyin is not limited herein.

More accurately assessment result, the selection for preselecting voice content need to meet some conditions in order to obtain.

In one example, pre-selection voice content be configured as covering be arbitrarily designated the high tone of languages frequency of use and/ Or constitute the basic pronunciation of specified languages.The selection rule of pre-selection voice content is illustrated by taking Chinese as an example below.

Wherein, the higher word of frequency of use can be pronoun in Chinese, such as: you, I etc.；It is also possible to noun, than Such as: family, friend, weather；Can also be modal particle, such as: it is good, uh, may etc..

Constitute Chinese it is basic pronounce in, share 21 initial consonants, be respectively b, p, N, f, d, t, n, l, g, k, h, N, q, x, zh,ch,sh,r,z,c,s；24 simple or compound vowel of a Chinese syllable, wherein single vowel is a, o, e, i, u, v；Compound vowel be ai, ei, ui, ao, ou, iu, ie、ve、er、an、en、in、un、vn、ang、eng、ing、ong。

In another example, pre-selection voice content is additionally configured to meet following condition between the M keyword split into At least one of: it is semantic different, do not repeat, there is no comprising with by comprising relationship, there is no homonyms.It is right separately below The each condition for needing to meet between M keyword is illustrated.

Wherein, semantic difference refers to that the meaning of word expression is different, for example banana and desk lamp are exactly two and look like completely not Same word.Not repeating to refer in pre-selection voice content is not in the same word.There is no comprising with by comprising Relationship refer to that there is no apparent subordinate relation, such as banana and fruit, banana is exactly the subordinate word of fruit.There is no same Sound word is that there is no the same or similar words that pronounces, for example stay and flow down, and the pronunciation of the two is identical.

Optionally, the embodiment of the present invention can be according to the primary voice data of pre-selection voice content acquisition male voice or female voice. Because there is notable difference in terms of tone color and tone and audio in male voice and female voice, then accordingly, the voice quality of the two Assessment result also can be different.It therefore, can be by the voice input data augmentation of primary voice data to male voice or female voice, so that language The assessment result of sound quality is more comprehensive.It is further possible to by the voice input data augmentation of primary voice data to child Sound or the sound of the old etc., herein without limiting.

It should be noted that environment is tested according to actual Speech Assessment, and when determining pre-selection voice content, above-mentioned pre-selection The selection rule of voice content can be whole satisfactions, be also possible to part satisfaction, wherein corresponding voice matter when all meeting The assessment result of amount is most accurate.

In step 102, call processing is carried out to primary voice data, obtains voice data to be assessed.

Wherein, call processing refers to the communication conversed when carrying out Speech Assessment test by talk channel come analog voice Environment.Specifically, primary voice data can be inputted to one end of talk channel, then received from the other end of talk channel former Voice data of the beginning voice data after transmission loss, as voice data to be assessed.Illustratively, if A and B are leading to Words, the speech understanding that A can be issued are primary voice data, and the sound that A is issued can be heard after transmission by B, can be by B The speech understanding heard is voice data to be assessed.

The communication environment conversed using talk channel analog voice, the duration of primary voice data require to be greater than duration Threshold value, duration threshold value refer to meet a quality evaluation need acquire primary voice data it is most short continue when the time.Example Property, it can be obtained according to the word speed of the relative transport speed of talk channel, the transmission frequency of talk channel and primary voice data To duration threshold value.

In one example, following formula be can use to determine the duration threshold value of primary voice data:

T=100 × α × max (c/ ν f, s) (1)

Wherein, T is the duration threshold value of primary voice data, and α is constant, and c is the light velocity, and ν is the relative transport of talk channel Speed, f are the transmission frequency of talk channel, and s is the word speed of primary voice data, and unit is second/word.

In step 103, voice data to be assessed is converted into speech text to be assessed.Illustratively, voice can be passed through Voice data to be assessed is converted to speech text to be assessed by identification technology.In one example, voice number to be assessed is being obtained According to rear, voice data to be assessed can be automatically converted to speech text to be assessed.

In step 104, pre-selection voice content is split into M keyword, is utilized respectively M keyword to voice to be assessed Text is retrieved, and obtains the quantity for each keyword not restored correctly, and M is positive integer.In one example, Ke Yili Automatically retrieval is carried out to speech text to be assessed with retrieval technique.

In step 105, the quantity for each keyword not restored correctly is added, the key not restored correctly is obtained First total quantity of word.

In step 106, according to the duration of primary voice data, keyword corresponding with primary voice data is calculated Second total quantity；

In step 107, according to the first total quantity and the second total quantity, the content intact degree of voice data to be assessed is assessed.

Further, since the embodiment of the present invention, which will preselect voice content, splits into M keyword, and it is based on each keyword pair Speech text to be assessed is retrieved, can compared with the whole distortion factor that can only embody call voice in prior art The word not restored correctly is accurately positioned.

In addition, can use speech recognition technology in the embodiment of the present invention carries out automation text turn to voice communication content Change, and using retrieval technique automatically retrieval voice content keyword, therefore, can save a large amount of costs of labor and time at This, and can be avoided the subjective impact of evaluator.

Preferably, according to an embodiment of the invention, the ratio of calculating the first total quantity and the second total quantity, root can be passed through The content intact degree of voice data to be assessed is assessed according to ratio.

Wherein, the first total quantity refers to the total quantity for the keyword not restored correctly, and the second total quantity refers to and original The total quantity of the corresponding keyword of beginning voice data, herein, can using the first total quantity as this Speech Assessment during Number of words is gulped down, then, the ratio of the first total quantity and the second total quantity is it can be understood that for gulping down during this Speech Assessment Word rate.

In one example, voice data to be assessed can be identified as text to be assessed first, according in pre-selection voice M keyword of appearance respectively retrieves speech text to be assessed, and notes down the retrieval quantity q of each keyword₀,q₁,…, q_m-1, obtain the quantity p that each keyword is not restored correctly₀,p₁,…,p_m-1.Then each keyword is not restored correctly Quantity be added, obtainBe denoted as this speech quality evaluation process gulps down number of words, whereinWith it is original The ratio of the corresponding all keyword sums of voice data is the word rate that gulps down.

It should be understood that gulping down, word rate is higher to represent that the integrity degree that primary voice data is reduced is lower, the matter of voice communication It is poorer to measure.Therefore, using the technical solution in the embodiment of the present invention can more also original subscriber carries out voice communication Scene, faster and reliably assessment voice communication reduction integrity degree.

Further, since appraisal procedure in the embodiment of the present invention gulps down the method for word rate and assesses voice quality by calculating, And speech model is not set up, so as to avoid assessment result from being influenced by speech model Parameters variation, therefore, the present invention is implemented Appraisal procedure in example also has the characteristics that stability is high.

Fig. 2 be another embodiment of the present invention provides voice quality appraisal procedure flow diagram.Fig. 2's and Fig. 1 The difference is that the step 106 in Fig. 1 can be refined as the step 1061 in Fig. 2 to step 1062.

In step 1061, according to the duration of primary voice data, calculates the primary assessment of satisfaction and need to repeat pre-selection The times N of voice content, N are positive integer.

In step 1062, the product of M and N is calculated, the second sum as keyword corresponding with primary voice data Amount.

Fig. 3 is the flow diagram of the appraisal procedure for the voice quality that further embodiment of this invention provides.The pass of Fig. 3 and figure System is that the step in Fig. 2 can be refined as the step 10611 in Fig. 3 to step 10613.

In step 10611, the number of words that pre-selection voice content includes is obtained.It should be noted that by taking Chinese character as an example, herein Number of words counted not in accordance with keyword, but counted according to individual Chinese character.

In step 10612, the product of number of words and voice communication word speed that pre-selection voice content includes is calculated, is repeated The duration that original speech content needs.Wherein, the unit of voice communication word speed is second/word.

In step 10613, calculates the duration of primary voice data and be repeated once original speech content needs The ratio of duration, the times N that the primary assessment of satisfaction as primary voice data needs to repeat to preselect voice content.

According to an embodiment of the invention, preselecting voice content for the multistage that can be recognized accurately in voice data to be assessed And corresponding keyword, it can set identical for the initial position for preselecting voice content.It in one example, can be in original There are one section between the duplicate pre-selection voice content of adjacent needs in beginning voice data to be left white the time, i.e., in every section of pre-selection voice It is added k seconds and is left white to synchronize before content, k is positive integer.

Fig. 4 is the structural schematic diagram of the assessment device of voice quality provided in an embodiment of the present invention.Voice quality in Fig. 4 Assessment device include acquisition module 401, processing module 402, conversion module 403, retrieval module 404, the first computing module 405, the second computing module 406 and evaluation module 407.

Wherein, acquisition module 401, for acquiring primary voice data according to pre-selection voice content；

Processing module 402 obtains voice data to be assessed for carrying out call processing to primary voice data；

Conversion module 403, for voice data to be assessed to be converted to speech text to be assessed；

Retrieval module 404 splits into M keyword for that will preselect voice content, is utilized respectively M keyword to be evaluated Estimate speech text to be retrieved, obtain the quantity for each keyword not restored correctly, M is positive integer；

First computing module 405 obtains incorrect for the quantity for each keyword not restored correctly to be added First total quantity of the keyword of reduction；

Second computing module 406 calculates corresponding with primary voice data for the duration according to primary voice data Keyword the second total quantity；

Evaluation module 407, for according to the first total quantity and the second total quantity, the content for assessing voice data to be assessed to be complete Whole degree.

According to an embodiment of the invention, voice data to be assessed is converted to voice text to be assessed by conversion module 403 This, splits into M keyword for voice content is preselected by retrieval module 404, and be utilized respectively M keyword to voice to be assessed Text is retrieved, it can be deduced that the quantity for each keyword not restored correctly.Then the first calculation module 405 will be by not just The quantity of each keyword really restored, the total quantity of the available keyword not restored correctly, is commented as this voice Number of words is gulped down during estimating.Due to gulping down number of words during the available Speech Assessment of the embodiment of the present invention, as long as establishing The relationship gulped down between all keywords sum that number of words and Speech Assessment data include during Speech Assessment, evaluation module 407 can assess the content intact degree of voice data to be assessed.

Fig. 5 is the hardware structural diagram of the assessment device of voice quality provided in an embodiment of the present invention.As shown in figure 5, The assessment device of voice quality in the embodiment of the present invention includes: processor 501, memory 502, communication interface 503 and bus 510.Wherein, processor 501, memory 502 and communication interface 503 connect by bus 510 and complete mutual communication.

Specifically, above-mentioned processor 501 may include central processing unit 501 (CPU) or specific integrated circuit (ASIC), or may be configured to implement the embodiment of the present invention one or more integrated circuits.

Memory 502 may include for data or the mass storage of instruction 502.For example it rather than limits, deposits Reservoir 502 may include HDD, floppy disk drive, flash memory, CD, magneto-optic disk, tape or universal serial bus 510 (USB) driver Or the combination of two or more the above.In a suitable case, memory 502 may include removable or non-removable The medium of (or fixed).In a suitable case, memory 502 can be inside or outside resource interface equipment.In specific reality It applies in example, memory 502 is non-volatile solid state memory 502.In a particular embodiment, memory 502 includes read-only storage Device 502 (ROM).In a suitable case, which can be the ROM of masked edit program, programming ROM (PROM), erasable PROM (EPROM), electric erasable PROM (EEPROM), electrically-alterable ROM (EAROM) or flash memory or two or more the above Combination.

Communication interface 503 is mainly used for realizing in the embodiment of the present invention between each module, device, unit and/or equipment Communication.

That is, the assessment device of voice quality may be implemented as including: processor 501, memory 502, communication Interface 503 and bus 510.Processor 501, memory 502 and communication interface 503 are connected by bus 510 and are completed each other Communication.Memory 502 is for storing program code；Processor 501 is by reading the executable program stored in memory 502 Code runs program corresponding with the executable program code, with the assessment side for executing voice quality described above Method, to realize the appraisal procedure and device of the voice quality in conjunction with described in Fig. 1 to Fig. 4.

It should be clear that all the embodiments in this specification are described in a progressive manner, each embodiment it Between the same or similar part may refer to each other, the highlights of each of the examples are it is different from other embodiments it Place.For device embodiment, related place may refer to the declaratives of embodiment of the method.The invention is not limited to upper Literary particular step described and shown in figure and structure.Those skilled in the art can understand spirit of the invention Afterwards, it is variously modified, modification and addition, or the sequence between changing the step.Also, it for brevity, omits here To the detailed description of known method technology.

However, it is desirable to clear, the invention is not limited to specific configuration described above and shown in figure and processing. Also, the detailed description to known method technology for brevity, is omitted here.In the above-described embodiments, it describes and shows Several specific steps are as example.But method process of the invention is not limited to described and illustrated specific steps, Those skilled in the art can be variously modified, modification and addition after understanding spirit of the invention, or change step Sequence between rapid.

The present invention can realize in other specific forms, without departing from its spirit and essential characteristics.For example, particular implementation Algorithm described in example can be modified, and system architecture is without departing from essence spirit of the invention.Therefore, currently Embodiment be all counted as being exemplary rather than in all respects it is limited, the scope of the present invention by appended claims rather than Foregoing description definition, also, the meaning of claim and whole changes in the range of equivalent are fallen into all be included in Among the scope of the present invention.

Claims

1. a kind of appraisal procedure of voice quality characterized by comprising

Primary voice data is acquired according to pre-selection voice content；

The voice data to be assessed is converted into speech text to be assessed；

The pre-selection voice content is split into M keyword, is utilized respectively the M keyword to the voice text to be assessed This is retrieved, and obtains the quantity for each keyword not restored correctly, and M is positive integer；

The quantity for each keyword not restored correctly is added, the first sum of the keyword not restored correctly is obtained Amount；

According to the duration of the primary voice data, calculate keyword corresponding with the primary voice data second is total Quantity；

According to first total quantity and second total quantity, the content intact degree of the voice data to be assessed is assessed.

2. appraisal procedure according to claim 1, which is characterized in that described according to first total quantity and described second Total quantity assesses the content intact degree of the voice data to be assessed, comprising:

Calculate the ratio of first total quantity and second total quantity；

3. appraisal procedure according to claim 1, which is characterized in that it is any that the pre-selection voice content is configured as covering The high tone of specified languages frequency of use and/or the basic pronunciation for constituting the specified languages.

4. appraisal procedure according to claim 3, which is characterized in that the pre-selection voice content is additionally configured to split into M keyword between meet at least one of following condition: it is semantic it is different, do not repeat, there is no comprising with by comprising Homonym is not present in relationship.

5. appraisal procedure according to claim 1, which is characterized in that described to acquire raw tone according to pre-selection voice content Data, comprising:

According to the primary voice data of pre-selection voice content acquisition male voice or female voice.

6. appraisal procedure according to claim 1, which is characterized in that it is described according to the primary voice data it is lasting when It is long, calculate the second total quantity of keyword corresponding with the primary voice data, comprising:

According to the duration of the primary voice data, calculates the primary assessment of satisfaction and need to repeat the pre-selection voice content Times N, N are positive integer；

7. appraisal procedure according to claim 6, which is characterized in that it is described according to the primary voice data it is lasting when It is long, it calculates and meets the times N that primary assessment needs to repeat the pre-selection voice content, comprising:

Obtain the number of words that the pre-selection voice content includes；

The product for calculating number of words and voice communication word speed that the pre-selection voice content includes, obtains being repeated once the original language The duration that sound content needs；

Calculate the duration and the duration for being repeated once the original speech content needs of the primary voice data Ratio, the primary assessment of satisfaction as the primary voice data need to repeat the times N of the pre-selection voice content.

8. appraisal procedure according to claim 6, which is characterized in that adjacent in the primary voice data to need to repeat Pre-selection voice content between there are one section to be left white the time.

9. appraisal procedure according to claim 1, which is characterized in that the duration of the primary voice data requires big In duration threshold value, the duration threshold value be relative transport speed based on talk channel, the transmission frequency of the talk channel and What the word speed of the primary voice data obtained.

10. appraisal procedure according to claim 9, which is characterized in that determine the raw tone using following formula The duration threshold value of data:

T=100 × α × max (c/ ν f, s)

Wherein, T is the duration threshold value of the primary voice data, and α is constant, and c is the light velocity, and ν is the relative transport of talk channel Speed, f are the transmission frequency of the talk channel, and s is the word speed of the primary voice data.

11. a kind of assessment device of voice quality characterized by comprising

Retrieval module is utilized respectively the M keyword to institute for the pre-selection voice content to be split into M keyword It states speech text to be assessed to be retrieved, obtains the quantity for each keyword not restored correctly, M is positive integer；

First computing module obtains not restored correctly for the quantity of each keyword not restored correctly to be added First total quantity of keyword；

Second computing module calculates and the primary voice data pair for the duration according to the primary voice data Second total quantity of the keyword answered；

Evaluation module, for assessing the voice data to be assessed according to first total quantity and second total quantity Content intact degree.

12. a kind of assessment device of voice quality, including memory, processor and storage are on a memory and can be on a processor The program of operation, which is characterized in that the processor is realized as described in claim 1-10 any one when executing described program Voice quality appraisal procedure.

13. a kind of computer readable storage medium, is stored thereon with program, which is characterized in that described program is executed by processor The appraisal procedure of voice quality of the Shi Shixian as described in claim 1-10 any one.