CN107342079A - A kind of acquisition system of the true voice based on internet - Google Patents
A kind of acquisition system of the true voice based on internet Download PDFInfo
- Publication number
- CN107342079A CN107342079A CN201710543472.5A CN201710543472A CN107342079A CN 107342079 A CN107342079 A CN 107342079A CN 201710543472 A CN201710543472 A CN 201710543472A CN 107342079 A CN107342079 A CN 107342079A
- Authority
- CN
- China
- Prior art keywords
- speech data
- server
- speech
- client
- statement text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013528 artificial neural network Methods 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims abstract description 8
- 238000011156 evaluation Methods 0.000 claims description 27
- 238000001514 detection method Methods 0.000 claims description 16
- 238000007689 inspection Methods 0.000 claims description 2
- 238000004321 preservation Methods 0.000 claims description 2
- 230000001915 proofreading effect Effects 0.000 abstract description 5
- 238000012805 post-processing Methods 0.000 abstract description 4
- 238000000034 method Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000000151 deposition Methods 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 241000238558 Eucarida Species 0.000 description 1
- 230000004308 accommodation Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000004218 nerve net Anatomy 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/01—Assessment or evaluation of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
A kind of acquisition system of the true voice based on internet, including:Server and client side;Server and client side's network connection;Server performs:The text material to prestore is divided into statement text of its length suitable for deep neural network training;Statement text to be read aloud is sent to the unspecific user for obtaining client access authority;Receive speech data corresponding with statement text;Client executing:The bright read request of statement text is initiated to server and receives statement text to be read aloud;Collection user reads aloud the speech data of statement text, and the speech data collected is sent into server.The acquisition mode of the true voice of this example is compared with existing voice collecting mode:Substantial amounts of post-processing and proof-reading need not be carried out to the audio file of collection, moreover, the audio file of collection is corresponding with the text material read aloud;In addition, by evaluating the speech data of collection, further, the high quality collection of true voice is realized.
Description
Technical field
The present invention relates to voice collecting technical field, and in particular to a kind of collection system of the true voice based on internet
System.
Background technology
Quick development is being obtained in recent years and is being widely applied based on the speech recognition technology of deep neural network.
The speech data (data for possessing word-voice control) that this technology needs to have marked in advance is input to a nerve net
Among network, neutral net is trained.The quality and quantity of the speech data marked extremely closes for the effect of speech recognition
Important, the data marked are more, and the effect of training is better.The speech data quality marked is higher, closer to real people
Class voice, it is better to the recognition effect of real human's voice to train the deep neural network come.
At present, the main acquisition source of the tagged speech data set used in terms of deep learning includes the following aspects:
A. specially recruit personnel and read aloud text material and recorded, to gather voice sample;
B. using the audio file in open field to obtain voice sample;
C. phonitic entry method is developed, gathers the voice sample of user, such as interrogates rumours phonetic input method;
D., the voice assistant of operating system is provided, interacted therewith by client, gathers voice sample, such as Microsoft
The Cortana of the Win10 desktop versions and Siri of Apple Inc.;
E. directly synthesized according to text material using speech synthesis technique.
There are the following problems for above-mentioned voice acquisition technique:
1. recruiting the mode that people reads aloud text material, the audio file collected must be divided into 10 seconds or so by the later stage
Small documents, and need split text material correspond to therewith, these are required for substantial amounts of post-processing and proof-reading.And
The scope of collection is small, the limited sample size that can be collected every time;
2. the mode of the audio file using open field, these usual audio files all lack corresponding text material
Material, file size are generally also excessive, it is necessary to substantial amounts of later stage word dictation, dividing processing and proof-reading;
3. the mode gathered with phonitic entry method, the voice sample collected are not ensured that with accurate word therewith
It is corresponding.Meanwhile the voice sample collected is uneven in length, also there are a large amount of useless samples to mix period, sample quality can not be protected
Card is, it is necessary to substantial amounts of post-processing and proof-reading;
It is 4. identical with the mode gathered with phonitic entry method with the mode of voice assistant, shortcoming;
5. with the mode of speech synthesis technique, the voice of synthesis has larger difference with true voice, is unfavorable for depth nerve
Study of the network to real speech.
The content of the invention
The application provides a kind of acquisition system of the true voice based on internet, including server and client side;
Server and client side's network connection;
Server performs:
The text material to prestore is divided into statement text of its length suitable for deep neural network training;
Statement text to be read aloud is sent to the unspecific user for obtaining client access authority;
Receive speech data corresponding with statement text;
Client executing:
The bright read request of statement text is initiated to server and receives statement text to be read aloud;
Collection user reads aloud the speech data of statement text, and the speech data collected is sent into server.
In a kind of embodiment, in addition to Speech Assessment module, Speech Assessment module perform and the speech data are commented
Valency.
In a kind of embodiment, Speech Assessment module performs speech data and evaluated, and is specially:According to noise level and use
Whether family carries out reading aloud the scoring for calculating the speech data according to statement text.
In a kind of embodiment, Speech Assessment module is integrated in server, and server also performs:Speech data is commented
Valency, significant notation will be carried out by the speech data of evaluation and be saved to the memory bank of statement text corresponding thereto
In, otherwise, the speech data not by evaluation is subjected to invalid flag.
In a kind of embodiment, Speech Assessment module is integrated in client, and client also performs:Speech data is commented
Valency, significant notation will be carried out by the speech data of evaluation and be sent to server, otherwise, will not pass through the voice of evaluation
Data carry out invalid flag.
In a kind of embodiment, in addition to third party's detection platform, third party's detection platform respectively with client and server
Network connection;
Client executing:
The speech data collected is sent to third party's detection platform;
Third party's detection platform is built-in with Speech Assessment module, and third party's detection platform performs:
Speech data is evaluated, significant notation will be carried out by the speech data of evaluation and transmit it to service
Device, otherwise, the speech data not by evaluation is subjected to invalid flag.
In a kind of embodiment, server is integrated with selective examination module, and server also performs:To the effective speech data of preservation
Carry out random artificial selective examination.
In a kind of embodiment, the program of client executing at least depends on one of them:Smart machine, PC and clear
Look at device webpage.
According to the acquisition system of above-described embodiment, due to the text material to prestore is divided into its length suitable for depth god
Statement text through network training, statement text to be read aloud is sent to according to the bright read request of user, language is read aloud to user
The voice of sentence text is acquired, and the acquisition mode of the true voice of this example is compared with existing voice collecting mode:This example is not
Need to carry out the audio file of collection later stage segmentation, do not need substantial amounts of post-processing and proof-reading, moreover, the sound of collection
Frequency file is corresponding with the text material read aloud;In addition, by evaluating the speech data of collection, the language that passes through will be evaluated
Sound data carry out storage and effective mark, further, realize the high quality collection of true voice.
Brief description of the drawings
Fig. 1 is the acquisition system operating diagram of embodiment one;
Fig. 2 is the acquisition system operating diagram of embodiment two;
Fig. 3 is the acquisition system operating diagram of embodiment three.
Embodiment
The present invention is described in further detail below by embodiment combination accompanying drawing.
In embodiments of the present invention, it is true voice of the solution currently used for the tape label of training deep neural network
The problem of data sample is less, and speech data sample acquisition cost is higher, this example provide a kind of true voice based on internet
Acquisition system, its gather speech data can after simple process for deep learning neutral net training, checking and
Test.
Embodiment one:
The acquisition system of the true voice based on internet of this example includes server 1 and client 2, its operating diagram
As shown in figure 1, server 1 and client 2 establish network connection, the language that client 2 is read aloud non-user-specific based on internet
Sentence text is recorded, and recording is sent to server 1, to realize the collection of true voice, wherein, what non-user-specific referred to
It is any one user, i.e. any one user can ask read aloud statement text by registering to server 1, thus,
Extend the scope of speech sample.
Specifically, in order to avoid carrying out dividing processing, the audio file more gathered for convenience to the audio file of collection
Suitable for the sample of deep neural network speech recognition, the substantial amounts of text material to prestore is divided into its length and fitted by server 1
For the statement text of deep neural network training, e.g., the length of statement text is approximately equal to the bright read time of 10 seconds or so.
After non-user-specific registers an account by client 2, you can as the specific user of statement text is read aloud,
Such as, when non-user-specific is according to the registration of client 2 prompting, after progressively agreeing to client 2 using clause, the nonspecific use
Family is with regard to that can obtain the access rights of client 2, and then, user can just initiate to server 1 read aloud statement text by client 2
Request, client 2 receives statement text to be read aloud, and user reads aloud the statement text of acquisition, and now, client 2 is subsidiary
Recording hardware device is triggered, and the true voice read aloud user is recorded, and after treating that user reads aloud, client 2 will adopt
The speech data collected is sent to server 1, and server 1 receives and stores the speech data, so that server 1 can gather
To with word corresponding to voice messaging.
In order to ensure to obtain the validity of voice messaging, avoid obtaining useless sample, the language collected for each
Sound data carry out automatic detection and provide evaluation score, and only evaluation score can just be stored more than the speech data of default value
To server 1.This example also includes Speech Assessment module 3, and Speech Assessment module 3 is evaluated speech data, according to evaluation point
Whether the speech data that number judges to collect meets the requirements, wherein, evaluating the key element of speech data includes noise level and user
Whether read aloud according to statement text, therefore, whether Speech Assessment module 3 is according to noise level and user according to statement text
Carry out reading aloud the scoring for calculating speech data.
Wherein, whether read aloud for user according to statement text, specific evaluation method is Speech Assessment module 3
To use reference voice corresponding to the statement text, the reference voice can be existing collection tagged speech or
The similarity of artificial speech, the comparison reference voice of Speech Assessment module 3 and the voice gathered is synthesized, is commented according to similarity
Point.
The similarity of comparison reference voice and the voice of collection is achieved in that:First looked for using dynamic time warping algorithm
To between the language of collection the best alignment with reference to the feature between language, then using Levenshtein distance algorithms come calculate this two
The distance between individual sequence, the similarity between being obtained by distance between two languages, is scored according to the similarity.
In this example, Speech Assessment module 3 is integrated in server 1, and server 1 receives speech data corresponding to statement text
Afterwards, server 1 is evaluated speech data by Speech Assessment module 3, and server 1 will be carried out by the speech data of evaluation
Significant notation is simultaneously saved in the memory bank 4 of statement text corresponding thereto, otherwise, will not pass through the voice number of evaluation
According to invalid flag is carried out, the speech data of invalid flag can be saved in the memory bank 4 in server 1, also may be used by server 1
To be abandoned.
Further, in order to ensure the quality of the significant notation speech data of upload, server 1 is integrated with selective examination module, takes
Business device 1 carries out random artificial selective examination by spot-check the speech data of significant notation of the module to being stored in memory bank 4, to obtain
Take relevant information to be used for the evaluating for adjusting speech data, and the not detectable invalid sample of automatic detection institute can be rejected,
Further improve the quality of true voice sample set.
It should be noted that in order to extend the sample range of true voice sample set, the program of the execution of client 2 of this example
Existing forms can be mobile phone, tablet personal computer, the stand-alone utility on PC or be integrated in other application
Accommodation function module program or browsing device net page application program or the specialized hardware of customization inside program
Configuration processor, i.e. the program of client executing at least depends on one of them:Smart machine, PC and browser net
Page, wherein, smart machine includes but is not limited to:Smart mobile phone, tablet personal computer, intelligent watch, game machine, special phonographic recorder,
The intelligent domestic appliance controller etc..Accordingly, the program that server 1 performs can be deployed in property server, can also be deployed in
High in the clouds, i.e. server 1 can be cloud server or in general server.
Embodiment two:
Based on embodiment one, this example is unlike embodiment one:Speech Assessment module 3 is integrated in client 2 by this example,
Its operating diagram will be also commented by voice as shown in Fig. 2 client 2 is gathered after user reads aloud the speech data of statement text
Valency module 3 is evaluated speech data, will be carried out significant notation by the speech data of evaluation and is sent to server
1, server 1 will carry out significant notation by the speech data of evaluation and be saved to depositing for statement text corresponding thereto
Store up in body 4, otherwise, the speech data not by evaluation is subjected to invalid flag, client 2 will can not pass through the voice of evaluation
Data directly abandon, and can also be transmitted and are saved in the memory bank 4 in server 1.
Embodiment three:
Based on embodiment one, for this example unlike embodiment one, this example also includes third party's detection platform 5, third party
Detection platform 5 is respectively with client 2 and the network connection of server 1, and its operating diagram will be as shown in figure 3, client 2 will collect
Speech data be conveyed directly to third party's detection platform 5;Third party's detection platform 5 is built-in with Speech Assessment module 3, by the 3rd
Square detection platform 5 is evaluated speech data by Speech Assessment module 3, will carry out criterion by the speech data of evaluation
Remember and transmit it to server 1, server 1 significant notation will be carried out by the speech data of evaluation and be saved to and its
In the memory bank 4 of corresponding statement text, otherwise, invalid flag, third party's inspection will not carried out by the speech data of evaluation
Surveying platform 5 can directly abandon the speech data not by evaluation, can also be transmitted and be saved in depositing in server 1
Store up in body 4.
Use above specific case is illustrated to the present invention, is only intended to help and is understood the present invention, not limiting
The system present invention.For those skilled in the art, according to the thought of the present invention, can also make some simple
Deduce, deform or replace.
Claims (8)
- A kind of 1. acquisition system of the true voice based on internet, it is characterised in that including:Server and client side;Server and client side's network connection;The server performs:The text material to prestore is divided into statement text of its length suitable for deep neural network training;Statement text to be read aloud is sent to the unspecific user for obtaining client access authority;Receive speech data corresponding with the statement text;The client executing:The bright read request of statement text is initiated to the server and receives statement text to be read aloud;Collection user reads aloud the speech data of statement text, and the speech data collected is sent into the server.
- 2. acquisition system as claimed in claim 1, it is characterised in that also including Speech Assessment module, the Speech Assessment mould Block performs and the speech data is evaluated.
- 3. acquisition system as claimed in claim 2, it is characterised in that the Speech Assessment module performs the speech data and entered Row evaluation, it is specially:Commenting for the speech data is calculated according to whether noise level and user according to statement text read aloud Point.
- 4. acquisition system as claimed in claim 2, it is characterised in that the Speech Assessment module is integrated in the server, The server also performs:The speech data is evaluated, significant notation will be carried out by the speech data of evaluation and incited somebody to action It is preserved into the memory bank of statement text corresponding thereto, and otherwise, the speech data not by evaluation is carried out without criterion Note.
- 5. acquisition system as claimed in claim 2, it is characterised in that the Speech Assessment module is integrated in the client, The client also performs:The speech data is evaluated, significant notation will be carried out by the speech data of evaluation and incited somebody to action It is sent to the server, otherwise, the speech data not by evaluation is carried out into invalid flag.
- 6. acquisition system as claimed in claim 2, it is characterised in that also including third party's detection platform, third party's inspection Survey platform respectively with the client and server network connection;The client executing:The speech data collected is sent to third party's detection platform;Third party's detection platform is built-in with the Speech Assessment module, and third party's detection platform performs:The speech data is evaluated, significant notation will be carried out by the speech data of evaluation and transmit it to the clothes Business device, otherwise, the speech data that will not pass through evaluation carry out invalid flag.
- 7. the acquisition system as described in claim any one of 4-6, it is characterised in that the server is integrated with selective examination module, The server also performs:Random artificial selective examination is carried out to the effective speech data of preservation.
- 8. acquisition system as claimed in claim 1, it is characterised in that the program of the client executing at least depends on wherein One of:Smart machine, PC and browsing device net page.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710543472.5A CN107342079A (en) | 2017-07-05 | 2017-07-05 | A kind of acquisition system of the true voice based on internet |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710543472.5A CN107342079A (en) | 2017-07-05 | 2017-07-05 | A kind of acquisition system of the true voice based on internet |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107342079A true CN107342079A (en) | 2017-11-10 |
Family
ID=60218438
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710543472.5A Pending CN107342079A (en) | 2017-07-05 | 2017-07-05 | A kind of acquisition system of the true voice based on internet |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107342079A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107818797A (en) * | 2017-12-07 | 2018-03-20 | 苏州科达科技股份有限公司 | Voice quality assessment method, apparatus and its system |
CN108010539A (en) * | 2017-12-05 | 2018-05-08 | 广州势必可赢网络科技有限公司 | Voice quality evaluation method and device based on voice activation detection |
CN108696622A (en) * | 2018-05-28 | 2018-10-23 | 成都昊铭科技有限公司 | Voice without interface wakes up test device, system and method |
CN110797001A (en) * | 2018-07-17 | 2020-02-14 | 广州阿里巴巴文学信息技术有限公司 | Method and device for generating voice audio of electronic book and readable storage medium |
CN111210826A (en) * | 2019-12-26 | 2020-05-29 | 深圳市优必选科技股份有限公司 | Voice information processing method and device, storage medium and intelligent terminal |
CN113422825A (en) * | 2021-06-22 | 2021-09-21 | 读书郎教育科技有限公司 | System and method for assisting in culturing reading interests |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1940915A (en) * | 2005-09-29 | 2007-04-04 | 国际商业机器公司 | Corpus expansion system and method |
US20070168578A1 (en) * | 2005-10-27 | 2007-07-19 | International Business Machines Corporation | System and method for data collection interface creation and data collection administration |
CN102176222A (en) * | 2011-03-18 | 2011-09-07 | 北京科技大学 | Multi-sensor information collection analyzing system and autism children monitoring auxiliary system |
CN103198828A (en) * | 2013-04-03 | 2013-07-10 | 中金数据系统有限公司 | Method and system of construction of voice corpus |
CN104732977A (en) * | 2015-03-09 | 2015-06-24 | 广东外语外贸大学 | On-line spoken language pronunciation quality evaluation method and system |
CN104754112A (en) * | 2013-12-31 | 2015-07-01 | 中兴通讯股份有限公司 | User information obtaining method and mobile terminal |
CN105825856A (en) * | 2016-05-16 | 2016-08-03 | 四川长虹电器股份有限公司 | Independent learning method for vehicle-mounted speech recognition module |
CN105873050A (en) * | 2010-10-14 | 2016-08-17 | 阿里巴巴集团控股有限公司 | Wireless service identity authentication, server and system |
CN106023991A (en) * | 2016-05-23 | 2016-10-12 | 丽水学院 | Handheld voice interaction device and interaction method orienting to multi-task interaction |
-
2017
- 2017-07-05 CN CN201710543472.5A patent/CN107342079A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1940915A (en) * | 2005-09-29 | 2007-04-04 | 国际商业机器公司 | Corpus expansion system and method |
US20070168578A1 (en) * | 2005-10-27 | 2007-07-19 | International Business Machines Corporation | System and method for data collection interface creation and data collection administration |
CN105873050A (en) * | 2010-10-14 | 2016-08-17 | 阿里巴巴集团控股有限公司 | Wireless service identity authentication, server and system |
CN102176222A (en) * | 2011-03-18 | 2011-09-07 | 北京科技大学 | Multi-sensor information collection analyzing system and autism children monitoring auxiliary system |
CN103198828A (en) * | 2013-04-03 | 2013-07-10 | 中金数据系统有限公司 | Method and system of construction of voice corpus |
CN104754112A (en) * | 2013-12-31 | 2015-07-01 | 中兴通讯股份有限公司 | User information obtaining method and mobile terminal |
CN104732977A (en) * | 2015-03-09 | 2015-06-24 | 广东外语外贸大学 | On-line spoken language pronunciation quality evaluation method and system |
CN105825856A (en) * | 2016-05-16 | 2016-08-03 | 四川长虹电器股份有限公司 | Independent learning method for vehicle-mounted speech recognition module |
CN106023991A (en) * | 2016-05-23 | 2016-10-12 | 丽水学院 | Handheld voice interaction device and interaction method orienting to multi-task interaction |
Non-Patent Citations (1)
Title |
---|
IAN LANE等: "Tools for collecting speech corpora via Mechanical-Turk", 《PROCEEDINGS OF THE NAACL HLT 2010 WORKSHOP ON CREATING SPEECH AND LANGUAGE DATA WITH AMAZON’S MECHANICAL TURK》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108010539A (en) * | 2017-12-05 | 2018-05-08 | 广州势必可赢网络科技有限公司 | Voice quality evaluation method and device based on voice activation detection |
CN107818797A (en) * | 2017-12-07 | 2018-03-20 | 苏州科达科技股份有限公司 | Voice quality assessment method, apparatus and its system |
CN107818797B (en) * | 2017-12-07 | 2021-07-06 | 苏州科达科技股份有限公司 | Voice quality evaluation method, device and system |
CN108696622A (en) * | 2018-05-28 | 2018-10-23 | 成都昊铭科技有限公司 | Voice without interface wakes up test device, system and method |
CN110797001A (en) * | 2018-07-17 | 2020-02-14 | 广州阿里巴巴文学信息技术有限公司 | Method and device for generating voice audio of electronic book and readable storage medium |
CN110797001B (en) * | 2018-07-17 | 2022-04-12 | 阿里巴巴(中国)有限公司 | Method and device for generating voice audio of electronic book and readable storage medium |
CN111210826A (en) * | 2019-12-26 | 2020-05-29 | 深圳市优必选科技股份有限公司 | Voice information processing method and device, storage medium and intelligent terminal |
CN111210826B (en) * | 2019-12-26 | 2022-08-05 | 深圳市优必选科技股份有限公司 | Voice information processing method and device, storage medium and intelligent terminal |
CN113422825A (en) * | 2021-06-22 | 2021-09-21 | 读书郎教育科技有限公司 | System and method for assisting in culturing reading interests |
CN113422825B (en) * | 2021-06-22 | 2022-11-08 | 读书郎教育科技有限公司 | System and method for assisting in culturing reading interests |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107342079A (en) | A kind of acquisition system of the true voice based on internet | |
Massucci et al. | Measuring the academic reputation through citation networks via PageRank | |
CN110457432B (en) | Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium | |
US20200073996A1 (en) | Methods and Systems for Domain-Specific Disambiguation of Acronyms or Homonyms | |
CN109783631B (en) | Community question-answer data verification method and device, computer equipment and storage medium | |
CN105956053A (en) | Network information-based search method and apparatus | |
US11182605B2 (en) | Search device, search method, search program, and recording medium | |
CN111524578A (en) | Psychological assessment device, method and system based on electronic psychological sand table | |
CN111967261B (en) | Cancer stage information processing method, device and storage medium | |
CN111916110B (en) | Voice quality inspection method and device | |
Bianchi et al. | Exploring the potentialities of automatic extraction of university webometric information | |
Kim et al. | Speech intelligibility estimation using multi-resolution spectral features for speakers undergoing cancer treatment | |
Heeringa et al. | Computational dialectology | |
Sandhya et al. | Smart attendance system using speech recognition | |
CN110738032B (en) | Method and device for generating judge paperwork thinking section | |
CN109272262A (en) | A kind of analysis method of natural language feature | |
Yang et al. | Person authentication using finger snapping—a new biometric trait | |
CN108898439A (en) | A kind of information recommendation method based on sight spot | |
KR101838089B1 (en) | Sentimetal opinion extracting/evaluating system based on big data context for finding welfare service and method thereof | |
Lei et al. | Robust scream sound detection via sound event partitioning | |
Volkova et al. | Light CNN architecture enhancement for different types spoofing attack detection | |
Heidari et al. | Investigation of the natural frequency of the structure and earthquake frequencies in the frequency domain using a discrete wavelet | |
Chen et al. | Music Feature Extraction Method Based on Internet of Things Technology and Its Application | |
Nerbonne et al. | Some further dialectometrical steps | |
SHAFIEE et al. | Research self-efficacy and career decision-making self-efficacy of students at shiraz university of medical sciences: an explanatory model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171110 |
|
WD01 | Invention patent application deemed withdrawn after publication |