CN114067787B - Voice speech speed self-adaptive recognition system - Google Patents

Voice speech speed self-adaptive recognition system Download PDF

Info

Publication number
CN114067787B
CN114067787B CN202111547185.4A CN202111547185A CN114067787B CN 114067787 B CN114067787 B CN 114067787B CN 202111547185 A CN202111547185 A CN 202111547185A CN 114067787 B CN114067787 B CN 114067787B
Authority
CN
China
Prior art keywords
voice
character
sentence
interval
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111547185.4A
Other languages
Chinese (zh)
Other versions
CN114067787A (en
Inventor
邹月荣
李�权
汪张龙
郭清霞
李艳
许东生
杜平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Xunfei Qiming Technology Development Co ltd
Original Assignee
Guangdong Xunfei Qiming Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Xunfei Qiming Technology Development Co ltd filed Critical Guangdong Xunfei Qiming Technology Development Co ltd
Priority to CN202111547185.4A priority Critical patent/CN114067787B/en
Publication of CN114067787A publication Critical patent/CN114067787A/en
Application granted granted Critical
Publication of CN114067787B publication Critical patent/CN114067787B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • G10L15/05Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a voice speech speed self-adaptive recognition system, which comprises a user input module and a self-adaptive processing module; the user input module is used for inputting voice information by a user, and the self-adaptive processing module comprises a voice conversion unit, a character dividing unit, an analysis unit and a self-adaptive processing unit; the voice conversion unit is used for converting voice information input by a user into character information; the character dividing unit is used for dividing the converted character information into independent characters; and the analysis unit performs analysis processing on the basis of the divided independent characters to obtain parameters of the divided character information. The invention can carry out self-adaptive recognition based on the speech rate of different users, thereby improving the comprehensive effectiveness of speech conversion of different users and solving the problem that the existing speech recognition has insufficient self-adaptation to the speech rate.

Description

Voice speech speed self-adaptive recognition system
Technical Field
The invention relates to the technical field of speech speed recognition, in particular to a speech speed self-adaptive recognition system.
Background
Speech recognition is a cross discipline. In the last two decades, the speech recognition technology has made a significant progress, and starts to move from the laboratory to the market, and in the next 10 years, the speech recognition technology will enter various fields such as industry, home appliances, communication, automotive electronics, medical treatment, home services, consumer electronics, and the like. The speech recognition ratio is made as "the auditory system of the machine". Speech recognition technology is a high technology that allows machines to convert speech signals into corresponding text or commands through a recognition and understanding process. The voice recognition technology mainly comprises three aspects of a feature extraction technology, a pattern matching criterion and a model training technology.
In the prior art, because the speaking habits of each person are different, the pause points of each sentence of each person are different, the speed of each person is also different, and the existing voice recognition system cannot perform accurate voice recognition conversion based on the characteristics, the phenomenon of character missing can occur in the recognition conversion process, and the final word semantics presented by voice are wrong.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a voice speech rate self-adaptive recognition system which can perform self-adaptive recognition based on the speech rates of different users, thereby improving the comprehensive effectiveness of voice conversion of different users and solving the problem that the existing voice recognition has defects in speech rate self-adaptation.
In order to achieve the purpose, the invention is realized by the following technical scheme: a speech speed self-adaptive recognition system comprises a user input module and a self-adaptive processing module; the user input module is used for inputting voice information by a user;
the self-adaptive processing module comprises a voice conversion unit, a character dividing unit, an analysis unit and a self-adaptive processing unit;
the voice conversion unit is used for converting voice information input by a user into character information;
the character dividing unit is used for dividing the converted character information into independent characters;
the analysis unit performs analysis processing based on the divided independent characters to obtain parameters of the divided character information;
and the self-adaptive processing unit is used for carrying out self-adaptive identification processing on the speech speed of the user according to the parameters of the divided character information.
Further, the word segmentation unit is configured with a word segmentation policy, and the word segmentation policy includes: the method comprises the steps of calibrating characters in character information in sequence, carrying out time length demarcation on voice input from input starting time to input ending time, sequentially corresponding the calibrated characters to the time length of the input voice, taking the time length as a time mark of the characters, and taking the time mark as a demarcation limit of each character.
Further, the analysis unit is configured with a speech rate analysis policy, where the speech rate analysis policy includes: acquiring the interval duration between every two characters, wherein the interval duration is obtained by subtracting the time mark of the previous character from the time mark of the next character, and then putting the interval duration into a time interval number set;
collecting interval thresholds with the interval duration less than or equal to one time into a first time duration set, collecting interval thresholds with the interval duration more than one time and less than or equal to two times into a second time duration set, and collecting interval thresholds with the interval duration more than two times into a third time duration set;
selecting one of the first duration set, the second duration set and the third duration set with the maximum interval duration as a speech rate identification number set;
and substituting a plurality of interval durations in the speech rate identification fixed number set into a speech rate formula to obtain a speech rate value.
Further, the speech rate formula is configured to:
Figure GDA0003460801940000031
where Vys is a speech rate value, T1 to Tn are respectively expressed as a number of interval durations in the speech rate identification fixed number set, n is a number of interval durations in the speech rate identification fixed number set, a1 is a conversion ratio of the speech rate value, and a1 is greater than zero.
Further, the interval threshold is calculated by an interval threshold formula, where the interval threshold formula is configured to:
Figure GDA0003460801940000032
yjg is interval threshold, Sz is the number of words in the word information, Tz is the duration of the voice information, b1 is the corresponding coefficient of the interval threshold, and a1 is greater than zero.
Further, the analysis unit is further configured with a sentence habit analysis policy, and the sentence habit analysis policy includes: acquiring interval duration in the third time duration set, taking the interval duration in the third time duration set as a separation point of each sentence, dividing the text information into sentences by using the separation point, and respectively counting the number of the texts in each sentence;
the method comprises the steps of collecting character threshold values with the number of characters less than or equal to one time into a first character number set;
collecting the characters with the number more than one time of character threshold values and less than or equal to two times of character threshold values into a second character number set;
collecting the character threshold values with the number of characters more than two times into a third character number set;
selecting one of the first character number set, the second character number set and the third character number set with the most data as a sentence habit reference number set; and substituting the number of the characters in the sentence habit reference number set into a sentence habit formula to obtain a sentence habit numerical value.
Further, the sentence habit formula is configured to:
Figure GDA0003460801940000041
pyx is a sentence habit numerical value, SI1 to Slm respectively represent a plurality of characters in a sentence habit reference number set, m represents a data number in the sentence habit reference number set, c1 is a sentence conversion reference value, and c1 is larger than zero.
Further, the text threshold is calculated by a text threshold formula, and the text threshold formula is configured to:
Figure GDA0003460801940000042
wherein Ywz is a text threshold, Wz is the total number of text in text information, Yz is the total number of sentences in text information after sentence break, d1 is a text threshold conversion ratio, and d1 is greater than zero.
Further, the adaptive processing unit is configured with an adaptive processing policy, the adaptive processing policy comprising: substituting the speech rate value and the sentence habit value of the user into a self-adaptive processing formula to obtain a voice input similarity value;
when the voice input similarity value of the input voice is larger than the voice threshold value of two times, the voice is marked as other fast voice;
when the voice input similarity value of the input voice is smaller than a voice threshold value of one time, marking the voice as other low-speed voice;
and when the voice input similarity value of the input voice is more than or equal to one time of voice threshold value and less than or equal to two times of voice threshold value, marking the voice as normal recognition voice.
Further, the adaptive processing formula is configured to: pxs-k 1 × Vys + k2 × Pyx; wherein Pxs is a voice input similarity value, k1 is a speech rate conversion value, k2 is a sentence habit conversion value, and k1 and k2 are both greater than zero.
The invention has the beneficial effects that: the self-adaptive processing module comprises a voice conversion unit, a character dividing unit, an analysis unit and a self-adaptive processing unit, wherein voice information input by a user can be firstly converted into character information through the voice conversion unit, the converted character information can be divided into independent characters through the character dividing unit, the divided independent characters can be analyzed and processed through the analysis unit to obtain parameters of the divided character information, and finally the self-adaptive processing unit can carry out self-adaptive recognition processing on the speech speed of the user according to the parameters of the divided character information, so that voice recognition can be carried out according to the speech speed and speaking sentence habits of different users, the recognition accuracy is improved, and the accuracy of voice semantic conversion is guaranteed.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
fig. 1 is a schematic diagram of the connection between the system and the user end according to the present invention;
FIG. 2 is a block diagram of the modules of the present invention.
In the figure: 1. an identification system; 11. a user input module; 12. an adaptive processing module; 121. a voice conversion unit; 122. a character dividing unit; 123. an analysis unit; 124. an adaptive processing unit.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
Referring to fig. 1 and fig. 2, a speech speed adaptive recognition system, where the recognition system 1 includes a user input module 11 and an adaptive processing module 12; the user input module 11 is used for inputting voice information by a user, the user input module 11 is in communication connection with the user side 2, and the user can input the voice information through the user side 2;
the adaptive processing module 12 includes a voice conversion unit 121, a text division unit 122, an analysis unit 123, and an adaptive processing unit 124;
the voice conversion unit 121 is configured to convert voice information input by a user into text information;
the text dividing unit 122 is configured to divide the converted text information into independent texts; the word segmentation unit 122 is configured with a word segmentation policy, which includes: the method comprises the steps of calibrating characters in character information in sequence, carrying out time length demarcation on voice input from input starting time to input ending time, sequentially corresponding the calibrated characters to the time length of the input voice, taking the time length as a time mark of the characters, and taking the time mark as a demarcation limit of each character. By corresponding the characters to the time tracks, the subsequent character duration division processing can be facilitated.
The analysis unit 123 performs analysis processing based on the divided independent characters to obtain parameters of the divided character information; the analysis unit 123 is configured with a speech rate analysis policy, where the speech rate analysis policy includes: acquiring the interval duration between every two characters, wherein the interval duration is obtained by subtracting the time mark of the previous character from the time mark of the next character, and then putting the interval duration into a time interval number set;
collecting interval thresholds with interval duration less than or equal to one time into a first time duration number set, collecting interval thresholds with interval duration more than one time and less than or equal to two times into a second time duration number set, and collecting interval thresholds with interval duration more than two times into a third time duration number set;
selecting one of the first duration set, the second duration set and the third duration set with the maximum interval duration as a speech rate identification number set; the most interval duration in the speech rate identification set can represent the speech rate between two characters normally connected by the user, so that the speech rate value is calculated most reasonably by selecting the data in the set.
And substituting a plurality of interval durations in the speech rate identification fixed number set into a speech rate formula to obtain a speech rate value.
The speech rate formula is configured to:
Figure GDA0003460801940000071
the speech speed value obtained by calculating the interval duration in the speech speed identification fixed number set can more accurately represent the speech speed condition of the user and represent the duration between two characters normally connected by the user, wherein Vys is the speech speed value, T1 to Tn are respectively represented as the interval durations in the speech speed identification fixed number set, n is the number of the interval durations in the speech speed identification fixed number set, a1 is the conversion ratio of the speech speed value, and a1 is greater than zero.
The interval threshold is calculated by an interval threshold formula, and the interval threshold formula is configured as follows:
Figure GDA0003460801940000072
yjg is interval threshold, Sz is the number of words in the word information, Tz is the duration of the voice information, b1 is the corresponding coefficient of the interval threshold, and a1 is greater than zero. The interval threshold is processed and calculated, and is calculated based on the number of characters in the character information of the user and the duration of the voice information, so that the setting of the interval threshold is not a fixed numerical value and is obtained according to different characteristics of each user, and the speed characteristic of each user can be more prominent when the interval threshold is divided.
The analysis unit 123 is further configured with a sentence habit analysis policy, which includes: acquiring interval duration in the third time duration set, taking the interval duration in the third time duration set as a separation point of each sentence, dividing the text information into sentences by using the separation point, and respectively counting the number of the texts in each sentence;
the method comprises the steps of collecting character threshold values with the number of characters less than or equal to one time into a first character number set;
collecting the characters with the number more than one time of character threshold values and less than or equal to two times of character threshold values into a second character number set;
collecting the character threshold values with the number of characters more than twice into a third character number set;
selecting one of the first character number set, the second character number set and the third character number set with the most data as a sentence habit reference number set; the data in the sentence habit reference number set can more accurately represent the speaking habit of the user, and some users pause after saying a long sentence, but pause without writing the user habit and saying a plurality of characters, so the speech rate feature is also the important reference data for carrying out voice recognition on the user. And substituting the number of the characters in the sentence habit reference number set into a sentence habit formula to obtain a sentence habit numerical value.
The sentence habit formula is configured as:
Figure GDA0003460801940000081
pyx is a sentence habit numerical value, Sl1 to Slm respectively represent a plurality of character quantities in the sentence habit reference number set, m represents a data quantity in the sentence habit reference number set, c1 is a sentence conversion reference value, and c1 is larger than zero. The sentence speaking habits of the user can be represented more accurately by selecting the number of the characters in the sentence habit reference number set for processing and calculation.
The text threshold is calculated by a text threshold formula, wherein the text threshold formula is configured as follows:
Figure GDA0003460801940000091
wherein Ywz is a text threshold, Wz is the total number of text in text information, Yz is the total number of sentences in text information after sentence break, d1 is a text threshold conversion ratio, and d1 is greater than zero. Through the processing calculation of the character threshold value, the character threshold value is based on the sum of the total number of characters of the character informationThe sentence total number of the text information after sentence breaking is calculated, the design can ensure that the text threshold value is obtained according to the sentence habit characteristics of each user and is not a fixed numerical value, and therefore, the accuracy after division can be better ensured when the text threshold value is used for division.
The adaptive processing unit 124 is configured to perform adaptive recognition processing on the speech rate of the user according to the parameters of the divided text information. The adaptive processing unit 124 is configured with adaptive processing strategies that include: substituting the speech rate value and the sentence habit value of the user into a self-adaptive processing formula to obtain a voice input similarity value;
when the voice input similarity value of the input voice is larger than the voice threshold value of two times, the voice is marked as other fast voice;
when the voice input similarity value of the input voice is smaller than a voice threshold value of one time, marking the voice as other low-speed voice;
and when the voice input similarity value of the input voice is more than or equal to one time of voice threshold value and less than or equal to two times of voice threshold value, marking the voice as normal recognition voice.
The adaptive processing formula is configured to: pxs-k 1 × Vys + k2 × Pyx; wherein Pxs is a voice input similarity value, k1 is a speech rate conversion value, k2 is a sentence habit conversion value, and k1 and k2 are both greater than zero. The voice input similarity value is calculated based on the speed value and the sentence habit value of the user, wherein k1 represents the weight of the speed value in the voice input similarity value, and k2 represents the weight of the sentence habit value in the voice input similarity value.
The working principle is as follows: the user inputs voice through the voice information input by the user, the input voice is transmitted to the processing module to be processed, the input voice information can be firstly converted into character information through the voice conversion unit 121, the converted character information can be divided into independent characters through the character dividing unit 122, the divided independent characters can be analyzed through the analysis unit 123 to obtain parameters of the divided character information, and finally the self-adaptive processing unit 124 can perform self-adaptive recognition processing on the speed of the user according to the parameters of the divided character information, so that voice recognition can be performed according to the speed of the different users and the habits of speaking sentences, and the recognition accuracy is improved.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (5)

1. A speech rate adaptive recognition system, characterized in that said recognition system (1) comprises a user input module (11) and an adaptive processing module (12); the user input module (11) is used for inputting voice information by a user;
the self-adaptive processing module (12) comprises a voice conversion unit (121), a character dividing unit (122), an analysis unit (123) and a self-adaptive processing unit (124);
the voice conversion unit (121) is used for converting voice information input by a user into character information;
the character dividing unit (122) is used for dividing the converted character information into independent characters;
the analysis unit (123) performs analysis processing based on the divided independent characters to obtain parameters of the divided character information;
the self-adaptive processing unit (124) is used for carrying out self-adaptive identification processing on the speech speed of the user according to the parameters of the divided text information;
the word segmentation unit (122) is configured with a word segmentation strategy, which comprises: sequentially calibrating characters in character information, then carrying out time length demarcation on the voice input from the input starting time to the input ending time, sequentially corresponding the calibrated characters to the time length of the input voice, taking the time length as a time mark of the characters, and taking the time mark as a demarcation limit of each character;
the analysis unit (123) is configured with a speech rate analysis strategy comprising: acquiring the interval duration between every two characters, wherein the interval duration is obtained by subtracting the time mark of the previous character from the time mark of the next character, and then putting the interval duration into a time interval number set;
collecting interval thresholds with the interval duration less than or equal to one time into a first time duration set, collecting interval thresholds with the interval duration more than one time and less than or equal to two times into a second time duration set, and collecting interval thresholds with the interval duration more than two times into a third time duration set;
selecting one of the first duration set, the second duration set and the third duration set with the maximum interval duration as a speech rate identification number set;
bringing a plurality of interval durations in the speech rate identification fixed number set into a speech rate formula to obtain a speech rate value;
the analysis unit (123) is further configured with a sentence habit analysis policy comprising: acquiring interval duration in the third time duration set, taking the interval duration in the third time duration set as a separation point of each sentence, dividing the text information into sentences by using the separation point, and respectively counting the number of the texts in each sentence;
the method comprises the steps of collecting character threshold values with the number of characters less than or equal to one time into a first character number set;
collecting the characters with the number more than one time of character threshold values and less than or equal to two times of character threshold values into a second character number set;
collecting the character threshold values with the number of characters more than two times into a third character number set;
selecting one of the first character number set, the second character number set and the third character number set with the most data as a sentence habit reference number set; substituting the number of the characters in the sentence habit reference number set into a sentence habit formula to obtain a sentence habit numerical value;
the text threshold is calculated by a text threshold formula, wherein the text threshold formula is configured as follows:
Figure FDA0003646196530000031
wherein Ywz is a text threshold, Wz is the total number of text information, Yz is the total number of sentences of text information after sentence break, d1 is a text threshold conversion ratio, and d1 is greater than zero;
the adaptive processing unit (124) is configured with an adaptive processing policy comprising: substituting the speech rate value and the sentence habit value of the user into a self-adaptive processing formula to obtain a voice input similarity value;
when the voice input similarity value of the input voice is larger than the voice threshold value of two times, the voice is marked as other fast voice;
when the voice input similarity value of the input voice is smaller than a voice threshold value of one time, marking the voice as other low-speed voice;
and when the voice input similarity value of the input voice is more than or equal to one time of voice threshold value and less than or equal to two times of voice threshold value, marking the voice as normal recognition voice.
2. The adaptive speech rate recognition system of claim 1, wherein the speech rate formula is configured to:
Figure FDA0003646196530000032
wherein Vys is the speech rate value, T1 to Tn are respectively expressed as a plurality of interval durations in the speech rate identification number set, and n is the speech rate identification number setA1 is the conversion ratio of speech rate values, and a1 is greater than zero.
3. The system according to claim 1, wherein the interval threshold is calculated by an interval threshold formula, and the interval threshold formula is configured to:
Figure FDA0003646196530000041
yjg is interval threshold, Sz is the number of words in the word information, Tz is the duration of the voice information, b1 is the corresponding coefficient of the interval threshold, and a1 is greater than zero.
4. The adaptive speech pace recognition system of claim 1, wherein the sentence habit formula is configured to:
Figure FDA0003646196530000042
pyx is a sentence habit numerical value, Sl1 to Slm respectively represent a plurality of character quantities in the sentence habit reference number set, m represents a data quantity in the sentence habit reference number set, c1 is a sentence conversion reference value, and c1 is larger than zero.
5. The adaptive speech rate recognition system of claim 1, wherein the adaptive processing formula is configured to: pxs-k 1 × Vys + k2 × Pyx; wherein Pxs is a voice input similarity value, k1 is a speech rate conversion value, k2 is a sentence habit conversion value, and k1 and k2 are both greater than zero.
CN202111547185.4A 2021-12-17 2021-12-17 Voice speech speed self-adaptive recognition system Active CN114067787B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111547185.4A CN114067787B (en) 2021-12-17 2021-12-17 Voice speech speed self-adaptive recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111547185.4A CN114067787B (en) 2021-12-17 2021-12-17 Voice speech speed self-adaptive recognition system

Publications (2)

Publication Number Publication Date
CN114067787A CN114067787A (en) 2022-02-18
CN114067787B true CN114067787B (en) 2022-07-05

Family

ID=80229756

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111547185.4A Active CN114067787B (en) 2021-12-17 2021-12-17 Voice speech speed self-adaptive recognition system

Country Status (1)

Country Link
CN (1) CN114067787B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060665A (en) * 2019-03-15 2019-07-26 上海拍拍贷金融信息服务有限公司 Word speed detection method and device, readable storage medium storing program for executing
CN110164420A (en) * 2018-08-02 2019-08-23 腾讯科技(深圳)有限公司 A kind of method and device of the method for speech recognition, voice punctuate

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001184100A (en) * 1999-12-24 2001-07-06 Anritsu Corp Speaking speed converting device
JP2003066991A (en) * 2001-08-22 2003-03-05 Seiko Epson Corp Method and apparatus for outputting voice recognition result and recording medium with program for outputting and processing voice recognition result recorded thereon
US7412378B2 (en) * 2004-04-01 2008-08-12 International Business Machines Corporation Method and system of dynamically adjusting a speech output rate to match a speech input rate
CN1841496A (en) * 2005-03-31 2006-10-04 株式会社东芝 Method and apparatus for measuring speech speed and recording apparatus therefor
CN102543063B (en) * 2011-12-07 2013-07-24 华南理工大学 Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers
CN103400580A (en) * 2013-07-23 2013-11-20 华南理工大学 Method for estimating importance degree of speaker in multiuser session voice
US9311932B2 (en) * 2014-01-23 2016-04-12 International Business Machines Corporation Adaptive pause detection in speech recognition
KR102072235B1 (en) * 2016-12-08 2020-02-03 한국전자통신연구원 Automatic speaking rate classification method and speech recognition system using thereof
CN111179910A (en) * 2019-12-17 2020-05-19 深圳追一科技有限公司 Speed of speech recognition method and apparatus, server, computer readable storage medium
CN112466332A (en) * 2020-11-13 2021-03-09 阳光保险集团股份有限公司 Method and device for scoring speed, electronic equipment and storage medium
CN112599152B (en) * 2021-03-05 2021-06-08 北京智慧星光信息技术有限公司 Voice data labeling method, system, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110164420A (en) * 2018-08-02 2019-08-23 腾讯科技(深圳)有限公司 A kind of method and device of the method for speech recognition, voice punctuate
CN110060665A (en) * 2019-03-15 2019-07-26 上海拍拍贷金融信息服务有限公司 Word speed detection method and device, readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN114067787A (en) 2022-02-18

Similar Documents

Publication Publication Date Title
EP4068280A1 (en) Speech recognition error correction method, related devices, and readable storage medium
CN110717031B (en) Intelligent conference summary generation method and system
CN103400577B (en) The acoustic model method for building up of multilingual speech recognition and device
CN109003625B (en) Speech emotion recognition method and system based on ternary loss
CN110096570A (en) A kind of intension recognizing method and device applied to intelligent customer service robot
CN110210016B (en) Method and system for detecting false news of bilinear neural network based on style guidance
CN111709242B (en) Chinese punctuation mark adding method based on named entity recognition
CN109388700A (en) A kind of intension recognizing method and system
CN110222184A (en) A kind of emotion information recognition methods of text and relevant apparatus
CN109213856A (en) A kind of method for recognizing semantics and system
CN105446955A (en) Adaptive word segmentation method
CN105095415A (en) Method and apparatus for confirming network emotion
CN109949799A (en) A kind of semanteme analytic method and system
CN109492221A (en) A kind of information replying method and wearable device based on semantic analysis
CN111695338A (en) Interview content refining method, device, equipment and medium based on artificial intelligence
CN108899033A (en) A kind of method and device of determining speaker characteristic
CN111724766B (en) Language identification method, related equipment and readable storage medium
CN111079433B (en) Event extraction method and device and electronic equipment
CN114067787B (en) Voice speech speed self-adaptive recognition system
CN112580351B (en) Machine-generated text detection method based on self-information loss compensation
CN113763962A (en) Audio processing method and device, storage medium and computer equipment
CN112818693A (en) Automatic extraction method and system for electronic component model words
CN101901348A (en) Normalization based handwriting identifying method and identifying device
CN111563161A (en) Sentence recognition method, sentence recognition device and intelligent equipment
CN107871113B (en) Emotion hybrid recognition detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant