CN111339757A - Error correction method for voice recognition result in collection scene - Google Patents
Error correction method for voice recognition result in collection scene Download PDFInfo
- Publication number
- CN111339757A CN111339757A CN202010089898.XA CN202010089898A CN111339757A CN 111339757 A CN111339757 A CN 111339757A CN 202010089898 A CN202010089898 A CN 202010089898A CN 111339757 A CN111339757 A CN 111339757A
- Authority
- CN
- China
- Prior art keywords
- text
- corrected
- field
- word
- collection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000012937 correction Methods 0.000 title claims abstract description 20
- 238000012549 training Methods 0.000 claims abstract description 11
- 238000002372 labelling Methods 0.000 claims abstract description 5
- 230000007704 transition Effects 0.000 claims abstract description 4
- 239000000463 material Substances 0.000 claims description 12
- 230000011218 segmentation Effects 0.000 claims description 5
- 238000000586 desensitisation Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 2
- 238000011031 large-scale manufacturing process Methods 0.000 abstract description 3
- 238000012216 screening Methods 0.000 abstract description 2
- 241001672694 Citrus reticulata Species 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses an error correction method of a voice recognition result in a receiving-urging scene, which comprises the following steps: step 1, generating a special dictionary base of the collection urging field, and step 2, training an HMM model of linguistic data between a collector urging in the collection urging field and a client call: the method comprises the following steps of utilizing a call corpus of a receiver and a client in the field of collection, carrying out certain manual labeling and arrangement, using the call corpus as a training sample, calculating initial emission probability, transition probability and emission probability, step 3, generating a text to be corrected, step 4, generating a text set after correction, and step 5, carrying out error correction, namely, a text screening method: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, decoding by using a trained HMM model and combining an algorithm, and calculating the final candidate corrected text. The invention can correct the voice recognition of the collector well, and is convenient for large-scale production application of the voice recognition product.
Description
Technical Field
The invention relates to the technical field of voice recognition, in particular to an error correction method for a voice recognition result in a receiving-urging scene.
Background
With the popularization of deep learning, great breakthroughs are made in the aspects of computer vision, speech recognition, natural language processing and the like. Taking speech recognition as an example, the accuracy of speech recognition has reached 97% so that the application field of speech recognition is wider and wider. In the field of financial collection, a collector and a client can carry out a large amount of telephone communication. The method comprises the steps that speech recognition (ASR) needs to be carried out on call audio to be converted into corresponding call texts, and on one hand, quality inspection analysis is carried out on the call texts to ensure compliance; on the other hand, the text analysis and mining are carried out on the call text, and a solid foundation is laid for subsequently improving the receiving acceleration effect.
In the actual voice interaction process, the voice recognition error rate is high due to the influences of various factors such as nonstandard Mandarin of a user, noise, vocabulary loss in the professional field and the like. The prior art focuses on improving the accuracy of voice recognition, but lacks an error correction means for a recognition result. Therefore, the large-scale production application of the voice recognition product is greatly influenced.
Disclosure of Invention
The present invention aims to provide a method for correcting a speech recognition result in a call-in scene, so as to solve the problems proposed in the background art.
In order to achieve the purpose, the invention provides the following technical scheme: a method for correcting a voice recognition result in a receiving-urging scene comprises the following steps:
step 2, training an HMM model of the linguistic data between the call of the call taker and the client in the call taker field: the method comprises the following steps of (1) utilizing a call corpus between an acquirer in an acquisition field and a client, carrying out certain manual marking and sorting, and then using the call corpus as a training sample to calculate initial emission probability, transition probability and emission probability;
step 3, generating a text to be corrected: performing word segmentation on the voice recognition text, and checking whether each word is in a user dictionary library to judge whether a text to be corrected is added;
step 4, the text set generation method after error correction: firstly, converting a candidate text to be corrected into pinyin, and secondly, calculating the editing distance between the candidate text to be corrected and the pinyin of each word in a collection field special dictionary library by using an editing distance algorithm to judge whether the original candidate text to be corrected is reserved;
and step 5, correcting the error, namely: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, decoding by using a trained HMM model and combining a decoding algorithm, and calculating the final candidate corrected text.
Preferably, the method for determining whether to add the candidate text to be corrected is that if each word is in the collection-urging-field-specific dictionary base, no correction is performed, and if not, the word is taken as the candidate text to be corrected.
Preferably, the method of whether to reserve the original candidate text to be corrected is to use the original candidate text to be corrected as a candidate text set after correction if the edit distance is smaller than the threshold, and reserve the original candidate text to be corrected first if the edit distance is larger than the threshold.
Preferably, the method for processing the sorted corpus text comprises text word segmentation, part-of-speech tagging, word frequency statistics, pinyin tagging and similar word retrieval.
Preferably, the decoding algorithm for performing the decoding operation is a viterbi algorithm.
Compared with the prior art, the invention has the beneficial effects that:
the invention can correct the voice recognition of the collector well, and is convenient for large-scale production application of the voice recognition product.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the construction of a dictionary base for urging collection of a domain of expertise according to the present invention;
FIG. 3 is a flow of text screening after final error correction is constructed.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-3, the present invention provides a technical solution: a method for correcting a voice recognition result in a receiving-urging scene comprises the following steps:
step 2, training an HMM (hidden Markov model) of linguistic data between the call of the collector and the client in the collection field: the method comprises the following steps of (1) utilizing a call corpus between an acquirer in an acquisition field and a client, carrying out certain manual marking and sorting, and then using the call corpus as a training sample to calculate initial emission probability, transition probability and emission probability;
step 3, generating a text to be corrected: performing word segmentation on the voice recognition text, and checking whether each word is in a user dictionary library to judge whether a text to be corrected is added;
step 4, the text set generation method after error correction: firstly, converting a candidate text to be corrected into pinyin, and secondly, calculating the editing distance between the candidate text to be corrected and the pinyin of each word in a collection field special dictionary library by using an editing distance algorithm to judge whether the original candidate text to be corrected is reserved;
and step 5, correcting the error, namely: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, and utilizing a trained HMM (hidden Markov model) to perform decoding work by combining a decoding algorithm to calculate the final candidate corrected text.
Specifically, the method for judging whether to add the candidate text to be corrected is to perform no correction if each word is in the collection-promoting field proprietary dictionary library, and to use the word as the candidate text to be corrected if the word is not in the collection-promoting field proprietary dictionary library.
Specifically, the method for determining whether to reserve the original candidate text to be corrected is to use the original candidate text to be corrected as a candidate text set after correction if the edit distance is smaller than a threshold, reserve the original candidate text to be corrected first if the edit distance is larger than the threshold, calculate the edit distances of the full pinyin and the first pinyin, use the word as a candidate text after correction if the edit distances are respectively smaller than the respective thresholds, and use the word ranked with the edit distance of the word in the first three as the candidate text after correction if the edit distances are larger than the threshold. Through the step, if there are one or more candidate texts after error correction in each candidate text to be corrected, the candidate text after error correction needs to be continuously screened.
The method comprises the steps of dividing words, labeling parts of speech, counting word frequency, marking pinyin, retrieving similar words, dividing words, removing stop words, sequencing according to the word frequency in a reverse order, and identifying the parts of speech and the pinyin of each word.
Specifically, the decoding algorithm for performing the decoding operation is a viterbi algorithm, and the search efficiency is high.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (5)
1. A method for correcting a voice recognition result in a receiving-urging scene is characterized by comprising the following steps:
step 1, generation of a special dictionary base in the collection urging field: making statistics on language materials of call between a receiver in the collection accelerating field and a client, carrying out certain manual labeling, sorting and desensitization, wherein the language materials are used as training language materials, and are processed according to sorted language material texts and sorted by special service personnel to form a special dictionary library in the collection accelerating field;
step 2, training an HMM model of the linguistic data between the call of the call taker and the client in the call taker field: the method comprises the following steps of (1) utilizing a call corpus between an acquirer in an acquisition field and a client, carrying out certain manual marking and sorting, and then using the call corpus as a training sample to calculate initial emission probability, transition probability and emission probability;
step 3, generating a text to be corrected: performing word segmentation on the voice recognition text, and checking whether each word is in a user dictionary library to judge whether a text to be corrected is added;
step 4, the text set generation method after error correction: firstly, converting a candidate text to be corrected into pinyin, and secondly, calculating the editing distance between the candidate text to be corrected and the pinyin of each word in a collection field special dictionary library by using an editing distance algorithm to judge whether the original candidate text to be corrected is reserved;
and step 5, correcting the error, namely: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, decoding by using a trained HMM model and combining a decoding algorithm, and calculating the final candidate corrected text.
2. The method according to claim 1, wherein the method comprises the following steps: and the judgment method for whether to add the candidate text to be corrected is to perform no correction if each word is in the collection urging field special dictionary library, and to take the word as the candidate text to be corrected if the word is not in the collection urging field special dictionary library.
3. The method according to claim 2, wherein the method comprises the following steps: and if the editing distance is greater than the threshold value, the original candidate text to be corrected is firstly reserved.
4. The method according to claim 3, wherein the method comprises the following steps: the method for processing the sorted corpus text comprises text word segmentation, part of speech tagging, word frequency statistics, pinyin tagging and similar word retrieval.
5. The method according to claim 2, wherein the method comprises the following steps: the decoding algorithm for decoding is a viterbi algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010089898.XA CN111339757A (en) | 2020-02-13 | 2020-02-13 | Error correction method for voice recognition result in collection scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010089898.XA CN111339757A (en) | 2020-02-13 | 2020-02-13 | Error correction method for voice recognition result in collection scene |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111339757A true CN111339757A (en) | 2020-06-26 |
Family
ID=71183440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010089898.XA Pending CN111339757A (en) | 2020-02-13 | 2020-02-13 | Error correction method for voice recognition result in collection scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111339757A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985213A (en) * | 2020-09-07 | 2020-11-24 | 科大讯飞华南人工智能研究院(广州)有限公司 | Method and device for correcting voice customer service text |
CN112382289A (en) * | 2020-11-13 | 2021-02-19 | 北京百度网讯科技有限公司 | Method and device for processing voice recognition result, electronic equipment and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6018708A (en) * | 1997-08-26 | 2000-01-25 | Nortel Networks Corporation | Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies |
CN101650886A (en) * | 2008-12-26 | 2010-02-17 | 中国科学院声学研究所 | Method for automatically detecting reading errors of language learners |
CN105975625A (en) * | 2016-05-26 | 2016-09-28 | 同方知网数字出版技术股份有限公司 | Chinglish inquiring correcting method and system oriented to English search engine |
CN105976818A (en) * | 2016-04-26 | 2016-09-28 | Tcl集团股份有限公司 | Instruction identification processing method and apparatus thereof |
CN106959977A (en) * | 2016-01-12 | 2017-07-18 | 广州市动景计算机科技有限公司 | Candidate collection computational methods and device, word error correction method and device in word input |
CN107729321A (en) * | 2017-10-23 | 2018-02-23 | 上海百芝龙网络科技有限公司 | A kind of method for correcting error of voice identification result |
CN108304385A (en) * | 2018-02-09 | 2018-07-20 | 叶伟 | A kind of speech recognition text error correction method and device |
CN109670148A (en) * | 2018-09-26 | 2019-04-23 | 平安科技(深圳)有限公司 | Collection householder method, device, equipment and storage medium based on speech recognition |
CN109933778A (en) * | 2017-12-18 | 2019-06-25 | 北京京东尚科信息技术有限公司 | Segmenting method, device and computer readable storage medium |
CN110210029A (en) * | 2019-05-30 | 2019-09-06 | 浙江远传信息技术股份有限公司 | Speech text error correction method, system, equipment and medium based on vertical field |
-
2020
- 2020-02-13 CN CN202010089898.XA patent/CN111339757A/en active Pending
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6018708A (en) * | 1997-08-26 | 2000-01-25 | Nortel Networks Corporation | Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies |
CN101650886A (en) * | 2008-12-26 | 2010-02-17 | 中国科学院声学研究所 | Method for automatically detecting reading errors of language learners |
CN106959977A (en) * | 2016-01-12 | 2017-07-18 | 广州市动景计算机科技有限公司 | Candidate collection computational methods and device, word error correction method and device in word input |
CN105976818A (en) * | 2016-04-26 | 2016-09-28 | Tcl集团股份有限公司 | Instruction identification processing method and apparatus thereof |
CN105975625A (en) * | 2016-05-26 | 2016-09-28 | 同方知网数字出版技术股份有限公司 | Chinglish inquiring correcting method and system oriented to English search engine |
CN107729321A (en) * | 2017-10-23 | 2018-02-23 | 上海百芝龙网络科技有限公司 | A kind of method for correcting error of voice identification result |
CN109933778A (en) * | 2017-12-18 | 2019-06-25 | 北京京东尚科信息技术有限公司 | Segmenting method, device and computer readable storage medium |
CN108304385A (en) * | 2018-02-09 | 2018-07-20 | 叶伟 | A kind of speech recognition text error correction method and device |
CN109670148A (en) * | 2018-09-26 | 2019-04-23 | 平安科技(深圳)有限公司 | Collection householder method, device, equipment and storage medium based on speech recognition |
CN110210029A (en) * | 2019-05-30 | 2019-09-06 | 浙江远传信息技术股份有限公司 | Speech text error correction method, system, equipment and medium based on vertical field |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111985213A (en) * | 2020-09-07 | 2020-11-24 | 科大讯飞华南人工智能研究院(广州)有限公司 | Method and device for correcting voice customer service text |
CN111985213B (en) * | 2020-09-07 | 2024-05-28 | 科大讯飞华南人工智能研究院(广州)有限公司 | Voice customer service text error correction method and device |
CN112382289A (en) * | 2020-11-13 | 2021-02-19 | 北京百度网讯科技有限公司 | Method and device for processing voice recognition result, electronic equipment and storage medium |
CN112382289B (en) * | 2020-11-13 | 2024-03-22 | 北京百度网讯科技有限公司 | Speech recognition result processing method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107315737B (en) | Semantic logic processing method and system | |
CN102760436B (en) | Voice lexicon screening method | |
CN111209401A (en) | System and method for classifying and processing sentiment polarity of online public opinion text information | |
CN110853649A (en) | Label extraction method, system, device and medium based on intelligent voice technology | |
CN111489765A (en) | Telephone traffic service quality inspection method based on intelligent voice technology | |
CN112259083B (en) | Audio processing method and device | |
CN110413998B (en) | Self-adaptive Chinese word segmentation method oriented to power industry, system and medium thereof | |
CN112712349A (en) | Intelligent paperless conference data information processing method based on artificial intelligence and big data analysis | |
CN111339757A (en) | Error correction method for voice recognition result in collection scene | |
CN112002328A (en) | Subtitle generating method and device, computer storage medium and electronic equipment | |
CN113221542A (en) | Chinese text automatic proofreading method based on multi-granularity fusion and Bert screening | |
CN111737424A (en) | Question matching method, device, equipment and storage medium | |
CN112967710B (en) | Low-resource customer dialect point identification method | |
CN114239579A (en) | Electric power searchable document extraction method and device based on regular expression and CRF model | |
CN113037934A (en) | Hot word analysis system based on call recording of call center | |
Behre et al. | Streaming punctuation for long-form dictation with transformers | |
CN112231440A (en) | Voice search method based on artificial intelligence | |
CN117292680A (en) | Voice recognition method for power transmission operation detection based on small sample synthesis | |
CN116665674A (en) | Internet intelligent recruitment publishing method based on voice and pre-training model | |
CN111785236A (en) | Automatic composition method based on motivational extraction model and neural network | |
CN114254628A (en) | Method and device for quickly extracting hot words by combining user text in voice transcription, electronic equipment and storage medium | |
CN114707515A (en) | Method and device for judging dialect, electronic equipment and storage medium | |
CN110858268B (en) | Method and system for detecting unsmooth phenomenon in voice translation system | |
CN112488593A (en) | Auxiliary bid evaluation system and method for bidding | |
CN113011183A (en) | Unstructured text data processing method and system in electric power regulation and control field |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200626 |
|
RJ01 | Rejection of invention patent application after publication |