CN111339757A - Error correction method for voice recognition result in collection scene - Google Patents

Error correction method for voice recognition result in collection scene Download PDF

Info

Publication number
CN111339757A
CN111339757A CN202010089898.XA CN202010089898A CN111339757A CN 111339757 A CN111339757 A CN 111339757A CN 202010089898 A CN202010089898 A CN 202010089898A CN 111339757 A CN111339757 A CN 111339757A
Authority
CN
China
Prior art keywords
text
corrected
field
word
collection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010089898.XA
Other languages
Chinese (zh)
Inventor
鲁进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Kaian Information Technology Co ltd
Original Assignee
Shanghai Kaian Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Kaian Information Technology Co ltd filed Critical Shanghai Kaian Information Technology Co ltd
Priority to CN202010089898.XA priority Critical patent/CN111339757A/en
Publication of CN111339757A publication Critical patent/CN111339757A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an error correction method of a voice recognition result in a receiving-urging scene, which comprises the following steps: step 1, generating a special dictionary base of the collection urging field, and step 2, training an HMM model of linguistic data between a collector urging in the collection urging field and a client call: the method comprises the following steps of utilizing a call corpus of a receiver and a client in the field of collection, carrying out certain manual labeling and arrangement, using the call corpus as a training sample, calculating initial emission probability, transition probability and emission probability, step 3, generating a text to be corrected, step 4, generating a text set after correction, and step 5, carrying out error correction, namely, a text screening method: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, decoding by using a trained HMM model and combining an algorithm, and calculating the final candidate corrected text. The invention can correct the voice recognition of the collector well, and is convenient for large-scale production application of the voice recognition product.

Description

Error correction method for voice recognition result in collection scene
Technical Field
The invention relates to the technical field of voice recognition, in particular to an error correction method for a voice recognition result in a receiving-urging scene.
Background
With the popularization of deep learning, great breakthroughs are made in the aspects of computer vision, speech recognition, natural language processing and the like. Taking speech recognition as an example, the accuracy of speech recognition has reached 97% so that the application field of speech recognition is wider and wider. In the field of financial collection, a collector and a client can carry out a large amount of telephone communication. The method comprises the steps that speech recognition (ASR) needs to be carried out on call audio to be converted into corresponding call texts, and on one hand, quality inspection analysis is carried out on the call texts to ensure compliance; on the other hand, the text analysis and mining are carried out on the call text, and a solid foundation is laid for subsequently improving the receiving acceleration effect.
In the actual voice interaction process, the voice recognition error rate is high due to the influences of various factors such as nonstandard Mandarin of a user, noise, vocabulary loss in the professional field and the like. The prior art focuses on improving the accuracy of voice recognition, but lacks an error correction means for a recognition result. Therefore, the large-scale production application of the voice recognition product is greatly influenced.
Disclosure of Invention
The present invention aims to provide a method for correcting a speech recognition result in a call-in scene, so as to solve the problems proposed in the background art.
In order to achieve the purpose, the invention provides the following technical scheme: a method for correcting a voice recognition result in a receiving-urging scene comprises the following steps:
step 1, generation of a special dictionary base in the collection urging field: making statistics on language materials of call between a receiver in the collection accelerating field and a client, carrying out certain manual labeling, sorting and desensitization, wherein the language materials are used as training language materials, and are processed according to sorted language material texts and sorted by special service personnel to form a special dictionary library in the collection accelerating field;
step 2, training an HMM model of the linguistic data between the call of the call taker and the client in the call taker field: the method comprises the following steps of (1) utilizing a call corpus between an acquirer in an acquisition field and a client, carrying out certain manual marking and sorting, and then using the call corpus as a training sample to calculate initial emission probability, transition probability and emission probability;
step 3, generating a text to be corrected: performing word segmentation on the voice recognition text, and checking whether each word is in a user dictionary library to judge whether a text to be corrected is added;
step 4, the text set generation method after error correction: firstly, converting a candidate text to be corrected into pinyin, and secondly, calculating the editing distance between the candidate text to be corrected and the pinyin of each word in a collection field special dictionary library by using an editing distance algorithm to judge whether the original candidate text to be corrected is reserved;
and step 5, correcting the error, namely: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, decoding by using a trained HMM model and combining a decoding algorithm, and calculating the final candidate corrected text.
Preferably, the method for determining whether to add the candidate text to be corrected is that if each word is in the collection-urging-field-specific dictionary base, no correction is performed, and if not, the word is taken as the candidate text to be corrected.
Preferably, the method of whether to reserve the original candidate text to be corrected is to use the original candidate text to be corrected as a candidate text set after correction if the edit distance is smaller than the threshold, and reserve the original candidate text to be corrected first if the edit distance is larger than the threshold.
Preferably, the method for processing the sorted corpus text comprises text word segmentation, part-of-speech tagging, word frequency statistics, pinyin tagging and similar word retrieval.
Preferably, the decoding algorithm for performing the decoding operation is a viterbi algorithm.
Compared with the prior art, the invention has the beneficial effects that:
the invention can correct the voice recognition of the collector well, and is convenient for large-scale production application of the voice recognition product.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of the construction of a dictionary base for urging collection of a domain of expertise according to the present invention;
FIG. 3 is a flow of text screening after final error correction is constructed.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-3, the present invention provides a technical solution: a method for correcting a voice recognition result in a receiving-urging scene comprises the following steps:
step 1, generation of a special dictionary base in the collection urging field: making statistics on language materials of call between a receiver in the collection accelerating field and a client, carrying out certain manual labeling, sorting and desensitization, wherein the language materials are used as training language materials, and are processed according to sorted language material texts and sorted by special service personnel to form a special dictionary library in the collection accelerating field;
step 2, training an HMM (hidden Markov model) of linguistic data between the call of the collector and the client in the collection field: the method comprises the following steps of (1) utilizing a call corpus between an acquirer in an acquisition field and a client, carrying out certain manual marking and sorting, and then using the call corpus as a training sample to calculate initial emission probability, transition probability and emission probability;
step 3, generating a text to be corrected: performing word segmentation on the voice recognition text, and checking whether each word is in a user dictionary library to judge whether a text to be corrected is added;
step 4, the text set generation method after error correction: firstly, converting a candidate text to be corrected into pinyin, and secondly, calculating the editing distance between the candidate text to be corrected and the pinyin of each word in a collection field special dictionary library by using an editing distance algorithm to judge whether the original candidate text to be corrected is reserved;
and step 5, correcting the error, namely: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, and utilizing a trained HMM (hidden Markov model) to perform decoding work by combining a decoding algorithm to calculate the final candidate corrected text.
Specifically, the method for judging whether to add the candidate text to be corrected is to perform no correction if each word is in the collection-promoting field proprietary dictionary library, and to use the word as the candidate text to be corrected if the word is not in the collection-promoting field proprietary dictionary library.
Specifically, the method for determining whether to reserve the original candidate text to be corrected is to use the original candidate text to be corrected as a candidate text set after correction if the edit distance is smaller than a threshold, reserve the original candidate text to be corrected first if the edit distance is larger than the threshold, calculate the edit distances of the full pinyin and the first pinyin, use the word as a candidate text after correction if the edit distances are respectively smaller than the respective thresholds, and use the word ranked with the edit distance of the word in the first three as the candidate text after correction if the edit distances are larger than the threshold. Through the step, if there are one or more candidate texts after error correction in each candidate text to be corrected, the candidate text after error correction needs to be continuously screened.
The method comprises the steps of dividing words, labeling parts of speech, counting word frequency, marking pinyin, retrieving similar words, dividing words, removing stop words, sequencing according to the word frequency in a reverse order, and identifying the parts of speech and the pinyin of each word.
Specifically, the decoding algorithm for performing the decoding operation is a viterbi algorithm, and the search efficiency is high.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (5)

1. A method for correcting a voice recognition result in a receiving-urging scene is characterized by comprising the following steps:
step 1, generation of a special dictionary base in the collection urging field: making statistics on language materials of call between a receiver in the collection accelerating field and a client, carrying out certain manual labeling, sorting and desensitization, wherein the language materials are used as training language materials, and are processed according to sorted language material texts and sorted by special service personnel to form a special dictionary library in the collection accelerating field;
step 2, training an HMM model of the linguistic data between the call of the call taker and the client in the call taker field: the method comprises the following steps of (1) utilizing a call corpus between an acquirer in an acquisition field and a client, carrying out certain manual marking and sorting, and then using the call corpus as a training sample to calculate initial emission probability, transition probability and emission probability;
step 3, generating a text to be corrected: performing word segmentation on the voice recognition text, and checking whether each word is in a user dictionary library to judge whether a text to be corrected is added;
step 4, the text set generation method after error correction: firstly, converting a candidate text to be corrected into pinyin, and secondly, calculating the editing distance between the candidate text to be corrected and the pinyin of each word in a collection field special dictionary library by using an editing distance algorithm to judge whether the original candidate text to be corrected is reserved;
and step 5, correcting the error, namely: firstly, replacing the determined candidate text to be corrected with the corresponding candidate corrected text set, decoding by using a trained HMM model and combining a decoding algorithm, and calculating the final candidate corrected text.
2. The method according to claim 1, wherein the method comprises the following steps: and the judgment method for whether to add the candidate text to be corrected is to perform no correction if each word is in the collection urging field special dictionary library, and to take the word as the candidate text to be corrected if the word is not in the collection urging field special dictionary library.
3. The method according to claim 2, wherein the method comprises the following steps: and if the editing distance is greater than the threshold value, the original candidate text to be corrected is firstly reserved.
4. The method according to claim 3, wherein the method comprises the following steps: the method for processing the sorted corpus text comprises text word segmentation, part of speech tagging, word frequency statistics, pinyin tagging and similar word retrieval.
5. The method according to claim 2, wherein the method comprises the following steps: the decoding algorithm for decoding is a viterbi algorithm.
CN202010089898.XA 2020-02-13 2020-02-13 Error correction method for voice recognition result in collection scene Pending CN111339757A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010089898.XA CN111339757A (en) 2020-02-13 2020-02-13 Error correction method for voice recognition result in collection scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010089898.XA CN111339757A (en) 2020-02-13 2020-02-13 Error correction method for voice recognition result in collection scene

Publications (1)

Publication Number Publication Date
CN111339757A true CN111339757A (en) 2020-06-26

Family

ID=71183440

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010089898.XA Pending CN111339757A (en) 2020-02-13 2020-02-13 Error correction method for voice recognition result in collection scene

Country Status (1)

Country Link
CN (1) CN111339757A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985213A (en) * 2020-09-07 2020-11-24 科大讯飞华南人工智能研究院(广州)有限公司 Method and device for correcting voice customer service text
CN112382289A (en) * 2020-11-13 2021-02-19 北京百度网讯科技有限公司 Method and device for processing voice recognition result, electronic equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6018708A (en) * 1997-08-26 2000-01-25 Nortel Networks Corporation Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies
CN101650886A (en) * 2008-12-26 2010-02-17 中国科学院声学研究所 Method for automatically detecting reading errors of language learners
CN105975625A (en) * 2016-05-26 2016-09-28 同方知网数字出版技术股份有限公司 Chinglish inquiring correcting method and system oriented to English search engine
CN105976818A (en) * 2016-04-26 2016-09-28 Tcl集团股份有限公司 Instruction identification processing method and apparatus thereof
CN106959977A (en) * 2016-01-12 2017-07-18 广州市动景计算机科技有限公司 Candidate collection computational methods and device, word error correction method and device in word input
CN107729321A (en) * 2017-10-23 2018-02-23 上海百芝龙网络科技有限公司 A kind of method for correcting error of voice identification result
CN108304385A (en) * 2018-02-09 2018-07-20 叶伟 A kind of speech recognition text error correction method and device
CN109670148A (en) * 2018-09-26 2019-04-23 平安科技(深圳)有限公司 Collection householder method, device, equipment and storage medium based on speech recognition
CN109933778A (en) * 2017-12-18 2019-06-25 北京京东尚科信息技术有限公司 Segmenting method, device and computer readable storage medium
CN110210029A (en) * 2019-05-30 2019-09-06 浙江远传信息技术股份有限公司 Speech text error correction method, system, equipment and medium based on vertical field

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6018708A (en) * 1997-08-26 2000-01-25 Nortel Networks Corporation Method and apparatus for performing speech recognition utilizing a supplementary lexicon of frequently used orthographies
CN101650886A (en) * 2008-12-26 2010-02-17 中国科学院声学研究所 Method for automatically detecting reading errors of language learners
CN106959977A (en) * 2016-01-12 2017-07-18 广州市动景计算机科技有限公司 Candidate collection computational methods and device, word error correction method and device in word input
CN105976818A (en) * 2016-04-26 2016-09-28 Tcl集团股份有限公司 Instruction identification processing method and apparatus thereof
CN105975625A (en) * 2016-05-26 2016-09-28 同方知网数字出版技术股份有限公司 Chinglish inquiring correcting method and system oriented to English search engine
CN107729321A (en) * 2017-10-23 2018-02-23 上海百芝龙网络科技有限公司 A kind of method for correcting error of voice identification result
CN109933778A (en) * 2017-12-18 2019-06-25 北京京东尚科信息技术有限公司 Segmenting method, device and computer readable storage medium
CN108304385A (en) * 2018-02-09 2018-07-20 叶伟 A kind of speech recognition text error correction method and device
CN109670148A (en) * 2018-09-26 2019-04-23 平安科技(深圳)有限公司 Collection householder method, device, equipment and storage medium based on speech recognition
CN110210029A (en) * 2019-05-30 2019-09-06 浙江远传信息技术股份有限公司 Speech text error correction method, system, equipment and medium based on vertical field

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985213A (en) * 2020-09-07 2020-11-24 科大讯飞华南人工智能研究院(广州)有限公司 Method and device for correcting voice customer service text
CN111985213B (en) * 2020-09-07 2024-05-28 科大讯飞华南人工智能研究院(广州)有限公司 Voice customer service text error correction method and device
CN112382289A (en) * 2020-11-13 2021-02-19 北京百度网讯科技有限公司 Method and device for processing voice recognition result, electronic equipment and storage medium
CN112382289B (en) * 2020-11-13 2024-03-22 北京百度网讯科技有限公司 Speech recognition result processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107315737B (en) Semantic logic processing method and system
CN102760436B (en) Voice lexicon screening method
CN111209401A (en) System and method for classifying and processing sentiment polarity of online public opinion text information
CN110853649A (en) Label extraction method, system, device and medium based on intelligent voice technology
CN111489765A (en) Telephone traffic service quality inspection method based on intelligent voice technology
CN112259083B (en) Audio processing method and device
CN110413998B (en) Self-adaptive Chinese word segmentation method oriented to power industry, system and medium thereof
CN112712349A (en) Intelligent paperless conference data information processing method based on artificial intelligence and big data analysis
CN111339757A (en) Error correction method for voice recognition result in collection scene
CN112002328A (en) Subtitle generating method and device, computer storage medium and electronic equipment
CN113221542A (en) Chinese text automatic proofreading method based on multi-granularity fusion and Bert screening
CN111737424A (en) Question matching method, device, equipment and storage medium
CN112967710B (en) Low-resource customer dialect point identification method
CN114239579A (en) Electric power searchable document extraction method and device based on regular expression and CRF model
CN113037934A (en) Hot word analysis system based on call recording of call center
Behre et al. Streaming punctuation for long-form dictation with transformers
CN112231440A (en) Voice search method based on artificial intelligence
CN117292680A (en) Voice recognition method for power transmission operation detection based on small sample synthesis
CN116665674A (en) Internet intelligent recruitment publishing method based on voice and pre-training model
CN111785236A (en) Automatic composition method based on motivational extraction model and neural network
CN114254628A (en) Method and device for quickly extracting hot words by combining user text in voice transcription, electronic equipment and storage medium
CN114707515A (en) Method and device for judging dialect, electronic equipment and storage medium
CN110858268B (en) Method and system for detecting unsmooth phenomenon in voice translation system
CN112488593A (en) Auxiliary bid evaluation system and method for bidding
CN113011183A (en) Unstructured text data processing method and system in electric power regulation and control field

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200626

RJ01 Rejection of invention patent application after publication