CN113096667A - Wrongly-written character recognition detection method and system - Google Patents

Wrongly-written character recognition detection method and system Download PDF

Info

Publication number
CN113096667A
CN113096667A CN202110417158.9A CN202110417158A CN113096667A CN 113096667 A CN113096667 A CN 113096667A CN 202110417158 A CN202110417158 A CN 202110417158A CN 113096667 A CN113096667 A CN 113096667A
Authority
CN
China
Prior art keywords
text
recognized
word
confusion degree
word segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110417158.9A
Other languages
Chinese (zh)
Inventor
王珏
史文华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yunshen Intelligent Technology Co ltd
Original Assignee
Shanghai Yunshen Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yunshen Intelligent Technology Co ltd filed Critical Shanghai Yunshen Intelligent Technology Co ltd
Priority to CN202110417158.9A priority Critical patent/CN113096667A/en
Publication of CN113096667A publication Critical patent/CN113096667A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a method and a system for identifying and detecting wrongly written characters, wherein the method comprises the following steps: processing an original voice signal and converting the processed original voice signal into a text to be recognized; inputting a text to be recognized into a pre-trained word segmentation model, and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized; sequentially inputting the word vectors in the word vector sequence into a pre-trained statistical language model in sequence, and outputting the final confusion degree of the text to be recognized; and when the final confusion degree is greater than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized. The invention improves the recognition accuracy and efficiency of wrongly written characters in the text.

Description

Wrongly-written character recognition detection method and system
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and a system for identifying and detecting wrongly written characters.
Background
With the rapid development of scientific technology, voice input gradually appears in the visual field of people, and possibly replaces the traditional keyboard input, so that the voice input system becomes a new generation of standard input system. When inputting speech, it is common to cause input errors due to noisy environment or words with harmonic sounds.
With the development of technology, the application of Automatic Speech Recognition (ASR) technology is becoming more and more extensive, and ASR technology is a technology for converting human Speech into text. In the application process of the ASR technology, due to the influence of background noise or the influence of pronunciation of a speaker, such as dialect, accent, fast speaking, word usage habit and the like, substitution, insertion or deletion errors inevitably occur in the ASR recognition result. These recognition errors may cause the recognized sentences to have the problems of improper word order, improper collocation, unclear semantics, improper sentence logicality, etc., and form wrong sentences. These mistakes not only make understanding and analysis difficult, but also make subsequent Natural Language Processing (NLP) application extremely difficult.
Therefore, it is of practical significance and necessity to identify whether a sentence is correct or not.
Disclosure of Invention
The invention aims to provide a method and a system for identifying and detecting wrongly written characters, which can improve the identification accuracy and efficiency of wrongly written characters in texts.
The technical scheme provided by the invention is as follows:
the invention provides a wrongly written character recognition and detection method, which comprises the following steps:
processing an original voice signal and converting the processed original voice signal into a text to be recognized;
inputting a text to be recognized into a pre-trained word segmentation model, and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
sequentially inputting the word vectors in the word vector sequence into a pre-trained statistical language model in sequence, and outputting the final confusion degree of the text to be recognized;
and when the final confusion degree is greater than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized.
Further, the processing and converting the original voice signal into the text to be recognized includes the steps of:
carrying out spectrum analysis on the original voice signal, and cutting an analysis result corresponding to the invalid voice signal to obtain a target voice signal;
and performing voice recognition on the target voice signal to obtain corresponding voice characteristics, and inputting the voice characteristics into a voice conversion model to obtain the text to be recognized.
Further, before inputting the text to be recognized into the pre-trained word segmentation model, the method comprises the following steps:
performing word segmentation processing on each corpus sentence in the acquired corpus data set to obtain a word segmentation sample set;
training according to the word segmentation sample set to obtain the word segmentation model; the word segmentation sample set comprises a plurality of word segmentation sample vectors.
Further, the sequentially inputting the word vectors into a pre-trained statistical language model and outputting the final confusion degree of the text to be recognized comprises the following steps:
respectively and sequentially outputting the occurrence probability of each word vector in the text to be recognized through the statistical language model;
and calculating to obtain the final confusion degree of the text to be recognized according to the occurrence probability of each word vector in the text to be recognized.
Further, when the final confusion degree is greater than a preset confusion degree, the method for recognizing the text to be recognized comprises the following steps:
replacing wrongly written characters in the text to be recognized to obtain candidate text sentences, and calculating the final confusion degree of each candidate text sentence;
and outputting the candidate text sentence with the final confusion degree smaller than the preset confusion degree and the final confusion degree minimum as a final recognition text.
The invention also provides a wrongly written character recognition and detection system, which comprises:
the voice processing module is used for processing and converting the original voice signal into a text to be recognized;
the word segmentation processing module is used for inputting the text to be recognized into a pre-trained word segmentation model and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
the statistical processing module is used for sequentially inputting all word vectors in the word vector sequence into a pre-trained statistical language model in sequence and outputting the final confusion degree of the text to be recognized;
and the analysis module is used for determining that wrongly written characters exist in the text to be recognized when the final confusion degree is greater than the preset confusion degree.
Further, the voice processing module comprises:
the processing unit is used for carrying out spectrum analysis on the original voice signal and cutting an analysis result corresponding to the invalid voice signal to obtain a target voice signal;
and the conversion unit is used for carrying out voice recognition on the target voice signal to obtain corresponding voice characteristics, and inputting the voice characteristics into a voice conversion model to obtain the text to be recognized.
Further, the word segmentation processing module further comprises:
the system comprises a sample acquisition unit, a word segmentation unit and a word segmentation unit, wherein the sample acquisition unit is used for carrying out word segmentation processing on each corpus sentence in an acquired corpus data set to obtain a word segmentation sample set;
the training unit is used for training according to the word segmentation sample set to obtain the word segmentation model; the word segmentation sample set comprises a plurality of word segmentation sample vectors.
Further, the statistical processing module comprises:
the probability calculation unit is used for respectively and sequentially outputting the occurrence probability of each word vector in the text to be recognized through the statistical language model;
and the confusion degree calculating unit is used for calculating the final confusion degree of the text to be recognized according to the occurrence probability of each word vector in the text to be recognized.
Further, the method also comprises the following steps:
the replacement processing module is used for replacing wrongly written characters in the text to be recognized to obtain candidate text sentences and calculating the final confusion degree of each candidate text sentence;
and the output module is used for outputting the candidate text sentence with the final confusion degree smaller than the preset confusion degree and the final confusion degree minimum as the final recognition text.
By the method and the system for identifying and detecting the wrongly written characters, the identification accuracy and efficiency of the wrongly written characters in the text can be improved.
Drawings
The foregoing features, technical features, advantages and implementations of a method and system for detecting misregistered word recognition will be further described in the following detailed description of preferred embodiments in a clearly understandable manner, in conjunction with the accompanying drawings.
FIG. 1 is a flow chart of one embodiment of a method for identifying and detecting wrongly written words of the present invention;
FIG. 2 is a flow chart of one embodiment of a method for detecting misregistered word recognition of the present invention;
FIG. 3 is a flow chart of one embodiment of a method for recognition detection of wrongly written words of the present invention;
fig. 4 is a schematic structural diagram of an embodiment of a system for detecting a wrongly written word according to the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. However, it will be apparent to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
In addition, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not intended to indicate or imply relative importance.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, other drawings and embodiments can be derived from them without inventive effort.
In an embodiment of the present invention, as shown in fig. 1, a method for identifying and detecting a wrongly written word includes:
s100, processing an original voice signal and converting the processed original voice signal into a text to be recognized;
specifically, the original voice signal refers to a voice audio signal acquired by a microphone, and the voice audio signal can be subjected to voice recognition by a voice recognition technology of science university news to obtain a corresponding text to be recognized.
S200, inputting a text to be recognized into a pre-trained word segmentation model, and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
s300, sequentially inputting the word vectors in the word vector sequence into a pre-trained statistical language model in sequence, and outputting the final confusion degree of the text to be recognized;
and S400, when the final confusion degree is greater than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized.
Specifically, the word segmentation model includes a Bert model, or a final word segmentation model tool, and the like. The word segmentation model also includes GPT2 (a pre-training word segmentation model proposed by OpenAI), or other N-gram word segmentation model (a statistical Language model based on byte fragment sequences with length N). After the intelligent mobile device, the mobile terminal or the server obtains the text to be recognized in the above manner, the text to be recognized is input to the word segmentation voice model, word vectors arranged according to the character sequence of the text to be recognized can be obtained, and the word segmentation model segments the input text to be recognized and outputs word vector sequences in the corresponding sequence. The word vector sequence comprises a plurality of word vectors, and the arrangement sequence of all the word vectors is arranged according to the character sequence of the text to be recognized to form the word vector sequence. Then, the intelligent mobile device, the mobile terminal or the server sequentially inputs the word vectors in the word vector sequence into the statistical language model, and outputs the final confusion degree of the text to be recognized. And finally, the intelligent mobile device, the mobile terminal or the server judges and compares the final confusion degree with the preset confusion degree, and if the final confusion degree is smaller than the preset confusion degree, the text to be recognized is determined not to have wrongly written or mispronounced characters. Otherwise, if the final confusion degree is larger than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized.
The method can directly identify the original voice signal, can identify whether the text to be identified converted according to the original voice signal is wrongly written or not only according to the calculated confusion degree, and improves the accuracy and efficiency of wrongly written or not identification.
Preferably, the microphone may be disposed at an intelligent mobile device such as a robot or an unmanned vehicle, or at a mobile terminal such as a mobile phone, a computer, a tablet, or an intelligent watch, and the mobile terminal or the intelligent mobile device collects an original voice signal and the mobile terminal or the intelligent mobile device identifies the original voice signal of the user. Of course, after acquiring and acquiring the original voice signal, the mobile terminal or the intelligent mobile device may forward the acquired original voice signal to the server, and the server may identify the original voice signal of the user. The invention does not limit the execution subject of the identification and detection of wrongly written words.
In an embodiment of the present invention, as shown in fig. 2, a method for identifying and detecting a wrongly written word includes:
s010 carries out word segmentation processing on each corpus sentence in the acquired corpus data set to obtain a word segmentation sample set;
s020 training according to the word segmentation sample set to obtain a word segmentation model; the word segmentation sample set comprises a plurality of word segmentation sample vectors;
specifically, the core of the word segmentation model is to solve the conditional probability of the occurrence of words. The intelligent mobile device, the mobile terminal or the server collects the linguistic data sentences of the documents or the data in each field to obtain a linguistic data set, and then performs word segmentation on each linguistic data sentence by adopting the prior art to obtain a word segmentation result of each linguistic data sentence. In this way, the segmentation result of each corpus sentence is summarized to obtain a segmentation sample vector, which includes a word vector and its label, position (position of the segmentation result in the corpus sentence), and matrix relationship (i.e. relationship between the segmentation result and other segmentation results in the corpus sentence). The word vector is not necessarily a vector formed by only one word, and for example, the word vector may be the words "we", "milk", and so on.
The intelligent mobile device, the mobile terminal or the server randomly extracts a part from the word segmentation sample set as a training set, the rest part is a verification set, and the number of the word segmentation sample vectors in the training set is larger than that in the verification set. And then training the word segmentation sample vectors in the training set to obtain a candidate word segmentation model, verifying the candidate word segmentation model by using the word segmentation sample vectors in the verification set, and adjusting parameters of the candidate word segmentation model according to a verification result until the recognition accuracy of the candidate word segmentation model is greater than a preset threshold value to obtain a final word segmentation model.
S110, carrying out spectrum analysis on the original voice signal, and cutting an analysis result corresponding to the invalid voice signal to obtain a target voice signal;
s120, performing voice recognition on the target voice signal to obtain corresponding voice characteristics, and inputting the voice characteristics into a voice conversion model to obtain a text to be recognized;
specifically, since the original voice signal often includes invalid voice, the smart mobile device, the mobile terminal, or the server needs to perform spectrum analysis on the original voice signal to obtain an analysis result, where the analysis result may include a valid voice recognition result or an invalid voice recognition result. The invalid voice comprises voice corresponding to the blank area or voice corresponding to a preset invalid vocabulary. The preset invalid vocabularies comprise vocabularies such as Chinese language vocabularies and the like, and the preset invalid vocabularies do not generate other understanding on the semantics of the original voice signal and can be preset according to actual requirements. Therefore, the intelligent mobile device, the mobile terminal or the server cuts the analysis result corresponding to the invalid voice signal to obtain the target voice signal.
Then, the intelligent mobile device, the mobile terminal or the server performs voice recognition on the target voice signal, because voiceprint features (tone, etc.) of different users are different, after the original voice signal is obtained, if a plurality of voiceprint features are determined to exist, the voice signals with different voiceprint features are respectively subjected to voice recognition to obtain corresponding voice features, and all voice features corresponding to the same voiceprint feature are input into the voice conversion model to recognize and output a corresponding text to be recognized. The technology of converting voice into text is the prior art, and is not described in detail herein.
S200, inputting a text to be recognized into a pre-trained word segmentation model, and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
s310, respectively and sequentially outputting the occurrence probability of each word vector in the text to be recognized through the statistical language model;
s320, calculating to obtain the final confusion degree of the text to be recognized according to the occurrence probability of each word vector in the text to be recognized;
specifically, a unified standard framework is constructed through the forms of pre-training, fine-tuning, a language model and confusion. In any field, unsupervised corpus sentences of the related field are prepared, and a fine-tuning training is performed on a segmentation model (e.g., the Bert model of google) based on the related corpus sentences. When the wrongly-written characters of the text to be recognized corresponding to an original voice signal are detected, firstly, a vector representation of each character in the text to be recognized, namely a word vector is obtained, then the occurrence probability of each word vector in the whole text to be recognized is calculated, then the confusion degree of each word vector is calculated according to the occurrence probability of each word vector, finally, the final confusion degree of the text to be recognized is obtained through the confusion degree calculation of each word vector of the text to be recognized, and whether wrongly-written characters exist in the text to be recognized is judged based on the final confusion degree of the text to be recognized.
Supposing that a word vector sequence obtained by performing speech recognition conversion on an original speech signal and performing word segmentation on a text W to be recognized is as follows: s ═ w1,w2,w3,…,wn}
P(S)=P(w1,w2,w3,…,wn)
=P(w1)P(w2|w1),…P(wn|w1,w2,…,wn-1)
Figure BDA0003026387600000091
Wherein S is a word vector sequence, n represents the number of word vectors in the word vector sequence, and wnIs the nth word vector, P (S) is the probability that the character string composed according to the arrangement order of the word vectors is a sentence, P (w)n|w1,w2,…,wn-1) Is the current word vector wnThe first n-1 word vectors are (w)1,w2,…,wn-1) Pp (S) is the final confusion of the word vector sequence S. In natural language processing, for a word segmentation model, the quality of the word segmentation model is generally measured by using the confusion degree, and the lower the confusion degree, the lower the confusion degree of the statistical language model in the face of a sentence is, the better the statistical language model is.
And S400, when the final confusion degree is greater than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized.
Specifically, the same portions of this embodiment as those of the above embodiment are referred to the above embodiment, and are not described in detail here.
The invention realizes a universal wrongly written character detection method. The method comprises the steps of modeling a text to be recognized through a word segmentation model to obtain word vectors, identifying and outputting the occurrence probability of each word vector in the text to be recognized through a statistical language model, further calculating the confusion degree of each word vector to calculate the confusion degree of the text to be recognized, setting a threshold value of a preset confusion degree, and comparing the preset confusion degree with the final confusion degree to judge whether wrongly-written characters exist in the text to be recognized currently.
In the embodiment, the text to be recognized is modeled through the pre-trained word segmentation model, the confusion degree of the text to be recognized is calculated through the confusion degree, and whether wrongly written characters exist in the current text to be recognized is judged through the setting of the preset confusion degree. In addition, the preset keywords are removed, the correctness of the detection result is not influenced, and the final recognition text corresponding to the original voice signal can be simplified and recognized, so that the recognition efficiency and the accuracy of the voice signal are improved.
In an embodiment of the present invention, as shown in fig. 3, a method for identifying and detecting a wrongly written word includes:
s110, carrying out spectrum analysis on the original voice signal, and cutting an analysis result corresponding to the invalid voice signal to obtain a target voice signal;
s120, performing voice recognition on the target voice signal to obtain corresponding voice characteristics, and inputting the voice characteristics into a voice conversion model to obtain a text to be recognized;
s200, inputting a text to be recognized into a pre-trained word segmentation model, and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
s310, respectively and sequentially outputting the occurrence probability of each word vector in the text to be recognized through the statistical language model;
s320, calculating to obtain the final confusion degree of the text to be recognized according to the occurrence probability of each word vector in the text to be recognized;
s400, when the final confusion degree is larger than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized;
s500, replacing wrongly written characters in the text to be recognized to obtain candidate text sentences, and calculating the final confusion degree of each candidate text sentence;
and S600, outputting the candidate text sentence with the final confusion degree smaller than the preset confusion degree and the minimum final confusion degree as a final recognition text.
Specifically, the corrected characters corresponding to the wrongly written characters are established according to the voice reading method or the pinyin, and the association relationship between each wrongly written character and each corrected character is established. And finding out candidate corrected characters of the wrongly-written characters in the text to be recognized according to the association relation, and replacing the wrongly-written characters in the text to be recognized with the candidate corrected characters to obtain a plurality of candidate text sentences. And then, respectively determining the final confusion degree corresponding to each candidate text sentence through the statistical language model, and selecting the candidate text sentence with the final confusion degree smaller than the preset confusion degree and the minimum final confusion degree from all the candidate text sentences as a final recognition text according to the final confusion degree corresponding to each candidate text sentence. Perplexity (Perplexity) is an index used in the Natural Language Processing (NLP) field to measure the quality of a statistical language model. The smaller the confusion of a candidate text sentence, the greater the probability that the candidate text sentence is indicated, indicating that the candidate text sentence is more correct.
When a wrongly-recognized word exists in a text to be recognized, the text to be recognized is the text to be corrected, the wrongly-recognized word can be replaced by the text to be corrected according to the incidence relation between the wrongly-recognized word and the corrected word to obtain candidate text statements, then the confusion degree of each candidate text statement is calculated through a statistical language model, then the candidate text statement with the final confusion degree smaller than the preset confusion degree and the minimum confusion degree is selected as a corrected statement corresponding to the current wrongly-recognized word, and the corrected statement is the final recognized text corresponding to the original voice signal because only one wrongly-recognized word exists.
Preferably, when a plurality of groups of wrongly-written characters exist in the text to be recognized, the correction statement corresponding to the last wrongly-written character is used as a new text to be corrected, and a new wrongly-written character is reselected from the correction statement corresponding to the last wrongly-written character according to the above manner to be corrected and replaced, so that a new correction statement is obtained, and the final recognition text corresponding to the original voice signal is obtained until no wrongly-written character exists in the correction statement after being corrected for many times.
Illustratively, the search engine acquires the original voice signal of the user by triggering the acquisition through the self-contained voice control, and the corresponding final recognition text is obtained by the embodiment mode, so that the search engine can perform searching according to the final recognition text obtained after the original voice signal is recognized and corrected, the user does not need to type manually, the search efficiency is improved, and the search experience of the user is greatly improved.
The invention carries out error correction replacement of wrongly-written characters through the confusion degree, avoids only carrying out harmonic sound and homophone replacement, carries out processing of wrongly-written characters in a wide scene, can not only carry out wrongly-written character detection on the text to be recognized after the original voice signal is recognized by voice, but also carry out error correction on wrongly-written characters, and can more accurately and conveniently determine the corresponding intention of the original voice signal.
In one embodiment of the present invention, as shown in fig. 4, a system for detecting a wrongly written word includes:
the voice processing module 10 is configured to process an original voice signal and convert the processed original voice signal into a text to be recognized;
the word segmentation processing module 20 is configured to input a text to be recognized to a pre-trained word segmentation model, and output a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
the statistical processing module 30 is configured to sequentially input each word vector in the word vector sequence into a pre-trained statistical language model, and output a final confusion degree of the text to be recognized;
and the analysis module 40 is used for determining that wrongly written characters exist in the text to be recognized when the final confusion degree is greater than a preset confusion degree.
Specifically, the original voice signal refers to a voice audio signal acquired by a microphone, and the voice audio signal can be subjected to voice recognition by a voice recognition technology of science university news to obtain a corresponding text to be recognized.
The word segmentation model comprises a Bert model, or a result word segmentation model tool and the like. The statistical language model includes an n-gram participle model. After the intelligent mobile device, the mobile terminal or the server obtains the text to be recognized in the above manner, the text to be recognized is input to the word segmentation voice model, word vectors with the text to be recognized arranged in sequence can be obtained, and the word vectors are arranged according to the text sequence, so that the word segmentation model can recognize the input text to be recognized and output a corresponding word vector sequence. The word vector sequence comprises a plurality of word vectors, and the arrangement sequence of all the word vectors is arranged according to the character sequence of the text to be recognized to form the word vector sequence. Then, the intelligent mobile device, the mobile terminal or the server sequentially inputs the word vectors in the word vector sequence into the statistical language model, and outputs the final confusion degree of the text to be recognized. And finally, the intelligent mobile device, the mobile terminal or the server judges and compares the final confusion degree with the preset confusion degree, and if the final confusion degree is smaller than the preset confusion degree, the text to be recognized is determined not to have wrongly written or mispronounced characters. And otherwise, if the final confusion degree is greater than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized.
The method can directly identify the original voice signal, can identify whether the text to be identified converted according to the original voice signal is wrongly written or not only according to the calculated confusion degree, and improves the accuracy and efficiency of wrongly written or not identification.
Preferably, the microphone may be disposed at an intelligent mobile device such as a robot or an unmanned vehicle, or at a mobile terminal such as a mobile phone, a computer, a tablet, or an intelligent watch, and the mobile terminal or the intelligent mobile device collects an original voice signal and the mobile terminal or the intelligent mobile device identifies the original voice signal of the user. Of course, after acquiring and acquiring the original voice signal, the mobile terminal or the intelligent mobile device may forward the acquired original voice signal to the server, and the server may identify the original voice signal of the user. The invention does not limit the execution subject of the identification and detection of wrongly written words.
Based on the foregoing embodiments, the speech processing module 10 includes:
the processing unit is used for carrying out spectrum analysis on the original voice signal and cutting an analysis result corresponding to the invalid voice signal to obtain a target voice signal;
and the conversion unit is used for carrying out voice recognition on the target voice signal to obtain corresponding voice characteristics, and inputting the voice characteristics into a voice conversion model to obtain the text to be recognized.
Based on the foregoing embodiment, the system for recognizing and detecting wrongly written words further includes:
the system comprises a sample acquisition unit, a word segmentation unit and a word segmentation unit, wherein the sample acquisition unit is used for carrying out word segmentation processing on each corpus sentence in an acquired corpus data set to obtain a word segmentation sample set;
the training unit is used for training according to the word segmentation sample set to obtain the word segmentation model; the word segmentation sample set comprises a plurality of word segmentation sample vectors.
Specifically, this embodiment is a system embodiment corresponding to the above method embodiment, and specific effects refer to the above method embodiment, which is not described in detail herein.
Based on the foregoing embodiment, the sequentially inputting the word vectors into the pre-trained statistical language model, and outputting the final confusion of the text to be recognized includes the steps of:
respectively and sequentially outputting the occurrence probability of each word vector in the text to be recognized through the statistical language model;
and calculating to obtain the final confusion degree of the text to be recognized according to the occurrence probability of each word vector in the text to be recognized.
Specifically, this embodiment is a system embodiment corresponding to the above method embodiment, and specific effects refer to the above method embodiment, which is not described in detail herein.
Based on the foregoing embodiment, the system for recognizing and detecting wrongly written words further includes:
the replacement processing module is used for replacing wrongly written characters in the text to be recognized to obtain candidate text sentences and calculating the final confusion degree of each candidate text sentence;
and the output module is used for outputting the candidate text sentence with the final confusion degree larger than the preset confusion degree and the final confusion degree being the minimum as the final recognition text.
Specifically, this embodiment is a system embodiment corresponding to the above method embodiment, and specific effects refer to the above method embodiment, which is not described in detail herein.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of program modules is illustrated, and in practical applications, the above-described distribution of functions may be performed by different program modules, that is, the internal structure of the apparatus may be divided into different program units or modules to perform all or part of the above-described functions. Each program module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one processing unit, and the integrated unit may be implemented in a form of hardware, or may be implemented in a form of software program unit. In addition, the specific names of the program modules are only used for distinguishing the program modules from one another, and are not used for limiting the protection scope of the application.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or recited in detail in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method for identifying and detecting wrongly written characters is characterized by comprising the following steps:
processing an original voice signal and converting the processed original voice signal into a text to be recognized;
inputting a text to be recognized into a pre-trained word segmentation model, and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
sequentially inputting the word vectors in the word vector sequence into a pre-trained statistical language model in sequence, and outputting the final confusion degree of the text to be recognized;
and when the final confusion degree is greater than the preset confusion degree, determining that wrongly written characters exist in the text to be recognized.
2. The method for detecting the recognition of the wrongly written words as claimed in claim 1, wherein the step of processing and converting the original speech signal into the text to be recognized comprises the steps of:
carrying out spectrum analysis on the original voice signal, and cutting an invalid voice signal according to an analysis result to obtain a target voice signal;
and performing voice recognition on the target voice signal to obtain corresponding voice characteristics, and inputting the voice characteristics into a voice conversion model to obtain the text to be recognized.
3. The method for detecting the recognition of the wrongly written characters according to claim 1, wherein the step of inputting the text to be recognized into the pre-trained segmentation model comprises the steps of:
performing word segmentation processing on each corpus sentence in the acquired corpus data set to obtain a word segmentation sample set;
training according to the word segmentation sample set to obtain the word segmentation model; the word segmentation sample set comprises a plurality of word segmentation sample vectors.
4. The method for recognizing and detecting wrongly written characters according to claim 1, wherein the steps of sequentially inputting the word vectors into a pre-trained statistical language model and outputting the final confusion of the text to be recognized comprise:
respectively and sequentially outputting the occurrence probability of each word vector in the text to be recognized through the statistical language model;
and calculating to obtain the final confusion degree of the text to be recognized according to the occurrence probability of each word vector in the text to be recognized.
5. The method for detecting the recognition of the wrongly written words according to any one of claims 1-4, wherein when the final confusion degree is larger than a preset confusion degree, the method comprises the following steps after determining that the wrongly written words exist in the text to be recognized:
replacing wrongly written characters in the text to be recognized to obtain candidate text sentences, and calculating the final confusion degree of each candidate text sentence;
and outputting the candidate text sentence with the final confusion degree smaller than the preset confusion degree and the final confusion degree minimum as a final recognition text.
6. A system for identifying and detecting wrongly written characters, comprising:
the voice processing module is used for processing and converting the original voice signal into a text to be recognized;
the word segmentation processing module is used for inputting the text to be recognized into a pre-trained word segmentation model and outputting a word vector sequence in the recognized text; the word vector sequence is arranged according to the character sequence of the text to be recognized;
the statistical processing module is used for sequentially inputting all word vectors in the word vector sequence into a pre-trained statistical language model in sequence and outputting the final confusion degree of the text to be recognized;
and the analysis module is used for determining that wrongly written characters exist in the text to be recognized when the final confusion degree is greater than the preset confusion degree.
7. The system according to claim 6, wherein the speech processing module comprises:
the processing unit is used for carrying out spectrum analysis on the original voice signal and cutting an analysis result corresponding to the invalid voice signal to obtain a target voice signal;
and the conversion unit is used for carrying out voice recognition on the target voice signal to obtain corresponding voice characteristics, and inputting the voice characteristics into a voice conversion model to obtain the text to be recognized.
8. The system according to claim 6, wherein the segmentation processing module further comprises:
the system comprises a sample acquisition unit, a word segmentation unit and a word segmentation unit, wherein the sample acquisition unit is used for carrying out word segmentation processing on each corpus sentence in an acquired corpus data set to obtain a word segmentation sample set;
the training unit is used for training according to the word segmentation sample set to obtain the word segmentation model; the word segmentation sample set comprises a plurality of word segmentation sample vectors.
9. The system according to claim 6, wherein the statistical processing module comprises:
the probability calculation unit is used for respectively and sequentially outputting the occurrence probability of each word vector in the text to be recognized through the statistical language model;
and the confusion degree calculating unit is used for calculating the final confusion degree of the text to be recognized according to the occurrence probability of each word vector in the text to be recognized.
10. The system for detecting the recognition of a wrongly written word as claimed in any one of claims 6 to 9, further comprising:
the replacement processing module is used for replacing wrongly written characters in the text to be recognized to obtain candidate text sentences and calculating the final confusion degree of each candidate text sentence;
and the output module is used for outputting the candidate text sentence with the final confusion degree smaller than the preset confusion degree and the final confusion degree minimum as the final recognition text.
CN202110417158.9A 2021-04-19 2021-04-19 Wrongly-written character recognition detection method and system Pending CN113096667A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110417158.9A CN113096667A (en) 2021-04-19 2021-04-19 Wrongly-written character recognition detection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110417158.9A CN113096667A (en) 2021-04-19 2021-04-19 Wrongly-written character recognition detection method and system

Publications (1)

Publication Number Publication Date
CN113096667A true CN113096667A (en) 2021-07-09

Family

ID=76678859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110417158.9A Pending CN113096667A (en) 2021-04-19 2021-04-19 Wrongly-written character recognition detection method and system

Country Status (1)

Country Link
CN (1) CN113096667A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723096A (en) * 2021-07-23 2021-11-30 智慧芽信息科技(苏州)有限公司 Text recognition method and device, computer-readable storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107222614A (en) * 2017-05-08 2017-09-29 惠州Tcl移动通信有限公司 Automatically hang up method, readable storage medium storing program for executing and the mobile terminal of call
WO2018120889A1 (en) * 2016-12-28 2018-07-05 平安科技(深圳)有限公司 Input sentence error correction method and device, electronic device, and medium
CN110211571A (en) * 2019-04-26 2019-09-06 平安科技(深圳)有限公司 Wrong sentence detection method, device and computer readable storage medium
CN111651978A (en) * 2020-07-13 2020-09-11 深圳市智搜信息技术有限公司 Entity-based lexical examination method and device, computer equipment and storage medium
CN112149406A (en) * 2020-09-25 2020-12-29 中国电子科技集团公司第十五研究所 Chinese text error correction method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018120889A1 (en) * 2016-12-28 2018-07-05 平安科技(深圳)有限公司 Input sentence error correction method and device, electronic device, and medium
CN107222614A (en) * 2017-05-08 2017-09-29 惠州Tcl移动通信有限公司 Automatically hang up method, readable storage medium storing program for executing and the mobile terminal of call
CN110211571A (en) * 2019-04-26 2019-09-06 平安科技(深圳)有限公司 Wrong sentence detection method, device and computer readable storage medium
CN111651978A (en) * 2020-07-13 2020-09-11 深圳市智搜信息技术有限公司 Entity-based lexical examination method and device, computer equipment and storage medium
CN112149406A (en) * 2020-09-25 2020-12-29 中国电子科技集团公司第十五研究所 Chinese text error correction method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723096A (en) * 2021-07-23 2021-11-30 智慧芽信息科技(苏州)有限公司 Text recognition method and device, computer-readable storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN106683680B (en) Speaker recognition method and device, computer equipment and computer readable medium
US11514891B2 (en) Named entity recognition method, named entity recognition equipment and medium
CN107016994B (en) Voice recognition method and device
CN108766414B (en) Method, apparatus, device and computer-readable storage medium for speech translation
KR102191425B1 (en) Apparatus and method for learning foreign language based on interactive character
US6910012B2 (en) Method and system for speech recognition using phonetically similar word alternatives
CN107729313B (en) Deep neural network-based polyphone pronunciation distinguishing method and device
US7421387B2 (en) Dynamic N-best algorithm to reduce recognition errors
KR101590724B1 (en) Method for modifying error of speech recognition and apparatus for performing the method
US20070219777A1 (en) Identifying language origin of words
KR20170034227A (en) Apparatus and method for speech recognition, apparatus and method for learning transformation parameter
JP5932869B2 (en) N-gram language model unsupervised learning method, learning apparatus, and learning program
WO2021103712A1 (en) Neural network-based voice keyword detection method and device, and system
WO2014183373A1 (en) Systems and methods for voice identification
CN107886968B (en) Voice evaluation method and system
CN110675866B (en) Method, apparatus and computer readable recording medium for improving at least one semantic unit set
CN112397056B (en) Voice evaluation method and computer storage medium
CN113707125A (en) Training method and device for multi-language voice synthesis model
CN110019741A (en) Request-answer system answer matching process, device, equipment and readable storage medium storing program for executing
CN110853669B (en) Audio identification method, device and equipment
CN112651247A (en) Dialogue system, dialogue processing method, translation device, and translation method
US20050187767A1 (en) Dynamic N-best algorithm to reduce speech recognition errors
CN113096667A (en) Wrongly-written character recognition detection method and system
KR20130126570A (en) Apparatus for discriminative training acoustic model considering error of phonemes in keyword and computer recordable medium storing the method thereof
CN114974310A (en) Emotion recognition method and device based on artificial intelligence, computer equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210709

WD01 Invention patent application deemed withdrawn after publication