CN111489746B - Power grid dispatching voice recognition language model construction method based on BERT - Google Patents
Power grid dispatching voice recognition language model construction method based on BERT Download PDFInfo
- Publication number
- CN111489746B CN111489746B CN202010148584.2A CN202010148584A CN111489746B CN 111489746 B CN111489746 B CN 111489746B CN 202010148584 A CN202010148584 A CN 202010148584A CN 111489746 B CN111489746 B CN 111489746B
- Authority
- CN
- China
- Prior art keywords
- word
- power grid
- grid dispatching
- named entity
- bert
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010276 construction Methods 0.000 title claims abstract description 11
- 239000013598 vector Substances 0.000 claims abstract description 33
- 238000000034 method Methods 0.000 claims abstract description 25
- 238000012549 training Methods 0.000 claims abstract description 22
- 230000011218 segmentation Effects 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 7
- 235000003140 Panax quinquefolius Nutrition 0.000 claims description 3
- 240000005373 Panax quinquefolius Species 0.000 claims description 3
- 230000000873 masking effect Effects 0.000 claims 1
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000011160 research Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 241000282376 Panthera tigris Species 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 239000013604 expression vector Substances 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003252 repetitive effect Effects 0.000 description 2
- 241001672694 Citrus reticulata Species 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the field of power grid dispatching voice recognition, in particular to a power grid dispatching voice recognition language model construction method based on BERT, which comprises the following steps: extracting word granularity semantic features of a power grid dispatching statement; extracting keyword features of a power grid dispatching statement; extracting named entity features of power grid dispatching sentences; and segmenting the power grid dispatching sentences input into the BERT original model by taking words as units to extract position characteristics, and training the BERT original model based on the semantic characteristics, the keyword characteristics, the named entity characteristics and the position characteristics to obtain a power grid dispatching voice recognition language model. The invention has the beneficial effects that: according to the power grid dispatching language characteristics and the dispatching voice recognition application scene, the input feature vector and output probability prediction method of the dispatching sentences of the BERT model is improved, the rationality judgment of the power grid dispatching sentences combined with the dispatching language characteristics is realized, and the accuracy in the aspect of power grid dispatching voice recognition is higher compared with other common language models.
Description
Technical Field
The invention relates to the field of power grid dispatching voice recognition, in particular to a power grid dispatching voice recognition language model construction method based on BERT.
Background
With the expansion of the scale of the power distribution network and the promotion of informatization construction, information related to distribution network command is continuously increased, and a dispatcher needs to perform a large number of repetitive work of issuing orders, receiving orders, checking and the like every day, so that the demand of adopting an intelligent virtual dispatcher to replace repetitive manual labor is generated. The speech recognition link is related to the accurate understanding of the virtual dispatcher on the field personnel report information, and is the basis for the correct processing and sending of dispatching instructions. As two core composition modules of a speech recognition system, an acoustic model and a language model respectively carry out character reconstruction on input speech from the perspective of pronunciation and semantics, wherein the language model has the main function of giving the probability that an input sentence is a reasonable sentence, namely measuring the semantic rationality of the sentence. Since language models often involve semantic understanding in a specific field, it is necessary to design for language features in an application field to improve accuracy of the models.
Currently, there are few studies on speech recognition language models in the power domain. Some researches construct an electric power voice recognition system, but mainly aim at an acoustic model to design, only the selection of training corpora is considered in the aspect of a language model, and the model structure is not improved; some researches add grammar rules in the using process of the power scheduling language model to assist in judging the rationality of the scheduling language, but the rationality of scheduling contents related to power grid terms, named entities and the like is difficult to determine through the grammar rules; some researches consider electric power professional terms, and provide a language model dynamic optimization method capable of adding field words in real time, so that the accuracy of electric power voice recognition is improved, but fuzzy matching with inaccurate pronunciation is not fully designed. In addition, the language models adopted in the research belong to statistical language models, and the neural network language models with more advantages in accuracy and generalization capability are not adopted.
Disclosure of Invention
In order to solve the problems, the invention provides a power grid dispatching voice recognition language model construction method based on BERT.
A power grid dispatching voice recognition language model construction method based on BERT comprises the following steps:
extracting word granularity semantic features of power grid dispatching sentences;
extracting keyword features of a power grid dispatching statement;
extracting named entity characteristics of power grid dispatching sentences;
and segmenting the power grid dispatching sentences input into the BERT original model by taking words as units to extract position characteristics, and training the BERT original model based on the semantic characteristics, the keyword characteristics, the named entity characteristics and the position characteristics to obtain a power grid dispatching voice recognition language model.
Preferably, the extracting word granularity semantic features of the power grid dispatching statement includes:
and segmenting the scheduling statement by taking the word as granularity, wherein the semantic feature vector of each word is generated by adopting a skip-gram model of word2 vec.
Preferably, the extracting the keyword features of the power grid dispatching statement includes:
for each word in the power grid dispatching sentence, the pinyin of the word is divided into an initial consonant, a final sound and an tone, when the syllable is recognized integrally, the word is directly divided into the initial consonant and the final sound, the combined final sound is not divided any more, and the initial consonant or the tone is recorded as a null value by the word without the initial consonant or the tone;
calculating the similarity between each word and each keyword in the power grid dispatching sentence;
and for each word in the power grid dispatching statement, extracting the semantic feature vector of the keyword with the highest similarity, and obtaining the keyword feature vector of the word according to the similarity.
Preferably, the calculating the similarity between each word in the power grid dispatching statement and each keyword includes:
the calculation formula is as follows:
in the formula: sim sheng The method is characterized in that 1 is taken when the two letters are the same, 0.5 is taken when the letters are different and are respectively corresponding flat tongue sound and warped tongue sound, and 0 is taken in other cases; sim yun The expression that 1 is taken when the two rhymes are the same, 0.5 is taken when the rhymes are different but are respectively corresponding front nasal sounds and rear nasal sounds, and 0 is taken in the other cases; sim diao Indicating that 1 is taken at the same time as the two word tone phases, and 0 is taken otherwise.
Preferably, the extracting named entity features of the power grid dispatching statement includes:
constructing a named entity dictionary by using the power grid standing book information, and counting the word numbers of the shortest named entity and the longest named entity in the named entity dictionary and respectively marking as c and d;
for each word in the power grid dispatching sentence, extracting all word sequences with the length of q (q is c, c +1, …, d) including the word, and then calculating to obtain the similarity between each word sequence with the length of q and each word with the length of q in the named entity dictionary;
and calculating the named entity characteristics of each word in the power grid dispatching statement based on the similarity between each word sequence with the length of q and each word with the length of q in the named entity dictionary.
Preferably, the calculating the similarity between each word sequence with the length q and each word with the length q in the named entity dictionary comprises:
the calculation formula is as follows:
in the formula: sim zi(r) Representing the degree of similarity of the nth word of the word sequence to the nth word of the named entity.
Preferably, the step of calculating the named entity feature of each word in the power grid dispatching statement based on the similarity between each word sequence with the length of q and each word with the length of q in the named entity dictionary comprises:
for each word, it has e corresponding word sequences, wherein the maximum value of the similarity between the s-th word sequence (s is 1,2, …, e) and each named entity is recorded as msim xu(s) The total e similarity maximum values are set as msim xu(t) Then, the tth word sequence is called as a matching word sequence of the word, and the named entity feature vector of the word is calculated:
in the formula: f (u) a value representing the named entity feature vector to the u-th dimension; g msim xu(t) Representing the probability of misrecognition of the matching word sequence, wherein g is 0 when the matching word sequence is completely the same as the named entity, and otherwise is 1; pos indicates that the word is the number of words of the matching word sequence; len denotes the length of the matching word sequence; dim represents the dimension of the named entity feature vector.
Preferably, the training of the BERT original model based on the semantic features, the keyword features, the named entity features and the position features to obtain the power grid dispatching speech recognition language model comprises:
carrying out unsupervised pre-training of MLM tasks on the BERT original model;
and carrying out supervised training on the BERT original model based on reasonable probability of a scheduling statement.
Preferably, the unsupervised pre-training of the MLM task based on the BERT raw model comprises:
and (3) the MLM task randomly masks the input of the partial segmentation unit, a softmax layer is accessed after the corresponding output expression vector of the MLM task is expressed to predict the masked words or characters, and the parameters of the BERT original model are trained in the process of multiple predictions.
Preferably, the performing supervised training of the BERT raw model based on reasonable probability of the scheduling statement includes:
for a power grid dispatching statement containing j words, the input of the kth word (k is 1,2, …, j) is shielded in sequence, and the probability pro of the word output correspondingly is predicted by adopting a BERT original model pre-trained by an MLM task and a softmax layer k And finally calculating the probability that the power grid dispatching sentence is a reasonable sentence:
the invention has the beneficial effects that:
according to the method for constructing the power grid scheduling speech recognition language model based on the BERT, the input feature vector and output probability prediction method of the scheduling statement of the BERT model can be improved according to the power grid scheduling language characteristics and the scheduling speech recognition application scene, the rationality judgment of the power grid scheduling statement in combination with the scheduling language characteristics is realized, and the method has higher accuracy in the aspect of power grid scheduling speech recognition compared with other commonly used language models.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a schematic flowchart of a method for building a speech recognition language model for power grid dispatching based on BERT according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be further described below with reference to the accompanying drawings, but the present invention is not limited to these embodiments.
The basic idea of the invention is to combine the characteristics of power grid dispatching sentences and provide a dispatching semantic feature, keyword feature and named entity feature extraction method so as to generate multi-class feature vectors of model input sentences; and adjusting the training step of the BERT according to the task characteristics of the power grid dispatching voice recognition, so that the rationality of the dispatching statement can be judged unsupervised by using the BERT based on the neural network language model.
Through analysis, the power grid dispatching statement has the following characteristics: 1) the power grid dispatching statements contain a large number of named entities, for example, substation names, line names, pole names, switch names and the like can be related to field personnel when reporting operating equipment, and for a general language model, the named entities are usually difficult to accurately identify due to lack of corresponding priori knowledge; 2) the terms of the power grid dispatching instructions conform to relevant specifications in the power field, and some power terminology has a relatively fixed naming mode, for example, a substation is named by adopting a 'place name + station', a line is named by adopting a 'place name + number + line', an electric pole is named by adopting a 'number + pole' or a 'place name + number + pole' mode, and the like; 3) because the field worker has a mandarin accent problem and noise interference in outdoor environment, after voice input is performed on the field, through acoustic model recognition, sentences different from correct pronunciation may be obtained, for example, a 'construction branch line' is recognized as a 'deadline branch line', and the like, so that when a recognition result is judged by using a language model, the difference between the input sentences of the language model and the actual sentences needs to be fully considered.
Based on the above thought, the invention provides a power grid dispatching speech recognition language model construction method based on BERT, as shown in FIG. 1, comprising the following steps:
s1: and extracting word granularity semantic features of the power grid dispatching sentences.
Whether it is a statistical-based language model such as n-gram or a neural network-based language model, sentences are usually segmented with word granularity. However, the power grid dispatching statement includes a large number of named entities related to the power field, and in addition, human interference possibly caused by inaccurate pronunciation, word segmentation processing is performed on the power grid dispatching text in advance, which may cause a large deviation between the segmentation mode of the text and the actual meaning, for example, dividing the 'tiger becoming/becoming connected to a555 line' into 'tiger becoming/becoming connected to a555 line' and the like. Even if multiple candidate segmentation schemes are generated, the correct sentence segmentation mode cannot be covered frequently. In order to avoid the influence of word segmentation errors on the feature extraction accuracy, the scheduling statements are directly segmented by taking the characters as granularity, and the semantic feature vector of each character is generated by adopting a skip-gram model of word2 vec. In a distributed representation mode based on word granularity, a piece of power grid scheduling text containing a words is converted into a b-dimensional vectors, wherein the p-th vector (p ═ 1,2, …, a) represents semantic features of the p-th word of the power grid scheduling text, and b is the dimension of each word feature vector.
S2: and extracting the keyword characteristics of the power grid dispatching statements.
Although the grid dispatching language belongs to the category of natural language, the professional terms contained in the grid dispatching language still conform to the specifications of the power field. Through certain keywords with fixed electric power professional terms, semantic units before and after the keywords can be effectively distinguished, for example, through 'changing' and 'line', the transformer substation name field and the line name field of 'flood domain south ocean T649 line reclosing changing from signal to trip' can be effectively identified. Therefore, in order to make the language model more accurately understand the true meaning of the grid scheduling language, it is necessary to extract the features of the keywords, and the specific keywords are shown in table 1:
TABLE 1 keywords for Power grid dispatching language
Since the relevant information of the power grid dispatching is input by the field personnel through voice, the keyword characteristics in the dispatching information are extracted and considered in terms of the pronunciation of the words. Therefore, a similarity calculation method based on pinyin characteristics is proposed. For each character in the scheduling information, the pinyin of the character is firstly split into three parts of initial consonant, final sound and tone, and when the syllable of the whole cognitive reading appears, the character is directly split into the initial consonant and the final sound, for example, yin is split into y and in; the combined vowel is not split any more, for example, a 'ean' formed by combining vowels 'u' and 'an' is regarded as a new vowel; words without initial consonants (e.g., "an") or without tones (e.g., "it" will mark the initial consonants or tones as null values. Then, calculating the similarity between each word and each keyword in the scheduling information, wherein the calculation formula is as follows:
in the formula: sim sheng Taking 1 when the two letters are the same, taking 0.5 when the letters are different but are respectively corresponding flat tongue and warped tongue sounds (such as 'z' and 'zh'), and taking 0 in the other cases; sim yun Taking 1 when the two rhymes are the same, taking 0.5 when the rhymes are different but are respectively corresponding anterior nasal sounds and posterior nasal sounds (such as 'an' and 'ang'), and taking 0 in the other cases; sim diao The phase and voice of two words are simultaneously taken as 1, otherwise, 0 is taken. Finally, for each word in the scheduling information, the semantic feature vector of the keyword with the highest similarity is taken (if a plurality of keywords with the highest similarity exist, the average value of the corresponding semantic feature vectors is taken),and multiplying the similarity to obtain the keyword feature vector of the word.
S3: and extracting the named entity characteristics of the power grid dispatching sentences.
Most named entities in the power grid dispatching language, such as substation names, line names and the like, do not belong to Chinese commonly used vocabularies. Therefore, in the electric power scheduling text corpus, the named entities appear frequently, and the available context information is very limited, so that the identification correctness of the named entities is difficult to determine by means of the context in practical application. Therefore, the power grid ledger information needs to be introduced, and the named entity features of the power grid scheduling language need to be constructed to assist in judging the correctness of the named entity identification.
For this purpose, a named entity dictionary is first constructed using grid ledger information including names of individual power stations, devices, and the like. And meanwhile, counting the word numbers of the shortest named entity and the longest named entity in the named entity dictionary, and respectively recording the word numbers as c and d.
Then, for each word in the power grid dispatching statement, taking all word sequences with the length of q (q ═ c, c +1, …, d) including the word, and then solving the similarity between each word sequence with the length of q and each word with the length of q in the named entity dictionary, wherein the similarity also needs to be defined in the aspect of pronunciation of the word, and the calculation formula is as follows:
in the formula: sim zi(r) And (3) representing the similarity between the r th word of the word sequence and the r th word of the named entity, wherein the similarity is calculated according to the formula (1).
And finally, forming the named entity characteristics of each word in the power grid dispatching statement. For each word, it has e corresponding word sequences, wherein the maximum value of the similarity between the s-th word sequence (s is 1,2, …, e) and each named entity is recorded as msim xu(s) E maximum similarity values are obtained, and the maximum value among the e maximum similarity values is set as msim xu(t) (i.e. maximum similarity of t-th word sequence), then the t-th word sequence is called matching word sequence of the word, and then according to the formula(3) Computing the named entity feature vector for the word:
in the formula: f (u) a value representing the named entity feature vector in the u dimension; g.msim xu(t) Representing the probability of misrecognition of the matching word sequence, wherein g is 0 when the matching word sequence and the named entity are completely the same, otherwise, g is 1, because the similarity msim of the matching word sequence and the named entity is different xu(t) The higher the probability that the matching word sequence is a misrecognized result (for example, the misrecognized result of the Huift station is a recovery station), the higher the probability, the probability that the matching word sequence is a misrecognized result, and the like, and the g.msim xu(t) If the matching word sequence is larger but identical to the named entity, the matching word sequence is considered to be correct, that is, the probability of misrecognition is 0, and therefore g · msim is made by setting g to 0 xu(t) 0; pos indicates that the word is the first word of the matching word sequence; len is the length of the matching word sequence; dim is the dimension of the named entity feature vector.
S4: and segmenting the power grid dispatching sentences input into the BERT original model by taking words as units to extract position features, and training the BERT original model based on semantic features, keyword features, named entity features and position features to obtain a power grid dispatching voice recognition language model.
And segmenting the power grid dispatching sentences input into the BERT by taking words as units. In the original model structure of BERT, 3 types of features, i.e., semantic features, segment features, and position features, are extracted for each segmentation unit. The semantic feature vector reflects the semantic information of each segmentation unit; the segment feature vector is used for marking which sentence each segmentation unit belongs to when two sentences are simultaneously input into the BERT; the position feature vector is used for representing the position of each segmentation unit in the sentence. In the power grid dispatching speech recognition language model, semantic feature vectors of each word of a dispatching statement are generated through the step S1; because the power grid dispatching instruction appears in a form of a single sentence, the section characteristics do not need to be added into the power grid dispatching language model; and the position feature vector is obtained by automatic learning in the model training process according to a BERT method. Meanwhile, the characteristics of the power grid dispatching language are considered, and the keyword feature vector in the step S2 and the named entity feature vector in the step S3 are added, so that the accuracy of the language model for understanding the power grid dispatching language is improved. Each word of the final scheduling statement contains 4 categories of features, namely semantic features, location features, keyword features and named entity features.
The original BERT Model, when unsupervised pre-training, includes two training tasks, namely Mask Language Model (MLM) and Next Sequence Prediction (NSP). The MLM task randomly masks the input of a partial segmentation unit, accesses a softmax layer after the corresponding output expression vector to predict the masked words or characters, and trains the parameters of BERT in the process of multiple predictions; the NSP task inputs two sentences simultaneously and trains BERT by predicting whether the two sentences are consecutive sentences in an actual article. Similarly, because the power grid dispatching instruction appears in a single sentence form, when a power grid dispatching language model is constructed, the BERT does not need to be pre-trained by an NSP task, and only the MLM task is pre-trained.
After unsupervised pre-training, the original BERT model needs to be subjected to supervised fine tuning to be suitable for a specific natural language processing task, but the fine tuning process needs to consume a large amount of manpower for data annotation. The invention provides a method for calculating the reasonable probability of a scheduling statement by combining tasks of a power grid scheduling language model, namely judging the rationality of the scheduling statement. For a scheduling statement containing j words, the input of the kth word (k is 1,2, …, j) is masked in sequence, and the probability pro of the word corresponding to the output is predicted by adopting a BERT and softmax layer pre-trained by an MLM task k And finally obtaining the probability that the scheduling statement is a reasonable sentence:
the method can fully utilize the pre-training result of the model on the MLM task on one hand, and does not need to add extra labeled data on the other hand, thereby effectively reducing the model training threshold.
Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.
Claims (8)
1. A power grid dispatching voice recognition language model construction method based on BERT is characterized by comprising the following steps:
extracting word granularity semantic features of a power grid dispatching statement;
extracting keyword characteristics of a power grid dispatching statement;
extracting named entity features of power grid dispatching sentences;
segmenting a power grid dispatching sentence input into the BERT original model by taking words as units to extract position features, and training the BERT original model based on semantic features, keyword features, named entity features and position features to obtain a power grid dispatching voice recognition language model;
the method for extracting the named entity features of the power grid dispatching statements comprises the following steps:
constructing a named entity dictionary by using the power grid standing book information, and counting the word numbers of the shortest named entity and the longest named entity in the named entity dictionary and respectively recording the word numbers as c and d;
for each word in the power grid dispatching sentence, extracting all word sequences with the length of q (q is c, c +1, …, d) including the word, and then calculating to obtain the similarity between each word sequence with the length of q and each word with the length of q in the named entity dictionary;
calculating the named entity characteristics of each character in the power grid dispatching information based on the similarity between each character sequence with the length of q and each word with the length of q in the named entity dictionary;
the method for calculating the named entity characteristics of each word in the power grid dispatching information based on the similarity between each word sequence with the length of q and each word with the length of q in the named entity dictionary comprises the following steps:
for each word, let it have e correspondencesThe maximum similarity between the s-th word sequence (s ═ 1,2, …, e) and each named entity is recorded as msim xu(s) E maximum similarity values are obtained, and the maximum value among the e maximum similarity values is assumed to be msim xu(t) Then, the tth word sequence is called as a matching word sequence of the word, and the named entity feature vector of the word is calculated:
in the formula: f (u) a value representing the named entity feature vector to the u-th dimension; g.msim xu(t) Representing the probability of misrecognition of the matching word sequence, wherein g is 0 when the matching word sequence is completely the same as the named entity, or is 1; pos indicates that the word is the first word of the matching word sequence; len denotes the length of the matching word sequence; dim represents the dimension of the named entity feature vector.
2. The method as claimed in claim 1, wherein the extracting word-granularity semantic features of the power grid dispatching sentence comprises:
and segmenting the scheduling statement by taking the word as granularity, wherein the semantic feature vector of each word is generated by adopting a skip-gram model of word2 vec.
3. The BERT-based power grid scheduling speech recognition language model construction method of claim 1, wherein the extracting of the keyword features of the power grid scheduling statement comprises:
for each word in the power grid dispatching sentence, the pinyin of the word is divided into an initial consonant, a final sound and an tone, when the syllable is recognized integrally, the word is directly divided into the initial consonant and the final sound, the combined final sound is not divided any more, and the initial consonant or the tone is recorded as a null value by the word without the initial consonant or the tone;
calculating the similarity between each word and each keyword in the power grid dispatching sentence;
and for each word in the power grid dispatching statement, extracting the semantic feature vector of the keyword with the highest similarity, and obtaining the keyword feature vector of the word according to the similarity.
4. The method as claimed in claim 3, wherein the calculating the similarity between each word and each keyword in the power grid scheduling statement comprises:
the calculation formula is as follows:
in the formula: sim sheng The method is characterized in that 1 is taken when the two letters are the same, 0.5 is taken when the letters are different and are respectively corresponding flat tongue sound and warped tongue sound, and 0 is taken in other cases; sim yun The expression that 1 is taken when the two rhymes are the same, 0.5 is taken when the rhymes are different but are respectively corresponding front nasal sounds and rear nasal sounds, and 0 is taken in the other cases; sim diao Indicating that 1 is taken at the same time as the two word tone phases, and 0 is taken otherwise.
5. The method as claimed in claim 1, wherein the calculating of the similarity between each word sequence with length q and each word with length q in the named entity dictionary comprises:
the calculation formula is as follows:
in the formula: sim zi(r) Representing the degree of similarity of the nth word of the word sequence to the nth word of the named entity.
6. The method for constructing the power grid scheduling speech recognition language model based on the BERT of claim 1, wherein the training of the BERT original model based on the semantic features, the keyword features, the named entity features and the position features to obtain the power grid scheduling speech recognition language model comprises the following steps of:
carrying out unsupervised pre-training of MLM tasks on the BERT original model;
and carrying out supervised training on the BERT original model based on reasonable probability of a scheduling statement.
7. The method for constructing the power grid dispatching speech recognition language model based on the BERT as claimed in claim 6, wherein the unsupervised pre-training of the MLM task based on the BERT original model comprises:
and the MLM task randomly masks the input of the partial segmentation unit, accesses a softmax layer after the corresponding output representation vector to predict the masked words or characters, and trains the parameters of the BERT original model in the process of multiple prediction.
8. The method as claimed in claim 7, wherein the supervised training of the BERT raw model based on reasonable probability of scheduling statements comprises:
for a power grid dispatching statement containing j words, sequentially masking input of kth word (k is 1,2, …, j), and predicting probability pro corresponding to output as the word by adopting a BERT original model pre-trained by an MLM task and a softmax layer k And finally calculating the probability that the power grid dispatching sentence is a reasonable sentence:
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010148584.2A CN111489746B (en) | 2020-03-05 | 2020-03-05 | Power grid dispatching voice recognition language model construction method based on BERT |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010148584.2A CN111489746B (en) | 2020-03-05 | 2020-03-05 | Power grid dispatching voice recognition language model construction method based on BERT |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111489746A CN111489746A (en) | 2020-08-04 |
CN111489746B true CN111489746B (en) | 2022-07-26 |
Family
ID=71794395
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010148584.2A Active CN111489746B (en) | 2020-03-05 | 2020-03-05 | Power grid dispatching voice recognition language model construction method based on BERT |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111489746B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112420042A (en) * | 2020-11-19 | 2021-02-26 | 国网北京市电力公司 | Control method and device of power system |
CN113342585A (en) * | 2021-06-28 | 2021-09-03 | 沈阳工业大学 | PCB wiring fracture detection and identification method based on language semantic judgment |
CN113591475B (en) * | 2021-08-03 | 2023-07-21 | 美的集团(上海)有限公司 | Method and device for unsupervised interpretable word segmentation and electronic equipment |
CN113488061B (en) * | 2021-08-05 | 2024-02-23 | 国网江苏省电力有限公司 | Distribution network dispatcher identity verification method and system based on improved Synth2Aug |
CN113688210B (en) * | 2021-09-06 | 2024-02-09 | 北京科东电力控制系统有限责任公司 | Power grid dispatching intention recognition method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106980620A (en) * | 2016-01-18 | 2017-07-25 | 阿里巴巴集团控股有限公司 | A kind of method and device matched to Chinese character string |
CN109800437A (en) * | 2019-01-31 | 2019-05-24 | 北京工业大学 | A kind of name entity recognition method based on Fusion Features |
CN110083831A (en) * | 2019-04-16 | 2019-08-02 | 武汉大学 | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF |
CN110263182A (en) * | 2019-06-18 | 2019-09-20 | 京东方科技集团股份有限公司 | Paintings recommended method and system, terminal device, computer equipment and medium |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3507708A4 (en) * | 2016-10-10 | 2020-04-29 | Microsoft Technology Licensing, LLC | Combo of language understanding and information retrieval |
-
2020
- 2020-03-05 CN CN202010148584.2A patent/CN111489746B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106980620A (en) * | 2016-01-18 | 2017-07-25 | 阿里巴巴集团控股有限公司 | A kind of method and device matched to Chinese character string |
CN109800437A (en) * | 2019-01-31 | 2019-05-24 | 北京工业大学 | A kind of name entity recognition method based on Fusion Features |
CN110083831A (en) * | 2019-04-16 | 2019-08-02 | 武汉大学 | A kind of Chinese name entity recognition method based on BERT-BiGRU-CRF |
CN110263182A (en) * | 2019-06-18 | 2019-09-20 | 京东方科技集团股份有限公司 | Paintings recommended method and system, terminal device, computer equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN111489746A (en) | 2020-08-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111489746B (en) | Power grid dispatching voice recognition language model construction method based on BERT | |
CN108124477B (en) | Improving word segmenters to process natural language based on pseudo data | |
CN100536532C (en) | Method and system for automatic subtilting | |
CN113591483A (en) | Document-level event argument extraction method based on sequence labeling | |
CN108510976A (en) | A kind of multilingual mixing voice recognition methods | |
Li et al. | Towards zero-shot learning for automatic phonemic transcription | |
Jin et al. | A Korean named entity recognition method using Bi-LSTM-CRF and masked self-attention | |
CN105404621A (en) | Method and system for blind people to read Chinese character | |
CN107797987A (en) | A kind of mixing language material name entity recognition method based on Bi LSTM CNN | |
CN114970529A (en) | Weakly supervised and interpretable training of machine learning based Named Entity Recognition (NER) mechanisms | |
Mohammed | Using machine learning to build POS tagger for under-resourced language: the case of Somali | |
CN110377882A (en) | For determining the method, apparatus, system and storage medium of the phonetic of text | |
CN111222329B (en) | Sentence vector training method, sentence vector model, sentence vector prediction method and sentence vector prediction system | |
CN114722832A (en) | Abstract extraction method, device, equipment and storage medium | |
Yang et al. | ASR error correction with constrained decoding on operation prediction | |
Zhu et al. | Concept transfer learning for adaptive language understanding | |
CN112183060B (en) | Reference resolution method of multi-round dialogue system | |
CN110750967B (en) | Pronunciation labeling method and device, computer equipment and storage medium | |
Ananth et al. | Grammatical tagging for the Kannada text documents using hybrid bidirectional long-short term memory model | |
Mamatov et al. | Construction of language models for Uzbek language | |
Forsati et al. | An efficient meta heuristic algorithm for pos-tagging | |
Naulla et al. | Predicting the Next Word of a Sinhala Word Series Using Recurrent Neural Networks | |
CN111090720B (en) | Hot word adding method and device | |
CN112613316A (en) | Method and system for generating ancient Chinese marking model | |
CN111414747A (en) | Time knowledge fuzzy measurement method and system based on weak supervised learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |