WO2023000725A1 - 电力计量的命名实体识别方法、装置和计算机设备 - Google Patents

电力计量的命名实体识别方法、装置和计算机设备 Download PDF

Info

Publication number
WO2023000725A1
WO2023000725A1 PCT/CN2022/087120 CN2022087120W WO2023000725A1 WO 2023000725 A1 WO2023000725 A1 WO 2023000725A1 CN 2022087120 W CN2022087120 W CN 2022087120W WO 2023000725 A1 WO2023000725 A1 WO 2023000725A1
Authority
WO
WIPO (PCT)
Prior art keywords
word vector
reference feature
word
feature set
corpus
Prior art date
Application number
PCT/CN2022/087120
Other languages
English (en)
French (fr)
Inventor
梁洪浩
伍少成
姜和芳
陈晓伟
Original Assignee
深圳供电局有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳供电局有限公司 filed Critical 深圳供电局有限公司
Publication of WO2023000725A1 publication Critical patent/WO2023000725A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Definitions

  • the present application relates to the technical field of named entities, in particular to a named entity recognition method, device, computer equipment and storage medium for power metering.
  • the existing deep learning model does not fully consider the name overlap of named entities in power metering.
  • a named entity recognition method for electric power metering comprising:
  • the reference feature sets include a first reference feature set, a second reference feature set and a third reference feature set, each of the first reference feature sets Elements are word vectors corresponding to words, each element in the second reference feature set is a word vector corresponding to two adjacent words, and each element in the third reference feature set is corresponding to three adjacent words the word vector;
  • a plurality of reference feature sets are input into the trained word vector feature extraction model, so as to determine the word vector feature corresponding to the reference feature set based on each element in the same reference feature set through the word vector feature extraction model;
  • the vector features include the part-of-speech features corresponding to each word vector, the relevance features corresponding to two adjacent word vectors, and the relevance features corresponding to three adjacent word vectors;
  • the word vector features corresponding to each of the multiple reference feature sets are input into a preset conditional random field, and the named entities in the corpus to be recognized are determined according to the labeling results output by the conditional random field.
  • the determination of the word vector feature corresponding to the reference feature set based on each element in the same reference feature set includes:
  • the similarity is input into a pre-trained single-layer neural network, and the similarity is enlarged or reduced by the single-layer neural network to obtain an adjusted similarity;
  • the determination of the word vector feature corresponding to the reference feature set according to the adjusted similarity includes:
  • mapping relationship determine the attention feature corresponding to the attention coefficient, and input a plurality of attention features to the forward neural network to obtain the word vector feature corresponding to the reference feature set; the mapping relationship is:
  • h i is the attention feature
  • K is the number of attention heads.
  • the word vectors are combined to obtain multiple reference feature sets, including:
  • the order of arrangement corresponding to each of the multiple word vectors corresponds to the order of arrangement corresponding to the plurality of words in the corpus to be identified;
  • a first reference feature set is generated.
  • the word vectors are combined to obtain multiple reference feature sets, including:
  • a second reference feature set is generated according to multiple groups of word vector pairs.
  • it also includes:
  • the power metering corpus includes multiple pieces of corpus for describing power metering information
  • the initialized word vector model is trained by using the obtained multiple words to obtain a trained word vector model, and the trained word vector model is used to identify the word vector corresponding to each word in the power measurement corpus.
  • it also includes:
  • Said label comprises the named entity of electric power metering in said sample corpus and the entity category corresponding to said named entity;
  • the sample feature set includes a first sample feature set, a second sample feature set, and a third sample feature set, and each of the first sample feature sets Elements are word vectors corresponding to words, each element in the second sample feature set is a word vector corresponding to two adjacent words, and each element in the third sample feature set is corresponding to three adjacent words the word vector;
  • each sample feature set to the machine translation model to be trained to determine the word vector feature corresponding to the sample feature set through the self-attention layer in the machine translation model, and input multiple word vector features to the preset condition Random field, according to the prediction result output by the conditional random field, determine the predicted named entity in the sample corpus;
  • a named entity recognition device for electric power metering comprising:
  • the word vector acquisition module is used to acquire word vectors corresponding to multiple words in the corpus to be recognized for describing the power metering information
  • a reference feature set acquisition module configured to combine the word vectors to obtain a plurality of reference feature sets;
  • the reference feature set includes a first reference feature set, a second reference feature set and a third reference feature set, the first reference feature set
  • Each element in a reference feature set is a word vector corresponding to a word
  • each element in the second reference feature set is a word vector corresponding to two adjacent words
  • each element in the third reference feature set Elements are word vectors corresponding to three adjacent words;
  • the word vector feature acquisition module is used to input multiple reference feature sets into the trained word vector feature extraction model, so as to determine that the reference feature set corresponds to each element based on the same reference feature set through the word vector feature extraction model.
  • the word vector feature includes the part-of-speech feature corresponding to each word vector, the relevance feature corresponding to two adjacent word vectors and the corresponding relevance feature corresponding to three adjacent word vectors;
  • the named entity determination module is used to input the corresponding word vector features of multiple reference feature sets into the preset conditional random field, and determine the named entity in the corpus to be recognized according to the labeling results output by the conditional random field .
  • a computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of any one of the above methods when executing the computer program.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of any one of the above methods are implemented.
  • the named entity recognition method, device, computer equipment, and storage medium for power metering described above can obtain word vectors corresponding to multiple words in the corpus to be recognized for describing power metering information, and combine the word vectors to obtain multiple references Feature set, wherein the reference feature set includes a first reference feature set, a second reference feature set and a third reference feature set, each element in the first reference feature set is a word vector of a corresponding word, in the second reference feature set Each element of is a word vector corresponding to two adjacent words, each element in the third reference feature set is a word vector corresponding to three adjacent words, and then multiple reference feature sets can be input to the trained word
  • the vector feature extraction model is used to determine the word vector features corresponding to the reference feature set based on each element in the same reference feature set through the word vector feature extraction model, and input the word vector features corresponding to each of the multiple reference feature sets into the preset
  • the word vector features of the corpus to be recognized can be extracted by comparing a single word vector, two adjacent word vectors, and three word vectors, and can identify
  • the relationship between adjacent words can judge whether multiple words constitute the same named entity as a whole, avoid erroneously dividing the same named entity, solve the problem of overlapping named entity names in power metering, and reduce the impact of pre-segmented words. Effectively improve the accuracy of named entity recognition.
  • Fig. 1 is a schematic flow chart of a named entity recognition method for electric power metering in an embodiment
  • Fig. 2 is a schematic flow chart of a named entity recognition method for electric power metering in another embodiment
  • Fig. 3 is a structural block diagram of a named entity recognition device for electric power metering in an embodiment
  • Figure 4 is an internal block diagram of a computer device in one embodiment.
  • the knowledge graph describes concepts, entities and their relationships in the objective world in a structured form.
  • the unit in the knowledge graph is the "entity-relationship-entity" triplet, and the relationship between entities is organized into a networked knowledge structure.
  • knowledge graphs can be used to solidify scheduling knowledge and provide knowledge and data support for grid operation monitoring and decision-making.
  • the purpose of named entity recognition of power metering is to identify power metering entities and their categories in specific fields, and provide a basis for establishing and analyzing knowledge graphs of power metering.
  • the existing deep learning model does not fully consider the name overlap of named entities in power metering.
  • the named entity “current loss”, it can be regarded as a named entity, but when obtaining the named entity, it may be wrongly divided into "current” and "loss”, where "current” is divided into A single named entity.
  • a method for identifying a named entity for electric power metering is provided.
  • This embodiment uses an example in which the method is applied to a server for illustration. It can be understood that the method can also be applied to a terminal. It can also be applied to a system including a terminal and a server, and is realized through interaction between the terminal and the server.
  • the method may include the following steps:
  • step 101 word vectors corresponding to multiple words in the corpus to be recognized for describing power metering information are obtained.
  • the word vector may be a vector obtained by mapping words to real numbers.
  • the server can obtain the corpus to be recognized for describing the power metering information, and after performing word segmentation on the corpus to be recognized, can determine word vectors corresponding to each of the multiple words in the corpus to be recognized.
  • Step 102 combining the word vectors to obtain multiple reference feature sets.
  • the reference feature set includes a first reference feature set, a second reference feature set and a third reference feature set
  • each element in the first reference feature set is a word vector corresponding to a word
  • each element in the second reference feature set The elements are word vectors corresponding to two adjacent words
  • each element in the third reference feature set is a word vector corresponding to three adjacent words.
  • electricality difference anomaly can be determined as a named entity, and it is also divided into two named entities of “electricity quantity” and “difference anomaly”;
  • electrical energy meter replacement may be divided into “electric energy meter” and "replacement” by the word segmentation module, that is, some adjacent words may be marked as the same named entity, or may be divided into isolated named entities.
  • multiple word vectors can be combined to obtain multiple reference feature sets. Specifically, multiple word vectors can be traversed, each individual word vector is determined as an element, and the first reference feature set is obtained based on multiple word vectors; or, the word vectors can also be traversed in pairs, based on two adjacent word vectors to generate an element, thereby obtaining a second reference feature set containing multiple sets of word vector pairs; as another example, one element can be generated based on three adjacent word vectors to generate a third reference feature set.
  • Step 103 input multiple reference feature sets into the trained word vector feature extraction model, so as to determine the word vector features corresponding to the reference feature set based on each element in the same reference feature set through the word vector feature extraction model.
  • the word vector features include the part-of-speech features corresponding to each word vector, the relevance features corresponding to two adjacent word vectors, and the relevance features corresponding to three adjacent word vectors.
  • the part-of-speech feature can be determined based on each element in the first reference feature set, and the part-of-speech feature can include at least one of Chinese part of speech and English part of speech, such as nouns, verbs, adjectives, adverbs, predicates, attributives, adverbials, etc. part-of-speech features.
  • the multiple reference feature sets can be input into the trained word vector feature extraction model.
  • the word vector feature extraction model can determine the word vector features corresponding to the reference feature set based on each element in the same reference feature set.
  • the word vector feature extraction model can be composed of three processing modules, and each processing module can include one or more self-attention layers.
  • the three processing modules receive the first The reference feature set, the second reference feature set and the third reference feature set, the self-attention layer in the processing module can determine the word vector features corresponding to the reference feature set based on each element in the corresponding reference feature set.
  • Step 104 input word vector features corresponding to multiple reference feature sets into a preset conditional random field, and determine named entities in the corpus to be recognized according to the labeling results output by the conditional random field.
  • the word vector features corresponding to the multiple reference feature sets can be input into the preset conditional random field, and the conditional random field is based on the multiple input word vector features.
  • the tagging results corresponding to each word, and then the named entities in the corpus to be recognized can be determined according to the tagging results. After the named entity is determined, the corpus to be recognized can be marked and saved based on the recognized named entity.
  • word vectors corresponding to multiple words in the corpus to be recognized for describing power metering information can be obtained, and the word vectors can be combined to obtain multiple reference feature sets, wherein the reference feature set includes the first The reference feature set, the second reference feature set and the third reference feature set, each element in the first reference feature set is the word vector of the corresponding word, and each element in the second reference feature set is two adjacent words corresponding Each element in the third reference feature set is a word vector corresponding to three adjacent words, and then multiple reference feature sets can be input into the trained word vector feature extraction model to extract Based on each element in the same reference feature set, the model determines the word vector features corresponding to the reference feature set, and inputs the word vector features corresponding to each of the multiple reference feature sets into the preset conditional random field, and outputs it according to the conditional random field Annotate the results and determine the named entities in the corpus to be recognized.
  • the reference feature set includes the first The reference feature set, the second reference feature set and the third reference feature set
  • the word vector features of the corpus to be recognized can be extracted by comparing a single word vector, two adjacent word vectors, and three word vectors, and can identify
  • the relationship between adjacent words can judge whether multiple words constitute the same named entity as a whole, avoid erroneously dividing the same named entity, solve the problem of overlapping named entity names in power metering, and reduce the impact of pre-segmented words. Effectively improve the accuracy of named entity recognition.
  • the method may also include the following steps:
  • the power metering corpus includes multiple pieces of corpus used to describe power metering information, and the trained word vector model is used to identify word vectors corresponding to each word in the power metering corpus.
  • the entity types in the general domain are generally people, places, organizations, etc., and the naming format is relatively standardized.
  • many named entity datasets in the general domain have been opened and used for model training.
  • the power metering corpus can be constructed in advance. Specifically, there is a large amount of corpus related to power metering in the power system, for example, it can be obtained from the developed power metering information processing system, or business reports, power metering statistics, etc. Subject information acquisition, or, for English corpus, corpus related to power metering can be obtained from the English knowledge base. After obtaining a large amount of corpus related to power metering, data cleaning can be performed to remove irrelevant information, and a power metering corpus including English corpus and Chinese corpus can be obtained.
  • the preset word segmentation model can be used to segment the corpus in the power metering corpus to obtain multiple words used to describe the power metering information, for example, based on various punctuation marks, the sentences in the corpus can be structurally divided .
  • the currently obtained words can be used to train the initialized word vector model to obtain a trained word vector model, such as training the Word2Vec model, so that the model can be used to map words describing energy metering information into word vectors.
  • the method may also include the following steps:
  • the label includes the named entity of power metering in the sample corpus and the entity category corresponding to the named entity.
  • the entity category corresponding to each named entity can be any of the following: power metering indicators, power metering objects, power metering phenomena, and power metering behaviors .
  • the boundaries between different named entities are relatively fuzzy.
  • statistical power consumption data can be marked as “power consumption”, “meter reading rate”, “current” and so on and divided into power metering index entities.
  • power metering object entities such as “electric energy meter”, “Guangzhou Power Supply Bureau”, etc.
  • the phenomenon generated by a specific subject in the power metering process is identified as a power metering phenomenon entity, such as “power meter stop”, “current loss”, “current imbalance”, etc.
  • Power metering operations for specific actions can be marked as power metering behavior entities, such as "meter reading” and "abnormal maintenance”.
  • the indicators and objects of power metering entities are mostly nouns
  • power metering phenomena are mostly a combination of nouns and verbs
  • power metering behaviors are mostly verbs.
  • the sample feature set includes a first sample feature set, a second sample feature set and a third sample feature set, each element in the first sample feature set is a word vector corresponding to a word, and each element in the second sample feature set The elements are word vectors corresponding to two adjacent words, and each element in the third sample feature set is a word vector corresponding to three adjacent words.
  • sample corpus and their corresponding labels can be obtained.
  • the word segmentation model can be used to segment the sample corpus to obtain multiple sample words corresponding to the sample prediction, and then the word vector corresponding to the sample word can be obtained through the trained word vector model.
  • multiple word vectors can be combined to obtain multiple sample feature sets corresponding to the word vector.
  • the acquisition method of the sample feature set is similar to that of the reference feature set. For details, please refer to the following The method for obtaining the reference feature set described above will not be described in detail in this embodiment.
  • each sample feature set can be input to the machine translation model to be trained.
  • the machine translation model After the machine translation model has obtained multiple sample feature sets, it can obtain the corresponding word vector features through multiple self-attention in the machine translation model.
  • the machine translation model may be composed of three processing modules, and the machine translation model including the three processing modules may be called a third-order machine translation model.
  • each processing module may include one or more self-attention layers.
  • the self-attention layer in the processing module can determine the word vector features corresponding to the sample feature set based on each element in the corresponding sample feature set.
  • multiple word vector features can be input into the preset conditional random field, and according to the prediction results output by the conditional random field, the predicted named entities in the sample corpus can be determined, and according to the predicted Naming entities and labels, tuning model parameters for machine translation models. Repeat the above training process until the training end condition is met, and the word vector feature extraction model can be obtained.
  • the sample corpus and its corresponding labels can be obtained, the word segmentation model is used to obtain multiple sample words corresponding to the sample prediction, and the word vector corresponding to the sample word is obtained through the trained word vector model, and the word vector corresponding to the word vector is obtained.
  • Multiple sample feature sets each sample feature set is input to the machine translation model to be trained to determine the word vector feature corresponding to the sample feature set through the self-attention layer in the machine translation model, and multiple word vector features are input to
  • the preset conditional random field according to the prediction result output by the conditional random field, determines the predicted named entity in the sample corpus, adjusts the model parameters of the machine translation model according to the predicted named entity and label, and repeats the training process until the training end condition is met.
  • the word vector feature extraction model is obtained, which can provide a basis for accurately identifying named entities of electric power metering.
  • said determining the word vector feature corresponding to the reference feature set based on each element in the same reference feature set may include the following steps:
  • each processing module in the word vector feature extraction model can process the corresponding reference feature set respectively, and for each reference feature set, the processing module can determine the adjacent reference feature set in the same reference feature set The similarity of the corresponding elements.
  • the word vector feature extraction model may include one or more self-attention layers, and each self-attention layer consists of three parts, including neural cosine similarity function, attention coefficient and attention feature.
  • the cosine function in the self-attention layer can determine the similarity corresponding to adjacent elements in the same reference feature set, and the cosine similarity can be measured by measuring the angle between two input vectors to measure the similarity between two input vectors. Specifically, it can be determined by the following formula:
  • i represents the vector corresponding to the i-th element.
  • the similarity can be input into the word vector feature extraction model and pre-trained
  • the single-layer neural network through the single-layer neural network to enlarge or reduce the similarity, get the adjusted similarity.
  • the adjusted similarity can more accurately reflect the difference between adjacent word vectors, so that the word vector features corresponding to the reference feature set can be determined according to the adjusted similarity.
  • the single-layer neural network can be:
  • Similarity ij is the adjusted similarity
  • Wf i and Wf j are the pre-trained weights for two adjacent elements in the single-layer neural network
  • V is the network parameter of the single-layer neural network
  • Neural is the single-layer neural network network
  • the ReLU activation function can be used to prevent similar gradients from disappearing.
  • determining the word vector feature corresponding to the reference feature set according to the adjusted similarity may include the following steps:
  • the attention coefficient corresponding to the adjusted similarity determines the attention feature corresponding to the attention coefficient according to the preset mapping relationship, and input multiple attention features into the forward neural network to obtain the reference feature set Corresponding word vector features.
  • mapping relationship is:
  • h i is the attention feature
  • K is the number of attention heads
  • sigmoid is the activation function
  • mapping relationship is:
  • the attention coefficient corresponding to the adjusted similarity can be obtained, and the attention coefficient can be determined by the following formula:
  • the attention coefficient After determining the attention coefficient, it can be substituted into the preset mapping relationship, determine the attention feature corresponding to the attention coefficient, and input the obtained multiple attention features into the forward neural network, through the processing module Forward neural network model processing to obtain the word vector features corresponding to the reference feature set. Specifically, after obtaining multiple attention features, they can be subjected to feature fusion (add) and normalization (normalization), and then input to the forward neural network. After the force feature is analyzed, the result can be input, and after the output result is subjected to feature fusion and standardization again, the processing module can output the word vector feature corresponding to the reference feature set.
  • feature fusion add
  • normalization normalization
  • the attention feature corresponding to the attention coefficient is determined, and multiple attention features are input to the forward neuron Network to obtain the word vector features corresponding to the reference feature set, which can provide a basis for the accurate identification of named entities of power metering.
  • the word vectors are combined to obtain multiple reference feature sets, including:
  • Determining the sequence corresponding to each of the multiple word vectors generating a first reference feature set based on each word vector and its corresponding sequence.
  • the sequence corresponding to each word vector can be obtained, wherein the sequence corresponding to each of the multiple word vectors corresponds to the sequence corresponding to multiple words in the corpus to be recognized.
  • the first reference feature set may be generated based on each word vector and its corresponding sequence. Specifically, after determining the arrangement order corresponding to each word vector, each word vector can be determined as an element in turn, and by traversing multiple word vectors, a first reference feature set containing multiple elements can be generated, the first Each element in the reference feature set corresponds to a word vector.
  • the part-of-speech feature corresponding to each word can be determined based on the arrangement sequence corresponding to each word vector in the first reference feature set.
  • a word vector is an expression that maps words into a vector form. When multiple word vectors are arranged in the order of arrangement, they can be represented as the corresponding vector form after word segmentation of the corpus to be recognized. In the corpus to be recognized, due to different There are differences in the corresponding positions and contexts of words. By analyzing the order of the corresponding word vectors, the part-of-speech features corresponding to each word vector can be obtained.
  • the first reference feature set is generated, which can provide a basis for determining the similarity between adjacent word vectors.
  • the combination of the word vectors to obtain a plurality of reference feature sets may include the following steps:
  • the order of arrangement corresponding to each word vector can be obtained, and the order of arrangement corresponding to each of the plurality of word vectors corresponds to the order of arrangement corresponding to the plurality of words in the corpus to be recognized.
  • multiple sets of adjacent word vectors can be obtained to obtain multiple sets of word vector pairs. Specifically, after determining the arrangement order corresponding to each word vector, multiple word vectors can be traversed in pairs to extract two adjacent word vectors. After obtaining multiple sets of word vector pairs, each set of word vector pairs can be determined as an element, and a second reference feature set can be generated based on the multiple elements. In the subsequent processing, the word vector feature extraction model can perform combined analysis on two adjacent word vectors to obtain the combined features corresponding to the two adjacent words.
  • the combination of the word vectors to obtain a plurality of reference feature sets may include the following steps:
  • the sequence corresponding to each word vector can be obtained. After obtaining the sequence corresponding to each word vector, based on each word vector and its corresponding sequence, three adjacent word vectors can be determined as a combination of word vectors to obtain multiple combinations of word vectors. After obtaining multiple combinations of word vectors, each combination of word vectors can be determined as an element, and a third reference feature set can be generated based on the multiple elements. In the subsequent processing, the word vector feature extraction model can perform combined analysis on the three adjacent word vectors to obtain the combined features corresponding to the three adjacent words.
  • the third reference feature set is generated, which can provide a basis for obtaining the combined features between three adjacent word vectors, and reduce the influence of the word segmentation model on the incorrect word segmentation of the corpus to be recognized.
  • the preset word segmentation model can be used to segment it, and the word vector corresponding to each word can be obtained through the trained word vector model. After the word vector is obtained, multiple word vectors can be combined to obtain the first reference feature set, the second reference feature set and the third reference feature set, and input them into the trained word vector feature extraction model.
  • the word vector feature extraction model can process them respectively through the first processing module, the second processing module and the third processing module.
  • the first processing module after receiving the first reference feature set, it can determine the position codes corresponding to the elements in the first reference feature set, and use the neural cosine multi-head attention mechanism to perform on the elements in the first reference feature set
  • the processing includes obtaining the similarity between adjacent elements, and enlarging or reducing the obtained similarity through the trained single-layer neural network, and determining the attention feature based on the adjusted similarity.
  • the part-of-speech features corresponding to the first reference feature set can be obtained as word vectors feature.
  • the second reference feature set can be processed by the second processing module to obtain the correlation features corresponding to two adjacent word vectors as word vector features.
  • the third reference feature set it can be processed by the third processing module to obtain the relevance features corresponding to three adjacent word vectors as word vector features.
  • the processing procedures of the second processing module and the third processing module are the same as the processing procedures of the first processing module, which will not be repeated in this embodiment.
  • conditional random field can be used to determine the corresponding labeling results of the corpus to be recognized based on the feature function, and determine the named entities in the corpus to be recognized.
  • steps in the flow chart of FIG. 1 are displayed sequentially as indicated by the arrows, these steps are not necessarily executed sequentially in the order indicated by the arrows. Unless otherwise specified herein, there is no strict order restriction on the execution of these steps, and these steps can be executed in other orders. Moreover, at least some of the steps in FIG. 1 may include multiple steps or stages, and these steps or stages may not necessarily be executed at the same time, but may be executed at different times, and the execution sequence of these steps or stages may also be It is not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a part of steps or stages in other steps.
  • a named entity recognition device for electric power metering includes:
  • a word vector acquisition module 301 configured to acquire word vectors corresponding to multiple words in the corpus to be recognized for describing power metering information
  • a reference feature set acquisition module 302 configured to combine the word vectors to obtain a plurality of reference feature sets;
  • the reference feature sets include a first reference feature set, a second reference feature set and a third reference feature set, the Each element in the first reference feature set is a word vector corresponding to a word, each element in the second reference feature set is a word vector corresponding to two adjacent words, and each element in the third reference feature set elements are word vectors corresponding to three adjacent words;
  • the word vector feature acquisition module 303 is used to input multiple reference feature sets into the trained word vector feature extraction model, so as to determine the reference feature set based on each element in the same reference feature set by the word vector feature extraction model Corresponding word vector feature;
  • the word vector feature includes the part-of-speech feature corresponding to each word vector, the relevance feature corresponding to two adjacent word vectors and the corresponding relevance feature corresponding to three adjacent word vectors;
  • the named entity determination module 304 is used to input the word vector features corresponding to each of the multiple reference feature sets into the preset conditional random field, and determine the naming in the corpus to be recognized according to the labeling results output by the conditional random field. entity.
  • the word vector feature acquisition module 303 includes:
  • a similarity determination submodule is used to determine the similarity corresponding to adjacent elements in the same reference feature set
  • the similarity adjustment submodule is used to input the similarity into the pre-trained single-layer neural network, and enlarge or reduce the similarity through the single-layer neural network to obtain the adjusted similarity;
  • the word vector feature determination submodule is used to determine the word vector feature corresponding to the reference feature set according to the adjusted similarity.
  • the word vector feature determination submodule is specifically used for:
  • mapping relationship determine the attention feature corresponding to the attention coefficient, and input a plurality of attention features to the forward neural network to obtain the word vector feature corresponding to the reference feature set; the mapping relationship is:
  • h i is the attention feature
  • K is the number of attention heads.
  • the reference feature set acquisition module 302 includes:
  • the first sequence determination submodule is used to determine the corresponding order of arrangement of multiple word vectors; the order of arrangement corresponding to each of the plurality of word vectors corresponds to the order of arrangement corresponding to the plurality of words in the corpus to be identified;
  • the first reference feature set generating submodule is configured to generate the first reference feature set based on each word vector and its corresponding arrangement order.
  • the word vectors are combined to obtain multiple reference feature sets, including:
  • the second order determines the submodule, which is used to determine the corresponding arrangement order of multiple word vectors
  • the combination sub-module is used to obtain multiple groups of adjacent word vectors based on the corresponding arrangement order of multiple word vectors, and obtain multiple groups of word vector pairs;
  • the second reference feature set generation sub-module is used to generate the second reference feature set according to multiple groups of word vector pairs.
  • the device also includes:
  • a corpus acquisition module configured to acquire a pre-built power metering corpus; the power metering corpus includes multiple pieces of corpus for describing power metering information;
  • a word segmentation module configured to use a preset word segmentation model to segment the corpus of the power metering corpus to obtain a plurality of words used to describe the power metering information;
  • the first training module is used to train the initialized word vector model by using the obtained multiple words to obtain a trained word vector model, and the trained word vector model is used to identify the word vector corresponding to each word in the power measurement corpus.
  • the device also includes:
  • a sample corpus acquisition module configured to acquire a sample corpus and its corresponding labels; the labels include named entities of power metering in the sample corpus and entity categories corresponding to the named entities;
  • the sample word vector acquisition module is used to obtain a plurality of sample words corresponding to the sample prediction using the word segmentation model, and obtain the word vector corresponding to the sample words through the trained word vector model;
  • a sample feature set acquisition module configured to acquire a plurality of sample feature sets corresponding to the word vector;
  • the sample feature set includes a first sample feature set, a second sample feature set and a third sample feature set, the first sample feature set
  • Each element in the sample feature set is a word vector corresponding to a word
  • each element in the second sample feature set is a word vector corresponding to two adjacent words
  • each element in the third sample feature set Elements are word vectors corresponding to three adjacent words;
  • the second training module is used to input each sample feature set to the machine translation model to be trained, so as to determine the word vector feature corresponding to the sample feature set through the self-attention layer in the machine translation model, and multiple word vectors
  • the feature is input to the preset conditional random field, and the predicted named entity in the sample corpus is determined according to the prediction result output by the conditional random field;
  • the parameter adjustment module is used to adjust the model parameters of the machine translation model according to the predicted named entity and the label, and repeat the training process until the training end condition is met to obtain a word vector feature extraction model.
  • Each module in the above-mentioned named entity recognition device for electric power metering can be fully or partially realized by software, hardware and a combination thereof.
  • the above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can invoke and execute the corresponding operations of the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure may be as shown in FIG. 4 .
  • the computer device includes a processor, memory and a network interface connected by a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer programs and databases.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer device is used to store word vectors.
  • the network interface of the computer device is used to communicate with an external terminal via a network connection.
  • FIG. 4 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation to the computer equipment on which the solution of the application is applied.
  • the specific computer equipment can be More or fewer components than shown in the figures may be included, or some components may be combined, or have a different arrangement of components.
  • a computer device including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program:
  • the reference feature sets include a first reference feature set, a second reference feature set and a third reference feature set, each of the first reference feature sets Elements are word vectors corresponding to words, each element in the second reference feature set is a word vector corresponding to two adjacent words, and each element in the third reference feature set is corresponding to three adjacent words the word vector;
  • a plurality of reference feature sets are input into the trained word vector feature extraction model, so as to determine the word vector feature corresponding to the reference feature set based on each element in the reference feature set through the word vector feature extraction model;
  • the word vector The features include the part-of-speech features corresponding to each word vector, the relevance features corresponding to two adjacent word vectors, and the relevance features corresponding to three adjacent word vectors;
  • the word vector features corresponding to each of the multiple reference feature sets are input into a preset conditional random field, and the named entities in the corpus to be recognized are determined according to the labeling results output by the conditional random field.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • the reference feature sets include a first reference feature set, a second reference feature set and a third reference feature set, each of the first reference feature sets Elements are word vectors corresponding to words, each element in the second reference feature set is a word vector corresponding to two adjacent words, and each element in the third reference feature set is corresponding to three adjacent words the word vector;
  • a plurality of reference feature sets are input into the trained word vector feature extraction model, so as to determine the word vector feature corresponding to the reference feature set based on each element in the reference feature set through the word vector feature extraction model;
  • the word vector The features include the part-of-speech features corresponding to each word vector, the relevance features corresponding to two adjacent word vectors, and the relevance features corresponding to three adjacent word vectors;
  • the word vector features corresponding to each of the multiple reference feature sets are input into a preset conditional random field, and the named entities in the corpus to be recognized are determined according to the labeling results output by the conditional random field.
  • Non-volatile memory can include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory or optical memory, etc.
  • Volatile memory can include Random Access Memory (RAM) or external cache memory.
  • RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种电力计量的命名实体识别方法、装置、计算机设备和存储介质,所述方法包括:获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量(101);对所述词向量进行组合,获取多个参考特征集合(102);将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征(103);将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体(104);上述方法能够避免错误地对同一命名实体进行划分,解决电力计量中命名实体名称重叠的问题,减轻预先分词带来的影响,有效提高命名实体识别的准确性。

Description

电力计量的命名实体识别方法、装置和计算机设备 技术领域
本申请涉及命名实体技术领域,特别是涉及一种电力计量的命名实体识别方法、装置、计算机设备和存储介质。
背景技术
随着知识图谱的日益普及,人们对知识图谱的需求越来越大。针对电力计量知识图谱,构建前往往需要从电力计量文本中准确识别并提取电力计量的命名实体。在传统技术中,深度学习模型在电力计量中得到了广泛的应用,例如通过长短时记忆神经网络(LSTM)、卷积神经网络等识别电力计量的命名实体。
然而,现有的深度学习模型并没有充分考虑到电力计量中命名实体的名称重叠的情况,在对语料进行分词时,将应该确定为一个单一的命名实体划分为多个部分,导致识别命名实体识别错误的情况发生,降低了命名实体的识别准确率。
发明内容
基于此,有必要针对上述技术问题,提供一种电力计量的命名实体识别方法、装置、计算机设备和存储介质。
一种电力计量的命名实体识别方法,所述方法包括:
获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量;
对所述词向量进行组合,获取多个参考特征集合;所述参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,所述第一参考特征集合中的每个元素为对应词语的词向量,所述第二参考特征集合中的每个元素为两个相邻词语对应的词向量,所述第三参考特征集合中的每个元素为三个相邻词语对应的词向量;
将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征;所述词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征;
将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
在其中一个实施例中,所述基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征,包括:
确定同一参考特征集合中相邻元素对应的相似度;
将所述相似度输入到预先训练的单层神经网络,通过所述单层神经网络放大或缩小所述相似度,得到调整后的相似度;
根据调整后的相似度确定该参考特征集合对应的词向量特征。
在其中一个实施例中,所述根据调整后的相似度确定该参考特征集合对应的词向量特征,包括:
获取调整后的相似度对应的注意力系数;
根据预设的映射关系,确定所述注意力系数对应的注意力特征,并将多个注意力特征输入到前向神经网络,得到该参考特征集合对应的词向量特征;所述映射关系为:
Figure PCTCN2022087120-appb-000001
其中,h i为注意力特征,
Figure PCTCN2022087120-appb-000002
为注意力系数,K为注意力头的数量。
在其中一个实施例中,所述对所述词向量进行组合,获取多个参考特征集合,包括:
确定多个词向量各自对应的排列顺序;多个词向量各自对应的排列顺序与所述待识别语料中多个词语对应的排列顺序对应;
基于各个词向量及其对应的排列顺序,生成第一参考特征集合。
在其中一个实施例中,所述对所述词向量进行组合,获取多个参考特征集合,包括:
确定多个词向量各自对应的排列顺序;
基于多个词向量对应的排列顺序,获取多组相邻的词向量,得到多组词向量对;
根据多组词向量对,生成第二参考特征集合。
在其中一个实施例中,还包括:
获取预先构建的电力计量语料库;所述电力计量语料库包括多条用于描述电力计量信息的语料;
采用预设的分词模型对所述电力计量语料库的语料进行分词,得到多个用于描述电力计量信息的词语;
采用得到的多个词语训练初始化的词向量模型,得到训练好的词向量模型,所述训练好的词向量模型用于识别电力计量语料中各个词语对应的词向量。
在其中一个实施例中,还包括:
获取样本语料及其对应的标签;所述标签包括所述样本语料中电力计量的命名实体和所述命名实体对 应的实体类别;
采用所述分词模型获取样本预料对应的多个样本词语,并通过训练好的词向量模型获取所述样本词语对应的词向量;
获取所述词向量对应的多个样本特征集合;所述样本特征集合包括第一样本特征集合、第二样本特征集合和第三样本特征集合,所述第一样本特征集合中的每个元素为对应词语的词向量,所述第二样本特征集合中的每个元素为两个相邻词语对应的词向量,所述第三样本特征集合中的每个元素为三个相邻词语对应的词向量;
将各个样本特征集合输入到待训练的机器翻译模型,以通过所述机器翻译模型中的自注意力层确定样本特征集合对应的词向量特征,并将多个词向量特征输入到预设的条件随机场,根据所述条件随机场输出的预测结果,确定所述样本语料中的预测命名实体;
根据所述预测命名实体和所述标签,调整所述机器翻译模型的模型参数,重复训练过程,直到满足训练结束条件,得到词向量特征提取模型。
一种电力计量的命名实体识别装置,所述装置包括:
词向量获取模块,用于获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量;
参考特征集合获取模块,用于对所述词向量进行组合,获取多个参考特征集合;所述参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,所述第一参考特征集合中的每个元素为对应词语的词向量,所述第二参考特征集合中的每个元素为两个相邻词语对应的词向量,所述第三参考特征集合中的每个元素为三个相邻词语对应的词向量;
词向量特征获取模块,用于将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征;所述词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征;
命名实体确定模块,用于将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现如上任一项所述方法的步骤。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上任一项所述方法的步骤。
上述电力计量的命名实体识别方法、装置、计算机设备和存储介质,可以获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量,并对词向量进行组合,获取多个参考特征集合,其中,参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,第一参考特征集合中的每个元素为对应词语的词向量,第二参考特征集合中的每个元素为两个相邻词语对应的词向量,第三参考特征集合中的每个元素为三个相邻词语对应的词向量,进而可以将多个参考特征集合输入到训练好的词向量特征提取模型,以通过词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征,将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据条件随机场输出的标注结果,确定待识别语料中的命名实体。本方案中,通过获取多个不同方式组合得到的参考特征集合,可以通过比较单个词向量、两个相邻词向量和三个词向量提取出待识别语料的词向量特征,能够识别出分词后相邻词语之间的关系,从整体上判断多个词语是否构成同一命名实体,避免错误地对同一命名实体进行划分,解决电力计量中命名实体名称重叠的问题,减轻预先分词带来的影响,有效提高命名实体识别的准确性。
附图说明
图1为一个实施例中一种电力计量的命名实体识别方法的流程示意图;
图2为另一个实施例中一种电力计量的命名实体识别方法的流程示意图;
图3为一个实施例中一种电力计量的命名实体识别装置的结构框图;
图4为一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
随着智能电网的发展,对电力大数据的分析和处理要求越来越高。在实际应用中,孤立的电力子系统中存在着大量的电力计量信息,但是难以从这些离散的电力计量信息中获得有效的决策数据支持。如何整合零散的电力计量信息,构建电力计量的知识图谱,已成为一个亟待解决的问题。
知识图谱以结构化的形式描述了客观世界中的概念、实体及其关系。知识图谱中的单位是“实体-关系-实体”三元组,实体之间的关系被组织成一个网络化的知识结构。对于智能电网,知识图谱可以用来固化调度知识,为电网运行监控和决策提供知识支持和数据支持。在构建电力计量的知识图谱之前,需要从电 力计量文本中识别和提取电力计量的命名实体。进行电力计量的命名实体识别的目的是识别特定领域的电力计量实体及其类别,为建立和分析电力计量的知识图谱提供基础。
在传统技术中,早期电力计量的实体识别方法可分为基于字典或规则的方法和基于统计机器学习模型的方法。近年来,深度学习模型在电力计量中得到了广泛的应用,例如通过长短时记忆神经网络(LSTM)、卷积神经网络等识别电力计量的命名实体。
然而,现有的深度学习模型并没有充分考虑到电力计量中命名实体的名称重叠的情况,在对语料进行分词时,将应该确定为一个单一的命名实体划分为多个部分,导致识别命名实体识别错误的情况发生,降低了命名实体的识别准确率。例如,针对命名实体“电流损耗”,其可以看作是一个命名实体,但在获取命名实体时,有可能将其错误地划分为“电流”和“损耗”,其中的“电流”被划分为一个单独的命名实体。
在一个实施例中,如图1所示,提供了一种电力计量的命名实体识别方法,本实施例以该方法应用于服务器进行举例说明,可以理解的是,该方法也可以应用于终端,还可以应用于包括终端和服务器的系统,并通过终端和服务器的交互实现。本实施例中,该方法可以包括以下步骤:
步骤101,获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量。
其中,词向量可以是将词语映射到实数后得到的向量。
在实际应用中,服务器可以获取用于描述电力计量信息的待识别语料,在对待识别语料进行分词后,可以确定待识别语料中多个词语各自对应的词向量。
步骤102,对所述词向量进行组合,获取多个参考特征集合。
其中,参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,第一参考特征集合中的每个元素为对应词语的词向量,第二参考特征集合中的每个元素为两个相邻词语对应的词向量,第三参考特征集合中的每个元素为三个相邻词语对应的词向量。
由于电力计量的命名实体存在名称部分重叠的情况,例如,针对“电量差异异常”可被确定为一命名实体,也存在将其划分为“电量”和“差异异常”两个命名实体的情况;又如,“电能表替换”可能被分词模块划分为“电能表”和“替换”,即一些相邻的词可能被标记为同一个命名实体,也可能被分别划分为孤立的命名实体。
基于此,在得到多个词语对应的词向量后,可以对多个词向量进行组合,获取多个参考特征集合。具体地,可以遍历多个词向量,将单独的每个词向量确定为一个元素,基于多个词向量得到第一参考特征集合;或者,也可以对词向量成对遍历,基于相邻的两个词向量生成一个元素,由此得到包含多组词向量对的第二参考特征集合;又如,可以基于相邻的三个词向量生成一个元素,生成第三参考特征集合。
步骤103,将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征。
其中,词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征。具体而言,词性特征可以基于第一参考特征集合中的各个元素确定,词性特征可以包括汉语词性和英语词性中的至少一种,例如名词、动词、形容词、副词、表语、定语、状语等词性特征。
在得到多个参考特征集合后,可以将多个参考特征集合输入到训练好的词向量特征提取模型。词向量特征提取模型在获取到多个参考特征集合后,可以通过词向量特征提取模型,基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征。
具体而言,词向量特征提取模型中可以由三个处理模块组成,每个处理模块中可以包括一个或多个自注意力层,在提取词向量特征时,三个处理模块分别接收到第一参考特征集合、第二参考特征集合和第三参考特征集合,处理模块中的自注意力层可以基于对应参考特征集合中的各个元素,确定与该参考特征集合对应的词向量特征。
步骤104,将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
在得到多个参考特征集合各自对应的词向量特征后,可以将多个参考特征集合对应的词向量特征输入到预设的条件随机场中,由条件随机场基于输入的多个词向量特征预测各个词语对应的标注结果,进而可以根据标注结果确定出待识别语料中的命名实体。在确定命名实体后,可以基于识别出的命名实体对待识别语料进行标注、保存。
在本实施例中,可以获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量,并对词向量进行组合,获取多个参考特征集合,其中,参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,第一参考特征集合中的每个元素为对应词语的词向量,第二参考特征集合中的每个元素为两个相邻词语对应的词向量,第三参考特征集合中的每个元素为三个相邻词语对应的词向量,进而可以将多个参考特征集合输入到训练好的词向量特征提取模型,以通过词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征,将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据条件随机场输出的标注结果,确定待识别语料中的命名实体。本方案中,通过获取多个不同方式组合得到的参考特征集合,可以通过比较单个词向量、两个相邻词向量和三个词向量提取出待识别语料的词向量特征,能够识别出分词后相邻词语之间的关系,从整体上判断多个词 语是否构成同一命名实体,避免错误地对同一命名实体进行划分,解决电力计量中命名实体名称重叠的问题,减轻预先分词带来的影响,有效提高命名实体识别的准确性。
在一个实施例中,所述方法还可以包括如下步骤:
获取预先构建的电力计量语料库;采用预设的分词模型对所述电力计量语料库的语料进行分词,得到多个用于描述电力计量信息的词语;采用得到的多个词语训练初始化的词向量模型,得到训练好的词向量模型。
其中,电力计量语料库包括多条用于描述电力计量信息的语料,训练好的词向量模型用于识别电力计量语料中各个词语对应的词向量。
在具体实现中,一般领域的实体类型一般为人、场所、组织等,命名格式相对规范,相应地,许多通用领域的命名实体数据集已开放并用于模型训练。然而,在电力计量领域,缺少一个可以直接用于机器学习模型训练的公共数据集。
基于此,可以预先构建电力计量语料库。具体地,电力系统中存在着大量的与电力计量相关的语料,例如,可以从已开发的电力计量信息处理系统中获取,也可以从从事电力工作的企业中获取业务报表、电力计量统计数据等主体信息获取,或者,针对英语语料,可以从英文知识库中获取与电力计量相关的语料。在获取与电力计量相关的大量语料后,可以进行数据清洗,剔除无关信息,得到包括英文语料和中文语料的电力计量语料库。
在获取电力计量语料库后,可以采用预设的分词模型对电力计量语料库中的语料进行分词,得到多个用于描述电力计量信息的词语,例如基于多种标点符号对语料库中的句子进行结构划分。在进行分词后,可以采用当前得到的词语训练初始化的词向量模型,得到训练好的词向量模型,例如训练Word2Vec模型,从而后续可以利用该模型将描述电能计量信息的词语映射为词向量。
在本实施例中,通过获取预先构建的电力计量语料库,采用预设的分词模型对所述电力计量语料库的语料进行分词,得到多个用于描述电力计量信息的词语,采用得到的多个词语训练初始化的词向量模型,得到训练好的词向量模型,能够构建电力计量语料库以及与电力计量相关的词向量,避免电力计量的命名实体边界模型,为后续准确识别出电力计量的命名实体提供基础。
在一个实施例中,所述方法还可以包括如下步骤:
获取样本语料及其对应的标签;采用所述分词模型获取样本预料对应的多个样本词语,并通过训练好的词向量模型获取所述样本词语对应的词向量;获取所述词向量对应的多个样本特征集合;将各个样本特征集合输入到待训练的机器翻译模型,以通过所述机器翻译模型中的自注意力层确定样本特征集合对应的 词向量特征,并将多个词向量特征输入到预设的条件随机场,根据所述条件随机场输出的预测结果,确定所述样本语料中的预测命名实体;根据所述预测命名实体和所述标签,调整所述机器翻译模型的模型参数,重复训练过程,直到满足训练结束条件,得到词向量特征提取模型。
其中,标签包括样本语料中电力计量的命名实体和命名实体对应的实体类别,每个命名实体对应的实体类别可以是一下任意一种:电力计量指标、电力计量对象、电力计量现象和电力计量行为。具体而言,在电力计量中,不同命名实体之间的边界较为模糊,通过引入上述实体类别,可以在识别出命名实体的同时确定出对应的实体类别,提高识别效率。
例如,统计用电数据可以标注为“用电量”“抄表率”“电流”等划分为电力计量指标实体。将与电力计量有关的对象、人员、地区、机构标识为电力计量对象实体,如“电能表”“广州供电局”等。将电力计量过程中特定主体产生的现象标识为电力计量现象实体,如“电能表停止”“电流损耗”“电流不平衡”等。针对特定动作的电力计量操作,则可以被标记为电力计量行为实体,如“抄表”“异常维修”等。其中,电力计量实体的指标和对象多为名词,电力计量现象多为名词和动词的组合,电力计量行为多为动词。
样本特征集合包括第一样本特征集合、第二样本特征集合和第三样本特征集合,第一样本特征集合中的每个元素为对应词语的词向量,第二样本特征集合中的每个元素为两个相邻词语对应的词向量,第三样本特征集合中的每个元素为三个相邻词语对应的词向量。
在实际应用中,可以获取样本语料及其对应的标签。在得到样本语料后,可以采用分词模型对样本语料进行分词,得到样本预料对应的多个样本词语,进而可以通过训练好的词向量模型获取样本词语对应的词向量。
在获取各个样本词语对应的词向量后,可以对多个词向量进行组合,获取词向量对应的多个样本特征集合,样本特征集合的获取方式与参考特征集合的获取方式相似,具体可参考后文关于参考特征集合的获取方法,本实施例不作赘述。
在获取多个样本特征集合后,可以将各个样本特征集合输入到待训练的机器翻译模型。机器翻译模型在获取到多个样本特征集合后,可以通过机器翻译模型中的多个自注意力成获取对应的词向量特征。具体而言,机器翻译模型中可以由三个处理模块组成,包含三个处理模块的机器翻译模型可以称为3阶机器翻译模型。其中,每个处理模块中可以包括一个或多个自注意力层,在提取词向量特征时,三个处理模块分别接收到第一样本特征集合、第二样本特征集合和第三样本特征集合,处理模块中的自注意力层可以基于对应样本特征集合中的各个元素,确定与该样本特征集合对应的词向量特征。
在获取各样本特征集合对应的词向量特征后,可以将多个词向量特征输入到预设的条件随机场,根据 条件随机场输出的预测结果,确定样本语料中的预测命名实体,并根据预测命名实体和标签,调整机器翻译模型的模型参数。重复上述训练过程,直到满足训练结束条件时,可以得到词向量特征提取模型。
在本实施例中,可以获取样本语料及其对应的标签,采用分词模型获取样本预料对应的多个样本词语,并通过训练好的词向量模型获取样本词语对应的词向量,获取词向量对应的多个样本特征集合,将各个样本特征集合输入到待训练的机器翻译模型,以通过机器翻译模型中的自注意力层确定样本特征集合对应的词向量特征,并将多个词向量特征输入到预设的条件随机场,根据条件随机场输出的预测结果,确定样本语料中的预测命名实体,根据预测命名实体和标签,调整机器翻译模型的模型参数,重复训练过程,直到满足训练结束条件,得到词向量特征提取模型,能够为准确识别电力计量的命名实体提供基础。
在一个实施例中,所述基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征,可以包括如下步骤:
确定同一参考特征集合中相邻元素对应的相似度;将所述相似度输入到预先训练的单层神经网络,通过所述单层神经网络放大或缩小所述相似度,得到调整后的相似度;根据调整后的相似度确定该参考特征集合对应的词向量特征。
在具体实现中,在得到参考特征集合后,词向量特征提取模型中的每个处理模块可以分别处理对应的参考特征集合,针对每个参考特征集合,处理模块可以确定同一参考特征集合中相邻元素对应的相似度。具体地,词向量特征提取模型中可以包括一个或多个自注意力层,每个自注意力层由三部分组成,包括神经余弦相似函数、注意系数和注意特征。当处理模块获取到对应的参考特征集合后,可以由自注意力层中的余弦函数确定同一参考特征集合中相邻元素对应的相似度,余弦相似度可以通过测量两个输入向量之间夹角的余弦值来度量两个输入向量之间的相似度。具体可以通过如下式子确定:
Figure PCTCN2022087120-appb-000003
其中,i表示第i个元素对应的向量。
在实际应用中,电力计量的多个命名实体之间可能存在高度相似但各自分属不同命名实体类别的情况,仅仅基于相似的词向量进行比较容易造成分类错误,例如针对相邻词向量较为相似的情况。因此,在得到相似度后,为了确定不同词向量之间的相似性所产生的影响,在获取到相邻元素对应的相似度后,可以将相似度输入到词向量特征提取模型中预先训练好的单层神经网络,通过单层神经网络放大或缩小相似度,得到调整后的相似度。调整后的相似度可以更准确地反映相邻词向量之间的差异,从而可以根据调整后的相似度确定该参考特征集合对应的词向量特征。其中,单层神经网络可以为:
Similarity ij=Neural(Wf i,Wf j)=Neural(V×cosine(Wf i,Wf j))
其中,Similarity ij为调整后的相似度,Wf i,Wf j为单层神经网络中针对相邻的两个元素预先训练好的权重,V为单层神经网络的网络参数,Neural为单层神经网络,可以使用ReLU激活函数来防止相似梯度消失。
在本实施例中,可以确定同一参考特征集合中相邻元素对应的相似度,将相似度输入到预先训练的单层神经网络,通过单层神经网络放大或缩小相似度,得到调整后的相似度,进而可以根据调整后的相似度确定该参考特征集合对应的词向量特征。在本方案中,针对电力计量中相似的命名实体,通过对相似度进行准确调整,有效地反映相似命名实体之间的差异,能够避免错误地将相似的词语划分为同一命名实体,有效提高命名实体的识别准确性。
在一个实施例中,所述根据调整后的相似度确定该参考特征集合对应的词向量特征,可以包括如下步骤:
获取调整后的相似度对应的注意力系数;根据预设的映射关系,确定所述注意力系数对应的注意力特征,并将多个注意力特征输入到前向神经网络,得到该参考特征集合对应的词向量特征。
其中,当自注意力层采用多头注意力机制时,映射关系为:
Figure PCTCN2022087120-appb-000004
其中,h i为注意力特征,
Figure PCTCN2022087120-appb-000005
为注意力系数,K为注意力头的数量,sigmoid为激活函数。
当自注意力层采用单头注意力机制时,映射关系为:
Figure PCTCN2022087120-appb-000006
具体地,对注意力系数进行调整后,可以获取调整后的相似度对应的注意力系数,该注意力系数可以通过如下式子确定:
Figure PCTCN2022087120-appb-000007
在确定注意力系数后,可以代入到预设的映射关系中,确定该注意力系数对应的注意力特征,并将得到的多个注意力特征输入到前向神经网络中,通过处理模块中的前向神经网络模型处理,得到该参考特征集合对应的词向量特征。具体而言,在得到多个注意力特征后,可以对其进行特征融合(add)和标准化(normalization)后,再输入到前向神经网络,前向神经网络对经过特征融合和标准化后的注意力特征进行分析后,可以输入结果,在输出结果再次进行特征融合和标准化后,该处理模块可以输出参考特征集合对 应的词向量特征。
在本实施例中,通过获取调整后的相似度对应的注意力系数,根据预设的映射关系,确定所述注意力系数对应的注意力特征,并将多个注意力特征输入到前向神经网络,得到该参考特征集合对应的词向量特征,可以为准确识别出电力计量的命名实体提供判别基础。
在一个实施例中,所述对所述词向量进行组合,获取多个参考特征集合,包括:
确定多个词向量各自对应的排列顺序;基于各个词向量及其对应的排列顺序,生成第一参考特征集合。
在具体实现中,在获取到多个词向量后,可以获取各个词向量对应的排列顺序,其中,多个词向量各自对应的排列顺序与待识别语料中多个词语对应的排列顺序对应。
在得到各个词向量对应的排列顺序后,可以基于各个词向量及其对应的排列顺序,生成第一参考特征集合。具体而言,在确定每个词向量对应的排列顺序后,可以依次将每个词向量确定为一个元素,通过遍历多个词向量,可以生成包含多个元素的第一参考特征集合,第一参考特征集合中的每个元素对应一个词向量。
在获取到第一参考特征集合后,可以基于第一参考特征集合中各个词向量对应的排列顺序,确定每个词语对应的词性特征。具体而言,词向量是将词语映射为向量形式的表达,当多个词向量按照排列顺序进行排列时,可以表征为待识别语料分词后对应的向量形式,而在待识别语料中,由于不同词语对应的位置、前后关系存在差异,通过分析各个词向量对应的排列顺序,可以得到每个词向量对应的词性特征。
在本实施例中,通过确定多个词向量各自对应的排列顺序,并基于各个词向量及其对应的排列顺序,生成第一参考特征集合,可以为确定相邻词向量之间的相似度提供基础。
在一个实施例中,所述对所述词向量进行组合,获取多个参考特征集合,可以包括如下步骤:
确定多个词向量各自对应的排列顺序;基于多个词向量对应的排列顺序,获取多组相邻的词向量,得到多组词向量对;根据多组词向量对,生成第二参考特征集合。
在实际应用中,在获取到多个词向量后,可以获取各个词向量对应的排列顺序,多个词向量各自对应的排列顺序与待识别语料中多个词语对应的排列顺序对应。
在得到各个词向量对应的排列顺序后,可以基于各个词向量及其对应的排列顺序,获取多组相邻的词向量,得到多组词向量对。具体而言,在确定每个词向量对应的排列顺序后,可以对多个词向量进行成对遍历,提取相邻的两个词向量。在得到多组词向量对后,可以将每一组词向量对确定为一个元素,并基于多个元素生成第二参考特征集合。在后续的处理过程中,词向量特征提取模型可以对相邻的两个词向量进行组合分析,得到相邻两个词语对应的组合特征。
在本实施例中,通过确定多个词向量各自对应的排列顺序,基于多个词向量对应的排列顺序,获取多组相邻的词向量,得到多组词向量对,根据多组词向量对,生成第二参考特征集合,能够为获取相邻两个词向量之间的组合特征提供基础,降低分词模型对待识别语料错误分词的影响。
在一个实施例中,所述对所述词向量进行组合,获取多个参考特征集合,可以包括如下步骤:
确定多个词向量各自对应的排列顺序;基于多个词向量对应的排列顺序,将相邻的三个词向量确定为一词向量组合,得到多个词向量组合;根据多个词向量组合,生成第三参考特征集合。
在实际应用中,在获取到多个词向量后,可以获取各个词向量对应的排列顺序。在得到各个词向量对应的排列顺序后,可以基于各个词向量及其对应的排列顺序,将相邻的三个词向量确定为一词向量组合,得到多个词向量组合。在得到多个词向量组合后,可以将每一词向量组合确定为一个元素,并基于多个元素生成第三参考特征集合。在后续的处理过程中,词向量特征提取模型可以对相邻的三个词向量进行组合分析,得到相邻三个词语对应的组合特征。
在本实施例中,通过确定多个词向量各自对应的排列顺序,基于多个词向量对应的排列顺序,将相邻的三个词向量确定为一词向量组合,得到多个词向量组合,根据多个词向量组合,生成第三参考特征集合,能够为获取相邻三个词向量之间的组合特征提供基础,降低分词模型对待识别语料错误分词的影响。
为了使本领域技术人员能够更好地理解上述步骤,以下通过一个例子对本申请实施例加以示例性说明,但应当理解的是,本申请实施例并不限于此。
如图2所示,针对待识别的用于描述电力计量信息的待识别语料,可以采用预设的分词模型对其进行分词后,通过训练好的词向量模型获取每个词语对应的词向量。在得到词向量后,可以对多个词向量进行组合,得到第一参考特征集合、第二参考特征集合和第三参考特征集合,并将其输入训练好的词向量特征提取模型中。
在获取到多个参考特征集合后,词向量特征提取模型可以通过第一处理模块、第二处理模块和第三处理模块分别对其进行处理。以第一处理模块为例,在接收到第一参考特征集合后,可以确定第一参考特征集合中元素对应的位置编码,并通过神经余弦多头注意力机制对第一参考特征集合中的元素进行处理,包括获取相邻元素之间的相似度,并通过训练好的单层神经网络对得到的相似度进行放大或缩小,基于调整后的相似度确定注意力特征。在得到多个注意力特征后,可以进行特征融合和标准化,并将处理结果输入到前向神经网络中,并在特征融合和标准化后,得到第一参考特征集合对应的词性特征,作为词向量特征。针对第二参考特征集合,可以由第二处理模块进行处理,得到相邻两个词向量对应的关联性特征,作为词向量特征。针对第三参考特征集合,可以通过第三处理模块处理,得到相邻三个词向量对应的关联性特征, 作为词向量特征。第二处理模块和第三处理模块的处理过程与第一处理模块的处理过程相同,本实施例不作赘述。
在得到各个处理模块输出的词向量特征后,可以由条件随机场基于特征函数,确定待识别语料对应的标注结果,确定待识别语料中的命名实体。
应该理解的是,虽然图1的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1中的至少一部分步骤可以包括多个步骤或者多个阶段,这些步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤中的步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图3所示,提供了一种电力计量的命名实体识别装置,所述装置包括:
词向量获取模块301,用于获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量;
参考特征集合获取模块302,用于对所述词向量进行组合,获取多个参考特征集合;所述参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,所述第一参考特征集合中的每个元素为对应词语的词向量,所述第二参考特征集合中的每个元素为两个相邻词语对应的词向量,所述第三参考特征集合中的每个元素为三个相邻词语对应的词向量;
词向量特征获取模块303,用于将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征;所述词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征;
命名实体确定模块304,用于将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
在一个实施例中,所述词向量特征获取模块303,包括:
相似度确定子模块,用于确定同一参考特征集合中相邻元素对应的相似度;
相似度调整子模块,用于将所述相似度输入到预先训练的单层神经网络,通过所述单层神经网络放大或缩小所述相似度,得到调整后的相似度;
词向量特征确定子模块,用于根据调整后的相似度确定该参考特征集合对应的词向量特征。
在一个实施例中,所述词向量特征确定子模块,具体用于:
获取调整后的相似度对应的注意力系数;
根据预设的映射关系,确定所述注意力系数对应的注意力特征,并将多个注意力特征输入到前向神经网络,得到该参考特征集合对应的词向量特征;所述映射关系为:
Figure PCTCN2022087120-appb-000008
其中,h i为注意力特征,
Figure PCTCN2022087120-appb-000009
为注意力系数,K为注意力头的数量。
在一个实施例中,所述参考特征集合获取模块302,包括:
第一顺序确定子模块,用于确定多个词向量各自对应的排列顺序;多个词向量各自对应的排列顺序与所述待识别语料中多个词语对应的排列顺序对应;
第一参考特征集合生成子模块,用于基于各个词向量及其对应的排列顺序,生成第一参考特征集合。
在一个实施例中,所述对所述词向量进行组合,获取多个参考特征集合,包括:
第二顺序确定子模块,用于确定多个词向量各自对应的排列顺序;
组合子模块,用于基于多个词向量对应的排列顺序,获取多组相邻的词向量,得到多组词向量对;
第二参考特征集合生成子模块,用于根据多组词向量对,生成第二参考特征集合。
在一个实施例中,所述装置还包括:
语料库获取模块,用于获取预先构建的电力计量语料库;所述电力计量语料库包括多条用于描述电力计量信息的语料;
分词模块,用于采用预设的分词模型对所述电力计量语料库的语料进行分词,得到多个用于描述电力计量信息的词语;
第一训练模块,用于采用得到的多个词语训练初始化的词向量模型,得到训练好的词向量模型,所述训练好的词向量模型用于识别电力计量语料中各个词语对应的词向量。
在一个实施例中,所述装置还包括:
样本语料获取模块,用于获取样本语料及其对应的标签;所述标签包括所述样本语料中电力计量的命名实体和所述命名实体对应的实体类别;
样本词向量获取模块,用于采用所述分词模型获取样本预料对应的多个样本词语,并通过训练好的 词向量模型获取所述样本词语对应的词向量;
样本特征集合获取模块,用于获取所述词向量对应的多个样本特征集合;所述样本特征集合包括第一样本特征集合、第二样本特征集合和第三样本特征集合,所述第一样本特征集合中的每个元素为对应词语的词向量,所述第二样本特征集合中的每个元素为两个相邻词语对应的词向量,所述第三样本特征集合中的每个元素为三个相邻词语对应的词向量;
第二训练模块,用于将各个样本特征集合输入到待训练的机器翻译模型,以通过所述机器翻译模型中的自注意力层确定样本特征集合对应的词向量特征,并将多个词向量特征输入到预设的条件随机场,根据所述条件随机场输出的预测结果,确定所述样本语料中的预测命名实体;
参数调整模块,用于根据所述预测命名实体和所述标签,调整所述机器翻译模型的模型参数,重复训练过程,直到满足训练结束条件,得到词向量特征提取模型。
关于一种电力计量的命名实体识别装置的具体限定可以参见上文中对于一种电力计量的命名实体识别方法的限定,在此不再赘述。上述一种电力计量的命名实体识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图4所示。该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机程序和数据库。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的数据库用于存储词向量。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种电力计量的命名实体识别方法。
本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现以下步骤:
获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量;
对所述词向量进行组合,获取多个参考特征集合;所述参考特征集合包括第一参考特征集合、第二参 考特征集合和第三参考特征集合,所述第一参考特征集合中的每个元素为对应词语的词向量,所述第二参考特征集合中的每个元素为两个相邻词语对应的词向量,所述第三参考特征集合中的每个元素为三个相邻词语对应的词向量;
将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征;所述词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征;
将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
在一个实施例中,处理器执行计算机程序时还实现上述其他实施例中的步骤。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量;
对所述词向量进行组合,获取多个参考特征集合;所述参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,所述第一参考特征集合中的每个元素为对应词语的词向量,所述第二参考特征集合中的每个元素为两个相邻词语对应的词向量,所述第三参考特征集合中的每个元素为三个相邻词语对应的词向量;
将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征;所述词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征;
将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
在一个实施例中,计算机程序被处理器执行时还实现上述其他实施例中的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和易失性存储器中的至少一种。非易失性存储器 可包括只读存储器(Read-Only Memory,ROM)、磁带、软盘、闪存或光存储器等。易失性存储器可包括随机存取存储器(Random Access Memory,RAM)或外部高速缓冲存储器。作为说明而非局限,RAM可以是多种形式,比如静态随机存取存储器(Static Random Access Memory,SRAM)或动态随机存取存储器(Dynamic Random Access Memory,DRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种电力计量的命名实体识别方法,其特征在于,所述方法包括:
    获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量;
    对所述词向量进行组合,获取多个参考特征集合;所述参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,所述第一参考特征集合中的每个元素为对应词语的词向量,所述第二参考特征集合中的每个元素为两个相邻词语对应的词向量,所述第三参考特征集合中的每个元素为三个相邻词语对应的词向量;
    将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征;所述词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征;
    将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
  2. 根据权利要求1所述的方法,其特征在于,所述基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征,包括:
    确定同一参考特征集合中相邻元素对应的相似度;
    将所述相似度输入到预先训练的单层神经网络,通过所述单层神经网络放大或缩小所述相似度,得到调整后的相似度;
    根据调整后的相似度确定该参考特征集合对应的词向量特征。
  3. 根据权利要求2所述的方法,其特征在于,所述根据调整后的相似度确定该参考特征集合对应的词向量特征,包括:
    获取调整后的相似度对应的注意力系数;
    根据预设的映射关系,确定所述注意力系数对应的注意力特征,并将多个注意力特征输入到前向神经网络,得到该参考特征集合对应的词向量特征;所述映射关系为:
    Figure PCTCN2022087120-appb-100001
    其中,h i为注意力特征,
    Figure PCTCN2022087120-appb-100002
    为注意力系数,K为注意力头的数量。
  4. 根据权利要求1所述的方法,其特征在于,所述对所述词向量进行组合,获取多个参考特征集合,包括:
    确定多个词向量各自对应的排列顺序;多个词向量各自对应的排列顺序与所述待识别语料中多个词语 对应的排列顺序对应;
    基于各个词向量及其对应的排列顺序,生成第一参考特征集合。
  5. 根据权利要求1所述的方法,其特征在于,所述对所述词向量进行组合,获取多个参考特征集合,包括:
    确定多个词向量各自对应的排列顺序;
    基于多个词向量对应的排列顺序,获取多组相邻的词向量,得到多组词向量对;
    根据多组词向量对,生成第二参考特征集合。
  6. 根据权利要求1所述的方法,其特征在于,还包括:
    获取预先构建的电力计量语料库;所述电力计量语料库包括多条用于描述电力计量信息的语料;
    采用预设的分词模型对所述电力计量语料库的语料进行分词,得到多个用于描述电力计量信息的词语;
    采用得到的多个词语训练初始化的词向量模型,得到训练好的词向量模型,所述训练好的词向量模型用于识别电力计量语料中各个词语对应的词向量。
  7. 根据权利要求6所述的方法,其特征在于,还包括:
    获取样本语料及其对应的标签;所述标签包括所述样本语料中电力计量的命名实体和所述命名实体对应的实体类别;
    采用所述分词模型获取样本预料对应的多个样本词语,并通过训练好的词向量模型获取所述样本词语对应的词向量;
    获取所述词向量对应的多个样本特征集合;所述样本特征集合包括第一样本特征集合、第二样本特征集合和第三样本特征集合,所述第一样本特征集合中的每个元素为对应词语的词向量,所述第二样本特征集合中的每个元素为两个相邻词语对应的词向量,所述第三样本特征集合中的每个元素为三个相邻词语对应的词向量;
    将各个样本特征集合输入到待训练的机器翻译模型,以通过所述机器翻译模型中的自注意力层确定样本特征集合对应的词向量特征,并将多个词向量特征输入到预设的条件随机场,根据所述条件随机场输出的预测结果,确定所述样本语料中的预测命名实体;
    根据所述预测命名实体和所述标签,调整所述机器翻译模型的模型参数,重复训练过程,直到满足训练结束条件,得到词向量特征提取模型。
  8. 一种电力计量的命名实体识别装置,其特征在于,所述装置包括:
    词向量获取模块,用于获取用于描述电力计量信息的待识别语料中多个词语各自对应的词向量;
    参考特征集合获取模块,用于对所述词向量进行组合,获取多个参考特征集合;所述参考特征集合包括第一参考特征集合、第二参考特征集合和第三参考特征集合,所述第一参考特征集合中的每个元素为对应词语的词向量,所述第二参考特征集合中的每个元素为两个相邻词语对应的词向量,所述第三参考特征集合中的每个元素为三个相邻词语对应的词向量;
    词向量特征获取模块,用于将多个参考特征集合输入到训练好的词向量特征提取模型,以通过所述词向量特征提取模型基于同一参考特征集合中的各个元素,确定该参考特征集合对应的词向量特征;所述词向量特征包括每个词向量对应的词性特征、相邻两个词向量对应的关联性特征和相邻三个词向量对应的关联性特征;
    命名实体确定模块,用于将多个参考特征集合各自对应的词向量特征输入到预设的条件随机场,并根据所述条件随机场输出的标注结果,确定所述待识别语料中的命名实体。
  9. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至7中任一项所述方法的步骤。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述方法的步骤。
PCT/CN2022/087120 2021-07-23 2022-04-15 电力计量的命名实体识别方法、装置和计算机设备 WO2023000725A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110839145.0 2021-07-23
CN202110839145.0A CN113591480B (zh) 2021-07-23 2021-07-23 电力计量的命名实体识别方法、装置和计算机设备

Publications (1)

Publication Number Publication Date
WO2023000725A1 true WO2023000725A1 (zh) 2023-01-26

Family

ID=78249527

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/087120 WO2023000725A1 (zh) 2021-07-23 2022-04-15 电力计量的命名实体识别方法、装置和计算机设备

Country Status (2)

Country Link
CN (1) CN113591480B (zh)
WO (1) WO2023000725A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591480B (zh) * 2021-07-23 2023-07-25 深圳供电局有限公司 电力计量的命名实体识别方法、装置和计算机设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101481A (zh) * 2018-06-25 2018-12-28 北京奇艺世纪科技有限公司 一种命名实体识别方法、装置及电子设备
CN109145303A (zh) * 2018-09-06 2019-01-04 腾讯科技(深圳)有限公司 命名实体识别方法、装置、介质以及设备
WO2020133039A1 (zh) * 2018-12-27 2020-07-02 深圳市优必选科技有限公司 对话语料中实体的识别方法、装置和计算机设备
CN112052684A (zh) * 2020-09-07 2020-12-08 南方电网数字电网研究院有限公司 电力计量的命名实体识别方法、装置、设备和存储介质
EP3767516A1 (en) * 2019-07-18 2021-01-20 Ricoh Company, Ltd. Named entity recognition method, apparatus, and computer-readable recording medium
CN113065349A (zh) * 2021-03-15 2021-07-02 国网河北省电力有限公司 基于条件随机场的命名实体识别方法
CN113591480A (zh) * 2021-07-23 2021-11-02 深圳供电局有限公司 电力计量的命名实体识别方法、装置和计算机设备

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112949311A (zh) * 2021-03-05 2021-06-11 北京工业大学 一种融合字形信息的命名实体识别方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101481A (zh) * 2018-06-25 2018-12-28 北京奇艺世纪科技有限公司 一种命名实体识别方法、装置及电子设备
CN109145303A (zh) * 2018-09-06 2019-01-04 腾讯科技(深圳)有限公司 命名实体识别方法、装置、介质以及设备
WO2020133039A1 (zh) * 2018-12-27 2020-07-02 深圳市优必选科技有限公司 对话语料中实体的识别方法、装置和计算机设备
EP3767516A1 (en) * 2019-07-18 2021-01-20 Ricoh Company, Ltd. Named entity recognition method, apparatus, and computer-readable recording medium
CN112052684A (zh) * 2020-09-07 2020-12-08 南方电网数字电网研究院有限公司 电力计量的命名实体识别方法、装置、设备和存储介质
CN113065349A (zh) * 2021-03-15 2021-07-02 国网河北省电力有限公司 基于条件随机场的命名实体识别方法
CN113591480A (zh) * 2021-07-23 2021-11-02 深圳供电局有限公司 电力计量的命名实体识别方法、装置和计算机设备

Also Published As

Publication number Publication date
CN113591480B (zh) 2023-07-25
CN113591480A (zh) 2021-11-02

Similar Documents

Publication Publication Date Title
CN109472033B (zh) 文本中的实体关系抽取方法及系统、存储介质、电子设备
WO2021253904A1 (zh) 测试案例集生成方法、装置、设备及计算机可读存储介质
WO2022105115A1 (zh) 问答对匹配方法、装置、电子设备及存储介质
US10796104B1 (en) Systems and methods for constructing an artificially diverse corpus of training data samples for training a contextually-biased model for a machine learning-based dialogue system
CN112819023B (zh) 样本集的获取方法、装置、计算机设备和存储介质
WO2020211720A1 (zh) 数据处理方法和代词消解神经网络训练方法
WO2021073390A1 (zh) 数据筛选方法、装置、设备及计算机可读存储介质
CN112052684A (zh) 电力计量的命名实体识别方法、装置、设备和存储介质
CN108959305A (zh) 一种基于互联网大数据的事件抽取方法及系统
CN110968725B (zh) 图像内容描述信息生成方法、电子设备及存储介质
CN110162771A (zh) 事件触发词的识别方法、装置、电子设备
CN110209743B (zh) 知识管理系统及方法
CN112181490A (zh) 功能点评估法中功能类别的识别方法、装置、设备及介质
WO2023000725A1 (zh) 电力计量的命名实体识别方法、装置和计算机设备
CN108304568B (zh) 一种房地产公众预期大数据处理方法及系统
CN110069558A (zh) 基于深度学习的数据分析方法及终端设备
RU2715024C1 (ru) Способ отладки обученной рекуррентной нейронной сети
CN111950646A (zh) 电磁图像的层次化知识模型构建方法及目标识别方法
WO2022141838A1 (zh) 模型置信度分析方法、装置、电子设备及计算机存储介质
US11514311B2 (en) Automated data slicing based on an artificial neural network
JP2017188025A (ja) データ分析システム、その制御方法、プログラム、及び、記録媒体
CN112215006A (zh) 机构命名实体归一化方法和系统
US20240028828A1 (en) Machine learning model architecture and user interface to indicate impact of text ngrams
CN116089586B (zh) 基于文本的问题生成方法及问题生成模型的训练方法
CN117520209B (zh) 代码评审方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22844897

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE