CN110287495A - A kind of power marketing profession word recognition method and system - Google Patents

A kind of power marketing profession word recognition method and system Download PDF

Info

Publication number
CN110287495A
CN110287495A CN201910584443.2A CN201910584443A CN110287495A CN 110287495 A CN110287495 A CN 110287495A CN 201910584443 A CN201910584443 A CN 201910584443A CN 110287495 A CN110287495 A CN 110287495A
Authority
CN
China
Prior art keywords
word
power marketing
professional
identification model
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910584443.2A
Other languages
Chinese (zh)
Inventor
邹云峰
邓君华
徐超
季梦黎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Jiangsu Electric Power Co Ltd
Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Jiangsu Electric Power Co Ltd
Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Jiangsu Electric Power Co Ltd, Electric Power Research Institute of State Grid Jiangsu Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201910584443.2A priority Critical patent/CN110287495A/en
Publication of CN110287495A publication Critical patent/CN110287495A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of power marketing profession word recognition methods, including step 1, are based on power marketing external data, training initial identification model;The professional word in power marketing data is extracted, initial professional word dictionary is constructed;Step 2, based on newest professional word dictionary and identification model, the word in power marketing data is labeled, and uses the power marketing data marked, trained new identification model;Step 3, it is less than threshold value in response to the number of iterations, power marketing data is identified based on new identification model, new professional word dictionary is constructed, goes to step 2, otherwise, exports new identification model and identified for professional word.Also disclose corresponding system.This method is directed on the basis of not a large amount of background corpus, returns mark by identification model prediction result, is promoted to the professional word recognition accuracy for being directed to power marketing data, is improved the working efficiency of electric power personnel.

Description

A kind of power marketing profession word recognition method and system
Technical field
The present invention relates to a kind of power marketing profession word recognition method and systems, belong to Chinese information processing technology field.
Background technique
With the development of electronic information, electric power data is multiplied, and there is a large amount of important letters in power marketing data Breath.The identification of power marketing profession word is a basic and critical task in power domain, by the special of marketing data Industry word extracts, it can be found that it is hidden in the information of data behind, such as fault detection and prevention, market operation situation etc..
It is mainly extracted at present using two ways for the professional word identification of power domain, one is be based on statistics Method, the correlation calculated between word and word are extracted, and this method lays particular stress on the more term of frequency of occurrence, and accuracy rate is lower; Another kind uses deep learning algorithm, and this method needs a large amount of artificial labeled data so that the heavy workload of staff and Efficiency is lower.
Summary of the invention
The present invention provides a kind of power marketing profession word recognition method and systems, solve existing for existing identification technology The above problem.
In order to solve the above-mentioned technical problem, the technical scheme adopted by the invention is that:
A kind of power marketing profession word recognition method, includes the following steps,
Step 1, power marketing external data, training initial identification model are based on;Extract the profession in power marketing data Word constructs initial professional word dictionary;
Step 2, based on newest professional word dictionary and identification model, the word in power marketing data is labeled, and With the power marketing data marked, the new identification model of training;
Step 3, it is less than threshold value in response to the number of iterations, power marketing data is identified based on new identification model, The new professional word dictionary of building, goes to step 2, otherwise, exports new identification model and identifies for professional word.
Identification model is the BILSTM-CRF model being added from attention mechanism.
The process of the initial professional word dictionary of building is,
The professional word in power marketing data is extracted using left and right comentropy and K-Means clustering method;
The result that all methods extract and the domain lexicon that external power business personnel provides are merged, by artificial filter After obtain initial professional word dictionary.
The process being labeled to the professional word in power marketing data is,
Power marketing data are identified with identification model, and the word in power marketing data is labeled;
The word for being labeled as non-physical is searched in professional word dictionary, and if it exists, then mark the word for being labeled as non-physical For professional word.
According to professional word dictionary, Trie tree is constructed, the word for being labeled as non-physical is searched by Trie tree;Wherein, Trie tree In each node indicate that each word in professional word dictionary, root node do not store any word.
A kind of power marketing profession word identifying system, including,
It constructs module: being based on power marketing external data, training initial identification model;It extracts special in power marketing data Industry word constructs initial professional word dictionary;
It marks training module: based on newest professional word dictionary and identification model, the word in power marketing data being carried out Mark, and the power marketing data marked are used, trained new identification model;
It returns mark module: being less than threshold value in response to the number of iterations, power marketing data are known based on new identification model Not, new professional word dictionary is constructed, mark training module is gone to, otherwise, new identification model is exported and is identified for professional word.
Identification model is the BILSTM-CRF model being added from attention mechanism.
Constructing module includes initial professional word dictionary creation module, and initial profession word candidate's dictionary creation module includes profession Word abstraction module and merging filtering module;
Professional word abstraction module: it is extracted using left and right comentropy and K-Means clustering method special in power marketing data Industry word;
Merge filtering module: the result that all methods extract and the domain lexicon that external power business personnel provides are closed And initial professional word dictionary is obtained after artificial filter.
Marking training module includes labeling module, and labeling module includes preliminary labeling module and modified module;
Preliminary labeling module: power marketing data are identified with identification model, the word in power marketing data is labeled;
Modified module: the word for being labeled as non-physical is searched in professional word dictionary, and if it exists, will then be labeled as non-physical Word be labeled as professional word.
Modified module includes searching module;
Searching module: according to professional word candidate dictionary, Trie tree is constructed, is searched by Trie tree and is labeled as non-physical Word;Wherein, each node indicates that each word in professional word dictionary, root node do not store any word in Trie tree.
Advantageous effects of the invention: this method is directed on the basis of not a large amount of background corpus, pass through identification Model prediction result returns mark, is promoted to the professional word recognition accuracy for being directed to power marketing data, improves the work of electric power personnel Efficiency.
Detailed description of the invention
Fig. 1 is the flow chart of the method for the present invention;
Fig. 2 is Trie tree structure diagram;
Fig. 3 is from attention mechanism BILSTM-CRF illustraton of model.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.Following embodiment is only used for clearly illustrating the present invention Technical solution, and not intended to limit the protection scope of the present invention.
As shown in Figure 1, a kind of power marketing profession word recognition method, comprising the following steps:
Step 1, power marketing external data, training initial identification model are based on;Extract the profession in power marketing data Word constructs initial professional word dictionary.
Identification model is using the BILSTM-CRF model being added from attention mechanism, using power marketing external data as base Plinth is trained to being added from the BILSTM-CRF model of attention mechanism, obtains initial identification model;Wherein outside power marketing Portion's data include MSRA data, Money Data, power marketing external professional word dictionary, wherein power marketing external professional word dictionary By power marketing, business personnel is provided.
The process of the initial professional word dictionary of building is as follows:
11) the professional word in power marketing data is extracted using left and right comentropy and K-Means clustering method.
Power marketing data are clustered using K-Means clustering method, on the basis of cluster, are closed using TF-IDF Keyword extracts, and retains before each cluster ranking 5 word, constructs terminological dictionary Cluster_dict (cluster dictionary), as shown in table 1.
1 Cluster_dict table of table
Cluster name Candidate profession word Score
Cluster 1 Ammeter 0.0274005
Cluster 2 Ammeter 0.0151421
Cluster 3 Telegram in reply 0.0162745
Each word or so comentropy is calculated, with word for the center w, the set of words for appearing in the left side word w is α=(a1, a2....as), appearing in the set of words on the right of word w is β=(b1,b2....bs), calculation formula is as follows:
Wherein, n1 indicates the number that w occurs in corpus, C (ai, w) and indicate a in corpusiThe number occurred simultaneously with word w, C(w,bi) indicate word w and b in corpusiThe number occurred jointly, L-E (w) are left comentropy, and R-E (w) is right comentropy;
It selects comentropy lesser from the comentropy of left and right, then forming new phrase with word w, constructs terminological dictionary Entropy_dict (entropy dictionary), as shown in table 2.
2 Entropy_dict table of table
Candidate profession word Score
Electricity payment 0.288829687518
Way to pay dues 0.035226661968
Urge expense notification sheet 0.224742098999
12) result that all methods extract and the domain lexicon that external power business personnel provides are merged, by artificial Initial professional word dictionary is obtained after filtering;Cluster_dict, Entropy_dict and external power business personnel are provided Domain lexicon merge, initial professional word dictionary is obtained after manual examination and verification duplicate removal, such as: ammeter, indicator light, urges expense to notify at electric pole Singly, handle cancellation etc..
Step 2, based on newest professional word dictionary and identification model, the word in power marketing data is labeled, and With the power marketing data marked, the new identification model of training.
The process being labeled to the professional word in power marketing data is as follows:
21) power marketing data are identified with identification model, the word in power marketing data is labeled.
The label of mark is broadly divided into five classes, name, place name, institution term, power specialty word and is not belonging to entity Word, wherein name is indicated with PER, and place name is shown with LOC table, and institution term is shown with ORG table, power specialty word ELECT table Show, non-physical word is indicated with O.
22) word for being labeled as non-physical is searched in professional word dictionary, and if it exists, will then be labeled as the word mark of non-physical Note is professional word.
As shown in Fig. 2, according to professional word dictionary, constructing Trie tree in order to accelerate search efficiency, being searched and marked by Trie tree Note is the word of non-physical;Wherein, each node indicates each word in professional word dictionary in Trie tree, and root node is not stored any Word.Such as " Xiao Ming and November 20 receive staff's granting urge payment card ", wherein " pressing for payment of expense " with " urging payment card " all For power marketing profession word, then by the lookup to Trie tree, can selection maximum matching phrase rapidly and efficiently " press for payment of expense It is single " it is labeled.
Initial data are as follows: " caller client reflection, Xiao Ming to the town Bian Cang power supply station handle timesharing, and T system shows and do not open also Logical, client has objection to this, asks relevant departments, power supply company to verify as early as possible and replies client."
Data format after mark are as follows: " caller client reflection,<pER>xiao Ming</PER>it arrives<oRG>the town Bian Cang power supply station</ ORG>handle<eLECT>timesharing</ELECT>, T system shows also not open-minded, and client has objection to this, asks power supply company's dependent part Door is verified as early as possible and replies client."
80% is used as training set in the power marketing data marked, and 20% as verifying collection, to being added from attention machine The BILSTM-CRF model of system is trained, and obtains new identification model;It wherein inputs input format and the form of label is as follows, Wherein, the first word feature for being classified as input, second is classified as part of speech feature (part of speech feature is explained as shown in table 3), and third is classified as mark The label of note, wherein B indicates beginning, and I indicates intermediate, and E indicates ending, and S indicates single word
3 part of speech feature of table is explained
As shown in figure 3, the model mainly includes 6 layers, respectively input layer, search layer, BILSTM layers, from attention mechanism Layer, full articulamentum and CRF layers.It is specific as follows:
(1) input layer: handling training set and verifying collection, and data are marked using BIOES, and word feature is respectively adopted Char=[c1,c2....cn] and part of speech feature pos=[p1,p2....pn] it is used as mode input, word feature is for obtaining text This essential characteristic, part of speech feature are used to obtain the semantic feature under sentence different context, and n is characterized quantity.
(2) it searches layer: word feature and part of speech feature is converted to corresponding word vector and part of speech feature vector respectively, And obtained vector is spliced, i.e. Xj=[charj,posj], wherein XjIndicate j-th of sentence, j ∈ [1, n].
(3) BILSTM layers: the contextual information of sentence, the contextual information that BILSTM is learnt are obtained by BILSTM Spliced, obtain BILSTM layers of feature vector, it is assumed that BILSTM hidden layer exports result are as follows:Wherein hiFor The feature vector of BILSTM layers of output, i.e., i-th of word in sentence,Indicate the hidden layer vector exported before i-th of word to LSTM,Indicate the hidden layer vector of the reversed LSTM output of i-th of word.
(4) from attention mechanism layer: using from attention mechanism, to obtain the correlation between sequence itself, to BILSTM The feature vector h of layer outputiIts attention weight is calculated, the output from attention mechanism layer is obtained, from the calculating of attention mechanism Mode are as follows:
Q_t=f (Whi+b)
K_t=f (Whi+b)
V_t=f (Whi+b)
Wherein, Q_t, K_t, V_t respectively indicate the output connected entirely, and d indicates the dimension of Q_t,Wherein m is sentence length, and self_attention is weighing from attention for i-th of word Value, W are weight matrix to be trained, and b is bias term, and f is activation primitive Relu.
(5) full articulamentum: being mapped to Label space for the output of attention mechanism layer, the output connected entirely, calculates Formula is as follows:
M=W*self_attention+b
Wherein, M is the output of full articulamentum.
(6) CRF layers: the output M connected entirely is CRF layers incoming, for inputting sentence X=(x1,x2.......xm) reality The score of border label are as follows:
Wherein, Score (X, y) is the score of the physical tags y of sentence X, Mi,yiFor state score, word x is indicatediIt is marked For the score of label yi, Ayi,yi+1For transfer matrix, the score that label yi is shifted to label yi+1 is indicated;
The score Score (X, y) of all possible annotated sequence is normalized, probability P after being normalized (y | X):
Wherein, YXIt is the possible output label set of X,For YXIn label;
The objective function of model training are as follows:
Step 3, it is less than threshold value in response to the number of iterations, threshold value is usually set to 3, seeks based on new identification model to electric power Pin data are identified, are constructed new professional word dictionary, are gone to step 2, otherwise, export new identification model for professional word knowledge Not.
When constructing New Specialty word dictionary, only profession word word used in last iteration need to be added in newly identified professional word Allusion quotation arrives new New Specialty word dictionary after manual examination and verification duplicate removal.
The above method is carried out recognition result with existing the two method to compare, wherein the mould used based on statistical method Type is TF-IDF, and for the model that deep learning algorithm uses for BILSTM-CRF, comparison result is as shown in table 4.
The comparison of 4 power marketing profession word recognition result of table
It can be seen that the above method by upper table to be directed on the basis of not a large amount of background corpus, pass through semi-automatic mark Note and identification model prediction result return mark, are promoted to the professional word recognition accuracy for being directed to power marketing data, improve electric power people The working efficiency of member.
A kind of power marketing profession word identifying system, comprising:
It constructs module: being based on power marketing external data, training initial identification model;It extracts special in power marketing data Industry word constructs initial professional word dictionary.
Identification model is the BILSTM-CRF model being added from attention mechanism.
Constructing module includes initial professional word dictionary creation module, and initial profession word candidate's dictionary creation module includes profession Word abstraction module and merging filtering module;
Professional word abstraction module: it is extracted using left and right comentropy and K-Means clustering method special in power marketing data Industry word;
Merge filtering module: the result that all methods extract and the domain lexicon that external power business personnel provides are closed And initial professional word dictionary is obtained after artificial filter.
It marks training module: based on newest professional word dictionary and identification model, the word in power marketing data being carried out Mark, and the power marketing data marked are used, trained new identification model.
Marking training module includes labeling module, and labeling module includes preliminary labeling module and modified module;
Preliminary labeling module: power marketing data are identified with identification model, the word in power marketing data is labeled;
Modified module: the word for being labeled as non-physical is searched in professional word dictionary, and if it exists, will then be labeled as non-physical Word be labeled as professional word.
Modified module includes searching module;Searching module: according to professional word candidate dictionary, Trie tree is constructed, Trie is passed through Tree searches the word for being labeled as non-physical;Wherein, each node indicates each word in professional word dictionary in Trie tree, and root node is not Store any word.
It returns mark module: being less than threshold value in response to the number of iterations, power marketing data are known based on new identification model Not, new professional word dictionary is constructed, mark training module is gone to, otherwise, new identification model is exported and is identified for professional word.
A kind of computer readable storage medium storing one or more programs, one or more of programs include referring to Enable, described instruction when executed by a computing apparatus so that the calculatings equipment execution power marketing profession word recognition method.
A kind of calculating equipment, including one or more processors, memory and one or more program, one of them or Multiple programs store in the memory and are configured as being executed by one or more of processors, one or more of Program includes the instruction for executing power marketing profession word recognition method.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
The above is only the embodiment of the present invention, are not intended to restrict the invention, all in the spirit and principles in the present invention Within, any modification, equivalent substitution, improvement and etc. done, be all contained in apply pending scope of the presently claimed invention it It is interior.

Claims (10)

1. a kind of power marketing profession word recognition method, it is characterised in that: include the following steps,
Step 1, power marketing external data, training initial identification model are based on;Extract the professional word in power marketing data, structure Build initial professional word dictionary;
Step 2, based on newest professional word dictionary and identification model, the word in power marketing data is labeled, and with marking The power marketing data being poured in, the new identification model of training;
Step 3, it is less than threshold value in response to the number of iterations, power marketing data is identified based on new identification model, constructs New professional word dictionary, goes to step 2, otherwise, exports new identification model and identifies for professional word.
2. a kind of power marketing profession word recognition method according to claim 1, it is characterised in that: identification model is to be added From the BILSTM-CRF model of attention mechanism.
3. a kind of power marketing profession word recognition method according to claim 1, it is characterised in that: the initial professional word of building The process of dictionary is,
The professional word in power marketing data is extracted using left and right comentropy and K-Means clustering method;
The result that all methods extract and the domain lexicon that external power business personnel provides are merged, after artificial filter To initial professional word dictionary.
4. a kind of power marketing profession word recognition method according to claim 1, it is characterised in that: to power marketing data In the process that is labeled of professional word be,
Power marketing data are identified with identification model, and the word in power marketing data is labeled;
The word for being labeled as non-physical is searched in professional word dictionary, and if it exists, be then labeled as the word for being labeled as non-physical specially Industry word.
5. a kind of power marketing profession word recognition method according to claim 4, it is characterised in that: according to professional word word Allusion quotation constructs Trie tree, and the word for being labeled as non-physical is searched by Trie tree;Wherein, each node indicates professional word in Trie tree Each word in dictionary, root node do not store any word.
6. a kind of power marketing profession word identifying system, it is characterised in that: including,
It constructs module: being based on power marketing external data, training initial identification model;Extract the profession in power marketing data Word constructs initial professional word dictionary;
It marks training module: based on newest professional word dictionary and identification model, the word in power marketing data being labeled, And with the power marketing data marked, the new identification model of training;
It returns mark module: being less than threshold value in response to the number of iterations, power marketing data are identified based on new identification model, structure New professional word dictionary is built, mark training module is gone to, otherwise, new identification model is exported and is identified for professional word.
7. a kind of power marketing profession word identifying system according to claim 6, it is characterised in that: identification model is to be added From the BILSTM-CRF model of attention mechanism.
8. a kind of power marketing profession word identifying system according to claim 6, it is characterised in that: building module includes just Begin professional word dictionary creation module, and initial profession word candidate's dictionary creation module is including professional word abstraction module and merges filter module Block;
Professional word abstraction module: the professional word in power marketing data is extracted using left and right comentropy and K-Means clustering method;
Merge filtering module: the result that all methods extract and the domain lexicon that external power business personnel provides being merged, warp Initial professional word dictionary is obtained after crossing artificial filter.
9. a kind of power marketing profession word identifying system according to claim 6, it is characterised in that: mark training module packet Labeling module is included, labeling module includes preliminary labeling module and modified module;
Preliminary labeling module: power marketing data are identified with identification model, the word in power marketing data is labeled;
Modified module: the word for being labeled as non-physical is searched in professional word dictionary, and if it exists, will then be labeled as the word of non-physical It is labeled as professional word.
10. a kind of power marketing profession word identifying system according to claim 9, it is characterised in that: modified module includes Searching module;
Searching module: according to professional word candidate dictionary, Trie tree is constructed, the word for being labeled as non-physical is searched by Trie tree;Its In, each node indicates that each word in professional word dictionary, root node do not store any word in Trie tree.
CN201910584443.2A 2019-07-01 2019-07-01 A kind of power marketing profession word recognition method and system Pending CN110287495A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910584443.2A CN110287495A (en) 2019-07-01 2019-07-01 A kind of power marketing profession word recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910584443.2A CN110287495A (en) 2019-07-01 2019-07-01 A kind of power marketing profession word recognition method and system

Publications (1)

Publication Number Publication Date
CN110287495A true CN110287495A (en) 2019-09-27

Family

ID=68021484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910584443.2A Pending CN110287495A (en) 2019-07-01 2019-07-01 A kind of power marketing profession word recognition method and system

Country Status (1)

Country Link
CN (1) CN110287495A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738054A (en) * 2019-10-14 2020-01-31 携程计算机技术(上海)有限公司 Method, system, electronic device and storage medium for identifying hotel information in mail
CN111339268A (en) * 2020-02-19 2020-06-26 北京百度网讯科技有限公司 Entity word recognition method and device
CN113762716A (en) * 2021-07-30 2021-12-07 国网山东省电力公司营销服务中心(计量中心) Method and system for evaluating running state of transformer area based on deep learning and attention

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572758A (en) * 2013-10-24 2015-04-29 山东大学 Method and system for automatically extracting power field specialized vocabularies
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN107527073A (en) * 2017-09-05 2017-12-29 中南大学 The recognition methods of entity is named in electronic health record
CN109710947A (en) * 2019-01-22 2019-05-03 福建亿榕信息技术有限公司 Power specialty word stock generating method and device
CN109710926A (en) * 2018-12-12 2019-05-03 内蒙古电力(集团)有限责任公司电力调度控制分公司 Dispatching of power netwoks professional language semantic relation extraction method, apparatus and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572758A (en) * 2013-10-24 2015-04-29 山东大学 Method and system for automatically extracting power field specialized vocabularies
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
CN107527073A (en) * 2017-09-05 2017-12-29 中南大学 The recognition methods of entity is named in electronic health record
CN109710926A (en) * 2018-12-12 2019-05-03 内蒙古电力(集团)有限责任公司电力调度控制分公司 Dispatching of power netwoks professional language semantic relation extraction method, apparatus and electronic equipment
CN109710947A (en) * 2019-01-22 2019-05-03 福建亿榕信息技术有限公司 Power specialty word stock generating method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110738054A (en) * 2019-10-14 2020-01-31 携程计算机技术(上海)有限公司 Method, system, electronic device and storage medium for identifying hotel information in mail
CN111339268A (en) * 2020-02-19 2020-06-26 北京百度网讯科技有限公司 Entity word recognition method and device
CN111339268B (en) * 2020-02-19 2023-08-15 北京百度网讯科技有限公司 Entity word recognition method and device
CN113762716A (en) * 2021-07-30 2021-12-07 国网山东省电力公司营销服务中心(计量中心) Method and system for evaluating running state of transformer area based on deep learning and attention

Similar Documents

Publication Publication Date Title
Jung Semantic vector learning for natural language understanding
CN109635117B (en) Method and device for recognizing user intention based on knowledge graph
CN109145294B (en) Text entity identification method and device, electronic equipment and storage medium
CN105244029B (en) Voice recognition post-processing method and system
Xie et al. Detecting duplicate bug reports with convolutional neural networks
CN110457676B (en) Evaluation information extraction method and device, storage medium and computer equipment
CN107861951A (en) Session subject identifying method in intelligent customer service
CN106776538A (en) The information extracting method of enterprise&#39;s noncanonical format document
CN109598517B (en) Commodity clearance processing, object processing and category prediction method and device thereof
CN112434535B (en) Element extraction method, device, equipment and storage medium based on multiple models
CN111899090B (en) Enterprise associated risk early warning method and system
Yuan-jie et al. Web service classification based on automatic semantic annotation and ensemble learning
CN110287495A (en) A kind of power marketing profession word recognition method and system
CN112818093A (en) Evidence document retrieval method, system and storage medium based on semantic matching
CN113672718B (en) Dialogue intention recognition method and system based on feature matching and field self-adaption
CN113761218A (en) Entity linking method, device, equipment and storage medium
CN110334186A (en) Data query method, apparatus, computer equipment and computer readable storage medium
CN110222192A (en) Corpus method for building up and device
Tripathi et al. SimNER–an accurate and faster algorithm for named entity recognition
KR20230163983A (en) Similar patent extraction methods using neural network model and device for the method
CN117422074A (en) Method, device, equipment and medium for standardizing clinical information text
Thuy et al. Leveraging foreign language labeled data for aspect-based opinion mining
CN110287396A (en) Text matching technique and device
Spichakova et al. Using machine learning for automated assessment of misclassification of goods for fraud detection
CN114708073A (en) Intelligent detection method and device for surrounding mark and serial mark, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190927

RJ01 Rejection of invention patent application after publication