CN107622049A - A kind of special word stock generating method of electric service - Google Patents

A kind of special word stock generating method of electric service Download PDF

Info

Publication number
CN107622049A
CN107622049A CN201710797109.6A CN201710797109A CN107622049A CN 107622049 A CN107622049 A CN 107622049A CN 201710797109 A CN201710797109 A CN 201710797109A CN 107622049 A CN107622049 A CN 107622049A
Authority
CN
China
Prior art keywords
extraction
participle
electric service
steps
special
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710797109.6A
Other languages
Chinese (zh)
Inventor
左松林
倪妍妍
李直
袁加梅
黄华胜
张甜
张莉莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
Original Assignee
State Grid Corp of China SGCC
Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, Electric Power Research Institute of State Grid Anhui Electric Power Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201710797109.6A priority Critical patent/CN107622049A/en
Publication of CN107622049A publication Critical patent/CN107622049A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a kind of special word stock generating method of electric service, the described method comprises the following steps:Text Pretreatment;English, digital extraction;Chinese number, measure word extraction;Proper noun is extracted;Secondary participle extracts and sets extraction threshold value and weight;The content extracted to secondary participle is matched and weight mark, and is accordingly segmented by model extraction;Ambiguity processing, form final participle extraction result;It is artificial to add the exclusive noun of power network;Comprehensive final participle extraction result and the exclusive noun of power network, form the special dictionary of electric service.Invention introduces the extraction that proprietary fields in power system are segmented as dictionary phrase, significantly more efficient screening proper noun, it is easy to the segmentation of entry and the extraction of hot word;A variety of method extraction hot words are combined, and do association matching and the weight mark of correlation, so as to extract final participle dictionary;Electric service hot word extraction demand is more conformed to, adds hot word extraction accuracy and efficiency.

Description

A kind of special word stock generating method of electric service
Technical field
The invention belongs to electric service field, more particularly to a kind of special word stock generating method of electric service.
Background technology
Power system reform is further deepened, and structural transformation of the economy upgrading is lasting to be strengthened, and customer electricity service standard is increasingly Improve.Customer service work order information is the valuable resource of power supply enterprise, and management level and leadership can therefrom learn enterprises service The deficiency of work and the problem of exist.Customer complaint can be converted into the positive external pressure that enterprise enhances internal management.With The constantly improve of intelligent grid, sales service data are to magnanimity scale development.Existing different types of service work order, is still adopted Counted with the low artificial enquiry of traditional efficiency, data analysis, can not inherently reflect current service problem.
Intend hot word extraction, classification, association, the analysis by all 95598 business work orders of Depth Study, 186 association work orders Technology, hot word " Baidu " formula search engine is established, and work order search displaying, statistical mechanism are established based on search platform.Research heat The incidence relation of word and work order, customer service work order of tracing to the source, the tracking of work order is carried out for focus vocabulary.
In order to which accurate efficient hot word is extracted, it would be desirable to generate the special dictionary of the special electric service of power network.
The content of the invention
It is an object of the invention to overcome problem above existing for prior art, there is provided a kind of special dictionary life of electric service Into method, it is easy to accurate efficient hot word extraction.
To realize above-mentioned technical purpose and the technique effect, the present invention is achieved through the following technical solutions:
A kind of special word stock generating method of electric service, the described method comprises the following steps:
S1:Text Pretreatment:Segment processing is carried out to input text using punctuate;
S2:English, digital extraction:Participle extraction is carried out to the vocabulary of English, numeral, English digital mixing;
S3:Chinese number, measure word extraction:Participle extraction is carried out to numeral-classifier compound using measure word dictionary and Chinese number;
S4:Proper noun is extracted:Using the method for power system special term extraction matching, the proper noun in text is entered Row extraction;
S5:Secondary participle extraction is carried out for the content that S1 extracts into S4 steps, and sets extraction threshold value and weight;
S6:The content extracted to secondary participle is matched and weight mark, and is accordingly segmented by model extraction;
S7:Ambiguity processing is carried out to the corresponding participle extracted in S6 steps, forms final participle extraction result;
S8:It is artificial to add the exclusive noun of power network;
S9:The exclusive noun of power network in final participle extraction result and S8 steps in comprehensive S7 steps, forms power supply clothes It is engaged in special dictionary.
Further, power system special term includes department's division, shaft tower arrangement, private type, detailed in the S4 steps Disaggregated classification, user's classification, employee name, device name.
Further, after the special dictionary of electric service in the S9 steps is formed, system is associated to part of speech, word frequency Meter.
The beneficial effects of the invention are as follows:
Invention introduces the extraction that proprietary fields in power system are segmented as dictionary phrase, significantly more efficient sieve Proper noun is selected, is easy to the segmentation of entry and the extraction of hot word;A variety of method extraction hot words are combined, and do the association of correlation With being marked with weight, so as to extract final participle dictionary;Electric service hot word extraction demand is more conformed to, hot word is added and carries Take precision and efficiency.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the schematic flow sheet of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained all other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.
The special word stock generating method of a kind of electric service as shown in Figure 1, the described method comprises the following steps:
S1:Text Pretreatment:Using punctuate to input text carry out segment processing, i.e., one whole section words by punctuate service into Row punctuate segmentation;
S2:English, digital extraction:English digital is distinguished substantially, using existing segmentation methods, to English, digital, English number The vocabulary of word mixing carries out participle extraction, is related to English word, and continuous number encodes, English digital mixing word retrieval, such as Book, 20162010332, D3000 etc., step refining sample content of going forward side by side segmentation, it is easy to follow-up word segmentation processing;
S3:Chinese number, measure word extraction:Participle extraction is carried out to numeral-classifier compound using measure word dictionary and Chinese number, from one Number and measure word, such as one are extracted in short sentence, the step refining sample content segmentations of going forward side by side such as five, is easy at follow-up participle Reason;
S4:Proper noun is extracted:Using the method for power system special term extraction matching, if word in the short sentence of segmentation It is identical with power system special term, then extract, power system special term include department's division, shaft tower arrangement, private type, Exhaustive division, user's classification, employee name, device name;
S5:Secondary participle extraction is carried out for the content that S1 extracts into S4 steps, and sets extraction threshold value and weight;
The secondary participle extraction includes following three kinds of processing modes:
1st, the long string matching algorithm using forward iteration and most, further segmented for the content that S1 extracts into S4 steps Extraction, extraction threshold value and weight need to be set;
2nd, using existing specialized dictionary, segmented using participle extraction algorithm for the S1 contents extracted into S4 steps Extraction, extraction threshold value and weight need to be set;
3rd, using the method for the machine learning such as RNN LSTM, carrying of segmenting is done for the S1 contents extracted into S4 steps Take;
S6:The content extracted to secondary participle is matched and weight mark, and corresponding by corresponding model extraction Participle;
S7:Ambiguity processing is carried out to the corresponding participle extracted in S6 steps, such as " I does not like " ambiguity is divided into:" I | Like | do not like ", it is after ambiguity processing:" I | or not like " form final participle extraction result;
S8:By artificial addition manner, the exclusive noun of newest power network is added, such as palm electric power, electric E treasured etc.;
S9:The exclusive noun of power network in final participle extraction result and S8 steps in comprehensive S7 steps, forms power supply clothes It is engaged in special dictionary, after the special dictionary of electric service is formed, statistics is associated to part of speech, word frequency, i.e., for statistical correlation text Content, the word frequency of vocabulary in dictionary.
General principle, principal character and the advantages of the present invention of the present invention has been shown and described above.The technology of the industry Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the simply explanation described in above-described embodiment and specification is originally The principle of invention, without departing from the spirit and scope of the present invention, various changes and modifications of the present invention are possible, these changes Change and improvement all fall within the protetion scope of the claimed invention.

Claims (3)

  1. A kind of 1. special word stock generating method of electric service, it is characterised in that:It the described method comprises the following steps:
    S1:Text Pretreatment:Segment processing is carried out to input text using punctuate;
    S2:English, digital extraction:Participle extraction is carried out to the vocabulary of English, numeral, English digital mixing;
    S3:Chinese number, measure word extraction:Participle extraction is carried out to numeral-classifier compound using measure word dictionary and Chinese number;
    S4:Proper noun is extracted:Using the method for power system special term extraction matching, the proper noun in text is carried Take;
    S5:Secondary participle extraction is carried out for the content that S1 extracts into S4 steps, and sets extraction threshold value and weight;
    S6:The content extracted to secondary participle is matched and weight mark, and is accordingly segmented by model extraction;
    S7:Ambiguity processing is carried out to the corresponding participle extracted in S6 steps, forms final participle extraction result;
    S8:It is artificial to add the exclusive noun of power network;
    S9:The exclusive noun of power network in final participle extraction result and S8 steps in comprehensive S7 steps, it is special to form electric service Use dictionary.
  2. A kind of 2. special word stock generating method of electric service according to claim 1, it is characterised in that:In the S4 steps Power system special term includes department's division, shaft tower arrangement, private type, exhaustive division, user's classification, employee name, equipment Title.
  3. A kind of 3. special word stock generating method of electric service according to claim 1, it is characterised in that:In the S9 steps The special dictionary of electric service formed after, statistics is associated to part of speech, word frequency.
CN201710797109.6A 2017-09-06 2017-09-06 A kind of special word stock generating method of electric service Pending CN107622049A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710797109.6A CN107622049A (en) 2017-09-06 2017-09-06 A kind of special word stock generating method of electric service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710797109.6A CN107622049A (en) 2017-09-06 2017-09-06 A kind of special word stock generating method of electric service

Publications (1)

Publication Number Publication Date
CN107622049A true CN107622049A (en) 2018-01-23

Family

ID=61089325

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710797109.6A Pending CN107622049A (en) 2017-09-06 2017-09-06 A kind of special word stock generating method of electric service

Country Status (1)

Country Link
CN (1) CN107622049A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388806A (en) * 2018-10-26 2019-02-26 北京布本智能科技有限公司 A kind of Chinese word cutting method based on deep learning and forgetting algorithm
CN111783438A (en) * 2020-05-22 2020-10-16 贵州电网有限责任公司 Hot word detection method for realizing work order analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899190A (en) * 2015-06-04 2015-09-09 百度在线网络技术(北京)有限公司 Generation method and device for word segmentation dictionary and word segmentation processing method and device
CN105005556A (en) * 2015-07-29 2015-10-28 成都理工大学 Index keyword extraction method and system based on big geological data
CN106250372A (en) * 2016-08-17 2016-12-21 国网上海市电力公司 A kind of Chinese electric power data text mining method for power system
CN106447346A (en) * 2016-08-29 2017-02-22 北京中电普华信息技术有限公司 Method and system for construction of intelligent electric power customer service system
CN106951410A (en) * 2017-03-21 2017-07-14 北京三快在线科技有限公司 Generation method, device and the electronic equipment of dictionary

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104899190A (en) * 2015-06-04 2015-09-09 百度在线网络技术(北京)有限公司 Generation method and device for word segmentation dictionary and word segmentation processing method and device
CN105005556A (en) * 2015-07-29 2015-10-28 成都理工大学 Index keyword extraction method and system based on big geological data
CN106250372A (en) * 2016-08-17 2016-12-21 国网上海市电力公司 A kind of Chinese electric power data text mining method for power system
CN106447346A (en) * 2016-08-29 2017-02-22 北京中电普华信息技术有限公司 Method and system for construction of intelligent electric power customer service system
CN106951410A (en) * 2017-03-21 2017-07-14 北京三快在线科技有限公司 Generation method, device and the electronic equipment of dictionary

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388806A (en) * 2018-10-26 2019-02-26 北京布本智能科技有限公司 A kind of Chinese word cutting method based on deep learning and forgetting algorithm
CN109388806B (en) * 2018-10-26 2023-06-27 北京布本智能科技有限公司 Chinese word segmentation method based on deep learning and forgetting algorithm
CN111783438A (en) * 2020-05-22 2020-10-16 贵州电网有限责任公司 Hot word detection method for realizing work order analysis

Similar Documents

Publication Publication Date Title
CN104899304B (en) Name entity recognition method and device
CN110598203B (en) Method and device for extracting entity information of military design document combined with dictionary
CN106919673B (en) Text mood analysis system based on deep learning
CN104391942B (en) Short essay eigen extended method based on semantic collection of illustrative plates
CN107168945B (en) Bidirectional cyclic neural network fine-grained opinion mining method integrating multiple features
CN107168955B (en) Utilize the Chinese word cutting method of the word insertion and neural network of word-based context
CN110807328B (en) Named entity identification method and system for legal document multi-strategy fusion
CN106570179B (en) A kind of kernel entity recognition methods and device towards evaluation property text
CN103336766B (en) Short text garbage identification and modeling method and device
CN106529804A (en) Client complaint early-warning monitoring analyzing method based on text mining technology
CN107908716A (en) 95598 work order text mining method and apparatus of word-based vector model
CN106530127A (en) Complaint early warning and monitoring analysis system based on text mining
WO2019228466A1 (en) Named entity recognition method, device and apparatus, and storage medium
CN111274814B (en) Novel semi-supervised text entity information extraction method
CN101127042A (en) Sensibility classification method based on language model
CN112101028A (en) Multi-feature bidirectional gating field expert entity extraction method and system
CN109002473A (en) A kind of sentiment analysis method based on term vector and part of speech
CN107562726A (en) A kind of electric service search engine based on hot word
CN112069312B (en) Text classification method based on entity recognition and electronic device
CN107145573A (en) The problem of artificial intelligence customer service robot, answers method and system
CN115858758A (en) Intelligent customer service knowledge graph system with multiple unstructured data identification
CN111259153A (en) Attribute-level emotion analysis method of complete attention mechanism
CN113360582B (en) Relation classification method and system based on BERT model fusion multi-entity information
CN101876975A (en) Identification method of Chinese place name
CN109657039A (en) A kind of track record information extraction method based on the double-deck BiLSTM-CRF

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180123

RJ01 Rejection of invention patent application after publication