CN107622049A - A kind of special word stock generating method of electric service - Google Patents
A kind of special word stock generating method of electric service Download PDFInfo
- Publication number
- CN107622049A CN107622049A CN201710797109.6A CN201710797109A CN107622049A CN 107622049 A CN107622049 A CN 107622049A CN 201710797109 A CN201710797109 A CN 201710797109A CN 107622049 A CN107622049 A CN 107622049A
- Authority
- CN
- China
- Prior art keywords
- extraction
- participle
- electric service
- steps
- special
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a kind of special word stock generating method of electric service, the described method comprises the following steps:Text Pretreatment;English, digital extraction;Chinese number, measure word extraction;Proper noun is extracted;Secondary participle extracts and sets extraction threshold value and weight;The content extracted to secondary participle is matched and weight mark, and is accordingly segmented by model extraction;Ambiguity processing, form final participle extraction result;It is artificial to add the exclusive noun of power network;Comprehensive final participle extraction result and the exclusive noun of power network, form the special dictionary of electric service.Invention introduces the extraction that proprietary fields in power system are segmented as dictionary phrase, significantly more efficient screening proper noun, it is easy to the segmentation of entry and the extraction of hot word;A variety of method extraction hot words are combined, and do association matching and the weight mark of correlation, so as to extract final participle dictionary;Electric service hot word extraction demand is more conformed to, adds hot word extraction accuracy and efficiency.
Description
Technical field
The invention belongs to electric service field, more particularly to a kind of special word stock generating method of electric service.
Background technology
Power system reform is further deepened, and structural transformation of the economy upgrading is lasting to be strengthened, and customer electricity service standard is increasingly
Improve.Customer service work order information is the valuable resource of power supply enterprise, and management level and leadership can therefrom learn enterprises service
The deficiency of work and the problem of exist.Customer complaint can be converted into the positive external pressure that enterprise enhances internal management.With
The constantly improve of intelligent grid, sales service data are to magnanimity scale development.Existing different types of service work order, is still adopted
Counted with the low artificial enquiry of traditional efficiency, data analysis, can not inherently reflect current service problem.
Intend hot word extraction, classification, association, the analysis by all 95598 business work orders of Depth Study, 186 association work orders
Technology, hot word " Baidu " formula search engine is established, and work order search displaying, statistical mechanism are established based on search platform.Research heat
The incidence relation of word and work order, customer service work order of tracing to the source, the tracking of work order is carried out for focus vocabulary.
In order to which accurate efficient hot word is extracted, it would be desirable to generate the special dictionary of the special electric service of power network.
The content of the invention
It is an object of the invention to overcome problem above existing for prior art, there is provided a kind of special dictionary life of electric service
Into method, it is easy to accurate efficient hot word extraction.
To realize above-mentioned technical purpose and the technique effect, the present invention is achieved through the following technical solutions:
A kind of special word stock generating method of electric service, the described method comprises the following steps:
S1:Text Pretreatment:Segment processing is carried out to input text using punctuate;
S2:English, digital extraction:Participle extraction is carried out to the vocabulary of English, numeral, English digital mixing;
S3:Chinese number, measure word extraction:Participle extraction is carried out to numeral-classifier compound using measure word dictionary and Chinese number;
S4:Proper noun is extracted:Using the method for power system special term extraction matching, the proper noun in text is entered
Row extraction;
S5:Secondary participle extraction is carried out for the content that S1 extracts into S4 steps, and sets extraction threshold value and weight;
S6:The content extracted to secondary participle is matched and weight mark, and is accordingly segmented by model extraction;
S7:Ambiguity processing is carried out to the corresponding participle extracted in S6 steps, forms final participle extraction result;
S8:It is artificial to add the exclusive noun of power network;
S9:The exclusive noun of power network in final participle extraction result and S8 steps in comprehensive S7 steps, forms power supply clothes
It is engaged in special dictionary.
Further, power system special term includes department's division, shaft tower arrangement, private type, detailed in the S4 steps
Disaggregated classification, user's classification, employee name, device name.
Further, after the special dictionary of electric service in the S9 steps is formed, system is associated to part of speech, word frequency
Meter.
The beneficial effects of the invention are as follows:
Invention introduces the extraction that proprietary fields in power system are segmented as dictionary phrase, significantly more efficient sieve
Proper noun is selected, is easy to the segmentation of entry and the extraction of hot word;A variety of method extraction hot words are combined, and do the association of correlation
With being marked with weight, so as to extract final participle dictionary;Electric service hot word extraction demand is more conformed to, hot word is added and carries
Take precision and efficiency.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair
Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the schematic flow sheet of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained all other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
The special word stock generating method of a kind of electric service as shown in Figure 1, the described method comprises the following steps:
S1:Text Pretreatment:Using punctuate to input text carry out segment processing, i.e., one whole section words by punctuate service into
Row punctuate segmentation;
S2:English, digital extraction:English digital is distinguished substantially, using existing segmentation methods, to English, digital, English number
The vocabulary of word mixing carries out participle extraction, is related to English word, and continuous number encodes, English digital mixing word retrieval, such as
Book, 20162010332, D3000 etc., step refining sample content of going forward side by side segmentation, it is easy to follow-up word segmentation processing;
S3:Chinese number, measure word extraction:Participle extraction is carried out to numeral-classifier compound using measure word dictionary and Chinese number, from one
Number and measure word, such as one are extracted in short sentence, the step refining sample content segmentations of going forward side by side such as five, is easy at follow-up participle
Reason;
S4:Proper noun is extracted:Using the method for power system special term extraction matching, if word in the short sentence of segmentation
It is identical with power system special term, then extract, power system special term include department's division, shaft tower arrangement, private type,
Exhaustive division, user's classification, employee name, device name;
S5:Secondary participle extraction is carried out for the content that S1 extracts into S4 steps, and sets extraction threshold value and weight;
The secondary participle extraction includes following three kinds of processing modes:
1st, the long string matching algorithm using forward iteration and most, further segmented for the content that S1 extracts into S4 steps
Extraction, extraction threshold value and weight need to be set;
2nd, using existing specialized dictionary, segmented using participle extraction algorithm for the S1 contents extracted into S4 steps
Extraction, extraction threshold value and weight need to be set;
3rd, using the method for the machine learning such as RNN LSTM, carrying of segmenting is done for the S1 contents extracted into S4 steps
Take;
S6:The content extracted to secondary participle is matched and weight mark, and corresponding by corresponding model extraction
Participle;
S7:Ambiguity processing is carried out to the corresponding participle extracted in S6 steps, such as " I does not like " ambiguity is divided into:" I |
Like | do not like ", it is after ambiguity processing:" I | or not like " form final participle extraction result;
S8:By artificial addition manner, the exclusive noun of newest power network is added, such as palm electric power, electric E treasured etc.;
S9:The exclusive noun of power network in final participle extraction result and S8 steps in comprehensive S7 steps, forms power supply clothes
It is engaged in special dictionary, after the special dictionary of electric service is formed, statistics is associated to part of speech, word frequency, i.e., for statistical correlation text
Content, the word frequency of vocabulary in dictionary.
General principle, principal character and the advantages of the present invention of the present invention has been shown and described above.The technology of the industry
Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the simply explanation described in above-described embodiment and specification is originally
The principle of invention, without departing from the spirit and scope of the present invention, various changes and modifications of the present invention are possible, these changes
Change and improvement all fall within the protetion scope of the claimed invention.
Claims (3)
- A kind of 1. special word stock generating method of electric service, it is characterised in that:It the described method comprises the following steps:S1:Text Pretreatment:Segment processing is carried out to input text using punctuate;S2:English, digital extraction:Participle extraction is carried out to the vocabulary of English, numeral, English digital mixing;S3:Chinese number, measure word extraction:Participle extraction is carried out to numeral-classifier compound using measure word dictionary and Chinese number;S4:Proper noun is extracted:Using the method for power system special term extraction matching, the proper noun in text is carried Take;S5:Secondary participle extraction is carried out for the content that S1 extracts into S4 steps, and sets extraction threshold value and weight;S6:The content extracted to secondary participle is matched and weight mark, and is accordingly segmented by model extraction;S7:Ambiguity processing is carried out to the corresponding participle extracted in S6 steps, forms final participle extraction result;S8:It is artificial to add the exclusive noun of power network;S9:The exclusive noun of power network in final participle extraction result and S8 steps in comprehensive S7 steps, it is special to form electric service Use dictionary.
- A kind of 2. special word stock generating method of electric service according to claim 1, it is characterised in that:In the S4 steps Power system special term includes department's division, shaft tower arrangement, private type, exhaustive division, user's classification, employee name, equipment Title.
- A kind of 3. special word stock generating method of electric service according to claim 1, it is characterised in that:In the S9 steps The special dictionary of electric service formed after, statistics is associated to part of speech, word frequency.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710797109.6A CN107622049A (en) | 2017-09-06 | 2017-09-06 | A kind of special word stock generating method of electric service |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710797109.6A CN107622049A (en) | 2017-09-06 | 2017-09-06 | A kind of special word stock generating method of electric service |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107622049A true CN107622049A (en) | 2018-01-23 |
Family
ID=61089325
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710797109.6A Pending CN107622049A (en) | 2017-09-06 | 2017-09-06 | A kind of special word stock generating method of electric service |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107622049A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109388806A (en) * | 2018-10-26 | 2019-02-26 | 北京布本智能科技有限公司 | A kind of Chinese word cutting method based on deep learning and forgetting algorithm |
CN111783438A (en) * | 2020-05-22 | 2020-10-16 | 贵州电网有限责任公司 | Hot word detection method for realizing work order analysis |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104899190A (en) * | 2015-06-04 | 2015-09-09 | 百度在线网络技术(北京)有限公司 | Generation method and device for word segmentation dictionary and word segmentation processing method and device |
CN105005556A (en) * | 2015-07-29 | 2015-10-28 | 成都理工大学 | Index keyword extraction method and system based on big geological data |
CN106250372A (en) * | 2016-08-17 | 2016-12-21 | 国网上海市电力公司 | A kind of Chinese electric power data text mining method for power system |
CN106447346A (en) * | 2016-08-29 | 2017-02-22 | 北京中电普华信息技术有限公司 | Method and system for construction of intelligent electric power customer service system |
CN106951410A (en) * | 2017-03-21 | 2017-07-14 | 北京三快在线科技有限公司 | Generation method, device and the electronic equipment of dictionary |
-
2017
- 2017-09-06 CN CN201710797109.6A patent/CN107622049A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104899190A (en) * | 2015-06-04 | 2015-09-09 | 百度在线网络技术(北京)有限公司 | Generation method and device for word segmentation dictionary and word segmentation processing method and device |
CN105005556A (en) * | 2015-07-29 | 2015-10-28 | 成都理工大学 | Index keyword extraction method and system based on big geological data |
CN106250372A (en) * | 2016-08-17 | 2016-12-21 | 国网上海市电力公司 | A kind of Chinese electric power data text mining method for power system |
CN106447346A (en) * | 2016-08-29 | 2017-02-22 | 北京中电普华信息技术有限公司 | Method and system for construction of intelligent electric power customer service system |
CN106951410A (en) * | 2017-03-21 | 2017-07-14 | 北京三快在线科技有限公司 | Generation method, device and the electronic equipment of dictionary |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109388806A (en) * | 2018-10-26 | 2019-02-26 | 北京布本智能科技有限公司 | A kind of Chinese word cutting method based on deep learning and forgetting algorithm |
CN109388806B (en) * | 2018-10-26 | 2023-06-27 | 北京布本智能科技有限公司 | Chinese word segmentation method based on deep learning and forgetting algorithm |
CN111783438A (en) * | 2020-05-22 | 2020-10-16 | 贵州电网有限责任公司 | Hot word detection method for realizing work order analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104899304B (en) | Name entity recognition method and device | |
CN110598203B (en) | Method and device for extracting entity information of military design document combined with dictionary | |
CN106919673B (en) | Text mood analysis system based on deep learning | |
CN104391942B (en) | Short essay eigen extended method based on semantic collection of illustrative plates | |
CN107168945B (en) | Bidirectional cyclic neural network fine-grained opinion mining method integrating multiple features | |
CN107168955B (en) | Utilize the Chinese word cutting method of the word insertion and neural network of word-based context | |
CN110807328B (en) | Named entity identification method and system for legal document multi-strategy fusion | |
CN106570179B (en) | A kind of kernel entity recognition methods and device towards evaluation property text | |
CN103336766B (en) | Short text garbage identification and modeling method and device | |
CN106529804A (en) | Client complaint early-warning monitoring analyzing method based on text mining technology | |
CN107908716A (en) | 95598 work order text mining method and apparatus of word-based vector model | |
CN106530127A (en) | Complaint early warning and monitoring analysis system based on text mining | |
WO2019228466A1 (en) | Named entity recognition method, device and apparatus, and storage medium | |
CN111274814B (en) | Novel semi-supervised text entity information extraction method | |
CN101127042A (en) | Sensibility classification method based on language model | |
CN112101028A (en) | Multi-feature bidirectional gating field expert entity extraction method and system | |
CN109002473A (en) | A kind of sentiment analysis method based on term vector and part of speech | |
CN107562726A (en) | A kind of electric service search engine based on hot word | |
CN112069312B (en) | Text classification method based on entity recognition and electronic device | |
CN107145573A (en) | The problem of artificial intelligence customer service robot, answers method and system | |
CN115858758A (en) | Intelligent customer service knowledge graph system with multiple unstructured data identification | |
CN111259153A (en) | Attribute-level emotion analysis method of complete attention mechanism | |
CN113360582B (en) | Relation classification method and system based on BERT model fusion multi-entity information | |
CN101876975A (en) | Identification method of Chinese place name | |
CN109657039A (en) | A kind of track record information extraction method based on the double-deck BiLSTM-CRF |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180123 |
|
RJ01 | Rejection of invention patent application after publication |