CN109726283A - A kind of electric service client's demand recognition methods based on text similarity measurement - Google Patents

A kind of electric service client's demand recognition methods based on text similarity measurement Download PDF

Info

Publication number
CN109726283A
CN109726283A CN201811463322.4A CN201811463322A CN109726283A CN 109726283 A CN109726283 A CN 109726283A CN 201811463322 A CN201811463322 A CN 201811463322A CN 109726283 A CN109726283 A CN 109726283A
Authority
CN
China
Prior art keywords
text
work order
client
demand
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811463322.4A
Other languages
Chinese (zh)
Inventor
卜晓阳
王宗伟
金鹏
赵郭燚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Co Ltd Customer Service Center
Original Assignee
State Grid Co Ltd Customer Service Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Co Ltd Customer Service Center filed Critical State Grid Co Ltd Customer Service Center
Priority to CN201811463322.4A priority Critical patent/CN109726283A/en
Publication of CN109726283A publication Critical patent/CN109726283A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of electric service client's demand recognition methods based on text similarity measurement.Recognition methods process is as follows: establishing client's demand hot spot system table;Text Pretreatment: carrying out text participle for the text in work order and text quantify, and carries out cutting to the long content of text, rejects stop words;Stop words refers to the big and invalid vocabulary of auxiliary words of mood on text analyzing without influence, amount, punctuation mark;Automate text classification: finally according to the theme of identification and corresponding dictionary, combining classification algorithm realizes the classification of automation to full dose customer service work order.The present invention has the advantages that cosine similarity can be automatic accurate for themes multiple in text identification, therefore innovative measure text similarity of the present invention is combined with work order data, precisely identifies whole demands of client in every work order.

Description

A kind of electric service client's demand recognition methods based on text similarity measurement
Technical field:
The present invention relates to the methods for being specially adapted for electric power community service department, and in particular to one kind is based on text similarity Electric service client's demand recognition methods of measurement.
Background technique:
With internet+, big data, the information technologies rapid development such as cloud computing, most information is via paper carrier It is transitioned into electron carrier, and in these information, it is largely unstructured or semi-structured text information.How effectively to manage Reason, the information excavated, contained in analysis magnanimity unstructured data, it has also become the challenge again of big data field.Unstructured In data, text data is play an important role.For possessing the enterprise of a large amount of text datas, how this part number is effectively utilized Decide the development in enterprise's future according to resource.In the data of power industry client service center, how work order data are handled, To accurately identify the demand of client in work order, or even the implicit demand timely newly-increased demand of uprushing of discovery simultaneously is excavated, this It is most important to the satisfaction of the quality and client that promote service.
Using there are mainly two types of the common methods of excavation of the text similarity to information in text data, one is SimHash algorithm, another is cosine similarity algorithm, also known as COS distance, is with two vector angles in vector space Cosine value as measure two inter-individual differences size measurement;Cosine value closer to 1, indicate that angle closer to 0 degree, I.e. two vectors are more similar.By the way that the processing to content of text is reduced to the vector operation in vector space, and there is calculating to tie Fruit is accurate, is suitble to the advantages of handling short text.
In the work order of power customer, client's demand of every work order is simultaneously not all single demand, accurately identifies every work Single whole demands are particularly important;In the Text Classification of machine learning classification, machine learning classification algorithm can only be identified Single demand is unable to satisfy a case where work order contains multiple demands.The work order of record in to(for) client's demand is by customer service people Member's processing conversion record, content of text is longer, unified without simplifying well, there are the work order that same work order has multiple demands, And the recording mode of same demand also difference.
Summary of the invention:
Present invention is primarily based on text similarity measurements to carry out demand knowledge to the text data in electric service client's work order Not, by cosine similarity algorithm, to treated, text data carries out mining analysis, identifies client's whole demand in work order, with Just be accurately positioned each client in terms of electricity consumption the problem of.Specific technical solution is as follows:
A kind of electric service client's demand recognition methods based on text similarity measurement, comprises the following processes:
Step 0: establishing client's demand hot spot system table: randomly choosing N sample in full dose sample as training sample And test sample, N sample is randomly choosed in full dose sample, according to the client for including in cosine similarity algorithm identification work order Demand defines the business meaning of each theme in conjunction with professional knowledge and logic, forms client's demand hot spot system table;
Step 1: Text Pretreatment: carrying out text participle and text for the text in work order and quantify, to the long content of text into Stop words is rejected in row cutting;Stop words refers to the big and invalid vocabulary of auxiliary words of mood on text analyzing without influence, amount, punctuate Symbol;
Step 2: automation text classification: finally according to the theme of identification and corresponding dictionary, combining classification algorithm is to complete Measure the classification that customer service work order realizes automation.
Preferably, in step 0, the N value 10000.
The present invention has the advantages that
(1) this method proposition is used in full dose customer service work order using cosine similarity algorithm precisely identifies client's demand, Text data is sufficiently excavated and is applied in real work.
(2) identification that cosine similarity can be automatic accurate for themes multiple in text, thus the present invention it is innovative general Text similarity measurement is combined with work order data, precisely identifies whole demands of client in every work order.
Specific embodiment:
Embodiment:
A kind of electric service client's demand recognition methods based on text similarity measurement, comprises the following processes:
Step 0: establishing client's demand hot spot system table: randomly choosing 10,000 samples in full dose sample as training sample Sheet and test sample randomly choose 10,000 samples in full dose sample, include in foundation cosine similarity algorithm identification work order Client's demand defines the business meaning of each theme in conjunction with professional knowledge and logic, forms client's demand hot spot system table;
Step 1: the text in work order being subjected to text participle and text quantifies, mainly to the long content of text according to certain Rule carries out cutting, rejects stop words;Stop words refers to the big and invalid word of auxiliary words of mood on text analyzing without influence, amount Remittance, punctuation mark etc. form specialized dictionary and thesaurus by Text Pretreatment, segment to improve to new data Accuracy and validity;By calling the jar packet sealed up for safekeeping in this project, pass through the java program of exploitation in the packet on the one hand The calling for realizing participle tool is calling ICTCLAS to segment tool, and to guarantee that word segmentation result is accurate and validity, electric power is added Industry specialized dictionary and thesaurus, such as professional word ' three-phase imbalance ', ' three-phase load ', ' three-phase equilibrium ' equal unified definition For synonym ' three-phase problem ', the professional word ' time should not be so long ', ' overlong time ', ' time span is long ', ' time is too long ', ' time is long ' unified definition is synonym ' overlong time ' etc., and final improve forms 2835 power specialty words, and 1305 synonyms;
Step 2: automation text classification: finally according to the theme of identification and corresponding dictionary, combining classification algorithm is to complete Measure customer service work order realize automation classification, such as the corresponding dictionary of more family power failure demand themes include ' processing ', ' causing ', ' phone ', ' more family power failures ', ' reflection ', ' verification ', ' incoming call ', ' it is required that ', in combination with including more family power failure demand themes Other work orders carry out abundant dictionary, ultimately form each demand theme and respectively correspond respective dictionary;Combining classification algorithm pair later Full dose customer service work order realizes automatic classification, and after new work order data generate, and also in combination with sorting algorithm, generates to new Work order data classify, to identify client's demand.

Claims (2)

1. a kind of electric service client's demand recognition methods based on text similarity measurement, which is characterized in that including following mistake Journey:
Step 0: establishing client's demand hot spot system table: randomly choosing N sample in full dose sample as training sample and survey Sample sheet randomly chooses N sample in full dose sample, tells according to the client for including in cosine similarity algorithm identification work order It asks, in conjunction with professional knowledge and logic, defines the business meaning of each theme, form client's demand hot spot system table;
Step 1: Text Pretreatment: the text in work order being subjected to text participle and text quantifies, the long content of text is cut Point, reject stop words;Stop words refers to the big and invalid vocabulary of auxiliary words of mood on text analyzing without influence, amount, punctuation mark;
Step 2: automation text classification: finally according to the theme of identification and corresponding dictionary, combining classification algorithm is to full dose visitor Take the classification that work order realizes automation.
2. a kind of electric service client's demand recognition methods based on text similarity measurement according to claim 1, special Sign is, in step 0, the N value 10000.
CN201811463322.4A 2018-12-03 2018-12-03 A kind of electric service client's demand recognition methods based on text similarity measurement Pending CN109726283A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811463322.4A CN109726283A (en) 2018-12-03 2018-12-03 A kind of electric service client's demand recognition methods based on text similarity measurement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811463322.4A CN109726283A (en) 2018-12-03 2018-12-03 A kind of electric service client's demand recognition methods based on text similarity measurement

Publications (1)

Publication Number Publication Date
CN109726283A true CN109726283A (en) 2019-05-07

Family

ID=66295531

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811463322.4A Pending CN109726283A (en) 2018-12-03 2018-12-03 A kind of electric service client's demand recognition methods based on text similarity measurement

Country Status (1)

Country Link
CN (1) CN109726283A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112116173A (en) * 2019-06-19 2020-12-22 中国石油化工股份有限公司 Invalid operation reduction system
CN112667812A (en) * 2020-12-30 2021-04-16 云南电网有限责任公司 Method for identifying power supply service customer electricity quantity and electricity charge demand

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107861942A (en) * 2017-10-11 2018-03-30 国网浙江省电力公司电力科学研究院 A kind of electric power based on deep learning is doubtful to complain work order recognition methods

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107861942A (en) * 2017-10-11 2018-03-30 国网浙江省电力公司电力科学研究院 A kind of electric power based on deep learning is doubtful to complain work order recognition methods

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112116173A (en) * 2019-06-19 2020-12-22 中国石油化工股份有限公司 Invalid operation reduction system
CN112667812A (en) * 2020-12-30 2021-04-16 云南电网有限责任公司 Method for identifying power supply service customer electricity quantity and electricity charge demand

Similar Documents

Publication Publication Date Title
WO2018000269A1 (en) Data annotation method and system based on data mining and crowdsourcing
CN109389418A (en) Electric service client's demand recognition methods based on LDA model
CN109726283A (en) A kind of electric service client's demand recognition methods based on text similarity measurement
CN107766560B (en) Method and system for evaluating customer service flow
CN107516370A (en) The automatic test and evaluation method of a kind of bank slip recognition
Müller et al. Comparison of preprocessing approaches for text data in digital shop floor management systems
US11741318B2 (en) Open information extraction from low resource languages
CN111221873A (en) Inter-enterprise homonym identification method and system based on associated network
CN107704529A (en) The recognition methods of information uniqueness, application server, system and storage medium
Pham et al. A hybrid approach to vietnamese word segmentation using part of speech tags
CN110909162B (en) Text quality inspection method, storage medium and electronic equipment
CN109388804A (en) Report core views extracting method and device are ground using the security of deep learning model
CN110866394A (en) Company name identification method and device, computer equipment and readable storage medium
CN110826991B (en) Electronic receipt processing system and method
CN107886233B (en) Service quality evaluation method and system for customer service
CN116340172A (en) Data collection method and device based on test scene and test case detection method
CN115618264A (en) Method, apparatus, device and medium for topic classification of data assets
CN113778875B (en) System test defect classification method, device, equipment and storage medium
CN112328951B (en) Processing method of experimental data of analysis sample
CN108255887B (en) Method and device for verifying industry text
CN110083807B (en) Contract modification influence automatic prediction method, device, medium and electronic equipment
CN114564391A (en) Method and device for determining test case, storage medium and electronic equipment
CN110826330B (en) Name recognition method and device, computer equipment and readable storage medium
Park et al. Identify the failure mode of weapon system (or equipment) using machine learning
CN106779396A (en) A kind of system for recognizing business standing degree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190507

RJ01 Rejection of invention patent application after publication