CN110472017A - A kind of analysis of words art and topic point identify matched method and system - Google Patents

A kind of analysis of words art and topic point identify matched method and system Download PDF

Info

Publication number
CN110472017A
CN110472017A CN201910771751.6A CN201910771751A CN110472017A CN 110472017 A CN110472017 A CN 110472017A CN 201910771751 A CN201910771751 A CN 201910771751A CN 110472017 A CN110472017 A CN 110472017A
Authority
CN
China
Prior art keywords
word
client
words art
topic point
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910771751.6A
Other languages
Chinese (zh)
Inventor
杨钊
姜磊
赖招展
叶彩园
杨嘉文
朱振航
何慧
沈广盈
屈吕杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Brilliant Data Analytics Inc
Original Assignee
Brilliant Data Analytics Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Brilliant Data Analytics Inc filed Critical Brilliant Data Analytics Inc
Priority to CN201910771751.6A priority Critical patent/CN110472017A/en
Publication of CN110472017A publication Critical patent/CN110472017A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Databases & Information Systems (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The present invention is natural language processing technique, is that a kind of analysis of words art and topic point identify matched method and system.Its method initially sets up marketing words art analysis of strategies tag library, establishes client's label system respectively for client and attending a banquet and label system of attending a banquet, and precisely identify customers wishes;Marketing words art strategy is analyzed, words art is completed and extracts automatically, words art is completed in conjunction with client's label and outstanding words art library and recommends automatically;Marketing words art analysis report is generated to the reference situation of outstanding words art according to attending a banquet.The invention analyzes the identification of topic point from words art, intelligently analyzes client's topic point entirely and provides outstanding words art, breaches the mode for manually extracting outstanding words art in the past, reduce a large amount of manpower and material resources.

Description

A kind of analysis of words art and topic point identify matched method and system
Technical field
The present invention relates to natural language processing technique fields, and in particular to a kind of analysis of words art and the identification of topic point are matched Method and system.
Background technique
Outgoing call marketing is enterprise contact client, to lead referral product and a kind of means for guiding client to buy, and outgoing call Marketing system is a platform for assisting customer service to carry out outgoing call marketing.But existing most of outgoing call marketing system is only simple A key outgoing call, automatic recording, manual record has intention the function of client.The problem of existing in this way has:
(1) session recording is only used to examine employee, does not excavate the information hidden in session, causes resource unrestrained Take.
(2) manual record, which has intention client, has very big requirement to the experience and ability of employee.
As voice is converted to the maturation of text techniques and natural language processing technique, historical session can be recorded and be converted to Text is analyzed, and extracts the information hidden in session recording, reduces requirement of the outgoing call marketing to the experience and ability of employee.
Outstanding words art is extracted, i.e., a set of common marketing set is extracted from the successful session recording of history outgoing call marketing Road is a kind of effective means for improving outgoing call marketing success rate.Outstanding words art extracting mode in the marketing of outgoing call at present is more single One, the successful contact staff that often markets directly rule of thumb summarizes, or simply extracts keyword, cannot accomplish according to visitor Outstanding words art is recommended in the transformation of family topic in time.
Summary of the invention
In view of the above-mentioned problems, the present invention provides a kind of analysis of words art and topic point identifies matched method and system, from words Art analyzes the identification of topic point, intelligently analyzes entirely and client's topic point and provides outstanding words art, breach manually extract in the past it is excellent The mode of show words art, reduces a large amount of manpower and material resources.
The method of the present invention adopts the following technical scheme that realize: the analysis of words art and topic point identify matched method, including Following steps:
A, marketing words art analysis of strategies tag library is established, establishes client's label system respectively for client and attending a banquet and mark of attending a banquet Label system, and precisely identify customers wishes;
B, marketing words art strategy is analyzed, completes words art and extracts automatically, it is complete in conjunction with client's label and outstanding words art library Recommend automatically at words art;
C, marketing words art analysis report is generated to the reference situation of outstanding words art according to attending a banquet.
In a preferred embodiment, step a establishes the process of client's label system are as follows: carries out going deep into spy to session text Rope sees clearly session data, understands customers wishes variation tendency, precisely identifies customers wishes, and combine existing customer information knot Structure data establish analysis window, discovery client's basic information label that may be present and consumption wish label, and assessment tag Reasonability, construct the label system of client.
Preferably, step a precisely identifies that customers wishes are realized by topic point term clustering method, comprising the following steps:
(1) extract database in specify table, rank of attending a banquet, compliance rate session text as training sample data;
(2) training sample data are analyzed, finds the industry special term of discovery user;
(3) word window is set, Term co-occurrence matrix is carried out to training sample data and is calculated, word frequency information table is obtained;
(4) word frequency information table is analyzed, extracts primary word, and screen extraction wherein keyword;
(5) term clustering is carried out to the keyword screened by Term co-occurrence matrix, according to cluster result, in conjunction with business It carries out topic point classification to sort out, creates topic point rule list;Go out the corresponding keyword of each topic point in conjunction with business combing;
(6) optimize topic point rule list, based on existing topic point rule list, continuous iterative search is the same as its under topic point Its keyword, iteration calculate the corresponding weight of each keyword after the completion.
Preferably, the automatic extraction process of art is talked about in step b includes:
The outstanding call attended a banquet periodically is obtained, the mark of topic point is carried out to the call of Cheng Dan in them, call Id and the topic recognized point information are stored in database, in case the training data that topic point is recommended;And
Customer information is tracked, text data of conversing daily timing acquisition yesterday, identification client often takes on the telephone the client having Label, topic point information, and in the database the storage of the result of identification.
And the identification of topic point information is realized using keyword extraction algorithm, is included the following steps:
(1) Chinese word segmentation: the text of client is split according to complete words;
(2) it filters stop words: stop words is removed according to deactivated vocabulary;
(3) part-of-speech tagging: the grammatical category of each word is determined in given sentence, its part of speech is determined and is marked;Word Property mark after only retain the word of specified part of speech, as preliminary candidate keyword;
(4) it sets window size and calculates the co-occurrence between word;
(5) using candidate keywords as node, co-occurrence degree is that side constructs candidate keywords figure G=(V, E), and wherein V is node Collection, by the preliminary candidate crucial phrase at;Side of the E between node calculates the co-occurrence structure between word using step (4) Make the side between any two node, between two nodes there are side and if only if their corresponding vocabulary in the window that length is K Co-occurrence in mouthful, K indicate window size;The weight of each node of iterative diffusion, until convergence, more new keywords figure weight;
(6) sequence screening: after the completion of the update of keyword figure weight, carrying out Bit-reversed to the weight of all nodes, from And most important T word is obtained, as final candidate keywords;
(7) it combines crucial phrase: the most important T word is marked in original session text, if being formed Adjacent phrase is then combined into more word keywords.
Present system adopts the following technical scheme that realize: the analysis of words art and topic point identify matched system, comprising:
Marketing words art analysis of strategies tag library, for establishing client's label system respectively for client and attending a banquet and label of attending a banquet System, and precisely identify customers wishes;
Marketing words art analysis of strategies platform, for analyzing marketing words art strategy, completion words art extracts automatically, in conjunction with Client's label and outstanding words art library are completed words art and are recommended automatically, and dialogue art carries out statistics displaying;
Marketing words art analysis report generation module, for generating marketing words art to the reference situation of outstanding words art according to attending a banquet Analysis report;
The establishment process of client's label system are as follows: session text is furtherd investigate, session data is seen clearly, is understood Customers wishes variation tendency precisely identifies customers wishes, and combines existing customer information structural data, establishes analysis window Mouthful, discovery client's basic information label that may be present and consumption wish label, and the reasonability of assessment tag, construct client's Label system;
The accurate identification customers wishes by topic point term clustering method realize, comprising steps of
(1) extract database in specify table, rank of attending a banquet, compliance rate session text as training sample data;
(2) training sample data are analyzed, finds the industry special term of discovery user;
(3) word window is set, Term co-occurrence matrix is carried out to training sample data and is calculated, word frequency information table is obtained;
(4) word frequency information table is analyzed, extracts primary word, and screen extraction wherein keyword;
(5) term clustering is carried out to the keyword screened by Term co-occurrence matrix, according to cluster result, in conjunction with business It carries out topic point classification to sort out, creates topic point rule list;Go out the corresponding keyword of each topic point in conjunction with business combing;
(6) optimize topic point rule list, based on existing topic point rule list, continuous iterative search is the same as its under topic point Its keyword, iteration calculate the corresponding weight of each keyword after the completion;
The automatic extraction process of the words art includes:
The outstanding call attended a banquet periodically is obtained, the mark of topic point is carried out to the call of Cheng Dan in them, call Id and the topic recognized point information are stored in database, in case the training data that topic point is recommended;And
Customer information is tracked, text data of conversing daily timing acquisition yesterday, identification client often takes on the telephone the client having Label, topic point information, and in the database the storage of the result of identification;
The identification of the topic point information is realized using keyword extraction algorithm, is included the following steps:
(1) Chinese word segmentation: the text of client is split according to complete words;
(2) it filters stop words: stop words is removed according to deactivated vocabulary;
(3) part-of-speech tagging: the grammatical category of each word is determined in given sentence, its part of speech is determined and is marked;Word Property mark after only retain the word of specified part of speech, as preliminary candidate keyword;
(4) it sets window size and calculates the co-occurrence between word;
(5) using candidate keywords as node, co-occurrence degree is that side constructs candidate keywords figure G=(V, E), and wherein V is node Collection, by the preliminary candidate crucial phrase at;Side of the E between node constructs any two section using the co-occurrence between word Side between point, there are sides between two nodes and if only if their corresponding vocabulary co-occurrence, K table in the window that length is K Show window size;The weight of each node of iterative diffusion, until convergence, more new keywords figure weight;
(6) sequence screening: after the completion of the update of keyword figure weight, carrying out Bit-reversed to the weight of all nodes, from And most important T word is obtained, as final candidate keywords;
(7) it combines crucial phrase: the most important T word is marked in original session text, if being formed Adjacent phrase is then combined into more word keywords.
For the present invention during with client sessions, the focus of energy automatic identification client simultaneously recommends related marketing words art, no Only new employee is enable quickly to be on duty, reduces training pressure and expense, moreover it is possible to effectively improve product probability of transaction.With prior art phase Than the technical effect that the present invention obtains specifically includes that
1, the outgoing call wasting of resources caused by due to information mismatches is reduced.Traditional outgoing call marketing model is blindness marketing, Outgoing call client's majority is not target customer, and customer resources is not accurate, and success rate is not high, is also brought along to the enthusiasm of employee Adverse effect.The present invention can anticipate under art analysis of strategies tag library support if the extraction of outstanding words art according to the consumption of client It is willing to that label quickly analyzes client characteristics, captures customers wishes, understands customers wishes variation tendency, accurately identify customers wishes, Reduce the waste of outgoing call resource.
2, the training cost of new employee is reduced.In previous outgoing call marketing, attends a banquet, be on duty for the outgoing call being newly added Before need to carry out various trainings such as a degree of training, including professional knowledge, sales technique;It is abundant for business experience For attending a banquet always, need from history into the outstanding words art that marketing is extracted and summed up in single session;The two aspects can all cause Cost and temporal waste.And the present invention it is outstanding words art extract in marketing talk about art analysis of strategies platform, can intelligence realization Words art extracts and talks about automatically the function of art unified management, reduces to attend a banquet always and extracts the time of outstanding words art;Words art is recommended automatically Function can reduce the cost for training of newly attending a banquet, also reduce the requirement to the professional skill and experience of new employee.
3, it excavates customer demand and improves the sales volume of the product.The present invention utilizes mature text mining skill in words art analysis Art (natural language processing, information extraction, information retrieval, Knowledge Discovery etc.) combines sales service knowledge, completes a set of marketing words Art strategy;It is identified in conjunction with topic point, prediction with topic point when client sessions, is convenient for seat personnel from corresponding topic point next time Theme is cut, customer demand is excavated.
Detailed description of the invention
Fig. 1 is the implementation flow chart that the outstanding words art of the present invention is extracted;
Fig. 2 is the implementation flow chart that topic point of the present invention is recommended;
Fig. 3 is the flow chart using topic point term clustering method identification customers wishes.
Specific embodiment
Below in conjunction with accompanying drawings and embodiments, the invention will be further described, it is clear that described embodiment is this Invention a part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art exist Not making every other embodiment obtained under the premise of creative work shall fall within the protection scope of the present invention.
Referring to Fig. 1,2, in the present embodiment, the analysis of words art and topic point identify matched method the following steps are included:
S1, Text Pretreatment
Pre-process to session text data, including three parts: Chinese word segmentation, text cleaning, text remove dryness.Its In, Chinese word segmentation is that continuous, nonseptate Chinese sequence is divided into word one by one according to certain specification;Chinese word segmentation It is the basis of text mining, for a Duan Zhongwen of input, successfully carries out Chinese word segmentation, can achieve computer automatic identification language The effect of sentence meaning.Text normalization is facilitated subsequent analysis mining work to open by text cleaning mainly for unified text Exhibition;It mainly include following processing step: capital and small letter conversion, the conversion of full-shape half-angle, the conversion of simplified and traditional body, spcial character conversion.Text Stop words is predominantly rejected in denoising, that is, is deleted to understanding word of the text without clear meaning, such as: auxiliary words of mood, adverbial word, Jie The high frequencies such as word, conjunction but word without meaning.
To session Text Pretreatment, unessential information in text can be effectively rejected, on the basis of retaining semantic information Effective standardization text, provides unified input text for subsequent analysis mining model of establishing.
S2, the analysis of words art and the identification matching of topic point
To client and attend a banquet using processed structural data and carry out stringent division, determine analysis session when Between window and successfully talk about art definition.Session text is divided into " Cheng Dan ", " not Cheng Dan " two class, selective analysis " Cheng Dan " session text This content refines outstanding words art from " Cheng Dan " session text, from client, level of attending a banquet and talks about in art again to outstanding words art content Appearance level carries out thinner division, and second level is arranged for both the above level and talks about art label.In the outstanding words that every grade is talked about under art label Art carries out feature extraction, extracts representative keyword.Client's label and words art label are contacted by proposed algorithm Come, achieve the purpose that the client for identifying some scene and is pushed out corresponding outstanding words art, feature and recommends index.With visual Change diagrammatic form shows hot word of attending a banquet (marketing words art, name of product, technical term etc.) and attends a banquet (which the class recommendation of art statistics talked about Words art is used more), it finally obtains analysis report, proposes reasonable proposal.
In order to realize words art analysis and topic point identification matching, the present embodiment establish a set of complete function, be applicable in if The system of art analysis and topic point match cognization, overall framework include marketing words art analysis of strategies tag library, marketing words art strategy Analysis platform, marketing words art analysis report generation module.The process of words art analysis, packet are carried out based on the aided management system of attending a banquet Include following steps:
S21, foundation marketing words art analysis of strategies tag library, for client and foundation words art label system respectively of attending a banquet.
In term of client, mainly by furtheing investigate to session text, session data is seen clearly, captures customers wishes, Understand customers wishes variation tendency, accurately identifies customers wishes, and combine existing customer information structural data, establish and divide Analyse window, discovery client's basic information label that may be present and consumption wish label, and the reasonability of assessment tag, building visitor The label system at family, can allow enterprise's rapid build customer portrait, unified management, realize quickly analysis client characteristics, see clearly visitor Family demand provides specific direction for the next step development of marketing work, blindness is avoided to market.
And precisely identifying that customers wishes are realized by topic point term clustering method, topic point term clustering method is selected using feature The suitable candidate word of the choice of technology is selected, using Complex Networks Analysis technology, using candidate word as node, co-occurrence degree is carried out as side Link constitutes a huge word network;Using GN algorithm, word network is carried out trimming, eventually becomes word network one by one The high word cluster of interdependency.Such as Fig. 3, precisely identify that the process of customers wishes includes following using topic point term clustering method Step:
(1) training corpus extracts: extracting and the session text of table, rank of attending a banquet, compliance rate is specified to be used as training in database Sample data;
(2) user's special term is found: being analyzed training sample data, is found the industry special term of discovery user;
(3) co-occurrence matrix calculates: setting word window carries out Term co-occurrence matrix to training sample data and calculates, obtains Word frequency information table;
(4) keyword extraction: the word frequency information table obtained to training session sample process is analyzed, and primary word is extracted, Wherein keyword is extracted by artificial screening;
(5) topic point term clustering: carrying out term clustering to the keyword screened by Term co-occurrence matrix, poly- according to word Class is sorted out as a result, carrying out topic point classification in conjunction with business, creates topic point rule list;In conjunction with business, each topic point is combed out Corresponding keyword;
(6) topic point rule list optimizes: based on existing topic point rule list, continuous iterative search is the same as its under topic point Its keyword, iteration calculate the corresponding weight of each keyword after the completion.
It is main to pass through the basic information attended a banquet, exchange information, sales situation at aspect of attending a banquet, and combine the rank attended a banquet Equal structured messages construct the label system attended a banquet using the label construction method based on statistics class and rule-based class, can be with An integration capability situation attended a banquet is embodied, provides corresponding training chance to attending a banquet for ability shortcoming in time.
(1) label based on statistics class: such label can be counted from existing structural data and be obtained, be the most basic Tag types.Such as: gender, city, age, spending amount.
(2) label of rule-based class: such label is generated based on user behavior or other rules determined, actual In development process, determined by business personnel according to business experience and data personnel's joint consultation.Such as: " any active ues " label Bore is the nearest 3 monthly users for having consumer record.
For counting the label of class and Regularia, the number such as existing class of service table, description of product table structure can refer to According to, the professional knowledge of business and business personnel being related to during dialogue-based, Primary Construction business and product dimension Classification system.
S22, based on marketing words art analysis of strategies platform to marketing words art strategy analyze, including words art extract automatically, Words art recommends automatically, talks about three parts of art statistics displaying.
Words art extracts a part automatically periodically to obtain the outstanding call attended a banquet, and talks about to the call of Cheng Dan in them The mark for inscribing point is stored to the id of call in database with the topic point information recognized, in case the training number that topic point is recommended According to;Another part is then tracking customer information, i.e., text data of conversing daily timing acquisition yesterday, and identification client often takes on the telephone tool Some client's labels, topic point information, and in the database the storage of the result of identification.Wherein, topic point information identification uses Keyword extraction algorithm, common keyword extraction algorithm have TF-IDF, TextRank, Rake, Topic-Model etc..This reality The extraction of topic point in example is applied using TextRank keyword extraction algorithms.TextRank algorithm be it is a kind of for text based on The sort algorithm of figure, by the way that text segmentation at word and is established graph model, using voting mechanism to the important component in text It is ranked up and keyword extraction, digest can be realized.The specific stream of topic point information identification is carried out using keyword extraction algorithm Journey is as follows:
(1) Chinese word segmentation: the text of client is split according to complete words.
(2) it filters stop words: stop words is removed according to deactivated vocabulary.
(3) part-of-speech tagging: part-of-speech tagging is exactly the grammatical category that each word is determined in given sentence, determines its part of speech simultaneously The process marked is the basis of the work such as syntactic analysis, information extraction.Keyword in one section of word is usually that part of speech is run after fame Word, verb and adjectival word, and it is then relatively very low for the significance level of conjunction, preposition etc., therefore, to every after participle A word label part of speech facilitates keyword extraction;And part-of-speech tagging is conducive to eliminate word ambiguity, strengthens the spy based on word Sign effectively removes a variety of important uses such as stop words.Therefore, the word that specified part of speech is only retained after part-of-speech tagging, such as noun move Word, adjective, the word remained are used as preliminary candidate keyword.
(4) set window size and calculate the co-occurrence between word: co-occurrence refers in specified window size two The number that word occurs simultaneously.Such as: window size is set as k, it is assumed that and word composed by a sentence can be expressed as w1, W2, w3 ..., wn, then w1, w2 ..., wk, w2, w3 ..., wk+1, w3, w4 ..., wk+2 etc. are a windows.It is generally acknowledged that The number that two words occur simultaneously in same piece document is more, then the relationship for representing the two words is closer.It unites as a result, The frequency that the word of one group of document occurs in same piece document between any two is counted, one can be formed by these words to association institute The co-occurrence network of composition, the distance between network node can reflect the close and distant relation between word.
(5) using candidate keywords as node, co-occurrence degree is that side constructs candidate keywords figure G=(V, E), and wherein V is node Collection, by (3) generate preliminary candidate crucial phrase at;Side of the E between node, using (4) calculated co-occurrence construction wantonly two Side between a node, the co-occurrence in the window that length is K that there are sides between two nodes and if only if their corresponding vocabulary, K indicates window size, i.e., most K words of co-occurrence;According to the weight of each node of formula iterative diffusion, until convergence, updates and closes Keyword figure weight.
(6) sequence screening: after the completion of the update of keyword figure weight, the side right between two words again the big, represents word It is more important, therefore, Bit-reversed is carried out to the weight of all nodes, so that most important T word is obtained, as final Candidate keywords.
(7) it combines crucial phrase: step (6) being obtained into most important T word, is marked in original session text Note, if forming adjacent phrase, is combined into more word keywords.Because existing in the Chinese word segmentation stage and splitting more word keywords A possibility that, it is searched in urtext in the step and it is reconfigured.
The process of above-mentioned topic point information identification, the focus of client is identified substantially in conversation procedure, determines concern Theme is associated focus, helps to provide corresponding product, service or offer to client for the focus of client It is preferential, it helps the art if special scenes that assists attending a banquet is used.The interactive examples of focus identification are as follows:
And when the client's label having often is taken on the telephone in identification, the interaction of tag recognition node need to be carried out.Tag recognition section The interaction example of point is as follows:
Customer service is asked Client answers Label
Dry do not generate heat also without right Uh it is right Nothing
Anyway you this feel throat it is somewhat dry it is other all without what reaction have the phenomenon that dry right Uh Have
But there is dry to be Uh Have
Words art is recommended automatically, that is, combines outstanding words art library, by client's label and topic the point information input identified to words It is calculated in topic point proposed algorithm, which topic point final prediction say when conversing with client next time, and the result of recommendation Storage is in the database;Drag on again to client next time and comes or when seat personnel is paid a return visit, facilitate seat personnel from corresponding topic point Cut theme.And in conversation procedure, the topic point of client is identified according to above-mentioned keyword extraction algorithm in real time and according to topic Art is recommended if point realizes dynamic on the customer service system page, is used for attending a banquet to choose, this greatly reduces the requirement to attending a banquet, So that new hand also can quickly be on duty.The example that words art recommends automatically and (also topic made to recommend automatically) is as follows:
Upper communication breath are as follows: converse for the first time, understand basic condition;Products Presentation uses, in fact it could happen that reaction, diet note Meaning understands client's living habit;Client has a mind to weight-reducing;
This recommends topic point are as follows: weight-losing principle, constitution are explained, whether inquiry household supports, excites client's desire, product Application method, product reaction.
Words art statistics is shown as showing the hot word attended a banquet with during client sessions, including battalion with Visual Chart form Pin words art, name of product and some technical terms etc.;And statistic of classification is carried out to the recommendation of all categories words art, output is each Classification talks about the service condition of art, facilitates the unified management of words art.
S23, marketing words art analysis report is generated using marketing words art analysis report generation module: typical case is divided Analysis is talked about in art analysis of strategies tag library in conjunction with marketing and is attended a banquet in client and the label attended a banquet, and marketing words art analysis of strategies platform To the reference situation of outstanding words art, marketing words art analysis report is generated, marketing strategy suggestion, training suggestions etc. are provided, are had in this way Help the marketing strategy of implementing plan product and how to carry out staffs training etc..
From the foregoing, it will be observed that the outstanding words art extracting method of the present invention, analyzes the identification of topic point from words art, intelligently analysis is objective entirely Family topic point simultaneously provides outstanding words art, breaches the mode for manually extracting outstanding words art in the past, reduces a large amount of manpower and material resources.
The above embodiment is a preferred embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment Limitation, other any changes, modifications, substitutions, combinations, simplifications made without departing from the spirit and principles of the present invention, It should be equivalent substitute mode, be included within the scope of the present invention.

Claims (10)

1. a kind of words art analysis and topic point identify matched method, which comprises the following steps:
A, marketing words art analysis of strategies tag library is established, establishes client's label system respectively for client and attending a banquet and label body of attending a banquet System, and precisely identify customers wishes;
B, marketing words art strategy is analyzed, completes words art and extract automatically, completes words in conjunction with client's label and outstanding words art library Art is recommended automatically;
C, marketing words art analysis report is generated to the reference situation of outstanding words art according to attending a banquet.
2. words art analysis according to claim 1 and topic point identify matched method, which is characterized in that step a is established The process of client's label system are as follows:
Session text is furtherd investigate, session data is seen clearly, understands customers wishes variation tendency, precisely identifies client's meaning It is willing to, and combines existing customer information structural data, establish analysis window, finds client's basic information label that may be present With consumption wish label, and the reasonability of assessment tag constructs the label system of client.
3. words art analysis according to claim 1 and topic point identify matched method, which is characterized in that step a is accurate Identify that customers wishes are realized by topic point term clustering method;It is suitable that topic point term clustering method is selected using Feature Selection Candidate word, using network analysis technique using candidate word as node, co-occurrence degree is linked as side, constitutes a word net Network;Using GN algorithm, word network is carried out trimming, word network is made to eventually become the high word cluster of interdependency one by one.
4. words art analysis according to claim 3 and topic point identify matched method, which is characterized in that use topic point Term clustering method precisely identify the process of customers wishes the following steps are included:
(1) extract database in specify table, rank of attending a banquet, compliance rate session text as training sample data;
(2) training sample data are analyzed, finds the industry special term of discovery user;
(3) word window is set, Term co-occurrence matrix is carried out to training sample data and is calculated, word frequency information table is obtained;
(4) word frequency information table is analyzed, extracts primary word, and screen extraction wherein keyword;
(5) term clustering is carried out to the keyword screened by Term co-occurrence matrix, according to cluster result, is carried out in conjunction with business Topic point classification is sorted out, and topic point rule list is created;Go out the corresponding keyword of each topic point in conjunction with business combing;
(6) optimize topic point rule list, based on existing topic point rule list, continuous iterative search is the same as other passes under topic point Keyword, iteration calculate the corresponding weight of each keyword after the completion.
5. words art analysis according to claim 1 and topic point identify matched method, which is characterized in that step a is established When label system of attending a banquet, by the basic information attended a banquet, exchange information and sales situation, and combine the structured message attended a banquet, The label system attended a banquet is constructed using the label construction method based on statistics class and rule-based class;The structuring letter attended a banquet Breath includes the rank attended a banquet.
6. words art analysis according to claim 1 and topic point identify matched method, which is characterized in that talked about in step b The automatic extraction process of art includes:
Periodically obtain the outstanding call attended a banquet, the mark of topic point carried out to the call of Cheng Dan in them, the id of call and The topic point information recognized is stored in database, in case the training data that topic point is recommended;And
Track customer information, text data of conversing daily timing acquisition yesterday, identification client often take on the telephone the client's label having, Topic point information, and in the database the storage of the result of identification.
7. words art analysis according to claim 6 and topic point identify matched method, which is characterized in that topic point information Identification using keyword extraction algorithm realize, include the following steps:
(1) Chinese word segmentation: the text of client is split according to complete words;
(2) it filters stop words: stop words is removed according to deactivated vocabulary;
(3) part-of-speech tagging: the grammatical category of each word is determined in given sentence, its part of speech is determined and is marked;Part of speech mark The word for only retaining specified part of speech after note, as preliminary candidate keyword;
(4) it sets window size and calculates the co-occurrence between word;
(5) using candidate keywords as node, co-occurrence degree is that side constructs candidate keywords figure G=(V, E), and wherein V is node collection, by The preliminary candidate crucial phrase at;Side of the E between node calculates the construction of the co-occurrence between word using step (4) and appoints The side anticipated between two nodes, there are sides between two nodes and if only if their corresponding vocabulary in the window that length is K Co-occurrence, K indicate window size;The weight of each node of iterative diffusion, until convergence, more new keywords figure weight;
(6) sequence screening: after the completion of the update of keyword figure weight, carrying out Bit-reversed to the weight of all nodes, thus To most important T word, as final candidate keywords;
(7) it combines crucial phrase: the most important T word is marked in original session text, if being formed adjacent Phrase is then combined into more word keywords.
8. words art analysis according to claim 7 and topic point identify matched method, which is characterized in that talked about in step b The automatic recommendation process of art are as follows: in conjunction with outstanding words art library, client's label and topic the point information input identified is pushed away to topic point It recommends in algorithm and calculates, which topic point final prediction say when conversing with client next time, and the result of recommendation is stored in In database;Drag on again to client next time and comes or when seat personnel is paid a return visit, facilitate seat personnel from corresponding topic point incision master Topic;And in conversation procedure, the topic point information of client is identified according to the keyword extraction algorithm in real time and according to topic point Art is recommended if information realizes dynamic on the customer service system page, is used for attending a banquet to choose.
9. words art analysis according to claim 1 and topic point identify matched method, which is characterized in that the method exists Also session text is pre-processed before step a, comprising: Chinese word segmentation, text cleaning and text remove dryness;
It further includes that words art statistics is shown that the step b, which carries out analysis to marketing words art strategy, is shown with Visual Chart form It attends a banquet and the hot word during client sessions, and statistic of classification is carried out to the recommendation of all categories words art, export each classification words The service condition of art.
10. a kind of words art analysis and topic point identify matched system characterized by comprising
Marketing words art analysis of strategies tag library, for establishing client's label system respectively for client and attending a banquet and label body of attending a banquet System, and precisely identify customers wishes;
Marketing words art analysis of strategies platform is completed words art and is extracted automatically, in conjunction with client for analyzing marketing words art strategy Label and outstanding words art library are completed words art and are recommended automatically, and dialogue art carries out statistics displaying;
Marketing words art analysis report generation module, for generating marketing words art analysis to the reference situation of outstanding words art according to attending a banquet Report;
The establishment process of client's label system are as follows: session text is furtherd investigate, session data is seen clearly, understands client Wish variation tendency precisely identifies customers wishes, and combines existing customer information structural data, establishes analysis window, hair Existing client's basic information label that may be present and consumption wish label, and the reasonability of assessment tag, construct the label of client System;
The accurate identification customers wishes by topic point term clustering method realize, comprising steps of
(1) extract database in specify table, rank of attending a banquet, compliance rate session text as training sample data;
(2) training sample data are analyzed, finds the industry special term of discovery user;
(3) word window is set, Term co-occurrence matrix is carried out to training sample data and is calculated, word frequency information table is obtained;
(4) word frequency information table is analyzed, extracts primary word, and screen extraction wherein keyword;
(5) term clustering is carried out to the keyword screened by Term co-occurrence matrix, according to cluster result, is carried out in conjunction with business Topic point classification is sorted out, and topic point rule list is created;Go out the corresponding keyword of each topic point in conjunction with business combing;
(6) optimize topic point rule list, based on existing topic point rule list, continuous iterative search is the same as other passes under topic point Keyword, iteration calculate the corresponding weight of each keyword after the completion;
The automatic extraction process of the words art includes:
Periodically obtain the outstanding call attended a banquet, the mark of topic point carried out to the call of Cheng Dan in them, the id of call and The topic point information recognized is stored in database, in case the training data that topic point is recommended;And
Track customer information, text data of conversing daily timing acquisition yesterday, identification client often take on the telephone the client's label having, Topic point information, and in the database the storage of the result of identification;
The identification of the topic point information is realized using keyword extraction algorithm, is included the following steps:
(1) Chinese word segmentation: the text of client is split according to complete words;
(2) it filters stop words: stop words is removed according to deactivated vocabulary;
(3) part-of-speech tagging: the grammatical category of each word is determined in given sentence, its part of speech is determined and is marked;Part of speech mark The word for only retaining specified part of speech after note, as preliminary candidate keyword;
(4) it sets window size and calculates the co-occurrence between word;
(5) using candidate keywords as node, co-occurrence degree is that side constructs candidate keywords figure G=(V, E), and wherein V is node collection, by The preliminary candidate crucial phrase at;Side of the E between node, using between the co-occurrence construction any two node between word Side, there are sides between two nodes, and and if only if their corresponding vocabulary, the co-occurrence in the window that length is K, K indicate window Size;The weight of each node of iterative diffusion, until convergence, more new keywords figure weight;
(6) sequence screening: after the completion of the update of keyword figure weight, carrying out Bit-reversed to the weight of all nodes, thus To most important T word, as final candidate keywords;
(7) it combines crucial phrase: the most important T word is marked in original session text, if being formed adjacent Phrase is then combined into more word keywords.
CN201910771751.6A 2019-08-21 2019-08-21 A kind of analysis of words art and topic point identify matched method and system Pending CN110472017A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910771751.6A CN110472017A (en) 2019-08-21 2019-08-21 A kind of analysis of words art and topic point identify matched method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910771751.6A CN110472017A (en) 2019-08-21 2019-08-21 A kind of analysis of words art and topic point identify matched method and system

Publications (1)

Publication Number Publication Date
CN110472017A true CN110472017A (en) 2019-11-19

Family

ID=68513102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910771751.6A Pending CN110472017A (en) 2019-08-21 2019-08-21 A kind of analysis of words art and topic point identify matched method and system

Country Status (1)

Country Link
CN (1) CN110472017A (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879868A (en) * 2019-11-21 2020-03-13 中国工商银行股份有限公司 Consultant scheme generation method, device, system, electronic equipment and medium
CN111028007A (en) * 2019-12-06 2020-04-17 中国银行股份有限公司 User portrait information prompting method, device and system
CN111160778A (en) * 2019-12-30 2020-05-15 佰聆数据股份有限公司 Outbound project auditing and evaluating method and system based on big data and computer equipment
CN111475634A (en) * 2020-04-10 2020-07-31 复旦大学 Representative speech segment extraction device and method based on seat speech segmentation
CN111553701A (en) * 2020-05-14 2020-08-18 支付宝(杭州)信息技术有限公司 Session-based risk transaction determination method and device
CN111581977A (en) * 2020-03-31 2020-08-25 西安电子科技大学 Text information conversion method, system, storage medium, computer program, and terminal
CN112036572A (en) * 2020-08-28 2020-12-04 上海冰鉴信息科技有限公司 Text list-based user feature extraction method and device
CN112084318A (en) * 2020-09-25 2020-12-15 支付宝(杭州)信息技术有限公司 Conversation auxiliary method, system and device
CN112328781A (en) * 2020-11-30 2021-02-05 北京博瑞彤芸科技股份有限公司 Message recommendation method and system and electronic equipment
CN112633992A (en) * 2021-01-11 2021-04-09 上海明略人工智能(集团)有限公司 Sales management method and system based on voice recognition
CN112685547A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Method and device for assessing dialect template, electronic equipment and storage medium
CN112686448A (en) * 2020-12-31 2021-04-20 重庆富民银行股份有限公司 Loss early warning method and system based on attribute data
CN112800269A (en) * 2021-01-20 2021-05-14 上海明略人工智能(集团)有限公司 Conference record generation method and device
CN112883160A (en) * 2021-02-25 2021-06-01 南昌鑫轩科技有限公司 Capture method and auxiliary system for result transfer conversion
CN113254621A (en) * 2021-06-21 2021-08-13 中国平安人寿保险股份有限公司 Seat call prompting method and device, computer equipment and storage medium
CN113434670A (en) * 2021-06-22 2021-09-24 未鲲(上海)科技服务有限公司 Method and device for generating dialogistic text, computer equipment and storage medium
CN113488051A (en) * 2021-07-20 2021-10-08 北京明略昭辉科技有限公司 Retail industry sales process analysis method, system, computer and storage medium
CN113626573A (en) * 2021-08-11 2021-11-09 北京深维智信科技有限公司 Sales session objection and response extraction method and system
CN113781129A (en) * 2021-11-15 2021-12-10 百融至信(北京)征信有限公司 Intelligent marketing strategy generation method and system
CN115563250A (en) * 2022-10-10 2023-01-03 江苏国光信息产业股份有限公司 Medical self-service voice service equipment and method
CN115905502A (en) * 2022-07-19 2023-04-04 北京中关村科金技术有限公司 Method, device and storage medium for mining and recommending dialect
CN117119106A (en) * 2023-10-17 2023-11-24 北京铁力山科技股份有限公司 Multifunctional intelligent control seat cooperation system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206270A1 (en) * 2016-01-19 2017-07-20 International Business Machines Corporation Cognitive System Comparison and Recommendation Engine
CN109190652A (en) * 2018-07-06 2019-01-11 中国平安人寿保险股份有限公司 It attends a banquet sort management method, device, computer equipment and storage medium
CN109242706A (en) * 2018-08-20 2019-01-18 中国平安人寿保险股份有限公司 Method, apparatus, computer equipment and the storage medium for assisting seat personnel to link up
CN109829810A (en) * 2018-12-13 2019-05-31 平安科技(深圳)有限公司 Business recommended method, apparatus, computer equipment and storage medium
CN109949071A (en) * 2019-01-31 2019-06-28 平安科技(深圳)有限公司 Products Show method, apparatus, equipment and medium based on voice mood analysis
CN110046230A (en) * 2018-12-18 2019-07-23 阿里巴巴集团控股有限公司 Generate the method for recommending words art set, the method and apparatus for recommending words art
CN110110038A (en) * 2018-08-17 2019-08-09 平安科技(深圳)有限公司 Traffic predicting method, device, server and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170206270A1 (en) * 2016-01-19 2017-07-20 International Business Machines Corporation Cognitive System Comparison and Recommendation Engine
CN109190652A (en) * 2018-07-06 2019-01-11 中国平安人寿保险股份有限公司 It attends a banquet sort management method, device, computer equipment and storage medium
CN110110038A (en) * 2018-08-17 2019-08-09 平安科技(深圳)有限公司 Traffic predicting method, device, server and storage medium
CN109242706A (en) * 2018-08-20 2019-01-18 中国平安人寿保险股份有限公司 Method, apparatus, computer equipment and the storage medium for assisting seat personnel to link up
CN109829810A (en) * 2018-12-13 2019-05-31 平安科技(深圳)有限公司 Business recommended method, apparatus, computer equipment and storage medium
CN110046230A (en) * 2018-12-18 2019-07-23 阿里巴巴集团控股有限公司 Generate the method for recommending words art set, the method and apparatus for recommending words art
CN109949071A (en) * 2019-01-31 2019-06-28 平安科技(深圳)有限公司 Products Show method, apparatus, equipment and medium based on voice mood analysis

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110879868A (en) * 2019-11-21 2020-03-13 中国工商银行股份有限公司 Consultant scheme generation method, device, system, electronic equipment and medium
CN111028007A (en) * 2019-12-06 2020-04-17 中国银行股份有限公司 User portrait information prompting method, device and system
CN111028007B (en) * 2019-12-06 2024-05-28 中国银行股份有限公司 User portrait information prompting method, device and system
CN111160778A (en) * 2019-12-30 2020-05-15 佰聆数据股份有限公司 Outbound project auditing and evaluating method and system based on big data and computer equipment
CN111581977A (en) * 2020-03-31 2020-08-25 西安电子科技大学 Text information conversion method, system, storage medium, computer program, and terminal
CN111475634B (en) * 2020-04-10 2023-04-28 复旦大学 Representative speaking segment extraction device and method based on seat voice segmentation
CN111475634A (en) * 2020-04-10 2020-07-31 复旦大学 Representative speech segment extraction device and method based on seat speech segmentation
CN111553701A (en) * 2020-05-14 2020-08-18 支付宝(杭州)信息技术有限公司 Session-based risk transaction determination method and device
CN112036572A (en) * 2020-08-28 2020-12-04 上海冰鉴信息科技有限公司 Text list-based user feature extraction method and device
CN112036572B (en) * 2020-08-28 2024-03-12 上海冰鉴信息科技有限公司 Text list-based user feature extraction method and device
CN112084318A (en) * 2020-09-25 2020-12-15 支付宝(杭州)信息技术有限公司 Conversation auxiliary method, system and device
CN112084318B (en) * 2020-09-25 2024-02-20 支付宝(杭州)信息技术有限公司 Dialogue assistance method, system and device
CN112328781A (en) * 2020-11-30 2021-02-05 北京博瑞彤芸科技股份有限公司 Message recommendation method and system and electronic equipment
CN112685547A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Method and device for assessing dialect template, electronic equipment and storage medium
CN112686448A (en) * 2020-12-31 2021-04-20 重庆富民银行股份有限公司 Loss early warning method and system based on attribute data
CN112686448B (en) * 2020-12-31 2024-02-13 重庆富民银行股份有限公司 Loss early warning method and system based on attribute data
CN112633992A (en) * 2021-01-11 2021-04-09 上海明略人工智能(集团)有限公司 Sales management method and system based on voice recognition
CN112800269A (en) * 2021-01-20 2021-05-14 上海明略人工智能(集团)有限公司 Conference record generation method and device
CN112883160A (en) * 2021-02-25 2021-06-01 南昌鑫轩科技有限公司 Capture method and auxiliary system for result transfer conversion
CN112883160B (en) * 2021-02-25 2023-04-07 江西知本位科技创业发展有限公司 Capture method and auxiliary system for result transfer conversion
CN113254621A (en) * 2021-06-21 2021-08-13 中国平安人寿保险股份有限公司 Seat call prompting method and device, computer equipment and storage medium
CN113434670A (en) * 2021-06-22 2021-09-24 未鲲(上海)科技服务有限公司 Method and device for generating dialogistic text, computer equipment and storage medium
CN113488051A (en) * 2021-07-20 2021-10-08 北京明略昭辉科技有限公司 Retail industry sales process analysis method, system, computer and storage medium
CN113626573B (en) * 2021-08-11 2022-09-27 北京深维智信科技有限公司 Sales session objection and response extraction method and system
CN113626573A (en) * 2021-08-11 2021-11-09 北京深维智信科技有限公司 Sales session objection and response extraction method and system
CN113781129B (en) * 2021-11-15 2022-02-15 百融至信(北京)征信有限公司 Intelligent marketing strategy generation method and system
CN113781129A (en) * 2021-11-15 2021-12-10 百融至信(北京)征信有限公司 Intelligent marketing strategy generation method and system
CN115905502A (en) * 2022-07-19 2023-04-04 北京中关村科金技术有限公司 Method, device and storage medium for mining and recommending dialect
CN115905502B (en) * 2022-07-19 2024-01-05 北京中关村科金技术有限公司 Speaking skill mining and recommending method, device and storage medium
CN115563250A (en) * 2022-10-10 2023-01-03 江苏国光信息产业股份有限公司 Medical self-service voice service equipment and method
CN117119106B (en) * 2023-10-17 2024-01-26 北京铁力山科技股份有限公司 Multifunctional intelligent control seat cooperation system
CN117119106A (en) * 2023-10-17 2023-11-24 北京铁力山科技股份有限公司 Multifunctional intelligent control seat cooperation system

Similar Documents

Publication Publication Date Title
CN110472017A (en) A kind of analysis of words art and topic point identify matched method and system
CN108052583B (en) E-commerce ontology construction method
US11823074B2 (en) Intelligent communication manager and summarizer
CN101420313B (en) Method and system for clustering customer terminal user group
CN110825882A (en) Knowledge graph-based information system management method
CN110825858A (en) Intelligent interaction robot system applied to customer service center
CN107958091A (en) A kind of NLP artificial intelligence approaches and interactive system based on financial vertical knowledge mapping
CN113704451B (en) Power user appeal screening method and system, electronic device and storage medium
CN110032630A (en) Talk about art recommendation apparatus, method and model training equipment
CN106022708A (en) Method for predicting employee resignation
KR20150096295A (en) System and method for buinding q&as database, and search system and method using the same
US20230394247A1 (en) Human-machine collaborative conversation interaction system and method
CN107330627A (en) A kind of big data processing method, server and system for innovating intention
CN109947934A (en) For the data digging method and system of short text
CN113032552B (en) Text abstract-based policy key point extraction method and system
CN110929007A (en) Electric power marketing knowledge system platform and application method
CN113723853A (en) Method and device for processing post competence demand data
KR20190103504A (en) Continuous Conversation Method and Its System by Automating Conversation Scenario Collection
CN113672698A (en) Intelligent interviewing method, system, equipment and storage medium based on expression analysis
Shi et al. StarSum: A star architecture based model for extractive summarization
CN110377706A (en) Search statement method for digging and equipment based on deep learning
CN115269771A (en) Big data analysis system based on semantics
Wang et al. Rom: A requirement opinions mining method preliminary try based on software review data
Im et al. A study on brand identity and image utilizing SNA
CN111949781B (en) Intelligent interaction method and device based on natural sentence syntactic analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191119