CN108897888B - Man-machine sparring method under voice customer service training scene - Google Patents

Man-machine sparring method under voice customer service training scene Download PDF

Info

Publication number
CN108897888B
CN108897888B CN201810750420.XA CN201810750420A CN108897888B CN 108897888 B CN108897888 B CN 108897888B CN 201810750420 A CN201810750420 A CN 201810750420A CN 108897888 B CN108897888 B CN 108897888B
Authority
CN
China
Prior art keywords
ginfo
training
information
data
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810750420.XA
Other languages
Chinese (zh)
Other versions
CN108897888A (en
Inventor
毛力
向业锋
谭毅
刘鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Taojin Niwo Information Technology Co ltd
Original Assignee
Sichuan Taojin Niwo Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Taojin Niwo Information Technology Co ltd filed Critical Sichuan Taojin Niwo Information Technology Co ltd
Priority to CN201810750420.XA priority Critical patent/CN108897888B/en
Publication of CN108897888A publication Critical patent/CN108897888A/en
Application granted granted Critical
Publication of CN108897888B publication Critical patent/CN108897888B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

In order to improve the intelligence and the accuracy of man-machine conversation, the invention provides a text customer service robot intelligent learning method based on big data, which comprises the following steps: (A) performing text processing on the voice information; (B) classifying according to the context by using the dialogue data O in the form of text for training; (C) and detecting the good sensitivity information Ginfo, the similar statement repetition degree information Info and the dialogue duration information Linfo, and training the data O. The method has a quick calculation process, and the self-learning efficiency after SVM training is greatly improved.

Description

Man-machine sparring method under voice customer service training scene
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a man-machine sparring method in a voice customer service training scene.
Background
In the existing man-machine conversation question-answering system, after a user inputs a question, the intention to be asked by the user is identified to be a core part in the whole question-answering system, the intention identification is correct, but the accuracy is too low, so that the problem that the answer is too many to select the optimal answer when the answer is returned to the user at the later stage can be caused; the intention recognition error may cause the meaning of the user to be unintelligible, and thus the user may be provided with an unwanted answer or may not be given an answer directly. The existing question-answering system is mainly realized by the algorithm logic of a computer, and the basic process comprises three processes of question analysis, information retrieval and answer extraction. In the three processes, careless mistakes occur in any one link, and the user cannot obtain a correct result. More importantly, because the adjustability of the question-answering system is poor, the question of the user cannot be utilized, and the question-answering system can be made more intelligent, when the user inputs the same question again, the user still cannot obtain the correct result based on the same logic unless the algorithm logic of the question-answering system is modified. Therefore, the adjustability of the question-answering system becomes a key problem influencing the accuracy and timeliness of the question-answering system.
The existing intention recognition method is based on training and predicting by manually labeling a large amount of corpora, and because a large amount of manual labeling is needed, a lot of uncontrollable factors exist, for example, different labeling results are caused by different linguistic comprehensions of each labeling person, repeated labeling results are caused for the same problem, and the same corpora are labeled in different classification labels; when new intention classification needs to be added, related personnel need to discuss and determine, and then the labeling personnel are trained to start labeling work, so that the machine cannot automatically add new classification. A large amount of manpower and material resources are consumed in the whole process of training the model, and the speed and the progress of function training are influenced by a plurality of uncontrollable factors.
Disclosure of Invention
In order to improve the intelligence and the accuracy of man-machine conversation, the invention provides a man-machine training method under a voice customer service training scene, which comprises the following steps:
(A) performing text processing on the voice information;
(B) classifying according to the context by using the dialogue data O in the form of text for training;
(C) and detecting the good sensitivity information Ginfo, the similar statement repetition degree information Info and the dialogue duration information Linfo, and training the data O.
Further, the contexts include three contexts of pre-sale, under-sale, and after-sale, which have predetermined weights different from each other, respectively.
Further, the popularity information Ginfo includes times information Ginfo _ wordnum using political wording, word content information Ginfo _ wordcontent, and times information Ginfo _ facenum using emoticons and ASCII code Ginfo _ facecontent corresponding to emoticons.
Further, the similar sentence repetition degree information Iinfo includes the number-of-times-of-repeated-sentence information Iinfo _ num and the word content information Iinfo _ content.
Further, the training of the data O includes:
splitting dialogue data in a text form for training into different words according to semantics;
for the g statement and the g +1 statement, performing similarity convolution on terms corresponding to different semantics, defining the term with the largest convolution value as the largest term, defining the term with the smallest convolution value as the smallest term, and g is 1, 2, … and Num1, wherein Num1 represents the number of statements in text-form dialogue data for training;
for the g +1 th sentence, deleting the minimum word, and deleting the minimum word in each sentence in the dialogue data in the text form for training, wherein the first sentence is reserved in a whole sentence, so that intermediate dialogue data R formed by combining a plurality of sentences obtained after deletion and the first sentence according to a time sequence is obtained;
the method comprises the steps of taking a sample training set as TRAIN { (R, Ginfo _ workcontent, Ginfo _ facecontent and Info _ content }, filling each element in the TRAIN with the occurrence number as a substitution identifier, filling each vacant position with the arithmetic mean of the Ginfo _ worknum, the Ginfo _ facenum and the Info _ num and the remainder of the quotient of 4 to form a matrix A1, filling each element in the TRAIN with the occurrence number as the substitution identifier, and filling each vacant position with the geometric mean of the Ginfo _ worknum, the Ginfo _ facenum and the Info _ num and the remainder of the quotient of 4 to form a matrix A2;
calculating an eigenvalue CH1 of a matrix A1 and an eigenvalue CH2 of a matrix A2, and multiplying CH1 and CH2 by a predetermined weight according to three contexts of before-sale, in-sale and after-sale; the iteration number Iter is an upper integer of the geometric mean value of (CH1+ CH2), the maximum words are used as an initial solution pair ((Li-1 CH1+ Li +1 CH2)/(Li-1 CH2+ Li +1 CH1)) in the range of the data O, and the obtained final iteration value M is taken as an upper integer M; the data O is subjected to M SVM trainings, where i is 1.
The method is simple in calculation process, and the self-learning efficiency after SVM training is greatly improved.
Detailed Description
The invention provides a man-machine sparring method under a voice customer service training scene, which comprises the following steps:
(A) performing text processing on the voice information;
(B) classifying according to the context by using the dialogue data O in the form of text for training;
(C) and detecting the good sensitivity information Ginfo, the similar statement repetition degree information Info and the dialogue duration information Linfo, and training the data O.
Preferably, the contexts include three contexts of pre-sale, mid-sale, and post-sale, which have predetermined weights different from each other, respectively.
Preferably, the popularity information Ginfo includes times information Ginfo _ wordnum using political wording, word content information Ginfo _ wordcontent, and times information Ginfo _ facenum using emoticons and ASCII code Ginfo _ facecontent corresponding to emoticons.
Preferably, the similar sentence repetition degree information Iinfo includes the number-of-times-of-repeated-sentence information Iinfo _ num and the word content information Iinfo _ content.
Preferably, the training of the data O comprises:
splitting dialogue data in a text form for training into different words according to semantics;
for the g statement and the g +1 statement, performing similarity convolution on terms corresponding to different semantics, defining the term with the largest convolution value as the largest term, defining the term with the smallest convolution value as the smallest term, and g is 1, 2, … and Num1, wherein Num1 represents the number of statements in text-form dialogue data for training;
for the g +1 th sentence, deleting the minimum word, and deleting the minimum word in each sentence in the dialogue data in the text form for training, wherein the first sentence is reserved in a whole sentence, so that intermediate dialogue data R formed by combining a plurality of sentences obtained after deletion and the first sentence according to a time sequence is obtained;
the method comprises the steps of taking a sample training set as TRAIN { (R, Ginfo _ workcontent, Ginfo _ facecontent and Info _ content }, filling each element in the TRAIN with the occurrence number as a substitution identifier, filling each vacant position with the arithmetic mean of the Ginfo _ worknum, the Ginfo _ facenum and the Info _ num and the remainder of the quotient of 4 to form a matrix A1, filling each element in the TRAIN with the occurrence number as the substitution identifier, and filling each vacant position with the geometric mean of the Ginfo _ worknum, the Ginfo _ facenum and the Info _ num and the remainder of the quotient of 4 to form a matrix A2;
calculating an eigenvalue CH1 of a matrix A1 and an eigenvalue CH2 of a matrix A2, and multiplying CH1 and CH2 by a predetermined weight according to three contexts of before-sale, in-sale and after-sale; the iteration number Iter is an upper integer of the geometric mean value of (CH1+ CH2), the maximum words are used as an initial solution pair ((Li-1 CH1+ Li +1 CH2)/(Li-1 CH2+ Li +1 CH1)) in the range of the data O, and the obtained final iteration value M is taken as an upper integer M; the data O is subjected to M SVM trainings, where i is 1.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (1)

1. The man-machine sparring method under the voice customer service training scene comprises the following steps:
(A) performing text processing on the voice information;
(B) classifying according to the context by using the dialogue data O in the form of text for training;
(C) detecting sensitivity information Ginfo, similar statement repetition degree information Info and dialogue duration information Linfo, and training data O;
the contexts comprise three contexts of before-sale, in-sale and after-sale, and the three contexts respectively have different predetermined weights;
the popularity information Ginfo comprises frequency information Ginfo _ wordnum using political expressions, word content information Ginfo _ wordcontent, frequency information Ginfo _ facenum using expressions and ASCII code Ginfo _ facecontent corresponding to the expressions;
the similar sentence repetition degree information Iinfo includes the number-of-times information Iinfo _ num of the repeated sentence and word content information Iinfo _ content;
wherein training data O comprises:
splitting dialogue data in a text form for training into different words according to semantics;
for the g statement and the g +1 statement, performing similarity convolution on terms corresponding to different semantics, defining the term with the largest convolution value as the largest term, defining the term with the smallest convolution value as the smallest term, and g is 1, 2, … and Num1-1, wherein Num1 represents the number of statements in text-form dialogue data for training;
for the g +1 th sentence, deleting the minimum word, and deleting the minimum word in each sentence in the dialogue data in the text form for training, wherein the first sentence is reserved in a whole sentence, so that intermediate dialogue data R formed by combining a plurality of sentences obtained after deletion and the first sentence according to a time sequence is obtained;
the method comprises the steps of taking a sample training set as TRAIN { (R, Ginfo _ workcontent, Ginfo _ facecontent and Info _ content }, filling each element in the TRAIN with the occurrence number as a substitution identifier, filling each vacant position with the arithmetic mean of the Ginfo _ worknum, the Ginfo _ facenum and the Info _ num and the remainder of the quotient of 4 to form a matrix A1, filling each element in the TRAIN with the occurrence number as the substitution identifier, and filling each vacant position with the geometric mean of the Ginfo _ worknum, the Ginfo _ facenum and the Info _ num and the remainder of the quotient of 4 to form a matrix A2;
calculating an eigenvalue CH1 of a matrix A1 and an eigenvalue CH2 of a matrix A2, and multiplying CH1 and CH2 by a predetermined weight according to three contexts of before-sale, in-sale and after-sale; the iteration number Iter is an upper integer of the geometric mean value of (CH1+ CH2), the maximum words are used as an initial solution pair ((Li-1 CH1+ Li +1 CH2)/(Li-1 CH2+ Li +1 CH1)) in the range of the data O, and the obtained final iteration value M is taken as an upper integer M; the data O is subjected to M SVM trainings, where i is 1.
CN201810750420.XA 2018-07-10 2018-07-10 Man-machine sparring method under voice customer service training scene Active CN108897888B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810750420.XA CN108897888B (en) 2018-07-10 2018-07-10 Man-machine sparring method under voice customer service training scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810750420.XA CN108897888B (en) 2018-07-10 2018-07-10 Man-machine sparring method under voice customer service training scene

Publications (2)

Publication Number Publication Date
CN108897888A CN108897888A (en) 2018-11-27
CN108897888B true CN108897888B (en) 2021-08-24

Family

ID=64348921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810750420.XA Active CN108897888B (en) 2018-07-10 2018-07-10 Man-machine sparring method under voice customer service training scene

Country Status (1)

Country Link
CN (1) CN108897888B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109272790A (en) * 2018-12-04 2019-01-25 曾文华 A kind of online customer service Training Methodology, system and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103279528A (en) * 2013-05-31 2013-09-04 俞志晨 Question-answering system and question-answering method based on man-machine integration
CN104301554A (en) * 2013-07-18 2015-01-21 中兴通讯股份有限公司 Device and method used for detecting service quality of customer service staff
CN107870896A (en) * 2016-09-23 2018-04-03 苏宁云商集团股份有限公司 A kind of dialog analysis method and device
CN107943860A (en) * 2017-11-08 2018-04-20 北京奇艺世纪科技有限公司 The recognition methods and device that the training method of model, text are intended to
CN108170739A (en) * 2017-12-18 2018-06-15 深圳前海微众银行股份有限公司 Problem matching process, terminal and computer readable storage medium
CN108959275A (en) * 2018-07-10 2018-12-07 四川淘金你我信息技术有限公司 It is man-machine to white silk system based on online language translation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103279528A (en) * 2013-05-31 2013-09-04 俞志晨 Question-answering system and question-answering method based on man-machine integration
CN104301554A (en) * 2013-07-18 2015-01-21 中兴通讯股份有限公司 Device and method used for detecting service quality of customer service staff
CN107870896A (en) * 2016-09-23 2018-04-03 苏宁云商集团股份有限公司 A kind of dialog analysis method and device
CN107943860A (en) * 2017-11-08 2018-04-20 北京奇艺世纪科技有限公司 The recognition methods and device that the training method of model, text are intended to
CN108170739A (en) * 2017-12-18 2018-06-15 深圳前海微众银行股份有限公司 Problem matching process, terminal and computer readable storage medium
CN108959275A (en) * 2018-07-10 2018-12-07 四川淘金你我信息技术有限公司 It is man-machine to white silk system based on online language translation

Also Published As

Publication number Publication date
CN108897888A (en) 2018-11-27

Similar Documents

Publication Publication Date Title
CN104809103B (en) A kind of interactive semantic analysis and system
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
CN107291783B (en) Semantic matching method and intelligent equipment
CN110895932B (en) Multi-language voice recognition method based on language type and voice content collaborative classification
KR102316063B1 (en) Method and apparatus for identifying key phrase in audio data, device and medium
CN110727779A (en) Question-answering method and system based on multi-model fusion
CN110781277A (en) Text recognition model similarity training method, system, recognition method and terminal
CN106844344B (en) Contribution calculation method for conversation and theme extraction method and system
CN108038208B (en) Training method and device of context information recognition model and storage medium
CN111026886A (en) Multi-round dialogue processing method for professional scene
CN110175229A (en) A kind of method and system carrying out online training based on natural language
CN111177310A (en) Intelligent scene conversation method and device for power service robot
CN113505209A (en) Intelligent question-answering system for automobile field
CN114492460B (en) Event causal relationship extraction method based on derivative prompt learning
CN115146124A (en) Question-answering system response method and device, equipment, medium and product thereof
CN111930937A (en) BERT-based intelligent government affair text multi-classification method and system
CN117149984A (en) Customization training method and device based on large model thinking chain
CN113326702A (en) Semantic recognition method and device, electronic equipment and storage medium
CN110413972B (en) Intelligent table name field name complementing method based on NLP technology
CN116166688A (en) Business data retrieval method, system and processing equipment based on natural language interaction
CN113590810B (en) Abstract generation model training method, abstract generation device and electronic equipment
CN108897888B (en) Man-machine sparring method under voice customer service training scene
CN108959588B (en) Text customer service robot intelligent learning method based on big data
CN108959275B (en) Man-machine sparring system based on online language translation
CN110362828B (en) Network information risk identification method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant