CN111061852A - Data processing method, device and system - Google Patents

Data processing method, device and system Download PDF

Info

Publication number
CN111061852A
CN111061852A CN201911279836.9A CN201911279836A CN111061852A CN 111061852 A CN111061852 A CN 111061852A CN 201911279836 A CN201911279836 A CN 201911279836A CN 111061852 A CN111061852 A CN 111061852A
Authority
CN
China
Prior art keywords
data
question
answer
model
customer service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911279836.9A
Other languages
Chinese (zh)
Inventor
何鹏飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhongtongji Network Technology Co Ltd
Original Assignee
Shanghai Zhongtongji Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhongtongji Network Technology Co Ltd filed Critical Shanghai Zhongtongji Network Technology Co Ltd
Priority to CN201911279836.9A priority Critical patent/CN111061852A/en
Publication of CN111061852A publication Critical patent/CN111061852A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/01Customer relationship services

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Accounting & Taxation (AREA)
  • Databases & Information Systems (AREA)
  • Economics (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a data processing method, a device and a system, wherein the data processing method comprises the steps of obtaining original question answering data; processing the original question-answer data to generate training data and test data, wherein the processing operation comprises data cleaning and/or data screening; establishing a virtual customer service question-answering model, and inputting training data into the virtual customer service question-answering model; and inputting the test data into the trained virtual customer service question-answering model to perform question-answering test. The method and the device can improve the cold start initial training effect of the virtual customer service question-answering model, can reduce the manpower input by generating the training data to train the virtual customer service question-answering model, and are favorable for the popularization and the use of the model; the virtual customer service question-answering model is tested through the test data, so that the virtual customer service question-answering effect can be evaluated, the question answering accuracy is improved, and the user experience is improved.

Description

Data processing method, device and system
Technical Field
The present application relates to the field of machine learning technologies, and in particular, to a data processing method, apparatus, and system.
Background
With the development of the internet and the mobile internet, on one hand, the number of people using the internet is continuously increased, and more people can consult business problems through the internet; on the other hand, the neural network algorithm is widely applied, and the natural language processing field also makes breakthrough progress.
To cope with the rapid increase of business consulting volume, many companies begin to use a Natural Language Processing (NLP) model to generate a virtual robot to deal with user problems, so as to reduce customer service labor cost. For a general enterprise, at the initial stage of preparing an NLP model, there is often insufficient business data to train the model, the NLP model needs to be run in a cold start mode, the cold start means that a machine learning model cannot obtain sufficient training data at the initial stage of start, and a certain specific method needs to be used to continuously correct the model itself to achieve an expected training effect. At present, in the initial stage of the NLP model, training data collected during cold start of the model is generated by manually simulating user question asking and answering, but this method not only consumes a lot of manpower, and the human cost cannot be evaluated, but also because training data is generated by virtue of manual experience, each customer service system needs to invest manpower again to train the customer service robot, resulting in poor model generalization, on the other hand, because the training data is limited in quantity, the test of the model is not sufficient, resulting in the occurrence of situations such as wrong answer when the virtual robot answers the user question, and affecting user experience.
Disclosure of Invention
In order to overcome the problem that at least in a certain degree, training data collected during model cold start are generated by manually simulating user question asking and answer, a large amount of manpower is consumed, the model is poor in popularization, and on the other hand, due to the fact that the number of the training data is limited, testing of the model is insufficient, and user experience is affected, the data processing method, the data processing device and the data processing system are provided.
In a first aspect, the present application provides a data processing method, including:
acquiring original question and answer data;
processing the original question-answer data to generate training data and test data, wherein the processing operation comprises data cleaning and/or data screening;
establishing a virtual customer service question-answering model, and inputting the training data into the virtual customer service question-answering model;
and inputting the test data into the trained virtual customer service question-answering model to perform question-answering test.
Further, the raw question-answer data includes:
internal data and external data, the internal data including historical question and answer data already existing in the system; the external data includes session data saved by manual customer service records.
Further, the data cleansing includes deleting specific characters in the original question and answer data, where the specific characters include one or more of symbols, numbers, letters, exclamations words and sensitive words.
Further, the data screening includes screening sentences conforming to business logic according to preset sentence lengths, where the sentences include question sentences and answer sentences.
Further, before generating the training data and the test data, the method further includes:
presetting standard question-answer data;
and labeling the original question-answer data to establish mapping of the original question-answer data and the standard question-answer data.
Further, the method further comprises:
and (5) auditing the original question-answer data labeling result.
Further, the virtual customer service question-answering model is a natural language processing model established by utilizing a convolutional neural algorithm.
Further, the method further comprises:
and evaluating the virtual customer service question-and-answer model, wherein the evaluating of the virtual customer service question-and-answer model comprises judging whether the virtual customer service question-and-answer model meets an online condition.
In a second aspect, the present application provides a data processing apparatus comprising:
the acquisition module is used for acquiring original question and answer data;
the generating module is used for processing the original question and answer data to generate training data and test data, and the processing operation comprises data cleaning and/or data screening;
the model establishing module is used for establishing a virtual customer service question-answering model and inputting the training data into the virtual customer service question-answering model;
and the model test module is used for inputting the test data into the trained virtual customer service question-answering model to perform question-answering test.
In a third aspect, the present application provides a data processing system comprising:
a memory for storing a computer program and a processor for executing the computer program to implement the data processing method of the claims.
The technical scheme provided by the embodiment of the application can have the following beneficial effects:
according to the method, training data and test data are generated by processing original question and answer data, the training data are input into the virtual customer service question and answer model, the test data are input into the trained virtual customer service question and answer model for question and answer test, the cold start initial training effect of the virtual customer service question and answer model is improved, manpower input can be reduced by generating the training data to train the virtual customer service question and answer model, and the method is favorable for popularization and use of the model; the virtual customer service question-answering model is tested through the test data, so that the virtual customer service question-answering effect can be evaluated, the question answering accuracy is improved, and the user experience is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
Fig. 1 is a flowchart of a data processing method according to an embodiment of the present application.
Fig. 2 is a flowchart of a data processing method according to another embodiment of the present application.
Fig. 3 is a flowchart of another data processing method according to an embodiment of the present application.
Fig. 4 is a functional block diagram of a data processing apparatus according to an embodiment of the present application.
Fig. 5 is a block diagram of a data processing system according to an embodiment of the present application.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
Fig. 1 is a block diagram of a data processing method according to an embodiment of the present application.
As shown in fig. 1, the data processing method provided in this embodiment includes:
s11: acquiring original question and answer data;
s12: processing the original question-answer data to generate training data and test data, wherein the processing operation comprises data cleaning and/or data screening;
s13: establishing a virtual customer service question-answering model, and inputting training data into the virtual customer service question-answering model;
s14: and inputting the test data into the trained virtual customer service question-answering model to perform question-answering test.
In the initial stage of the traditional NLP model, training data collected during cold start of the model are generated by manually simulating question and answer of a user, so that problems of insufficient training corpus and lack of test data can occur, and the question and answer model is low in answer accuracy rate due to the insufficient training corpus; the actual question-answering effect of the question-answering model cannot be effectively evaluated due to the lack of test data. Moreover, the acquired training data is generated by manually simulating user questions and answers, and the input of personnel cannot be estimated due to the lack of a standardized data processing flow and a model evaluation method. On the other hand, due to the fact that the number of training data is limited, testing of the model is not sufficient, the virtual robot can have the situation of not asking questions when answering the questions of the user, and user experience is affected.
In the embodiment, the training data and the test data are generated by processing the original question-answer data, the training data are input into the virtual customer service question-answer model, and the test data are input into the trained virtual customer service question-answer model for question-answer test, so that the cold start initial training effect of the virtual customer service question-answer model is improved, the manpower input can be reduced by generating the training data to train the virtual customer service question-answer model, and the model is favorable for popularization and use; the virtual customer service question-answering model is tested through the test data, so that the virtual customer service question-answering effect can be evaluated, the question answering accuracy is improved, and the user experience is improved.
Fig. 2 is a block diagram of a data processing method according to another embodiment of the present application. As shown in fig. 2, the data processing method of the present embodiment includes:
s21: acquiring original question and answer data;
as an optional implementation manner of the present invention, the original question-answering data includes:
the internal data comprises historical question and answer data existing in the system, and question and answer records recorded by the existing question and answer service system, wherein the data of the question and answer records can be directly used as original question and answer data, and can also be subjected to preliminary screening on the data of the question and answer records, and the screening conditions are as follows: the question-answer score is low, the question-answer service system can evaluate the relevance between the question of the user and all answers in the knowledge base aiming at the question provided by the user, the sum of the relevance is one, the higher the relevance score is, the question asked by the user can be confirmed, the user is invited to score the question-answer effect after each question-answer, and finally, the question-answer data of the user with the question-answer score below 0.7 and the user identification of being unsatisfied are screened out to be used as the original question-answer data. It can be understood that the user questions and answers with low question-answer scores or unsatisfactory users represent incorrect questions answered by the question-answering robot in the original question-answering service system, and the data is adopted as the original question-answering data, so that the updating and perfection of the knowledge base and the improvement of the user satisfaction are facilitated.
The external data includes session data saved by manual customer service records. External data includes, but is not limited to, the following acquisition modes:
the first method is as follows: the method comprises the steps of searching the existing knowledge base through keywords to obtain question and answer data, enabling one standard question to correspond to a plurality of keywords, searching partial keywords to screen out the standard questions and corresponding answers in the corresponding knowledge base, and enabling question and answer sample data to be enhanced by connecting a plurality of keyword strings into a question form. It should be noted that the keyword concatenation method may be to concatenate a plurality of keywords together in a python script manner to form a sentence, or may perform concatenation operation in other manners.
And in the second mode, the recording of the manual telephone customer service is converted into a text, or the dialogue record of the manual online customer service is directly obtained as the original question-answer data. It should be noted that the method for converting the audio record into the text and acquiring the dialog record of the manual online customer service is not limited in the present invention, and those skilled in the art can select the acquisition mode according to the actual needs.
S22: processing the original question-answer data to generate training data and test data, wherein the processing operation comprises data cleaning and/or data screening;
in some embodiments, the data cleansing includes deleting specific characters in the original question-and-answer data, the specific characters including one or more of symbols, numbers, letters, exclamation words, and sensitive words.
The original question-answering data contains a large number of symbols, numbers and letters, and the characters do not help model training, so the original question-answering data is deleted through data cleaning operation.
It should be noted that the data cleansing operation may be performed by selecting a data cleansing tool, or may be performed in a manner that can be implemented by those skilled in the art, and the method of data cleansing is not limited by the present invention.
In some embodiments, the data filtering includes filtering the sentences conforming to the business logic according to a preset sentence length, and the sentences include question sentences and answer sentences.
For example, the sentence screening threshold is set to 3 words, since the sentences not exceeding 3 words cannot completely express the business logic, that is, the sentences with more than 3 words in the screened sentences are retained, it should be noted that the setting of the sentence screening threshold can be set according to the actual requirements of the user or the system.
After the original question-answering data is processed, such as data cleaning and/or data screening, the processed data is respectively used as training data and test data according to a distribution proportion, for example, 80% of the processed data is used as training data, 20% of the processed data is used as test data, the training data is used for model training, and the test data is used for model testing. It should be noted that the data distribution ratio may be set according to actual model requirements.
S23: presetting standard question-answer data, and labeling the original question-answer data to establish mapping between the original question-answer data and the standard question-answer data;
the marking method can be carried out in an automatic marking mode or a manual marking mode, if manual marking is adopted, massive user problems are distributed to knowledge base maintenance personnel, and the knowledge base maintenance personnel judge which standard data or standard data preset corresponding to the user problems are used for robot training. Because the quantity of the original question-answering data can be predicted and obtained, the workload of knowledge base maintenance personnel can be estimated, and therefore the human resource consumption in the initial stage of constructing the NLP model can be expected. For example, 10000 pieces of original question and answer data are marked by 1250 pieces per person per day, and 8 persons are required to work at the same time for one day to complete the marking work.
S24: auditing the original question-answer data labeling result;
because each person has a different cognition on the problem, multiple persons are generally required to label at the initial stage of labeling, for example, two persons label the same batch of data, and the label is checked after the labeling is completed.
The mark result is audited and not only can be guaranteed the exactness of mark and still be favorable to the managers to make statistics, for example, statistics mark progress (single progress is single already-labeled volume/user question volume, whole mark progress is total volume of having labeled/user question volume), difference rate (mark difference rate is the quantity of different marks/present batch mark total volume), and perfect the mark flow according to the statistical result, when the difference rate of two people's marks is less than 15%, can change into single mark, further raise the efficiency, thereby in order to continuously optimize the mark flow, in time discover the problem that meets in the mark process.
S25: establishing a virtual customer service question-answering model, and inputting training data into the virtual customer service question-answering model;
the virtual customer service question-answering model is a natural language processing model established by utilizing a convolution neural algorithm. By calling a convolutional neural algorithm, training data are trained into a question-answering model with generalization performance and serve as the core of the whole question-answering robot. It should be noted that, the method of the natural language processing model established by using the convolutional neural algorithm is the prior art, and the method of the natural language processing model established by using the convolutional neural algorithm is not limited in the present invention.
S26: inputting test data into the trained virtual customer service question-answering model to perform question-answering test;
through the steps, the virtual customer service question-answering model is trained, and the question-answering robot has the answering capability and needs to detect the actual question-answering effect. At this time, test data is input into the trained virtual customer service question-and-answer model to perform question-and-answer test, so as to detect question-and-answer accuracy (question-and-answer accuracy is the number of question-and-answer service answer pairs/total number of questions of the user).
S27: and evaluating the virtual customer service question-answer model, wherein the evaluation of the virtual customer service question-answer model comprises judging whether the virtual customer service question-answer model meets an online condition.
For example, the threshold of the question answering accuracy is set to 80%, and the accuracy is not lower than 80%, so that the online condition is met, and it should be noted that the threshold can be determined and adjusted by the service party. If the accuracy rate cannot reach the business expectation, the steps are needed to be carried out once or more times until the question-answering effect meets the business expectation.
It should be noted that, in the testing process, the question sentence with the wrong answer will be the internal data in the step S21, and the next iteration training is performed, thereby facilitating the continuous improvement of the model.
In some embodiments, referring to fig. 3, a data processing flow node comprises:
external log-cleaning and screening-data import-assignment of labeling task-labeling audit-model training;
and/or the presence of a gas in the gas,
internal log-screening-data import-label task allocation-label auditing-model training.
Wherein, the external log node needs to perform operations such as sentence keyword combination, telephone voice to text conversion, manual customer service record and the like; arranging a fixed module at a cleaning and screening node, and operating a node/python script to perform data cleaning operation; the screening node needs to perform operations such as data unsatisfactory to screen users, data with a question-answer score of less than 0.7 and the like; the same batch of logs are distributed to multiple people for marking when the marking task node is distributed, and the operation that the marking difference rate of the multiple people is less than 15 percent is converted into single people marking and the like is carried out; the annotation auditing node comprises the operations of counting the difference rate of multi-person annotation, single person/overall annotation progress, newly added annotation covering knowledge points, variation compared with yesterday annotation and the like.
In the embodiment, the original question and answer data are labeled, audited and the like, so that the accuracy of the training data is guaranteed, the input of human resources at the initial stage of the model is expected, the data processing efficiency can be improved, the labeling process is continuously optimized, and the problems encountered in the labeling process are timely found.
Fig. 4 is a functional block diagram of a data processing apparatus according to an embodiment of the present application.
As shown in fig. 4, the data processing apparatus provided in this embodiment includes:
an obtaining module 41, configured to obtain original question-answering data;
a generating module 42, configured to perform processing operations on the original question and answer data to generate training data and test data, where the processing operations include data cleaning and/or data screening;
a model establishing module 43, configured to establish a virtual customer service question-and-answer model, and input training data into the virtual customer service question-and-answer model;
and the model test module 44 is used for inputting test data into the trained virtual customer service question-and-answer model to perform question-and-answer test.
And a labeling module 45, configured to label the original question-answer data to establish a mapping between the original question-answer data and the standard question-answer data.
And the auditing module 46 is used for auditing the original question-answering data labeling result.
And the evaluation module 47 is configured to evaluate the virtual customer service question and answer model, where evaluating the virtual customer service question and answer model includes determining whether the virtual customer service question and answer model meets an online condition.
In the embodiment, the original question and answer data are obtained through the obtaining module, the generating module processes the original question and answer data to generate training data and test data, the model establishing module establishes a virtual customer service question and answer model, the training data are input into the virtual customer service question and answer model, the model testing module inputs the test data into the trained virtual customer service question and answer model to perform question and answer test, the cold start initial training effect of the virtual customer service question and answer model can be improved, the manpower input can be reduced by training the virtual customer service question and answer model through the generated training data, and the model is favorable for popularization and use; the virtual customer service question-answering model is tested through the test data, so that the virtual customer service question-answering effect can be evaluated, the question answering accuracy is improved, and the user experience is improved. Furthermore, the original question-answer data are labeled through a labeling module so that the original question-answer data meet the standard question-answer data, a labeling result of the original question-answer data is verified through a verification module, a virtual customer service question-answer model is evaluated through an evaluation module, the virtual customer service question-answer model is evaluated, the virtual customer service question-answer model is judged to be whether the virtual customer service question-answer model meets an online condition or not, the initial human resource investment of the model can be expected, and the model training has the standard and can be evaluated.
Fig. 5 is a block diagram of a data processing system according to an embodiment of the present application, and as shown in fig. 5, the controller includes:
a memory 51 for storing a program; and
a processor 52 for executing a program stored in the memory 51 to perform a method as in any one of the above.
With regard to the data processing system in the above-described embodiment, the specific manner in which the processor 52 executes the program in the memory 51 has been described in detail in the embodiment related to the method, and will not be described in detail here.
It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.
It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.
It should be noted that the present invention is not limited to the above-mentioned preferred embodiments, and those skilled in the art can obtain other products in various forms without departing from the spirit of the present invention, but any changes in shape or structure can be made within the scope of the present invention with the same or similar technical solutions as those of the present invention.

Claims (10)

1. A data processing method, comprising:
acquiring original question and answer data;
processing the original question-answer data to generate training data and test data, wherein the processing operation comprises data cleaning and/or data screening;
establishing a virtual customer service question-answering model, and inputting the training data into the virtual customer service question-answering model;
and inputting the test data into the trained virtual customer service question-answering model to perform question-answering test.
2. The data processing method of claim 1, wherein the raw question-answer data comprises:
internal data and external data, the internal data including historical question and answer data already existing in the system; the external data includes session data saved by manual customer service records.
3. The data processing method of claim 1, wherein the data cleansing comprises deleting specific characters in the original question and answer data, wherein the specific characters comprise one or more of symbols, numbers, letters, exclamation words and sensitive words.
4. The data processing method of claim 1, wherein the data filtering comprises filtering the sentences conforming to business logic according to a preset sentence length, and the sentences comprise question sentences and answer sentences.
5. The data processing method of claim 1, further comprising, prior to the generating training data and test data:
presetting standard question-answer data;
and labeling the original question-answer data to establish mapping of the original question-answer data and the standard question-answer data.
6. The data processing method of claim 5, further comprising:
and (5) auditing the original question-answer data labeling result.
7. The data processing method of claim 1, wherein the virtual customer service question-and-answer model is a natural language processing model established using a convolutional neural algorithm.
8. The data processing method of claim 1, further comprising:
and evaluating the virtual customer service question-and-answer model, wherein the evaluating of the virtual customer service question-and-answer model comprises judging whether the virtual customer service question-and-answer model meets an online condition.
9. A data processing apparatus, comprising:
the acquisition module is used for acquiring original question and answer data;
the generating module is used for processing the original question and answer data to generate training data and test data, and the processing operation comprises data cleaning and/or data screening;
the model establishing module is used for establishing a virtual customer service question-answering model and inputting the training data into the virtual customer service question-answering model;
and the model test module is used for inputting the test data into the trained virtual customer service question-answering model to perform question-answering test.
10. A data processing system, comprising:
a memory for storing a computer program and a processor for executing the computer program to implement the data processing method of any one of claims 1 to 8.
CN201911279836.9A 2019-12-13 2019-12-13 Data processing method, device and system Pending CN111061852A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911279836.9A CN111061852A (en) 2019-12-13 2019-12-13 Data processing method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911279836.9A CN111061852A (en) 2019-12-13 2019-12-13 Data processing method, device and system

Publications (1)

Publication Number Publication Date
CN111061852A true CN111061852A (en) 2020-04-24

Family

ID=70300953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911279836.9A Pending CN111061852A (en) 2019-12-13 2019-12-13 Data processing method, device and system

Country Status (1)

Country Link
CN (1) CN111061852A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117436551A (en) * 2023-12-18 2024-01-23 杭州宇谷科技股份有限公司 Training method and system for intelligent customer service model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280095A (en) * 2017-01-06 2018-07-13 南通使爱智能科技有限公司 Intelligent virtual customer service system
CN109558952A (en) * 2018-11-27 2019-04-02 北京旷视科技有限公司 Data processing method, system, equipment and storage medium
CN109829375A (en) * 2018-12-27 2019-05-31 深圳云天励飞技术有限公司 A kind of machine learning method, device, equipment and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280095A (en) * 2017-01-06 2018-07-13 南通使爱智能科技有限公司 Intelligent virtual customer service system
CN109558952A (en) * 2018-11-27 2019-04-02 北京旷视科技有限公司 Data processing method, system, equipment and storage medium
CN109829375A (en) * 2018-12-27 2019-05-31 深圳云天励飞技术有限公司 A kind of machine learning method, device, equipment and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨武岐 等: "事业单位内部控制", 中国经济出版社, pages: 159 - 161 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117436551A (en) * 2023-12-18 2024-01-23 杭州宇谷科技股份有限公司 Training method and system for intelligent customer service model

Similar Documents

Publication Publication Date Title
CN108764480A (en) A kind of system of information processing
CN110008322A (en) Art recommended method and device under more wheel session operational scenarios
CN112346567A (en) Virtual interaction model generation method and device based on AI (Artificial Intelligence) and computer equipment
CN113468296B (en) Model self-iteration type intelligent customer service quality inspection system and method capable of configuring business logic
WO2021169485A1 (en) Dialogue generation method and apparatus, and computer device
CN111724908A (en) Epidemic situation investigation method and device based on robot process automation RPA
CN109933661A (en) It is a kind of that the semi-supervised question and answer of model are generated to inductive method and system based on depth
CN109657038A (en) The method for digging, device and electronic equipment of a kind of question and answer to data
CN107240394A (en) A kind of dynamic self-adapting speech analysis techniques for man-machine SET method and system
CN114037569A (en) Artificial intelligence-based multi-scene two-way simulation internet medical customer service personnel training method
JP2020047234A (en) Data evaluation method, device, apparatus, and readable storage media
WO2016131241A1 (en) Quality detection method and device
CN111724909A (en) Epidemic situation investigation method and device combining RPA and AI
CN112397061A (en) Online interaction method, device, equipment and storage medium
CN111061852A (en) Data processing method, device and system
CN112256576B (en) Man-machine dialogue corpus testing method, device, equipment and storage medium
CN115830419A (en) Data-driven artificial intelligence technology evaluation system and method
Elsaid et al. Automatic framework for requirement analysis phase
CN107886233A (en) The QoS evaluating method and system of customer service
CN114691903A (en) Intelligent course testing method and system, electronic equipment and storage medium
CN109918651A (en) Synonymous part of speech template acquisition methods and device
CN104834393A (en) Automatic testing device and system
CN108388972A (en) A kind of integrating skills appraisal procedure and device
CN115168603B (en) Automatic feedback response method, device and storage medium for color ring back tone service process
WO2022000140A1 (en) Epidemic screening method and apparatus combining rpa with ai

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200424