CN117709969A - Customer service scene-oriented generation matching type large model construction method, medium and equipment - Google Patents

Customer service scene-oriented generation matching type large model construction method, medium and equipment Download PDF

Info

Publication number
CN117709969A
CN117709969A CN202311760197.4A CN202311760197A CN117709969A CN 117709969 A CN117709969 A CN 117709969A CN 202311760197 A CN202311760197 A CN 202311760197A CN 117709969 A CN117709969 A CN 117709969A
Authority
CN
China
Prior art keywords
customer service
large model
text
dialogue
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311760197.4A
Other languages
Chinese (zh)
Inventor
张通
邓忠易
陈俊龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Guangzhou
South China University of Technology SCUT
Original Assignee
Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Guangzhou
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Guangzhou, South China University of Technology SCUT filed Critical Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Guangzhou
Priority to CN202311760197.4A priority Critical patent/CN117709969A/en
Publication of CN117709969A publication Critical patent/CN117709969A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The invention provides a customer service scene-oriented large model construction method, medium and equipment for generating a matching type; the method comprises a model setting stage, a pre-training stage, a domain migration stage and a downstream fine tuning stage which are sequentially executed; the pre-training phase refers to: pre-training a large model base of the intelligent customer service large model by taking texts of a cross-domain Chinese corpus as samples; the domain migration phase refers to: adopting customer service scene data as a sample; performing weak supervision training on a large model base of the intelligent customer service large model; the downstream fine tuning stage refers to: and training the intelligent customer service large model by taking the customer service scene manual annotation data as a sample so as to learn the related knowledge of the new service. The method realizes and optimizes the function of the large model step by step in stages, so that the method has the capability of deeply mining the knowledge of large-scale customer service text data, and simultaneously has the capability of accurately migrating and rapidly expanding newly increased service demands and changed service contents.

Description

Customer service scene-oriented generation matching type large model construction method, medium and equipment
Technical Field
The invention relates to the technical field of computer model construction, in particular to a customer service scene-oriented generation matching type large model construction method, medium and equipment.
Background
The intelligent customer service large model is an application oriented to the customer service industry developed on the basis of large-scale knowledge processing. In the existing customer service scene, the core function of intelligent customer service is man-machine conversation, and the intelligent customer service is mainly realized through a natural language generation type large model at present. After the pre-training of the large model is completed through the corpus in different fields, the system converts the received human language into sequence data and inputs the sequence data into the large model, captures the information in the dialogue through a series of working mechanisms, and also needs to perform a series of judgment logics according to human experience and business rules to filter invalid information, and finally outputs the invalid information to be converted into a natural language form so as to realize the man-machine dialogue.
However, the existing intelligent customer service large model has the following technical problems:
conventional large model modeling schemes are difficult to accommodate for engineering needs:
the conventional large-model modeling scheme has extremely high requirements on the scale and quality of the marked data; meanwhile, in the process of adapting to different downstream tasks, the original model architecture needs to be expanded, the original parameters are finely adjusted, and the secondary development often brings non-negligible additional workload. In a smart customer service scenario, more complex practical situations are still to be solved, for example, in a user incoming call and customer service dialogue, a problem reflected by a customer usually relates to a great variety of services, and meanwhile, a situation that service contents are continuously updated iteratively exists, so that a conventional large-model pre-training and fine-tuning scheme is difficult to adapt to the updating frequency of a specific service.
(II) a pseudo tag generation scheme based on deep learning has a limitation in customer service scenes:
customer service dialogue scene involves various complicated business problems, customer service personnel with certain expertise and working experience are required to finish the labeling work of dialogue texts well, and large-scale labeling data are often required for the pre-training of a large model to achieve a satisfactory effect. The existing pseudo tag generation scheme based on deep learning needs to independently construct a deep learning model, and can achieve a good generation effect after training is completed through labeling data of a certain scale, which can definitely bring additional development and labeling cost. For a large model which depends on a large amount of training data to ensure high accuracy, how to fully utilize manual labeling information, reduce labeling workload and fully combine human experience and learning rules becomes a critical problem to be solved.
(III) the training effect of the intelligent customer service large model is affected by a large amount of noise contained in the pseudo tag data:
although an automatic labeling mode of generating pseudo tag data through a deep learning model has become a mainstream data enhancement means at the present stage, compared with manual labeling, the automatic labeling mode is difficult to ensure the quality of labeled data, and certain noise is often present. Therefore, the data set in the real scene can be divided into a small amount of manually marked fine label data and a large amount of noisy pseudo label data, and the whole data set is used for large model training at the moment, and the mode of minimizing the whole loss function can not obtain expected effects. How to avoid models learning incorrect knowledge from large amounts of noisy data remains a major concern.
Fourth, the existing intelligent customer service large model has single function, and is difficult to meet rich service requirements:
in the existing intelligent customer service large model scheme, the core function is limited to man-machine conversation, and functions of attribution analysis and the like which can really assist customer service staff in handling incoming call consultation are not involved. The technical architecture of the model usually adopts a generated model to reply to the questions of the user in combination with various rules, the final output is still in the form of dialogue text, and the output result and the actual service requirement are in a gap.
Disclosure of Invention
In order to overcome the defects and shortcomings in the prior art, the invention aims to provide a customer service scene-oriented large model construction method, medium and equipment for generating and matching; the method realizes and optimizes the function of the large model step by step in stages, so that the method has the capability of deeply mining the knowledge of large-scale customer service text data, and simultaneously has the capability of accurately migrating and rapidly expanding newly added service demands and changed service contents; the method not only can adapt to different computing power resources and data scales and strengthen management and control of an algorithm research and development flow by a manager, but also can provide a targeted solution for continuously updated service demands.
In order to achieve the above purpose, the invention is realized by the following technical scheme: a customer service scene-oriented large model construction method for generating matching comprises a model setting stage, a pre-training stage, a field migration stage and a downstream fine adjustment stage which are sequentially executed;
the model setting stage refers to: setting a large intelligent customer service model based on a Transformer architecture; the intelligent customer service large model comprises a large model base and a mapping module connected to the output end of the large model base;
the pre-training stage refers to: the text of a cross-domain Chinese corpus is used as a sample, and a large model base of the intelligent customer service large model is pre-trained, so that the intelligent customer service large model has a strong generalization capability;
the domain migration stage refers to: adopting customer service scene data as a sample; performing weak supervision training on a large model base of the intelligent customer service large model by utilizing fine label data in customer service scene data and pseudo label data obtained through automatic labeling so as to enable the intelligent customer service large model to have strong domain knowledge;
the downstream fine tuning stage refers to: adopting customer service scene manual annotation data as a sample; the sample splices the corresponding service demand prompt template according to the service demand to obtain task data; training the intelligent customer service large model by using the task data to learn the related knowledge of the new service.
Preferably, the domain migration phase comprises the following steps:
x1, collecting customer service scene data; classifying customer service scene data into user information I, product information and customer service dialogue; the customer service dialogue comprises dialogue text I of voice transcription;
the first dialogue text comprises a first dialogue text with a label and a first dialogue text without a label; the number of tagged dialog texts > the number of untagged dialog texts;
x2, forming fine label data in a form of 'dialogue text-label' when the dialogue text I is the dialogue text I with the label;
when the first dialogue text is one of the label-free dialogue texts, acquiring user information I corresponding to the first dialogue text; converting the user information into text description according to the context template, and splicing the text description with the first dialogue text to obtain a first dialogue spliced text; automatically labeling the dialogue spliced text to obtain a corresponding abstract of the dialogue text; combining the first dialogue text with the abstract to form pseudo tag data in a dialogue text-abstract form so as to realize data enhancement;
x3, respectively inputting the label data and the pseudo label data into a large model base of the intelligent customer service large model; in the forward propagation process, calculating to obtain the loss gradient of the tag data and the pseudo tag data, and comparing the similarity degree of the loss gradient:
if the loss gradient directions are consistent, judging that the training is effective, and then carrying out backward propagation to complete weak supervision training;
otherwise, the loss gradient of the pseudo tag data is set to zero, and then backward propagation is carried out to complete weak supervision training.
Preferably, in the step X2, automatic labeling processing is performed on the dialog spliced text to obtain a summary corresponding to the dialog text, which means that: comprises the following sub-steps:
x21, initializing a summary set;
x22, splitting the dialogue splicing text into n sentences;
x23, splicing each sentence and all sentences in the abstract set to obtain abstract spliced sentences;
x24, calculating the maximum common subsequence length L of each abstract spliced sentence and the rest sentences of the dialogue spliced text respectively; removing the sentence corresponding to the Lmax from the dialogue splicing text and storing the sentence into a summary set;
x25, judging whether the number s of sentences in the abstract set reaches the set number of sentences: if so, setting the current abstract set as an abstract of the first dialogue text; otherwise, the step X22 is skipped, and the sentence number s in the abstract collection is continuously increased.
Preferably, the context template comprises content of user information one and content of dialogue text one; the content of the first user information comprises telephone charge balance, package name, package charge and package flow.
Preferably, the downstream trimming stage comprises the following substeps:
y1, collecting manual annotation data of customer service scenes; the manual annotation data comprises a second dialogue text with a label; acquiring user information II corresponding to the dialogue text II; setting a service demand prompting template according to the service demand; converting the second user information into text description according to the context template, splicing the second user information with the second dialogue text, and splicing the second user information with the service demand prompting template to obtain task data;
y2, setting candidate labels of a candidate label list;
y3, respectively inputting the task data and the candidate labels into a large model base of the intelligent customer service large model to obtain a generated information sentence vector and a candidate label sentence vector; calculating and generating the similarity between the information sentence vectors and each candidate tag sentence vector through a matching algorithm; taking a candidate label corresponding to the candidate label sentence vector with the highest similarity as a prediction label;
y4, calculating a loss function according to the predicted label and the actual label of the second dialogue text; and when the value convergence of the loss function or the iteration times reach the set times, ending the downstream fine tuning stage, otherwise, adjusting the parameters of the intelligent customer service large model, and returning to the substep Y3 to continue training.
Preferably, the execution frequency of the pre-training phase < the execution frequency of the domain migration phase < the execution frequency of the downstream fine tuning phase.
Preferably, in the intelligent customer service large model, the large model base comprises an encoder and a decoder; the mapping module comprises a plurality of full-connection layers which are connected in sequence.
A readable storage medium, wherein said storage medium stores a computer program which, when executed by a processor, causes said processor to perform said customer service scene oriented generation matching large model construction method.
The computer equipment comprises a processor and a memory for storing a program executable by the processor, wherein the processor realizes the matched large model construction method facing the customer service scene when executing the program stored by the memory.
Compared with the prior art, the invention has the following advantages and beneficial effects:
firstly, according to the real customer service working content and customer service data characteristics, the modeling whole flow of the intelligent customer service large model is disassembled into four stages of model setting, pre-training, field migration and downstream fine adjustment, the functions of the large model are realized and optimized step by step in stages, so that the intelligent customer service large model has the capability of deep mining of large-scale customer service text data knowledge, and meanwhile, the intelligent customer service large model has the capability of accurate migration and rapid expansion for newly added service demands and changed service contents; in the multi-stage modeling method, the cold start problem of customer service dialogue is effectively solved through the learning of open domain public corpus in the pre-training stage; the field migration stage is used for mining a large amount of effective information of the non-labeling dialogue text through a small amount of labeling data; and in the downstream fine tuning stage, the capacity of the large model is quickly and accurately migrated to a specific service by adopting a matching algorithm, so that the resource consumption of secondary development is avoided. The staged modeling method accurately marks the whole process of customer service work, not only can adapt to different computing power resources and data scales and strengthen management and control of an algorithm research and development process by a manager, but also can provide a targeted solution for continuously updated business demands.
Secondly, extracting abstract information from each section of dialogue by adopting an automatic labeling algorithm facing dialogue text, constructing a corpus in an original text-abstract form, and generating pseudo tag data required by large-model field migration training by utilizing massive original dialogue data; compared with the scheme of independently constructing a deep learning model to generate pseudo tag data, the automatic labeling algorithm focuses more on the characteristics of the data, the language mode under a specific scene is mined through statistical analysis on the distribution characteristics of real data, additional labeling data is not needed, development workload and calculation power consumption are greatly reduced, a large amount of pseudo tag data of dialog texts can be generated in a short time by fully utilizing the existing massive original dialog data, and the time cost and labor cost of labeling work are greatly reduced.
Thirdly, a weak supervision pre-learning method based on data enhancement is adopted, and based on a large amount of pseudo-annotation data generated by an automatic annotation algorithm, a small amount of artificial annotation data is combined, so that data enhancement in a customer service scene is realized in a weak supervision mode; the manually marked dialogue text contains language modes unique to customer service scenes and can be regarded as strong supervision data; the automatic labeling algorithm can be used for acquiring large-scale pseudo tag data at low cost, but the quality of the pseudo tag data is lower than that of manual labeling data, and the pseudo tag data can be regarded as weak supervision data containing noise. According to the weak supervision training method provided by the invention, under the condition that the quality of the training data set is uneven, a larger amount of noisy data is used as a main learning object of the model, and effective information can be mined from a large amount of noisy data based on a small amount of strong supervision data, so that the model effect is improved.
And fourthly, adopting a matching algorithm based on the generated large model, finishing fine adjustment of the intelligent customer service large model by using the labeling data of the specific service scene, inputting the service data and the dialogue text into the large model simultaneously on the basis of the fine adjustment, coding, and finally realizing the functions required by the specific service scene in a characteristic matching mode. The matching algorithm trains and guides the generation capacity of the large model under different application scenes by setting different prompting templates, and connects the generation capacity of the large model and service requirements in a characteristic matching mode, so that the matching algorithm has good adaptability to continuously iterated service contents and newly added service scenes, and avoids the cost consumption of secondary development.
Drawings
FIG. 1 is a flow chart of a method for constructing a large model of generating matching type for customer service scene;
FIG. 2 is a schematic diagram of a pre-training stage of the customer service scene oriented generation matching type large model construction method of the invention;
FIG. 3 is a flowchart of an automatic labeling algorithm of the customer service scene oriented method for constructing the large model of the generation matching type;
FIG. 4 is a weak supervision training flow chart of the customer service scene oriented generation matching type large model construction method;
FIG. 5 is a schematic diagram of the encoder and decoder in the downstream fine tuning stage of the customer service scene oriented method for creating a matched large model;
fig. 6 is a schematic diagram of sentence vector mapping of the method for constructing a large model of generating matching type for customer service scene.
Detailed Description
The invention is described in further detail below with reference to the drawings and the detailed description.
Example 1
The embodiment of the method for constructing the large model of the generation matching type for the customer service scene comprises a model setting stage, a pre-training stage, a field migration stage and a downstream fine tuning stage which are sequentially executed, as shown in fig. 1.
The application paradigm of the existing large model is fine-tuned on the basis of pre-training, but in the application scene of real life, human dialogue has the characteristics of complexity and multiple ambiguity, and the satisfactory effect cannot be obtained by simply applying the paradigm, so the invention summarizes a modeling method suitable for the intelligent customer service scene by analyzing the working content and the data characteristic of the customer service scene.
The model setting stage refers to: setting a large intelligent customer service model based on a Transformer architecture; the intelligent customer service large model comprises a large model base and a mapping module comprising a plurality of full-connection layers; the large model base includes an encoder and a decoder.
The pre-training stage refers to: the text of the cross-domain Chinese corpus is used as a sample, and the large model base of the intelligent customer service large model is pre-trained, so that the intelligent customer service large model has a strong generalization capability. The pre-training stage is field independent and task independent, and the training of the large model base is completed by using the Chinese corpus of the open domain.
In the pre-training stage, an encoder converts an input text into sequence data, extracts high-order semantic information such as grammar, part of speech, entity and the like from the sequence data and maps the high-order semantic information into feature vectors; the decoder remaps the feature vectors output by the encoder into a data sequence and continuously optimizes the model parameters via supervisory signals provided by the data tags. On the basis of massive Chinese corpora, the data distribution characteristics of Chinese language are learned through rich pre-training tasks such as shielding word reduction, topic classification, abstract extraction and the like, and a Chinese semantic space is constructed, so that data collection and cleaning are needed from multiparty data sources, a cross-domain Chinese corpus is constructed, and data are converted according to different task templates. The intelligent customer service large model output through the pre-training stage has strong Chinese semantic understanding capability and generalization capability. The training method of the pre-training stage can be implemented by the prior art.
The domain migration stage refers to: adopting customer service scene data as a sample; and carrying out weak supervision training on a large model base of the intelligent customer service large model by utilizing fine label data in customer service scene data and pseudo label data obtained through automatic labeling, so that the intelligent customer service large model has strong domain knowledge.
The downstream fine tuning stage refers to: adopting customer service scene manual annotation data as a sample; the sample splices the corresponding service demand prompt template according to the service demand to obtain task data; training the intelligent customer service large model by using the task data to learn the related knowledge of the new service.
Specifically, in the pre-training stage, the encoder converts the input text into sequence data, extracts high-order semantic information such as grammar, part of speech, entity and the like from the sequence data and maps the high-order semantic information into feature vectors; the decoder remaps the feature vectors output by the encoder to a data sequence and continuously optimizes the model parameters via the supervisory signals provided by the data tags, as shown in fig. 2.
The core of the domain migration stage is to utilize the data of the customer service scene to realize the knowledge migration of the large model in the vertical domain.
The domain migration stage comprises the following steps:
x1, collecting customer service scene data; the customer service scene data is classified into user information I, product information and customer service dialogue.
The first user information comprises information such as telephone charge balance, package name, package charge, package flow and the like, and operation records of recent self-service operation, incoming call consultation and complaints; the product information comprises detailed contents of various packages and services, such as campus packages and the contents of packages such as cost, free call duration, directional flow and the like; the customer service dialogue includes dialogue text one of the voice transcription, and periodic manual spot check records (a section of dialogue text one corresponds to a sentence of manual summary sentence).
The first user information and the product information can provide key information for the intention recognition of the session, and the customer service session reflects the flow of analyzing and solving the problems from the first user information and the product information, so that the first user information and the product information are the main learning objects of the intelligent customer service large model.
The first dialogue text comprises a first dialogue text with a label and a first dialogue text without a label; the number of tagged dialog texts > the number of untagged dialog texts;
taking the dialogue text I with the label as fine label data; and automatically labeling the first unlabeled conversation text to convert the first unlabeled conversation text into pseudo-label data.
X2, forming fine label data in a form of 'dialogue text-label' when the dialogue text I is the dialogue text I with the label;
when the first dialogue text is a label-free dialogue text, each dialogue text needs to be matched with the first user information in order to enrich the context of the current dialogue; the method comprises the steps of obtaining user information I corresponding to dialogue text I; and converting the user information into text description according to the context template, and splicing the text description with the first dialogue text to obtain the first dialogue spliced text. The context template comprises the content of user information I and the content of dialogue text I; the content of the first user information comprises telephone charge balance, package name, package charge and package flow. Assuming that the context templates are shown in table 1,
table 1 up and down Wen Moban
Fields Content
Telephone charge balance 20 yuan
Package name Campus super package
Package tariff 29 yuan/month
Package flow rate Domestic general flow 30G, directional flow 50G
Session content Your own, my packages are not enough in flow, with or without larger packages, preferably …
The dialogue splicing text is as follows: the current user's telephone charge balance is (20 yuan), the use package is (campus super package), package charge is (29 yuan monthly), the current dialogue content is (hello, my package flow is insufficient, there is no larger package flow, preferably …) the package contains flow (domestic general flow 30G, directional flow 50G).
Automatically labeling the dialogue spliced text to obtain a corresponding abstract of the dialogue text; the first dialog text and the abstract are combined to form pseudo tag data in the form of a "dialog text-abstract" to achieve data enhancement. Specifically, as shown in fig. 3, the method comprises the following substeps:
x21, initializing a summary set;
x22, splitting the dialogue splicing text into n sentences;
x23, splicing each sentence and all sentences in the abstract set to obtain abstract spliced sentences;
x24, calculating the maximum common subsequence length L of each abstract spliced sentence and the rest sentences of the dialogue spliced text respectively; removing the sentence corresponding to the Lmax from the dialogue splicing text and storing the sentence into a summary set;
x25, judging whether the number s of sentences in the abstract set reaches the set number of sentences: if so, setting the current abstract set as an abstract of the first dialogue text; otherwise, the step X22 is skipped, and the sentence number s in the abstract collection is continuously increased.
X3, respectively inputting the label data and the pseudo label data into a large model base of the intelligent customer service large model to perform weak supervision training; weak supervision training refers to: as shown in fig. 4, in one forward propagation process, a loss gradient of the tag data and the pseudo tag data is calculated, and the similarity degree of the loss gradient is compared:
if the loss gradient direction is consistent, judging that the optimization direction of the training data is consistent with the optimization direction of the clean data, training effectively, and then performing backward propagation to complete weak supervision training;
otherwise, judging that the noise influences the clean data training, setting the loss gradient of the pseudo tag data to zero, and then carrying out backward propagation to complete the weak supervision training.
The goal of the downstream fine tuning stage is to provide solutions for different business demands, and the intelligent customer service large model learns the relevant knowledge of new business from a small amount of manual annotation data, so that the performance of specific tasks is improved.
A downstream trimming stage comprising the following sub-steps:
y1, collecting manual annotation data of customer service scenes; the manual annotation data comprises a second dialogue text with a label; acquiring user information II corresponding to the dialogue text II; setting a service demand prompting template according to the service demand; converting the second user information into text description according to the context template, splicing the second user information with the second dialogue text, and splicing the second user information with the service demand prompting template to obtain task data;
the user dialogue attribution analysis is used as an important function in the intelligent customer service scene, can provide a key analysis result for enterprises, assists customer service staff to discover and solve problems in time, and finally achieves the purpose of improving user satisfaction. In this embodiment, the user session attribution analysis is taken as an example, and the working content of the downstream fine tuning stage is described; attribution analysis refers to classifying a current session into a specific service or product by customer service staff in combination with existing information according to incoming calls and session content of a user. If the heat of which products is highest is analyzed from the incoming call and the conversation of the user in a certain time period, a product attribution prompting template can be constructed; if the reasons of user complaints and product faults need to be positioned, a complaint attribution prompting template can be constructed; for other business requirements, different types of prompt templates can be constructed according to the characteristics of business data, and auxiliary information is provided for the input data of the intelligent customer service large model.
When the service requirement is product attribution, the content of the service requirement prompting template is as follows: "what product the current conversation belongs to? "; when the business requirement is complaint attribution, the content of the business requirement prompting template is as follows: what is the complaint content of the current conversation? "
Taking complaints as a cause, the task data is:
"the current user's telephone charge balance is (20 yuan), the use package is (campus super package), the package charge is (29 yuan monthly), the current dialogue content is (hello, my package flow is insufficient, there is no larger package flow, preferably …), the package charge is (domestic general flow 30G, directional flow 50G). What is the complaint content of the current conversation? "
Y2, inquiring related products and activities in the service system according to the keywords to serve as candidate labels, and storing the candidate labels in a candidate label list;
y3, inputting task data into an encoder of the intelligent customer service large model, and outputting decoder generation information by a decoder of the intelligent customer service large model; the decoder generated information is input into a mapping module of the intelligent customer service large model, and the generated information sentence vector is obtained through the mapping of a plurality of full connection layers of the mapping module, as shown in fig. 5; inputting the candidate labels into a large intelligent customer service model to obtain candidate label sentence vectors;
the encoder depth mines key information such as part of speech, grammar, context association, relative position and the like of the input text based on an attention mechanism, and encodes the key information into implicit characteristic vectors serving as the input of the decoder. The decoder is responsible for reconstructing the implicit feature vector into a label of the input text. Language patterns and context wisdom in customer service scenarios are mined from large-scale "text-tag" sample pairs by the workflow of encoding-decoding.
The aim of sentence vector mapping is to input task data and candidate label text into a large model code at the same time and perform similarity matching, so as to realize efficient matching of the task data and the labels.
The cosine similarity between the generated information sentence vector and each candidate label sentence vector is calculated through a matching algorithm; the candidate tag corresponding to the candidate tag sentence vector with the highest similarity is used as the prediction tag, as shown in fig. 6.
Y4, calculating a loss function according to the predicted label and the actual label of the second dialogue text; and when the value convergence of the loss function or the iteration times reach the set times, ending the downstream fine tuning stage, otherwise, adjusting the parameters of the intelligent customer service large model, and returning to the substep Y3 to continue training.
Example two
The readable storage medium of this embodiment stores a computer program, which when executed by a processor, causes the processor to execute the method for creating a matched large model for customer service oriented scenarios of embodiment one.
Example III
The computer device of the present embodiment includes a processor and a memory for storing a program executable by the processor, where when the processor executes the program stored in the memory, the method for creating a matched large model for customer service scene according to the first embodiment is implemented.
The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims (9)

1. A customer service scene-oriented large model construction method for generating matching is characterized by comprising the following steps: the method comprises a model setting stage, a pre-training stage, a domain migration stage and a downstream fine tuning stage which are sequentially executed;
the model setting stage refers to: setting a large intelligent customer service model based on a Transformer architecture; the intelligent customer service large model comprises a large model base and a mapping module connected to the output end of the large model base;
the pre-training stage refers to: the text of a cross-domain Chinese corpus is used as a sample, and a large model base of the intelligent customer service large model is pre-trained, so that the intelligent customer service large model has a strong generalization capability;
the domain migration stage refers to: adopting customer service scene data as a sample; performing weak supervision training on a large model base of the intelligent customer service large model by utilizing fine label data in customer service scene data and pseudo label data obtained through automatic labeling so as to enable the intelligent customer service large model to have strong domain knowledge;
the downstream fine tuning stage refers to: adopting customer service scene manual annotation data as a sample; the sample splices the corresponding service demand prompt template according to the service demand to obtain task data; training the intelligent customer service large model by using the task data to learn the related knowledge of the new service.
2. The customer service scene-oriented generation matching type large model construction method as claimed in claim 1, wherein the method comprises the following steps: the domain migration stage comprises the following steps:
x1, collecting customer service scene data; classifying customer service scene data into user information I, product information and customer service dialogue; the customer service dialogue comprises dialogue text I of voice transcription;
the first dialogue text comprises a first dialogue text with a label and a first dialogue text without a label; the number of tagged dialog texts > the number of untagged dialog texts;
x2, forming fine label data in a form of 'dialogue text-label' when the dialogue text I is the dialogue text I with the label;
when the first dialogue text is one of the label-free dialogue texts, acquiring user information I corresponding to the first dialogue text; converting the user information into text description according to the context template, and splicing the text description with the first dialogue text to obtain a first dialogue spliced text; automatically labeling the dialogue spliced text to obtain a corresponding abstract of the dialogue text; combining the first dialogue text with the abstract to form pseudo tag data in a dialogue text-abstract form so as to realize data enhancement;
x3, respectively inputting the label data and the pseudo label data into a large model base of the intelligent customer service large model; in the forward propagation process, calculating to obtain the loss gradient of the tag data and the pseudo tag data, and comparing the similarity degree of the loss gradient:
if the loss gradient directions are consistent, judging that the training is effective, and then carrying out backward propagation to complete weak supervision training;
otherwise, the loss gradient of the pseudo tag data is set to zero, and then backward propagation is carried out to complete weak supervision training.
3. The customer service scene-oriented generation matching type large model construction method as claimed in claim 2, wherein the method comprises the following steps: step X2, performing automatic labeling processing on the dialogue spliced text to obtain a corresponding abstract of the dialogue text, which means that: comprises the following sub-steps:
x21, initializing a summary set;
x22, splitting the dialogue splicing text into n sentences;
x23, splicing each sentence and all sentences in the abstract set to obtain abstract spliced sentences;
x24, calculating the maximum common subsequence length L of each abstract spliced sentence and the rest sentences of the dialogue spliced text respectively; removing the sentence corresponding to the Lmax from the dialogue splicing text and storing the sentence into a summary set;
x25, judging whether the number s of sentences in the abstract set reaches the set number of sentences: if so, setting the current abstract set as an abstract of the first dialogue text; otherwise, the step X22 is skipped, and the sentence number s in the abstract collection is continuously increased.
4. The customer service scene-oriented generation matching type large model construction method as claimed in claim 2, wherein the method comprises the following steps: the context template comprises the content of user information I and the content of dialogue text I; the content of the first user information comprises telephone charge balance, package name, package charge and package flow.
5. The customer service scene-oriented generation matching type large model construction method as claimed in claim 2, wherein the method comprises the following steps: the downstream fine tuning stage comprises the following sub-steps:
y1, collecting manual annotation data of customer service scenes; the manual annotation data comprises a second dialogue text with a label; acquiring user information II corresponding to the dialogue text II; setting a service demand prompting template according to the service demand; converting the second user information into text description according to the context template, splicing the second user information with the second dialogue text, and splicing the second user information with the service demand prompting template to obtain task data;
y2, setting candidate labels of a candidate label list;
y3, respectively inputting the task data and the candidate labels into a large model base of the intelligent customer service large model to obtain a generated information sentence vector and a candidate label sentence vector; calculating and generating the similarity between the information sentence vectors and each candidate tag sentence vector through a matching algorithm; taking a candidate label corresponding to the candidate label sentence vector with the highest similarity as a prediction label;
y4, calculating a loss function according to the predicted label and the actual label of the second dialogue text; and when the value convergence of the loss function or the iteration times reach the set times, ending the downstream fine tuning stage, otherwise, adjusting the parameters of the intelligent customer service large model, and returning to the substep Y3 to continue training.
6. The customer service scene-oriented generation matching type large model construction method as claimed in claim 1, wherein the method comprises the following steps: the execution frequency of the pre-training stage is less than the execution frequency of the domain migration stage and less than the execution frequency of the downstream fine tuning stage.
7. The customer service scene-oriented generation matching type large model construction method as claimed in claim 1, wherein the method comprises the following steps: in the intelligent customer service large model, a large model base comprises an encoder and a decoder; the mapping module comprises a plurality of full-connection layers which are connected in sequence.
8. A readable storage medium, wherein the storage medium has stored thereon a computer program which, when executed by a processor, causes the processor to perform the customer service scene oriented generation matching large model construction method of any of claims 1-7.
9. A computer device comprising a processor and a memory for storing a program executable by the processor, wherein the processor, when executing the program stored in the memory, implements the customer service scene oriented generation matching large model construction method of any one of claims 1-7.
CN202311760197.4A 2023-12-20 2023-12-20 Customer service scene-oriented generation matching type large model construction method, medium and equipment Pending CN117709969A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311760197.4A CN117709969A (en) 2023-12-20 2023-12-20 Customer service scene-oriented generation matching type large model construction method, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311760197.4A CN117709969A (en) 2023-12-20 2023-12-20 Customer service scene-oriented generation matching type large model construction method, medium and equipment

Publications (1)

Publication Number Publication Date
CN117709969A true CN117709969A (en) 2024-03-15

Family

ID=90153034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311760197.4A Pending CN117709969A (en) 2023-12-20 2023-12-20 Customer service scene-oriented generation matching type large model construction method, medium and equipment

Country Status (1)

Country Link
CN (1) CN117709969A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
WO2022088444A1 (en) * 2020-11-02 2022-05-05 之江实验室 Multi-task language model-oriented meta-knowledge fine tuning method and platform
CN114756658A (en) * 2022-05-10 2022-07-15 北京明略昭辉科技有限公司 Method and device for training model, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111444721A (en) * 2020-05-27 2020-07-24 南京大学 Chinese text key information extraction method based on pre-training language model
WO2022088444A1 (en) * 2020-11-02 2022-05-05 之江实验室 Multi-task language model-oriented meta-knowledge fine tuning method and platform
CN114756658A (en) * 2022-05-10 2022-07-15 北京明略昭辉科技有限公司 Method and device for training model, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
董斌;: "运营商智慧客服体系化建设", 电信科学, no. 07, 31 December 2020 (2020-12-31) *

Similar Documents

Publication Publication Date Title
CN104050160B (en) Interpreter&#39;s method and apparatus that a kind of machine is blended with human translation
CN109492113A (en) Entity and relation combined extraction method for software defect knowledge
CN109857846B (en) Method and device for matching user question and knowledge point
CN109460459A (en) A kind of conversational system automatic optimization method based on log study
CN110516240B (en) Semantic similarity calculation model DSSM (direct sequence spread spectrum) technology based on Transformer
CN108628908A (en) The method, apparatus and electronic equipment of sorted users challenge-response boundary
CN117149977A (en) Intelligent collecting robot based on robot flow automation
CN113326367B (en) Task type dialogue method and system based on end-to-end text generation
CN114168754A (en) Relation extraction method based on syntactic dependency and fusion information
CN111553157A (en) Entity replacement-based dialog intention identification method
CN114942990A (en) Few-sample abstract dialogue abstract generation system based on prompt learning
CN114265921A (en) Question-answer knowledge base construction method and device, equipment, medium and product thereof
CN117524202A (en) Voice data retrieval method and system for IP telephone
CN115860015B (en) Translation memory-based transcription text translation method and computer equipment
CN117033579A (en) Novel knowledge base system and method based on LLMS
CN114238605B (en) Automatic conversation method and device for intelligent voice customer service robot
CN113515960B (en) Automatic translation quality assessment method integrating syntax information
CN117709969A (en) Customer service scene-oriented generation matching type large model construction method, medium and equipment
CN114330352A (en) Named entity identification method and system
Chen et al. Review of Machine Translation
CN113744737B (en) Training of speech recognition model, man-machine interaction method, equipment and storage medium
CN110046332B (en) Similar text data set generation method and device
CN118036577B (en) Sequence labeling method in natural language processing
Hu Research on Named Entity Recognition Technology based on pre-trained model
CN109241539B (en) Updating method of machine learning artificial intelligence translation database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination