CN117709969A

CN117709969A - Customer service scene-oriented generation matching type large model construction method, medium and equipment

Info

Publication number: CN117709969A
Application number: CN202311760197.4A
Authority: CN
Inventors: 张通; 邓忠易; 陈俊龙
Original assignee: Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Guangzhou; South China University of Technology SCUT
Current assignee: Guangdong Provincial Laboratory Of Artificial Intelligence And Digital Economy Guangzhou; South China University of Technology SCUT
Priority date: 2023-12-20
Filing date: 2023-12-20
Publication date: 2024-03-15

Abstract

The invention provides a customer service scene-oriented large model construction method, medium and equipment for generating a matching type; the method comprises a model setting stage, a pre-training stage, a domain migration stage and a downstream fine tuning stage which are sequentially executed; the pre-training phase refers to: pre-training a large model base of the intelligent customer service large model by taking texts of a cross-domain Chinese corpus as samples; the domain migration phase refers to: adopting customer service scene data as a sample; performing weak supervision training on a large model base of the intelligent customer service large model; the downstream fine tuning stage refers to: and training the intelligent customer service large model by taking the customer service scene manual annotation data as a sample so as to learn the related knowledge of the new service. The method realizes and optimizes the function of the large model step by step in stages, so that the method has the capability of deeply mining the knowledge of large-scale customer service text data, and simultaneously has the capability of accurately migrating and rapidly expanding newly increased service demands and changed service contents.

Description

Customer service scene-oriented generation matching type large model construction method, medium and equipment

Technical Field

The invention relates to the technical field of computer model construction, in particular to a customer service scene-oriented generation matching type large model construction method, medium and equipment.

Background

The intelligent customer service large model is an application oriented to the customer service industry developed on the basis of large-scale knowledge processing. In the existing customer service scene, the core function of intelligent customer service is man-machine conversation, and the intelligent customer service is mainly realized through a natural language generation type large model at present. After the pre-training of the large model is completed through the corpus in different fields, the system converts the received human language into sequence data and inputs the sequence data into the large model, captures the information in the dialogue through a series of working mechanisms, and also needs to perform a series of judgment logics according to human experience and business rules to filter invalid information, and finally outputs the invalid information to be converted into a natural language form so as to realize the man-machine dialogue.

However, the existing intelligent customer service large model has the following technical problems:

conventional large model modeling schemes are difficult to accommodate for engineering needs:

the conventional large-model modeling scheme has extremely high requirements on the scale and quality of the marked data; meanwhile, in the process of adapting to different downstream tasks, the original model architecture needs to be expanded, the original parameters are finely adjusted, and the secondary development often brings non-negligible additional workload. In a smart customer service scenario, more complex practical situations are still to be solved, for example, in a user incoming call and customer service dialogue, a problem reflected by a customer usually relates to a great variety of services, and meanwhile, a situation that service contents are continuously updated iteratively exists, so that a conventional large-model pre-training and fine-tuning scheme is difficult to adapt to the updating frequency of a specific service.

(II) a pseudo tag generation scheme based on deep learning has a limitation in customer service scenes:

customer service dialogue scene involves various complicated business problems, customer service personnel with certain expertise and working experience are required to finish the labeling work of dialogue texts well, and large-scale labeling data are often required for the pre-training of a large model to achieve a satisfactory effect. The existing pseudo tag generation scheme based on deep learning needs to independently construct a deep learning model, and can achieve a good generation effect after training is completed through labeling data of a certain scale, which can definitely bring additional development and labeling cost. For a large model which depends on a large amount of training data to ensure high accuracy, how to fully utilize manual labeling information, reduce labeling workload and fully combine human experience and learning rules becomes a critical problem to be solved.

(III) the training effect of the intelligent customer service large model is affected by a large amount of noise contained in the pseudo tag data:

although an automatic labeling mode of generating pseudo tag data through a deep learning model has become a mainstream data enhancement means at the present stage, compared with manual labeling, the automatic labeling mode is difficult to ensure the quality of labeled data, and certain noise is often present. Therefore, the data set in the real scene can be divided into a small amount of manually marked fine label data and a large amount of noisy pseudo label data, and the whole data set is used for large model training at the moment, and the mode of minimizing the whole loss function can not obtain expected effects. How to avoid models learning incorrect knowledge from large amounts of noisy data remains a major concern.

Fourth, the existing intelligent customer service large model has single function, and is difficult to meet rich service requirements:

in the existing intelligent customer service large model scheme, the core function is limited to man-machine conversation, and functions of attribution analysis and the like which can really assist customer service staff in handling incoming call consultation are not involved. The technical architecture of the model usually adopts a generated model to reply to the questions of the user in combination with various rules, the final output is still in the form of dialogue text, and the output result and the actual service requirement are in a gap.

Disclosure of Invention

In order to overcome the defects and shortcomings in the prior art, the invention aims to provide a customer service scene-oriented large model construction method, medium and equipment for generating and matching; the method realizes and optimizes the function of the large model step by step in stages, so that the method has the capability of deeply mining the knowledge of large-scale customer service text data, and simultaneously has the capability of accurately migrating and rapidly expanding newly added service demands and changed service contents; the method not only can adapt to different computing power resources and data scales and strengthen management and control of an algorithm research and development flow by a manager, but also can provide a targeted solution for continuously updated service demands.

In order to achieve the above purpose, the invention is realized by the following technical scheme: a customer service scene-oriented large model construction method for generating matching comprises a model setting stage, a pre-training stage, a field migration stage and a downstream fine adjustment stage which are sequentially executed;

the model setting stage refers to: setting a large intelligent customer service model based on a Transformer architecture; the intelligent customer service large model comprises a large model base and a mapping module connected to the output end of the large model base;

the pre-training stage refers to: the text of a cross-domain Chinese corpus is used as a sample, and a large model base of the intelligent customer service large model is pre-trained, so that the intelligent customer service large model has a strong generalization capability;

the domain migration stage refers to: adopting customer service scene data as a sample; performing weak supervision training on a large model base of the intelligent customer service large model by utilizing fine label data in customer service scene data and pseudo label data obtained through automatic labeling so as to enable the intelligent customer service large model to have strong domain knowledge;

the downstream fine tuning stage refers to: adopting customer service scene manual annotation data as a sample; the sample splices the corresponding service demand prompt template according to the service demand to obtain task data; training the intelligent customer service large model by using the task data to learn the related knowledge of the new service.

Preferably, the domain migration phase comprises the following steps:

x1, collecting customer service scene data; classifying customer service scene data into user information I, product information and customer service dialogue; the customer service dialogue comprises dialogue text I of voice transcription;

the first dialogue text comprises a first dialogue text with a label and a first dialogue text without a label; the number of tagged dialog texts > the number of untagged dialog texts;

x2, forming fine label data in a form of 'dialogue text-label' when the dialogue text I is the dialogue text I with the label;

when the first dialogue text is one of the label-free dialogue texts, acquiring user information I corresponding to the first dialogue text; converting the user information into text description according to the context template, and splicing the text description with the first dialogue text to obtain a first dialogue spliced text; automatically labeling the dialogue spliced text to obtain a corresponding abstract of the dialogue text; combining the first dialogue text with the abstract to form pseudo tag data in a dialogue text-abstract form so as to realize data enhancement;

x3, respectively inputting the label data and the pseudo label data into a large model base of the intelligent customer service large model; in the forward propagation process, calculating to obtain the loss gradient of the tag data and the pseudo tag data, and comparing the similarity degree of the loss gradient:

if the loss gradient directions are consistent, judging that the training is effective, and then carrying out backward propagation to complete weak supervision training;

otherwise, the loss gradient of the pseudo tag data is set to zero, and then backward propagation is carried out to complete weak supervision training.

Preferably, in the step X2, automatic labeling processing is performed on the dialog spliced text to obtain a summary corresponding to the dialog text, which means that: comprises the following sub-steps:

x21, initializing a summary set;

x22, splitting the dialogue splicing text into n sentences;

x23, splicing each sentence and all sentences in the abstract set to obtain abstract spliced sentences;

x24, calculating the maximum common subsequence length L of each abstract spliced sentence and the rest sentences of the dialogue spliced text respectively; removing the sentence corresponding to the Lmax from the dialogue splicing text and storing the sentence into a summary set;

x25, judging whether the number s of sentences in the abstract set reaches the set number of sentences: if so, setting the current abstract set as an abstract of the first dialogue text; otherwise, the step X22 is skipped, and the sentence number s in the abstract collection is continuously increased.

Preferably, the context template comprises content of user information one and content of dialogue text one; the content of the first user information comprises telephone charge balance, package name, package charge and package flow.

Preferably, the downstream trimming stage comprises the following substeps:

y1, collecting manual annotation data of customer service scenes; the manual annotation data comprises a second dialogue text with a label; acquiring user information II corresponding to the dialogue text II; setting a service demand prompting template according to the service demand; converting the second user information into text description according to the context template, splicing the second user information with the second dialogue text, and splicing the second user information with the service demand prompting template to obtain task data;

y2, setting candidate labels of a candidate label list;

y3, respectively inputting the task data and the candidate labels into a large model base of the intelligent customer service large model to obtain a generated information sentence vector and a candidate label sentence vector; calculating and generating the similarity between the information sentence vectors and each candidate tag sentence vector through a matching algorithm; taking a candidate label corresponding to the candidate label sentence vector with the highest similarity as a prediction label;

y4, calculating a loss function according to the predicted label and the actual label of the second dialogue text; and when the value convergence of the loss function or the iteration times reach the set times, ending the downstream fine tuning stage, otherwise, adjusting the parameters of the intelligent customer service large model, and returning to the substep Y3 to continue training.

Preferably, the execution frequency of the pre-training phase < the execution frequency of the domain migration phase < the execution frequency of the downstream fine tuning phase.

Preferably, in the intelligent customer service large model, the large model base comprises an encoder and a decoder; the mapping module comprises a plurality of full-connection layers which are connected in sequence.

A readable storage medium, wherein said storage medium stores a computer program which, when executed by a processor, causes said processor to perform said customer service scene oriented generation matching large model construction method.

The computer equipment comprises a processor and a memory for storing a program executable by the processor, wherein the processor realizes the matched large model construction method facing the customer service scene when executing the program stored by the memory.

Compared with the prior art, the invention has the following advantages and beneficial effects:

firstly, according to the real customer service working content and customer service data characteristics, the modeling whole flow of the intelligent customer service large model is disassembled into four stages of model setting, pre-training, field migration and downstream fine adjustment, the functions of the large model are realized and optimized step by step in stages, so that the intelligent customer service large model has the capability of deep mining of large-scale customer service text data knowledge, and meanwhile, the intelligent customer service large model has the capability of accurate migration and rapid expansion for newly added service demands and changed service contents; in the multi-stage modeling method, the cold start problem of customer service dialogue is effectively solved through the learning of open domain public corpus in the pre-training stage; the field migration stage is used for mining a large amount of effective information of the non-labeling dialogue text through a small amount of labeling data; and in the downstream fine tuning stage, the capacity of the large model is quickly and accurately migrated to a specific service by adopting a matching algorithm, so that the resource consumption of secondary development is avoided. The staged modeling method accurately marks the whole process of customer service work, not only can adapt to different computing power resources and data scales and strengthen management and control of an algorithm research and development process by a manager, but also can provide a targeted solution for continuously updated business demands.

Secondly, extracting abstract information from each section of dialogue by adopting an automatic labeling algorithm facing dialogue text, constructing a corpus in an original text-abstract form, and generating pseudo tag data required by large-model field migration training by utilizing massive original dialogue data; compared with the scheme of independently constructing a deep learning model to generate pseudo tag data, the automatic labeling algorithm focuses more on the characteristics of the data, the language mode under a specific scene is mined through statistical analysis on the distribution characteristics of real data, additional labeling data is not needed, development workload and calculation power consumption are greatly reduced, a large amount of pseudo tag data of dialog texts can be generated in a short time by fully utilizing the existing massive original dialog data, and the time cost and labor cost of labeling work are greatly reduced.

Thirdly, a weak supervision pre-learning method based on data enhancement is adopted, and based on a large amount of pseudo-annotation data generated by an automatic annotation algorithm, a small amount of artificial annotation data is combined, so that data enhancement in a customer service scene is realized in a weak supervision mode; the manually marked dialogue text contains language modes unique to customer service scenes and can be regarded as strong supervision data; the automatic labeling algorithm can be used for acquiring large-scale pseudo tag data at low cost, but the quality of the pseudo tag data is lower than that of manual labeling data, and the pseudo tag data can be regarded as weak supervision data containing noise. According to the weak supervision training method provided by the invention, under the condition that the quality of the training data set is uneven, a larger amount of noisy data is used as a main learning object of the model, and effective information can be mined from a large amount of noisy data based on a small amount of strong supervision data, so that the model effect is improved.

And fourthly, adopting a matching algorithm based on the generated large model, finishing fine adjustment of the intelligent customer service large model by using the labeling data of the specific service scene, inputting the service data and the dialogue text into the large model simultaneously on the basis of the fine adjustment, coding, and finally realizing the functions required by the specific service scene in a characteristic matching mode. The matching algorithm trains and guides the generation capacity of the large model under different application scenes by setting different prompting templates, and connects the generation capacity of the large model and service requirements in a characteristic matching mode, so that the matching algorithm has good adaptability to continuously iterated service contents and newly added service scenes, and avoids the cost consumption of secondary development.

Drawings

FIG. 1 is a flow chart of a method for constructing a large model of generating matching type for customer service scene;

FIG. 2 is a schematic diagram of a pre-training stage of the customer service scene oriented generation matching type large model construction method of the invention;

FIG. 3 is a flowchart of an automatic labeling algorithm of the customer service scene oriented method for constructing the large model of the generation matching type;

FIG. 4 is a weak supervision training flow chart of the customer service scene oriented generation matching type large model construction method;

FIG. 5 is a schematic diagram of the encoder and decoder in the downstream fine tuning stage of the customer service scene oriented method for creating a matched large model;

fig. 6 is a schematic diagram of sentence vector mapping of the method for constructing a large model of generating matching type for customer service scene.

Detailed Description

The invention is described in further detail below with reference to the drawings and the detailed description.

Example 1

The embodiment of the method for constructing the large model of the generation matching type for the customer service scene comprises a model setting stage, a pre-training stage, a field migration stage and a downstream fine tuning stage which are sequentially executed, as shown in fig. 1.

The application paradigm of the existing large model is fine-tuned on the basis of pre-training, but in the application scene of real life, human dialogue has the characteristics of complexity and multiple ambiguity, and the satisfactory effect cannot be obtained by simply applying the paradigm, so the invention summarizes a modeling method suitable for the intelligent customer service scene by analyzing the working content and the data characteristic of the customer service scene.

The model setting stage refers to: setting a large intelligent customer service model based on a Transformer architecture; the intelligent customer service large model comprises a large model base and a mapping module comprising a plurality of full-connection layers; the large model base includes an encoder and a decoder.

The pre-training stage refers to: the text of the cross-domain Chinese corpus is used as a sample, and the large model base of the intelligent customer service large model is pre-trained, so that the intelligent customer service large model has a strong generalization capability. The pre-training stage is field independent and task independent, and the training of the large model base is completed by using the Chinese corpus of the open domain.

In the pre-training stage, an encoder converts an input text into sequence data, extracts high-order semantic information such as grammar, part of speech, entity and the like from the sequence data and maps the high-order semantic information into feature vectors; the decoder remaps the feature vectors output by the encoder into a data sequence and continuously optimizes the model parameters via supervisory signals provided by the data tags. On the basis of massive Chinese corpora, the data distribution characteristics of Chinese language are learned through rich pre-training tasks such as shielding word reduction, topic classification, abstract extraction and the like, and a Chinese semantic space is constructed, so that data collection and cleaning are needed from multiparty data sources, a cross-domain Chinese corpus is constructed, and data are converted according to different task templates. The intelligent customer service large model output through the pre-training stage has strong Chinese semantic understanding capability and generalization capability. The training method of the pre-training stage can be implemented by the prior art.

The domain migration stage refers to: adopting customer service scene data as a sample; and carrying out weak supervision training on a large model base of the intelligent customer service large model by utilizing fine label data in customer service scene data and pseudo label data obtained through automatic labeling, so that the intelligent customer service large model has strong domain knowledge.

Specifically, in the pre-training stage, the encoder converts the input text into sequence data, extracts high-order semantic information such as grammar, part of speech, entity and the like from the sequence data and maps the high-order semantic information into feature vectors; the decoder remaps the feature vectors output by the encoder to a data sequence and continuously optimizes the model parameters via the supervisory signals provided by the data tags, as shown in fig. 2.

The core of the domain migration stage is to utilize the data of the customer service scene to realize the knowledge migration of the large model in the vertical domain.

The domain migration stage comprises the following steps:

x1, collecting customer service scene data; the customer service scene data is classified into user information I, product information and customer service dialogue.

The first user information comprises information such as telephone charge balance, package name, package charge, package flow and the like, and operation records of recent self-service operation, incoming call consultation and complaints; the product information comprises detailed contents of various packages and services, such as campus packages and the contents of packages such as cost, free call duration, directional flow and the like; the customer service dialogue includes dialogue text one of the voice transcription, and periodic manual spot check records (a section of dialogue text one corresponds to a sentence of manual summary sentence).

The first user information and the product information can provide key information for the intention recognition of the session, and the customer service session reflects the flow of analyzing and solving the problems from the first user information and the product information, so that the first user information and the product information are the main learning objects of the intelligent customer service large model.

taking the dialogue text I with the label as fine label data; and automatically labeling the first unlabeled conversation text to convert the first unlabeled conversation text into pseudo-label data.

when the first dialogue text is a label-free dialogue text, each dialogue text needs to be matched with the first user information in order to enrich the context of the current dialogue; the method comprises the steps of obtaining user information I corresponding to dialogue text I; and converting the user information into text description according to the context template, and splicing the text description with the first dialogue text to obtain the first dialogue spliced text. The context template comprises the content of user information I and the content of dialogue text I; the content of the first user information comprises telephone charge balance, package name, package charge and package flow. Assuming that the context templates are shown in table 1,

table 1 up and down Wen Moban

Fields	Content
		Telephone charge balance	20 yuan
Package name	Campus super package
		Package tariff	29 yuan/month
Package flow rate	Domestic general flow 30G, directional flow 50G
		Session content	Your own, my packages are not enough in flow, with or without larger packages, preferably …

The dialogue splicing text is as follows: the current user's telephone charge balance is (20 yuan), the use package is (campus super package), package charge is (29 yuan monthly), the current dialogue content is (hello, my package flow is insufficient, there is no larger package flow, preferably …) the package contains flow (domestic general flow 30G, directional flow 50G).

Automatically labeling the dialogue spliced text to obtain a corresponding abstract of the dialogue text; the first dialog text and the abstract are combined to form pseudo tag data in the form of a "dialog text-abstract" to achieve data enhancement. Specifically, as shown in fig. 3, the method comprises the following substeps:

x21, initializing a summary set;

x22, splitting the dialogue splicing text into n sentences;

X3, respectively inputting the label data and the pseudo label data into a large model base of the intelligent customer service large model to perform weak supervision training; weak supervision training refers to: as shown in fig. 4, in one forward propagation process, a loss gradient of the tag data and the pseudo tag data is calculated, and the similarity degree of the loss gradient is compared:

if the loss gradient direction is consistent, judging that the optimization direction of the training data is consistent with the optimization direction of the clean data, training effectively, and then performing backward propagation to complete weak supervision training;

otherwise, judging that the noise influences the clean data training, setting the loss gradient of the pseudo tag data to zero, and then carrying out backward propagation to complete the weak supervision training.

The goal of the downstream fine tuning stage is to provide solutions for different business demands, and the intelligent customer service large model learns the relevant knowledge of new business from a small amount of manual annotation data, so that the performance of specific tasks is improved.

A downstream trimming stage comprising the following sub-steps:

the user dialogue attribution analysis is used as an important function in the intelligent customer service scene, can provide a key analysis result for enterprises, assists customer service staff to discover and solve problems in time, and finally achieves the purpose of improving user satisfaction. In this embodiment, the user session attribution analysis is taken as an example, and the working content of the downstream fine tuning stage is described; attribution analysis refers to classifying a current session into a specific service or product by customer service staff in combination with existing information according to incoming calls and session content of a user. If the heat of which products is highest is analyzed from the incoming call and the conversation of the user in a certain time period, a product attribution prompting template can be constructed; if the reasons of user complaints and product faults need to be positioned, a complaint attribution prompting template can be constructed; for other business requirements, different types of prompt templates can be constructed according to the characteristics of business data, and auxiliary information is provided for the input data of the intelligent customer service large model.

When the service requirement is product attribution, the content of the service requirement prompting template is as follows: "what product the current conversation belongs to? "; when the business requirement is complaint attribution, the content of the business requirement prompting template is as follows: what is the complaint content of the current conversation? "

Taking complaints as a cause, the task data is:

"the current user's telephone charge balance is (20 yuan), the use package is (campus super package), the package charge is (29 yuan monthly), the current dialogue content is (hello, my package flow is insufficient, there is no larger package flow, preferably …), the package charge is (domestic general flow 30G, directional flow 50G). What is the complaint content of the current conversation? "

Y2, inquiring related products and activities in the service system according to the keywords to serve as candidate labels, and storing the candidate labels in a candidate label list;

y3, inputting task data into an encoder of the intelligent customer service large model, and outputting decoder generation information by a decoder of the intelligent customer service large model; the decoder generated information is input into a mapping module of the intelligent customer service large model, and the generated information sentence vector is obtained through the mapping of a plurality of full connection layers of the mapping module, as shown in fig. 5; inputting the candidate labels into a large intelligent customer service model to obtain candidate label sentence vectors;

the encoder depth mines key information such as part of speech, grammar, context association, relative position and the like of the input text based on an attention mechanism, and encodes the key information into implicit characteristic vectors serving as the input of the decoder. The decoder is responsible for reconstructing the implicit feature vector into a label of the input text. Language patterns and context wisdom in customer service scenarios are mined from large-scale "text-tag" sample pairs by the workflow of encoding-decoding.

The aim of sentence vector mapping is to input task data and candidate label text into a large model code at the same time and perform similarity matching, so as to realize efficient matching of the task data and the labels.

The cosine similarity between the generated information sentence vector and each candidate label sentence vector is calculated through a matching algorithm; the candidate tag corresponding to the candidate tag sentence vector with the highest similarity is used as the prediction tag, as shown in fig. 6.

Example two

The readable storage medium of this embodiment stores a computer program, which when executed by a processor, causes the processor to execute the method for creating a matched large model for customer service oriented scenarios of embodiment one.

Example III

The computer device of the present embodiment includes a processor and a memory for storing a program executable by the processor, where when the processor executes the program stored in the memory, the method for creating a matched large model for customer service scene according to the first embodiment is implemented.

The above examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the above examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principle of the present invention should be made in the equivalent manner, and the embodiments are included in the protection scope of the present invention.

Claims

1. A customer service scene-oriented large model construction method for generating matching is characterized by comprising the following steps: the method comprises a model setting stage, a pre-training stage, a domain migration stage and a downstream fine tuning stage which are sequentially executed;

2. The customer service scene-oriented generation matching type large model construction method as claimed in claim 1, wherein the method comprises the following steps: the domain migration stage comprises the following steps:

3. The customer service scene-oriented generation matching type large model construction method as claimed in claim 2, wherein the method comprises the following steps: step X2, performing automatic labeling processing on the dialogue spliced text to obtain a corresponding abstract of the dialogue text, which means that: comprises the following sub-steps:

x21, initializing a summary set;

x22, splitting the dialogue splicing text into n sentences;

4. The customer service scene-oriented generation matching type large model construction method as claimed in claim 2, wherein the method comprises the following steps: the context template comprises the content of user information I and the content of dialogue text I; the content of the first user information comprises telephone charge balance, package name, package charge and package flow.

5. The customer service scene-oriented generation matching type large model construction method as claimed in claim 2, wherein the method comprises the following steps: the downstream fine tuning stage comprises the following sub-steps:

y2, setting candidate labels of a candidate label list;

6. The customer service scene-oriented generation matching type large model construction method as claimed in claim 1, wherein the method comprises the following steps: the execution frequency of the pre-training stage is less than the execution frequency of the domain migration stage and less than the execution frequency of the downstream fine tuning stage.

7. The customer service scene-oriented generation matching type large model construction method as claimed in claim 1, wherein the method comprises the following steps: in the intelligent customer service large model, a large model base comprises an encoder and a decoder; the mapping module comprises a plurality of full-connection layers which are connected in sequence.

8. A readable storage medium, wherein the storage medium has stored thereon a computer program which, when executed by a processor, causes the processor to perform the customer service scene oriented generation matching large model construction method of any of claims 1-7.

9. A computer device comprising a processor and a memory for storing a program executable by the processor, wherein the processor, when executing the program stored in the memory, implements the customer service scene oriented generation matching large model construction method of any one of claims 1-7.