CN111221963A - Intelligent customer service data training model field migration method - Google Patents

Intelligent customer service data training model field migration method Download PDF

Info

Publication number
CN111221963A
CN111221963A CN201911133457.9A CN201911133457A CN111221963A CN 111221963 A CN111221963 A CN 111221963A CN 201911133457 A CN201911133457 A CN 201911133457A CN 111221963 A CN111221963 A CN 111221963A
Authority
CN
China
Prior art keywords
model
field
data set
training
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911133457.9A
Other languages
Chinese (zh)
Other versions
CN111221963B (en
Inventor
张翀
江岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Xiaoduo Technology Co Ltd
Original Assignee
Chengdu Xiaoduo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Xiaoduo Technology Co Ltd filed Critical Chengdu Xiaoduo Technology Co Ltd
Priority to CN201911133457.9A priority Critical patent/CN111221963B/en
Publication of CN111221963A publication Critical patent/CN111221963A/en
Application granted granted Critical
Publication of CN111221963B publication Critical patent/CN111221963B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a migration method in the field of intelligent customer service data training models, which comprises the following steps: training the initial network model by using all data sets to obtain a universal model, and training the initial network model by using the data sets of all fields to obtain a plurality of field models; inputting a target data set into a general model and a corresponding field model for calculation, taking the middle output of the general model as a general sentence representation, and taking the middle output of the field model as a field sentence representation; splicing the field sentence representation at the tail of the general sentence representation to obtain a spliced sentence representation; and inputting the spliced sentence representation into an initial network model to train to obtain a target model. The method obtains a target model by using the spliced sentence expression training formed by splicing the general sentence expression and the field sentence expression, and the spliced sentence expression formed by splicing the general sentence expression and the field sentence expression inherits the field knowledge learned by the general model and the field model and can be completely adapted to the field of target data.

Description

Intelligent customer service data training model field migration method
Technical Field
The invention belongs to the technical field of neural network data processing, and particularly relates to a migration method in the field of intelligent customer service data training models.
Background
The transfer learning technology in deep learning has been widely applied in the field of NLP. The existing deep learning migration learning method comprises the following steps:
1. based on parameters (early transfer learning mode), i.e. parameters of the pre-trained model are multiplexed. The input of the target model is the result of text numerical value conversion, and the parameters of the pre-training model are directly used as the initialization parameters of the target model.
Disadvantages based on parameters: the target model parameter scale and the pre-training model parameter scale are large, so that the calculation complexity is high, and the industrial application requirement cannot be met.
2. And converting the text value based on the representation, inputting the converted text value into a pre-training model, and taking the intermediate output of the pre-training model as the input of the target model.
Advantages based on representation: because only the pre-training model is used for calculating once to represent, and iteration is not repeated, the target model selects small-scale parameters, so that the speed is greatly improved, and the identification accuracy is equivalent to that based on the parameters.
Disadvantages based on the representation: the problem of field mismatching exists, for example, when a pre-training model based on the problem of the clothing field is directly applied to the field of a mobile phone, the problem of field mismatching exists, and the identification accuracy rate can meet the bottleneck.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a method for migrating the training model field of intelligent customer service data, which comprises the steps of pre-training a general model suitable for all fields and a plurality of field models respectively suitable for one field, calculating and outputting general sentence expressions and field sentence expressions by using the field models corresponding to the general model and the target data set when calculating a target data set, learning general knowledge by the general model, learning the special knowledge of the field by the field model, training and obtaining the target model by using spliced sentence expressions spliced by the general sentence expressions and the field sentence expressions, wherein the spliced sentence expressions spliced by the general sentence expressions and the field sentence expressions take the general knowledge learned by the general model and the field knowledge learned by the field model into account, and the target model obtained by training also learns the general knowledge and the special knowledge of the field, the method can be completely adapted to the field of target data, improves the recognition accuracy of semantics, and greatly reduces the target data volume used in the process of training the target model because the general model and the field model are mature models which are already trained, thereby improving the model training efficiency.
In order to achieve the above purpose, the solution adopted by the invention is as follows: a method for migrating the field of an intelligent customer service data training model comprises the following steps:
s1: training an initial network model for a semantic classification task by using all data sets to obtain a general model, training the initial network model for the semantic classification task by using the data sets of all fields to obtain a plurality of field models, wherein all the data sets comprise the data sets of the fields, each field comprises a plurality of data sets, and the data in the data sets are labeled with semantic categories in advance;
s2: inputting a target data set into a general model and a corresponding field model for calculation, taking the middle output of the general model as a general sentence representation, taking the middle output of the field model as a field sentence representation, wherein the target data set belongs to any field data set, the data quantity is less than that of the field data set, and the data in the target data set are labeled with semantic categories in advance;
s3: splicing the field sentence representation at the tail of the general sentence representation to obtain a spliced sentence representation;
s4: and inputting the expression of the spliced sentences into an initial network model to train so as to obtain a target model, wherein the adopted target data set belongs to which field, and the obtained target model belongs to which field.
The migration method further comprises the steps of training an initialization model for the semantic classification task by using all the data sets to obtain an additional model; inputting the target data set into an additional model for calculation, and taking the intermediate output of the additional model as an additional sentence for representation; splicing the field sentence representation at the tail of the general sentence representation, and splicing the additional sentence representation at the tail of the field sentence representation to obtain a spliced sentence representation; the spliced sentence representation is input into the initial network model to be trained to obtain a target model, the additional model calculates a target data set to obtain additional sentence representation, more knowledge of the spliced sentence representation added with the additional sentence representation is added, one or more angles of description texts are increased, more knowledge learned by the target model trained by the spliced sentence representation is obtained, the semantic recognition accuracy of the target model is improved, and the adaptability of the target model to the field of target data is improved.
The method for training the initial network model by using all the data sets to obtain the universal model comprises the following steps:
s111: performing numerical value conversion on each statement of each data set in all data sets to obtain a vector with a specified length; firstly, defining and generating a mapping table from Chinese characters to numbers, and corresponding different Chinese characters to one number, wherein each Chinese character has a unique number correspondence; then each sentence is converted into a vector with a specified length according to a mapping table, and the length is insufficient to supplement 0 so as to reach the specified length;
s112: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set, wherein the data set comprises a plurality of sentences to form a multi-dimensional matrix;
s113: inputting the matrix of the data set into an initial network model for iterative computation;
s114: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s115: and repeating S113-S114 until the value of the loss function is not reduced or until iteration calculation is carried out for a preset number of times, wherein the target structure of the parameter to be determined after the last adjustment is a universal model.
The step of training the initial network model by using the data sets of each field to obtain a plurality of field models comprises the following steps;
s121: performing numerical value conversion on each statement of each data set in the same field data set to obtain a vector with a specified length, wherein the field data set has the same steps as the processing steps of all the data sets;
s122: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s123: inputting the matrix of the data set into an initial network model, and performing iterative computation by adding a field preference objective function;
s124: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s125: and repeating S123-S124 until the value of the loss function is not reduced or until iteration calculation is carried out for continuous preset times, wherein the target structure of the parameters to be determined after the last adjustment is the domain model.
The training of the initialization model using the full dataset to obtain additional models comprises the steps of:
s131: performing numerical value conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s132: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s133: inputting the matrix of the data set into an initial network model, and adding a target function with different field preferences from the field training model to perform iterative computation;
s134: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s135: and repeating S133-S134 until the value of the loss function is not reduced or until the iteration calculation is continuously carried out for a preset number of times, wherein the target structure of the parameter to be determined after the last adjustment is an additional model.
The target function of the field preference is a distance measurement function or an included angle measurement function.
The calculation of inputting the target data set into the general model and the domain model comprises the following steps:
s201: performing numerical value conversion on each statement of each data set in the target data set to obtain a vector with a specified length;
s202: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s203: inputting the matrix of the data set into a general model to calculate once to obtain intermediate output as general sentence expression;
s204: and inputting the matrix of the data set into a corresponding domain model, and calculating once to obtain intermediate output as domain sentence representation.
The training of the spliced sentence representation input into the initial network model to obtain the target model comprises the following steps: inputting the spliced sentence representation into an initial network model for iterative computation; calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted; until the value of the loss function is not reduced or until the iteration calculation of continuous preset times, and the target structure of the parameters to be determined after the last adjustment is the target model.
The invention has the beneficial effects that:
(1) the method first pre-trains a general model suitable for all fields and a plurality of field models each suitable for one field, when a target data set is calculated, a general sentence expression and a field sentence expression are calculated and output by using a general model and a field model corresponding to the target data set, general knowledge is learned by the general model, the field model learns specific knowledge of the field by training, a target model is obtained by training the spliced sentence expression spliced by the general sentence expression and the field sentence expression, the spliced sentence expression spliced by the general sentence expression and the field sentence expression gives consideration to the general knowledge learned by the general model and the field knowledge learned by the field model, the general knowledge and the specific knowledge of the field are learned by the trained target model, the target model can be completely adapted to the field of the target data, and the recognition accuracy of semantics is improved.
(2) Meanwhile, as the general model and the field model are mature models which are trained, the target data volume used in the process of training the target model is greatly reduced, thereby improving the model training efficiency.
Drawings
FIG. 1 is a diagram illustrating a data training model domain migration method according to a first embodiment of the present invention;
FIG. 2 is a diagram of a data training model domain migration method according to a second embodiment of the present invention;
FIG. 3 is a flow chart of the universal model training of the present invention;
FIG. 4 is a flow chart of the present invention domain model training;
FIG. 5 is a flow chart of additional model training in accordance with the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
as shown in fig. 1, a method for migrating the field of an intelligent customer service data training model includes the following steps:
s1: and training the initial network model aiming at the semantic classification task by using all the data sets to obtain a fully trained universal model, and training the initial network model aiming at the semantic classification task by using the data sets of all the fields to obtain a plurality of fully trained field models. The whole data set comprises data sets of a plurality of fields, each field comprises a plurality of data sets, and the data in the data sets are labeled with semantic categories in advance; the general model is unique, and the field model covers multiple industry fields of customer service conversation scenes, including tens of consumption fields of electric appliances, clothes, shoes and bags, foods, daily life, beauty cosmetics, ornaments and the like; the semantic categories refer to the categories of questions that the customer asks for customer service in a pre-determined e-commerce customer service conversation scene, such as: the delivery time is inquired, and whether there is a gift or not is inquired. When in labeling, the user chat linguistic data are divided into corresponding semantemes, such as: "when to ship" and "good and long to ship" are divided into the semantics of "ask for shipping time". Therefore, under the semantic of 'inquiry delivery time', rich questions can all represent the semantic of 'inquiry delivery time', and similarly, other semantic labeling processes are also the same, different questions and corresponding semantics are learned during model training, so that the questions seen during training or similar questions can be divided into correct semantics when the model performs prediction, and meanwhile, the robot reply content corresponding to the semantics is configured in advance, and thus the automatic response process of the robot is realized.
S2: inputting a target data set into a general model and a corresponding field model for calculation, taking the middle output of the general model as a general sentence representation, taking the middle output of the field model as a field sentence representation, wherein the target data set belongs to any field data set, the data quantity is less than that of the field data set, and the data in the target data set are labeled with semantic categories in advance.
S3: the domain sentence representation is spliced at the tail of the general sentence representation to obtain a spliced sentence representation, and the intermediate layer output of the general model and the domain model is a 500-dimensional vector, such as the general sentence representation of a sentence: [1, …,500], field sentence representation: [501, …,1000], the vector dimension becomes 1000 dimensions after splicing [1, …,500,501, …,1000 ].
S4: and inputting the expression of the spliced sentences into an initial network model to train so as to obtain a target model, wherein the adopted target data set belongs to which field, and the obtained target model belongs to which field.
In another preferred embodiment, as shown in fig. 2, the migration method further includes training the initialization model for the semantic classification task using the entire data set to obtain an additional model; inputting the target data set into an additional model for calculation, and taking the intermediate output of the additional model as an additional sentence for representation; splicing the field sentence representation at the tail of the general sentence representation, and splicing the additional sentence representation at the tail of the field sentence representation to obtain a spliced sentence representation; inputting the spliced sentence representation into an initial network model to train to obtain a target model; the additional model calculates the target data set to obtain additional sentence representation, the spliced sentence representation added with the additional sentence representation has more knowledge, the target model trained by adopting the spliced sentence representation has more knowledge, and the target model can be more suitable for the field of target data.
As shown in fig. 3, the training of the initial network model using the entire data set to obtain the general model includes the following steps:
s111: performing numerical value conversion on each statement of each data set in all data sets to obtain a vector with a specified length; firstly, defining and generating a mapping table from Chinese characters to numbers, and corresponding different Chinese characters to one number, wherein each Chinese character has a unique number correspondence; then each sentence is converted into a vector with a specified length according to the mapping table, and the length is insufficient to complement 0 so as to reach the specified length. Such as: "at" - > "1", "do" - > "2", etc., then "at do" becomes [1,2], setting the specified length of 35 according to the average length of the seller's messages counted in the e-commerce customer service chat, i.e. processing 35 words at most to obtain a vector with the length of 35, and supplementing 0 to 35 vectors with the length less than 35, such as "at do" becomes [0,0,0,0,0, …,1,2 ]; the semantic categories marked by the sentences also need to be subjected to numerical value conversion, and the established semantic categories are mapped to number numbers, for example, if n semantics exist, the number corresponding to each semantic is 0 to n-1;
s112: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set, wherein the data set comprises a plurality of sentences to form a multi-dimensional matrix; for example, if the data set contains only two "at", then the resulting matrix is a two-dimensional matrix [ [0, …,1,2], [0, …,1,2] ];
s113: inputting the matrix of the data set into an initial network model for iterative computation;
s114: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s115: and repeating S113-S114 until the value of the loss function is not reduced or until iteration calculation is carried out for a preset number of times, wherein the target structure of the parameter to be determined after the last adjustment is a universal model.
As shown in fig. 4, the training of the initial network model using the data sets of the respective domains to obtain a plurality of domain models includes the following steps;
s121: performing numerical value conversion on each statement of each data set in the same field data set to obtain a vector with a specified length, wherein the field data set has the same steps as the processing steps of all the data sets;
s122: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s123: inputting the matrix of the data set into an initial network model, and performing iterative computation by adding a field preference objective function; in the model training process, a distance measuring function is adopted to measure the distance between the intermediate layer output of the general model and the intermediate layer output of the field model, the intermediate layer refers to the layer between the model input layer and the output layer, the intermediate layer output of the general model and the intermediate layer output of the field model are both floating point vectors [ x _0, …, x _500] of 500 dimensions, and the optimization objective of the objective function with field preference is to ensure that the distance between the intermediate layer output vector of the general model and the intermediate layer output vector of the field model is large enough, so that the intermediate layer outputs of the general model and the field model are farther away in spatial distribution;
s124: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s125: and repeating S123-S124 until the value of the loss function is not reduced or until iteration calculation is carried out for continuous preset times, wherein the target structure of the parameters to be determined after the last adjustment is the domain model. In the training process of the domain model, the reasonable selection of the sample content of the domain data set can increase the learning capacity of the domain model to the domain-related knowledge, so that the domain adaptability of the target model is better.
As shown in fig. 5, training the initialization model using the full dataset to obtain additional models comprises the following steps:
s131: performing numerical value conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s132: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s133: inputting a matrix of a data set into an initial network model, adding a field-preferred objective function different from a field training model to perform iterative computation, and adopting the field-preferred objective function when training an additional pre-training model to enlarge the spatial distinction between the intermediate layer representation of a general model and the intermediate layer representation of the additional pre-training model;
s134: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s135: and repeating S133-S134 until the value of the loss function is not reduced or until the iteration calculation is continuously carried out for a preset number of times, wherein the target structure of the parameter to be determined after the last adjustment is an additional model.
The target function of the field preference is a distance measurement function or an included angle measurement function.
The calculation of inputting the target data set into the general model and the domain model comprises the following steps:
s201: performing numerical value conversion on each statement of each data set in the target data set to obtain a vector with a specified length;
s202: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s203: inputting the matrix of the data set into a general model to calculate once to obtain intermediate output as general sentence expression;
s204: and inputting the matrix of the data set into a corresponding domain model, and calculating once to obtain intermediate output as domain sentence representation.
The training of the spliced sentence representation input into the initial network model to obtain the target model comprises the following steps: inputting the spliced sentence representation into an initial network model for iterative computation; calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted; until the value of the loss function is not reduced or until the iteration calculation of continuous preset times, and the target structure of the parameters to be determined after the last adjustment is the target model. With the continuous updating of the samples, continuous effect improvement can be brought to the target model by retraining the pre-training model.
Example one
Assume that there are n training samples in a dataset, where one sample: the "at does" corresponds to "inquiry at not" semantics, and many other training samples exist in the data set, such as "when to ship" corresponds to "inquiry about the time of shipment", and so on. Taking "at do" as an example, 1, the value is converted, at do, into a length 35 vector [0, …,1,2], which corresponds to the semantic "query is not at" converting to a number 0. 2. Pre-training shows that the vector obtained in the first step is converted into two 500-dimensional vectors [1, …,500], [501, …,1000] through a general model and a domain model, then the two vectors are spliced to obtain a 1000-dimensional vector [1, …,1000], similarly, other training samples of n-1 under a data set are converted to obtain a 1000-dimensional vector, a batch of 1000-dimensional vectors are input each time in the calculation process of a target model, 200 1000-dimensional vectors are fixed and input each time and correspond to 200 training samples, the semantic number corresponding to each vector is predicted in the calculation of the model, the loss value is calculated through a target function, the target function is used for evaluating the error index of the predicted semantic number and the actual number of the model, the output of the target function is called the loss value, namely the more accurate loss is predicted, the loss value is obtained after each batch of data is calculated, and then the parameters of the model are optimized through a gradient descent method, and inputting a batch of data to calculate the loss value, repeating the cycle until the loss does not decrease any more, stopping calculation, and not adjusting the target model parameters any more to obtain the customer service robot model for predicting the semantics of the buyer finally.
The above-mentioned embodiments only express the specific embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims (10)

1. A method for migrating an intelligent customer service data training model field is characterized by comprising the following steps: the method comprises the following steps:
s1: training the initial network model by using all data sets to obtain a universal model, and training the initial network model by using the data sets of all fields to obtain a plurality of field models;
s2: inputting a target data set into a general model and a corresponding field model for calculation, taking the middle output of the general model as a general sentence representation, and taking the middle output of the field model as a field sentence representation;
s3: splicing the field sentence representation at the tail of the general sentence representation to obtain a spliced sentence representation;
s4: and inputting the spliced sentence representation into an initial network model to train to obtain a target model.
2. The intelligent customer service data training model field migration method of claim 1, wherein: the migration method further comprises training the initialization model by using all data sets to obtain an additional model; inputting the target data set into an additional model for calculation, and taking the intermediate output of the additional model as an additional sentence for representation; splicing the field sentence representation at the tail of the general sentence representation, and splicing the additional sentence representation at the tail of the field sentence representation to obtain a spliced sentence representation; and inputting the spliced sentence representation into an initial network model to train to obtain a target model.
3. The intelligent customer service data training model field migration method according to claim 1 or 2, characterized in that: the method for training the initial network model by using all the data sets to obtain the universal model comprises the following steps:
s111: performing numerical value conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s112: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s113: inputting the matrix of the data set into an initial network model for iterative computation;
s114: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s115: and repeating S113-S114 until the value of the loss function is not reduced or until iteration calculation is carried out for a preset number of times, wherein the target structure of the parameter to be determined after the last adjustment is a universal model.
4. The intelligent customer service data training model field migration method according to claim 1 or 2, characterized in that: the step of training the initial network model by using the data sets of each field to obtain a plurality of field models comprises the following steps;
s121: performing numerical value conversion on each statement of each data set in the data sets in the same field to obtain a vector with a specified length;
s122: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s123: inputting the matrix of the data set into an initial network model, and performing iterative computation by adding a field preference objective function;
s124: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s125: and repeating S123-S124 until the value of the loss function is not reduced or until iteration calculation is carried out for continuous preset times, wherein the target structure of the parameters to be determined after the last adjustment is the domain model.
5. The intelligent customer service data training model field migration method of claim 2, wherein: the training of the initialization model using the full dataset to obtain additional models comprises the steps of:
s131: performing numerical value conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s132: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s133: inputting the matrix of the data set into an initial network model, and adding a target function with different field preferences from the field training model to perform iterative computation;
s134: calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted;
s135: and repeating S133-S134 until the value of the loss function is not reduced or until the iteration calculation is continuously carried out for a preset number of times, wherein the target structure of the parameter to be determined after the last adjustment is an additional model.
6. The intelligent customer service data training model domain migration method of claim 5, wherein: the target function of the field preference is a distance measurement function or an included angle measurement function.
7. The intelligent customer service data training model field migration method of claim 1, wherein: the calculation of inputting the target data set into the general model and the domain model comprises the following steps:
s201: performing numerical value conversion on each statement of each data set in the target data set to obtain a vector with a specified length;
s202: forming a matrix of a data set by a plurality of vectors obtained by processing the same data set;
s203: inputting the matrix of the data set into a general model to calculate once to obtain intermediate output as general sentence expression;
s204: and inputting the matrix of the data set into a corresponding domain model, and calculating once to obtain intermediate output as domain sentence representation.
8. The intelligent customer service data training model field migration method of claim 1, wherein: the training of the spliced sentence representation input into the initial network model to obtain the target model comprises the following steps: inputting the spliced sentence representation into an initial network model for iterative computation; calculating the value of a loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model to reduce the average value of the values of the loss function of the network model after the parameters are adjusted; until the value of the loss function is not reduced or until the iteration calculation of continuous preset times, and the target structure of the parameters to be determined after the last adjustment is the target model.
9. The intelligent customer service data training model field migration method of claim 1, wherein: the method comprises the steps of training an initial network model by using all data sets to obtain a general model, specifically, training the initial network model by using all data sets aiming at semantic classification tasks to obtain the general model; the method comprises the steps of training an initial network model by using data sets of various fields to obtain a plurality of field models, specifically, training the initial network model by using the data sets of various fields aiming at semantic classification tasks to obtain a plurality of field models; and the data of all the data sets are labeled with semantic categories in advance.
10. The intelligent customer service data training model field migration method of claim 1, wherein: the target data set belongs to any field data set, the number of the data sets is smaller than that of the field data set, and semantic categories are labeled in advance on the data in the target data set.
CN201911133457.9A 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method Active CN111221963B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911133457.9A CN111221963B (en) 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911133457.9A CN111221963B (en) 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method

Publications (2)

Publication Number Publication Date
CN111221963A true CN111221963A (en) 2020-06-02
CN111221963B CN111221963B (en) 2023-05-12

Family

ID=70810181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911133457.9A Active CN111221963B (en) 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method

Country Status (1)

Country Link
CN (1) CN111221963B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460455A (en) * 2018-02-01 2018-08-28 成都小多科技有限公司 Model treatment method and device
CN109711529A (en) * 2018-11-13 2019-05-03 中山大学 A kind of cross-cutting federal learning model and method based on value iterative network
CN110046248A (en) * 2019-03-08 2019-07-23 阿里巴巴集团控股有限公司 Model training method, file classification method and device for text analyzing
CN110309267A (en) * 2019-07-08 2019-10-08 哈尔滨工业大学 Semantic retrieving method and system based on pre-training model
US20190333199A1 (en) * 2018-04-26 2019-10-31 The Regents Of The University Of California Systems and methods for deep learning microscopy
CN110399492A (en) * 2019-07-22 2019-11-01 阿里巴巴集团控股有限公司 The training method and device of disaggregated model aiming at the problem that user's question sentence
CN110442684A (en) * 2019-08-14 2019-11-12 山东大学 A kind of class case recommended method based on content of text

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460455A (en) * 2018-02-01 2018-08-28 成都小多科技有限公司 Model treatment method and device
US20190333199A1 (en) * 2018-04-26 2019-10-31 The Regents Of The University Of California Systems and methods for deep learning microscopy
CN109711529A (en) * 2018-11-13 2019-05-03 中山大学 A kind of cross-cutting federal learning model and method based on value iterative network
CN110046248A (en) * 2019-03-08 2019-07-23 阿里巴巴集团控股有限公司 Model training method, file classification method and device for text analyzing
CN110309267A (en) * 2019-07-08 2019-10-08 哈尔滨工业大学 Semantic retrieving method and system based on pre-training model
CN110399492A (en) * 2019-07-22 2019-11-01 阿里巴巴集团控股有限公司 The training method and device of disaggregated model aiming at the problem that user's question sentence
CN110442684A (en) * 2019-08-14 2019-11-12 山东大学 A kind of class case recommended method based on content of text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘文洁;林磊;孙承杰;: "基于迁移学习的语义推理网络" *

Also Published As

Publication number Publication date
CN111221963B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
Zou et al. Logistic regression model optimization and case analysis
CN109493166B (en) Construction method for task type dialogue system aiming at e-commerce shopping guide scene
CN111143540A (en) Intelligent question and answer method, device, equipment and storage medium
CN110781409B (en) Article recommendation method based on collaborative filtering
CN108334891A (en) A kind of Task intent classifier method and device
CN109582956A (en) text representation method and device applied to sentence embedding
CN111353033B (en) Method and system for training text similarity model
CN110334190A (en) A kind of reply automatic generation method towards open field conversational system
CN110209926A (en) Merchant recommendation method, device, electronic equipment and readable storage medium storing program for executing
CN108509573A (en) Book recommendation method based on matrix decomposition collaborative filtering and system
CN112182362A (en) Method and device for training model for online click rate prediction and recommendation system
CN114186084B (en) Online multi-mode Hash retrieval method, system, storage medium and equipment
CN111046170A (en) Method and apparatus for outputting information
CN112529151A (en) Data processing method and device
CN112906393A (en) Meta learning-based few-sample entity identification method
CN112884552A (en) Lightweight multimode recommendation method based on generation countermeasure and knowledge distillation
CN116956116A (en) Text processing method and device, storage medium and electronic equipment
CN117009650A (en) Recommendation method and device
CN114579640A (en) Financial time sequence prediction system and method based on generating type countermeasure network
CN111221963B (en) Intelligent customer service data training model field migration method
CN108197702B (en) Product design method based on evaluation network and recurrent neural network
CN110956528B (en) Recommendation method and system for e-commerce platform
CN114862514A (en) User preference commodity recommendation method based on meta-learning
CN110796195B (en) Image classification method including online small sample excitation
CN114764469A (en) Content recommendation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant