CN111221963B - Intelligent customer service data training model field migration method - Google Patents

Intelligent customer service data training model field migration method Download PDF

Info

Publication number
CN111221963B
CN111221963B CN201911133457.9A CN201911133457A CN111221963B CN 111221963 B CN111221963 B CN 111221963B CN 201911133457 A CN201911133457 A CN 201911133457A CN 111221963 B CN111221963 B CN 111221963B
Authority
CN
China
Prior art keywords
model
training
data set
network model
initial network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911133457.9A
Other languages
Chinese (zh)
Other versions
CN111221963A (en
Inventor
张翀
江岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Xiaoduo Technology Co ltd
Original Assignee
Chengdu Xiaoduo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Xiaoduo Technology Co ltd filed Critical Chengdu Xiaoduo Technology Co ltd
Priority to CN201911133457.9A priority Critical patent/CN111221963B/en
Publication of CN111221963A publication Critical patent/CN111221963A/en
Application granted granted Critical
Publication of CN111221963B publication Critical patent/CN111221963B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a migration method in the field of intelligent customer service data training models, which comprises the following steps: training an initial network model by using all data sets to obtain a general model, and training the initial network model by using the data sets of all the fields to obtain a plurality of field models; inputting the target data set into a general model and corresponding domain model calculation, taking the intermediate output of the general model as general sentence representation, and taking the intermediate output of the domain model as domain sentence representation; splicing the domain sentence representation at the tail of the general sentence representation to obtain a spliced sentence representation; and inputting the spliced sentence representation into an initial network model for training to obtain a target model. According to the method, the target model is obtained through training of the spliced sentence representation formed by splicing the general sentence representation and the domain sentence representation, the spliced sentence representation formed by splicing the general sentence representation and the domain sentence representation inherits the domain knowledge learned by the general model and the domain model, and the target data domain can be completely adapted.

Description

Intelligent customer service data training model field migration method
Technical Field
The invention belongs to the technical field of neural network data processing, and particularly relates to a migration method in the field of intelligent customer service data training models.
Background
The transfer learning technology in deep learning has been widely used in the NLP field. The existing deep learning transfer learning method comprises the following steps:
1. based on parameters (early transition learning mode), i.e. multiplexing parameters of the pre-trained model. And the input of the target model is the result after the text numerical conversion, and the parameters of the pre-training model are directly used as the initialization parameters of the target model.
Parameter-based shortcomings: the target model parameter scale and the pre-training model parameter scale are as large as possible, so that the calculation complexity is high, and the industrial application requirements cannot be met.
2. Based on the representation, the text numerical value is converted and input into a pre-training model, and the intermediate output of the pre-training model is used as the input of the target model.
Based on the advantages of the representation: because the pre-training model is used for calculating the representation once instead of iterating for many times, the target model selects the parameters of small scale, so the speed is greatly improved, and the identification accuracy is equivalent to that based on the parameters.
Representation-based drawbacks: the problem of field mismatch exists when a pre-trained model based on the clothing field problem is directly applied to the mobile phone field, and the recognition accuracy rate can encounter a bottleneck.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides a field migration method of an intelligent customer service data training model, which is characterized in that firstly a general model suitable for all fields and a plurality of field models suitable for one field are pre-trained, when a target data set is calculated, a general sentence representation and a field sentence representation are calculated and output by using the general model and the field model corresponding to the target data set, the general model learns general knowledge, the field model learns the knowledge specific to the field through training, the spliced sentence representation formed by splicing the general sentence representation and the field sentence representation is trained to obtain a target model, the spliced sentence representation formed by splicing the general sentence representation and the field sentence representation takes the general knowledge learned by the general model and the field knowledge learned by the field model into account, the trained target model also learns the general knowledge and the field specific knowledge, the target data field can be completely adapted, and the recognition accuracy of semantics is improved.
In order to achieve the above object, the present invention adopts the following solutions: the intelligent customer service data training model field migration method comprises the following steps:
s1: training an initial network model for a semantic classification task by using all data sets to obtain a general model, training the initial network model for the semantic classification task by using the data sets of all the fields to obtain a plurality of field models, wherein all the data sets comprise the data sets of the fields, each field comprises the data sets, and the data in the data sets are labeled with semantic categories in advance;
s2: inputting a target data set into a general model and a corresponding field model for calculation, taking the middle output of the general model as general sentence representation, and taking the middle output of the field model as field sentence representation, wherein the target data set belongs to any field data set, the data quantity is smaller than that of the field data set, and the data in the target data set are all labeled with semantic categories in advance;
s3: splicing the domain sentence representation at the tail of the general sentence representation to obtain a spliced sentence representation;
s4: and inputting the spliced sentence representation into an initial network model for training to obtain a target model, wherein the adopted target data set belongs to which field, and the obtained target model belongs to which field.
The migration method further comprises the steps of training an initialization model for a semantic classification task by using all data sets to obtain an additional model; inputting the target data set into an additional model for calculation, and taking the intermediate output of the additional model as an additional sentence representation; splicing the domain sentence representation at the tail of the general sentence representation, and splicing the additional sentence representation at the tail of the domain sentence representation to obtain a spliced sentence representation; the spliced sentence representation is input into an initial network model for training to obtain a target model, the additional model calculates a target data set to obtain additional sentence representations, more knowledge represented by the spliced sentence representations added with the additional sentence representations is added, one or more angles of descriptive texts are increased, the knowledge learned by the target model trained by the spliced sentence representations is more, the recognition accuracy of the target model on the semantics is improved, and the suitability of the target model and the target data field is improved.
The training of the initial network model to obtain the universal model by using all the data sets comprises the following steps:
s111: performing numerical conversion on each statement of each data set in all data sets to obtain a vector with a specified length; firstly, defining and generating a mapping table from Chinese characters to numbers, and corresponding different Chinese characters to one number, wherein each Chinese character has a unique number correspondence; then each sentence is converted into a vector with a specified length according to the mapping table, and the length is not enough to be added with 0, so that the specified length is achieved;
s112: a plurality of vectors obtained by processing the same data set form a matrix of the data set, and one data set comprises a plurality of sentences to form a multi-dimensional matrix;
s113: inputting a matrix of the data set into an initial network model for iterative computation;
s114: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s115: and repeating S113-S114 until the value of the loss function is not reduced any more or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is a general model.
The training of the initial network model by using the data sets of all the fields to obtain a plurality of field models comprises the following steps of;
s121: performing numerical conversion on each statement of each data set in the same field data set to obtain a vector with a specified length, wherein the steps of the field data set are the same as the processing steps of all the data sets;
s122: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s123: inputting a matrix of the data set into an initial network model, and performing iterative computation by adding an objective function with field preference;
s124: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s125: repeating S123-S124 until the value of the loss function is not reduced or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is the field model.
The training the initialization model to obtain the additional model by using all the data sets comprises the following steps:
s131: performing numerical conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s132: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s133: inputting a matrix of the data set into an initial network model, and performing iterative computation by adding an objective function with different domain preference from the domain training model;
s134: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s135: repeating S133-S134 until the value of the loss function is not reduced or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is an additional model.
The objective function of the field preference is a distance measurement function or an included angle measurement function.
The inputting of the target data set into the general model and the domain model calculation comprises:
s201: performing numerical conversion on each statement of each data set in the target data set to obtain a vector with a specified length;
s202: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s203: inputting a matrix of the dataset into a general model to calculate once to obtain an intermediate output which is used as general sentence representation;
s204: and inputting the matrix of the data set into a corresponding domain model to calculate once to obtain an intermediate output which is used as domain sentence representation.
The training the spliced sentence representation into the initial network model to obtain the target model comprises the following steps: inputting the spliced sentence representation into an initial network model for iterative computation; calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced; until the value of the loss function is not reduced any more or until the iteration calculation is continuously performed for preset times, the target structure of the parameter to be determined after the last adjustment is the target model.
The beneficial effects of the invention are as follows:
(1) The method comprises the steps of pre-training a general model suitable for all fields and a plurality of field models suitable for one field, calculating and outputting general sentence representation and field sentence representation by using the general model and the field models corresponding to the target data set when calculating the target data set, learning general knowledge by using the general model, training the general model to obtain the target model by training knowledge specific to the learned field by using the general sentence representation and the spliced sentence representation spliced by the field sentence representation, taking the general knowledge learned by the general model and the field knowledge learned by the field model into account by the spliced sentence representation spliced by the general sentence representation and the field sentence representation, and learning general knowledge and the field-specific knowledge by the trained target model.
(2) Meanwhile, as the general model and the field model are mature models which are already trained, the target data volume used in the process of training the target model is greatly reduced, and thus the model training efficiency is improved.
Drawings
FIG. 1 is a diagram of a method for migrating a domain of a data training model according to a first embodiment of the present invention;
FIG. 2 is a diagram of a data training model field migration method in a second embodiment of the present invention;
FIG. 3 is a flow chart of the generic model training of the present invention;
FIG. 4 is a flow chart of the model training in the field of the present invention;
FIG. 5 is a flow chart of additional model training according to the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
as shown in fig. 1, a method for migrating the domain of an intelligent customer service data training model includes the following steps:
s1: training an initial network model for the semantic classification task by using all data sets to obtain a fully trained general model, and training the initial network model for the semantic classification task by using the data sets of each field to obtain a plurality of fully trained field models. All the data sets comprise data sets in a plurality of fields, each field comprises a plurality of data sets, and the data in the data sets are pre-labeled with semantic categories; the universal model is unique, and the field model covers a plurality of industry fields of customer service dialogue scenes, including tens of consumption fields of electric appliances, clothing shoe bags, foods, daily life, makeup, ornaments and the like; the semantic category refers to a predetermined question category of customer service meeting in an e-commerce customer service dialogue scene, for example: inquiring about the shipping time, inquiring about whether there is a gift, etc. Dividing the chat corpus of the user into corresponding semantics when labeling, such as: "when to ship", "good to ship" will be divided into the semantics of "inquire about the time of shipment". Therefore, the 'query delivery time' semantic meaning is expressed by a very rich question method, and other semantic labeling processes are similar to the query delivery time semantic meaning, so that different question methods and corresponding semantics are learned during model training, and the question methods or similar question methods which are seen during training can be divided under correct semantics during model prediction execution, and simultaneously, robot reply content corresponding to the semantics is configured in advance, so that the automatic robot reply process is realized.
S2: and inputting the target data set into a general model and a corresponding domain model for calculation, taking the middle output of the general model as general sentence representation, taking the middle output of the domain model as domain sentence representation, wherein the target data set belongs to any domain data set, the data quantity is smaller than that of the domain data set, and the data in the target data set are all labeled with semantic categories in advance.
S3: splicing the domain sentence representation at the tail of the general sentence representation to obtain a spliced sentence representation, wherein the intermediate layer outputs of the general model and the domain model are all 500-dimensional vectors, such as general sentence representation of a sentence: [1, …,500], domain sentence representation: [501, …,1000], the post-splice vector dimension becomes 1000 dimensions [1, …,500,501, …,1000].
S4: and inputting the spliced sentence representation into an initial network model for training to obtain a target model, wherein the adopted target data set belongs to which field, and the obtained target model belongs to which field.
In another preferred embodiment, as shown in fig. 2, the migration method further includes training an initialization model for the semantic classification task using all the data sets to obtain additional models; inputting the target data set into an additional model for calculation, and taking the intermediate output of the additional model as an additional sentence representation; splicing the domain sentence representation at the tail of the general sentence representation, and splicing the additional sentence representation at the tail of the domain sentence representation to obtain a spliced sentence representation; inputting the spliced sentence representation into an initial network model for training to obtain a target model; the additional model calculates the target data set to obtain additional sentence representation, the spliced sentence representation added with the additional sentence representation has more knowledge, the target model trained by the spliced sentence representation has more knowledge, and the target model can be more suitable for the field of target data.
As shown in fig. 3, the training of the initial network model to obtain the generic model using the full dataset includes the steps of:
s111: performing numerical conversion on each statement of each data set in all data sets to obtain a vector with a specified length; firstly, defining and generating a mapping table from Chinese characters to numbers, and corresponding different Chinese characters to one number, wherein each Chinese character has a unique number correspondence; and then converting each sentence into a vector with a specified length according to the mapping table, wherein the length is less than the complement 0 to reach the specified length. Such as: "at" - > "1", "mock" - > "2", etc., then "at" becomes [1,2], a specified length of 35 is set according to the average length of the vendor message counted in the e-commerce customer service chat, i.e., a maximum of 35 words are processed to obtain a vector of 35, and vectors of less than 35 are complemented by 0 to 35, such as "at" becomes [0, …,1,2] here; the semantic category of sentence annotation also needs to be converted into numerical value, and the established semantic category is mapped to a numerical number, if n semantics exist, the number corresponding to each semantic is 0 to n-1;
s112: a plurality of vectors obtained by processing the same data set form a matrix of the data set, and one data set comprises a plurality of sentences to form a multi-dimensional matrix; for example, the dataset contains only two sentences "in does", then the matrix that is composed is a two-dimensional matrix [ [0, …,1,2], [0, …,1,2] ];
s113: inputting a matrix of the data set into an initial network model for iterative computation;
s114: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s115: and repeating S113-S114 until the value of the loss function is not reduced any more or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is a general model.
As shown in fig. 4, the training of the initial network model to obtain a plurality of domain models by using the data sets of the respective domains includes the following steps;
s121: performing numerical conversion on each statement of each data set in the same field data set to obtain a vector with a specified length, wherein the steps of the field data set are the same as the processing steps of all the data sets;
s122: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s123: inputting a matrix of the data set into an initial network model, and performing iterative computation by adding an objective function with field preference; in the model training process, a distance measurement function is adopted to measure the distance between the output of the middle layer of the general model and the output of the middle layer of the field model, wherein the middle layer refers to a layer between the input layer and the output layer of the model, the output of the middle layer of the general model and the output layer of the middle layer of the field model are both 500-dimensional floating point number vectors [ x_0, …, x_500], and the optimization objective of the objective function of the field preference is to enable the distance between the output vector of the middle layer of the general model and the output vector of the middle layer of the field model to be large enough, so that the middle layer output of the general model and the middle layer output of the field model can be farther apart in space distribution;
s124: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s125: repeating S123-S124 until the value of the loss function is not reduced or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is the field model. In the training process of the domain model, the learning ability of the domain model to domain related knowledge can be increased by reasonably selecting sample content of the domain data set, so that the domain suitability of the target model is better.
As shown in fig. 5, the training of the initialization model using the entire data set to obtain additional models includes the steps of:
s131: performing numerical conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s132: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s133: inputting a matrix of the data set into an initial network model, performing iterative computation by adding a domain-preferred objective function different from a domain training model, wherein the objective of adopting the domain-preferred objective function when training an additional pre-training model is to enlarge the spatial distinction between the middle layer representation of the general model and the middle layer representation of the additional pre-training model;
s134: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s135: repeating S133-S134 until the value of the loss function is not reduced or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is an additional model.
The objective function of the field preference is a distance measurement function or an included angle measurement function.
The inputting of the target data set into the general model and the domain model calculation comprises:
s201: performing numerical conversion on each statement of each data set in the target data set to obtain a vector with a specified length;
s202: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s203: inputting a matrix of the dataset into a general model to calculate once to obtain an intermediate output which is used as general sentence representation;
s204: and inputting the matrix of the data set into a corresponding domain model to calculate once to obtain an intermediate output which is used as domain sentence representation.
The training the spliced sentence representation into the initial network model to obtain the target model comprises the following steps: inputting the spliced sentence representation into an initial network model for iterative computation; calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced; until the value of the loss function is not reduced any more or until the iteration calculation is continuously performed for preset times, the target structure of the parameter to be determined after the last adjustment is the target model. With the continuous updating of the sample, continuous effect improvement can be brought to the target model by retraining the pre-training model.
Example 1
Assume that there are n training samples in one dataset, one of which: there are many other training samples in the dataset that the query is not in the semantic meaning of "in does" correspond to "query delivery time" and so on. Taking "at" as an example, 1, a numerical conversion, "at" becomes a vector of length 35 [0, …,1,2], which corresponds to the semantic "no at" conversion to a numerical 0. 2. The pre-training means that the vector obtained in the first step is converted into two vectors [1, …,500], [501, …,1000] with 500 dimensions through a general model and a field model, then the two vectors are spliced to obtain a vector with 1000 dimensions [1, …,1000], similarly, n-1 other training samples under a data set are converted to obtain a vector with 1000 dimensions, a batch of 1000-dimensional vectors are input each time in the calculation process of a target model, 200 vectors with 1000 dimensions are fixed each time, 200 training samples are corresponding, semantic numbers corresponding to the vectors are predicted in the calculation of the model, a loss value is calculated through a target function, the target function is used for evaluating error indexes of semantic numbers and actual numbers predicted by the model, the output of the target function is called the loss value, namely the lower the more accurate loss is predicted, the loss value is obtained after calculation of each batch of data, the loss value is obtained after calculation of the model is optimized through a gradient descent method, the loss value is calculated again, the cycle is repeated until the loss is not reduced, the calculation is stopped, and the target model parameters are not regulated any more, and finally, the predicted customer service machine model is obtained.
The foregoing examples merely illustrate specific embodiments of the invention, which are described in greater detail and are not to be construed as limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention.

Claims (9)

1. The utility model provides an intelligent customer service data training model field migration method which is characterized in that: the method comprises the following steps:
s1: training an initial network model by using all data sets to obtain a general model, and training the initial network model by using the data sets of all the fields to obtain a plurality of field models;
s2: inputting the target data set into a general model and corresponding domain model calculation, taking the intermediate output of the general model as general sentence representation, and taking the intermediate output of the domain model as domain sentence representation;
s3: splicing the domain sentence representation at the tail of the general sentence representation to obtain a spliced sentence representation;
s4: inputting the spliced sentence representation into an initial network model for training to obtain a target model, which specifically comprises the following steps: inputting the spliced sentence representation into an initial network model for iterative computation; calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced; until the value of the loss function is not reduced any more or until the iteration calculation is continuously performed for preset times, the target structure of the parameter to be determined after the last adjustment is the target model.
2. The intelligent customer service data training model field migration method according to claim 1, wherein: the migration method further comprises the steps of training an initialization model by using all data sets to obtain an additional model; inputting the target data set into an additional model for calculation, and taking the intermediate output of the additional model as an additional sentence representation; splicing the domain sentence representation at the tail of the general sentence representation, and splicing the additional sentence representation at the tail of the domain sentence representation to obtain a spliced sentence representation; and inputting the spliced sentence representation into an initial network model for training to obtain a target model.
3. The intelligent customer service data training model field migration method according to claim 1 or 2, wherein: the training of the initial network model to obtain the universal model by using all the data sets comprises the following steps:
s111: performing numerical conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s112: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s113: inputting a matrix of the data set into an initial network model for iterative computation;
s114: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s115: and repeating S113-S114 until the value of the loss function is not reduced any more or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is a general model.
4. The intelligent customer service data training model field migration method according to claim 1 or 2, wherein: the training of the initial network model by using the data sets of all the fields to obtain a plurality of field models comprises the following steps of;
s121: performing numerical conversion on each statement of each data set in the same field of data sets to obtain a vector with a specified length;
s122: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s123: inputting a matrix of the data set into an initial network model, and performing iterative computation by adding an objective function with field preference;
s124: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s125: repeating S123-S124 until the value of the loss function is not reduced or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is the field model.
5. The intelligent customer service data training model field migration method as claimed in claim 2, wherein: the training the initialization model to obtain the additional model by using all the data sets comprises the following steps:
s131: performing numerical conversion on each statement of each data set in all data sets to obtain a vector with a specified length;
s132: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s133: inputting a matrix of the data set into an initial network model, and performing iterative computation by adding an objective function with different domain preference from the domain training model;
s134: calculating the value of the loss function of the initial network model, and adjusting the parameters to be determined in each layer structure of the initial network model so that the average value of the loss function of the network model after the parameters are adjusted is reduced;
s135: repeating S133-S134 until the value of the loss function is not reduced or until the iteration calculation is continuously performed for preset times, wherein the target structure of the parameter to be determined after the last adjustment is an additional model.
6. The intelligent customer service data training model field migration method as claimed in claim 5, wherein: the objective function of the field preference is a distance measurement function or an included angle measurement function.
7. The intelligent customer service data training model field migration method according to claim 1, wherein: the inputting of the target data set into the general model and the domain model calculation comprises:
s201: performing numerical conversion on each statement of each data set in the target data set to obtain a vector with a specified length;
s202: a plurality of vectors obtained by processing the same data set form a matrix of the data set;
s203: inputting a matrix of the dataset into a general model to calculate once to obtain an intermediate output which is used as general sentence representation;
s204: and inputting the matrix of the data set into a corresponding domain model to calculate once to obtain an intermediate output which is used as domain sentence representation.
8. The intelligent customer service data training model field migration method according to claim 1, wherein: training the initial network model by using all data sets to obtain a general model, specifically training the initial network model by using all data sets aiming at semantic classification tasks to obtain the general model; training an initial network model by using the data sets of each field to obtain a plurality of field models, specifically training the initial network model by using the data sets of each field for semantic classification tasks to obtain a plurality of field models; the data of all the data sets are labeled with semantic categories in advance.
9. The intelligent customer service data training model field migration method according to claim 1, wherein: the target data set belongs to any field data set, the data quantity is smaller than that of the field data set, and the data in the target data set are labeled with semantic categories in advance.
CN201911133457.9A 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method Active CN111221963B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911133457.9A CN111221963B (en) 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911133457.9A CN111221963B (en) 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method

Publications (2)

Publication Number Publication Date
CN111221963A CN111221963A (en) 2020-06-02
CN111221963B true CN111221963B (en) 2023-05-12

Family

ID=70810181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911133457.9A Active CN111221963B (en) 2019-11-19 2019-11-19 Intelligent customer service data training model field migration method

Country Status (1)

Country Link
CN (1) CN111221963B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460455A (en) * 2018-02-01 2018-08-28 成都小多科技有限公司 Model treatment method and device
CN109711529A (en) * 2018-11-13 2019-05-03 中山大学 A kind of cross-cutting federal learning model and method based on value iterative network
CN110046248A (en) * 2019-03-08 2019-07-23 阿里巴巴集团控股有限公司 Model training method, file classification method and device for text analyzing
CN110309267A (en) * 2019-07-08 2019-10-08 哈尔滨工业大学 Semantic retrieving method and system based on pre-training model
CN110399492A (en) * 2019-07-22 2019-11-01 阿里巴巴集团控股有限公司 The training method and device of disaggregated model aiming at the problem that user's question sentence
CN110442684A (en) * 2019-08-14 2019-11-12 山东大学 A kind of class case recommended method based on content of text

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11222415B2 (en) * 2018-04-26 2022-01-11 The Regents Of The University Of California Systems and methods for deep learning microscopy

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108460455A (en) * 2018-02-01 2018-08-28 成都小多科技有限公司 Model treatment method and device
CN109711529A (en) * 2018-11-13 2019-05-03 中山大学 A kind of cross-cutting federal learning model and method based on value iterative network
CN110046248A (en) * 2019-03-08 2019-07-23 阿里巴巴集团控股有限公司 Model training method, file classification method and device for text analyzing
CN110309267A (en) * 2019-07-08 2019-10-08 哈尔滨工业大学 Semantic retrieving method and system based on pre-training model
CN110399492A (en) * 2019-07-22 2019-11-01 阿里巴巴集团控股有限公司 The training method and device of disaggregated model aiming at the problem that user's question sentence
CN110442684A (en) * 2019-08-14 2019-11-12 山东大学 A kind of class case recommended method based on content of text

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘文洁 ; 林磊 ; 孙承杰 ; .基于迁移学习的语义推理网络.智能计算机与应用.2018,第8卷(第06期),195-198. *

Also Published As

Publication number Publication date
CN111221963A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
Zou et al. Logistic regression model optimization and case analysis
CN108334891A (en) A kind of Task intent classifier method and device
CN111639679B (en) Small sample learning method based on multi-scale metric learning
CN111931513B (en) Text intention recognition method and device
JP2020537777A (en) Methods and devices for identifying the user's intent of speech
CN110348075A (en) A kind of grinding surface roughness prediction technique based on improvement algorithm of support vector machine
CN111353033B (en) Method and system for training text similarity model
US20220121934A1 (en) Identifying neural networks that generate disentangled representations
CN111666427A (en) Entity relationship joint extraction method, device, equipment and medium
WO2023137911A1 (en) Intention classification method and apparatus based on small-sample corpus, and computer device
CN114186084B (en) Online multi-mode Hash retrieval method, system, storage medium and equipment
CN110502757B (en) Natural language emotion analysis method
CN111046170A (en) Method and apparatus for outputting information
CN112906393A (en) Meta learning-based few-sample entity identification method
CN112529151A (en) Data processing method and device
CN109635294A (en) Based on single semantic unregistered word processing method, intelligent answer method and device
CN116308754A (en) Bank credit risk early warning system and method thereof
CN111221963B (en) Intelligent customer service data training model field migration method
Ferreira et al. Adversarial bandit for online interactive active learning of zero-shot spoken language understanding
CN107562714A (en) A kind of statement similarity computational methods and device
CN113722439B (en) Cross-domain emotion classification method and system based on antagonism class alignment network
CN115758145A (en) Model training method, text recognition method, electronic device and readable storage device
CN109740163A (en) Semantic expressiveness resource generation method and device applied to deep learning model
Riid et al. Interpretability of fuzzy systems and its application to process control
CN113139382A (en) Named entity identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant