WO2020073531A1 - 文本分类模型的更新训练方法、装置及设备 - Google Patents

文本分类模型的更新训练方法、装置及设备 Download PDF

Info

Publication number
WO2020073531A1
WO2020073531A1 PCT/CN2018/125250 CN2018125250W WO2020073531A1 WO 2020073531 A1 WO2020073531 A1 WO 2020073531A1 CN 2018125250 W CN2018125250 W CN 2018125250W WO 2020073531 A1 WO2020073531 A1 WO 2020073531A1
Authority
WO
WIPO (PCT)
Prior art keywords
classification
text
sample
sample text
label
Prior art date
Application number
PCT/CN2018/125250
Other languages
English (en)
French (fr)
Inventor
许开河
杨坤
王少军
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020073531A1 publication Critical patent/WO2020073531A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present disclosure relates to the field of artificial intelligence technology, and in particular, to an update training method, device, and equipment for text classification models.
  • the text classification model in the existing customer service robot question and answer system After adding knowledge points related to new products or knowledge points related to hot issues in the knowledge base of the customer service robot, the text classification model needs to be retrained.
  • the text classification model takes a long time, which results in the text classification model not being updated in time, and the customer service robot cannot answer questions related to the new knowledge points.
  • the present disclosure provides an update training method and device for a text classification model.
  • the text classification model includes a semantic extraction layer and a classification layer. Before adding a knowledge point, the semantic extraction layer and the classification layer are completed according to the sample data of the original knowledge point Training, the update training method of the text classification model includes:
  • An update training device for a text classification model includes a semantic extraction layer and a classification layer. Before adding a knowledge point, the semantic extraction layer and the classification layer are completed according to sample data of the original knowledge point Training, the updating training device of the text classification model includes:
  • the obtaining module is configured to obtain the sample text corresponding to the newly added knowledge point and the labeling tag for marking the sample text;
  • the feature vector construction module is configured to construct a feature vector of the sample text through the semantic extraction layer that completes training based on the sample data;
  • the update training module is configured to perform update training of the classification layer according to the feature vector of the sample text and the labeling tag corresponding to the sample text, so as to implement update training of the text classification model.
  • the feature vector building module includes:
  • the word segmentation unit is configured to: perform word segmentation on the sample text through the semantic extraction layer that completes training based on sample data of original knowledge points;
  • the feature vector construction unit is configured to construct a feature vector of the sample text according to the encoding corresponding to each word in the sample text and the semantic weight of each word.
  • the device further includes:
  • the classification label supplement module is configured to: supplement the classification label of the classification layer according to the labeling label corresponding to the sample text;
  • the classification label set update module is configured to update the classification label set of the classification layer according to the supplemented classification label.
  • the update training module includes:
  • the classification label prediction unit is configured to: use the classification layer to predict the classification label corresponding to the sample text according to the feature vector of the sample text;
  • the judgment unit is configured to: perform consistency judgment on the obtained classification label and the labeling label corresponding to the sample text;
  • the adjusting unit is configured to adjust the parameters of the classification layer until the obtained classification label is consistent with the labeling label if they are inconsistent.
  • the classification label prediction unit includes:
  • the probability prediction unit is configured to: use the classification layer to predict, according to the feature vector, the probability that the feature vector corresponds to each category label in the updated category label set;
  • the classification label determining unit is configured to: traverse the probability of each classification label, and use the classification label corresponding to the maximum probability value as the classification label corresponding to the sample text.
  • the device further includes:
  • a classification test module configured to classify several test samples through the updated text classification model
  • a classification accuracy calculation module configured to: calculate, according to the classification result, the classification accuracy of the updated training text classification model on the several test samples;
  • the update training end module is configured to: if the classification accuracy reaches the specified accuracy, end the update training of the text classification model.
  • An update training device for text classification models including:
  • Memory for storing processor executable instructions
  • the processor is configured as the method described above.
  • Update training can greatly shorten the training time of the text classification model and realize the timely update of the text classification model.
  • customer service robots in the field of artificial intelligence technology can be used to respond to questions related to new knowledge points in a timely manner.
  • FIG. 1 is a schematic diagram of an implementation environment involved in this disclosure
  • Fig. 2 is a block diagram of a server according to an exemplary embodiment
  • Fig. 3 is a flowchart of a method for updating and training a text classification model according to an exemplary embodiment
  • step S150 of the embodiment shown in FIG. 3 is a flowchart of steps before step S150 of the embodiment shown in FIG. 3;
  • step S150 of the embodiment shown in FIG. 3 is a flowchart of step S150 of the embodiment shown in FIG. 3;
  • step S151 of the embodiment shown in FIG. 6 is a flowchart of step S151 of the embodiment shown in FIG. 6;
  • step S150 of the embodiment shown in FIG. 3 is a flowchart of steps after step S150 of the embodiment shown in FIG. 3;
  • Fig. 9 is a block diagram of an apparatus for updating and training a text classification model according to an exemplary embodiment
  • Fig. 10 is a block diagram of an update training device for a text classification model according to an exemplary embodiment.
  • FIG. 1 is a schematic diagram of an implementation environment involved in this disclosure.
  • the implementation environment includes: a server 200 and at least one terminal 100.
  • the terminal 100 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, and other electronic devices that can establish a network connection with the server 200 and can run a client, which is not specifically limited herein.
  • a wireless or wired network connection is established between the terminal 100 and the server 200 in advance, so that the terminal 100 and the server 200 interact through the client running on the terminal 100.
  • the server 200 can obtain the sample text input by the user on the terminal 100, and then construct a feature vector of the sample text, perform classification prediction on the feature vector, and implement update training of the text classification model, etc. .
  • the terminal 100 may receive the classification label for the sample text returned by the server 200.
  • the text classification method of the present disclosure is not limited to deploying corresponding processing logic in the server 200, it may also be processing logic deployed in other machines.
  • processing logic for updating and training a text classification model is deployed in a terminal device with computing capabilities.
  • Fig. 2 is a block diagram of a server according to an exemplary embodiment.
  • the server with this hardware structure can be used to update the text classification model and be deployed in the implementation environment shown in FIG. 1.
  • server is only an example adapted to the present disclosure, and it cannot be considered as providing any limitation on the scope of use of the present disclosure.
  • the server cannot also be interpreted as needing to depend on or must have one or more components in the exemplary server 200 shown in FIG. 2.
  • the server 200 includes: a power supply 210, an interface 230, at least one memory 250, and at least one central processing unit (CPU, Central Processing Units) 270.
  • CPU Central Processing Unit
  • the power supply 210 is used to provide an operating voltage for each hardware device on the server 200.
  • the interface 230 includes at least one wired or wireless network interface 231, at least one serial-parallel conversion interface 233, at least one input-output interface 235, and at least one USB interface 237, etc., for communicating with external devices, such as data transmission with the terminal 100.
  • the memory 250 may be a read-only memory, a random access memory, a magnetic disk, or an optical disk.
  • the resources stored on the memory 250 include an operating system 251, application programs 253, and data 255. .
  • the operating system 251 is used to manage and control the hardware devices and application programs 253 on the server 200 to realize the calculation and processing of the massive data 255 by the central processor 270, which may be Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM , FreeBSDTM, etc.
  • the application program 253 is a computer program that completes at least one specific job based on the operating system 251, and may include at least one module (not shown in FIG. 2), and each module may separately include a series of computers for the server 200. Readable instructions.
  • the data 255 may be sample data stored in the magnetic disk and the like.
  • the central processor 270 may include one or more processors, and is configured to communicate with the memory 250 through a bus for computing and processing the massive data 255 in the memory 250.
  • the server 200 to which the present disclosure is applied will complete the update training of the text classification model by the central processor 270 reading a series of computer-readable instructions stored in the memory 250.
  • the server 200 may be used by one or more application specific integrated circuits (Application Specific Integrated Circuit (ASIC for short), digital signal processor, digital signal processing equipment, programmable logic device, field programmable gate array, controller, microcontroller, microprocessor, or other electronic components to implement the following Text classification method. Therefore, the implementation of the present disclosure is not limited to any specific hardware circuit, software, or a combination of both.
  • ASIC Application Specific Integrated Circuit
  • Fig. 3 is a flow chart of a method for updating and training a text classification model according to an exemplary embodiment.
  • the update training method of the text classification model may be executed by the server 200 of the implementation environment shown in FIG. 1.
  • the text classification model includes a semantic extraction layer and a classification layer.
  • the semantic extraction layer and the classification layer are trained according to the sample data of the original knowledge point.
  • the update training method includes the following steps:
  • Step S110 Obtain the sample text corresponding to the newly added knowledge point and the label label for labeling the sample text.
  • the text classification model of this application is constructed by a neural network, where the text classification model can be constructed by a convolutional neural network (CNN), a recurrent neural network (RNN) and other neural networks that can perform text classification, or by various types
  • CNN convolutional neural network
  • RNN recurrent neural network
  • the neural network is combined and is not specifically limited here.
  • the semantic extraction layer After completing the training of the semantic extraction layer and the classification layer according to the sample data of the original knowledge points, through the training, the parameters of the semantic extraction layer and the classification layer are determined, so that the text classification model can realize the problems related to the original knowledge points
  • the semantic extraction layer can construct the feature vector of the text
  • the classification layer can classify the text based on the feature vector of the text.
  • the sample data of the original knowledge points constitute the database of the text classification model.
  • the sample data is different, and the databases of the corresponding text classification models are also different.
  • the new knowledge point may be a knowledge point not included in the database of the text classification model, or a knowledge point modified for the knowledge point in the original database, which is not specifically limited here.
  • the new knowledge point may be the developed new insurance business.
  • the sample text of the new knowledge point is related to the new insurance business Issues, such as the insurance handling process, handling materials, handling conditions, claims process and other related issues;
  • the new knowledge points can also be changes to the original insurance claims process, corresponding to the sample text of the new knowledge points is the Issues related to the changed insurance claims process. Therefore, after the update training of the text classification model, the customer service robot can classify the questions about the new knowledge points raised by the user, and then search for answers according to the classification results and present the searched answers to the user.
  • the labeling label of the sample text is a label obtained by manually classifying the sample text.
  • the labeling label can be obtained by manually labeling the sample text, and the labeled labeling label is saved.
  • Step S130 Construct a feature vector of the sample text by completing the semantic extraction layer based on the sample data.
  • the parameters of the semantic extraction layer are determined, and in step S130, a feature vector of the sample text is constructed according to the semantic extraction layer of the determined parameters.
  • the update training of the semantic extraction layer is not performed.
  • the semantic extraction layer of the text classification model has a perfect function to construct the feature vector of the text. Therefore, after the training before online service, when adding knowledge points, the semantic extraction layer can also construct the feature vector of the text.
  • step S130 includes:
  • step S131 the sample text is segmented by the semantic extraction layer that completes the training based on the sample data of the original knowledge points.
  • Step S132 Construct a feature vector of the sample text according to the encoding corresponding to each word in the sample text and the semantic weight of each word.
  • the word segmentation of the sample text means that the sample text is divided into several sequentially arranged phrases.
  • Word segmentation can be performed using a word segmentation algorithm, for example, a word segmentation algorithm based on string matching, a word segmentation algorithm based on understanding, or a word segmentation algorithm based on statistics, etc., which is not specifically limited herein.
  • a database of the text classification model is constructed from the sample data.
  • the database includes a dictionary constructed from the sample data, and the dictionary includes the sample data. The encoding corresponding to the word and the semantic weight corresponding to the word.
  • the semantic weight corresponding to a word is used to characterize the degree to which the word contributes to the semantics of the sample text in the sample text. For example, in the text "What are the procedures for handling a safe car owner card”, according to the word segmentation result obtained in step S131 is "processing ⁇ safe ⁇ car owner card ⁇ of ⁇ flow ⁇ have ⁇ which", "" "” have "” which " These three words do not contribute much to the semantics of the text, so the corresponding semantic weights of the three words in the text are smaller, and the four pairs of words "manage”, "safe", "owner card”, and "process” are The semantic contribution of the text is greater, so the semantic weight of the sample text where the four words are located is also relatively large.
  • the coding and semantic weights corresponding to each word are determined after training.
  • the semantic extraction layer is trained based on the sample data of the original knowledge points.
  • the larger the amount of sample data the more complete the dictionary, the more complete the coding corresponding to the words in the dictionary and the semantic weight of the words, and thus the more perfect the function of the semantic extraction layer to construct the feature vector of the text.
  • the feature vector of the sample text can be constructed according to the encoding corresponding to each word and the semantic weight corresponding to each word.
  • numbers are generally used to represent codes corresponding to words, and real numbers are used to represent weights corresponding to words, so that the constructed feature vector of the input text is a real number vector.
  • Step S150 Perform update training of the classification layer according to the feature vectors of the sample text and the labeling tags corresponding to the sample text, so as to implement the update training of the text classification model.
  • the update training of the classification layer is to adjust the parameters of the classification layer during the update training process.
  • the text classification model can output the classification tags corresponding to the text for the text related to the new knowledge point, that is, the text classification model is updated training.
  • the function of the semantic extraction layer to construct the feature vector of the text is perfect.
  • the original semantic extraction layer is used to construct the feature vector of the sample text, but the original semantic extraction layer is not updated, which can greatly shorten the text classification model Update training time to achieve timely update of the text classification model.
  • the sample data of the customer service robot's text classification model is often hundreds of thousands of pieces, the sample data is large, the training time is long, and the semantic extraction of the text classification model
  • the function of constructing the feature vector of the text is very perfect. Therefore, when the text classification model needs to be updated, only the update training of the classification layer is performed, and the update training of the semantic extraction layer is not performed, which greatly shortens the update training time of the text classification model, and ensures that the text classification model is updated after training
  • the classification accuracy of the texts related to the original knowledge points and newly added knowledge points Especially when the amount of new knowledge points is relatively small compared to the original knowledge points, and the text classification model update training is required, the technical solution of this application can realize the timely update of the text classification model, and can also guarantee The classification accuracy of the text classification model.
  • the method further includes:
  • Step S010 supplementing the classification label of the classification layer according to the labeling label corresponding to the sample text.
  • Step S030 Update the classification label set of the classification layer according to the supplemented classification labels.
  • the classification label set includes all the classification labels that can be output by the classification layer.
  • a label label corresponds to a classification label of the classification layer.
  • the original knowledge points do not include the sample text of the new knowledge points, of course, the sample text of the new knowledge points cannot be correctly classified.
  • the classification labels of the classification layer are supplemented according to the labeling labels corresponding to the sample text, and the classification label set of the classification layer is updated, when the update training of the classification layer is performed according to the sample text, the sample text can be determined from the updated classification label set Classification label.
  • step S150 includes:
  • Step S151 Use the classification layer to predict the classification label corresponding to the sample text according to the feature vector of the sample text.
  • Step S152 Perform consistency judgment between the obtained classification label and the label label corresponding to the sample text.
  • step S153 if they are inconsistent, the parameters of the classification layer are adjusted until the obtained classification label is consistent with the labeling label.
  • the training of the text classification model is to adjust the parameters of the text classification model during the training process so that the classification label output by the text classification model is consistent with the label label manually annotated. If the two are consistent, there is no need to adjust the parameters of the text classification model. If they are not consistent, the parameters of the text classification model are adjusted until the two are consistent. In the technical solution of the present application, when the training is updated by the text classification model, the parameters of the classification layer are adjusted so that the classification label of the sample text is consistent with the labeling label.
  • the next sample text is used to update the text classification model.
  • both the semantic extraction layer and the classification layer are trained, that is, during the training process, if the classification label of the sample text output by the classification layer and the label of the sample text are output If the labels are inconsistent, adjust the parameters of the semantic extraction layer and classification layer until they are consistent.
  • the neural network structure of the semantic extraction layer is more complicated, the calculation process is more complicated, and the amount of calculation is larger.
  • the semantic extraction layer needs to be re-constructed after the operation according to the adjusted parameters.
  • the feature vector of the text so it takes a long time to train the text classification model.
  • step S151 includes:
  • step S210 the classification layer is used to predict the probability of the feature vector corresponding to each classification label in the updated classification label set according to the feature vector prediction.
  • Step S230 traverse the probability of each classification label, and use the classification label corresponding to the maximum probability value as the classification label corresponding to the sample text.
  • step S150 the method further includes:
  • Step S171 classify several test samples through the updated text classification model.
  • step S172 the classification accuracy of the updated training text classification model for the several test samples is calculated according to the classification result.
  • step S173 if the classification accuracy reaches the specified accuracy, the update training of the text classification model is ended.
  • Steps S171-173 are used to test the classification accuracy of the updated text classification model.
  • the test sample may include text related to the original knowledge point and / or text related to the newly added knowledge point, preferably including the text of the original knowledge point and the text of the newly added knowledge point. And mark the test sample.
  • the classification label output by the text classification model for each test sample is compared with the label of each test sample. If the two are consistent, the classification is considered accurate. If they are inconsistent, the classification is considered incorrect and the classification accuracy is calculated.
  • the ratio of the number of test samples to the total test samples is the classification accuracy of the updated text classification model for several test samples.
  • the update training of the text classification model is ended, and if the classification accuracy does not reach the specified accuracy, steps S110, S130, and S150 are repeated to continue the update training of the text classification model.
  • the following is an embodiment of an apparatus of the present disclosure, which can be used to implement an embodiment of an update training method of a text classification model performed by the server 200 of the present disclosure.
  • an update training method embodiment of the text classification model of the present disclosure please refer to the update training method embodiment of the text classification model of the present disclosure.
  • Fig. 9 is a block diagram of a device for updating and training a text classification model according to an exemplary embodiment.
  • the update training of the text classification model may be used in the server 200 of the implementation environment shown in Fig. 1 to execute any of the above embodiments All or part of the training method for updating the text classification model in.
  • the update training device of the text classification model includes, but is not limited to: an acquisition module 110, a feature vector construction module 130, and an update training module 150, where the text classification model includes a semantic extraction layer and a classification layer, which are newly added Before the knowledge point, the training of the semantic extraction layer and the classification layer is completed according to the sample data of the original knowledge point.
  • the device includes:
  • the obtaining module 110 is configured to obtain the sample text corresponding to the newly added knowledge point and the labeling tag for labeling the sample text.
  • the feature vector construction module 130 which is connected to the acquisition module 110, is configured to construct a feature vector of the sample text through a semantic extraction layer that completes training based on the sample data.
  • the update training module 150 which is connected to the feature vector construction module 130, is configured to perform update training of the classification layer according to the feature vector of the sample text and the label tags corresponding to the sample text, so as to implement the update training of the text classification model.
  • the feature vector construction module 130 includes:
  • the word segmentation unit is configured to segment the sample text through a semantic extraction layer that completes training based on the sample data of the original knowledge points.
  • the feature vector construction unit is configured to construct a feature vector of the sample text according to the encoding corresponding to each word in the sample text and the semantic weight of each word.
  • the device for updating and training the text classification model further includes:
  • the classification label supplement module is configured to supplement the classification label of the classification layer according to the labeling label corresponding to the sample text.
  • the classification label set update module is configured to update the classification label set of the classification layer according to the supplemented classification label.
  • the update training module 150 includes:
  • the classification label prediction unit is configured to use the classification layer to predict the classification label corresponding to the sample text according to the feature vector of the sample text, and the updated classification label set includes the classification label corresponding to the sample text.
  • the judging unit is configured to judge the consistency of the obtained classification label and the labeling label corresponding to the sample text.
  • the adjustment unit is configured to adjust the parameters of the classification layer until the obtained classification labels are consistent with the labeling labels if they are inconsistent.
  • the classification label prediction unit includes:
  • the probability prediction unit is configured to use the classification layer to predict the probability of the feature vector corresponding to each classification label in the updated classification label set according to the feature vector prediction.
  • the classification label determining unit is configured to: traverse the probability of each classification label, and use the classification label corresponding to the maximum probability value as the classification label corresponding to the sample text.
  • the device for updating and training the text classification model further includes:
  • the classification test module is configured to classify several test samples through the updated text classification model.
  • the classification accuracy calculation module is configured to calculate, according to the classification result, the classification accuracy of the updated training text classification model for the several test samples.
  • the update training end module is configured to end the update training of the classification model if the classification accuracy reaches the specified accuracy.
  • modules / units can be implemented by hardware, software, or a combination of both.
  • these modules may be implemented as one or more hardware modules, such as one or more application specific integrated circuits.
  • these modules may be implemented as one or more computer programs executed on one or more processors, such as the programs stored in the memory 250 executed by the central processor 270 of FIG. 2.
  • the present disclosure also provides an update training device for a text classification model.
  • the text classification device may be a server 200 in the implementation environment shown in FIG. step.
  • the update training device of the text classification model includes:
  • the processor 1001 is configured as all or part of the steps in any embodiment of the above training method for updating the text classification model.
  • the executable instructions may be computer-readable instructions.
  • the communication bus / data Line 1003 reads computer readable instructions from memory 1002.
  • a computer-readable storage medium is also provided, for example, it may be a temporary and non-transitory computer-readable storage medium including instructions.
  • the storage medium may be a memory 250 including instructions that can be executed by the central processor 270 of the server 200 to complete the update training method of the text classification model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种文本分类模型的更新训练方法、装置及设备,文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成语义提取层和分类层的训练,所述方法包括:获取新增知识点对应的样本文本以及对样本文本进行标注的标注标签(S110);通过根据样本数据完成训练的语义提取层构建样本文本的特征向量(S130);根据样本文本的特征向量以及样本文本对应的标注标签进行分类层的更新训练,以实现文本分类模型的更新训练(S150)。在需要对文本分类模型进行更新训练时,仅进行分类层的更新训练,从而可以大幅缩短文本分类模型更新训练的时间,实现文本分类模型的及时更新。

Description

文本分类模型的更新训练方法、装置及设备 技术领域
本申请要求2018年10月12日递交、发明名称为“文本分类模型的更新训练方法、装置及设备”的中国专利申请CN201811192187.4的优先权,在此通过引用将其全部内容合并于此。
本公开涉及人工智能技术领域,特别涉及一种文本分类模型的更新训练方法、装置及设备。
背景技术
现有客服机器人问答系统中的文本分类模型,在客服机器人的知识库新增新产品相关的知识点或者新增热点问题相关的知识点后,需要对文本分类模型进行重新训练,一般重新训练一个文本分类模型需要很长的时间,从而导致文本分类模型更新不及时,客服机器人无法回答新增知识点相关的问题。
所以由于文本分类模型训练时间长导致文本分类模型更新不及时的问题还有待解决。
技术问题
为了解决相关技术中存在的问题,本公开提供了一种文本分类模型的更新训练方法及装置。
技术解决方案
一种文本分类模型的更新训练方法,所述文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成所述语义提取层和所述分类层的训练,所述文本分类模型的更新训练方法包括:
获取新增知识点对应的样本文本以及对所述样本文本进行标注的标注标签;
通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量;
根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练。
一种文本分类模型的更新训练装置,所述文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成所述语义提取层和所述分类层的训练,所述文本分类模型的更新训练装置包括:
获取模块,被配置为:获取新增知识点对应的样本文本以及对所述样本文本进行标注的标注标签;
特征向量构建模块,被配置为:通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量;
更新训练模块,被配置为:根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练。
在一实施例中,所述特征向量构建模块包括:
分词单元,被配置为:通过根据原有知识点的样本数据完成训练的所述语义提取层对所述样本文本进行分词;
特征向量构建单元,被配置为:根据所述样本文本中的每个词对应的编码以及每个词的语义权重构建所述样本文本的特征向量。
在一实施例中,所述装置还包括:
分类标签补充模块,被配置为:根据所述样本文本对应的标注标签补充所述分类层的分类标签;
分类标签集合更新模块,被配置为:根据所补充的分类标签更新所述分类层的分类标签集合。
在一实施例中,所述更新训练模块包括:
分类标签预测单元,被配置为:利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签;
判断单元,被配置为:进行所得到的所述分类标签与所述样本文本所对应标注标签的一致性判断;
调整单元,被配置为:如果不一致,调整所述分类层的参数直至所得到的所述分类标签与所述标注标签一致。
在一实施例中,分类标签预测单元包括:
概率预测单元,被配置为:利用所述分类层根据所述特征向量预测得到所述特征向量对应于更新后的所述分类标签集合中每一分类标签的概率;
分类标签确定单元,被配置为:遍历所述每一分类标签的概率,以最大概率值所对应的分类标签作为所述样本文本对应的分类标签。
在一实施例中,所述装置还包括:
分类测试模块,被配置为:通过更新后的所述文本分类模型对若干测试样本进行分类;
分类精度计算模块,被配置为:根据分类结果计算得到更新训练后的所述文本分类模型对所述若干测试样本的分类精度;
更新训练结束模块,被配置为:如果所述分类精度达到指定精度,结束所述文本分类模型的更新训练。
一种文本分类模型的更新训练设备,包括:
处理器;
用于存储处理器可执行指令的存储器;
其中,所述处理器被配置为以上所述的方法。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现以上所述的方法。
有益效果
通过本申请的技术方案, 在文本分类模型已根据原有知识点的样本数据进行训练的基础上,在需要对文本分类模型进行更新训练时,仅进行分类层的更新训练,实现文本分类模型的更新训练,从而可以大幅缩短文本分类模型更新训练的时间,实现文本分类模型的及时更新,进而人工智能技术领域中的客服机器人等可以及时用于进行新增知识点相关问题的回复。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并于说明书一起用于解释本申请的原理。
图1是根据本公开所涉及的实施环境的示意图;
图2是根据一示例性实施例示出的一种服务器的框图;
图3是根据一示例性实施例示出的一种文本分类模型的更新训练方法的流程图;
图4是图3所示实施例的步骤S130的流程图;
图5是是图3所示实施例的步骤S150之前步骤的流程图;
图6是图3所示实施例的步骤S150的流程图;
图7是图6所示实施例的步骤S151的流程图;
图8是图3所示实施例的步骤S150之后步骤的流程图;
图9是根据一示例性实施例示出的一种文本分类模型的更新训练装置的框图;
图10是根据一示例性实施例示出的一种文本分类模型的更新训练设备的框图。
本发明的实施方式
这里将详细地对示例性实施例执行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
图1是根据本公开所涉及的实施环境的示意图。该实施环境包括:服务器200和至少一个终端100。
其中终端100可以是智能手机、平板电脑、笔记本电脑、台式电脑等可以与服务器200建立网络连接且可以运行客户端的电子设备,在此不进行具体限定。终端100与服务器200之间预先建立了无线或者有线的网络连接,从而,通过在终端100上运行的客户端实现终端100与服务器200进行交互。
基于服务器200与终端100之间的交互,服务器200便可以获取到用户在终端100上输入的样本文本,然后构建该样本文本的特征向量、对特征向量进行分类预测实现文本分类模型的更新训练等。终端100可以接收服务器200所返回的针对样本文本的分类标签。
应当说明的是,本公开文本分类方法,不限于在服务器200中部署相应的处理逻辑,其也可以是部署于其它机器中的处理逻辑。例如,在具备计算能力的终端设备中部署进行文本分类模型的更新训练的处理逻辑等。
图2是根据一示例性实施例示出的一种服务器的框图。具有此硬件结构的服务器可用于进行文本分类模型的更新训练而部署在图1所示的实施环境中。
需要说明的是,该服务器只是一个适配于本公开的示例,不能认为是提供了对本公开使用范围的任何限制。该服务器也不能解释为需要依赖于或者必须具有图2中示出的示例性的服务器200中的一个或者多个组件。
该服务器的硬件结构可因配置或者性能的不同而产生较大的差异,如图2所示,服务器200包括:电源210、接口230、至少一存储器250、以及至少一中央处理器(CPU, Central Processing Units)270。
其中,电源210用于为服务器200上的各硬件设备提供工作电压。
接口230包括至少一有线或无线网络接口231、至少一串并转换接口233、至少一输入输出接口235以及至少一USB接口237等,用于与外部设备通信,例如与终端100进行数据传输。
存储器250作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储的资源包括操作系统251、应用程序253及数据255等,存储方式可以是短暂存储或者永久存储。其中,操作系统251用于管理与控制服务器200上的各硬件设备以及应用程序253,以实现中央处理器270对海量数据255的计算与处理,其可以是Windows ServerTM、Mac OS XTM、UnixTM、LinuxTM、FreeBSDTM等。应用程序253是基于操作系统251之上完成至少一项特定工作的计算机程序,其可以包括至少一模块(图2中未示出),每个模块都可以分别包含有对服务器200的一系列计算机可读指令。数据255可以是存储于磁盘中的样本数据等。
中央处理器270可以包括一个或多个以上的处理器,并设置为通过总线与存储器250通信,用于运算与处理存储器250中的海量数据255。
如上面所详细描述的,适用本公开的服务器200将通过中央处理器270读取存储器250中存储的一系列计算机可读指令的形式来完成文本分类模型的更新训练。
在示例性实施例中,服务器200可以被一个或多个应用专用集成电路(Application Specific Integrated Circuit ,简称ASIC)、数字信号处理器、数字信号处理设备、可编程逻辑器件、现场可编程门阵列、控制器、微控制器、微处理器或其他电子元件实现,用于执行下述文本分类方法。因此,实现本公开并不限于任何特定硬件电路、软件以及两者的组合。
图3是根据一示例性实施例示出的一种文本分类模型的更新训练方法的流程图。该文本分类模型的更新训练方法可以图1所示实施环境的服务器200执行。在图3所示实施例中,文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成语义提取层和分类层的训练,该文本分类模型的更新训练方法包括以下步骤:
步骤S110,获取新增知识点对应的样本文本以及对样本文本进行标注的标注标签。
本申请的文本分类模型是通过神经网络构建的,其中文本分类模型可以通过卷积神经网络(CNN)、循环神经网络(RNN)等可以进行文本分类的神经网络构建,也可以通过多种类型的神经网络组合而成,在此不进行具体限定。
在根据原有知识点的样本数据完成语义提取层和分类层的训练后,通过该训练,确定了语义提取层和分类层的参数,从而该文本分类模型可以实现对原有知识点相关的问题进行分类,即语义提取层可以构建文本的特征向量,分类层可以基于文本的特征向量对文本进行分类。
在文本分类模型根据原有知识点的样本数据完成训练后,该原有知识点的样本数据构成了该文本分类模型的数据库。针对在不同应用场景的文本分类模型,样本数据不同,相应的文本分类模型的数据库也不相同。
其中新增知识点可以是文本分类模型的数据库中未包含的知识点,或者是针对原有数据库中的知识点进行修改的知识点,在此不进行具体限定。针对新增知识点,需要利用新增知识点的样本文本以及对样本文本进行标注的标注标签进行文本分类模型的更新训练。
举例来说,例如应用于保险领域的客服机器人中的文本分类模型,新增知识点可以是所开发的新保险业务,相对应的,新增知识点的样本文本即为与该新保险业务相关的问题,例如该保险办理流程、办理材料、办理条件、理赔流程等相关问题;新增知识点还可以是原有保险理赔流程变更,相对应的,新增知识点的样本文本即为与该变更的保险理赔流程相关的问题。从而,在进行文本分类模型的更新训练之后,客服机器人可以针对用户所提出关于新增知识点的问题进行分类,进而根据分类结果搜索答案并向用户呈现所搜索的答案。
样本文本的标注标签是通过人工对该样本文本进行分类得到的标签,在具体实施例中,可以通过人工对样本文本进行标注得到标注标签,并保存标注的标注标签。
步骤S130,通过根据样本数据完成训练的语义提取层构建样本文本的特征向量。
由上文叙述可知,在语义提取层根据样本数据完成训练之后,语义提取层的参数确定,在步骤S130中,根据所确定参数的语义提取层构建样本文本的特征向量。而在之后的步骤中,不需要再调整语义提取层的参数,即不进行语义提取层的更新训练。
特别是人工智能技术领域中的客服机器人,由于在客服机器人进行线上服务前,应用大量的样本数据训练客服机器人的文本分类模型,文本分类模型的语义提取层构建文本的特征向量的功能完善。所以,在经过线上服务前的训练之后,新增知识点时,语义提取层也可以构建文本的特征向量。
在一示例性实施例中,如图4所示,步骤S130包括:
步骤S131,通过根据原有知识点的样本数据完成训练的语义提取层对样本文本进行分词。
步骤S132,根据样本文本中的每个词对应的编码以及每个词的语义权重构建样本文本的特征向量。
对样本文本进行分词即将样本文本分割成若干个顺序排列的词组。分词可以采用分词算法进行,例如可以采用基于字符串匹配的分词算法、基于理解的分词算法或基于统计的分词算法等,在此不进行具体限定。
在根据原有知识点的样本数据完成文本分类模型的训练之后,通过样本数据构建了该文本分类模型的数据库,该数据库中包括根据样本数据所构建的词典,词典中包括了样本数据中所包含的词对应的编码,以及词所对应的语义权重。
词所对应的语义权重用于表征在样本文本中,该词对样本文本语义的贡献程度。例如在“办理平安车主卡的流程有哪些”这文本中,根据步骤S131得到的分词结果为“办理^平安^车主卡^的^流程^有^哪些”,“的”“有”“哪些”这三个词对该文本的语义贡献程度不大,从而该三个词在该文本中所对应的语义权重小一些,而“办理”“平安”“车主卡”“流程”这四个词对该文本语义的贡献程度更大,从而该四个词所在该样本文本中语义权重也相对较大。当然每个词对应的编码以及对应的语义权重是训练之后确定的,在本申请中即根据原有知识点的样本数据完成语义提取层的训练后确定的。当然,样本数据量越大,词典越完善,词典中词所对应的编码以及词的语义权重也更完善,从而语义提取层构建文本的特征向量的功能也越完善。
在完成分词后,根据每个词对应的编码以及每个词对应的语义权重即可构建该样本文本的特征向量。在具体实施例中,一般用数字表示词所对应的编码,而且用实数表示词所对应的权重,从而所构建的输入文本的特征向量为实数向量。
步骤S150,根据样本文本的特征向量以及样本文本对应的标注标签进行分类层的更新训练,以实现文本分类模型的更新训练。
对分类层进行更新训练即在更新训练过程中调整分类层的参数。在根据样本文本的特征向量以及样本文本对应的标注标签进行分类层的更新训练后,文本分类模型可以针对新增知识点相关的文本输出该文本对应的分类标签,即实现了文本分类模型的更新训练。
通过本申请的技术方案,在文本分类模型根据原有知识点的样本数据进行充分训练的基础上,语义提取层构建文本的特征向量的功能完善。在需要对文本分类模型进行更新训练时,仅进行分类层的更新训练,并利用原语义提取层构建样本文本的特征向量,而不进行原语义提取层的更新训练,从而可以大幅缩短文本分类模型的更新训练时间,实现文本分类模型的及时更新。
特别是人工智能技术领域的客服机器人中,在客服机器人进行线上服务前,客服机器人的文本分类模型的样本数据动辄几十万条,样本数据量大,训练时间长,文本分类模型的语义提取层构建文本的特征向量的功能十分完善。从而在文本分类模型需要进行更新训练时,仅进行分类层的更新训练,而不进行语义提取层的更新训练,大幅缩短了文本分类模型的更新训练时间,而且保证了在更新训练后文本分类模型对原有知识点、新增知识点相关文本的分类精度。特别是在新增知识点相对于原有知识点的量较少,而又需要进行文本分类模型的更新训练时,通过本申请的技术方案,可以实现文本分类模型的及时更新,而且也可以保证文本分类模型的分类精度。
在一示例性实施例中,如图5所示,在步骤S150之前还包括:
步骤S010,根据样本文本对应的标注标签补充分类层的分类标签。
步骤S030,根据所补充的分类标签更新分类层的分类标签集合。
分类标签集合中包括了分类层可输出的全部分类标签。一个标注标签对应于分类层的一个分类标签,在新增知识点时,由于原有知识点中不包括新增知识点的样本文本,当然也无法对新增知识点的样本文本进行正确分类。根据样本文本对应的标注标签补充分类层的分类标签,并更新分类层的分类标签集合后,从而在根据样本文本进行分类层的更新训练时,可以从更新后的分类标签集合中确定样本文本的分类标签。
在一示例性实施例中,如图6所示,步骤S150包括:
步骤S151,利用分类层根据样本文本的特征向量预测得到样本文本所对应的分类标签。
步骤S152,进行所得到的分类标签与样本文本所对应标注标签的一致性判断。
步骤S153,如果不一致,调整分类层的参数直至所得到的分类标签与标注标签一致。
文本分类模型的训练即在训练过程中调整文本分类模型的参数,使文本分类模型输出的分类标签与人工进行标注的标注标签一致。如果两者一致,不需要调整文本分类模型的参数,如果不一致,则调整文本分类模型的参数直至两者一致。在本申请的技术方案中,通过文本分类模型更新训练时,调整分类层的参数使样本文本的分类标签与标注标签一致。
在具体实施例中,如果通过分类层得到的分类标签与样本文本所对应标注标签一致,则用下一样本文本进行文本分类模型的更新训练。
在现有技术中,不管是文本分类模型的初次训练还是更新训练,都是语义提取层和分类层均进行训练,即训练过程中,如果分类层所输出样本文本的分类标签与样本文本的标注标签不一致,则调整语义提取层和分类层的参数,直到二者一致。
由于在文本分类模型中,语义提取层的神经网络结构更复杂,运算过程更复杂、运算量更大,在调整了语义提取层的参数后,语义提取层需要根据调整后的参数重新经过运算构建文本的特征向量,所以训练文本分类模型的时间长。
而在本申请中,仅调整分类层的参数,相当于仅对分类层进行更新训练,从而可以大幅缩短文本分类模型更新训练的时间。
在实际测试中,利用四个公开数据集做测试,ag_news, Dbpedia, Yahoo!Answer和平安银行FAQ知识库。在四个数据集上通过实验对比,采用本申请的文本分类模型的更新训练方法所用的训练时间缩短为全文本分类模型重新训练所花费时间的1/10。
在一示例性实施例中,如图7所示,步骤S151包括:
步骤S210,利用分类层根据特征向量预测得到特征向量对应于更新后的分类标签集合中每一分类标签的概率。
步骤S230,遍历每一分类标签的概率,以最大概率值所对应的分类标签作为样本文本对应的分类标签。
在一示例性实施例中,如图8所示,步骤S150之后还包括:
步骤S171,通过更新后的所述文本分类模型对若干测试样本进行分类。
步骤S172,根据分类结果计算得到更新训练后的所述文本分类模型对所述若干测试样本的分类精度。
步骤S173,如果分类精度达到指定精度,结束文本分类模型的更新训练。
其中步骤S171-173用于测试更新训练后文本分类模型的分类精度。其中测试样本可以包括原有知识点相关的文本和/或新增知识点相关的文本,优选包括原有知识点的文本和新增知识点的文本。并对测试样本进行标注。在步骤S172中,将文本分类模型对每一测试样本输出的分类标签与每一测试样本的标注进行对比,如果两者一致,则认为分类准确,如果不一致,则认为分类错误,计算分类准确的测试样本数量占总测试样本的比例,该比例即为更新后的文本分类模型对若干测试样本的分类精度。
如果分类精度达到指定精度,则结束文本分类模型的更新训练,如果分类精度未达到指定精度,则重复步骤S110、S130、S150继续进行文本分类模型的更新训练。
下述为本公开装置实施例,可以用于执行本公开上述服务器200执行的文本分类模型的更新训练方法实施例。对于本公开装置实施例中未披露的细节,请参照本公开文本分类模型的更新训练方法实施例。
图9是根据一示例性实施例示出的一种文本分类模型的更新训练装置的框图,该文本分类模型的更新训练可以用于图1所示实施环境的服务器200中,执行以上任一实施例中的文本分类模型的更新训练方法的全部或者部分步骤。如图9所示,该文本分类模型的更新训练装置包括但不限于:获取模块110、特征向量构建模块130以及更新训练模块150,其中该文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成语义提取层和分类层的训练,该装置包括:
获取模块110,被配置为:获取新增知识点对应的样本文本以及对样本文本进行标注的标注标签。
特征向量构建模块130,该模块与获取模块110连接,被配置为:通过根据样本数据完成训练的语义提取层构建样本文本的特征向量。
更新训练模块150,该模块与特征向量构建模块130连接,被配置为:根据样本文本的特征向量以及样本文本对应的标注标签进行分类层的更新训练,以实现文本分类模型的更新训练。
在一实施例中,特征向量构建模块130包括:
分词单元,被配置为:通过根据原有知识点的样本数据完成训练的语义提取层对样本文本进行分词。
特征向量构建单元,被配置为:根据样本文本中的每个词对应的编码以及每个词的语义权重构建样本文本的特征向量。
在一实施例中,文本分类模型的更新训练装置还包括:
分类标签补充模块,被配置为:根据样本文本对应的标注标签补充分类层的分类标签。
分类标签集合更新模块,被配置为:根据所补充的分类标签更新分类层的分类标签集合。
在一实施例中,更新训练模块150包括:
分类标签预测单元,被配置为:利用分类层根据样本文本的特征向量预测得到样本文本所对应的分类标签,更新后的分类标签集合包括样本文本所对应的分类标签。
判断单元,被配置为:进行所得到的分类标签与样本文本所对应标注标签的一致性判断。
调整单元,被配置为:如果不一致,调整分类层的参数直至所得到的分类标签与标注标签一致。
在一实施例中,分类标签预测单元包括:
概率预测单元,被配置为:利用分类层根据特征向量预测得到特征向量对应于更新后的分类标签集合中每一分类标签的概率。
分类标签确定单元,被配置为:遍历每一分类标签的概率,以最大概率值所对应的分类标签作为样本文本对应的分类标签。
在一实施例中,文本分类模型的更新训练装置还包括:
分类测试模块,被配置为:通过更新后的所述文本分类模型对若干测试样本进行分类。
分类精度计算模块,被配置为:根据分类结果计算得到更新训练后的所述文本分类模型对所述若干测试样本的分类精度。
更新训练结束模块,被配置为:如果分类精度达到指定精度,结束分类模型的更新训练。
上述装置中各个模块/单元的功能和作用的实现过程具体详见上述文本分类模型的更新训练方法中对应步骤的实现过程,在此不再赘述。
可以理解,这些模块/单元可以通过硬件、软件、或二者结合来实现。当以硬件方式实现时,这些模块可以实施为一个或多个硬件模块,例如一个或多个专用集成电路。当以软件方式实现时,这些模块可以实施为在一个或多个处理器上执行的一个或多个计算机程序,例如图2的中央处理器270所执行的存储在存储器250中的程序。
可选的,本公开还提供一种文本分类模型的更新训练设备,该文本分类设备可以是图1所示实施环境的服务器200,执行以上文本分类模型的更新训练方法实施例中的全部或者部分步骤。如图10所示,该文本分类模型的更新训练设备包括:
处理器1001;
用于存储处理器1001可执行指令的存储器1002;
其中,处理器1001被配置为以上文本分类模型的更新训练方法任一实施例中的全部或者部分步骤,可执行指令可以是计算机可读指令,处理器1001在执行时,可以通过通讯总线/数据线1003从存储器1002中读取计算机可读指令。
该实施例中的设备的处理器执行操作的具体方式已经在有关该文本分类模型的更新训练方法的实施例中执行了详细描述,此处将不做详细阐述说明。
在示例性实施例中,还提供了一种计算机可读存储介质,例如可以为包括指令的临时性和非临时性计算机可读存储介质。该存储介质可以是包括指令的存储器250,上述指令可由服务器200的中央处理器270执行以完成上述文本分类模型的更新训练方法。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围执行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (24)

  1. 一种文本分类模型的更新训练方法,所述文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成所述语义提取层和所述分类层的训练,其中,包括:
    获取新增知识点对应的样本文本以及对所述样本文本进行标注的标注标签;
    通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量;
    根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练。
  2. 根据权利要求1所述的方法,其中,所述通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量,包括:
    通过根据原有知识点的样本数据完成训练的所述语义提取层对所述样本文本进行分词;
    根据所述样本文本中的每个词对应的编码以及每个词的语义权重构建所述样本文本的特征向量。
  3. 根据权利要求1或2所述的方法,其中,所述根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练之前,还包括:
    根据所述样本文本对应的标注标签补充所述分类层的分类标签;
    根据所补充的分类标签更新所述分类层的分类标签集合。
  4. 根据权利要求3所述的方法,其中,所述根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练,包括:
    利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签;
    进行所得到的所述分类标签与所述样本文本所对应标注标签的一致性判断;
    如果不一致,调整所述分类层的参数直至所得到的所述分类标签与所述标注标签一致。
  5. 根据权利要求4所述的方法,其中,所述利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签,包括:
    利用所述分类层根据所述特征向量预测得到所述特征向量对应于更新后的所述分类标签集合中每一分类标签的概率;
    遍历所述每一分类标签的概率,以最大概率值所对应的分类标签作为所述样本文本对应的分类标签。
  6. 根据权利要求1至5中任一项所述的方法,其中,根据所述新增样本的特征向量以及所述新增样本对应的标注进行所述分类层的更新训练之后,还包括:
    通过更新后的所述文本分类模型对若干测试样本进行分类;
    根据分类结果计算得到更新训练后的所述文本分类模型对所述若干测试样本的分类精度;
    如果所述分类精度达到指定精度,结束所述文本分类模型的更新训练。
  7. 一种文本分类模型的更新训练装置,所述文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成所述语义提取层和所述分类层的训练,其中,包括:
    获取模块,被配置为:获取新增知识点对应的样本文本以及对所述样本文本进行标注的标注标签;
    特征向量构建模块,被配置为:通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量;
    更新训练模块,被配置为:根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练。
  8. 根据权利要求7所述的装置,其中,所述特征向量构建模块包括:
    分词单元,被配置为:通过根据原有知识点的样本数据完成训练的所述语义提取层对所述样本文本进行分词;
    特征向量构建单元,被配置为:根据所述样本文本中的每个词对应的编码以及每个词的语义权重构建所述样本文本的特征向量。
  9. 根据权利要求7或8所述的装置,其中,所述装置还包括:
    分类标签补充模块,被配置为:根据所述样本文本对应的标注标签补充所述分类层的分类标签;
    分类标签集合更新模块,被配置为:根据所补充的分类标签更新所述分类层的分类标签集合。
  10. 根据权利要求9所述的装置,其中,所述更新训练模块包括:
    分类标签预测单元,被配置为:利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签;
    判断单元,被配置为:进行所得到的所述分类标签与所述样本文本所对应标注标签的一致性判断;
    调整单元,被配置为:如果判断单元判断所得到的所述分类标签与所述样本文本所对应标注标签不一致,调整所述分类层的参数直至所得到的所述分类标签与所述标注标签一致。
  11. 根据权利要求10所述的装置,其中,所述分类标签预测单元包括:
    概率预测单元,被配置为:利用所述分类层根据所述特征向量预测得到所述特征向量对应于更新后的所述分类标签集合中每一分类标签的概率;
    分类标签确定单元,被配置为:遍历所述每一分类标签的概率,以最大概率值所对应的分类标签作为所述样本文本对应的分类标签。
  12. 根据权利要求7至11中任一项所述的方法,其中,所述装置还包括:
    分类测试模块,被配置为:通过更新后的所述文本分类模型对若干测试样本进行分类;
    分类精度计算模块,被配置为:根据分类结果计算得到更新训练后的所述文本分类模型对所述若干测试样本的分类精度;
    更新训练结束模块,被配置为:如果所述分类精度达到指定精度,结束所述文本分类模型的更新训练。
  13. 一种文本分类模型的更新训练设备,所述文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成所述语义提取层和所述分类层的训练,其中,包括:
    处理器;
    用于存储处理器可执行指令的存储器;
    其中,所述处理器被配置为以下步骤:
    获取新增知识点对应的样本文本以及对所述样本文本进行标注的标注标签;
    通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量;
    根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练。
  14. 根据权利要求13所述的设备,其中,在所述通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量步骤中,所述处理器执行以下步骤:
    通过根据原有知识点的样本数据完成训练的所述语义提取层对所述样本文本进行分词;
    根据所述样本文本中的每个词对应的编码以及每个词的语义权重构建所述样本文本的特征向量。
  15. 根据权利要求13或14所述的设备,其中,在所述根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练步骤之前,所述处理器还执行以下步骤:
    根据所述样本文本对应的标注标签补充所述分类层的分类标签;
    根据所补充的分类标签更新所述分类层的分类标签集合。
  16. 根据权利要求15所述的设备,其中,在所述根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练步骤中,所述处理器执行以下步骤:
    利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签;
    进行所得到的所述分类标签与所述样本文本所对应标注标签的一致性判断;
    如果不一致,调整所述分类层的参数直至所得到的所述分类标签与所述标注标签一致。
  17. 根据权利要求16所述的方法,其中,在利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签的步骤中,所述处理器执行以下步骤:
    利用所述分类层根据所述特征向量预测得到所述特征向量对应于更新后的所述分类标签集合中每一分类标签的概率;
    遍历所述每一分类标签的概率,以最大概率值所对应的分类标签作为所述样本文本对应的分类标签。
  18. 根据权利要求13至17中任一项所述的设备,其中,在根据所述新增样本的特征向量以及所述新增样本对应的标注进行所述分类层的更新训练步骤之后之后,所述处理器还执行以下步骤:
    通过更新后的所述文本分类模型对若干测试样本进行分类;
    根据分类结果计算得到更新训练后的所述文本分类模型对所述若干测试样本的分类精度;
    如果所述分类精度达到指定精度,结束所述文本分类模型的更新训练。
  19. 一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序由处理器执行以下步骤:
    获取新增知识点对应的样本文本以及对所述样本文本进行标注的标注标签;
    通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量;
    根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练;
    其中,所述文本分类模型包括语义提取层和分类层,在新增知识点前,根据原有知识点的样本数据完成所述语义提取层和所述分类层的训练。
  20. 根据权利要求1所述的计算机可读存储介质,其中,在通过根据所述样本数据完成训练的所述语义提取层构建所述样本文本的特征向量步骤中,所述处理器执行以下步骤:
    通过根据原有知识点的样本数据完成训练的所述语义提取层对所述样本文本进行分词;
    根据所述样本文本中的每个词对应的编码以及每个词的语义权重构建所述样本文本的特征向量。
  21. 根据权利要求19或20所述的计算机可读存储介质,其中,在根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练步骤之前,所述处理器还执行以下步骤:
    根据所述样本文本对应的标注标签补充所述分类层的分类标签;
    根据所补充的分类标签更新所述分类层的分类标签集合。
  22. 根据权利要求21所述的计算机可读存储介质,其中,在根据所述样本文本的特征向量以及所述样本文本对应的标注标签进行所述分类层的更新训练,以实现所述文本分类模型的更新训练步骤中,所述处理器执行以下步骤:
    利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签;
    进行所得到的所述分类标签与所述样本文本所对应标注标签的一致性判断;
    如果不一致,调整所述分类层的参数直至所得到的所述分类标签与所述标注标签一致。
  23. 根据权利要求22所述的计算机可读存储介质,其中,在利用所述分类层根据所述样本文本的特征向量预测得到所述样本文本所对应的分类标签步骤中,所述处理器执行以下步骤:
    利用所述分类层根据所述特征向量预测得到所述特征向量对应于更新后的所述分类标签集合中每一分类标签的概率;
    遍历所述每一分类标签的概率,以最大概率值所对应的分类标签作为所述样本文本对应的分类标签。
  24. 根据权利要求19至23中任一项所述的计算机可读存储介质,其中,在根据所述新增样本的特征向量以及所述新增样本对应的标注进行所述分类层的更新训练步骤之后,所述处理器还执行以下步骤:
    通过更新后的所述文本分类模型对若干测试样本进行分类;
    根据分类结果计算得到更新训练后的所述文本分类模型对所述若干测试样本的分类精度;
    如果所述分类精度达到指定精度,结束所述文本分类模型的更新训练。
PCT/CN2018/125250 2018-10-12 2018-12-29 文本分类模型的更新训练方法、装置及设备 WO2020073531A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811192187.4 2018-10-12
CN201811192187.4A CN109241288A (zh) 2018-10-12 2018-10-12 文本分类模型的更新训练方法、装置及设备

Publications (1)

Publication Number Publication Date
WO2020073531A1 true WO2020073531A1 (zh) 2020-04-16

Family

ID=65052732

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/125250 WO2020073531A1 (zh) 2018-10-12 2018-12-29 文本分类模型的更新训练方法、装置及设备

Country Status (2)

Country Link
CN (1) CN109241288A (zh)
WO (1) WO2020073531A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186780A (zh) * 2022-09-14 2022-10-14 江西风向标智能科技有限公司 学科知识点分类模型训练方法、系统、存储介质及设备

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109994103A (zh) * 2019-03-26 2019-07-09 北京博瑞彤芸文化传播股份有限公司 一种智能语义匹配模型的训练方法
CN110472665A (zh) * 2019-07-17 2019-11-19 新华三大数据技术有限公司 模型训练方法、文本分类方法及相关装置
CN110717023B (zh) * 2019-09-18 2023-11-07 平安科技(深圳)有限公司 面试回答文本的分类方法及装置、电子设备、存储介质
CN110851546B (zh) * 2019-09-23 2021-06-29 京东数字科技控股有限公司 一种验证、模型的训练、模型的共享方法、系统及介质
CN110633476B (zh) * 2019-09-27 2024-04-05 北京百度网讯科技有限公司 用于获取知识标注信息的方法及装置
CN114424186A (zh) * 2019-12-16 2022-04-29 深圳市欢太科技有限公司 文本分类模型训练方法、文本分类方法、装置及电子设备
CN111522570B (zh) * 2020-06-19 2023-09-05 杭州海康威视数字技术股份有限公司 目标库更新方法、装置、电子设备及机器可读存储介质
CN111737472A (zh) * 2020-07-01 2020-10-02 携程计算机技术(上海)有限公司 文本分类模型的更新方法及系统、电子设备及存储介质
CN112148874A (zh) * 2020-07-07 2020-12-29 四川长虹电器股份有限公司 可自动新增用户潜在意图的意图识别方法及系统
CN116881464B (zh) * 2023-09-06 2023-11-24 北京睿企信息科技有限公司 一种基于新增标签进行模型训练的方法及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104166706A (zh) * 2014-08-08 2014-11-26 苏州大学 基于代价敏感主动学习的多标签分类器构建方法
US20160078349A1 (en) * 2014-09-17 2016-03-17 International Business Machines Corporation Method for Identifying Verifiable Statements in Text
CN108062331A (zh) * 2016-11-08 2018-05-22 南京理工大学 基于终生学习的增量式朴素贝叶斯文本分类方法
CN108509484A (zh) * 2018-01-31 2018-09-07 腾讯科技(深圳)有限公司 分类器构建及智能问答方法、装置、终端及可读存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009589A (zh) * 2017-12-12 2018-05-08 腾讯科技(深圳)有限公司 样本数据处理方法、装置和计算机可读存储介质
CN108090178B (zh) * 2017-12-15 2020-08-25 北京锐安科技有限公司 一种文本数据分析方法、装置、服务器和存储介质
CN108520030B (zh) * 2018-03-27 2022-02-11 深圳中兴网信科技有限公司 文本分类方法、文本分类系统及计算机装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104166706A (zh) * 2014-08-08 2014-11-26 苏州大学 基于代价敏感主动学习的多标签分类器构建方法
US20160078349A1 (en) * 2014-09-17 2016-03-17 International Business Machines Corporation Method for Identifying Verifiable Statements in Text
CN108062331A (zh) * 2016-11-08 2018-05-22 南京理工大学 基于终生学习的增量式朴素贝叶斯文本分类方法
CN108509484A (zh) * 2018-01-31 2018-09-07 腾讯科技(深圳)有限公司 分类器构建及智能问答方法、装置、终端及可读存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186780A (zh) * 2022-09-14 2022-10-14 江西风向标智能科技有限公司 学科知识点分类模型训练方法、系统、存储介质及设备

Also Published As

Publication number Publication date
CN109241288A (zh) 2019-01-18

Similar Documents

Publication Publication Date Title
WO2020073531A1 (zh) 文本分类模型的更新训练方法、装置及设备
US20200134506A1 (en) Model training method, data identification method and data identification device
CN110147456B (zh) 一种图像分类方法、装置、可读存储介质及终端设备
JP7014100B2 (ja) 拡張装置、拡張方法及び拡張プログラム
US20210209410A1 (en) Method and apparatus for classification of wafer defect patterns as well as storage medium and electronic device
US20210374542A1 (en) Method and apparatus for updating parameter of multi-task model, and storage medium
US10540573B1 (en) Story cycle time anomaly prediction and root cause identification in an agile development environment
CN109564575A (zh) 使用机器学习模型来对图像进行分类
US20210201152A1 (en) Domain adaptation of deep neural networks
CN111259647A (zh) 基于人工智能的问答文本匹配方法、装置、介质及电子设备
US11741956B2 (en) Methods and apparatus for intent recognition
JP7364709B2 (ja) 機械学習および自然言語処理を利用したワクチン接種データの抽出および確認
CN113657483A (zh) 模型训练方法、目标检测方法、装置、设备以及存储介质
CN110377733A (zh) 一种基于文本的情绪识别方法、终端设备及介质
WO2024001806A1 (zh) 一种基于联邦学习的数据价值评估方法及其相关设备
US20240177006A1 (en) Data processing method and apparatus, program product, computer device, and medium
CN113434683A (zh) 文本分类方法、装置、介质及电子设备
CN112420125A (zh) 分子属性预测方法、装置、智能设备和终端
US10885385B2 (en) Image search and training system
WO2021174814A1 (zh) 众包任务的答案验证方法、装置、计算机设备及存储介质
US20220092406A1 (en) Meta-feature training models for machine learning algorithms
CN109376243A (zh) 文本分类方法及装置
WO2023142417A1 (zh) 网页识别方法、装置、电子设备和介质
JP6900724B2 (ja) 学習プログラム、学習方法および学習装置
CN116229196A (zh) 一种噪声样本的识别方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18936540

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18936540

Country of ref document: EP

Kind code of ref document: A1