CN111104514B

CN111104514B - Training method and device for document tag model

Info

Publication number: CN111104514B
Application number: CN201911338269.XA
Authority: CN
Inventors: 刘呈祥; 何伯磊; 肖欣延
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-12-23
Filing date: 2019-12-23
Publication date: 2023-04-25
Anticipated expiration: 2039-12-23
Also published as: CN111104514A

Abstract

The application discloses a training method and device for a document tag model, and relates to the technical field of document tag prediction. The specific implementation scheme is as follows: obtaining a pre-trained document tag model, wherein the document tag model is obtained by pre-training universal training data of each application scene; acquiring scene training data of an application scene to be applied, wherein the scene training data comprises: a plurality of documents and corresponding tag information under an application scene to be applied; obtaining a sub-model related to an application scene to be applied in a document tag model; training the sub-model by using scene training data to obtain a trained document tag model, so that training data required for training the document tag model in an application scene to be applied can be reduced, and training cost is reduced under the condition of ensuring the accuracy of the document tag model.

Description

Training method and device for document tag model

Technical Field

The application relates to the technical field of data processing, in particular to the technical field of document tag prediction, and particularly relates to a training method and device of a document tag model.

Background

Currently, tag prediction technology of documents is an important task for understanding document contents. For a new document tag prediction scene, the main solution ideas are as follows, and one is to train a general document tag model: the model is trained without considering the difference among various scenes, and a universal document tag model is used in all scenes. The other is to train the document tag model alone: training data is prepared separately for the new scene for training.

In the first method, the obtained model is trained, the scene or field pertinence is lacked, and the prediction accuracy in a single scene is low. In the second method, the required amount of training data to be prepared is large, and the training cost is high.

Disclosure of Invention

According to the training method and device for the document tag model, the sub-model related to the application scene to be applied in the pre-trained document tag model is trained according to scene training data of the application scene to be applied, so that the training cost of the document tag model in the application scene to be applied is reduced on the premise of ensuring the accuracy of the document tag model.

In one aspect, an embodiment of the present application provides a training method for a document tag model, including:

obtaining a pre-trained document tag model, wherein the document tag model is obtained by pre-training universal training data of each application scene;

acquiring scene training data of an application scene to be applied, wherein the scene training data comprises: the plurality of documents and the corresponding label information under the application scene to be applied;

acquiring a sub-model related to the application scene to be applied in the document tag model;

and training the sub-model by adopting the scene training data to obtain a trained document tag model.

In one embodiment of the present application, the document tag model includes: the device comprises a pretreatment layer, a candidate recall layer, a coarse arrangement layer and a fine arrangement layer;

the candidate recall layer comprises: a keyword recall sub-model, a multi-label classification recall sub-model, an explicit recall sub-model and an implicit recall sub-model which are connected in parallel;

the coarse row layer comprises: a rule sub-model and a semantic matching sub-model which are connected in parallel;

the sub-model related to the application scene to be applied comprises: a semantic matching sub-model, any one or more of the following sub-models: a multi-label classification recall sub-model, an explicit recall sub-model, and an implicit recall sub-model.

In one embodiment of the present application, the sub-model related to the application scenario to be applied includes: when the semantic matching sub-model, the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model are adopted, training is carried out on the sub-model by adopting the scene training data, and a trained document label model is obtained, and the method comprises the following steps:

inputting the documents into a multi-label classification recall sub-model, an explicit recall sub-model and an implicit recall sub-model respectively aiming at each document in the scene training data, and merging the output results to obtain candidate label results;

inputting the document and the candidate tag results into the semantic matching sub-model, and obtaining the relevance of the document and each candidate tag in the candidate tag results;

and adjusting coefficients of the semantic matching sub-model, the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model according to the relevance of the document to each candidate label in the candidate label result and label information corresponding to the document, so as to obtain a trained document label model.

In one embodiment of the present application, the scene training data further includes: a set of tags, the set of tags comprising: the document tag model can predict tags so that the document tag model can predict the tags of the documents in the scene training data in combination with the tag set.

In an embodiment of the present application, before training the sub-model by using the scene training data to obtain a trained document tag model, the method further includes:

initializing coefficients of the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model in the document label model.

According to the training method of the document tag model, the document tag model is obtained by obtaining the pre-trained document tag model, and the document tag model is obtained by pre-training universal training data of all application scenes; acquiring scene training data of an application scene to be applied, wherein the scene training data comprises: a plurality of documents and corresponding tag information under an application scene to be applied; obtaining a sub-model related to an application scene to be applied in a document tag model; training the sub-model by using scene training data to obtain a trained document tag model, so that training data required for training the document tag model in an application scene to be applied can be reduced, and training cost is reduced under the condition of ensuring the accuracy of the document tag model.

In another aspect, an embodiment of the present application provides a training device for a document tag model, including:

the acquisition module is used for acquiring a pre-trained document tag model, wherein the document tag model is obtained by pre-training universal training data of each application scene;

the acquisition module is further configured to acquire scene training data of an application scene to be applied, where the scene training data includes: the plurality of documents and the corresponding label information under the application scene to be applied;

the acquisition module is further used for acquiring a sub-model related to the application scene to be applied in the document tag model;

and the training module is used for training the sub-model by adopting the scene training data to obtain a trained document label model.

In one embodiment of the present application, the sub-model related to the application scenario to be applied includes: the training module is specifically used for, when the semantics match sub-model, the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model,

In one embodiment of the present application, the apparatus further comprises: and the initialization module is used for initializing coefficients of the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model in the document label model.

According to the training device for the document tag model, the document tag model is obtained by obtaining the pre-trained document tag model, and the document tag model is obtained by pre-training universal training data of all application scenes; acquiring scene training data of an application scene to be applied, wherein the scene training data comprises: a plurality of documents and corresponding tag information under an application scene to be applied; obtaining a sub-model related to an application scene to be applied in a document tag model; training the sub-model by using scene training data to obtain a trained document tag model, so that training data required for training the document tag model in an application scene to be applied can be reduced, and training cost is reduced under the condition of ensuring the accuracy of the document tag model.

Another embodiment of the present application proposes an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the training method of the document tag model according to the embodiment of the application.

Another aspect of the present application provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of training a document tag model of the embodiments of the present application.

Other effects of the above alternative will be described below in connection with specific embodiments.

Drawings

The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present application;

FIG. 2 is a schematic diagram of a document tag model structure.

FIG. 3 is a schematic diagram according to a second embodiment of the present application;

FIG. 4 is a schematic diagram according to a third embodiment of the present application;

FIG. 5 is a block diagram of an electronic device for implementing a training method for a document tag model of an embodiment of the present application;

Detailed Description

Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The following describes a training method and device for a document tag model according to an embodiment of the present application with reference to the accompanying drawings.

Fig. 1 is a schematic diagram according to a first embodiment of the present application. It should be noted that, the execution body of the training method of the document tag model provided in this embodiment is a training device of the document tag model, and the device may be implemented in a software and/or hardware manner, and the device may be configured in a terminal device or a server, and this embodiment is not limited specifically.

As shown in fig. 1, the training method of the document tag model may include:

step 101, obtaining a pre-trained document tag model, wherein the document tag model is obtained by pre-training universal training data of each application scene.

In this application, a schematic diagram of a document tag model structure may be shown in fig. 2, where in fig. 2, the document tag model includes: the device comprises a pretreatment layer, a candidate recall layer, a coarse arrangement layer and a fine arrangement layer. The preprocessing layer is used for carrying out processing such as segmentation, sentence segmentation, word segmentation, part-of-speech labeling POS, named entity recognition NER and the like on the document to obtain a preprocessing result; the pretreatment result comprises: segmentation results, sentence results, word segmentation results, part-of-speech tagging results and named entity recognition results.

Wherein the candidate recall layer comprises: the method comprises the following steps of a keyword recall sub-model, a multi-label classification recall sub-model, an explicit recall sub-model and an implicit recall sub-model which are connected in parallel. The 4 recall sub-models are input into a document and a preprocessing result corresponding to the document; the output result is a plurality of candidate tags. And combining the output results of the 4 recall sub-models to obtain candidate label results. The keyword recall sub-model is used for determining candidate labels by analyzing the semantic structures and the statistical characteristics of the documents. A multi-tag class recall sub-model for determining candidate tags based on NN multi-label classification. An explicit recall sub-model determines candidate tags based on literal matching and frequency screening. An implicit recall sub-model determines candidate tags based on a principal and minor component analysis.

Wherein, the thick layer of arranging includes: a rule sub-model and a semantic matching sub-model in parallel. The rule submodel is used for determining candidate labels to be filtered in the candidate label result according to a preset rule. The semantic matching sub-model is used for determining text relativity of the document and each candidate label in the candidate label result, and determining the candidate label to be filtered according to the text relativity. And filtering the candidate labels to be filtered in the candidate label results to obtain filtered candidate label results. The text relatedness refers to the similarity of the semantic layers between the text and the candidate labels.

The fine ranking layer is used for ranking the candidate labels according to the text relativity, the label heat and the label granularity of the candidate labels in the filtered candidate label results, and predicting label information corresponding to the document according to the ranking results. The tag heat refers to the attention heat of a user to the candidate tag, such as the search heat of the candidate tag. Tag granularity is determined based on constituent word types and length calculations of candidate tags. The more detailed the content of the candidate tag, the smaller the tag granularity. E.g., ordered in tag granularity, then hundred degrees- > hundred degrees alliance peak; entertainment- > entertainment star.

In the application, the application scene is used for example for carrying out label prediction of the heavy entity on the long file, carrying out accurate label prediction on the question and answer, carrying out label prediction on recall of original content of a user and the like. Wherein, the prediction object may include: long documents, questions and answers, user original content, etc. Predictive requirements such as recall, recraxess, heavy entities, reclassification, heavy commercial value, etc.

In this application, the general training data of each application scenario may refer to training data obtained by combining training data of each application scenario, for example. In the application, before determining the application scene to be applied, a large amount of general training data of each application scene can be adopted to pretrain the initial document tag model, so that after determining the application scene to be applied, the amount of training data in the application scene to be applied is reduced.

Step 102, obtaining scene training data of an application scene to be applied, wherein the scene training data comprises: and the plurality of documents and the corresponding label information in the application scene to be applied.

And step 103, obtaining a sub-model related to the application scene to be applied in the document tag model.

In the present application, the sub-model related to the application scenario to be applied includes: a semantic matching sub-model, any one or more of the following sub-models: a multi-label classification recall sub-model, an explicit recall sub-model, and an implicit recall sub-model. In the application, the sub-model can be selected from the sub-models to be retrained or finely adjusted according to a specific application scene to be applied.

And 104, training the sub-models by using scene training data to obtain a trained document tag model.

In the present application, the sub-model related to the application scenario to be applied includes: when the semantic matching sub-model, the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model are executed, the training device of the document label model may specifically execute the process of step 104 by respectively inputting the document into the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model for each document in the scene training data, and merging the output results to obtain candidate label results; inputting the document and the candidate tag results into a semantic matching sub-model, and obtaining the correlation degree of each candidate tag in the document and the candidate tag results; and adjusting coefficients of the semantic matching sub-model, the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model according to the relevance of each candidate label in the result of the document and the candidate label and label information corresponding to the document, so as to obtain a trained document label model.

In this application, in order to improve the accuracy of the trained document tag model, the scene training data may further include: a set of tags, the set of tags comprising: the document tag model may predict tags such that the document tag model, in combination with the tag set, performs tag prediction on documents in the scene training data.

In this application, before step 104, the method may further include the following steps: initializing coefficients of a multi-label classification recall sub-model, an explicit recall sub-model and an implicit recall sub-model in a document label model to avoid interference when the coefficients of the sub-models in the pre-trained document label model are trained in an application scene to be applied, and further improving accuracy of the document label model in the application scene to be applied.

In order to achieve the above embodiment, the embodiment of the present application further provides a training device for a document tag model.

Fig. 3 is a schematic diagram according to a second embodiment of the present application. As shown in fig. 3, the training apparatus 100 of the document tag model includes:

an obtaining module 110, configured to obtain a pre-trained document tag model, where the document tag model is obtained by pre-training general training data of each application scenario;

the obtaining module 110 is further configured to obtain scene training data of an application scene to be applied, where the scene training data includes: the plurality of documents and the corresponding label information under the application scene to be applied;

the obtaining module 110 is further configured to obtain a sub-model related to the application scenario to be adapted in the document tag model;

and the training module 120 is configured to train the sub-model by using the scene training data to obtain a trained document tag model.

In one embodiment of the present application, the sub-model related to the application scenario to be applied includes: in the case of a semantic matching sub-model, a multi-label classification recall sub-model, an explicit recall sub-model, and an implicit recall sub-model, the training module 120 is specifically configured to,

In one embodiment of the present application, referring to fig. 4 in combination, the apparatus further includes: and an initialization module 130, which performs an initialization operation on coefficients of the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model in the document label model.

It should be noted that the foregoing explanation of the training method of the document tag model is also applicable to the training device of the document tag model in this embodiment, and will not be repeated here.

According to embodiments of the present application, an electronic device and a readable storage medium are also provided.

As shown in fig. 5, a block diagram of an electronic device is provided for a training method of a document tag model according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.

As shown in fig. 5, the electronic device includes: one or more processors 301, memory 302, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 301 is illustrated in fig. 5.

Memory 302 is a non-transitory computer-readable storage medium provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the training method for the document tag model provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the training method of the document tag model provided by the present application.

The memory 302 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules (e.g., the acquisition module 110, the training module 120, and the initialization module 130 shown in fig. 3 and fig. 4) corresponding to the training method of the document tag model in the embodiments of the present application. The processor 301 executes various functional applications of the server and data processing, i.e., implements the training method of the document tag model in the above-described method embodiment, by running non-transitory software programs, instructions, and modules stored in the memory 302.

Memory 302 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created from the use of the trained electronic device of the text label model, and the like. In addition, memory 302 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 302 optionally includes memory remotely located with respect to processor 301, which may be connected to the trained electronic device of the document tag model via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The electronic device of the method of training a document tag model may further include: an input device 303 and an output device 304. The processor 301, memory 302, input device 303, and output device 304 may be connected by a bus or other means, for example in fig. 5.

The input device 303 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the trained electronic device of the document tag model, such as input devices for a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointer stick, one or more mouse buttons, a track ball, a joystick, and the like. The output device 304 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), haptic feedback devices (e.g., vibration motors), and the like. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.

The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. A method for training a document tag model, comprising:

training the sub-model by adopting the scene training data to obtain a trained document tag model;

wherein, the sub-model related to the application scene to be applied comprises: when the semantic matching sub-model, the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model are adopted, training is carried out on the sub-model by adopting the scene training data, and a trained document label model is obtained, and the method comprises the following steps:

2. The method of claim 1, wherein the document tag model comprises: the device comprises a pretreatment layer, a candidate recall layer, a coarse arrangement layer and a fine arrangement layer;

3. The method of claim 1, wherein the scene training data further comprises: a set of tags, the set of tags comprising: the document tag model can predict tags so that the document tag model can predict the tags of the documents in the scene training data in combination with the tag set.

4. The method of claim 1, wherein training the sub-model using the scene training data further comprises, prior to obtaining a trained document tag model:

5. A training device for a document tag model, comprising:

the training module is used for training the sub-model by adopting the scene training data to obtain a trained document label model;

wherein, the sub-model related to the application scene to be applied comprises: the training module is specifically used for, when the semantics match sub-model, the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model,

6. The apparatus of claim 5, wherein the document tag model comprises: the device comprises a pretreatment layer, a candidate recall layer, a coarse arrangement layer and a fine arrangement layer;

7. The apparatus of claim 5, wherein the scene training data further comprises: a set of tags, the set of tags comprising: the document tag model can predict tags so that the document tag model can predict the tags of the documents in the scene training data in combination with the tag set.

8. The apparatus as recited in claim 5, further comprising: and the initialization module is used for initializing coefficients of the multi-label classification recall sub-model, the explicit recall sub-model and the implicit recall sub-model in the document label model.

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-4.

10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-4.