CN113240088A

CN113240088A - Training method of text intention recognition model

Info

Publication number: CN113240088A
Application number: CN202110534484.8A
Authority: CN
Inventors: 邵磊
Original assignee: Shanghai Zhongtongji Network Technology Co Ltd
Current assignee: Shanghai Zhongtongji Network Technology Co Ltd
Priority date: 2021-05-17
Filing date: 2021-05-17
Publication date: 2021-08-10

Abstract

The application relates to a training method of a text intention recognition model, which comprises the following steps: acquiring a service target and a service scene; determining an algorithm structure of a deep learning model according to a business target and a business scene; obtaining structural parameters, and determining a pre-training model according to the structural parameters and an algorithm structure; and acquiring training data, and training the pre-training model by using the training data to obtain a text intention recognition model. Therefore, the pre-training model can be automatically assembled only by determining the business target and the business scene by business personnel when the model is created and inputting the structural parameters, so that communication between developers and the business personnel is not needed, the processes of manually writing scripts and manually adjusting parameters are avoided, a large amount of manpower is saved, and the training efficiency is effectively improved. Meanwhile, an algorithm structure is determined by combining a service target and a service scene, so that a simple model switching function of multiple service scenes at different times is realized, the participation of operation and maintenance personnel is avoided, and the application efficiency is improved.

Description

Training method of text intention recognition model

Technical Field

The application relates to the technical field of natural language processing, in particular to a training method of a text intention recognition model.

Background

With the development of Natural Language Processing (NLP) technology, algorithms used in partial industrial scenes such as text classification, text labeling and the like tend to be modularized, the development mode mainly includes combining different algorithm models and adjusting parameters, integrating, training and testing text data in a business to generate a deep learning model, adjusting and optimizing training data according to a test effect, continuously optimizing the deep learning model, and finally deploying an online network to achieve a specific business target.

In the related technology, most of the existing technical schemes are that a developer writes code parameters, manually writes scripts for testing, optimizes and retrains wrong labeled data after testing until an expected model effect is achieved, and then contacts operation and maintenance personnel for deployment. However, this solution is often implemented with the following three disadvantages: firstly, manual writing of test scripts is low in efficiency and visual test results are difficult to present; secondly, the process of training data labeling optimization depends on understanding of the business, a large amount of manpower is needed, and developers cannot know the business enough, so that the manpower resource cannot be utilized efficiently; and thirdly, operation and maintenance are required for model deployment, and extra personnel are required for model version updating and rollback, and the efficiency is low.

Disclosure of Invention

In view of the above, an object of the present application is to overcome the deficiencies of the prior art and provide a training method for a text intention recognition model.

In order to achieve the purpose, the following technical scheme is adopted in the application:

the application provides a training method of a text intention recognition model, which comprises the following steps:

acquiring a service target and a service scene;

determining an algorithm structure of a deep learning model according to the service target and the service scene;

obtaining structural parameters, and determining a pre-training model according to the structural parameters and the algorithm structure;

and acquiring training data, and training the pre-training model by using the training data to obtain a text intention recognition model.

Optionally, after obtaining the text intent recognition model, the method further includes:

obtaining test data, testing the text intention recognition model by using the test data, and judging whether the text intention recognition model reaches the standard or not;

if the text intention identification model meets the standard, ending the test, and executing corresponding online operation aiming at the text intention identification model;

if the standard is not met, sending out prompt information.

Optionally, after the sending of the prompt message, the method further includes:

detecting whether the adjustment information is received;

and if the adjustment information is received, continuing training and testing the text intention recognition model according to the adjustment information until the text intention recognition model reaches the standard.

Optionally, the obtaining test data, testing the text intention recognition model by using the test data, and determining whether the text intention recognition model meets the standard includes:

loading the text intention recognition model and building a test system;

acquiring test data, and inputting the test data into the test system to obtain a test result;

and judging whether the text intention recognition model reaches the standard or not according to the test result.

Optionally, the sending the prompt message includes:

sending out text prompt information in a popup window mode; the text prompt information carries the test result.

The technical scheme provided by the application can comprise the following beneficial effects:

according to the scheme, the algorithm structure of the deep learning model required to be used can be determined through the acquired business targets and business scenes, a foundation is laid for application of the text intention recognition model under different business scenes and business targets, the pre-training model can be determined through the acquired structural parameters and the determined algorithm structure, and then the pre-training model can be trained through the acquired training data, so that the text intention recognition model is obtained. Therefore, developers do not need to participate in the training process, business personnel only need to determine business targets and business scenes when the models are created, and structural parameters are input, the pre-training models can be automatically assembled, communication between the developers and the business personnel is not needed, the processes of manually writing scripts and manually adjusting parameters are avoided, a large amount of manpower is saved, and the training efficiency is effectively improved. Meanwhile, an algorithm structure is determined by combining a service target and a service scene, so that a simple model switching function of multiple service scenes at different times is realized, the participation of operation and maintenance personnel is avoided, and the application efficiency is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart of a training method for a text intent recognition model according to an embodiment of the present application.

Fig. 2 is a schematic structural diagram of a training apparatus for a text intention recognition model according to another embodiment of the present application.

Fig. 3 is a schematic structural diagram of a training device for a text intention recognition model according to another embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail below. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 1 is a flowchart of a training method for a text intent recognition model according to an embodiment of the present application. The embodiment provides a training method of a text intention recognition model, as shown in the figure, the method at least includes the following implementation steps:

and 11, acquiring a service target and a service scene.

And step 12, determining an algorithm structure of the deep learning model according to the service target and the service scene.

And step 13, obtaining the structural parameters, and determining a pre-training model according to the structural parameters and the algorithm structure.

In implementation, the structural parameters may include a coding layer structure (bert, lstm, and the like), a learning rate, a text length, a training batch, a training iteration number, and the like of the model, and the structural parameters define a model structure, and different model structures may bring different computational efficiencies and model effects.

And step 14, acquiring training data, and training the pre-training model by using the training data to obtain a text intention recognition model.

Wherein, the training data can be similar to the target data which needs to be predicted actually, and does not need to be completely consistent. During implementation, the training data can be selected according to actual requirements, the more the training data are, the more the distribution is uniform, and the better the trained model effect is.

In this embodiment, an algorithm structure of a deep learning model to be used can be determined through the obtained service targets and service scenes, a foundation is laid for application of text intention recognition models in different service scenes and service targets, a pre-training model can be determined through the obtained structure parameters and the determined algorithm structure, and then the pre-training model can be trained through the obtained training data, so that the text intention recognition model is obtained. Therefore, developers do not need to participate in the training process, business personnel only need to determine business targets and business scenes when the models are created, and structural parameters are input, the pre-training models can be automatically assembled, communication between the developers and the business personnel is not needed, the processes of manually writing scripts and manually adjusting parameters are avoided, a large amount of manpower is saved, and the training efficiency is effectively improved. Meanwhile, an algorithm structure is determined by combining a service target and a service scene, so that a simple model switching function of multiple service scenes at different times is realized, the participation of operation and maintenance personnel is avoided, and the application efficiency is improved.

In specific implementation, a developer can build a combined structure of a pre-training model in advance, namely, a plurality of algorithm structures and structure parameters are provided. Based on the method, after business personnel acquire business targets and business scenes, for example, classification of hastening and checking intentions in the field of express delivery, identification of anger and sad emotions in the general field, and the like, different algorithm structures and structural parameters can be used for determining the pre-training model to be used.

In step 14, after obtaining the text intention recognition model, in order to ensure that the trained text intention recognition model meets the use requirement, the training method of the text intention recognition model may further include: obtaining test data, testing the text intention recognition model by using the test data, and judging whether the text intention recognition model reaches the standard or not; if the text intention identification model meets the standard, ending the test, and executing corresponding online operation aiming at the text intention identification model; if the standard is not met, sending out prompt information.

During implementation, after a text intention recognition model generated by training is obtained, the model can be automatically loaded, and http and grpc interfaces are exposed to the outside, so that the deployment of the text intention recognition model is completed.

After the deployment is completed, a test system is built, and based on the test system, the obtained test data prepared in advance by the service personnel can be input into the test system for testing, so that a test result is finally obtained. The test result shows the overall test effect, including the identification result and the data marking result. And judging whether the current text intention recognition model reaches the standard or not according to the recognition result and the data labeling result.

If the text intention recognition model does not reach the standard, the reason of failing to reach the standard (such as error and confusion of training data, over-similar labels, structural parameters needing to be adjusted and the like) can be searched according to the test result, and prompt information is sent to inform business personnel of the reason of failing to reach the standard or the information needing to be adjusted.

When the implementation is carried out, when the prompt message is sent out, the text prompt message can be sent out in a popup window mode; the text prompt information carries the test result.

After the prompt message is sent out, the training method of the text intention recognition model can further comprise the following steps: detecting whether the adjustment information is received; and if the adjustment information is received, continuing training and testing the text intention recognition model according to the adjustment information until the text intention recognition model reaches the standard. Therefore, the steps of creating the model and training the model are simplified, the tasks of testing and optimizing the model are realized by a business department which really uses the model, the communication time is saved, and the labor cost is reduced.

The adjustment information is training data adjustment or structural parameter adjustment made by a service person on the text intention recognition model according to the prompt information. In implementation, after the adjustment information is obtained, the model can be trained again according to the adjustment information, and the training steps and the testing steps are circulated until the text intention recognition model reaches the standard.

After the standard-reaching training model is obtained, corresponding online operation can be executed on the standard-reaching text intention recognition model. When the online model is online, the url of the online model can be provided for a business system to be used for external services, so that a server does not need to be applied independently, operation and maintenance personnel do not need to participate, multi-version simultaneous deployment can be realized, and the online model can be switched at any time.

Based on the same technical concept, the present embodiment provides a training apparatus for a text intention recognition model, as shown in fig. 2, the apparatus may specifically include: an obtaining module 201, configured to obtain a service target and a service scene; the first determining module 202 is configured to determine an algorithm structure of the deep learning model according to the service objective and the service scenario; the second determining module 203 is configured to obtain a structural parameter, and determine a pre-training model according to the structural parameter and an algorithm structure; the training module 204 is configured to obtain training data, train the pre-training model with the training data, and obtain a text intent recognition model.

Optionally, the training apparatus for the text intention recognition model may further include a testing module, where the testing module is configured to: obtaining test data, testing the text intention recognition model by using the test data, and judging whether the text intention recognition model reaches the standard or not; if the text intention identification model meets the standard, ending the test, and executing corresponding online operation aiming at the text intention identification model; if the standard is not met, sending out prompt information.

Optionally, after sending the prompt message, the testing module may be further configured to: detecting whether the adjustment information is received; and if the adjustment information is received, continuing training and testing the text intention recognition model according to the adjustment information until the text intention recognition model reaches the standard.

Optionally, when the test data is acquired, the test data is used to test the text intention recognition model, and whether the text intention recognition model meets the standard or not is determined, the test module may specifically be configured to: loading a text intention recognition model and building a test system; acquiring test data, and inputting the test data into a test system to obtain a test result; and judging whether the text intention recognition model reaches the standard or not according to the test result.

Optionally, when sending the prompt message, the test module may specifically be configured to: sending out text prompt information in a popup window mode; the text prompt information carries the test result.

For a specific implementation of the training apparatus for a text intention recognition model provided in the embodiment of the present application, reference may be made to the implementation of the training method for a text intention recognition model described in any of the above embodiments, and details are not repeated here.

An embodiment of the present application further provides an interface invoking device, as shown in fig. 3, the interface invoking device may specifically include: a processor 301, and a memory 302 connected to the processor 301; the memory 302 is used to store computer programs; the processor 301 is configured to call and execute a computer program in the memory 302 to perform the training method of the text intention recognition model as described in any of the above embodiments.

For a specific implementation of the training device for the text intention recognition model provided in the embodiment of the present application, reference may be made to the implementation of the training method for the text intention recognition model described in any of the above embodiments, and details are not repeated here.

Embodiments of the present application also provide a storage medium storing a computer program, which when executed by a processor, implements the steps of the training method of the text intention recognition model according to any of the above embodiments.

It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar parts in other embodiments may be referred to for the content which is not described in detail in some embodiments.

It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present application, the meaning of "a plurality" means at least two unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A training method of a text intention recognition model is characterized by comprising the following steps:

acquiring a service target and a service scene;

2. The method for training the text intention recognition model according to claim 1, wherein after obtaining the text intention recognition model, the method further comprises:

if the standard is not met, sending out prompt information.

3. The method for training the text intention recognition model according to claim 2, wherein after the prompt message is sent out, the method further comprises:

detecting whether the adjustment information is received;

4. The method for training the text intention recognition model according to claim 2, wherein the obtaining test data, testing the text intention recognition model by using the test data, and determining whether the text intention recognition model meets the standard comprises:

loading the text intention recognition model and building a test system;

5. The method for training the text intention recognition model according to claim 4, wherein the sending out prompt information comprises: