CN112632271B - Text classification service deployment method, device, equipment and computer storage medium - Google Patents

Text classification service deployment method, device, equipment and computer storage medium Download PDF

Info

Publication number
CN112632271B
CN112632271B CN201910948777.3A CN201910948777A CN112632271B CN 112632271 B CN112632271 B CN 112632271B CN 201910948777 A CN201910948777 A CN 201910948777A CN 112632271 B CN112632271 B CN 112632271B
Authority
CN
China
Prior art keywords
training
model
classification
text data
fine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910948777.3A
Other languages
Chinese (zh)
Other versions
CN112632271A (en
Inventor
储晶星
齐希
施文驰
朱骏
傅一平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Zhejiang Innovation Research Institute Co ltd
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Zhejiang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Zhejiang Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201910948777.3A priority Critical patent/CN112632271B/en
Publication of CN112632271A publication Critical patent/CN112632271A/en
Application granted granted Critical
Publication of CN112632271B publication Critical patent/CN112632271B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention relates to the technical field of business support, and discloses a text classification service deployment method, a device, equipment and a computer storage medium, wherein the method comprises the following steps: acquiring training text data; obtaining a fine tuning pre-training model according to the training text data; acquiring marked text data of each scene type; obtaining semantic vector data of each scene type according to the marked text data of each scene type and the fine tuning pre-training model; respectively training a classification neural network model according to the semantic vector data of each scene type to obtain a classification model of each scene type; and classifying the texts of the scene types according to the classification model of the scene types and the fine tuning pre-training model. By the mode, the embodiment of the invention can accelerate the deployment of the classification model.

Description

Text classification service deployment method, device, equipment and computer storage medium
Technical Field
The embodiment of the invention relates to the technical field of business support, in particular to a text classification service deployment method, a device, equipment and a computer storage medium.
Background
The text classification refers to selecting a proper text classification method and an auxiliary algorithm according to the application scene of the text, selecting a characteristic value for the text, and training the text to be classified and the characteristic words through a classification model to realize target text classification recognition.
In carrying out embodiments of the present invention, the inventors found that: the current classification model is generally customized for a specific business scenario, for example, classifying customer complaint content, classifying web page comments, and the like. Each scene needs to establish a set of own models, including a plurality of links such as Chinese word segmentation, word embedding, model data training, on-line service deployment and the like, namely each new service needs to complete all links to establish a classification model used for the service, and the efficiency is low.
Disclosure of Invention
In view of the foregoing, embodiments of the present invention provide a text classification service deployment method, apparatus, device, and computer storage medium, which overcome or at least partially solve the foregoing problems.
According to an aspect of an embodiment of the present invention, there is provided a text classification service deployment method, the method including: acquiring training text data; obtaining a fine tuning pre-training model according to the training text data; acquiring marked text data of each scene type; obtaining semantic vector data of each scene type according to the marked text data of each scene type and the fine tuning pre-training model; respectively training a classification neural network model according to the semantic vector data of each scene type to obtain a classification model of each scene type; and classifying the texts of the scene types according to the classification model of the scene types and the fine tuning pre-training model.
In an optional manner, training a pre-training neural network model according to the training text data to obtain a pre-training model; and combining the pre-training model with a preset corpus to obtain the fine-tuning pre-training model.
In an optional manner, the obtaining a fine-tuning pre-training model according to the training text data specifically includes: training a pre-training neural network model according to the training text data to obtain a pre-training model; obtaining training semantic vector data according to the training text data and the pre-training model; training a dimension reduction neural network model according to the training semantic vector data to obtain a dimension reduction model; and combining the pre-training model and the dimension reduction model to obtain the fine-tuning pre-training model.
In an optional manner, the classifying model and the fine tuning pre-training model according to the scene types respectively classify texts of the scene types, specifically: respectively packaging the classification model and the fine tuning pre-training model of each scene type in a Docker container to obtain the Docker container of each scene type; and respectively associating the Docker containers of the scene types with application programs of different scene types to respectively acquire text data of the application programs of the scene types, and classifying the text data.
According to another aspect of the embodiment of the present invention, there is provided a text classification service deployment apparatus, including: the system comprises a pre-training layer and a service layer, wherein the service layer comprises a plurality of Docker containers, and each Docker container is packaged with a classification model and a fine tuning pre-training model; the pre-training layer is used for acquiring training text data and obtaining the fine-tuning pre-training model according to the training text data; the fine tuning pre-training model in each Docker container is respectively used for acquiring the marked text data of each scene type to obtain semantic vector data of each scene type; the semantic vector data of each scene type are respectively used for training a classification neural network model to obtain classification models in each Docker container; the classification model and the fine-tuning pre-training model in each Docker container are used for classifying texts of different scene types.
In an alternative manner, the pre-training layer includes a pre-training model and a pre-set corpus; the pre-training model is obtained by training a pre-training neural network model through the training text data; the preset corpus is used for combining with the pre-training model to obtain the fine-tuning pre-training model.
In an alternative manner, the pre-training layer includes a pre-training model and a dimension-reduction model; the pre-training model is obtained by training a pre-training neural network model through the training text data; the pre-training model is used for obtaining training semantic vector data according to the training text data; the training semantic vector data is used for training a dimension reduction neural network model to obtain the dimension reduction model, and the dimension reduction model is used for combining with the pre-training model to obtain the fine-tuning pre-training model.
In an alternative manner, the apparatus further comprises an application layer, the application layer comprising application programs of each scene type; and each application program is respectively used for associating each Docker container and sending text data of the corresponding scene type to the Docker container, so that the classification model and the fine-tuning pre-training model in the Docker container classify the text data.
According to another aspect of the embodiment of the present invention, there is provided a text classification service deployment apparatus, including: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus; the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the text classification service deployment method.
According to still another aspect of the embodiments of the present invention, there is provided a computer storage medium having at least one executable instruction stored therein, where the executable instruction causes the processor to perform operations corresponding to the above-mentioned text classification service deployment method.
According to the embodiment of the invention, a neural network model is trained through training text data to obtain a fine-tuning pre-training model, then, marking text data of different scene types are sequentially input into the fine-tuning pre-training model to obtain semantic vector data of each scene type, and supervision training is performed on different classification neural network models through the semantic vector data of the different scene types to obtain classification models of each scene type. Finally, when the text is classified, the text with different scene types can be input into the same fine tuning pre-training model to obtain semantic vector data with different scene types, and the semantic vector data with different scene types can be respectively input into the classification model with the scene type to obtain corresponding classification results. Compared with the prior art, the embodiment of the invention separates word embedding links for building the classification model, so that the classification models of different scene types share the same fine tuning pre-training model, the word embedding links do not need to be repeated when the classification models of different scene types are built, namely, the building process of the fine tuning pre-training model is carried out, and the deployment of the classification model is quickened. In addition, the developer of the model does not need to participate in the word embedding link of establishing the classification model, but only needs to participate in the generation link of the classification model, so that the working strength of the developer is reduced.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and may be implemented according to the content of the specification, so that the technical means of the embodiments of the present invention can be more clearly understood, and the following specific embodiments of the present invention are given for clarity and understanding.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 shows a flowchart of a text classification service deployment method according to an embodiment of the present invention;
FIG. 2 shows a flowchart of the substeps of determining a fine-tuned pre-training model in another embodiment of the invention;
FIG. 3 is a flowchart showing the sub-steps of determining a fine-tuning pre-training model in a further embodiment of the present invention;
FIG. 4 is a flowchart showing sub-steps for classifying text for each scene type in an embodiment of the invention;
Fig. 5 shows a schematic structural diagram of a text classification service deployment device according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a text classification service deployment device according to another embodiment of the present invention;
fig. 7 shows a schematic structural diagram of a text classification service deployment device according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The current classification model for classifying the text is generally customized and established for a specific service scene, and the establishment process comprises a plurality of links such as word segmentation, word embedding, model data training, online service deployment and the like for the text. The word embedding link refers to a process of converting Chinese text into semantic vectors. Because in the technical field of natural language processing, natural language is submitted to an algorithm in machine learning for processing, the natural language needs to be mathematically expressed first, and a semantic vector is one way to mathematically express the natural language. In addition, early approaches to math natural language were based primarily on shallow machine learning and statistics, and used one-hot (also called one-of-V, V being the size of the dictionary) or distributed approaches (e.g., word bags in combination with word frequency, co-occurrence information, TF-IDF, or entropy) to give a mathematical representation of sentences. The main disadvantages of this representation method are that the semantics of the language units (such as words, words or phrases n-grams) in sentences cannot be expressed and the relation between them (for example, the vector inner product of any two different words is 0), and the problem of high-dimensional sparseness easily occurs. Therefore, word embedding, i.e., training a mathematical representation of words through a neural network, is currently commonly employed. The main idea of word embedding is to map words into a continuous d-dimensional real vector with semantic information. Existing research has demonstrated that word embedding can better characterize the grammar and semantic information of text, and can be combined with deep neural networks and further improve the accuracy of model classification. But the word embedding link also requires a large amount of training data to train the neural network model while improving the accuracy. At present, the word segmentation and word embedding links are required to be executed once every time a new classification of service scenes is established, which is very troublesome and low-efficiency and is not beneficial to rapidly deploying text classification models of a plurality of service scenes. Therefore, the embodiment of the invention provides a text classification service deployment method, which can more efficiently and quickly establish a plurality of classification models applied to different business scenes so as to better provide text classification services for the business scenes.
Embodiments of the present invention will be described below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 shows a flowchart of a text classification service deployment method according to an embodiment of the present invention, where the method includes the following steps:
step S110: training text data is acquired.
Step S120: and obtaining a fine tuning pre-training model according to the training text data.
The training text data is training data used for training a neural network model in a word embedding link. The neural network model can be a general BERT model, and in the training process, the general BERT model can be trained through the seq2seq sequence mode and the attribute mechanism, so that each word in training text data is sequentially read in without word segmentation when the text is trained and converted, the word segmentation link in the original classification model establishment step is omitted, and the efficiency is improved. After the BERT model is trained by the training text data, the weight value in the BERT model can be determined, so that the fine tuning pre-training model is obtained. And then inputting text data into the fine-tuning pre-training model to obtain accurate semantic vectors, namely, mathematical representation of a natural language. For example, if the text data input into the fine-tuning pre-training model is "world hello", the words "world" and "hello" are two words, and the fine-tuning pre-training model outputs a two-dimensional semantic vector [ 0.1253.2536 ].
Furthermore, in order to improve the accuracy of text conversion of the fine-tuning pre-training model, the data amount of training text data required at the time of training is generally large. Therefore, the training process takes a long time and requires a lot of computing resources, so the training process needs to be deployed on an independent GPU (Graphics Processing Unit, graphics processor) server, i.e. a BERT model is built on the independent GPU server, and the BERT model is trained on the GPU server, so as to obtain a fine-tuning pre-training model.
Step S130: and acquiring the marked text data of each scene type.
Step S140: and obtaining semantic vector data of each scene type according to the marked text data of each scene type and the fine tuning pre-training model.
Step S150: and training a classification neural network model according to the semantic vector data of each scene type to obtain a classification model of each scene type.
After the fine tuning pre-training model is trained, training of the classification neural network model can be started to obtain the classification model. Because the text data of each service scene has large difference, the text data of different scenes cannot be classified by the same classification model, but a developer of the corresponding scene is required to select a proper neural network model and an auxiliary algorithm and train by using sample data of the corresponding scene. For example, in a business scenario in which customer complaint content is classified, it is necessary to classify these complaint data by the severity of the complaint content. In the business scenario of classifying the web page comments, the web page comments need to be classified according to the number of the sensitive words contained in the comments. It is apparent that words representing the severity of complaints and sensitive words contained in comments have no relevance, so text data in both scenes cannot be classified using the same classification model.
In the process of establishing the classification model, a business person can provide a batch of marked text data according to the requirement of a business scene, and the marked text data can be input into a fine-tuning pre-training model to obtain semantic vector data corresponding to the marked text data. The marked text data refers to the text data with a mark of the text type to which the text data belongs. For example, the tag text data of a customer complaint content scene may be "severe" or "not severe" or the like. And the marked text data of the webpage comment scene can be sensitive or insensitive and the like. These semantic vector data are then used to train a classification neural network model, which is determined by the developer of the corresponding scenario, as well as the auxiliary algorithms used for training. After training is finished, a classification model of the service scene can be obtained. If the classification model of other business scenes is needed to be obtained, the steps are repeatedly executed by only using the marked text data corresponding to the other business scenes. The same fine-tuning pre-training model can be used when the above steps are repeatedly performed, but the classified neural network model to be trained needs to be redetermined according to the developers of other business scenarios.
Specifically, when training the classified neural network model corresponding to each scene, firstly constructing the architecture of the classified neural network model, and initializing the weight of the classified neural network model; and dividing semantic vector data corresponding to the scene of the classified neural network model into a plurality of groups, inputting one group of data into the classified neural network model, and obtaining the prediction result of each semantic vector in the group of data according to the initialized weight value. And then, according to the text type of the mark in the mark text corresponding to each semantic vector, comparing with a prediction result, and calculating a loss function value. Finally, the weight of the classified neural network model is adjusted according to the magnitude of the loss function value, and another set of semantic vector data is input until the obtained loss function value is minimum. And taking the weight determined when the loss function value is the smallest as the weight of the classified neural network model to obtain the final classified model.
Step S160: and classifying the texts of the scene types according to the classification model of the scene types and the fine tuning pre-training model.
After the final classification model is obtained, the classification of the text data using the classification model may begin. Firstly, inputting a semantic vector into a training model to obtain a semantic vector, inputting the semantic vector into a classification model corresponding to a scene type to which the real text data belongs, obtaining the probability of the semantic vector corresponding to each text type under the scene type to which the semantic vector belongs, and finally, taking the text type with the highest probability as the text type of the real text data.
According to the embodiment of the invention, a neural network model is trained through training text data to obtain a fine-tuning pre-training model, then, marking text data of different scene types are sequentially input into the fine-tuning pre-training model to obtain semantic vector data of each scene type, and supervision training is performed on different classification neural network models through the semantic vector data of the different scene types to obtain classification models of each scene type. Finally, when the text is classified, the text with different scene types can be input into the same fine tuning pre-training model to obtain semantic vector data with different scene types, and the semantic vector data with different scene types can be respectively input into the classification model with the scene type to obtain corresponding classification results. Compared with the prior art, the embodiment of the invention separates word embedding links for building the classification model, so that the classification models of different scene types share the same fine tuning pre-training model, the word embedding links do not need to be repeated when the classification models of different scene types are built, namely, the building process of the fine tuning pre-training model is carried out, and the deployment of the classification model is quickened. In addition, the developer of the model does not need to participate in the word embedding link of establishing the classification model, but only needs to participate in the generation link of the classification model, so that the working strength of the developer is reduced.
For the above step S120, there may be a plurality of implementation manners, as shown in fig. 2, which shows a flowchart of sub-steps for determining a fine-tuning pre-training model in another embodiment of the present invention, the step S120 may be:
step S121: training a pre-training neural network model according to the training text data to obtain a pre-training model.
Step S122: and combining the pre-training model with a preset corpus to obtain the fine-tuning pre-training model.
The pre-training neural network model may be the BERT model in step S120 in the above embodiment, but in this embodiment, the trained BERT model does not directly obtain the fine-tuning pre-training model, but obtains the pre-training model first. The pre-training model is the same as the fine-tuning pre-training model in step S120 in the above embodiment, but in order to improve the accuracy of the classification model established later, a preset corpus is added in this embodiment, and the pre-training model is combined with the pre-training model to form the final fine-tuning pre-training model. The preset corpus is a special dictionary and recognition rules. When the classification model is generated by using the fine tuning pre-training model combined by the pre-training model and the preset corpus in the follow-up process, the marked text data can be matched with a special dictionary of the preset corpus after being input into the preset corpus, and corresponding feature vectors are generated according to the recognition rules. Meanwhile, the labeled text data can be input into a pre-training model to obtain semantic vectors, the feature vectors and the semantic vectors can be jointly input into a classification neural network model in the subsequent step, and the final classification model is obtained by training the classification neural network model. Compared with the embodiment, the model is more input when the classified neural network model is trained, and the feature vector and the semantic vector are both vectors representing the text features of the training text data, so that the finally trained classified model is more accurate.
It will be appreciated that: the foregoing pre-set corpus may be customized specifically according to the service requirements required for the scene type to which the tagged text data pertains, i.e., each scene type corresponds to a fine-tuning pre-training model, which includes different pre-set corpuses but identical pre-training models.
In some embodiments, the step S120 may be implemented in another manner, as shown in fig. 3, which shows a flowchart of sub-steps for determining a fine-tuning pre-training model in still another embodiment of the present invention, where the step S120 may further be:
step S201: training a pre-training neural network model according to the training text data to obtain a pre-training model.
Step S202: and obtaining training semantic vector data according to the training text data and the pre-training model.
Step S203: and training the dimension reduction neural network model according to the training semantic vector data to obtain the dimension reduction model.
Step S204: and combining the pre-training model and the dimension reduction model to obtain the fine-tuning pre-training model.
As with the previous embodiments, this embodiment also requires training by training text data to obtain a pre-training model. However, the dimension of the semantic vector obtained by the pre-training model is usually high, so that the calculated amount is too high and the efficiency is low when the classified neural network model is trained later, and therefore, the semantic vector needs to be subjected to dimension reduction treatment, namely a dimension reduction model is added, so that a fine-tuning pre-training model is formed by the semantic vector and the pre-training model. The dimension reduction process refers to reducing the dimension of the semantic vector, for example, converting a three-dimensional semantic vector [ 0.1253 0.2536 0.22323 ] into a two-dimensional semantic vector [ 0.1253.0.2536 ]. However, the meaning of text data expressed by the semantic vector may deviate in the dimension reduction process, so that a dimension reduction neural network model needs to be trained in a machine learning mode to obtain a dimension reduction model with higher dimension reduction accuracy, namely, the meaning of text data mainly expressed by the semantic vector is not changed while the dimension of the semantic vector is reduced. Specifically, after the pre-training model is obtained, training text data is input into the pre-training model again to obtain training semantic vector data, and the training semantic vector data is used for the dimension reduction neural network model to obtain the dimension reduction model. The dimension reduction neural network model can be a Pooling/Dropout model.
According to the embodiment of the invention, the dimension reduction model is added and combined with the pre-training model in the embodiment to form the final fine-tuning pre-training model, so that the semantic vector data after dimension reduction can be used for training when the classification model is built later, the calculated amount required in the machine learning process is reduced, and the efficiency is improved.
For the above step S160, as shown in fig. 4, a flowchart of sub-steps for classifying the text of each scene type in the embodiment of the present invention is shown, where step S160 is specifically:
step S161: and respectively packaging the classification model and the fine tuning pre-training model of each scene type in a Docker container to obtain the Docker container of each scene type.
Step S162: and respectively associating the Docker containers of the scene types with application programs of different scene types to respectively acquire text data of the application programs of the scene types, and classifying the text data.
The Docker container is an open-source application container engine, and provides a solution for rapid automatic deployment of application programs. Multiple Docker containers can be deployed on one physical machine, and each Docker container is isolated from the other Docker containers. Each business scenario will typically correspond to a Docker container that encapsulates the classification model and unified fine-tuning pre-training model corresponding to that business scenario. However, if the fine-tuning pre-training model includes a preset corpus, the preset corpus is adjusted according to different service scene types, and the fine-tuning pre-training model included in the dock container of each scene type is different according to different service scenes.
After the Docker containers are packaged for the scene types, the Docker containers of the scene types are respectively associated with application programs of different scene types, and the application programs are programs for providing text classification services under different service scenes, such as a sensitive comment auditing program or a WeChat article classification program. These programs use resources within their associated Docker container through the RESTful (REpresentational State Transfer, presentation state transition) interface call. That is, when text is entered in these programs, these programs will invoke the classification model and the fine-tuning pre-training model within their associated Docker container to classify the entered text, and ultimately output the classification results.
It should be noted that: when the traffic of an application program is large, a plurality of Docker containers can be associated, and all the Docker containers are packaged into the same classification model and fine-tuning pre-training model, so that the text input into the application program can be distributed to the plurality of Docker containers, the burden of a single Docker container is reduced, and the classification speed is also increased. In addition, after the encapsulation of the Docker container is completed, training data can be continuously input into the Docker container to train the classification model of the Docker container, so that the classification model is continuously updated, and the accuracy of the classification model is ensured.
According to the embodiment of the invention, a neural network model is trained through training text data to obtain a fine-tuning pre-training model, then, marking text data of different scene types are sequentially input into the fine-tuning pre-training model to obtain semantic vector data of each scene type, and supervision training is performed on different classification neural network models through the semantic vector data of the different scene types to obtain classification models of each scene type. Finally, when the text is classified, the text with different scene types can be input into the same fine tuning pre-training model to obtain semantic vector data with different scene types, and the semantic vector data with different scene types can be respectively input into the classification model with the scene type to obtain corresponding classification results. Compared with the prior art, the embodiment of the invention separates word embedding links for building the classification model, so that the classification models of different scene types share the same fine tuning pre-training model, the word embedding links do not need to be repeated when the classification models of different scene types are built, namely, the building process of the fine tuning pre-training model is carried out, and the deployment of the classification model is quickened. In addition, the developer of the model does not need to participate in the word embedding link of establishing the classification model, but only needs to participate in the generation link of the classification model, so that the working strength of the developer is reduced.
Fig. 5 shows a schematic structural diagram of a text classification service deployment device according to an embodiment of the present invention. As shown in fig. 5, the apparatus 100 includes a pre-training layer 10 and a service layer 20. The service layer 20 includes a plurality of Docker containers 21, and each Docker container 21 encapsulates a classification model 211 and a fine-tuning pre-training model 212. And the pre-training layer 10 is configured to obtain training text data, and obtain a fine-tuning pre-training model 212 according to the training text data. The fine tuning pre-training model 212 in each Docker container 21 is used for obtaining the marked text data of each scene type to obtain semantic vector data of each scene type; the semantic vector data of each scene type is respectively used for training a classification neural network model to obtain a classification model 211 in each Docker container 21; the classification model 211 and the fine-tuning pre-training model 212 within each Docker container 21 are used to classify text of different scene types, respectively.
Specifically, the pre-training layer 10 may be deployed on a separate GPU server, which contains a BERT model. A large amount of training text data is uploaded to the GPU server and the BERT model is trained to adjust the weight values of the BERT model, thereby obtaining the pre-training model 11. The pre-training model may be uploaded separately to another server deployed with the service layer 20 and replicated in multiple copies, and packaged as fine-tuning pre-training models 212 in respective Docker containers 21 of the service layer 20. And the classifying neural network models corresponding to different scene types are also packaged in each Docker container 21, and the classifying neural network models of different scene types can be trained respectively by respectively inputting the marked text data of different scene types into the Docker containers 21 so as to adjust the weight of the classifying neural network models in each Docker container 21 and determine the classifying model 211 in each Docker container 21. Then, the text data to be classified can be input into the Docker container 21 corresponding to the scene type, and an accurate text classification result can be obtained. If the accuracy of the classification result is not high, the labeled text data may be continuously input to each Docker container 21, and the weight of each classification model 211 may be readjusted to update the classification model 211.
In some embodiments, with continued reference to fig. 5, the pre-training layer 10 may further include a pre-set corpus 12, where the pre-set corpus 12 is a dedicated dictionary and recognition rules for uploading to a server deployed with the service layer 20 along with the pre-training model 11, and for encapsulating each Docker container 21 of the service layer 20 together as a fine-tuning pre-training model 212. However, the pre-corpus 12 may be customized according to the type of the business scenario, so that only the pre-training model 11 is duplicated into multiple copies after being uploaded to the server deployed with the service layer 20, and each pre-corpus 12 and the same pre-training model 11 together form multiple different fine-tuning pre-training models 212, and are respectively packaged in different Docker containers 21.
In other embodiments, as shown in fig. 6, which shows a schematic structural diagram of a text classification service deployment apparatus according to another embodiment of the present invention, the pre-training layer 10 may further include a dimension reduction model 13. As in the previous embodiment, the pre-training layer 10 will first train a BERT model to obtain the pre-training model 11 from the large amount of training text data uploaded. These training text data are then again input into the trained pre-training model 11, resulting in training semantic vector data. These training semantic vector data will then train the dimensionality reduction neural network model built on the pre-training layer 10 to adjust its weights, resulting in the dimensionality reduction model 13. The trained pre-training model 11 and the dimension-reduction model 13 are then uploaded together to a server deployed with the service layer 20, the pre-training model 11 and the dimension-reduction model 13 together form a fine-tuning pre-training model 212, and the fine-tuning pre-training model 212 is replicated into multiple parts and packaged in the respective Docker containers 21.
It will be appreciated that: in other embodiments, the pre-training layer 10 may also include the pre-training model 11, the dimension-reduction model 13 and the pre-set corpus 12, and the fine-tuning pre-training model 212 is formed by the pre-training model 11, the dimension-reduction model 13 and the pre-set corpus 12.
With continued reference to fig. 5 and 6, the apparatus 100 further includes an application layer 30, where the application layer 30 includes application programs for each scene type, such as a sensitive comment review program or a WeChat article classification program, and so forth. These applications are each used to associate one Docker container 21 or a plurality of identical Docker containers 21 and send text data for their corresponding scene types to the Docker container 21, causing the classification model 211 and the fine-tuning pre-training model 212 within the Docker container 21 to classify the text data. Specifically, each application program will upload the text to be classified to the server deployed with the service layer 20, and distribute the associated Docker container 21, the classification model 211 and the fine-tuning pre-training model 212 in the Docker container 21 classify the text and obtain a classification result, and then the service layer 20 returns the classification result to the corresponding application program.
The embodiment of the invention divides the training and online deployment of the whole text classification model 211 into three layers of architecture: a pre-training layer 10, a service layer 20 and an application layer 30. The pre-training layer 10 is configured to execute a word embedding link in the process of creating the classification models 211 of all scene types, and provide a unified fine-tuning pre-training model 212 for the classification model 211 of each scene type. The service layer 20 deploys a plurality of Docker containers 21, which encapsulate the foregoing fine-tuning pre-training model 212, and encapsulate the classification models 211 corresponding to different scene types respectively, where the process of creating these classification models 211 does not need to repeatedly execute word embedding links, but can convert the text into a digitized representation through the same fine-tuning pre-training model 212. Finally, application programs of various scene types are deployed on the application layer 30 and are respectively associated with various Docker containers 21 on the service layer 20, so that the application programs directly call the corresponding Docker containers 21 to perform text classification. Compared with the prior art, the embodiment of the invention separates word embedding links for establishing the classification model 211, so that the classification models 211 of different scene types share the same fine-tuning pre-training model 212, and the word embedding links do not need to be repeated when the classification models 211 of different scene types are established, namely, the establishment process of the fine-tuning pre-training model 212 is carried out, thereby accelerating the deployment of the classification models 211. In addition, since the training and online deployment of the classification model 211 are divided into three layers, the developer only needs to participate in the service layer 20, and does not need to participate in the whole training and online deployment process of the classification model 211, so that the burden of the developer is reduced.
The embodiment of the invention provides a non-volatile computer storage medium, which stores at least one executable instruction, and the computer executable instruction can execute the text classification service deployment method in any of the method embodiments.
Fig. 7 is a schematic structural diagram of a text classification service deployment device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the text classification service deployment device.
As shown in fig. 7, the text classification service deployment apparatus may include: a processor 202, a communication interface (Communications Interface) 204, a memory 206, and a communication bus 208.
Wherein: processor 202, communication interface 204, and memory 206 communicate with each other via communication bus 208. A communication interface 204 for communicating with network elements of other devices, such as clients or other servers. The processor 202 is configured to execute the program 210, and may specifically perform relevant steps in the above-described text classification service deployment method embodiment.
In particular, program 210 may include program code including computer-operating instructions.
The processor 202 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention. The one or more processors included in the text classification service deployment device may be the same type of processor, such as one or more CPUs; but may also be different types of processors such as one or more CPUs and one or more ASICs.
A memory 206 for storing a program 210. The memory 206 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 210 may be specifically operable to cause the processor 202 to:
acquiring training text data;
obtaining a fine tuning pre-training model according to the training text data;
acquiring marked text data of each scene type;
obtaining semantic vector data of each scene type according to the marked text data of each scene type and the fine tuning pre-training model;
respectively training a classification neural network model according to the semantic vector data of each scene type to obtain a classification model of each scene type;
and classifying the texts of the scene types according to the classification model of the scene types and the fine tuning pre-training model.
In an alternative, the program 210 may be specifically further configured to cause the processor 202 to:
training a pre-training neural network model according to the training text data to obtain a pre-training model;
and combining the pre-training model with a preset corpus to obtain the fine-tuning pre-training model.
In an alternative, the program 210 may be specifically further configured to cause the processor 202 to:
training a pre-training neural network model according to the training text data to obtain a pre-training model;
obtaining training semantic vector data according to the training text data and the pre-training model;
training a dimension reduction neural network model according to the training semantic vector data to obtain a dimension reduction model;
and combining the pre-training model and the dimension reduction model to obtain the fine-tuning pre-training model.
In an alternative, the program 210 may be specifically further configured to cause the processor 202 to:
respectively packaging the classification model and the fine tuning pre-training model of each scene type in a Docker container to obtain the Docker container of each scene type;
and respectively associating the Docker containers of the scene types with application programs of different scene types to respectively acquire text data of the application programs of the scene types, and classifying the text data.
According to the embodiment of the invention, a neural network model is trained through training text data to obtain a fine-tuning pre-training model, then, marking text data of different scene types are sequentially input into the fine-tuning pre-training model to obtain semantic vector data of each scene type, and supervision training is performed on different classification neural network models through the semantic vector data of the different scene types to obtain classification models of each scene type. Finally, when the text is classified, the text with different scene types can be input into the same fine tuning pre-training model to obtain semantic vector data with different scene types, and the semantic vector data with different scene types can be respectively input into the classification model with the scene type to obtain corresponding classification results. Compared with the prior art, the embodiment of the invention separates word embedding links for building the classification model, so that the classification models of different scene types share the same fine tuning pre-training model, the word embedding links do not need to be repeated when the classification models of different scene types are built, namely, the building process of the fine tuning pre-training model is carried out, and the deployment of the classification model is quickened. In addition, the developer of the model does not need to participate in the word embedding link of establishing the classification model, but only needs to participate in the generation link of the classification model, so that the working strength of the developer is reduced.
The embodiment of the invention provides an executable program which can execute the text classification service deployment method in any of the method embodiments.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features but not others included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments can be used in any combination.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specifically stated.

Claims (8)

1. A text classification service deployment method, comprising:
acquiring training text data;
obtaining a fine tuning pre-training model according to the training text data;
Acquiring marked text data of each scene type;
obtaining semantic vector data of each scene type according to the marked text data of each scene type and the fine tuning pre-training model;
respectively training a classification neural network model according to the semantic vector data of each scene type to obtain a classification model of each scene type;
respectively packaging the classification model and the fine tuning pre-training model of each scene type in a Docker container to obtain the Docker container of each scene type;
respectively associating the Docker containers of the scene types with application programs of different scene types to respectively obtain text data of the application programs of the scene types, and classifying the text data to obtain classification results; the application program is a program for providing text classification service under different service scenes.
2. The method of claim 1, wherein the obtaining a fine-tuning pre-training model based on the training text data is specifically:
training a pre-training neural network model according to the training text data to obtain a pre-training model;
and combining the pre-training model with a preset corpus to obtain the fine-tuning pre-training model.
3. The method of claim 1, wherein the obtaining a fine-tuning pre-training model based on the training text data is specifically:
training a pre-training neural network model according to the training text data to obtain a pre-training model;
obtaining training semantic vector data according to the training text data and the pre-training model;
training a dimension reduction neural network model according to the training semantic vector data to obtain a dimension reduction model;
and combining the pre-training model and the dimension reduction model to obtain the fine-tuning pre-training model.
4. A text classification service deployment apparatus, comprising: the system comprises a pre-training layer and a service layer, wherein the service layer comprises a plurality of Docker containers, and each Docker container is packaged with a classification model and a fine tuning pre-training model;
the pre-training layer is used for acquiring training text data and obtaining the fine-tuning pre-training model according to the training text data; the fine tuning pre-training model in each Docker container is respectively used for acquiring the marked text data of each scene type to obtain semantic vector data of each scene type; the semantic vector data of each scene type are respectively used for training a classification neural network model to obtain classification models in each Docker container; the classification model and the fine tuning pre-training model in each Docker container are respectively used for classifying texts of different scene types;
The system also comprises an application layer, wherein the application layer comprises application programs of various scene types; each application program is respectively used for associating each Docker container and sending text data of the corresponding scene type to the Docker container, so that a classification model and a fine tuning pre-training model in the Docker container classify the text data to obtain classification results; the application program is a program for providing text classification service under different service scenes.
5. The apparatus of claim 4, wherein the pre-training layer comprises a pre-training model and a pre-set corpus; the pre-training model is obtained by training a pre-training neural network model through the training text data; the preset corpus is used for combining with the pre-training model to obtain the fine-tuning pre-training model.
6. The apparatus of claim 4, wherein the pre-training layer comprises a pre-training model and a dimension-reduction model; the pre-training model is obtained by training a pre-training neural network model through the training text data; the pre-training model is used for obtaining training semantic vector data according to the training text data; the training semantic vector data is used for training a dimension reduction neural network model to obtain the dimension reduction model, and the dimension reduction model is used for combining with the pre-training model to obtain the fine-tuning pre-training model.
7. A text classification service deployment apparatus, comprising: the device comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete communication with each other through the communication bus;
the memory is configured to store at least one executable instruction that causes the processor to perform the text classification service deployment method of any of claims 1-3.
8. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform the text classification service deployment method of any of claims 1-3.
CN201910948777.3A 2019-10-08 2019-10-08 Text classification service deployment method, device, equipment and computer storage medium Active CN112632271B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910948777.3A CN112632271B (en) 2019-10-08 2019-10-08 Text classification service deployment method, device, equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910948777.3A CN112632271B (en) 2019-10-08 2019-10-08 Text classification service deployment method, device, equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN112632271A CN112632271A (en) 2021-04-09
CN112632271B true CN112632271B (en) 2023-04-25

Family

ID=75283107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910948777.3A Active CN112632271B (en) 2019-10-08 2019-10-08 Text classification service deployment method, device, equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN112632271B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113590903B (en) * 2021-09-27 2022-01-25 广东电网有限责任公司 Management method and device of information data
CN114020922B (en) * 2022-01-06 2022-03-22 智者四海(北京)技术有限公司 Text classification method, device and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11663409B2 (en) * 2015-01-23 2023-05-30 Conversica, Inc. Systems and methods for training machine learning models using active learning
CN109284385A (en) * 2018-10-15 2019-01-29 平安科技(深圳)有限公司 File classification method and terminal device based on machine learning
CN109710770A (en) * 2019-01-31 2019-05-03 北京牡丹电子集团有限责任公司数字电视技术中心 A kind of file classification method and device based on transfer learning

Also Published As

Publication number Publication date
CN112632271A (en) 2021-04-09

Similar Documents

Publication Publication Date Title
CN110263324B (en) Text processing method, model training method and device
CN109923558A (en) Mixture of expert neural network
US20200372217A1 (en) Method and apparatus for processing language based on trained network model
CN110728298A (en) Multi-task classification model training method, multi-task classification method and device
CN109948149A (en) A kind of file classification method and device
CN112632271B (en) Text classification service deployment method, device, equipment and computer storage medium
WO2022156561A1 (en) Method and device for natural language processing
CN113157919B (en) Sentence text aspect-level emotion classification method and sentence text aspect-level emotion classification system
CN110222330A (en) Method for recognizing semantics and device, storage medium, computer equipment
CN116226785A (en) Target object recognition method, multi-mode recognition model training method and device
CN115438215A (en) Image-text bidirectional search and matching model training method, device, equipment and medium
CN112989843B (en) Intention recognition method, device, computing equipment and storage medium
CN112906368B (en) Industry text increment method, related device and computer program product
CN117313138A (en) Social network privacy sensing system and method based on NLP
CN111339734A (en) Method for generating image based on text
CN113806536B (en) Text classification method and device, equipment, medium and product thereof
CN113312445B (en) Data processing method, model construction method, classification method and computing equipment
US20220092440A1 (en) Device and method for determining a knowledge graph
CN111566665B (en) Apparatus and method for applying image coding recognition in natural language processing
CN113806537A (en) Commodity category classification method and device, equipment, medium and product thereof
CN113869068A (en) Scene service recommendation method, device, equipment and storage medium
CN113722439A (en) Cross-domain emotion classification method and system based on antagonism type alignment network
CN115269767A (en) Model training method, device and storage medium
CN113486147A (en) Text processing method and device, electronic equipment and computer readable medium
CN113837216A (en) Data classification method, training method, device, medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231211

Address after: No.19, Jiefang East Road, Hangzhou, Zhejiang Province, 310000

Patentee after: CHINA MOBILE GROUP ZHEJIANG Co.,Ltd.

Patentee after: China Mobile (Zhejiang) Innovation Research Institute Co.,Ltd.

Patentee after: CHINA MOBILE COMMUNICATIONS GROUP Co.,Ltd.

Address before: No. 19, Jiefang East Road, Hangzhou, Zhejiang Province, 310016

Patentee before: CHINA MOBILE GROUP ZHEJIANG Co.,Ltd.

Patentee before: CHINA MOBILE COMMUNICATIONS GROUP Co.,Ltd.