WO2021243575A1

WO2021243575A1 - Text information classification method, mobile terminal, and computer-readable storage medium

Info

Publication number: WO2021243575A1
Application number: PCT/CN2020/093987
Authority: WO
Inventors: 林浩智
Original assignee: 深圳市欢太科技有限公司; Oppo广东移动通信有限公司
Priority date: 2020-06-02
Filing date: 2020-06-02
Publication date: 2021-12-09
Also published as: CN115605861A

Abstract

A text information classification method, a mobile terminal, and a computer-readable storage medium. The classification method comprises: obtaining text information (S11); determining a vertical domain intention of the text information (S12); recalling the text information when the vertical domain intention of the text information meets a set vertical domain intention, and performing intention classification on the text information (S13); and refusing the text information when the vertical domain intention of the text information does not meet the set vertical domain intention (S14). By means of the method, the classification of the text information can be effectively accelerated, and computer occupied resources are reduced.

Description

Classification method of text information, mobile terminal and computer readable storage medium

Technical field

This application relates to the technical field of text classification, and in particular to a method for classifying text information, a mobile terminal, and a computer-readable storage medium.

Background technique

With the development of text classification technology, text classification applications in online and offline industrial scenarios, such as: product evaluation polarity analysis in the field of e-commerce, automatic archiving and classification of data text, sensitive topic inspection of forums, online users of voice assistants Intent recognition, etc.

Among them, when the text information is more complex, the requirements for timeliness are higher, and it is often necessary to use fewer resources in a shorter time to return reliable results. For this type of application, due to the large number of task categories, the intent is more complicated and different. It is more flexible, and complex text classification models are often used to uniformly process complex text information.

At present, because complex text classification models are used for complex text information, when the complexity of the input text information is uncertain, it often leads to waste of resources and calculation time of the complex classification model, which slows down the classification of text information. The recognition speed, in turn, affects the text classification output experience.

Summary of the invention

This application provides a method for classifying text information, a mobile terminal, and a computer-readable storage medium to solve the problem of relatively slow text information classification.

The first aspect of the embodiments of the present application provides a method for classifying text information, including: obtaining text information; determining the vertical domain intention of the text information; when the vertical domain intention of the text information meets the setting vertical domain intention, recalling the text information , And then classify the text information by intention; when the vertical domain intention of the text information does not meet the set vertical domain intention, the text information is rejected.

The second aspect of the embodiments of the present application provides a mobile terminal, including: an acquisition module for acquiring text information; a determining module for determining the vertical intention of the text information; a recall module, where the vertical intention of the text information satisfies When setting the vertical domain intention, it is used to recall text information; the intention classification module, after the recall module recalls the text information, is used to classify the text information; the rejection module is used when the vertical domain intention of the text information does not meet the setting In the case of vertical domain intention, the text message is rejected.

The third aspect of the embodiments of the present application provides a mobile terminal, including: a processor, a memory, and a computer program stored in the memory and running on the processor. The processor is used to execute the computer program to implement the first embodiment of the present application. The method provided by the aspect.

The fourth aspect of the embodiments of the present application provides a computer-readable storage medium that stores a computer program, and the computer program can be executed by a processor to implement the method provided in the first aspect of the embodiments of the present application.

The beneficial effects of this application are: Different from the situation in the prior art, this application first determines the vertical domain intention of the text information for text information with uncertain complexity; when the vertical domain intention of the text information satisfies the vertical domain intention, The text information is recalled, and the text information is classified by intention. Different classification methods are used for the text information with uncertain complexity. The different classification methods are simpler and faster because of the relatively complex model, which makes the text information with different recognition difficulties The discrimination results can be quickly given in different levels of classification, thereby speeding up the intention classification. Therefore, through the above method, the waste of resources of the complex classification model can be effectively avoided and the classification of text information can be accelerated, so as to improve the classification and recognition speed of text information, and thereby reduce the resource occupation of the computer.

Description of the drawings

In order to explain the technical solutions in the embodiments of the present application more clearly, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.

Fig. 1 is a schematic flowchart of a first embodiment of a text information classification method of the present application;

FIG. 2 is a schematic flowchart of a specific implementation of step S12 shown in FIG. 1;

FIG. 3 is a schematic flowchart of a specific implementation of step S13 shown in FIG. 1;

FIG. 4 is a schematic flowchart of a specific implementation of step S33 shown in FIG. 3;

FIG. 5 is a schematic flowchart of another specific embodiment shown in FIG. 3;

Fig. 6 is a schematic flowchart of a second embodiment of a text information classification method of the present application;

FIG. 7 is a schematic block diagram of an embodiment of a mobile terminal of the present application;

FIG. 8 is a schematic block diagram of another embodiment of a mobile terminal of the present application;

Fig. 9 is a schematic block diagram of a circuit of an embodiment of a computer-readable storage medium of the present application.

FIG. 10 is a schematic structural composition diagram of an embodiment of a mobile terminal device of the present application.

detailed description

In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.

It should be understood that when used in this specification and appended claims, the term "comprising" indicates the existence of the described features, wholes, steps, operations, elements and/or components, but does not exclude one or more other features The existence or addition of, whole, step, operation, element, component and/or its collection.

It should also be understood that the terms used in the specification of this application are only for the purpose of describing specific embodiments and are not intended to limit the application. As used in the specification of this application and the appended claims, unless the context clearly indicates other circumstances, the singular forms "a", "an" and "the" are intended to include plural forms.

It should be further understood that the term "and/or" used in the specification and appended claims of this application refers to any combination of one or more of the items listed in the associated and all possible combinations, and includes these combinations .

As used in this specification and the appended claims, the term "if" can be interpreted as "when" or "once" or "in response to determination" or "in response to detection" depending on the context . Similarly, the phrase "if determined" or "if detected [described condition or event]" can be interpreted as meaning "once determined" or "in response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".

In order to illustrate the technical solution described in the present application, specific embodiments are used for description below.

Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a first embodiment of a text information classification method of the present application. The method provided in this embodiment specifically includes the following steps:

S11: Obtain text information;

Generally speaking, many smart devices are equipped with voice recognition functions, and voice assistants have become a common daily application for people. Voice assistant applications include more complicated text classification operations and need to be able to interact with users in real time. Taking a mobile terminal as an example, when the mobile terminal starts the voice assistant application, the user's voice input can be obtained in real time, and the voice input is converted into the text information corresponding to the voice by the mobile terminal, so that the mobile terminal can obtain the text information in real time.

In addition, when the mobile terminal is provided with a voice jack, the voice jack can be plugged into a headset, and text information can be obtained by collecting sound from the headset. If voice input is performed through a wireless headset, text information can also be obtained by voice collection on the wireless headset. Of course, those skilled in the art can obtain text information in other ways known in the art.

S12: Determine the vertical intention of the text information;

Usually, the voice input is a single text information input, that is, the obtained text information is generally a single text information input, but due to the many functions supported by the voice assistant, there are actually multiple models of different vertical domains that are processed in parallel in the implementation. For text information input, if there is a model that specializes in emotional question and answer, and a model that specializes in system operations, both models need to respond to the same text information input. The above is an example of these two models. In fact, there may be dozens or hundreds of models processing one text information input at the same time, which consumes a lot of computing resources. Then, it is particularly important to be able to quickly and quickly determine which vertical domain the acquired text information belongs to, that is, to determine which vertical domain intention the text information belongs to.

Therefore, the vertical intent of the obtained text information can be roughly selected. If it is determined that the vertical intent of the text information belongs to a certain approximate vertical intent, step S13 is entered to quickly return the classification result, thereby greatly saving subsequent calculations. resource.

S13: When the vertical domain intention of the text information satisfies the set vertical domain intention, recall the text information, and then classify the text information by intention;

The mobile terminal is preset with a vertical domain setting intention, which is used to determine whether the vertical domain intention of the text information satisfies the vertical domain setting intention. Because the vertical domain intention setting module is more detailed, the vertical domain intention of the input text information can be directly classified into the specific intention within the vertical domain. Since the task of judging whether the vertical intention of text information meets the setting of vertical intention is more detailed, this module is more complicated than determining the vertical intention of text information. This module also has the rejection capability, that is, the text message that hits the rejection category is returned to the rejection intention.

For the classification of vertical domain tasks, only part of the text information will quickly determine the vertical domain intention module, and judge whether the text information vertical domain intention meets the set vertical domain intention. Because through the screening of the previous multiple models, there is a greater chance of confirming that the input text information belongs to the text information input that is intended to set the vertical domain.

When the vertical domain intention of the text information satisfies the set vertical domain intention, the text information is recalled, and the text information is classified by intention. Further, after satisfying the intention of setting the vertical domain, a corresponding interrupt signal can be generated to notify the mobile terminal to perform a corresponding operation.

S14: When the vertical domain intention of the text information does not satisfy the set vertical domain intention, the text information is rejected.

Through the above method, this application first determines the vertical intention of the text information for text information with uncertain complexity; when the vertical intention of the text information meets the vertical intention of the set vertical domain, the text information is recalled and the intention of the text information is determined. Classification. Different classification methods are used for text information with uncertain complexity. Different classification methods are simpler and faster in relatively complex models. Therefore, text information with different recognition difficulties can be quickly classified at different levels to give the discrimination results. , Thereby speeding up intent classification. Therefore, through the above method, the waste of resources of the complex classification model can be effectively avoided and the classification of text information can be accelerated, so as to improve the classification and recognition speed of text information, and thereby reduce the resource occupation of the computer.

Please refer to FIG. 2. FIG. 2 is a schematic flowchart of a specific implementation of step S12 shown in FIG. 1, which specifically includes the following steps:

S21: Perform keyword matching between the text information and multiple keyword groups to obtain multiple matching degrees; wherein, each keyword group corresponds to a vertical domain intention;

In this step, it is mainly used for rough selection of the acquired text information, so the rough recall module can be used to process the acquired text information. Since the mobile terminal presets a model formed by multiple skill groups, such as multiple keyword groups, where each keyword group corresponds to a vertical domain intention, and each keyword group is a regular expression formed by multiple keywords, then The expression can be used to roughly select the vertical intention of the obtained text information, for example, using a simpler regular expression. For simple regular expression usage scenarios, such as a certain skill group dedicated to processing tasks such as alarm clocks, countdowns, schedules, etc. For this type of tasks, the input text of "time" is an important element, so simple regular expressions can be used Mode. Such as "{2,4}year{1,2}month{1,2}day" to check whether the input contains a time element, if the input text contains a time element, it also contains task keywords such as "alarm clock", "timing", etc. , It can be recalled by the coarse recall module and enter the follow-up process.

Because this module will process all text information input, it has higher requirements for speed and computational complexity. Therefore, in this module, simple regular expressions are used to recall textual information. By using simple regular expressions to process the text information, the mobile terminal can quickly filter out the text information that belongs to the local domain intention, and the text information that does not belong to the local domain intention directly returns the rejected classification result, thereby greatly saving For subsequent computing resources.

S22: Sort the multiple matching degrees corresponding to multiple keyword groups by relevance;

Each skill group preset by the mobile terminal processes the acquired text information at the same time to obtain multiple matching degrees, and the multiple matching degrees corresponding to the multiple keyword groups are sorted by relevance to obtain different rough screening results.

Furthermore, a vertical domain skill group has a rough recall module. Only matching keywords or using simple regular expression matching is greatly reduced compared to statistical models or deep learning models. By using the data of the real generation environment for testing, the average time and peak value are orders of magnitude higher than that of a single model.

S23: Use the vertical intent corresponding to the keyword group with the highest matching degree as the vertical intent of the text information.

After the coarse screening results obtained by relevance sorting, the vertical domain intention corresponding to the keyword group with the highest matching degree can be obtained, so that the vertical domain intention of the text information can be determined.

Of course, under the technical enlightenment of this application, those skilled in the art can completely think of setting other methods for determining the vertical domain of text information according to actual needs.

Please refer to FIG. 3. FIG. 3 is a schematic flowchart of a specific implementation of step S13 shown in FIG. 1, which specifically includes the following steps:

S31: When the vertical domain intention of the text information satisfies the set vertical domain intention, recall the text information and perform the first intention classification of the text information; wherein the first intention classification includes at least one intention class and one rejection class;

For the classification of vertical tasks, only part of the text information will pass through the fast coarse recall module to step S31. In step S31, a regular expression recall module, that is, a regular recall model, can be used. This module and the coarse recall module also use regular expressions with fast calculation speed. The difference is that the coarse recall module only distinguishes whether the task belongs to the local task, while the regular recall model is more detailed and can directly classify the input text information into the local task. The specific intent within the vertical domain is to set the vertical domain intention. Due to the more detailed tasks, the regular expressions used in this module are more complicated than the coarse recall module. However, this module also has the ability to reject, that is, it will hit the text message of the rejection category and return the rejection intention.

When the vertical domain intention of the text information satisfies the vertical domain intention, it means that the text information belongs to the vertical domain intention, then the text information is recalled and the text information is classified for the first time. Among them, the first intention classification includes at least At least one intention class and one rejection class to indicate that the first intention classification has the ability to reject.

And for the recalled text information, the text information can be input into the regular recall model, so that the regular recall model classifies the text information for the first time, where the first intention classification includes at least one intention class and one rejection class.

Specifically, the text information is serially matched with the regular database of the regular recall model, so that the regular recall model performs the first intention classification of the text information, and outputs the first intention classification result; wherein, the first intention classification result Include at least one result of intent and one result of rejection.

If the keyword "alarm clock" and the key text "tomorrow" are hit in the coarse recall module, the input text information will be assigned to the regular expression related to the alarm clock for serial matching; because the regular recall model is mainly for processing high frequency Simple text information and more complex text information that is difficult to use the model to classify, so there are not too many serial regular expressions. And using regular expression matching to classify high-frequency text information can save computing time and resources. Of course, the regular recall model also processes ordinary text information.

When the text information meets at least one intent class of the first intent classification result, go to step S32; when the text information does not meet the at least one intent class and rejection class of the first intent classification result, go to step S33; when the text information meets When the rejection category of the classification is intended for the first time, the text information is rejected.

S32: When the text information satisfies one of the at least one intent class of the first intent classification, perform slot extraction on the text information;

Generally, users can add, delete, modify, and recall the regular expressions in the model according to business needs. This makes the regular expressions in the regular recall model convenient and controllable, and can be used to quickly repair and modify a small amount of specific text information, and quickly control the output results.

For the processing of high-frequency text information, take an example to illustrate. For example, in the alarm clock scene, "set an alarm clock at 8 o'clock tomorrow morning" is a relatively high-frequency statement, such as ten million input text information, this statement If there are one hundred thousand times, you only need to set a regular expression to recall these one hundred thousand pieces of text information early, so that the text information meets at least one intent class of the first intent classification, thus saving one hundred thousand times in the depth model. Computational resource consumption; at the same time, the processing efficiency of more text information is improved, and the benefits are higher.

When the text information satisfies one of the at least one intent class of the first intent classification, the text information is subjected to slot extraction.

Specifically, when the text information satisfies one of the high-frequency simple text information and the more complicated text information input that is difficult to use the model to classify, the text information is extracted from the slot, and the specific slot extraction can be based on business requirements To determine, such as the content of the text message, the name of the person, the amount, etc.

S33: When the text information does not satisfy at least one of the intent class and the rejection class of the first intent classification, perform the second intent classification on the text information.

For the filtering of the previous model, the text information obtained does not satisfy at least one intention category and rejection category of the first intention classification, then the mobile terminal considers the text information to belong to the second intention classification, that is, confirm that the input text information is The input that belongs to the vertical domain intention, this part of the proportion of all inputs is already small, so the deep neural network with greater complexity and better effect can be used for the second intention classification.

Please refer to FIG. 4. FIG. 4 is a schematic flowchart of a specific implementation of step S33 shown in FIG. 3, which specifically includes the following steps:

S41: Determine whether the second intent classification of the text information satisfies at least one intent class of the second intent classification;

The second intent classification includes at least one intent class and one rejection class. Setting at least one intent class can be used to determine whether the second intent classification of the text information satisfies one of the at least one intent class in the second intent classification. The second intention classification of text information includes:

The text information is input to the intent classification model, so that the intent classification model performs a second intent classification on the text information, and outputs a second intent classification result; wherein the second intent classification result includes at least one intent class and one rejection class.

Among them, the second intention classification module can use the Text Convolutional Neural Network (Text CNN) model, that is, the method of classifying text through the convolutional neural network, and the text information entering this module is N+ One category, that is, N categories that belong to the task intent of the local vertical domain, and one category that does not belong to the local vertical domain, that is, rejection capability, where N represents a positive integer greater than or equal to 1.

The Text CNN model performs better than rule matching. According to business needs, in addition to the Text CNN model, you can also use the Long Short-Term Memory (LSTM) model and the Gate Recurrent Unit (GRU) Model, Transformer (Transformer) model, Bidirectional Encoder Represenations from Transformers (BERT) model, Text-to-Text (Transfer Transformer, T5) model and other more complex models, Text CNN model is not a requirement The option of is just taking the Text CNN model as an actual use example; if the business situation can tolerate a longer response time, you can choose a more complex model.

It can also be tuned according to actual business selection model parameters, and the resource consumption and performance gain selection models under different parameters can be compared. The specifics are not limited here.

If the text information satisfies at least one intent class in the second intention classification result, then step S42 is entered, and if the text information satisfies the rejection class in the second intention classification result, step S43 is entered.

S42: Perform slot extraction on text information;

After the input text information has been processed by the above modules, it already has a higher classification accuracy and corresponding slot results. When the text information satisfies at least one intent category of the second intent classification result, the text information is subjected to slot extraction. The slots for extracting the text information will be explained in detail below.

S43: Reject the text message.

If the text information meets the rejection category in the second intention classification result, it means that the text information belongs to the rejection category, and the rejection intention is returned, that is, the text information that is determined not to belong to the original domain intention is returned to the rejected classification result, thereby saving subsequent calculations resource.

Please refer to FIG. 5. FIG. 5 is a schematic flowchart of another specific embodiment shown in FIG. 3, which specifically includes the following steps:

S51: When the vertical domain intention of the text information satisfies the vertical domain intention, the text information is recalled, and the text information is classified for the first time;

This step S51 is similar to step S31 in FIG. 3, and the details will not be repeated here.

S52: Determine whether the first intention classification of the text information meets at least one intention class and one rejection class of the preset first intention classification;

Specifically, usually the mobile terminal can set a preset first intention classification, and the preset first intention classification is used to determine whether the first intention classification of the text information meets the preset first intention classification. In this embodiment, the preset first intention classification includes at least one intention class and one rejection class. If it is judged whether the first intention classification of the text information meets at least one intention class and one rejection class of the preset first intention classification, then step S55 is entered, and if it is judged not satisfied, step S53 is entered.

Steps S52 and S55 are similar to step S32 in FIG. 3, and the details are not repeated here.

In this embodiment, if the text information satisfies at least one intent category of the first intention classification, then step S55 is entered, and when the text information satisfies the rejection category of the first intention classification, the rejection classification can be directly returned. intention.

Of course, under the technical enlightenment of this application, those skilled in the art can completely conceive of setting other methods according to actual needs so that the first intent classification of text information meets the preset first intent classification condition.

S53: Re-determine the vertical intention of the text information;

After filtering by the two modules of the former coarse recall model and the regular recall model, only part of the text information will determine the vertical domain intention of the text information again.

Before the second intention classification of the text information, the vertical intention of the text information is determined again. For re-determining the vertical intention of the text information, the intention recall model can be used. This model task is the same as the rough recall module, that is, the input text information is reclassified to determine whether the input text information belongs to the local task intent or does not belong to the local task intent. Due to the limited processing complexity of regular expressions, it is difficult to classify part of the text information in the first two modules. Therefore, the text information needs to be processed again with the help of an intention recall model with neural network generalization capabilities.

Specifically, the text information is input to the intention recall model, so that the intention recall model determines the vertical intention of the text information; wherein, a confidence threshold can be set in the intention recall model, and the set confidence threshold can be used to distinguish the text information For the vertical domain intention, the set confidence threshold is adjustable. By adjusting the set confidence threshold, the recall rate of text information can be increased (Recall rate), that is, the proportion of positive samples that are actually marked as positive samples are classified as positive samples.

Among them, the intention recall model can determine the vertical intention of the text information according to the set confidence threshold, including:

Determine whether the vertical domain intention of the text information meets the set confidence threshold;

If the set confidence threshold is met, the vertical intention of the text information is determined;

If the set confidence threshold is not met, the text information is determined to be rejected.

According to business requirements, the intention recall model can be a Fast Text neural network model, which is a method of learning word embedding and text classification. Therefore, the intention recall model uses the generalization ability of the Fast Text neural network model to enable the model to Process the input of text information that has not been acquired. The computational complexity of the intention recall model is greater than that of regular expressions, but the accuracy rate is higher than that of regular expressions.

In addition, according to business requirements, the intention recall model can also be a convolutional neural network (Convolutional Neural Network, CNN) model, which is a feedforward neural network model. It is worth noting that the CNN model here mainly refers to the CNN model with fewer parameters and the CNN model with an attention module. For example, using Text CNN with a small number of feature maps, such as (2,256), (3,256), (4,256), and (2,32), (3,32), ( 4,32) Reduce the computational complexity, and the attention module is similar. The QKV in the attention module can be linearly projected to a low-dimensional such as 32-dimensional and then the attention calculation can be performed to reduce the computational complexity of colleagues who obtain part of the attention ability.

S55: When the vertical domain intention of the text information satisfies the set vertical domain intention, the step of performing the second intention classification of the text information is performed.

This step S55 is similar to step S33 in FIG. 3, and the details will not be repeated here.

Please refer to FIG. 6. FIG. 6 is a schematic flowchart of a second embodiment of a text information classification method of the present application. The method provided in this embodiment specifically includes the following steps:

S61: Obtain text information;

S62: Determine the vertical intention of the text information;

S63: Determine whether the vertical domain intention of the text information satisfies the set vertical domain intention;

S64: If the vertical domain intention of the text information satisfies the set vertical domain intention, the text information is recalled, and the text information is classified by intention;

Among them, steps S61, S62, S63, and S64 are respectively similar to S11, S12, S13, and S14 in FIG. 1, and the details are not repeated here.

Wherein, in step S63, whether the vertical domain intention of the text information meets the set vertical domain intention can be discriminated by means of judgment, and other methods may also be used, which is not specifically limited here.

In addition, the text information performs slot extraction, and the text information is input to the slot extraction module, so that the slot extraction module performs slot extraction on the text information and outputs the slot extraction result.

Among them, the slot extraction module can be extracted by using the Bi-Long Short-Term Memory (Bi-LSTM) model and the Conditional Random Fields (CRF) model. Among them, the Bi-LSTM model is A time recurrent neural network; CRF model is a conditional probability distribution model.

In addition, the slot regular expression model can also be used to extract the slots of the text information.

S65: Use the verification rule library to verify the slot of the text information.

After the text information is extracted from the slots, it also includes: using the verification rule library to verify the text information.

Specifically, the mobile terminal is preset with a verification rule library, and by using the verification rule library, the slot of the text information can be verified. The verification module uses rules to verify specific keywords that are required or cannot be included under each intent and the corresponding slot results, and reject a very small number of text messages that pass the model but do not meet the definition or requirements. Among them, the verification module You can quickly modify and take effect by configuring the verification rules under different intents.

For example, the verification rule library is artificially set by online actual problem cases, and the verification rules can be set with more detailed rules. For example, for alarm-related tasks, the input text information is "open the small alarm clock", which is classified as the intention of "open the alarm clock". In fact, "Small Alarm Clock" is a third-party app, you can set "Small Alarm Clock" as a rejection keyword for the intention of "open the alarm clock".

S66: Reject the text message.

Step S67 is similar to S15 in FIG. 1, and the details are not repeated here.

Therefore, this solution, for example, in the actual use of the common functions of the Breeno voice assistant's mobile phone, the coarse recall module can reject more than 90% of the input fields that do not belong to the task intent of the vertical domain, and at the same time, by adding keywords to ensure that the recall rate exceeds 99.9% ; In the regular expression module, more than 30% of the high-frequency vertical field intention text information can be processed quickly and directly enter the slot extraction module. In general, the hierarchical framework proposed by this solution is compared with a single-layer or two-layer intention classification framework. Specifically, the actual online situation saves 0-50% time, and the average response is from >10ms to less than 10ms, so it can save more than 50% Computing time and computing resources.

In addition, the use of regular expressions at different levels makes the solution more controllable, and the retraining of the model is less frequent. The design of using multiple simple models makes it possible to minimize changes to other intent recognition results when retraining the model for specific types of input. This makes it easier to repair errors or modify vertical task intent classification definitions, avoid frequent updates to complex models, and save a lot of manpower and computing resources.

Therefore, this solution splits the complex model into multiple relatively simple models, and at the same time uses regular expressions with faster calculation speeds at different levels, so that text information with different recognition difficulties can be discriminated earlier at different levels, speeding up Intent classification. And by using more levels and inserting regular expression modules at different levels, the final result is more controllable, and the output result can be quickly modified by making a small amount of changes to the configuration file.

Please refer to FIG. 7, which is a schematic structural diagram of a mobile terminal according to an embodiment of the present application. The embodiment of the present application provides a mobile terminal 7, including:

The obtaining module 71 is used to obtain text information;

The determining module 72 is used to determine the vertical intention of the text information;

The recall module 73 is used to recall the text information when the vertical domain intention of the text information meets the set vertical domain intention;

The intention classification module 74, after the text information is recalled by the recall module, is used to classify the text information by intention;

The rejection module 75 is used to reject the text information when the vertical intention of the text information does not meet the set vertical intention.

In the above manner, for the text information with uncertain complexity, the determination module 72 first determines the vertical domain intention of the text information; when the vertical domain intention of the text information satisfies the set vertical domain intention, the recall module 73 recalls the text information, In addition, the intention classification module 74 classifies text information by intention, and uses different classification methods for text information with uncertain complexity. Different classification methods are faster, so that text information with different recognition difficulties can be quickly classified in different levels. The result of the discrimination is obtained, thereby accelerating the classification of intentions. Therefore, through the above method, the waste of resources of the complex classification model can be effectively avoided and the classification of text information can be accelerated, so as to improve the classification and recognition speed of text information, and thereby reduce the resource occupation of the computer.

Further, please refer to FIG. 8, which is a schematic structural diagram of another mobile terminal according to an embodiment of the present application. An embodiment of the present application provides a mobile terminal 8 including: a processor 81, a memory 82, and a computer program 821 stored in the memory and running on the processor. The processor 81 is configured to execute the computer program 821 to implement the embodiment of the present application. The steps of the method provided in the first aspect will not be repeated here.

Refer to FIG. 9, which is a schematic block diagram of a circuit of an embodiment of a computer-readable storage medium of the present application. If implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in the computer-readable storage medium 100. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a readable storage. The medium includes a number of instructions (program data 101) to enable a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the various implementation methods of the present application. The aforementioned readable storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media, as well as various media with the above Computers, mobile phones, laptops, tablets, cameras and other electronic devices with readable storage media.

Further, an embodiment of the present application also provides a mobile terminal device. Please refer to FIG. 10 together. FIG. 10 is a schematic structural diagram of an embodiment of the mobile terminal device of the present invention. The mobile terminal device may be a mobile phone, a tablet computer, or a notebook. Computers and wearable devices, etc., in this embodiment, a mobile phone is taken as an example. The structure of the terminal device 900 may include a radio frequency (RF) circuit 910, a memory 920, an input unit 930, a display unit 940 (that is, the display screen assembly 600 in the foregoing embodiment), a sensor 950, an audio circuit 960, and WiFi ( wireless fidelity) module 970, processor 980, power supply 990, etc. Among them, the RF circuit 910, the memory 920, the input unit 930, the display unit 940, the sensor 950, the audio circuit 960, and the WiFi module 970 are respectively connected to the processor 980; the power source 990 is used to provide power to the entire terminal device 900.

Specifically, the RF circuit 910 is used for receiving and transmitting signals; the memory 920 is used for storing data instruction information; the input unit 930 is used for inputting information, which may specifically include a touch panel 931 and other input devices 932 such as operation buttons; and the display unit 940 It may include a display panel 941, etc.; the sensor 950 includes an infrared sensor, a laser sensor, etc., used to detect user proximity signals, distance signals, etc.; a speaker 961 and a microphone (or microphone) 962 are connected to the processor 980 through the audio circuit 960 for connection Sound signals are sent; the WiFi module 970 is used to receive and transmit WiFi signals, and the processor 980 is used to process data information of the mobile terminal device.

For the description of the execution process of the program data in the device with the storage function, reference may be made to the description in the embodiment of the text information classification method of the mobile terminal of the present application, which is not repeated here.

The above are only part of the embodiments of this application, and do not limit the scope of protection of this application. Any equivalent device or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly applied to other related The technical field is equally included in the scope of patent protection of this application.

Claims

A method for classifying text information, characterized in that the method includes:

Get text information;

Determine the vertical intention of the text information;

When the vertical domain intention of the text information satisfies the vertical domain intention setting, recall the text information, and then classify the text information by intention;

When the vertical domain intention of the text information does not satisfy the set vertical domain intention, the text information is rejected.
The method of claim 1, wherein:

The determining the vertical intention of the text information includes:

Perform keyword matching on the text information with multiple keyword groups to obtain multiple matching degrees; wherein, each keyword group corresponds to a vertical domain intention;

The vertical intent corresponding to the keyword group with the highest matching degree is taken as the vertical intent of the text information.
The method of claim 2, wherein:

The keyword group is a regular expression formed by multiple keywords.
The method of claim 1, wherein:

The recalling the text information when the vertical domain intention of the text information satisfies the setting vertical domain intention and classifying the text information with intention includes:

When the vertical domain intention of the text information satisfies the vertical domain intention, the text information is recalled, and the first intention classification is performed on the text information; wherein, the first intention classification includes at least one intention category And a rejection class;

When the text information satisfies one of the at least one intent class of the first intent classification, perform slot extraction on the text information; or

When the text information meets the rejection category of the first intention classification, reject the text information; or

When the text information does not satisfy at least one intent class of the first intention classification and the rejection class, perform a second intention classification on the text information.
The method of claim 4, wherein:

The first intention classification of the text information includes:

The text information is input to the regular recall model, so that the regular recall model performs the first intention classification of the text information, and outputs a first intention classification result; wherein, the first intention classification result includes at least one An intent result and a rejection result.
The method of claim 5, wherein:

The regular recall model performs the first intention classification of the text information, including:

The text information is serially matched with the regular database of the regular recall model to perform the first intention classification of the text information.
The method of claim 4, wherein:

The second intention classification of the text information includes:

The text information is input to the intent classification model, so that the intent classification model performs a second intent classification on the text information, and outputs a second intent classification result; wherein the second intent classification result includes at least one An intention class and a rejection class.
The method according to claim 7, wherein the intent classification model includes one of a Text CNN model, an LSTM model, a GRU model, a Transformer model, a Bert model, or a T5 model.
The method according to claim 7, wherein:

The inputting the text information into the intent classification model so that the intent classification model performs a second intent classification on the text information and outputting the second intent classification result includes:

When the text information satisfies at least one intent category in the second intent classification result, perform slot extraction on the text information; or

When the text information satisfies the rejection category in the second intention classification result, the text information is rejected.
The method of claim 4, wherein:

Before the second intent classification is performed on the text information, the method further includes:

Determine the vertical intention of the text information again;

When the vertical domain intention of the text information satisfies the set vertical domain intention, the step of performing the second intention classification of the text information is performed.
The method of claim 10, wherein:

The re-determining the vertical intention of the text information includes:

The text information is input to the intention recall model, so that the intention recall model determines the vertical intention of the text information; wherein the intention recall model determines the vertical intention of the text information according to a set confidence threshold , The set confidence threshold is adjustable.
The method of claim 11, wherein:

The intention recall model determines the vertical intention of the text information according to the set confidence threshold, including:

If the set confidence threshold is satisfied, the vertical domain intention of the text information is determined;

If the set confidence threshold is not met, it is determined to reject the text information.
The method according to claim 11, wherein the intention recall model is a Fast Text neural network model or a CNN model.
The method of claim 1, wherein:

After the intention classification of the text information, the method further includes:

The verification rule base is used to verify the slots of the text information.
The method according to claim 4 or 9, characterized in that,

The performing slot extraction on the text information includes:

The text information is input to the slot extraction module, so that the slot extraction module performs slot extraction on the text information, and outputs the slot extraction result.
The method of claim 15, wherein:

The slot extraction module includes a Bi-LSTM model and a CRF model.
The method of claim 15, wherein:

The slot extraction module is a slot regular expression model.
A mobile terminal, characterized in that it comprises:

Obtaining module for obtaining text information;

The determination module is used to determine the vertical intention of the text information;

The recall module is used to recall the text information when the vertical domain intention of the text information meets the set vertical domain intention;

The intention classification module is used to classify the text information by intention after the text information is recalled by the recall module;

The rejection module is used to reject the text information when the vertical intention of the text information does not satisfy the set vertical intention.
A mobile terminal, comprising: a processor and a memory, wherein a computer program is stored in the memory, and the processor is configured to execute the computer program to implement the classification according to any one of claims 1-17 method.
A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program can be executed by a processor to implement the classification method according to any one of claims 1-17 .