CN113792146A - Text classification method and device based on artificial intelligence, electronic equipment and medium - Google Patents

Text classification method and device based on artificial intelligence, electronic equipment and medium Download PDF

Info

Publication number
CN113792146A
CN113792146A CN202111093400.8A CN202111093400A CN113792146A CN 113792146 A CN113792146 A CN 113792146A CN 202111093400 A CN202111093400 A CN 202111093400A CN 113792146 A CN113792146 A CN 113792146A
Authority
CN
China
Prior art keywords
text
enhancement
strategy
text classification
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111093400.8A
Other languages
Chinese (zh)
Inventor
孙金辉
马骏
王少军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111093400.8A priority Critical patent/CN113792146A/en
Publication of CN113792146A publication Critical patent/CN113792146A/en
Priority to PCT/CN2022/071316 priority patent/WO2023040145A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to the technical field of artificial intelligence, and provides a text classification method, a text classification device, electronic equipment and a text classification medium based on artificial intelligence, wherein the method comprises the following steps: constructing a search space; randomly selecting a target text enhancement strategy by adopting a preset search strategy; performing text enhancement on the original text set by using a target text enhancement strategy to obtain a first enhanced text set; calculating a verification passing rate according to the original text set and the first enhanced text set; determining a target text classification model and an optimal text enhancement strategy; and performing text enhancement on the text set to be classified by adopting an optimal text enhancement strategy to obtain a third enhanced text set, and inputting the third enhanced text set and the text set to be classified into a target text classification model to obtain a text classification result. According to the method, the search space is constructed, the preset search strategy is adopted, the optimal text enhancement strategy is searched for each data set in a customized mode, and the accuracy of text classification is improved.

Description

Text classification method and device based on artificial intelligence, electronic equipment and medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a text classification method and device based on artificial intelligence, electronic equipment and a medium.
Background
The text classification task is one of the most important tasks in natural language processing. At present, deep learning models are widely applied to text classification tasks, such as models of CNN, RNN, and the like, and text enhancement is performed after a large amount of texts are labeled.
However, in the prior art, a large amount of labor and time are consumed for labeling a text, and meanwhile, some hyper-parameters need to be manually set when the text is enhanced, the hyper-parameters are obtained through manual experience and a large amount of comparison experiments, and an optimal text enhancement strategy cannot be quickly and accurately found when the text is enhanced, so that the accuracy and the efficiency of a text classification result are low.
Therefore, it is necessary to provide a method for text classification that can be performed accurately.
Disclosure of Invention
In view of the above, it is necessary to provide a text classification method, device, electronic device and medium based on artificial intelligence, which are used to build a search space and adopt a preset search strategy to search an optimal text enhancement strategy for each data set in a customized manner, so as to improve the accuracy of text classification.
A first aspect of the present invention provides a text classification method based on artificial intelligence, the method comprising:
analyzing the received text classification request and constructing a search space, wherein the search space comprises a plurality of text enhancement strategies;
randomly selecting a text enhancement strategy from the search space by adopting a preset search strategy as a target text enhancement strategy, wherein the preset search strategy comprises a controller;
performing text enhancement on each text in the original text set in the text classification request by using the target text enhancement strategy to obtain a first enhanced text set;
inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;
inputting a verification set in the text classification request into the first text classification model for verification, and calculating a verification passing rate;
determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification passing rate;
and performing text enhancement on the text set to be classified in the text classification request by adopting the optimal text enhancement strategy to obtain a third enhanced text set, and inputting the third enhanced text set and the text set to be classified into the target text classification model to obtain a text classification result.
Optionally, the parsing the received text classification request to construct a search space includes:
analyzing the received text classification request to obtain four types of hyper-parameters: the method comprises the following steps of (1) obtaining a category label, an operation type, a probability value of an application type and a proportion of words of application operation in each text;
performing combined operation on the four types of hyper-parameters to obtain a plurality of text enhancement strategies, wherein each text enhancement strategy consists of the four types of hyper-parameters;
a search space is constructed based on the plurality of text enhancement strategies.
Optionally, the operation type includes one or more of the following combinations: synonym replacement, random insertion, random exchange, random deletion.
Optionally, the randomly selecting one text enhancement policy from the search space by using a preset search policy includes:
inputting the plurality of text enhancement strategies into a controller of the preset search strategy, wherein the controller randomly selects one hyper-parameter of any hyper-parameter from the plurality of text enhancement strategies as an input parameter of the current time step of the controller, inputs the input parameter of the current time step into the controller, and outputs an output value of the current time step;
the controller randomly selects one hyper-parameter from any remaining hyper-parameters in the plurality of text enhancement strategies as an input parameter of the next time step, uses the first input parameter of the next time step and the output value of the current time step as target input parameters of the next time step, inputs the target input parameters of the next time step into the controller, and outputs the output value of the next time step;
and circularly executing the selection of the four types of hyper-parameters and the determination of the input parameters until obtaining the output parameters corresponding to each hyper-parameter, and determining four output values corresponding to the four types of hyper-parameters as a target text enhancement strategy.
Optionally, the text enhancement on each text in the original text set in the text classification request by using the target text enhancement policy to obtain a first enhanced text set includes:
identifying an output value corresponding to each hyper-parameter in the target text enhancement policy;
and performing text enhancement on each text in the original text set based on the output value corresponding to each hyper-parameter to obtain a first enhanced text.
Optionally, the determining, according to the verification passing rate, a target text classification model and an optimal text enhancement policy corresponding to the text classification request includes:
and when the verification passing rate meets a preset convergence condition in the text classification request, determining the first text classification model as a target text classification model and determining the target text enhancement strategy as an optimal text enhancement strategy.
Optionally, the method further comprises:
when the verification passing rate does not meet the preset convergence condition in the text classification request, updating the model parameters in the controller based on the verification passing rate to obtain an updated controller;
randomly selecting a new text addition strategy from the search space by adopting the updated controller to serve as a new target text enhancement strategy, and performing text enhancement on the original text set by using the new target text enhancement strategy to obtain a second enhanced text set;
inputting the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, inputting a verification set in the text classification request into the second text classification model for verification, and calculating a verification passing rate;
and repeatedly executing the step of updating the model parameters in the controller according to the verification passing rate to reselect a new text enhancement strategy for text enhancement to obtain the verification passing rate until the verification passing rate meets the preset convergence condition corresponding to the controller, determining the text classification model corresponding to the verification passing rate as a target text classification model and determining the new target text enhancement strategy corresponding to the verification passing rate as an optimal text enhancement strategy.
A second aspect of the present invention provides an artificial intelligence based text classification apparatus, the apparatus comprising:
the analysis module is used for analyzing the received text classification request and constructing a search space, wherein the search space comprises a plurality of text enhancement strategies;
the selection module is used for randomly selecting a text enhancement strategy from the search space by adopting a preset search strategy as a target text enhancement strategy, wherein the preset search strategy comprises a controller;
the text enhancement module is used for performing text enhancement on each text in the original text set in the text classification request by using the target text enhancement strategy to obtain a first enhanced text set;
the first input module is used for inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;
the verification module is used for inputting a verification set in the text classification request into the first text classification model for verification and calculating a verification passing rate;
the determining module is used for determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification passing rate;
and the second input module is used for performing text enhancement on the text set to be classified in the text classification request by adopting the optimal text enhancement strategy to obtain a third enhanced text set, and inputting the third enhanced text set and the text set to be classified into the target text classification model to obtain a text classification result.
A third aspect of the invention provides an electronic device comprising a processor and a memory, the processor being configured to implement the artificial intelligence based text classification method when executing a computer program stored in the memory.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the artificial intelligence based text classification method.
In summary, according to the text classification method, apparatus, electronic device and medium based on artificial intelligence, on one hand, a search space is constructed by adopting all text enhancement strategies corresponding to the text classification request, so that the integrity of the text enhancement strategies in the search space is ensured, and the accuracy of the optimal text enhancement strategy selected from the search space subsequently is improved; on the other hand, a controller in a preset search strategy is adopted to randomly select a text enhancement strategy from the search space, because the input parameter of the next time step of the controller is determined by the output value of the previous time step and the input parameter of the next time step, the previous output also influences the next text processing, the relevance of the text is improved, the reliability of the output value of each hyper-parameter is ensured, and the accuracy of the randomly selected text enhancement strategy is improved; finally, each text in the original text set is subjected to text enhancement by adopting a randomly selected target text enhancement strategy, manual marking is not needed, a large amount of manpower and time are not needed, and the efficiency and the accuracy of text enhancement are improved.
Drawings
Fig. 1 is a flowchart of a text classification method based on artificial intelligence according to an embodiment of the present invention.
Fig. 2 is a block diagram of an artificial intelligence based text classification apparatus according to a second embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
Example one
Fig. 1 is a flowchart of a text classification method based on artificial intelligence according to an embodiment of the present invention.
In this embodiment, the text classification method based on artificial intelligence can be applied to an electronic device, and for an electronic device that needs to perform text classification based on artificial intelligence, the function of text classification based on artificial intelligence provided by the method of the present invention can be directly integrated on the electronic device, or can be run in the electronic device in the form of a Software Development Kit (SDK).
The embodiment of the invention can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning, deep learning and the like.
As shown in FIG. 1, the text classification method based on artificial intelligence specifically includes the following steps, and the order of the steps in the flowchart can be changed and some steps can be omitted according to different requirements.
S11, analyzing the received text classification request, and constructing a search space, wherein the search space comprises a plurality of text enhancement strategies.
In this embodiment, when performing text classification, a user initiates a text classification request to a server through a client, specifically, the client may be a smart phone, an IPAD, or other existing intelligent devices, the server may be a text classification subsystem, and in a text classification process, for example, the client may send the text classification request to the text classification subsystem, and the text classification subsystem is configured to receive the text classification request sent by the client, analyze the text classification request, and construct a search space according to an analysis result.
In an alternative embodiment, the parsing the received text classification request to construct a search space includes:
analyzing the received text classification request to obtain four types of hyper-parameters: the method comprises the following steps of (1) obtaining a category label, an operation type, a probability value of an application type and a proportion of words of application operation in each text;
performing combined operation on the four types of hyper-parameters to obtain a plurality of text enhancement strategies, wherein each text enhancement strategy consists of the four types of hyper-parameters;
a search space is constructed based on the plurality of text enhancement strategies.
In this embodiment, if the text classification request includes 5 category labels, 4 operation types, probability values of 11 application types, and word ratios of 11 application operations, the constructed search space includes 2420 text enhancement strategies, that is, 5 × 4 × 11 × 11.
In particular, the category labels refer to the same type of text.
Specifically, the operation types include one or more of the following modes in combination: synonym replacement, random insertion, random exchange, random deletion.
In this embodiment, the probability of the application type refers to a probability of text enhancement, and may be discretized into 11 values of 0 to 1, where an interval between 0 and 1 is set to 0.1; the proportion of words to which the operation is applied refers to the proportion of words selected from each text, and may be discretized into 11 values of 0-0.5, where the interval between 0-0.5 is set to 0.05.
In the embodiment, a search space is constructed by adopting all the text enhancement strategies corresponding to the text classification request, so that the integrity of the text enhancement strategies in the search space is ensured, and the accuracy of the optimal text enhancement strategy selected from the search space subsequently is improved.
And S12, randomly selecting a text enhancement strategy from the search space by adopting a preset search strategy as a target text enhancement strategy, wherein the preset search strategy comprises a controller.
In this embodiment, the preset search policy may be an ENAS (efficient Neural Architecture search) search policy, the ENAS search policy efficiently implements exploration of a Neural network model structure in a form of sharing model parameters, and specifically, the preset search policy uses a controller, the controller is an RNN model, and the RNN model determines a calculation type of each node and selects an activated edge.
In an optional embodiment, the randomly selecting one text enhancement policy from the search space by using a preset search policy includes:
inputting the plurality of text enhancement strategies into a controller of the preset search strategy, wherein the controller randomly selects one hyper-parameter of any hyper-parameter from the plurality of text enhancement strategies as an input parameter of the current time step of the controller, inputs the input parameter of the current time step into the controller, and outputs an output value of the current time step;
the controller randomly selects one hyper-parameter from any remaining hyper-parameters in the plurality of text enhancement strategies as an input parameter of the next time step, uses the first input parameter of the next time step and the output value of the current time step as target input parameters of the next time step, inputs the target input parameters of the next time step into the controller, and outputs the output value of the next time step;
and circularly executing the selection of the four types of hyper-parameters and the determination of the input parameters until obtaining the output parameters corresponding to each hyper-parameter, and determining four output values corresponding to the four types of hyper-parameters as a target text enhancement strategy.
In this embodiment, the controller has one input at each time step, the embodiment includes four hyper-parameters, each time step corresponds to any one of the four hyper-parameters, and the output parameter at each time step is input into the value controller, and the output value corresponding to each hyper-parameter is obtained through the Softmax layer of the controller. Because the input parameter of the next time step of the controller is determined by the output value of the previous time step and the input parameter of the next time step, the previous output also influences the next text when the next text is processed, the relevance of the text is improved, the reliability of the output value of each hyper-parameter is ensured, and the accuracy of the randomly selected text enhancement strategy is improved.
S13, performing text enhancement on each text in the original text set in the text classification request by using the target text enhancement strategy to obtain a first enhanced text set.
In this embodiment, the text classification request further includes an original text set, where the original text set includes a plurality of texts.
In an optional embodiment, before the text enhancement is performed on each text in the original text set in the text classification request by using the target text enhancement policy to obtain the first enhanced text set, the method further includes:
and cleaning each text in the original text set according to a preset text cleaning strategy.
In this embodiment, a text cleaning policy may be preset, where the preset text cleaning policy may be to clean a text whose display format is inconsistent with that of time, date, numerical value, full half angle, and the like, whose content has characters that should not exist, and whose content should have content that is inconsistent with that of the field.
In the embodiment, each text in the original text set is cleaned, so that interference with subsequent text enhancement factors is reduced, and the text enhancement efficiency and accuracy are improved.
In an optional embodiment, the text enhancement on each text in the original text set in the text classification request by using the target text enhancement policy to obtain a first enhanced text set includes:
identifying an output value corresponding to each hyper-parameter in the target text enhancement policy;
and performing text enhancement on each text in the original text set based on the output value corresponding to each hyper-parameter to obtain a first enhanced text.
Illustratively, if the hyper-parameters included in the target text enhancement policy: the class label is class A, and the output value corresponding to the operation type is as follows: and randomly deleting, wherein the output value corresponding to the proportion of the words of the application operation is as follows: 0.2, the probability of the application type is: 1, one text in the original text set: the method comprises the following steps that 1, the target text enhancement strategy is adopted to obtain a first enhanced text: "I am Chinese" or "I Chinese" or "is Chinese".
Illustratively, if the hyper-parameters included in the target text enhancement policy: the class label is class A, and the output value corresponding to the operation type is as follows: and randomly deleting, wherein the output value corresponding to the proportion of the words of the application operation is as follows: 0.4, the probability of the application type is: 1, one text in the original text set: the method comprises the following steps that 1, the target text enhancement strategy is adopted to obtain a first enhanced text: "I am in" or "I am Chinese" or "is Chinese".
In the embodiment, the data enhancement technology is widely applied to effectively utilize the limited tagged corpus to improve the efficiency of the model and reduce the dependence on tagged data volume, the embodiment adopts the randomly selected target text enhancement strategy to perform text enhancement on each text in the original text set, ensures the diversity and integrity of the text set in the neural network preset by the subsequent input value, particularly aims at the small sample data set and the data set with unbalanced category, can enhance the data volume of the small sample data set and enhance the data set with unbalanced category to be balanced by the text enhancement strategy, improves the effectiveness and robustness of the model trained by the subsequently enhanced data set, and simultaneously performs text enhancement on each text in the original text set by adopting the randomly selected target text enhancement strategy without manual tagging, a large amount of manpower and time are not needed to be consumed, and the efficiency and the accuracy of text enhancement are improved.
And S14, inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model.
In this embodiment, a neural network may be preset, where the preset neural network may be an existing convolutional neural network or an inverse graph network, and after an original text set and a first enhanced text set are obtained, a text classification model is trained based on the original text set and the first enhanced document set.
And S15, inputting the verification set in the text classification request into the first text classification model for verification, and calculating the verification passing rate.
In this embodiment, the text classification request further includes a validation set, after a first text classification model is trained, a passing rate of the first text classification model is calculated based on the validation set, and whether the first text classification model is stable or not can be determined according to the validation passing rate.
And S16, determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification passing rate.
In this embodiment, the target text classification model refers to a text classification model corresponding to a verification passing rate, the optimal text enhancement strategy refers to enhancing a text after a selected text enhancement strategy is adopted, so that the verification passing rate obtained by the trained target text classification model reaches a preset convergence condition in the text classification request, specifically, the preset convergence condition refers to determining whether the controller is converged according to the verification passing rate, and only when the controller is converged, the obtained text enhancement strategy is determined to be the optimal text enhancement strategy, for example, the preset convergence condition may be that the verification passing rate is greater than or equal to a preset verification passing rate threshold, or the verification passing rate of the optimal text enhancement strategy on the text classification model is not increased any more.
In an optional embodiment, the determining, according to the verification passing rate, a target text classification model and an optimal text enhancement policy corresponding to the text classification request includes:
when the verification passing rate meets a preset convergence condition in the text classification request, determining the first text classification model as a target text classification model and determining the target text enhancement strategy as an optimal text enhancement strategy; or
When the verification passing rate does not meet the preset convergence condition in the text classification request, updating the model parameters in the controller based on the verification passing rate to obtain an updated controller, randomly selecting a new text addition strategy from the search space by adopting the updated controller as a new target text enhancement strategy, performing text enhancement on the original text set by using the new target text enhancement strategy to obtain a second enhanced text set, inputting the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, inputting the verification set in the text classification request into the second text classification model for verification, calculating the verification passing rate, repeatedly executing the updating of the model parameters in the controller according to the verification passing rate to reselect the new text enhancement strategy for text enhancement, and obtaining a verification passing rate until the verification passing rate meets a preset convergence condition corresponding to the controller, determining a text classification model corresponding to the verification passing rate as a target text classification model, and determining a new target text enhancement strategy corresponding to the verification passing rate as an optimal text enhancement strategy.
In this embodiment, the second enhanced text set is obtained by updating the text enhancement policy in the controller and then adopting a new text enhancement policy.
In the embodiment, the optimal text enhancement strategy is searched for each data set in a customized manner by constructing the search space and adopting the preset search strategy, so that the accuracy of text classification is improved.
S17, performing text enhancement on the text set to be classified in the text classification request by adopting the optimal text enhancement strategy to obtain a third enhanced text set, and inputting the third enhanced text set and the text set to be classified into the target text classification model to obtain a text classification result.
In this embodiment, the optimal text enhancement strategy is determined by constructing a search space, searching a target text enhancement strategy by using a preset search strategy, and determining the optimal text enhancement strategy based on a verification pass rate of the target text enhancement strategy in the target text classification model, wherein the optimal text enhancement strategy is obtained by searching by using a search strategy instead of manual experience and a large number of comparison experiments after the hyper-parameters are manually set.
In the embodiment, an optimal text enhancement strategy and a target text classification model are determined by adopting a search strategy, the optimal text enhancement strategy is adopted to perform text enhancement on a text set to be classified, it is ensured that an obtained third enhanced text set is most stable, and meanwhile, the text set to be classified and the third enhanced text set are input into the target text classification model to perform text classification, so that the accuracy of text classification is improved.
In summary, in the text classification method based on artificial intelligence according to this embodiment, on one hand, a search space is constructed by using all text enhancement strategies corresponding to the text classification request, so that the integrity of the text enhancement strategies in the search space is ensured, and the accuracy of the optimal text enhancement strategy selected from the search space in the following process is improved; on the other hand, a controller in a preset search strategy is adopted to randomly select a text enhancement strategy from the search space, because the input parameter of the next time step of the controller is determined by the output value of the previous time step and the input parameter of the next time step, the previous output also influences the next text processing, the relevance of the text is improved, the reliability of the output value of each hyper-parameter is ensured, and the accuracy of the randomly selected text enhancement strategy is improved; finally, each text in the original text set is subjected to text enhancement by adopting a randomly selected target text enhancement strategy, manual marking is not needed, a large amount of manpower and time are not needed, and the efficiency and the accuracy of text enhancement are improved.
Example two
Fig. 2 is a block diagram of an artificial intelligence based text classification apparatus according to a second embodiment of the present invention.
In some embodiments, the artificial intelligence based text classification apparatus 20 may include a plurality of functional modules comprised of program code segments. Program code for various program segments of the artificial intelligence based text classification apparatus 20 may be stored in a memory of the electronic device and executed by the at least one processor to perform (see, e.g., fig. 1 for details) the functions of artificial intelligence based text classification.
In this embodiment, the artificial intelligence based text classification apparatus 20 may be divided into a plurality of functional modules according to the functions performed by the apparatus. The functional module may include: the system comprises a parsing module 201, a selecting module 202, a text enhancement module 203, a first input module 204, a verification module 205, a determination module 206 and a second input module 207. The module referred to herein is a series of computer readable instruction segments stored in a memory that can be executed by at least one processor and that can perform a fixed function. In the present embodiment, the functions of the modules will be described in detail in the following embodiments.
The parsing module 201 is configured to parse the received text classification request to construct a search space, where the search space includes a plurality of text enhancement strategies.
In this embodiment, when performing text classification, a user initiates a text classification request to a server through a client, specifically, the client may be a smart phone, an IPAD, or other existing intelligent devices, the server may be a text classification subsystem, and in a text classification process, for example, the client may send the text classification request to the text classification subsystem, and the text classification subsystem is configured to receive the text classification request sent by the client, analyze the text classification request, and construct a search space according to an analysis result.
In an alternative embodiment, the parsing module 201 parses the received text classification request to construct a search space, which includes:
analyzing the received text classification request to obtain four types of hyper-parameters: the method comprises the following steps of (1) obtaining a category label, an operation type, a probability value of an application type and a proportion of words of application operation in each text;
performing combined operation on the four types of hyper-parameters to obtain a plurality of text enhancement strategies, wherein each text enhancement strategy consists of the four types of hyper-parameters;
a search space is constructed based on the plurality of text enhancement strategies.
In this embodiment, if the text classification request includes 5 category labels, 4 operation types, probability values of 11 application types, and word ratios of 11 application operations, the constructed search space includes 2420 text enhancement strategies, that is, 5 × 4 × 11 × 11.
In particular, the category labels refer to the same type of text.
Specifically, the operation types include one or more of the following modes in combination: synonym replacement, random insertion, random exchange, random deletion.
In this embodiment, the probability of the application type refers to a probability of text enhancement, and may be discretized into 11 values of 0 to 1, where an interval between 0 and 1 is set to 0.1; the proportion of words to which the operation is applied refers to the proportion of words selected from each text, and may be discretized into 11 values of 0-0.5, where the interval between 0-0.5 is set to 0.05.
In the embodiment, a search space is constructed by adopting all the text enhancement strategies corresponding to the text classification request, so that the integrity of the text enhancement strategies in the search space is ensured, and the accuracy of the optimal text enhancement strategy selected from the search space subsequently is improved.
A selecting module 202, configured to randomly select a text enhancement policy from the search space by using a preset search policy as a target text enhancement policy, where the preset search policy includes a controller.
In this embodiment, the preset search policy may be an ENAS (efficient Neural Architecture search) search policy, the ENAS search policy efficiently implements exploration of a Neural network model structure in a form of sharing model parameters, and specifically, the preset search policy uses a controller, the controller is an RNN model, and the RNN model determines a calculation type of each node and selects an activated edge.
In an optional embodiment, the selecting module 202 randomly selects a text enhancement policy from the search space by using a preset search policy, where the selecting as the target text enhancement policy includes:
inputting the plurality of text enhancement strategies into a controller of the preset search strategy, wherein the controller randomly selects one hyper-parameter of any hyper-parameter from the plurality of text enhancement strategies as an input parameter of the current time step of the controller, inputs the input parameter of the current time step into the controller, and outputs an output value of the current time step;
the controller randomly selects one hyper-parameter from any remaining hyper-parameters in the plurality of text enhancement strategies as an input parameter of the next time step, uses the first input parameter of the next time step and the output value of the current time step as target input parameters of the next time step, inputs the target input parameters of the next time step into the controller, and outputs the output value of the next time step;
and circularly executing the selection of the four types of hyper-parameters and the determination of the input parameters until obtaining the output parameters corresponding to each hyper-parameter, and determining four output values corresponding to the four types of hyper-parameters as a target text enhancement strategy.
In this embodiment, the controller has one input at each time step, the embodiment includes four hyper-parameters, each time step corresponds to any one of the four hyper-parameters, and the output parameter at each time step is input into the value controller, and the output value corresponding to each hyper-parameter is obtained through the Softmax layer of the controller. Because the input parameter of the next time step of the controller is determined by the output value of the previous time step and the input parameter of the next time step, the previous output also influences the next text when the next text is processed, the relevance of the text is improved, the reliability of the output value of each hyper-parameter is ensured, and the accuracy of the randomly selected text enhancement strategy is improved.
The text enhancement module 203 is configured to perform text enhancement on each text in the original text set in the text classification request by using the target text enhancement policy to obtain a first enhanced text set.
In this embodiment, the text classification request further includes an original text set, where the original text set includes a plurality of texts.
In an optional embodiment, before the text enhancement module 203 performs text enhancement on each text in the original text set in the text classification request by using the target text enhancement policy to obtain the first enhanced text set, the method further includes:
and cleaning each text in the original text set according to a preset text cleaning strategy.
In this embodiment, a text cleaning policy may be preset, where the preset text cleaning policy may be to clean a text whose display format is inconsistent with that of time, date, numerical value, full half angle, and the like, whose content has characters that should not exist, and whose content should have content that is inconsistent with that of the field.
In the embodiment, each text in the original text set is cleaned, so that interference with subsequent text enhancement factors is reduced, and the text enhancement efficiency and accuracy are improved.
In an optional embodiment, the text enhancement module 203 performs text enhancement on each text in the original text set in the text classification request by using the target text enhancement policy, and obtaining a first enhanced text set includes:
identifying an output value corresponding to each hyper-parameter in the target text enhancement policy;
and performing text enhancement on each text in the original text set based on the output value corresponding to each hyper-parameter to obtain a first enhanced text.
Illustratively, if the hyper-parameters included in the target text enhancement policy: the class label is class A, and the output value corresponding to the operation type is as follows: and randomly deleting, wherein the output value corresponding to the proportion of the words of the application operation is as follows: 0.2, the probability of the application type is: 1, one text in the original text set: the method comprises the following steps that 1, the target text enhancement strategy is adopted to obtain a first enhanced text: "I am Chinese" or "I Chinese" or "is Chinese".
Illustratively, if the hyper-parameters included in the target text enhancement policy: the class label is class A, and the output value corresponding to the operation type is as follows: and randomly deleting, wherein the output value corresponding to the proportion of the words of the application operation is as follows: 0.4, the probability of the application type is: 1, one text in the original text set: the method comprises the following steps that 1, the target text enhancement strategy is adopted to obtain a first enhanced text: "I am in" or "I am Chinese" or "is Chinese".
In the embodiment, the data enhancement technology is widely applied to effectively utilize the limited tagged corpus to improve the efficiency of the model and reduce the dependence on tagged data volume, the embodiment adopts the randomly selected target text enhancement strategy to perform text enhancement on each text in the original text set, ensures the diversity and integrity of the text set in the neural network preset by the subsequent input value, particularly aims at the small sample data set and the data set with unbalanced category, can enhance the data volume of the small sample data set and enhance the data set with unbalanced category to be balanced by the text enhancement strategy, improves the effectiveness and robustness of the model trained by the subsequently enhanced data set, and simultaneously performs text enhancement on each text in the original text set by adopting the randomly selected target text enhancement strategy without manual tagging, a large amount of manpower and time are not needed to be consumed, and the efficiency and the accuracy of text enhancement are improved.
The first input module 204 is configured to input the original text set and the first enhanced text set into a preset neural network for training, so as to obtain a first text classification model.
In this embodiment, a neural network may be preset, where the preset neural network may be an existing convolutional neural network or an inverse graph network, and after an original text set and a first enhanced text set are obtained, a text classification model is trained based on the original text set and the first enhanced document set.
And the verification module 205 is configured to input the verification set in the text classification request into the first text classification model for verification, and calculate a verification passing rate.
In this embodiment, the text classification request further includes a validation set, after a first text classification model is trained, a passing rate of the first text classification model is calculated based on the validation set, and whether the first text classification model is stable or not can be determined according to the validation passing rate.
And the determining module 206 is configured to determine a target text classification model and an optimal text enhancement policy corresponding to the text classification request according to the verification passing rate.
In this embodiment, the target text classification model refers to a text classification model corresponding to a verification passing rate, the optimal text enhancement strategy refers to enhancing a text after a selected text enhancement strategy is adopted, so that the verification passing rate obtained by the trained target text classification model reaches a preset convergence condition in the text classification request, specifically, the preset convergence condition refers to determining whether the controller is converged according to the verification passing rate, and only when the controller is converged, the obtained text enhancement strategy is determined to be the optimal text enhancement strategy, for example, the preset convergence condition may be that the verification passing rate is greater than or equal to a preset verification passing rate threshold, or the verification passing rate of the optimal text enhancement strategy on the text classification model is not increased any more.
In an optional embodiment, the determining module 206 determines the target text classification model and the optimal text enhancement policy corresponding to the text classification request according to the verification passing rate includes:
when the verification passing rate meets a preset convergence condition in the text classification request, determining the first text classification model as a target text classification model and determining the target text enhancement strategy as an optimal text enhancement strategy; or
When the verification passing rate does not meet the preset convergence condition in the text classification request, updating the model parameters in the controller based on the verification passing rate to obtain an updated controller, randomly selecting a new text addition strategy from the search space by adopting the updated controller as a new target text enhancement strategy, performing text enhancement on the original text set by using the new target text enhancement strategy to obtain a second enhanced text set, inputting the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, inputting the verification set in the text classification request into the second text classification model for verification, calculating the verification passing rate, repeatedly executing the updating of the model parameters in the controller according to the verification passing rate to reselect the new text enhancement strategy for text enhancement, and obtaining a verification passing rate until the verification passing rate meets a preset convergence condition corresponding to the controller, determining a text classification model corresponding to the verification passing rate as a target text classification model, and determining a new target text enhancement strategy corresponding to the verification passing rate as an optimal text enhancement strategy.
In this embodiment, the second enhanced text set is obtained by updating the text enhancement policy in the controller and then adopting a new text enhancement policy.
In the embodiment, the optimal text enhancement strategy is searched for each data set in a customized manner by constructing the search space and adopting the preset search strategy, so that the accuracy of text classification is improved.
The second input module 207 is configured to perform text enhancement on the text set to be classified in the text classification request by using the optimal text enhancement strategy to obtain a third enhanced text set, and input the third enhanced text set and the text set to be classified into the target text classification model to obtain a text classification result.
In this embodiment, the optimal text enhancement strategy is determined by constructing a search space, searching a target text enhancement strategy by using a preset search strategy, and determining the optimal text enhancement strategy based on a verification pass rate of the target text enhancement strategy in the target text classification model, wherein the optimal text enhancement strategy is obtained by searching by using a search strategy instead of manual experience and a large number of comparison experiments after the hyper-parameters are manually set.
In the embodiment, an optimal text enhancement strategy and a target text classification model are determined by adopting a search strategy, the optimal text enhancement strategy is adopted to perform text enhancement on a text set to be classified, it is ensured that an obtained third enhanced text set is most stable, and meanwhile, the text set to be classified and the third enhanced text set are input into the target text classification model to perform text classification, so that the accuracy of text classification is improved.
In summary, in the text classification device based on artificial intelligence of this embodiment, on one hand, a search space is constructed by using all the text enhancement strategies corresponding to the text classification request, so that the integrity of the text enhancement strategies in the search space is ensured, and the accuracy of the optimal text enhancement strategy selected from the search space in the following process is improved; on the other hand, a controller in a preset search strategy is adopted to randomly select a text enhancement strategy from the search space, because the input parameter of the next time step of the controller is determined by the output value of the previous time step and the input parameter of the next time step, the previous output also influences the next text processing, the relevance of the text is improved, the reliability of the output value of each hyper-parameter is ensured, and the accuracy of the randomly selected text enhancement strategy is improved; finally, each text in the original text set is subjected to text enhancement by adopting a randomly selected target text enhancement strategy, manual marking is not needed, a large amount of manpower and time are not needed, and the efficiency and the accuracy of text enhancement are improved.
EXAMPLE III
Fig. 3 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention. In the preferred embodiment of the present invention, the electronic device 3 comprises a memory 31, at least one processor 32, at least one communication bus 33 and a transceiver 34.
It will be appreciated by those skilled in the art that the configuration of the electronic device shown in fig. 3 does not constitute a limitation of the embodiment of the present invention, and may be a bus-type configuration or a star-type configuration, and the electronic device 3 may include more or less other hardware or software than those shown, or a different arrangement of components.
In some embodiments, the electronic device 3 is an electronic device capable of automatically performing numerical calculation and/or information processing according to instructions set or stored in advance, and the hardware thereof includes but is not limited to a microprocessor, an application specific integrated circuit, a programmable gate array, a digital processor, an embedded device, and the like. The electronic device 3 may also include a client device, which includes, but is not limited to, any electronic product that can interact with a client through a keyboard, a mouse, a remote controller, a touch pad, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a digital camera, and the like.
It should be noted that the electronic device 3 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
In some embodiments, the memory 31 is used for storing program codes and various data, such as the artificial intelligence based text classification device 20 installed in the electronic equipment 3, and realizes high-speed and automatic access to programs or data during the operation of the electronic equipment 3. The Memory 31 includes a Read-Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an electronically Erasable rewritable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory (EEPROM)), an optical Read-Only disk (CD-ROM) or other optical disk Memory, a magnetic disk Memory, a tape Memory, or any other medium readable by a computer capable of carrying or storing data.
In some embodiments, the at least one processor 32 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The at least one processor 32 is a Control Unit (Control Unit) of the electronic device 3, connects various components of the electronic device 3 by using various interfaces and lines, and executes various functions and processes data of the electronic device 3 by running or executing programs or modules stored in the memory 31 and calling data stored in the memory 31.
In some embodiments, the at least one communication bus 33 is arranged to enable connection communication between the memory 31 and the at least one processor 32 or the like.
Although not shown, the electronic device 3 may further include a power supply (such as a battery) for supplying power to each component, and optionally, the power supply may be logically connected to the at least one processor 32 through a power management device, so as to implement functions of managing charging, discharging, and power consumption through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 3 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, an electronic device, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.
In a further embodiment, in conjunction with fig. 2, the at least one processor 32 may execute operating means of the electronic device 3 and installed various types of applications (such as the artificial intelligence based text classification apparatus 20), program code, and the like, such as the various modules described above.
The memory 31 has program code stored therein, and the at least one processor 32 can call the program code stored in the memory 31 to perform related functions. For example, the various modules illustrated in fig. 2 are program code stored in the memory 31 and executed by the at least one processor 32 to implement the functions of the various modules for the purpose of artificial intelligence based text classification.
Illustratively, the program code may be partitioned into one or more modules/units that are stored in the memory 31 and executed by the processor 32 to accomplish the present application. The one or more modules/units may be a series of computer readable instruction segments capable of performing certain functions, which are used for describing the execution process of the program code in the electronic device 3. For example, the program code may be partitioned into a parsing module 201, a selecting module 202, a text enhancement module 203, a first input module 204, a verification module 205, a determination module 206, and a second input module 207.
In one embodiment of the present invention, the memory 31 stores a plurality of computer readable instructions that are executed by the at least one processor 32 to implement artificial intelligence based text classification functionality.
Specifically, the at least one processor 32 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1, and details are not repeated here.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or that the singular does not exclude the plural. A plurality of units or means recited in the present invention may also be implemented by one unit or means through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A method for artificial intelligence based text classification, the method comprising:
analyzing the received text classification request and constructing a search space, wherein the search space comprises a plurality of text enhancement strategies;
randomly selecting a text enhancement strategy from the search space by adopting a preset search strategy as a target text enhancement strategy, wherein the preset search strategy comprises a controller;
performing text enhancement on each text in the original text set in the text classification request by using the target text enhancement strategy to obtain a first enhanced text set;
inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;
inputting a verification set in the text classification request into the first text classification model for verification, and calculating a verification passing rate;
determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification passing rate;
and performing text enhancement on the text set to be classified in the text classification request by adopting the optimal text enhancement strategy to obtain a third enhanced text set, and inputting the third enhanced text set and the text set to be classified into the target text classification model to obtain a text classification result.
2. The artificial intelligence based text classification method of claim 1, wherein parsing the received text classification request to construct a search space comprises:
analyzing the received text classification request to obtain four types of hyper-parameters: the method comprises the following steps of (1) obtaining a category label, an operation type, a probability value of an application type and a proportion of words of application operation in each text;
performing combined operation on the four types of hyper-parameters to obtain a plurality of text enhancement strategies, wherein each text enhancement strategy consists of the four types of hyper-parameters;
a search space is constructed based on the plurality of text enhancement strategies.
3. The artificial intelligence based text classification method according to claim 2, characterized in that the operation types comprise one or a combination of several of the following ways: synonym replacement, random insertion, random exchange, random deletion.
4. The artificial intelligence based text classification method according to claim 2, wherein the randomly selecting a text enhancement strategy from the search space by using a preset search strategy as a target text enhancement strategy comprises:
inputting the plurality of text enhancement strategies into a controller of the preset search strategy, wherein the controller randomly selects one hyper-parameter of any hyper-parameter from the plurality of text enhancement strategies as an input parameter of the current time step of the controller, inputs the input parameter of the current time step into the controller, and outputs an output value of the current time step;
the controller randomly selects one hyper-parameter from any remaining hyper-parameters in the plurality of text enhancement strategies as an input parameter of the next time step, uses the first input parameter of the next time step and the output value of the current time step as target input parameters of the next time step, inputs the target input parameters of the next time step into the controller, and outputs the output value of the next time step;
and circularly executing the selection of the four types of hyper-parameters and the determination of the input parameters until obtaining the output parameters corresponding to each hyper-parameter, and determining four output values corresponding to the four types of hyper-parameters as a target text enhancement strategy.
5. The artificial intelligence based text classification method of claim 1, wherein the text enhancement of each text in an original text set in a text classification request using the target text enhancement policy, resulting in a first enhanced text set, comprises:
identifying an output value corresponding to each hyper-parameter in the target text enhancement policy;
and performing text enhancement on each text in the original text set based on the output value corresponding to each hyper-parameter to obtain a first enhanced text.
6. The artificial intelligence based text classification method according to claim 5, wherein the determining the target text classification model and the optimal text enhancement policy corresponding to the text classification request according to the verification passing rate comprises:
and when the verification passing rate meets a preset convergence condition in the text classification request, determining the first text classification model as a target text classification model and determining the target text enhancement strategy as an optimal text enhancement strategy.
7. The artificial intelligence based text classification method according to claim 6, characterized in that the method further comprises:
when the verification passing rate does not meet the preset convergence condition in the text classification request, updating the model parameters in the controller based on the verification passing rate to obtain an updated controller;
randomly selecting a new text addition strategy from the search space by adopting the updated controller to serve as a new target text enhancement strategy, and performing text enhancement on the original text set by using the new target text enhancement strategy to obtain a second enhanced text set;
inputting the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, inputting a verification set in the text classification request into the second text classification model for verification, and calculating a verification passing rate;
and repeatedly executing the step of updating the model parameters in the controller according to the verification passing rate to reselect a new text enhancement strategy for text enhancement to obtain the verification passing rate until the verification passing rate meets the preset convergence condition corresponding to the controller, determining the text classification model corresponding to the verification passing rate as a target text classification model and determining the new target text enhancement strategy corresponding to the verification passing rate as an optimal text enhancement strategy.
8. An apparatus for artificial intelligence based text classification, the apparatus comprising:
the analysis module is used for analyzing the received text classification request and constructing a search space, wherein the search space comprises a plurality of text enhancement strategies;
the selection module is used for randomly selecting a text enhancement strategy from the search space by adopting a preset search strategy as a target text enhancement strategy, wherein the preset search strategy comprises a controller;
the text enhancement module is used for performing text enhancement on each text in the original text set in the text classification request by using the target text enhancement strategy to obtain a first enhanced text set;
the first input module is used for inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;
the verification module is used for inputting a verification set in the text classification request into the first text classification model for verification and calculating a verification passing rate;
the determining module is used for determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification passing rate;
and the second input module is used for performing text enhancement on the text set to be classified in the text classification request by adopting the optimal text enhancement strategy to obtain a third enhanced text set, and inputting the third enhanced text set and the text set to be classified into the target text classification model to obtain a text classification result.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory, the processor being configured to implement the artificial intelligence based text classification method according to any one of claims 1 to 7 when executing a computer program stored in the memory.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the artificial intelligence based text classification method according to any one of claims 1 to 7.
CN202111093400.8A 2021-09-17 2021-09-17 Text classification method and device based on artificial intelligence, electronic equipment and medium Pending CN113792146A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111093400.8A CN113792146A (en) 2021-09-17 2021-09-17 Text classification method and device based on artificial intelligence, electronic equipment and medium
PCT/CN2022/071316 WO2023040145A1 (en) 2021-09-17 2022-01-11 Artificial intelligence-based text classification method and apparatus, electronic device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111093400.8A CN113792146A (en) 2021-09-17 2021-09-17 Text classification method and device based on artificial intelligence, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN113792146A true CN113792146A (en) 2021-12-14

Family

ID=78878790

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111093400.8A Pending CN113792146A (en) 2021-09-17 2021-09-17 Text classification method and device based on artificial intelligence, electronic equipment and medium

Country Status (2)

Country Link
CN (1) CN113792146A (en)
WO (1) WO2023040145A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023040145A1 (en) * 2021-09-17 2023-03-23 平安科技(深圳)有限公司 Artificial intelligence-based text classification method and apparatus, electronic device, and medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113282747B (en) * 2021-04-28 2023-07-18 南京大学 Text classification method based on automatic machine learning algorithm selection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807109A (en) * 2019-11-08 2020-02-18 北京金山云网络技术有限公司 Data enhancement strategy generation method, data enhancement method and device
CN111582375A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Data enhancement strategy searching method, device, equipment and storage medium
CN113220883A (en) * 2021-05-17 2021-08-06 华南师范大学 Text classification model performance optimization method and device and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220215209A1 (en) * 2019-04-25 2022-07-07 Google Llc Training machine learning models using unsupervised data augmentation
CN112487182B (en) * 2019-09-12 2024-04-12 华为技术有限公司 Training method of text processing model, text processing method and device
CN112101042A (en) * 2020-09-14 2020-12-18 平安科技(深圳)有限公司 Text emotion recognition method and device, terminal device and storage medium
CN113064973A (en) * 2021-04-12 2021-07-02 平安国际智慧城市科技股份有限公司 Text classification method, device, equipment and storage medium
CN113254599B (en) * 2021-06-28 2021-10-08 浙江大学 Multi-label microblog text classification method based on semi-supervised learning
CN113792146A (en) * 2021-09-17 2021-12-14 平安科技(深圳)有限公司 Text classification method and device based on artificial intelligence, electronic equipment and medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110807109A (en) * 2019-11-08 2020-02-18 北京金山云网络技术有限公司 Data enhancement strategy generation method, data enhancement method and device
CN111582375A (en) * 2020-05-09 2020-08-25 北京百度网讯科技有限公司 Data enhancement strategy searching method, device, equipment and storage medium
CN113220883A (en) * 2021-05-17 2021-08-06 华南师范大学 Text classification model performance optimization method and device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHUHUAI REN等: "Text AutoAugment:Learning Compositional Augmentation Policy for Text Classification", TEXT AUTOAUGMENT:LEARNING COMPOSITIONAL AUGMENTATION POLICY FOR TEXT CLASSIFICATION, vol. 2109, pages 1 - 15 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023040145A1 (en) * 2021-09-17 2023-03-23 平安科技(深圳)有限公司 Artificial intelligence-based text classification method and apparatus, electronic device, and medium

Also Published As

Publication number Publication date
WO2023040145A1 (en) 2023-03-23

Similar Documents

Publication Publication Date Title
CN113435582B (en) Text processing method and related equipment based on sentence vector pre-training model
CN113792146A (en) Text classification method and device based on artificial intelligence, electronic equipment and medium
CN109376868B (en) Information management system
CN113435998B (en) Loan overdue prediction method and device, electronic equipment and storage medium
CN114663223A (en) Credit risk assessment method, device and related equipment based on artificial intelligence
US11886779B2 (en) Accelerated simulation setup process using prior knowledge extraction for problem matching
CN112256537A (en) Model running state display method and device, computer equipment and storage medium
CN112783616A (en) Concurrent conflict processing method and device and computer storage medium
CN113468288B (en) Text courseware content extraction method based on artificial intelligence and related equipment
CN112288337B (en) Behavior recommendation method, behavior recommendation device, behavior recommendation equipment and behavior recommendation medium
CN113674065B (en) Service contact-based service recommendation method and device, electronic equipment and medium
CN115061895A (en) Business process arranging method and device, electronic equipment and storage medium
CN113469291B (en) Data processing method and device, electronic equipment and storage medium
CN114881313A (en) Behavior prediction method and device based on artificial intelligence and related equipment
CN115237706A (en) Buried point data processing method and device, electronic equipment and storage medium
CN115146064A (en) Intention recognition model optimization method, device, equipment and storage medium
US20220050884A1 (en) Utilizing machine learning models to automatically generate a summary or visualization of data
CN114742061A (en) Text processing method and device, electronic equipment and storage medium
CN114239538A (en) Assertion processing method and device, computer equipment and storage medium
CN111949867A (en) Cross-APP user behavior analysis model training method, analysis method and related equipment
CN114637564B (en) Data visualization method and device, electronic equipment and storage medium
CN113486183B (en) Text classification method and device based on support vector machine, electronic equipment and medium
CN116303943A (en) Intelligent question-answering method and device based on BERT model and related equipment
CN113722590B (en) Medical information recommendation method, device, equipment and medium based on artificial intelligence
CN114580409A (en) Text classification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination