WO2023040145A1

WO2023040145A1 - Artificial intelligence-based text classification method and apparatus, electronic device, and medium

Info

Publication number: WO2023040145A1
Application number: PCT/CN2022/071316
Authority: WO
Inventors: 孙金辉; 马骏; 王少军
Original assignee: 平安科技（深圳）有限公司
Priority date: 2021-09-17
Filing date: 2022-01-11
Publication date: 2023-03-23
Also published as: CN113792146A

Abstract

An artificial intelligence-based text classification method and apparatus, an electronic device, and a medium. The method comprises: constructing a search space; selecting a target text enhancement strategy by using a preset search strategy; performing text enhancement on an original text set to obtain a first enhanced text set; calculating a passing rate according to the original text set and the first enhanced text set; determining a target text classification model and an optimal text enhancement strategy; and performing, by using the optimal text enhancement strategy, text enhancement on a text set to be classified to obtain a third enhanced text set, and inputting the third enhanced text set and the text set to be classified into the target text classification model to obtain a text classification result. According to the method, an optimal text enhancement strategy is found for each data set in a customized manner by constructing a search space and using a preset search strategy, thereby improving the accuracy of text classification. The method further relates to blockchain technology, and the target text enhancement strategy is stored in a blockchain node.

Description

Text classification method, device, electronic equipment and medium based on artificial intelligence

This application claims the priority of the Chinese patent application with the application number 202111093400.8 filed with the China Patent Office on September 17, 2021, and the title of the application is "artificial intelligence-based text classification method, device, electronic equipment and medium", the entire content of which is passed References are incorporated in this application.

technical field

The present application relates to the technical field of artificial intelligence, and specifically relates to an artificial intelligence-based text classification method, device, electronic equipment and media.

Background technique

Text classification task is one of the most important tasks in natural language processing. At present, deep learning models have been widely used in text classification tasks, such as CNN, RNN and other models, which perform text enhancement after marking a large amount of text.

However, the inventors found that labeling text in the prior art consumes a lot of manpower and time. At the same time, it is necessary to manually set some hyperparameters when performing text enhancement. The hyperparameters are obtained through manual experience and a large number of comparative experiments. When text enhancement It is impossible to quickly and accurately find the optimal text enhancement strategy, resulting in low accuracy and efficiency of text classification results.

Therefore, it is necessary to propose a method that can accurately classify text.

Contents of the invention

The present application proposes an artificial intelligence-based text classification method, device, electronic equipment and medium.

The first aspect of the present application provides a text classification method based on artificial intelligence, the method comprising:

Analyzing the received text classification request to construct a search space, wherein the search space contains multiple text enhancement strategies;

Using a preset search strategy to randomly select a text enhancement strategy from the search space as a target text enhancement strategy, wherein the preset search strategy includes a controller;

Using the target text enhancement strategy to perform text enhancement on each text in the original text set in the text classification request to obtain a first enhanced text set;

Inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;

Input the verification set in the text classification request into the first text classification model for verification, and calculate the pass rate of verification;

determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate;

Use the optimal text enhancement strategy to perform text enhancement on the text set to be classified in the text classification request to obtain a third enhanced text set, and input the third enhanced text set and the text set to be classified into the In the target text classification model, the text classification result is obtained.

A second aspect of the present application provides an electronic device, the electronic device includes a memory and a processor, the memory is used to store at least one computer-readable instruction, and the processor is used to execute the at least one computer-readable instruction to Implement the following steps:

A third aspect of the present application provides a computer-readable storage medium, the computer-readable storage medium stores at least one computer-readable instruction, and when the at least one computer-readable instruction is executed by a processor, the following steps are implemented:

A fourth aspect of the present application provides an artificial intelligence-based text classification device, wherein the device includes:

A parsing module, configured to parse the received text classification request and construct a search space, wherein the search space includes multiple text enhancement strategies;

A selecting module, configured to randomly select a text enhancement strategy from the search space by using a preset search strategy as a target text enhancement strategy, wherein the preset search strategy includes a controller;

A text enhancement module, configured to use the target text enhancement strategy to perform text enhancement on each text in the original text set in the text classification request to obtain a first enhanced text set;

A first input module, configured to input the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;

A verification module, configured to input the verification set in the text classification request into the first text classification model for verification, and calculate the verification pass rate;

A determining module, configured to determine a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate;

The second input module is configured to use the optimal text enhancement strategy to perform text enhancement on the text set to be classified in the text classification request to obtain a third enhanced text set, and combine the third enhanced text set with the to-be-classified text set The classified text set is input into the target text classification model to obtain a text classification result.

The artificial intelligence-based text classification method, device, electronic equipment and storage medium described in this application improve the accuracy of text classification.

Description of drawings

FIG. 1 is a flowchart of an artificial intelligence-based text classification method provided in Embodiment 1 of the present application.

FIG. 2 is a structural diagram of an artificial intelligence-based text classification device provided in Embodiment 2 of the present application.

FIG. 3 is a schematic structural diagram of an electronic device provided in Embodiment 3 of the present application.

Detailed ways

In order to more clearly understand the above objects, features and advantages of the present application, the present application will be described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein in the specification of the application are only for the purpose of describing specific embodiments, and are not intended to limit the application.

Embodiment one

In this embodiment, the text classification method based on artificial intelligence can be applied to electronic devices. For electronic devices that need to perform text classification based on artificial intelligence, the electronic device based on the method provided by the application can be directly integrated on the electronic device. The text classification function of artificial intelligence, or run in the electronic device in the form of a software development kit (Software Development Kit, SDK).

The embodiments of the present application may acquire and process relevant data based on artificial intelligence technology. Among them, artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .

Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes several major directions such as computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, machine learning, and deep learning.

As shown in FIG. 1 , the artificial intelligence-based text classification method specifically includes the following steps. According to different requirements, the order of the steps in the flow chart can be changed, and some of them can be omitted.

S11. Parse the received text classification request to construct a search space, wherein the search space includes multiple text enhancement strategies.

In this embodiment, when the user performs text classification, the client initiates a text classification request to the server. Specifically, the client can be a smart phone, IPAD or other existing smart devices, and the server can be The text classification subsystem, in the text classification process, if the client can send a text classification request to the text classification subsystem, the text classification subsystem is used to receive the text classification request sent by the client, and the text Classification requests are parsed, and a search space is constructed based on the parsed results.

In an optional embodiment, said parsing the received text classification request and constructing a search space includes:

Parse the received text classification request to obtain four types of hyperparameters: category label, operation type, probability value of application type, and the proportion of words in each text to which operation is applied;

performing combined operations on the four types of hyperparameters to obtain multiple text enhancement strategies, wherein each of the text enhancement strategies is composed of the four types of hyperparameters;

A search space is constructed based on the plurality of text enhancement strategies.

It should be emphasized that, in order to further ensure the privacy and security of the above text enhancement strategy, the above text enhancement strategy can also be stored in a block chain node.

In this embodiment, if the text classification request contains 5 category labels, 4 types of operations, probability values of 11 application types, and proportions of words in 11 application operations, then the constructed search space contains 5×4 ×11×11=2420 kinds of text enhancement strategies.

Specifically, the category tags refer to the same type of text.

Specifically, the operation type includes one or a combination of the following methods: synonym replacement, random insertion, random exchange, and random deletion.

In this embodiment, the probability of the application type refers to the probability of text enhancement, which can be discretized into 11 values of 0-1, wherein the interval between 0-1 is set to 0.1; the word of the application operation The proportion refers to the proportion of words selected from each text, which can be discretized into 11 values from 0-0.5, where the interval between 0-0.5 is set to 0.05.

In this embodiment, a search space is constructed by using all the text enhancement strategies corresponding to the text classification request, which ensures the integrity of the text enhancement strategies in the search space, and improves the subsequent optimal selection from the search space. Accuracy of text enhancement strategies.

S12. Using a preset search strategy to randomly select a text enhancement strategy from the search space as a target text enhancement strategy, wherein the preset search strategy includes a controller.

In this embodiment, the preset search strategy may be an ENAS (Efficient Neural Architecture Search) search strategy, and the ENAS search strategy efficiently realizes the exploration of the neural network model structure by sharing model parameters. Specifically, the preset The assumed search strategy uses a controller, the controller is an RNN model, and the RNN model determines the calculation type of each node and selects an active edge.

In an optional embodiment, said using a preset search strategy to randomly select a text enhancement strategy from the search space, as the target text enhancement strategy includes:

Inputting the plurality of text enhancement strategies into the controller of the preset search strategy, the controller randomly selects a hyperparameter in any type of hyperparameters from the plurality of text enhancement strategies as the Input parameters of the current time step of the controller, input the input parameters of the current time step into the controller, and output the output value of the current time step;

The controller randomly selects one of the remaining hyperparameters of any type of hyperparameters from the plurality of text enhancement strategies as an input parameter of the next time step, and uses the first input parameter of the next time step and the The output value of the current time step is used as the target input parameter of the next time step, the target input parameter of the next time step is input into the controller, and the output value of the next time step is output;

The selection of the four types of hyperparameters and the determination of the input parameters are performed cyclically until the output parameters corresponding to each of the hyperparameters are obtained, and the four output values corresponding to the four types of hyperparameters are determined as the target text enhancement strategy.

In this embodiment, the controller has an input at each time step. In this embodiment, four hyperparameters are included, and each time step corresponds to any one of the four hyperparameters. The output parameters of each time step are input into the value controller, and the output value corresponding to each of the hyperparameters is obtained through the Softmax layer of the controller. Since the input parameters of the next time step of the controller are jointly determined by the output value of the previous time step and the input parameters of the next time step, when the next text is processed, the previous output also has an impact on it, improving the text The correlation ensures the reliability of the output value of each hyperparameter obtained, and improves the accuracy of the randomly selected text enhancement strategy.

S13. Perform text enhancement on each text in the original text set in the text classification request by using the target text enhancement strategy to obtain a first enhanced text set.

In this embodiment, the text classification request further includes an original text set, and the original text set includes multiple texts.

In an optional embodiment, before performing text enhancement on each text in the original text set in the text classification request using the target text enhancement strategy to obtain the first enhanced text set, the method further includes:

Each text in the original text set is cleaned according to a preset text cleaning strategy.

In this embodiment, a text cleaning strategy can be preset, and the preset text cleaning strategy can include inconsistent display formats such as time, date, value, and full half-width, characters that should not exist in the content, and content and the field Texts with inconsistent content should be cleaned.

In this embodiment, by cleaning each text in the original text set, factors that interfere with subsequent text enhancement are reduced, and the efficiency and accuracy of text enhancement are improved.

In an optional embodiment, performing text enhancement on each text in the original text set in the text classification request by using the target text enhancement strategy, and obtaining the first enhanced text set includes:

identifying an output value corresponding to each hyperparameter in the target text enhancement strategy;

Perform text enhancement on each text in the original text set based on the output value corresponding to each of the hyperparameters to obtain a first enhanced text.

Exemplarily, if the hyperparameter contained in the target text enhancement strategy: the category label is type A, the output value corresponding to the operation type is: random deletion, the output value corresponding to the proportion of words to which the operation is applied: 0.2, and the application type The probability is: 1, a text in the original text set: "I am Chinese", the first enhanced text obtained by using the target text enhancement strategy is: "I am China" or "I am Chinese" or "I am Chinese ".

Exemplarily, if the hyperparameter contained in the target text enhancement strategy: the category label is type A, the output value corresponding to the operation type is: random deletion, the output value corresponding to the proportion of words to which the operation is applied: 0.4, and the application type The probability is: 1, a text in the original text set: "I am Chinese", and the first enhanced text obtained by using the target text enhancement strategy is: "I am Chinese" or "I am Chinese" or "Chinese" or "Is China" or "is a Chinese" or "Chinese".

In this embodiment, data enhancement technology is widely used to effectively use limited annotation corpus to improve the efficiency of the model and reduce the dependence on the amount of annotation data. In this embodiment, the randomly selected target text enhancement strategy is used to enhance the original text set Text enhancement for each text ensures the diversity and integrity of the text set in the neural network with preset input values, especially for small sample data sets and data sets with unbalanced categories. The text enhancement strategy can enhance small The amount of data in the sample data set and the enhancement of the unbalanced data set to a balanced one improve the effectiveness and robustness of the subsequent model trained with the enhanced data set. At the same time, by using the randomly selected target text enhancement strategy Text enhancement is performed on each text in the original text set without manual labeling, without consuming a lot of manpower and time, which improves the efficiency and accuracy of text enhancement.

S14. Input the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model.

In this embodiment, a neural network can be preset, and the preset neural network can be an existing convolutional neural network or an inverse graph network. After obtaining the original text set and the first enhanced text set, based on the original The text set and the first enhanced file set train a text classification model.

S15. Input the verification set in the text classification request into the first text classification model for verification, and calculate the pass rate of verification.

In this embodiment, the text classification request also includes a verification set. After the first text classification model is trained, the pass rate of the first text classification model is calculated based on the verification set, which can be determined according to the verification pass rate. Whether the first text classification model is stable.

S16. Determine a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate.

In this embodiment, the target text classification model refers to the text classification model corresponding to the verification pass rate, and the optimal text enhancement strategy refers to the text enhancement strategy after using the selected text enhancement strategy, so that the training target The verification pass rate obtained by the text classification model reaches the preset convergence condition in the text classification request. Specifically, the preset convergence condition means that it can be determined whether the controller converges according to the verification pass rate. Only when When the controller converges, the obtained text enhancement strategy is determined to be the optimal text enhancement strategy. For example, the preset convergence condition can be that the verification pass rate is greater than or equal to the preset verification pass rate threshold, or the optimal text enhancement The verification pass rate of the strategy on the text classification model is no longer improved.

In an optional embodiment, the determining the target text classification model and the optimal text enhancement strategy corresponding to the text classification request according to the verification passing rate includes:

When the verification pass rate satisfies the preset convergence condition in the text classification request, determining the first text classification model as a target text classification model and determining the target text enhancement strategy as an optimal text enhancement strategy; or

When the verification pass rate does not meet the preset convergence condition in the text classification request, update the model parameters in the controller based on the verification pass rate to obtain an updated controller, and use the updated The controller randomly selects a new text enhancement strategy from the search space as a new target text enhancement strategy, and uses the new target text enhancement strategy to perform text enhancement on the original text set to obtain a second enhanced text set , and input the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, and input the verification set in the text classification request into the Perform verification in the second text classification model, and calculate the verification pass rate, repeat the update of the model parameters in the controller according to the verification pass rate, reselect a new text enhancement strategy for text enhancement, and obtain the verification pass rate, until the The verification pass rate satisfies the preset convergence condition corresponding to the controller, the text classification model corresponding to the verification pass rate is determined as the target text classification model and the new target text enhancement strategy corresponding to the verification pass rate is determined as Optimal Text Enhancement Strategies.

In this embodiment, the second enhanced text set is obtained by updating the text enhancement strategy in the controller and adopting a new text enhancement strategy.

In this embodiment, by constructing a search space and adopting a preset search strategy, an optimal text enhancement strategy is searched out for each data set, thereby improving the accuracy of text classification.

S17. Perform text enhancement on the text set to be classified in the text classification request by using the optimal text enhancement strategy to obtain a third enhanced text set, and input the third enhanced text set and the text set to be classified into In the target text classification model, a text classification result is obtained.

In this embodiment, the optimal text enhancement strategy is to construct a search space, use a preset search strategy to search for a target text enhancement strategy, and based on the verification pass rate of the target text enhancement strategy in the target text classification model, Determine the optimal text enhancement strategy. The optimal text enhancement strategy is not obtained through artificial experience and a large number of comparative experiments after manually setting hyperparameters, but is obtained by using a search strategy to search, which improves the determined optimal text enhancement strategy. The accuracy and efficiency of the strategy.

In this embodiment, the optimal text enhancement strategy and the target text classification model are determined by using the search strategy, and the text enhancement is performed on the text set to be classified by using the optimal text enhancement strategy to ensure that the obtained third enhanced text set is the most stable , while inputting the text set to be classified and the third enhanced text set into the target text classification model for text classification, which improves the accuracy of text classification.

To sum up, the artificial intelligence-based text classification method described in this embodiment, on the one hand, constructs a search space by using all the text enhancement strategies corresponding to the text classification request, ensuring the accuracy of the text enhancement strategies in the search space. Integrity, which improves the accuracy of the optimal text enhancement strategy selected from the search space; on the other hand, the controller in the preset search strategy randomly selects a text enhancement strategy from the search space , since the input parameters of the next time step of the controller are jointly determined by the output value of the previous time step and the input parameters of the next time step, when the next text is processed, the previous output also has an influence on it, improving the The relevance of the text ensures the reliability of the output value of each hyperparameter and improves the accuracy of the randomly selected text enhancement strategy; finally, each text in the original text set is processed by using the randomly selected target text enhancement strategy. Text enhancement does not require manual labeling, and does not need to consume a lot of manpower and time, which improves the efficiency and accuracy of text enhancement.

Embodiment two

In some embodiments, the artificial intelligence-based text classification device 20 may include a plurality of functional modules composed of program code segments. The program codes of each program segment in the text classification device 20 based on artificial intelligence can be stored in the memory of the electronic device, and executed by the at least one processor to execute (see Figure 1 for details) based on artificial intelligence The function of text classification.

In this embodiment, the text classification device 20 based on artificial intelligence can be divided into multiple functional modules according to the functions it performs. The functional modules may include: an analysis module 201 , a selection module 202 , a text enhancement module 203 , a first input module 204 , a verification module 205 , a determination module 206 and a second input module 207 . The module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the functions of each module will be described in detail in subsequent embodiments.

The parsing module 201 is configured to parse the received text classification request and construct a search space, wherein the search space contains multiple text enhancement strategies.

The selection module 202 is configured to use a preset search strategy to randomly select a text enhancement strategy from the search space as a target text enhancement strategy, wherein the preset search strategy includes a controller.

The text enhancement module 203 is configured to use the target text enhancement policy to perform text enhancement on each text in the original text set in the text classification request to obtain a first enhanced text set.

The first input module 204 is configured to input the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model.

A verification module 205, configured to input the verification set in the text classification request into the first text classification model for verification, and calculate a verification passing rate.

The determination module 206 is configured to determine a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate.

The second input module 207 is configured to use the optimal text enhancement strategy to perform text enhancement on the text set to be classified in the text classification request to obtain a third enhanced text set, and combine the third enhanced text set with the The text set to be classified is input into the target text classification model to obtain a text classification result.

The text classification device based on artificial intelligence described in this embodiment, on the one hand, constructs a search space by using all the text enhancement strategies corresponding to the text classification request, which ensures the integrity of the text enhancement strategies in the search space and improves the The accuracy of the optimal text enhancement strategy selected subsequently from the search space; on the other hand, the controller in the preset search strategy randomly selects a text enhancement strategy from the search space, due to the control The input parameters of the next time step of the filter are determined by the output value of the previous time step and the input parameters of the next time step. When the next text is processed, the previous output also affects it, which improves the relevance of the text. Ensure the reliability of the output value of each hyperparameter obtained, and improve the accuracy of the randomly selected text enhancement strategy; finally, by using the randomly selected target text enhancement strategy to enhance the text of each text in the original text set, no need Manual labeling does not require a lot of manpower and time, which improves the efficiency and accuracy of text enhancement.

Embodiment Three

Referring to FIG. 3 , it is a schematic structural diagram of an electronic device provided by Embodiment 3 of the present application. In a preferred embodiment of the present application, the electronic device 3 includes a memory 31 , at least one processor 32 , at least one communication bus 33 and a transceiver 34 .

Those skilled in the art should understand that the structure of the electronic device shown in Figure 3 does not constitute a limitation of the embodiment of the present application, it can be a bus structure or a star structure, and the electronic device 3 can also include a ratio diagram more or less other hardware or software, or a different arrangement of components.

In some embodiments, the electronic device 3 is an electronic device that can automatically perform numerical calculation and/or information processing according to preset or stored instructions, and its hardware includes but not limited to microprocessors, application-specific integrated circuits , programmable gate arrays, digital processors and embedded devices, etc. The electronic device 3 may also include a client device, which includes but is not limited to any electronic product that can interact with the client through a keyboard, mouse, remote control, touch pad, or voice-activated device, for example, Personal computers, tablets, smartphones, digital cameras, etc.

It should be noted that the electronic device 3 is only an example, and other existing or future electronic products that can be adapted to this application should also be included in the scope of protection of this application, and are included here by reference .

In some embodiments, the memory 31 is used to store program codes and various data, such as the artificial intelligence-based text classification device 20 installed in the electronic device 3, and realize high-speed , Automatically complete the program or data access. Described memory 31 comprises nonvolatile memory and volatile memory, such as read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable Read-Only Memory, PROM), erasable programmable only memory Read-Only Memory (Erasable Programmable Read-Only Memory, EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronically Erasable Programmable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory , EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.

In some embodiments, the at least one processor 32 may be composed of an integrated circuit, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions packaged, including a Or a combination of multiple central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, and various control chips. The at least one processor 32 is the control core (Control Unit) of the electronic device 3, and uses various interfaces and lines to connect the various components of the entire electronic device 3, by running or executing programs stored in the memory 31 or module, and call the data stored in the memory 31 to execute various functions of the electronic device 3 and process data.

In some embodiments, the at least one communication bus 33 is configured to realize connection and communication between the memory 31 and the at least one processor 32 and so on.

Although not shown, the electronic device 3 may also include a power supply (such as a battery) for supplying power to various components. Optionally, the power supply may be logically connected to the at least one processor 32 through a power management device, thereby Realize the functions of managing charging, discharging, and power consumption management. The power supply may also include one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, power status indicators and other arbitrary components. The electronic device 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

It should be understood that the embodiments are only for illustration, and are not limited by the structure in terms of the scope of the patent application.

The above-mentioned integrated units implemented in the form of software function modules can be stored in a computer-readable storage medium. The above-mentioned software function modules are stored in a storage medium, and include several instructions to make a computer device (which may be a personal computer, electronic device, or network device, etc.) or a processor (processor) execute the methods described in various embodiments of the present application part.

In a further embodiment, with reference to FIG. 2 , the at least one processor 32 can execute the operating means of the electronic device 3 and installed various applications (such as the artificial intelligence-based text classification means 20), Program code, etc., for example, each of the above-mentioned modules.

Program codes are stored in the memory 31 , and the at least one processor 32 can invoke the program codes stored in the memory 31 to execute related functions. For example, the various modules described in FIG. 2 are program codes stored in the memory 31 and executed by the at least one processor 32, so as to realize the functions of the various modules to achieve text classification based on artificial intelligence the goal of.

In one embodiment of the present application, the memory 31 stores a plurality of computer-readable instructions, and the plurality of computer-readable instructions are executed by the at least one processor 32 to realize the function of text classification based on artificial intelligence.

Exemplarily, the program code may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 31 and executed by the processor 32 to complete this Apply. The one or more modules/units may be a series of computer-readable instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer program in the electronic device 3 . For example, the program code can be divided into analysis module 201 , selection module 202 , text enhancement module 203 , first input module 204 , verification module 205 , determination module 206 and second input module 207 .

Specifically, for the specific implementation method of the above instructions by the at least one processor 32, reference may be made to the description of relevant steps in the embodiment corresponding to FIG. 1 , and details are not repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.

Further, the computer-readable storage medium may be non-volatile or volatile.

Further, the computer-readable storage medium may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function, etc.; The data created using the node, etc.

The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain (Blockchain), essentially a decentralized database, is a series of data blocks associated with each other using cryptographic methods. Each data block contains a batch of network transaction information, which is used to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, and may be located in one place or distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or in the form of hardware plus software function modules.

It will be apparent to those skilled in the art that the present application is not limited to the details of the exemplary embodiments described above, but that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application. Therefore, the embodiments should be regarded as exemplary and not restrictive in all points of view, and the scope of the application is defined by the appended claims rather than the foregoing description, and it is intended that the scope of the present application be defined by the appended claims rather than by the foregoing description. All changes within the meaning and range of equivalents of the elements are embraced in this application. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is clear that the word "comprising" does not exclude other elements or the singular does not exclude the plural. A plurality of units or means stated in this application may also be realized by software or hardware by one unit or means. The words first, second, etc. are used to denote names and do not imply any particular order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application without limitation. Although the present application has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solutions of the present application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solutions of the present application.

Claims

A text classification method based on artificial intelligence, wherein said method comprises:

Analyzing the received text classification request to construct a search space, wherein the search space contains multiple text enhancement strategies;

Using a preset search strategy to randomly select a text enhancement strategy from the search space as a target text enhancement strategy, wherein the preset search strategy includes a controller;

Using the target text enhancement strategy to perform text enhancement on each text in the original text set in the text classification request to obtain a first enhanced text set;

Inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;

Input the verification set in the text classification request into the first text classification model for verification, and calculate the pass rate of verification;

determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate;

Use the optimal text enhancement strategy to perform text enhancement on the text set to be classified in the text classification request to obtain a third enhanced text set, and input the third enhanced text set and the text set to be classified into the In the target text classification model, the text classification result is obtained.
The text classification method based on artificial intelligence as claimed in claim 1, wherein, said parsing received text classification request, constructing a search space comprises:

Parse the received text classification request to obtain four types of hyperparameters: category label, operation type, probability value of application type, and the proportion of words in each text to which operation is applied;

performing combined operations on the four types of hyperparameters to obtain multiple text enhancement strategies, wherein each of the text enhancement strategies is composed of the four types of hyperparameters;

A search space is constructed based on the plurality of text enhancement strategies.
The artificial intelligence-based text classification method according to claim 2, wherein the operation type includes one or a combination of the following methods: synonym replacement, random insertion, random exchange, and random deletion.
The text classification method based on artificial intelligence as claimed in claim 2, wherein, said adopting a preset search strategy randomly selects a text enhancement strategy from the search space, as the target text enhancement strategy includes:

Inputting the plurality of text enhancement strategies into the controller of the preset search strategy, the controller randomly selects a hyperparameter in any type of hyperparameters from the plurality of text enhancement strategies as the Input parameters of the current time step of the controller, input the input parameters of the current time step into the controller, and output the output value of the current time step;

The controller randomly selects one of the remaining hyperparameters of any type of hyperparameters from the plurality of text enhancement strategies as an input parameter of the next time step, and uses the first input parameter of the next time step and the The output value of the current time step is used as the target input parameter of the next time step, the target input parameter of the next time step is input into the controller, and the output value of the next time step is output;

The selection of the four types of hyperparameters and the determination of the input parameters are performed cyclically until the output parameters corresponding to each of the hyperparameters are obtained, and the four output values corresponding to the four types of hyperparameters are determined as the target text enhancement strategy.
The text classification method based on artificial intelligence as claimed in claim 1, wherein, said using said target text enhancement strategy to carry out text enhancement to each text in the original text set in the text classification request, and obtaining the first enhanced text set comprises:

identifying an output value corresponding to each hyperparameter in the target text enhancement strategy;

Perform text enhancement on each text in the original text set based on the output value corresponding to each of the hyperparameters to obtain a first enhanced text.
The text classification method based on artificial intelligence according to claim 5, wherein said determining the target text classification model and the optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate include:

When the verification pass rate satisfies the preset convergence condition in the text classification request, the first text classification model is determined as the target text classification model and the target text enhancement strategy is determined as the optimal text enhancement strategy.
The text classification method based on artificial intelligence as claimed in claim 6, wherein, described method also comprises:

When the verification pass rate does not meet the preset convergence condition in the text classification request, update the model parameters in the controller based on the verification pass rate to obtain an updated controller;

Using the updated controller to randomly select a new text enhancement strategy from the search space as a new target text enhancement strategy, and use the new target text enhancement strategy to perform text enhancement on the original text set, obtain the second enhanced text set;

Input the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, and input the verification set in the text classification request into the second Verify in the text classification model and calculate the verification pass rate;

Repeating the process of updating the model parameters in the controller based on the verification pass rate and reselecting a new text enhancement strategy for text enhancement to obtain the verification pass rate until the verification pass rate satisfies the preset convergence condition corresponding to the controller The text classification model corresponding to the verification pass rate is determined as the target text classification model and the new target text enhancement strategy corresponding to the verification pass rate is determined as the optimal text enhancement strategy.
An electronic device, wherein the electronic device includes a memory and a processor, the memory is used to store at least one computer-readable instruction, and the processor is used to execute the at least one computer-readable instruction to implement the following steps:

Analyzing the received text classification request to construct a search space, wherein the search space contains multiple text enhancement strategies;

Using a preset search strategy to randomly select a text enhancement strategy from the search space as a target text enhancement strategy, wherein the preset search strategy includes a controller;

Using the target text enhancement strategy to perform text enhancement on each text in the original text set in the text classification request to obtain a first enhanced text set;

Inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;

Input the verification set in the text classification request into the first text classification model for verification, and calculate the pass rate of verification;

determining a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate;

Use the optimal text enhancement strategy to perform text enhancement on the text set to be classified in the text classification request to obtain a third enhanced text set, and input the third enhanced text set and the text set to be classified into the In the target text classification model, the text classification result is obtained.
The electronic device according to claim 8, wherein, when the processor executes the at least one computer-readable instruction to implement the parsing of the received text classification request, when constructing a search space, it specifically includes:

Parse the received text classification request to obtain four types of hyperparameters: category label, operation type, probability value of application type, and the proportion of words in each text to which operation is applied;

performing combined operations on the four types of hyperparameters to obtain multiple text enhancement strategies, wherein each of the text enhancement strategies is composed of the four types of hyperparameters;

A search space is constructed based on the plurality of text enhancement strategies.
The electronic device according to claim 9, wherein the operation type includes one or a combination of the following methods: synonym replacement, random insertion, random exchange, and random deletion.
The electronic device according to claim 9, wherein the processor executes the at least one computer-readable instruction to realize the random selection of a text enhancement strategy from the search space by using a preset search strategy as the target Text enhancement strategies include:

Inputting the plurality of text enhancement strategies into the controller of the preset search strategy, the controller randomly selects a hyperparameter in any type of hyperparameters from the plurality of text enhancement strategies as the Input parameters of the current time step of the controller, input the input parameters of the current time step into the controller, and output the output value of the current time step;

The controller randomly selects one of the remaining hyperparameters of any type of hyperparameters from the plurality of text enhancement strategies as an input parameter of the next time step, and uses the first input parameter of the next time step and the The output value of the current time step is used as the target input parameter of the next time step, the target input parameter of the next time step is input into the controller, and the output value of the next time step is output;

The selection of the four types of hyperparameters and the determination of the input parameters are performed cyclically until the output parameters corresponding to each of the hyperparameters are obtained, and the four output values corresponding to the four types of hyperparameters are determined as the target text enhancement strategy.
The electronic device of claim 8, wherein said processor executes said at least one computer readable instruction to implement said text enhancement strategy for each text in the original text set in the text classification request Enhancement, when obtaining the first enhanced text set, specifically includes:

identifying an output value corresponding to each hyperparameter in the target text enhancement strategy;

Perform text enhancement on each text in the original text set based on the output value corresponding to each of the hyperparameters to obtain a first enhanced text.
The electronic device according to claim 12, wherein the processor executes the at least one computer-readable instruction to realize the determination of the target text classification model and the optimal text classification model corresponding to the text classification request according to the verification pass rate. Text enhancement strategies include:

When the verification pass rate satisfies the preset convergence condition in the text classification request, determining the first text classification model as a target text classification model and determining the target text enhancement strategy as an optimal text enhancement strategy; or

When the verification pass rate does not meet the preset convergence condition in the text classification request, update the model parameters in the controller based on the verification pass rate to obtain an updated controller; adopt the updated The controller randomly selects a new text enhancement strategy from the search space as a new target text enhancement strategy, and uses the new target text enhancement strategy to perform text enhancement on the original text set to obtain a second enhanced text set ; Input the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, and input the verification set in the text classification request into the first Verify in the text classification model, and calculate the verification pass rate; repeat the implementation of updating the model parameters in the controller according to the verification pass rate and reselect a new text enhancement strategy for text enhancement, and obtain the verification pass rate, until the described The verification pass rate satisfies the preset convergence condition corresponding to the controller, and the text classification model corresponding to the verification pass rate is determined as the target text classification model and the new target text enhancement strategy corresponding to the verification pass rate is determined as the optimal Optimal Text Enhancement Strategies.
A computer-readable storage medium, wherein the computer-readable storage medium stores at least one computer-readable instruction, and when the at least one computer-readable instruction is executed by a processor, the following steps are implemented:

Analyzing the received text classification request to construct a search space, wherein the search space contains multiple text enhancement strategies;

Using a preset search strategy to randomly select a text enhancement strategy from the search space as a target text enhancement strategy, wherein the preset search strategy includes a controller;

Using the target text enhancement strategy to perform text enhancement on each text in the original text set in the text classification request to obtain a first enhanced text set;

Inputting the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;

Input the verification set in the text classification request into the first text classification model for verification, and calculate the pass rate of verification;

Determine the target text classification model and optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate;

Use the optimal text enhancement strategy to perform text enhancement on the text set to be classified in the text classification request to obtain a third enhanced text set, and input the third enhanced text set and the text set to be classified into the In the target text classification model, the text classification result is obtained.
The medium according to claim 14, wherein the at least one computer-readable instruction is executed by the processor to implement the parsing of the received text classification request, and when constructing a search space, it specifically includes:

Parse the received text classification request to obtain four types of hyperparameters: category label, operation type, probability value of application type, and the proportion of words in each text to which operation is applied;

performing combined operations on the four types of hyperparameters to obtain multiple text enhancement strategies, wherein each of the text enhancement strategies is composed of the four types of hyperparameters;

A search space is constructed based on the plurality of text enhancement strategies.
The medium according to claim 15, wherein the operation type includes one or a combination of the following methods: synonym replacement, random insertion, random exchange, and random deletion.
The medium of claim 15, wherein the at least one computer-readable instruction is executed by the processor to implement the random selection of a text enhancement strategy from the search space using a preset search strategy as the target Text enhancement strategies include:

Inputting the plurality of text enhancement strategies into the controller of the preset search strategy, the controller randomly selects a hyperparameter in any type of hyperparameters from the plurality of text enhancement strategies as the Input parameters of the current time step of the controller, input the input parameters of the current time step into the controller, and output the output value of the current time step;

The controller randomly selects one of the remaining hyperparameters of any type of hyperparameters from the plurality of text enhancement strategies as an input parameter of the next time step, and uses the first input parameter of the next time step and the The output value of the current time step is used as the target input parameter of the next time step, the target input parameter of the next time step is input into the controller, and the output value of the next time step is output;

The selection of the four types of hyperparameters and the determination of the input parameters are performed cyclically until the output parameters corresponding to each of the hyperparameters are obtained, and the four output values corresponding to the four types of hyperparameters are determined as the target text enhancement strategy.
The medium of claim 14 , wherein the at least one computer readable instruction is executed by the processor to implement the text enhancement strategy for each text in the original text set in the text classification request. Enhancement, when obtaining the first enhanced text set, specifically includes:

identifying an output value corresponding to each hyperparameter in the target text enhancement strategy;

Perform text enhancement on each text in the original text set based on the output value corresponding to each of the hyperparameters to obtain a first enhanced text.
The medium of claim 18, wherein the at least one computer-readable instruction is executed by the processor to implement the determination of the target text classification model and the optimal text classification model corresponding to the text classification request according to the verification pass rate. Text enhancement strategies include:

When the verification pass rate satisfies the preset convergence condition in the text classification request, determining the first text classification model as a target text classification model and determining the target text enhancement strategy as an optimal text enhancement strategy; or

When the verification pass rate does not meet the preset convergence condition in the text classification request, update the model parameters in the controller based on the verification pass rate to obtain an updated controller; adopt the updated The controller randomly selects a new text enhancement strategy from the search space as a new target text enhancement strategy, and uses the new target text enhancement strategy to perform text enhancement on the original text set to obtain a second enhanced text set ; Input the original text set and the second enhanced file set into the preset neural network for training to obtain a second text classification model, and input the verification set in the text classification request into the first Verify in the text classification model, and calculate the verification pass rate; repeat the implementation of updating the model parameters in the controller according to the verification pass rate and reselect a new text enhancement strategy for text enhancement, and obtain the verification pass rate, until the described The verification pass rate satisfies the preset convergence condition corresponding to the controller, and the text classification model corresponding to the verification pass rate is determined as the target text classification model and the new target text enhancement strategy corresponding to the verification pass rate is determined as the optimal Optimal Text Enhancement Strategies.
A text classification device based on artificial intelligence, wherein said device comprises:

A parsing module, configured to parse the received text classification request and construct a search space, wherein the search space includes multiple text enhancement strategies;

A selecting module, configured to randomly select a text enhancement strategy from the search space by using a preset search strategy as a target text enhancement strategy, wherein the preset search strategy includes a controller;

A text enhancement module, configured to use the target text enhancement strategy to perform text enhancement on each text in the original text set in the text classification request to obtain a first enhanced text set;

A first input module, configured to input the original text set and the first enhanced text set into a preset neural network for training to obtain a first text classification model;

A verification module, configured to input the verification set in the text classification request into the first text classification model for verification, and calculate the verification pass rate;

A determining module, configured to determine a target text classification model and an optimal text enhancement strategy corresponding to the text classification request according to the verification pass rate;

The second input module is configured to use the optimal text enhancement strategy to perform text enhancement on the text set to be classified in the text classification request to obtain a third enhanced text set, and combine the third enhanced text set with the to-be-classified text set The classified text set is input into the target text classification model to obtain a text classification result.