CN113535961A

CN113535961A - System, method and device for realizing multi-language mixed short text classification processing based on small sample learning, memory and storage medium thereof

Info

Publication number: CN113535961A
Application number: CN202110886442.0A
Authority: CN
Inventors: 王永剑; 孙亚茹; 杨莹
Original assignee: Third Research Institute of the Ministry of Public Security
Current assignee: Third Research Institute of the Ministry of Public Security
Priority date: 2021-08-03
Filing date: 2021-08-03
Publication date: 2021-10-22

Abstract

The invention relates to a system for realizing multi-language mixed short text classification processing based on small sample learning, wherein the system comprises a data acquisition module, a short text classification module and a short text classification module, wherein the data acquisition module is used for inputting a small number of preset label samples into the system; the data preprocessing module is used for preprocessing the data of the preset label sample; the model calculation processing module is used for extracting key features and generating a corresponding model accuracy calculation result; and the model generation and output module is used for predicting the model prediction result of the current text data and further updating and iterating the output model through sampling and auditing the model prediction result. The invention also relates to a corresponding method, device, processor and storage medium thereof. By adopting the system, the method, the device, the processor and the storage medium thereof, the mining of large-scale data potential information is completed by utilizing small sample learning in a time-saving and labor-saving manner, word formation information and word cross-correlation information are effectively obtained, and the system, the method, the device, the processor and the storage medium thereof have great innovation.

Description

System, method and device for realizing multi-language mixed short text classification processing based on small sample learning, memory and storage medium thereof

Technical Field

The invention relates to the technical field of deep learning, in particular to the technical field of natural language processing, and specifically relates to a system, a method, a device, a memory and a computer readable storage medium for realizing multi-language mixed short text classification processing based on small sample learning.

Background

Text classification is the task of assigning labels to text, is one of the important and fundamental tasks in natural language processing, and is beneficial to support many downstream tasks such as emotion classification, topic extraction and the like. And a text classification technology which can not separate keys from the value information mining of the text platform. The issue of text belongs to short texts mostly, and has the characteristics of short sentences, multiple languages, content diversity, informality, grammar errors, popular languages, slang and the like, so that an effective text classification technology is needed to solve the problem of short text classification with a plurality of mixed languages.

Traditional text classification algorithms focus more on linear expressions of text, such as support vector machine models that use dictionaries or n-gram word vectors as input. Research in recent years has shown that non-linear models can capture text context information efficiently, and can produce more accurate predictions than linear models. The convolutional neural network model is a typical nonlinear model, which converts local features of data into low-dimensional vectors and retains task-related information. This efficient mapping method is superior to the sequence model in short text representation.

The convolutional neural network acquires the characteristic information of the data area by adopting maximum pooling, and only the characteristic with the maximum area value is reserved during calculation. As the number of convolution layers increases, the positioning information related to the target is gradually lost. Text regions may express more complex concepts that are learned by extracting the most significant feature information in a region by maximizing the extracted feature region alone, ignoring other useful information. In addition, coupling connections between network layers may add redundancy to the model.

In addition to the performance of the model, the quality of the data features also has a large impact on the results of downstream tasks. For a short text with Multi-language mixture, the existing models, such as Multi-language models like Multi-language Bert and LASER, cannot well represent the features of different languages in the same feature space. Therefore, the same feature space representation calculation cannot be performed among multiple languages, and semantic deviation occurs.

The attention mechanism is a method for effectively focusing on key information in model input data. The attention model not only pays particular attention to the feature information in the training process, but also effectively adjusts parameters of the neural network according to different features, and can mine more hidden feature information.

Disclosure of Invention

The present invention is directed to overcoming the above-mentioned drawbacks of the prior art, and providing a system, method, apparatus, memory and computer-readable storage medium for implementing a classification process of a multi-language hybrid short text based on small sample learning, which can effectively obtain word formation information and word cross-correlation information.

In order to achieve the above objects, the system, method, apparatus, memory and computer readable storage medium thereof for implementing a multi-lingual hybrid short text classification processing based on small sample learning according to the present invention are as follows:

the system for realizing the classification processing of the multi-language mixed short text based on the small sample learning is mainly characterized by comprising the following steps:

the data acquisition module is used for inputting a small number of preset label samples into the system;

the data preprocessing module is connected with the data acquisition module and is used for carrying out data set division, data cleaning and batch processing on the preset label sample;

the model calculation processing module is connected with the data preprocessing module and used for extracting key features according to the text data acquired after preprocessing and generating a corresponding model accuracy calculation result; and

and the model generation and output module is connected with the model calculation processing module and used for predicting the model prediction result of the current text data according to the model accuracy calculation result and further updating and iterating the output model through sampling and auditing the model prediction result.

Preferably, the model calculation processing module specifically includes:

the word information processing unit is connected with the data preprocessing module and is used for performing n-element lexical segmentation, word embedding and iterative processing of word sets on the small amount of label text data samples obtained after batch processing;

the text characteristic embedding unit is connected with the word information processing unit and is used for combining the word information subjected to the iterative processing into a text integral characteristic to be used as the input of an effective convolution layer;

the text key region characteristic unit is connected with the text characteristic embedding unit and is used for acquiring text key characteristic information in the text overall characteristic;

the text type judging unit is connected with the text key area characteristic unit and is used for analyzing and calculating the classification type of the current input text; and

and the model accuracy calculation unit is connected with the text type judgment unit and is used for calculating the model accuracy of the text information obtained after the text processing.

Preferably, the model generation and output module specifically includes:

the model prediction processing unit is used for inputting multi-language mixed short text data and performing model prediction;

the prediction result output unit is connected with the model prediction processing unit and is used for outputting a model prediction result; and

and the sampling and auditing unit is connected with the prediction result output unit and is used for sampling and auditing the model prediction result so as to detect the accuracy of the prediction model.

Preferably, the sampling auditing unit judges whether to perform updating calibration according to the following rules through a system preset threshold value:

if the text data sampled and audited by the sampling auditing unit is larger than a threshold value, adding new label data to the data acquisition module to perform iterative update processing of the model; otherwise

And if the text data sampled and audited by the sampling auditing unit is not greater than the threshold value, the data acquisition module is added with new label data after calibration processing to perform iterative update processing of the model.

The method for realizing the classification processing of the multi-language mixed short text based on the small sample learning by using the system is mainly characterized by comprising the following steps:

(1) acquiring text sub-word information from a multi-language mixed short text;

(2) carrying out data set division, data cleaning and batch operation pretreatment on the text sub-word information;

(3) performing text characteristic embedding on the preprocessed text sub-word information to obtain input information of an effective convolution layer;

(4) acquiring adjacent word information and text key area information of the text sub-word information by adopting different kernel convolutions;

(5) judging the category of the text according to probability distribution;

(6) and predicting a classification model and processing for mining new text data information according to the category information, and updating and iterating the model.

Preferably, the step (3) specifically includes the following steps:

(3.1) searching words, if not, segmenting according to n-gram to form a sub-word library, searching special sub-words before segmenting, and entering the step (3.3); otherwise, entering the step (3.2);

(3.2) if yes, segmenting according to the special sub-words, segmenting the rest part according to n-gram, otherwise, directly segmenting according to the n-gram to form a corresponding formed sub-word library, and entering the step (3.3);

and (3.3) affine transforming the sub-word library formed after segmentation to the representation of the word level, adding the newly represented word as a special sub-word to the sub-word set, and calculating the sub-word representation of the higher layer.

Preferably, the sub-word characterization of the higher layer is calculated according to the following formula:

wherein g is a subword, i is the ith word in the sentence, W_giFor data transformation matrix, θ_wIs a set of words and phrases, and is,

representing a characterization of the sub-word g, u_i|g(1 ≦ i ≦ n) i.e. the higher level representation of the subword.

Preferably, the step (4) specifically includes the following steps:

(4.1) representing the sub-words of the higher layer u after affine transformation_i|gCombined into text integral features

As an input to the active convolutional layer;

(4.2) performing one-dimensional convolution on the text features by adopting convolution kernels with different widths and different channel numbers to obtain global features containing different adjacent word information;

and (4.3) performing text global feature control by adopting a self-attention mechanism, thereby calculating and outputting text key region information features.

Preferably, the global characteristics of the different neighboring word information in step (4.2) are calculated according to the following formula:

V^l+1(U^l)＝ReLU(Conv_1×k(U^l))；

where k is the width, l is the number of convolutions, ReLU is the activation function, U^lInput feature data, Conv, of kernel convolution representing the l-th layer_1×kRepresenting a convolution operation with a kernel width k, V^l+1Representing the global features after convolution of the ith layer.

Preferably, the step (4.3) specifically includes the following steps:

(4.3.1) calculating the coupling coefficient c according to the following equation_jmTaking the weight as the salient weight of the global feature of the text:

wherein j is the jth row of the feature matrix after convolution, m is the mth column of the feature matrix after convolution, b_jmIs a convolution of characteristic values, u 'inside the text data'_jmIs a characteristic value before convolution at the initial time, c_jmThe attention value of the jth row and mth column of the convolved eigenvalue.

(4.3.2) calculating the text key area information characteristics v according to a global pooling calculation formula_mWherein, u'_m|jThe characteristics of the j-th row before convolution specifically include:

v_m＝∑_jc_jm·u′_m|j；

(4.3.3) iteratively updating the efficiently pooled internal coefficients b according to the following formula_jm：

b_jm＝b_jm+v_m·u′_m|j。

Preferably, the step (5) specifically comprises the following steps:

(5.1) text key region characteristics output by 2 kinds of kernel convolution with different widths

And

integrating into a text information characteristic vector through a splicing function f

The text information feature vector v is calculated according to the following formula:

(5.2) inputting the text information feature vector v into a feed-forward neural network FFNN (·) to output text category features, and predicting probability distribution of multiple categories of texts by adopting soft max function

The probability distribution is calculated according to the following formula

Preferably, the step (6) specifically includes the following steps:

(6.1) carrying out category information mining on the unmarked multilingual mixed short text data by using the trained model and carrying out classification model prediction;

(6.2) detecting the accuracy rate of the output result of the model in a sampling auditing mode;

and (6.3) expanding the new category data as a label sample into the sample data set for updating and iterative processing of the model.

Preferably, the step (6.2) specifically comprises the following steps:

the accuracy of the model was checked using the following audit criteria:

(6.2.1) sampling 5% of the predicted data amount;

(6.2.2) manually judging, wherein the score is 0/1;

(6.2.3) calculating the sampling accuracy rate according to the following formula:

(6.2.4) setting a check threshold value for comparison.

Preferably, the step (6.2.4) is specifically:

if the sampling accuracy is higher than the threshold value, expanding the new class data serving as a label sample into a sample data set; and if the sampling rate is lower than the threshold value, amplifying the sampling proportion, calibrating the error label, and further training the model after the error label is expanded to the label sample data set.

The device for realizing the classification processing of the multi-language mixed short text based on the small sample learning is mainly characterized by comprising the following steps:

a processor configured to execute computer-executable instructions;

a memory storing one or more computer-executable instructions that, when executed by the processor, perform the steps of the above-described method for performing a multilingual hybrid short-text classification based on small-sample learning.

The processor for realizing the classification of the multi-language mixed short text based on the small sample learning is mainly characterized in that the processor is configured to execute computer executable instructions, and the computer executable instructions are executed by the processor to realize the steps of the method for realizing the classification of the multi-language mixed short text based on the small sample learning.

The computer-readable storage medium is mainly characterized by having a computer program stored thereon, wherein the computer program can be executed by a processor to implement the steps of the method for implementing the classification processing of the multi-language mixed short text based on the small sample learning.

By adopting the system, the method, the device, the memory and the computer readable storage medium for realizing the classification processing of the multi-language mixed short text based on the small sample learning, aiming at the problems of difficult representation of unknown words, multi-language semantic deviation, time and labor waste of manual labeling and the like in the prior art, the invention provides the multi-language mixed short text classification model combining the convolutional neural network with the attention mechanism, which can utilize a small amount of samples to learn the short text characteristics of the multi-language mixture, and solve the problems of difficult representation of the unknown words and rare words, difficult capture of short text information, semantic drift of the multi-language characteristics, redundancy of model parameters and the like. On one hand, from the internal structure and the forming mode of the word, the sub-word characteristics at the bottom layer are mapped to the characteristics at the higher layer, and the sub-word characteristics are shared, so that the influence of the unknown word and the rare word on the model and the problem of the characteristics of the multi-language word in the same space are solved. On the other hand, the local convolution concerned by the deep convolutional neural network can effectively capture the association between adjacent words, but the maximum pooling thereof depends on the maximized value of the feature region to extract the most significant information, and other useful information is ignored. The method utilizes the convolution kernels with multiple channels and unequal widths to capture the information of adjacent words with different numbers, and simultaneously adopts a coupling coefficient calculation method to extract the most significant information in the sentence without neglecting other related information. And mining new data according to a model for learning a small amount of label data, and updating the label data set and the training model by sampling discrimination. The method utilizes small samples to learn, saves time and labor, completes the mining of large-scale data potential information, effectively obtains word formation information and word cross-correlation information, and has great innovation.

Drawings

FIG. 1 is a schematic diagram of a framework structure of a system for implementing a classification process of a multi-language mixed short text based on small sample learning according to the present invention.

FIG. 2 is a diagram of an embodiment of the present invention for implementing a method for implementing a classification process of a multi-language mixed short text based on small sample learning.

Detailed Description

In order to more clearly describe the technical contents of the present invention, the following further description is given in conjunction with specific embodiments.

Before describing in detail embodiments that are in accordance with the present invention, it should be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Referring to fig. 1, the system for implementing a classification process of a multi-language mixed short text based on small sample learning includes:

As a preferred embodiment of the present invention, the model calculation processing module specifically includes:

As a preferred embodiment of the present invention, the model generation and output module specifically includes:

As a preferred embodiment of the present invention, the sampling audit unit judges whether to perform update calibration according to the following rules through a system preset threshold:

The method for realizing the classification processing of the multi-language mixed short texts based on the small sample learning by using the system comprises the following steps:

(1) acquiring text sub-word information from a multi-language mixed short text;

(5) judging the category of the text according to probability distribution;

As a preferred embodiment of the present invention, the step (3) specifically comprises the following steps:

As a preferred embodiment of the present invention, the sub-word characterization of the upper layer is calculated according to the following formula:

As a preferred embodiment of the present invention, the step (4) specifically comprises the following steps:

As an input to the active convolutional layer;

As a preferred embodiment of the present invention, the global features of the different neighboring word information in step (4.2) are calculated according to the following formula:

V^l+1(U^l)＝ReLU(Conv_1×k(U^l))；

As a preferred embodiment of the present invention, the step (4.3) specifically comprises the following steps:

(4.3.2) calculating the text key area information characteristics v according to a global pooling calculation formula_mWherein, u'_m|jThe characteristics of the j-th row before convolution are specifically as follows:

v_m＝∑_jc_jm·u′_m|j；

b_jm＝b_jm+v_m·u′_m|j。

As a preferred embodiment of the present invention, the step (5) specifically comprises the following steps:

And

The probability distribution is calculated according to the following formula

As a preferred embodiment of the present invention, the step (6) specifically comprises the following steps:

As a preferred embodiment of the present invention, the step (6.2) specifically comprises the following steps:

the accuracy of the model was checked using the following audit criteria:

(6.2.1) sampling 5% of the predicted data amount;

(6.2.2) manually judging, wherein the score is 0/1;

(6.2.4) setting a check threshold value for comparison.

As a preferred embodiment of the present invention, the step (6.2.4) is specifically:

The device for realizing the classification processing of the multi-language mixed short text based on the small sample learning comprises the following steps:

a processor configured to execute computer-executable instructions;

The processor for implementing the classification processing of the multi-language hybrid short text based on the small sample learning is configured to execute computer executable instructions, and the computer executable instructions, when executed by the processor, implement the steps of the method for implementing the classification processing of the multi-language hybrid short text based on the small sample learning.

The computer readable storage medium has a computer program stored thereon, the computer program being executable by a processor to perform the steps of the above method for performing a multi-lingual hybrid short text classification process based on small sample learning.

In a specific embodiment of the invention, the technical scheme starts from the internal structure and the forming mode of words to construct a sub-word embedded network, and the influence of unknown words on the model is relieved, and simultaneously a multi-language mixed feature space is constructed, so that the distances of words with the same semantics and different languages in the feature space are close. In order to solve the generalization problem of the global pooling of the deep convolutional neural network, a coupling coefficient is adopted to effectively calculate the relevant characteristics of the local text region. And the convolution kernels with multiple channels and unequal widths are used for capturing information of adjacent words with different numbers, so that other related information is not ignored when the main information in the sentence is extracted by the model. The method comprises the following steps:

step one, text sub-word information is obtained. Embedding of the text depends on the sub-word information of each word, segmenting the words by adopting n-element grammar to form a sub-word library, and then carrying out affine transformation to the representation of the word level.

And step two, embedding text features. And combining the affine transformed word characteristics into the text overall characteristics to be used as the input of the effective convolution layer.

And step three, acquiring the information of the text key area. And the text key region characteristics are obtained by calculating different wide convolution kernels and coupling coefficients. And performing one-dimensional convolution on the text features by adopting convolution kernels with different widths and different channel numbers to obtain global features containing different adjacent word information. The convolution window slides on the input feature array in sequence, the data in the window and the data in the convolution kernel are multiplied and summed according to elements to obtain elements of corresponding output positions, and therefore adjacent word information of different distances is captured. And performing text global feature control on different text feature data after convolution by adopting an attention mechanism, and calculating a coupling coefficient to serve as a salient weight of the global feature. Then, the global pooling calculation outputs text key region information characteristics. Finally, effective pooling is performed to capture key information in the text without losing other related information, and the internal coefficients of the effective pooling are iteratively updated.

And step four, judging the type of the text. And integrating the information features of the text key regions output by the cores with different widths into a text information feature vector through a splicing function. And then, outputting text category characteristics to the text information characteristics through a layer of feedforward neural network, predicting the probability distribution of multiple categories of the text by adopting softmax, and calculating the category to which the text belongs through the probability distribution.

And step five, predicting and mining new text data information. And (5) carrying out category information mining on the unmarked multilingual mixed short text data by using the trained model. And detecting the accuracy of the model output result in a sampling auditing mode. The auditing standard adopts the following steps: (1) sampling a prediction result of 5% of the amount of the prediction data; (2) manually judging, wherein the score is 0/1; (3) the sampling accuracy rate is calculated, and the sampling accuracy rate is calculated,

(4) a threshold value is set. And if the sampling accuracy is higher than the threshold value, expanding the new class data into the sample data set as the label sample. If the sampling rate is lower than the threshold value, the sampling proportion is amplified, the error label is calibrated, and the model is further trained after the error label is expanded to a label sample data set; (5) and updating the iterative model.

Referring to fig. 2, in an embodiment of the present invention, taking the case of distinguishing the short text mixed with multiple languages of chinese and english as an example, the method for classifying the short text mixed with multiple languages of the present invention includes the following steps:

1. and (4) preparing data. Text data is read from a multi-language mixed short text. For example, read a short text sentence of chinese-english mixture: the tiny medical course, sol cool, is opened by Ting _ h

The sentences contain special symbols such as expressions.

2. And (4) preprocessing data. And de-marking points, de-emoticons and other de-noising processing which is irrelevant to text classification information is carried out on the text. Chinese is separated from other languages, English is in space, and Chinese is segmented as a whole. The result after the segmentation was: { 'Ting _ h' 'sets up' psychological 'course' so 'cool' }.

3. Embedding the sub-words to obtain the information of the text sub-words. Searching the sub-words, and if not, segmenting according to the n-gram. For example, the subword 'Ting _ h' is sliced by 3-gram. Adding special characters to the beginning and end of a word before 3-dimensional grammar splitting "<"and">"to distinguish subwords as suffixes and suffixes. Before segmentation, searching for special sub-words, if yes, segmenting according to the special sub-words, segmenting the rest part according to n-gram, otherwise, directly segmenting according to the n-gram. After segmentation, 6 sub-words are formed: {<‘Ti’‘Tin’‘ing’‘ng_’‘g_h’‘_h’>H, then add the newly characterized word Ting _ h as a special sub-word to the set of sub-words θ_wCalculating the sub-word representation of the higher layer according to formula (1), wherein

4. And (5) carrying out different kernel convolutions to obtain adjacent word information. Symbolizing the affine transformed word u_i|gCombined into text integral features

As input for the active convolutional layer. Convolution kernels of different widths may be used to obtain the correlation of different numbers of adjacent words. Adopting convolution with different widths k as t to check text features for one-dimensional convolution according to formula (2) to obtain global features containing different adjacent word information

V^l+1(U^l)＝ReLU(Conv_1×k(U^l)) (2)

Wherein U is^lInput feature data, Conv, of kernel convolution representing the l-th layer_1×kRepresenting a convolution operation with a kernel width k, V^l+1Representing the global features after convolution of the ith layer. Assuming that the sentence length is 10 and the dimension is feature 5, the input dimension is (5 × 10). The number of channels is set to 4 and the convolution kernel widths are 2 and 4, respectively. Therefore, two text region features with different widths are obtained by one convolution respectively

And

5. and effectively pooling to obtain the key area information of the text. Adopting an attention mechanism to carry out text global feature control, and calculating a coupling coefficient according to a formula (3)

And

as a salient weight of the global feature, the coupling coefficient is calculated as follows:

wherein, b_jmIs a characteristic value inside the convolved data, and is initially u'_jm. Outputting text key region information characteristics according to a global pooling calculation formula (4)

And

v_m＝∑_jc_jm·u′_m|j (4)

iteratively updating the efficiently pooled internal coefficients b according to equation (5)_jm：

b_jm＝b_jm+v_m·u′_m|j (5)

6. And (4) performing classification calculation. Text key region characteristics output by 2 kinds of kernel convolution with different widths

And

The final text-information feature vector v is,

then, outputting text category characteristics to the text information characteristics through a layer of feed-forward neural network FFNN (DEG), and then predicting the probability distribution of multiple categories of the text by adopting softmax

And (5) model prediction. The model here is a 4-class model, labeled sports, education, entertainment, music. By probability distribution

And calculating and outputting {0.01,0.91,0.067 and 0.013}, wherein the value of the label corresponding to the education category is the maximum, namely the sentence category result output by the model is education.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by suitable instruction execution devices.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, and the program may be stored in a computer readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, references to the description of terms "an embodiment," "some embodiments," "an example," "a specific example," or "an embodiment," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

In this specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A system for implementing multi-language hybrid short text classification processing based on small sample learning, the system comprising:

2. The system for implementing classification of multi-lingual hybrid short text based on small sample learning as claimed in claim 1, wherein the model calculation processing module specifically comprises:

3. The system for implementing multi-lingual hybrid short text classification processing based on small sample learning of claim 2, wherein the model generation and output module specifically comprises:

4. The system for implementing classification processing of mixed short text with multiple languages based on small sample learning as claimed in claim 3, wherein the sampling auditing unit determines whether to update and calibrate according to the following rules through a system preset threshold value:

5. A method for implementing multi-language hybrid short text classification processing based on small sample learning by using the system of claim 4, wherein the method comprises the following steps:

(1) acquiring text sub-word information from a multi-language mixed short text;

(5) judging the category of the text according to probability distribution;

6. The method for implementing a classification process of a multi-lingual mixed short text based on small sample learning as claimed in claim 5, wherein the step (3) comprises the following steps:

7. The method for implementing classification processing of multi-lingual mixed short texts based on small sample learning as claimed in claim 6, wherein the sub-word characterization of the higher layer is calculated according to the following formula:

8. The method for implementing multi-lingual hybrid short text classification processing based on small sample learning according to claim 7, wherein the step (4) comprises the following steps:

As an input to the active convolutional layer;

9. The method for implementing multi-lingual hybrid short text classification processing based on small sample learning as claimed in claim 8, wherein the global features of different neighboring word information in step (4.2) are calculated according to the following formula:

V^l+1(U^l)＝ReLU(Conv_1×k(U^l))；

where k is the width, l is the number of convolutions, ReLU is the activation function, U^lInput feature data, Conv, of kernel convolution representing the l-th layer_1×kIs shown as havingConvolution operation with kernel width k, V^l+1Representing the global features after convolution of the ith layer.

10. The method for implementing a classification process of a multi-lingual hybrid short text based on small sample learning as claimed in claim 9, wherein the step (4.3) comprises the following steps:

v_m＝∑_jc_jm·u′_m|j；

b_jm＝b_jm+v_m·u′_m|j。

11. The method for implementing a classification process of a multi-lingual mixed short text based on small sample learning as claimed in claim 10, wherein the step (5) comprises the following steps:

And

The probability distribution is calculated according to the following formula

12. The method for implementing a classification process of a multi-lingual hybrid short text based on small sample learning as claimed in claim 11, wherein the step (6) comprises the following steps:

13. The method for implementing a multi-lingual hybrid short text classification processing based on small sample learning according to claim 12, wherein the step (6.2) comprises the following steps:

the accuracy of the model was checked using the following audit criteria:

(6.2.1) sampling 5% of the predicted data amount;

(6.2.2) manually judging, wherein the score is 0/1;

(6.2.4) setting a check threshold value for comparison.

14. The method for performing classification of mixed short text with multiple languages based on small sample learning as claimed in claim 13, wherein the step (6.2.4) is specifically as follows:

15. An apparatus for implementing a multi-lingual hybrid short text classification process based on small sample learning, the apparatus comprising:

a processor configured to execute computer-executable instructions;

a memory storing one or more computer-executable instructions that, when executed by the processor, perform the steps of the method of implementing a multi-lingual mixed short text classification process based on small sample learning of any one of claims 5 to 14.

16. A processor for implementing a multi-lingual mixed short text classification process based on small sample learning, wherein the processor is configured to execute computer-executable instructions which, when executed by the processor, implement the steps of the method for implementing a multi-lingual mixed short text classification process based on small sample learning according to any one of claims 5 to 14.

17. A computer-readable storage medium, having stored thereon a computer program executable by a processor to perform the steps of the method for small sample learning based multilingual hybrid short text classification method of any one of claims 5-14.