WO2022095682A1 - 文本分类模型的训练方法、文本分类方法、装置、设备、存储介质及计算机程序产品 - Google Patents
文本分类模型的训练方法、文本分类方法、装置、设备、存储介质及计算机程序产品 Download PDFInfo
- Publication number
- WO2022095682A1 WO2022095682A1 PCT/CN2021/124335 CN2021124335W WO2022095682A1 WO 2022095682 A1 WO2022095682 A1 WO 2022095682A1 CN 2021124335 W CN2021124335 W CN 2021124335W WO 2022095682 A1 WO2022095682 A1 WO 2022095682A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- classification model
- samples
- language
- text classification
- Prior art date
Links
- 238000013145 classification model Methods 0.000 title claims abstract description 300
- 238000012549 training Methods 0.000 title claims abstract description 196
- 238000000034 method Methods 0.000 title claims abstract description 176
- 238000004590 computer program Methods 0.000 title claims description 13
- 238000012545 processing Methods 0.000 claims abstract description 112
- 238000012216 screening Methods 0.000 claims abstract description 101
- 238000013519 translation Methods 0.000 claims abstract description 29
- 230000008569 process Effects 0.000 claims description 98
- 239000013598 vector Substances 0.000 claims description 84
- 230000004913 activation Effects 0.000 claims description 43
- 238000013507 mapping Methods 0.000 claims description 43
- 230000006870 function Effects 0.000 claims description 29
- 230000015654 memory Effects 0.000 claims description 24
- 230000004927 fusion Effects 0.000 claims description 16
- 239000013589 supplement Substances 0.000 claims description 5
- 238000007499 fusion processing Methods 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 12
- 238000005516 engineering process Methods 0.000 abstract description 7
- 239000000047 product Substances 0.000 description 15
- 238000010586 diagram Methods 0.000 description 8
- 238000004891 communication Methods 0.000 description 7
- 238000002372 labelling Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 5
- 230000003416 augmentation Effects 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000008451 emotion Effects 0.000 description 3
- 230000002996 emotional effect Effects 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 238000013515 script Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 241000287828 Gallus gallus Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 235000014347 soups Nutrition 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2431—Multiple classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Definitions
- the embodiments of the present application are based on the Chinese patent application with the application number of 202011217057.9 and the filing date of November 4, 2020, and claim the priority of the Chinese patent application.
- the entire content of the Chinese patent application is incorporated into the embodiments of the present application as refer to.
- the present application relates to artificial intelligence technology, and in particular, to a training method for a text classification model, a text classification method, an apparatus, an electronic device, a computer-readable storage medium, and a computer program product.
- Artificial intelligence is a comprehensive technology of computer science. By studying the design principles and implementation methods of various intelligent machines, the machines have the functions of perception, reasoning and decision-making. Artificial intelligence technology is a comprehensive subject covering a wide range of fields, such as natural language processing technology and machine learning/deep learning. With the development of technology, artificial intelligence technology will be applied in more fields, and play a more increasingly important value.
- the text classification model is one of the important applications in the field of artificial intelligence.
- the text classification model can identify the category to which the text belongs.
- Text classification models are widely used in news recommendation, intent recognition systems, etc., that is, text classification models are the basic components of these complex systems.
- the text classification model in the related art is aimed at a certain language.
- the text classification model will face the pressure of lack of labeled samples in other languages, and it will not be able to perform other languages smoothly.
- Language text classification tasks are aimed at a certain language.
- Embodiments of the present application provide a text classification model training method, text classification method, apparatus, electronic device, computer-readable storage medium, and computer program product, which can automatically obtain cross-language text samples and improve the accuracy of text classification.
- the embodiment of the present application provides a training method for a text classification model, including:
- the network depth of the second text classification model is greater than the network depth of the first text classification model.
- the embodiment of the present application provides a text classification method, including:
- the text to be classified adopts a second language different from the first language
- the second text classification model is obtained by training the text samples of the second language screened by the first text classification model, and the text samples of the second language are obtained by analyzing the text samples of the first language. obtained by machine translation.
- the embodiment of the present application provides a training device for a text classification model, including:
- a translation module configured to perform machine translation processing on a plurality of first text samples in a first language to obtain a plurality of second text samples corresponding to the plurality of first text samples one-to-one;
- a first training module configured to train a first text classification model for the second language based on a plurality of third text samples in the second language and their corresponding category labels;
- a screening module configured to perform a confidence-based screening process on the plurality of second text samples through the trained first text classification model
- the second training module is configured to train a second text classification model for the second language based on the second text samples obtained by the screening process; wherein, the network depth of the second text classification model is greater than that of the second text classification model. Network depth of a text classification model.
- An embodiment of the present application provides a text classification device, including:
- an obtaining module configured to obtain the text to be classified; wherein, the text to be classified adopts a second language different from the first language;
- a processing module configured to perform encoding processing on the text to be classified by using a second text classification model whose network depth is greater than that of the first text classification model, to obtain an encoding vector of the text to be classified; Non-linear mapping to obtain the category corresponding to the text to be classified; wherein, the second text classification model is obtained by training the text samples of the second language screened by the first text classification model, and the second language The text samples are obtained by machine-translating the text samples in the first language.
- An embodiment of the present application provides an electronic device for training a text classification model, the electronic device comprising:
- the processor is configured to, when executing the executable instructions stored in the memory, implement the text classification model training method or the text classification method provided by the embodiments of the present application.
- Embodiments of the present application provide a computer-readable storage medium storing executable instructions for causing a processor to execute the training method for a text classification model or a text classification method provided by the embodiments of the present application.
- Embodiments of the present application provide a computer program product, including computer programs or instructions, which, when executed by a processor, implement the text classification model training method or text classification method provided by the embodiments of the present application.
- second text samples in a second language different from the first language through machine translation, and filter the second text samples through the first text classification model, so as to achieve automatic acquisition of cross-language text samples, reducing the need for lack of text samples.
- the second text classification model is trained through the high-quality text samples obtained by screening, so that the second text classification model can perform accurate text classification and improve the accuracy of text classification.
- FIG. 1 is a schematic diagram of an application scenario of a text classification system provided by an embodiment of the present application
- FIG. 2 is a schematic structural diagram of an electronic device for text classification model training provided by an embodiment of the present application
- 3-5 are schematic flowcharts of a training method based on a text classification model provided by an embodiment of the present application
- FIG. 6 is a schematic flowchart of an iterative training provided by an embodiment of the present application.
- FIG. 7 is a schematic diagram of a hierarchical softmax provided by an embodiment of the present application.
- FIG. 8 is a schematic diagram of a cascaded encoder provided by an embodiment of the present application.
- FIG. 9 is a schematic diagram of a text set A and a text set B provided by an embodiment of the present application.
- FIG. 10 is a schematic diagram of a text set B1 provided by an embodiment of the present application.
- FIG. 11 is a schematic flowchart of active learning provided by an embodiment of the present application.
- FIG. 12 is a schematic flowchart of reinforcement learning provided by an embodiment of the present application.
- first ⁇ second involved is only to distinguish similar objects, and does not represent a specific ordering of objects. It is understood that “first ⁇ second” can be used when permitted.
- the specific order or sequence is interchanged to enable the embodiments of the application described herein to be practiced in sequences other than those illustrated or described herein.
- Convolutional Neural Networks A class of Feedforward Neural Networks (FNN, Feedforward Neural Networks) that includes convolution calculations and has a deep structure, is one of the representative algorithms of deep learning.
- Convolutional neural networks have representation learning capabilities and can perform shift-invariant classification of input images according to their hierarchical structure.
- Cross-language few shot text classification When migrating from A language scene to B language scene, and there is a small budget for B language sample annotation, only a small amount of B language annotation text and a large amount of A language annotation text are needed. , the large-scale annotation of the B language text can be realized, and the text classification model can be trained through the large-scale annotation of the B language text to realize the B language text classification.
- Cross-language zero shot text classification When migrating from A language scene to B language scene, and lack of budget (no labor or tight product promotion time), it is impossible to label language B samples, that is: only with the help of A large number of labeled texts in language A are used to realize large-scale labeling of language B, and a text classification model is trained through the large-scale labeling of language B texts to realize text classification in language B.
- Text classification is widely used in content-related products, such as news classification, article classification, intent classification, information flow products, forums, communities, e-commerce, etc.
- text classification is for texts in a certain language, such as Chinese, English, etc., but when the product needs to expand its business in other languages, it will encounter the problem of insufficient labeled text in the early stage of the product.
- the product is promoted from the Chinese market to the English market, it is necessary to quickly label the news in the English field; when performing positive and negative sentiment analysis on the comments of Chinese users, as the number of users increases, or when the product is launched to overseas markets , there will be many non-Chinese comments, so these comments also need to be marked with the corresponding emotional polarity.
- the embodiments of the present application provide a text classification model training method, text classification method, device, electronic device, computer-readable storage medium and computer program product, which can automatically obtain cross-language text samples, improve text Classification accuracy.
- the training method and text classification method of the text classification model provided by the embodiments of the present application can be implemented by the terminal/server alone; or can be implemented by the terminal and the server collaboratively, for example, the terminal alone undertakes the training method of the text classification model described below, Or, the terminal sends a text classification request for a certain language to the server, and the server executes the training method of the text classification model according to the received text classification of the certain language, and performs the text classification task of the language based on the trained text classification model.
- the electronic device used for text classification model training may be various types of terminal devices or servers, where the server may be an independent physical server, or a server cluster or distributed server composed of multiple physical servers.
- the system can also be a cloud server that provides cloud computing services; the terminal can be a smartphone, tablet computer, notebook computer, desktop computer, smart speaker, smart watch, vehicle-mounted device, etc., but is not limited to this.
- the terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in this application.
- a server can be a server cluster deployed in the cloud to open artificial intelligence cloud services (AiaaS, AI as a Service) to users.
- AIaaS artificial intelligence cloud services
- the AIaaS platform will split several types of common AI services and provide independent services in the cloud. Or packaged services. This service model is similar to an AI-themed mall. All users can access one or more artificial intelligence services provided by the AIaaS platform through application programming interfaces.
- one of the artificial intelligence cloud services may be a text classification model training service, that is, a server in the cloud encapsulates the text classification model training program provided by the embodiment of the present application.
- the user invokes the text classification model training service in the cloud service through a terminal (running a client, such as a news client, a reading client, etc.), so that the server deployed in the cloud invokes the encapsulated text classification model training program, based on the first
- the first text sample of the language is obtained by using a machine translation model to obtain a second text sample in a second language different from the first language, and the second text sample is screened by the first text classification model.
- the text samples are used to train the second text classification model, and the second text samples are used for text classification for subsequent news applications, reading applications, etc.
- the text is English news
- the second text classification model News classification for English
- the category of each news to be recommended such as entertainment news, sports news, etc.
- the category of each article to be recommended is determined by the trained second text classification model (for Chinese article classification), such as mind Chicken soup, legal articles, educational articles, etc., so that each article to be recommended is screened based on the category of the article to obtain articles for recommendation, and the articles for recommendation are displayed to users to achieve targeted article recommendation.
- FIG. 1 is a schematic diagram of an application scenario of the text classification system 10 provided by the embodiment of the present application.
- the terminal 200 is connected to the server 100 through a network 300, and the network 300 may be a wide area network or a local area network, or a combination of the two.
- the terminal 200 (running a client, such as a news client) can be used to obtain text to be classified in a certain language. For example, a developer inputs text to be classified in a certain language through the terminal, and the terminal automatically obtains a text classification request for a certain language.
- a text classification model training plug-in may be embedded in the client running in the terminal, so as to implement the training method of the text classification model locally on the client. For example, after acquiring the text to be classified in a second language different from the first language, the terminal 200 invokes the text classification model training plug-in to implement the training method of the text classification model, and obtains the text sample (using the The second text sample (using the second language) corresponding to the first language), and the second text sample is screened by the first text classification model, and the second text sample is trained by the screened second text sample. Two text samples are used for text classification for subsequent news applications, reading applications, etc.
- the terminal 200 after the terminal 200 requests a text classification in a certain language, it calls the text classification model training interface of the server 100 (which can be provided in the form of a cloud service, that is, a text classification model training service), and the server 100 obtains the text classification model through the machine translation model.
- the second text sample (in the second language) corresponding to the first text sample (in the first language) is used, the second text sample is screened by the first text classification model, and the second text sample obtained by the screening is used to train the first text sample.
- the second text classification model performs text classification based on the trained second text samples for subsequent news applications, reading applications, and the like.
- FIG. 2 is a schematic structural diagram of the electronic device 500 for text classification model training provided by the embodiment of the present application.
- 500 is a server for illustration.
- the electronic device 500 for text classification model training shown in FIG. 2 includes: at least one processor 510 , memory 550 , at least one network interface 520 and user interface 530 .
- the various components in electronic device 500 are coupled together by bus system 540 .
- bus system 540 is used to implement the connection communication between these components.
- the bus system 540 also includes a power bus, a control bus and a status signal bus.
- the various buses are labeled as bus system 540 in FIG. 2 .
- the processor 510 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., where a general-purpose processor may be a microprocessor or any conventional processor or the like.
- DSP Digital Signal Processor
- Memory 550 includes volatile memory or non-volatile memory, and may also include both volatile and non-volatile memory.
- the non-volatile memory may be a read-only memory (ROM, Read Only Memory), and the volatile memory may be a random access memory (RAM, Random Access Memory).
- the memory 550 described in the embodiments of the present application is intended to include any suitable type of memory.
- Memory 550 optionally includes one or more storage devices that are physically remote from processor 510 .
- memory 550 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
- the operating system 551 includes system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;
- the apparatus for training the text classification model provided by the embodiments of the present application may be implemented in software, for example, the text classification model training plug-in in the terminal described above, or the server described above. Chinese text classification model training service.
- the apparatus for training the text classification model provided by the embodiments of the present application may be provided as various software embodiments, including various forms including application programs, software, software modules, scripts, or codes.
- FIG. 2 shows a training device 555 for a text classification model stored in the memory 550, which can be software in the form of programs and plug-ins, such as a text classification model training plug-in, and includes a series of modules, including a translation module 5551, a first A training module 5552, a screening module 5553, and a second training module 5554; wherein the translation module 5551, the first training module 5552, the screening module 5553, and the second training module 5554 are used to implement the training of the text classification model provided by the embodiment of the present application Features.
- FIG. 3 is a schematic flowchart of a training method based on a text classification model provided by an embodiment of the present application, which is described in conjunction with the steps shown in FIG. 3 .
- the network depth of the second text classification model is greater than the network depth of the first text classification model, that is, the text classification capability of the second text classification model is stronger than the classification capability of the first text classification model. Therefore, for training The number of text samples required for the second text classification model is greater than the number of text samples required for training the first text classification model.
- the first text sample is in a first language
- the second text sample and the third text sample are in a second language different from the first language
- the first text sample is a Chinese sample
- the second text sample and the third text sample are in a second language different from the first language.
- Three text samples are English samples.
- step 101 a machine translation process is performed on a plurality of first text samples in a first language to obtain a plurality of second text samples corresponding to the plurality of first text samples one-to-one.
- the terminal automatically obtains the text classification request for the second language, and sends the text classification request in the second language to the server, and the server receives the text classification in the second language.
- the server receives the text classification in the second language.
- a first text classification model for the second language is trained based on the plurality of third text samples in the second language and their corresponding category labels.
- step 101 There is no obvious sequence between step 101 and step 102.
- the server After the server receives the text classification request in the second language, it obtains a small amount of labeled third text samples from the sample library, and trains the first text classification model through multiple third text samples and corresponding category labels, so that after training The first text classification model of the can perform text classification based on the second language.
- training a first text classification model for the second language based on the plurality of third text samples in the second language and the respective corresponding class labels includes: based on the plurality of third text samples in the second language and the corresponding category labels, the first text classification model is trained for the t-th time; the first text classification model trained for the t-th time performs the confidence-based t-th screening process on multiple second text samples; based on the previous The t-time screening results, multiple third text samples, and corresponding category labels are used to perform the t+1-th training on the first text classification model; the first text classification model trained for the T-th time is used as the first text after training.
- Classification model wherein, t is a positive integer that increases sequentially, and the value range satisfies 1 ⁇ t ⁇ T-1, and T is an integer greater than 2, and is used to represent the total number of iterative training.
- the first text classification model is iteratively trained, so as to filter out more high-quality third text samples through the gradually optimized first text classification model , for subsequent augmentation training to train the second text classification model.
- the first training of the first text classification model is performed, and the first text classification model trained for the first time
- the second text sample is subjected to the first screening process based on confidence.
- the first text classification model is trained for the second time.
- the first text classification model trained for the second time performs the second screening process based on confidence on the second text samples except the first screening results among the plurality of second text samples.
- the text samples and the corresponding category labels are used to train the first text classification model for the third time, and the above training process is iterated until the T-th training is performed on the first text classification model, and the first text classification model trained for the T-th time is used. as the first text classification model after training.
- FIG. 4 is an optional schematic flowchart of a training method for a text classification model provided by an embodiment of the present application.
- FIG. 4 shows that step 102 in FIG. 3 can be implemented by steps 1021 to 1023 shown in FIG. 4 : in step 1021, perform prediction processing on a plurality of third text samples in the second language by the first text classification model to obtain the confidence levels of the prediction categories corresponding to the plurality of third text samples respectively; in step 1022, based on the prediction The confidence of the category and the category label of the third text sample are used to construct the loss function of the first text classification model; in step 1023, the parameters of the first text classification model are updated until the loss function converges, and the first text classification when the loss function converges The updated parameters of the model are used as the parameters of the first text classification model after training.
- the error signal of the first text classification model is determined based on the loss function of the first text classification model, the error information is back-propagated in the first text classification model, and the During the propagation process, the model parameters of each layer are updated.
- the training sample data is input into the input layer of the neural network model, passes through the hidden layer, and finally reaches the output layer and outputs the result.
- This is the forward propagation process of the neural network model. If there is an error between the output result and the actual result, calculate the error between the output result and the actual value, and propagate the error back from the output layer to the hidden layer until it propagates to the input layer.
- the process of back propagation according to the error Adjust the values of the model parameters; iterate the above process until convergence.
- the first text classification model belongs to a neural network model.
- performing prediction processing on a plurality of third text samples in the second language by a first text classification model to obtain confidences of predicted categories corresponding to the plurality of third text samples respectively including: for a plurality of third text samples Any third text sample in the text samples performs the following processing: the first text classification model performs the following processing: encoding the third text sample to obtain an encoding vector of the third text sample; encoding the third text sample's encoding vector Perform fusion processing to obtain a fusion vector; perform nonlinear mapping processing on the fusion vector to obtain the confidence level of the prediction category corresponding to the third text sample.
- the first text classification model is a fast text classification model (fasttext).
- the first text classification model in this embodiment of the present application is not limited to fasttext.
- Fasttext includes an input layer, a hidden layer, and an output layer.
- the samples can quickly train fasttext to enable fasttext to quickly perform text classification tasks in second languages.
- the third text sample is encoded through the input layer to obtain the encoding vector of the third text sample; then the encoding vector of the third text sample is fused through the hidden layer to obtain the fusion vector; finally, the fusion vector is processed through the output layer.
- Non-linear mapping that is, performing mapping processing through an activation function (eg, softmax)
- an activation function eg, softmax
- the first text classification model includes multiple cascaded activation layers; performing nonlinear mapping processing on the fusion vector to obtain the confidence level of the predicted category corresponding to the third text sample, including: In the first activation layer of the activation layer, the fusion vector is mapped to the first activation layer; the mapping result of the first activation layer is output to the subsequent cascaded activation layers, and the mapping is continued through the subsequent cascaded activation layers The processing and mapping results are output until output to the last activation layer; the activation result output by the last activation layer is used as the confidence level of the predicted category corresponding to the third text sample.
- the activation operation is performed through the hierarchical softmax, which can avoid the one-time activation operation to obtain the confidence of the predicted category, but through the multi-layer activation operation, thereby reducing the computational complexity.
- the hierarchical softmax includes T-layer activation layers, and each activation layer performs a hierarchical softmax operation.
- the fusion vector is mapped to the first activation layer through the first activation layer, and the first mapping result is obtained.
- the first mapping result is mapped to the second activation layer through the second activation layer, and the second mapping result is obtained until it is output to the T-th activation layer.
- the activation results output by the T activation layers are used as the confidence of the predicted category corresponding to the third text sample.
- T is the total number of active layers.
- performing encoding processing on the third text sample to obtain an encoding vector of the third text sample includes: performing window sliding processing on the third text sample to obtain multiple segment sequences; wherein the size of the window is N, N is a natural number; perform mapping processing on multiple fragment sequences based on the vocabulary library to obtain sequence vectors corresponding to multiple fragment sequences; combine sequence vectors corresponding to multiple fragment sequences to obtain the encoding vector of the third text sample .
- window sliding processing is performed on the third text sample to obtain multiple segment sequences, including: performing the following processing on the i-th word in the third text sample: obtaining the third text sample The i-th word in the i+N-1 word; the i-th word to the i+N-1 word is combined, and the combined result is used as a fragment sequence; wherein, 0 ⁇ i ⁇ M-N+ 1, M is the number of words in the third text sample, M is a natural number, so as to generate better encoding vectors for rare words, in the vocabulary table, even if the word does not appear in the training corpus, it can still be constructed from a window of word granularity
- the encoding vector corresponding to the word granularity can also allow the first text classification model to learn part of the information of the local word order, so that the first text classification model can keep the word order information during training.
- window sliding processing is performed on the third text sample to obtain multiple segment sequences, including: performing the following processing on the jth word in the third text sample: obtaining the third text sample
- the jth word to the j+N-1th word in the 1 K is the number of words in the third text sample, and K is a natural number.
- step 103 a confidence-based screening process is performed on the plurality of second text samples by using the trained first text classification model.
- the trained first text classification model can perform confidence-based screening processing on multiple second text samples, so as to filter out high-quality text samples.
- the second text samples are used to train the second text classification model through the high-quality second text samples.
- performing confidence-based screening processing on a plurality of second text samples by using the trained first text classification model includes: performing the following processing on any second text sample in the plurality of second text samples : perform prediction processing on the second text sample through the trained first text classification model to obtain the confidence levels of multiple predicted categories corresponding to the second text sample; determine the category label of the first text sample corresponding to the second text sample as Category label of the second text sample; based on the confidence of multiple predicted categories corresponding to the second text sample and the category label of the second text sample, the second text sample that exceeds the confidence threshold is used as the second text sample obtained by the screening process .
- the second text sample is encoded by the trained first text classification model to obtain the encoding vector of the second text sample, the encoding vector of the second text sample is fused to obtain the fusion vector, and the fusion vector is not Linear mapping processing to obtain the confidence levels of multiple predicted categories corresponding to the second text sample, and from the multiple predicted categories corresponding to the second text sample, determine the predicted category that matches the category label of the second text sample, when the matching predicted category When the confidence of the category exceeds the confidence threshold, the second text sample is used as the second text sample obtained by the screening process.
- step 104 a second text classification model for the second language is trained based on the second text samples obtained by the screening process.
- the server selects a large number of high-quality second text samples through the trained first text classification model, it realizes the automatic construction of cross-language text samples (that is, the second text samples in the second language with corresponding Category labeling of a text sample, i.e. no need for manual labeling), the second text classification model is trained through a large number of high-quality second text samples, so that the trained second text classification model can accurately perform text based on the second language. Classification to improve the accuracy of text classification in a second language.
- the embodiment of the present application can only filter the second text samples obtained through processing to classify the second text Just do the training.
- the text classification is performed on the text to be classified, that is, the text to be classified is encoded by the trained second text classification model.
- the encoding vector of the text to be classified is obtained, and the encoding vector of the text to be classified is non-linearly mapped to obtain the category corresponding to the text to be classified, and subsequent news applications, reading applications, etc. can be performed through the category corresponding to the text to be classified.
- FIG. 5 is an optional schematic flowchart of a training method for a text classification model provided by an embodiment of the present application.
- FIG. 5 shows that step 104 in FIG. 3 can be implemented through steps 1041 to 1043 shown in FIG. 5 .
- step 1041 the distribution of the second text samples obtained by the screening process in multiple categories is determined; in step 1042, when the distribution of the second text samples obtained by the screening process in the multiple categories satisfies the distribution equilibrium condition, and in each When the number of categories exceeds the corresponding threshold of the number of categories, randomly select text samples corresponding to the threshold of the number of categories from the text samples of each category in the second text samples obtained by the screening process to construct a training set; in step 1043, A second text classification model for the second language is trained based on the training set.
- the server obtains a large number of second text samples for training the second text classification model, analyze the distribution of the second text samples obtained by the screening process in multiple categories to determine whether the distribution equilibrium condition is satisfied, that is, different categories of Quantitative jitter, such as using the mean square error to measure the jitter of the number of different categories, the larger the jitter, the more uneven the distribution of text samples in multiple categories.
- the distribution of the second text samples obtained by the screening process in multiple categories satisfies the distribution equilibrium condition, and the number of each category exceeds the threshold of the number of categories
- the text samples of each category in the second text samples obtained by the screening process are selected from the text samples of each category. , extract text samples corresponding to the threshold of the number of categories to construct a training set, thereby improving the accuracy of text classification.
- training a second text classification model for the second language based on the second text samples obtained by the screening process includes: when the distribution of the second text samples obtained by the screening process in multiple categories does not satisfy a balanced distribution Condition, the expansion processing based on synonyms is performed for the second text samples of the categories with less distribution, so that the distribution of the second text samples obtained by the expansion processing in multiple categories satisfies the distribution equilibrium condition; based on the second text samples obtained by the expansion processing. training set; training a second text classification model for the second language based on the training set.
- the expansion processing based on synonyms is performed on the second text samples of the corresponding category, so that the second text samples obtained by the expansion processing are in the The number of each category exceeds the corresponding threshold of the number of categories; a training set is constructed based on the second text samples obtained by the augmentation process.
- the specific expansion process is as follows: perform the following processing on any text sample in the plurality of third text samples and the second text sample obtained by the screening process: a dictionary of synonyms (including the correspondence between various synonyms) Match the words in the text sample to obtain the matching words corresponding to the words in the text sample; replace the words in the text sample based on the matching words to obtain a new text sample; use the category label corresponding to the text sample as the new text sample.
- the class labels of the text samples By replacing the synonyms, the text samples of the second language can be greatly expanded, so as to realize the training of the second text classification model.
- training a second text classification model for the second language based on the second text samples obtained by the screening process includes: constructing a training set based on the plurality of third text samples and the second text samples obtained by the screening process , trains a second text classification model for the second language based on the training set.
- constructing a training set based on the plurality of third text samples and the second text samples obtained by the screening process includes: traversing each category of the second text samples obtained by the screening process, and performing the following processing: when the second text samples in the category When the number of categories is lower than the category number threshold of the category, the third text samples of the category are randomly selected from the plurality of third text samples to supplement the second text samples of the category to update the second text samples obtained by the screening process; based on The second text sample obtained by the updated screening process is used to construct a training set.
- a third text sample can be used to supplement.
- the number of second text samples in a category is lower than the threshold of the category number of categories, it means that there are relatively few text samples in this category, and the third text samples of this category can be randomly selected from multiple third text samples to supplement into the second text sample of the category, so as to update the second text sample obtained by the screening process, so that the text sample of the category in the second text sample is more sufficient.
- the computing power of the second text classification model can be used to match the corresponding number of text samples for appropriate training.
- the computing power (computing power) of the text classification model Based on the second text samples obtained by the screening process, before training the second text classification model for the second language, according to the correspondence between the computing power (computing power) of the text classification model and the number of text samples that can be calculated per unit time to determine the number of target samples that match the computing power that can be used to train the second text classification model; from the training set constructed based on the second text samples obtained by the screening process, screen out the text samples corresponding to the target number of samples as training Sample of the second text classification model for the second language.
- training a second text classification model for the second language based on the second text samples obtained by the screening process includes: performing prediction processing on the second text samples obtained by the screening process by using the second text classification model, Obtain the predicted category corresponding to the second text sample obtained by the screening process; build the loss function of the second text classification model based on the predicted category corresponding to the second text sample obtained by the screening process and the corresponding category label; update the second text classification model.
- the parameters are used until the loss function converges, and the updated parameters of the second text classification model when the loss function converges are used as the parameters of the trained second text classification model.
- the value of the loss function of the second text classification model after determining the value of the loss function of the second text classification model based on the predicted category corresponding to the second text sample obtained by the screening process and the corresponding category label, it can be determined whether the value of the loss function of the second text classification model exceeds a preset value Threshold, when the value of the loss function of the second text classification model exceeds the preset threshold, determine the error signal of the second text classification model based on the loss function of the second text classification model, and reverse the error information in the second text classification model Propagation, and update the model parameters of each layer in the process of propagation.
- Threshold when the value of the loss function of the second text classification model exceeds the preset threshold, determine the error signal of the second text classification model based on the loss function of the second text classification model, and reverse the error information in the second text classification model Propagation, and update the model parameters of each layer in the process of propagation.
- the second text classification model includes a plurality of cascaded encoders; the second text classification model performs prediction processing on the second text sample obtained by the screening process, and obtains the corresponding value of the second text sample obtained by the screening process. Predicting the category, including: performing the following processing on any text sample in the second text samples obtained by the screening process: using the first encoder of the plurality of cascaded encoders to encode the text sample by the first encoder Processing; output the encoding result of the first encoder to the subsequent cascaded encoders, and continue to perform encoding processing and output encoding results through the subsequent cascaded encoders until output to the last encoder; output the last encoder
- the encoding result of the text sample is used as the encoding vector of the corresponding text sample; the encoding vector of the text sample is nonlinearly mapped to obtain the predicted category corresponding to the text sample.
- the feature information of rich text samples can be extracted by performing encoding operations with cascaded encoders. For example, use the first encoder to perform the encoding processing of the first encoder on the text sample to obtain the first encoding result, output the first encoding result to the second encoder, and use the second encoder to encode the first encoding result.
- One encoding result is encoded by the second encoder, and the second encoding result is obtained, until it is output to the S-th encoder, and finally the encoding vector of the text sample is nonlinearly mapped to obtain the prediction category corresponding to the text sample. .
- S is the total number of encoders.
- y is a positive integer that increases in sequence, and the value range satisfies 2 ⁇ y ⁇ H-1, and H is Integer greater than 2 and used to represent the number of multiple cascaded encoders.
- the second language text classification is performed through the trained second text classification model, and the text classification method is as follows: obtain the text to be classified; wherein, the text to be classified Use a second language different from the first language; encode the text to be classified by using a second text classification model with a network depth greater than the first text classification model to obtain an encoding vector of the text to be classified; perform nonlinear coding on the encoding vector of the text to be classified Mapping to obtain the category corresponding to the text to be classified; wherein, the second text classification model is obtained by training the text samples of the second language screened by the first text classification model, and the text samples of the second language are obtained by training the text samples of the first language. Text samples are obtained by machine translation.
- the second text classification model includes a plurality of cascaded encoders.
- the following processing is performed on the text to be classified: through the first encoder of multiple cascaded encoders, the text to be classified is encoded by the first encoder; the encoding result of the first encoder is output to the subsequent cascade
- the encoder, through the subsequent cascaded encoders, continues to perform encoding processing and encoding results output until it is output to the last encoder; the encoding result output by the last encoder is used as the encoding vector corresponding to the text to be classified;
- the encoding vector is non-linearly mapped to obtain the category corresponding to the text to be classified.
- rich feature information of the text to be classified can be extracted.
- use the first encoder to perform the encoding processing of the first encoder on the text to be classified obtain the first encoding result, output the first encoding result to the second encoder, and use the second encoder to encode the first encoding result.
- One encoding result is encoded by the second encoder, and the second encoding result is obtained, until it is output to the S-th encoder, and finally the encoding vector of the text to be classified is non-linearly mapped, and the category corresponding to the text to be classified can be obtained.
- S is the total number of encoders.
- y is a positive integer that increases in sequence, and the value range satisfies 2 ⁇ y ⁇ H-1, and H is Integer greater than 2 and used to represent the number of multiple cascaded encoders.
- Text classification is widely used in content-related products, such as news classification, article classification, intent classification, information flow products, forums, communities, e-commerce, etc., so as to perform text recommendation and emotional guidance based on the categories of text classification.
- text classification is for texts in a certain language, such as Chinese, English, etc., products need to expand business in other languages, for example, to promote news reading products from the Chinese market to the English market, when users read news , which can recommend news based on the tags of English news, so as to recommend English news that meets the user's interests to users; when performing positive and negative sentiment analysis on the comments of Chinese users, when promoting products to overseas markets, when users make comments, it can be based on The labels of English comments guide users appropriately to avoid negative emotions.
- the method includes two parts, namely A) data preparation, B) algorithm framework and C) prediction:
- the embodiments of the present application are aimed at a situation where there are not a large number of samples (without labels), so it is impossible to train a large-scale pre-training model to extract text content.
- a partial text set A (Text A, a text set in language A) (including the first text sample) and a small amount of text set B (Text B, a text set in language B) (including the first text set Three text samples), among which, Text A and Text B are samples with category annotations.
- Text B has only a small amount of annotations, so the proportion is very small.
- the algorithm framework in the embodiment of the present application includes: 1) sample enhancement, 2) active learning, and 3) enhanced training.
- the sample augmentation, active learning, and augmentation training are described in detail below:
- each text X_A in language A in Text A is converted into text in language B to form the corresponding Text set B1 (Text B1, the text set of language B formed by translation).
- a weak classifier the first text classification model
- the weak classifier such as a shallow classifier such as fasttext
- Step 2 these high-confidence, labeled samples form a new training sample set (text set B1', Text B1'), based on Text B1' and Text B, continue to train the weak classifier, after the training is completed, Repeat step 1, and apply the weak classifier to the remaining samples screened by Text B1 (the remaining samples refer to the remaining text after selecting samples with high confidence from Text B1).
- Step 3 until the confidence obtained by predicting the samples in Text B1 can no longer be higher than the specified confidence threshold, that is, it is considered that the remaining samples screened by Text B1 are all samples of poor quality, and the iterative training is stopped at this time.
- the Text B' and Text B obtained in the above steps are mixed together, and then a strong classifier (second text classification model) is trained (such as a deep neural network (BERT, Bidirectional Encoder Representations from Transformers)).
- a strong classifier such as a deep neural network (BERT, Bidirectional Encoder Representations from Transformers)
- the trained strong classifier is used as the final text classification model for text classification of language B.
- the English news can be quickly labeled with the strong classifier obtained through training.
- News recommendation so as to recommend English news that meets the user's interests to users; when positive and negative sentiment analysis is performed on the comments of Chinese users, when the product is launched to the overseas market (language B), there will be many comments that are not Chinese, that is, English comments,
- the strong classifier obtained by training can quickly label the English comments with corresponding emotional labels, and when the user comments, it can properly guide the user's emotions based on the label of the English comment, so as to avoid the user's continuous negative emotions.
- the training method and text classification method of the text classification model in the embodiment of the present application obtain a second text sample in language B different from language A through a machine translation model, and screen the second text sample through a weak classifier,
- the automatic acquisition of cross-language text samples can reduce the pressure caused by the lack of text samples;
- the high-quality text samples obtained by screening are used to train a strong classifier, so that the strong classifier can perform accurate text classification and improve the accuracy of text classification. accuracy.
- each functional module in the training device for a text classification model may be composed of hardware resources of an electronic device (such as a terminal device, a server, or a server cluster), such as a processor, etc.
- Computational resources, communication resources (for example, to support communication in various ways such as optical cable and cellular), and memory are implemented collaboratively.
- a training device 555 for a text classification model stored in the memory 550 which may be software in the form of programs and plug-ins, for example, software modules designed in programming languages such as software C/C++, Java, C/C++, Application software designed by programming languages such as Java or special software modules, application program interfaces, plug-ins, cloud services, etc. in large-scale software systems are implemented. Examples of different implementation methods are described below.
- Example 1 The training device of the text classification model is a mobile application and module
- the training device 555 of the text classification model in the embodiment of the present application can be provided as a software module designed using a programming language such as software C/C++, Java, etc., and embedded in various mobile terminal applications based on systems such as Android or iOS (with executable).
- the instructions are stored in the storage medium of the mobile terminal and executed by the processor of the mobile terminal), so as to directly use the computing resources of the mobile terminal to complete the relevant information recommendation tasks, and periodically or irregularly transmit the processing results through various network communication methods.
- Remote server or save locally on the mobile terminal.
- Example 2 The training device of the text classification model is a server application and a platform
- the training device 555 of the text classification model in this embodiment of the present application can be provided as application software designed using programming languages such as C/C++, Java, or a special software module in a large-scale software system, running on the server side (in the form of executable instructions) It is stored in the storage medium on the server side and run by the processor on the server side), and the server uses its own computing resources to complete related information recommendation tasks.
- application software designed using programming languages such as C/C++, Java, or a special software module in a large-scale software system, running on the server side (in the form of executable instructions) It is stored in the storage medium on the server side and run by the processor on the server side), and the server uses its own computing resources to complete related information recommendation tasks.
- the embodiments of the present application can also be provided as a distributed and parallel computing platform composed of multiple servers, equipped with a customized, easy-to-interact web (Web) interface or other user interfaces (UI, User Interface) to form a user interface for personal, Information recommendation platforms (for recommendation lists) used by groups or units, etc.
- Web easy-to-interact web
- UI User Interface
- Example 3 The training device of the text classification model is a server-side application program interface (API, Application Program Interface) and a plug-in
- the text classification model training device 555 in this embodiment of the present application may be provided as a server-side API or plug-in for the user to invoke to execute the text classification model training method of the embodiment of the present application, and be embedded in various application programs.
- Example 4 The training device of the text classification model is the mobile device client API and plug-in
- the apparatus 555 for training the text classification model in the embodiment of the present application may be provided as an API or plug-in on the mobile device, for the user to call, so as to execute the training method of the text classification model in the embodiment of the present application.
- Example 5 The training device of the text classification model is an open cloud service
- the training device 555 of the text classification model in the embodiment of the present application may provide a cloud service for information recommendation developed for users, so that individuals, groups or units can obtain a recommendation list.
- the training device 555 of the text classification model includes a series of modules, including a translation module 5551 , a first training module 5552 , a screening module 5553 , and a second training module 5554 .
- the following continues to describe the training scheme for implementing the text classification model in cooperation of each module in the text classification model training device 555 provided by the embodiment of the present application.
- the translation module 5551 is configured to perform machine translation processing on a plurality of first text samples in the first language through a machine translation model to obtain a plurality of second text samples corresponding to the plurality of first text samples one-to-one;
- the plurality of second text samples are in a second language different from the first language;
- the first training module 5552 is configured to, based on the plurality of third text samples in the second language and the corresponding category labels, train with the first text classification model in the second language;
- the screening module 5553 is configured to perform confidence-based screening processing on the plurality of second text samples through the trained first text classification model;
- the second training Module 5554 configured to train a second text classification model for the second language based on the second text samples obtained by the screening process; wherein the network depth of the second text classification model is greater than that of the first text The network depth of the classification model.
- the first training module 5552 is further configured to perform the t-th training on the first text classification model based on a plurality of third text samples in the second language and the corresponding category labels;
- the t-th screening process based on confidence is performed on the plurality of second text samples by the first text classification model trained for the t-th time; based on the previous t-time screening results, the plurality of third text samples and the The corresponding category label, the t+1th training is performed on the first text classification model;
- the first text classification model trained for the Tth time is used as the first text classification model after the training; wherein, t is a positive integer increasing in sequence, and the value range satisfies 1 ⁇ t ⁇ T-1, and T is an integer greater than 2, and is used to represent the total number of iterative training.
- the second training module 5554 is further configured to determine the distribution of the second text samples obtained by the screening process in multiple categories; when the second text samples obtained by the screening process are in multiple categories The distribution satisfies the distribution balance condition, and when the number of each category exceeds the corresponding threshold of the number of categories, from the text samples of each category in the second text sample obtained by the screening process, based on random extraction corresponding to the number of categories A training set is constructed from the thresholded text samples; a second text classification model for the second language is trained based on the training set.
- the second training module 5554 is further configured to, when the distribution of the second text samples obtained by the screening process in multiple categories does not satisfy the distribution equilibrium condition, perform training on the second text samples of the category with less distribution.
- An expansion process based on synonyms; wherein, the distribution of the second text samples obtained by the expansion process in multiple categories satisfies the distribution equilibrium condition; a training set is constructed based on the second text samples obtained by the expansion process; based on the training The set trains a second text classification model for the second language.
- the second training module 5554 is further configured to construct a training set based on the plurality of third text samples and the second text samples obtained by the screening process, and train the training set for the A second text classification model for a second language.
- the second training module 5554 is further configured to traverse each category of the second text samples obtained by the screening process, and perform the following processing: when the number of the second text samples in the category is lower than When the number of categories of the category is the threshold, the third text samples of the category are randomly selected from the plurality of third text samples to supplement the second text samples of the category, so as to update the results obtained by the screening process.
- a second text sample; a training set is constructed based on the updated second text sample obtained by the screening process.
- the second training module 5554 is further configured to determine and train the second text classification model according to the corresponding relationship between the computing power of the text classification model and the number of text samples that can be computed in a unit time The number of target samples matched by the computing power that can be used; from the training set constructed based on the second text samples obtained by the screening process, screen out the text samples corresponding to the target number of samples, as training for the second text sample Example of a second text classification model for language.
- the first training module 5552 is further configured to perform prediction processing on a plurality of third text samples of the second language by using the first text classification model to obtain the plurality of third text samples Confidences of the corresponding predicted categories respectively; based on the confidences of the predicted categories and the category labels of the third text samples, construct the loss function of the first text classification model; update the parameters of the first text classification model Until the loss function converges, the updated parameters of the first text classification model when the loss function converges are used as the parameters of the trained first text classification model.
- the first training module 5552 is further configured to perform the following processing for any third text sample in the plurality of third text samples: perform the following processing through the first text classification model: performing encoding processing on the third text sample to obtain a coding vector of the third text sample; performing fusion processing on the coding vector of the third text sample to obtain a fusion vector; performing nonlinear mapping processing on the fusion vector, The confidence level of the predicted category corresponding to the third text sample is obtained.
- the first text classification model includes a plurality of cascaded activation layers; the first training module 5552 is further configured to pass the first activation layer of the plurality of cascaded activation layers, to the The fusion vector performs the mapping processing of the first activation layer; the mapping result of the first activation layer is output to the subsequent cascaded activation layers, and the mapping processing and mapping results are continued through the subsequent cascaded activation layers. output until the last activation layer; the activation result output by the last activation layer is used as the confidence level of the predicted category corresponding to the third text sample.
- the screening module 5553 is further configured to perform the following processing for any second text sample in the plurality of second text samples: perform the following processing on all the second text samples through the trained first text classification model performing prediction processing on the second text sample to obtain the confidence levels of multiple predicted categories corresponding to the second text sample; and determining the category label of the first text sample corresponding to the second text sample as the second text sample based on the confidence of multiple predicted categories corresponding to the second text sample and the category label of the second text sample, the second text sample that exceeds the confidence threshold Text sample.
- the second training module 5554 is further configured to perform prediction processing on the second text sample obtained by the screening process through the second text classification model to obtain the second text sample obtained by the screening process The corresponding predicted category; based on the predicted category corresponding to the second text sample obtained by the screening process and the corresponding category label, construct the loss function of the second text classification model; update the parameters of the second text classification model until all When the loss function converges, the updated parameters of the second text classification model when the loss function converges are used as the parameters of the second text classification model after training.
- the second text classification model includes a plurality of cascaded encoders; the second training module 5554 is further configured to perform the following processing for any text sample in the second text samples obtained by the screening process : perform the encoding processing of the first encoder on the text sample through the first encoder of the plurality of cascaded encoders; output the encoding result of the first encoder to the subsequent stage Concatenated encoders, continue encoding processing and encoding result output through the subsequent cascaded encoders until output to the last encoder; take the encoding result output by the last encoder as the encoding corresponding to the text sample vector; perform nonlinear mapping on the encoding vector of the text sample to obtain the predicted category corresponding to the text sample.
- the second training module 5554 is further configured to perform the following processing by the y th encoder of the plurality of cascaded encoders: self-attention to the encoding result of the y-1 th encoder force processing to obtain the y-th self-attention vector; perform residual connection processing on the y-th self-attention vector and the encoding result of the y-1th encoder to obtain the y-th residual vector;
- the y-th residual vector is subjected to nonlinear mapping processing to obtain the y-th mapping vector; the y-th mapping vector and the y-th residual vector are subjected to residual connection processing, and the result of the residual connection is obtained.
- y is a positive integer that increases sequentially, and the value range satisfies 2 ⁇ y ⁇ H-1, where H is an integer greater than 2, and is used to represent the number of the plurality of cascaded encoders.
- an embodiment of the present application further provides a text classification device, and the text classification device includes a series of modules, including an acquisition module and a processing module.
- the obtaining module is configured to obtain the text to be classified; wherein, the text to be classified is in a second language different from the first language;
- the processing module is configured to pass a second text classification model whose network depth is greater than that of the first text classification model Encoding the text to be classified to obtain the encoding vector of the text to be classified; performing nonlinear mapping on the encoding vector of the text to be classified to obtain the category corresponding to the text to be classified; wherein the second The text classification model is obtained by training the text samples in the second language screened by the first text classification model, and the text samples in the second language are obtained by performing machine translation on the text samples in the first language.
- Embodiments of the present application provide a computer program product or computer program, where the computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
- the processor of the electronic device reads the computer instruction from the computer-readable storage medium, and the processor executes the computer instruction, so that the electronic device executes the text classification model training method or text classification method described above in the embodiments of the present application.
- the embodiments of the present application provide a computer-readable storage medium storing executable instructions, wherein the executable instructions are stored, and when the executable instructions are executed by a processor, the processor will cause the processor to execute the artificial intelligence-based artificial intelligence provided by the embodiments of the present application.
- the information recommendation method, or the text classification method for example, the training method of the text classification model shown in Figure 3-5.
- the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; it may also include one or any combination of the foregoing memories Various equipment.
- executable instructions may take the form of programs, software, software modules, scripts, or code, written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and which Deployment may be in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
- executable instructions may, but do not necessarily correspond to files in a file system, may be stored as part of a file that holds other programs or data, for example, a Hyper Text Markup Language (HTML, Hyper Text Markup Language) document
- HTML Hyper Text Markup Language
- One or more scripts in stored in a single file dedicated to the program in question, or in multiple cooperating files (eg, files that store one or more modules, subroutines, or code sections).
- executable instructions may be deployed to be executed on one computing device, or on multiple computing devices located at one site, or alternatively, distributed across multiple sites and interconnected by a communication network execute on.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computational Linguistics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (20)
- 一种文本分类模型的训练方法,包括:对第一语言的多个第一文本样本进行机器翻译处理,得到与所述多个第一文本样本一一对应的多个第二文本样本;其中,所述多个第二文本样本采用不同于所述第一语言的第二语言;基于所述第二语言的多个第三文本样本以及分别对应的类别标签,训练用于所述第二语言的第一文本分类模型;通过训练后的所述第一文本分类模型对所述多个第二文本样本进行基于置信度的筛选处理;基于所述筛选处理得到的第二文本样本,训练用于所述第二语言的第二文本分类模型;其中,所述第二文本分类模型的网络深度大于所述第一文本分类模型的网络深度。
- 根据权利要求1所述的方法,其中,所述基于所述第二语言的多个第三文本样本以及分别对应的类别标签,训练用于所述第二语言的第一文本分类模型,包括:基于所述第二语言的多个第三文本样本以及分别对应的类别标签,对所述第一文本分类模型进行第t次训练;通过第t次训练的所述第一文本分类模型对所述多个第二文本样本进行基于置信度的第t次筛选处理;基于前t次筛选结果、所述多个第三文本样本以及分别对应的类别标签,对所述第一文本分类模型进行第t+1次训练;将第T次训练的所述第一文本分类模型作为所述训练后的所述第一文本分类模型;其中,t为依次递增的正整数、且取值范围满足1≤t≤T-1,T为大于2的整数、且用于表示迭代训练的总次数。
- 根据权利要求1所述的方法,其中,所述基于所述筛选处理得到的第二文本样本,训练用于所述第二语言的第二文本分类模型,包括:确定所述筛选处理得到的第二文本样本在多个类别的分布;当所述筛选处理得到的第二文本样本在多个类别的分布满足分布均衡条件、且在每个类别的数量超出对应的类别数量阈值时,从所述筛选处理得到的第二文本样本中的每个类别的文本样本中,基于随机抽取对应所述类别数量阈值的文本样本构建训练集;基于所述训练集训练用于所述第二语言的第二文本分类模型。
- 根据权利要求1所述的方法,其中,所述基于所述筛选处理得到的第二文本样本,训练用于所述第二语言的第二文本分类模型,包括:当所述筛选处理得到的第二文本样本在多个类别的分布不满足分布均衡条件,针对分布少的类别的第二文本样本进行基于近义词的扩充处理;其中,所述扩充处理得到的第二文本样本在多个类别的分布满足所述分布均衡条件;基于所述扩充处理得到的第二文本样本构建训练集;基于所述训练集训练用于所述第二语言的第二文本分类模型。
- 根据权利要求1所述的方法,其中,所述基于所述筛选处理得到的第二文本样本,训练用于所述第二语言的第二文本分类模型,包括:基于所述多个第三文本样本以及所述筛选处理得到的第二文本样本构建训练集,基于所述训练集训练用于所述第二语言的第二文本分类模型。
- 根据权利要求5所述的方法,其中,所述基于所述多个第三文本样本以及所述筛选处理得到的第二文本样本构建训练集,包括:遍历所述筛选处理得到的第二文本样本的每个类别,执行以下处理:当所述类别中的第二文本样本的数量低于所述类别的类别数量阈值时,将从所述多个第三文本样本中随机抽取所述类别的第三文本样本补充到所述类别的第二文本样本中,以更新所述筛选处理得到的第二文本样本;基于更新后的所述筛选处理得到的第二文本样本,构建训练集。
- 根据权利要求1所述的方法,其中,所述基于所述筛选处理得到的第二文本样本,训练用于所述第二语言的第二文本分类模型之前,所述方法还包括:根据文本分类模型的算力与在单位时间内所能够运算的文本样本的数量的对应关系,确定与训练所述第二文本分类模型所能够使用的算力匹配的目标样本数量;从基于所述筛选处理得到的第二文本样本构建的训练集中,筛选出对应所述目标样本数量的文本样本,以作为训练用于所述第二语言的第二文本分类模型的样本。
- 根据权利要求1所述的方法,其中,所述基于所述第二语言的多个第三文本样本以及分别对应的类别标签,训练用于所述第二语言的第一文本分类模型,包括:通过所述第一文本分类模型对所述第二语言的多个第三文本样本进行预测处理,得到所述多个第三文本样本分别对应的预测类别的置信度;基于所述预测类别的置信度以及所述第三文本样本的类别标签,构建所述第一文本分类模型的损失函数;更新所述第一文本分类模型的参数直至所述损失函数收敛,将所述损失函数收敛时所述第一文本分类模型的更新的参数,作为所述训练后的所述第一文本分类模型的参数。
- 根据权利要求8所述的方法,其中,所述通过所述第一文本分类模型对所述第二语言的多个第三文本样本进行预测处理,得到所述多个第三文本样本分别对应的预测类别的置信度,包括:针对所述多个第三文本样本中的任一第三文本样本执行以下处理:通过所述第一文本分类模型执行以下处理:对所述第三文本样本进行编码处理,得到所述第三文本样本的编码向量;对所述第三文本样本的编码向量进行融合处理,得到融合向量;对所述融合向量进行非线性映射处理,得到所述第三文本样本对应的预测类别的置信度。
- 根据权利要求9所述的方法,其中,所述第一文本分类模型包括多个级联的激活层;所述对所述融合向量进行非线性映射处理,得到所述第三文本样本对应的预测类别的置信度,包括:通过所述多个级联的激活层的第一个激活层,对所述融合向量进行所述第一个激活层的映射处理;将所述第一个激活层的映射结果输出到后续级联的激活层,通过所述后续级联的激活层继续进行映射处理和映射结果输出,直至输出到最后一个激活层;将所述最后一个激活层输出的激活结果作为所述第三文本样本对应的预测类别的置信度。
- 根据权利要求1所述的方法,其中,所述通过所述训练后的所述第一文本分类模型对所述多个第二文本样本进行基于置信度的筛选处理,包括:针对所述多个第二文本样本中的任一第二文本样本执行以下处理:通过所述训练后的所述第一文本分类模型对所述第二文本样本进行预测处理,得到所述第二文本样本对应的多个预测类别的置信度;将所述第二文本样本对应的第一文本样本的类别标签确定为所述第二文本样本的类别标签;基于所述第二文本样本对应的多个预测类别的置信度以及所述第二文本样本的类别标签,将超出置信度阈值的第二文本样本作为所述筛选处理得到的第二文本样本。
- 根据权利要求1所述的方法,其中,所述基于所述筛选处理得到的第二文本样本,训练用于所述第二语言的第二文本分类模型,包括:通过所述第二文本分类模型对所述筛选处理得到的第二文本样本进行预测处理,得到所述筛选处理得到的第二文本样本对应的预测类别;基于所述筛选处理得到的第二文本样本对应的预测类别以及对应的类别标签,构建所述第二文本分类模型的损失函数;更新所述第二文本分类模型的参数直至所述损失函数收敛,将所述损失函数收敛时所述第二文本分类模型的更新的参数,作为训练后的所述第二文本分类模型的参数。
- 根据权利要求12所述的方法,其中,所述第二文本分类模型包括多个级联的编码器;所述通过所述第二文本分类模型对所述筛选处理得到的第二文本样本进行预测处理,得到所述筛选处理得到的第二文本样本对应的预测类别,包括:针对所述筛选处理得到的第二文本样本中的任一文本样本执行以下处理:通过所述多个级联的编码器的第一个编码器,对所述文本样本进行所述第一个编码器的编码处理;将所述第一个编码器的编码结果输出到后续级联的编码器,通过所述后续级联的编码器继续进行编码处理和编码结果输出,直至输出到最后一个编码器;将所述最后一个编码器输出的编码结果作为对应所述文本样本的编码向量;对所述文本样本的编码向量进行非线性映射,得到所述文本样本对应的预测类别。
- 根据权利要求13所述的方法,其中,所述通过所述后续级联的编码器继续进行编码处理和编码结果输出,包括:通过所述多个级联的编码器的第y个编码器执行以下处理:对第y-1个编码器的编码结果进行自注意力处理,得到第y个自注意力向量;对所述第y个自注意力向量以及所述第y-1个编码器的编码结果进行残差连接处理,得到第y个残差向量;对所述第y个残差向量进行非线性映射处理,得到第y个映射向量;对所述第y个映射向量以及所述第y个残差向量进行残差连接处理,将残差连接的结果作为所述第y个编码器的编码结果,并将所述第y个编码器的编码结果输出到第y+1个编码器;其中,y为依次递增的正整数、且取值范围满足2≤y≤H-1,H为大于2的整数、且用于表示所述多个级联的编码器的数量。
- 一种文本分类方法,所述方法包括:获取待分类文本;其中,所述待分类文本采用不同于第一语言的第二语言;通过网络深度大于第一文本分类模型的第二文本分类模型对所述待分类文本进行编码处理,得到所述待分类文本的编码向量;对所述待分类文本的编码向量进行非线性映射,得到所述待分类文本对应的类别;其中,所述第二文本分类模型是通过所述第一文本分类模型筛选得到的第二语言的文本样本训练得到的,所述第二语言的文本样本是通过对所述第一语言的文本样本进行机器翻译得到的。
- 一种文本分类模型的训练装置,所述装置包括:翻译模块,配置为对第一语言的多个第一文本样本进行机器翻译处理,得到与所述 多个第一文本样本一一对应的多个第二文本样本;其中,所述多个第二文本样本采用不同于所述第一语言的第二语言;第一训练模块,配置为基于所述第二语言的多个第三文本样本以及分别对应的类别标签,训练用于所述第二语言的第一文本分类模型;筛选模块,用于通过训练后的所述第一文本分类模型对所述多个第二文本样本进行基于置信度的筛选处理;第二训练模块,配置为基于所述筛选处理得到的第二文本样本,训练用于所述第二语言的第二文本分类模型;其中,所述第二文本分类模型的网络深度大于所述第一文本分类模型的网络深度。
- 一种文本分类装置,所述装置包括:获取模块,配置为获取待分类文本;其中,所述待分类文本采用不同于第一语言的第二语言;处理模块,配置为通过网络深度大于第一文本分类模型的第二文本分类模型对所述待分类文本进行编码处理,得到所述待分类文本的编码向量;对所述待分类文本的编码向量进行非线性映射,得到所述待分类文本对应的类别;其中,所述第二文本分类模型是通过所述第一文本分类模型筛选得到的第二语言的文本样本训练得到的,所述第二语言的文本样本是通过对所述第一语言的文本样本进行机器翻译得到的。
- 一种电子设备,所述电子设备包括:存储器,用于存储可执行指令;处理器,用于执行所述存储器中存储的可执行指令时,实现权利要求1至14任一项所述的文本分类模型的训练方法,或权利要求15所述的文本分类方法。
- 一种计算机可读存储介质,存储有可执行指令,用于被处理器执行时,实现权利要求1至14任一项所述的文本分类模型的训练方法,或权利要求15所述的文本分类方法。
- 一种计算机程序产品,包括计算机程序或指令,所述计算机程序或指令被处理器执行时,实现权利要求1至14任一项所述的文本分类模型的训练方法,或权利要求15所述的文本分类方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2023514478A JP2023539532A (ja) | 2020-11-04 | 2021-10-18 | テキスト分類モデルのトレーニング方法、テキスト分類方法、装置、機器、記憶媒体及びコンピュータプログラム |
US17/959,402 US20230025317A1 (en) | 2020-11-04 | 2022-10-04 | Text classification model training method, text classification method, apparatus, device, storage medium and computer program product |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011217057.9A CN112214604A (zh) | 2020-11-04 | 2020-11-04 | 文本分类模型的训练方法、文本分类方法、装置及设备 |
CN202011217057.9 | 2020-11-04 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/959,402 Continuation US20230025317A1 (en) | 2020-11-04 | 2022-10-04 | Text classification model training method, text classification method, apparatus, device, storage medium and computer program product |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022095682A1 true WO2022095682A1 (zh) | 2022-05-12 |
Family
ID=74058181
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/124335 WO2022095682A1 (zh) | 2020-11-04 | 2021-10-18 | 文本分类模型的训练方法、文本分类方法、装置、设备、存储介质及计算机程序产品 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20230025317A1 (zh) |
JP (1) | JP2023539532A (zh) |
CN (1) | CN112214604A (zh) |
WO (1) | WO2022095682A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115033701A (zh) * | 2022-08-12 | 2022-09-09 | 北京百度网讯科技有限公司 | 文本向量生成模型训练方法、文本分类方法及相关装置 |
CN115186670A (zh) * | 2022-09-08 | 2022-10-14 | 北京航空航天大学 | 一种基于主动学习的领域命名实体识别方法及系统 |
CN115329723A (zh) * | 2022-10-17 | 2022-11-11 | 广州数说故事信息科技有限公司 | 基于小样本学习的用户圈层挖掘方法、装置、介质及设备 |
CN115346084A (zh) * | 2022-08-15 | 2022-11-15 | 腾讯科技(深圳)有限公司 | 样本处理方法、装置、电子设备、存储介质及程序产品 |
CN117455421A (zh) * | 2023-12-25 | 2024-01-26 | 杭州青塔科技有限公司 | 科研项目的学科分类方法、装置、计算机设备及存储介质 |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112214604A (zh) * | 2020-11-04 | 2021-01-12 | 腾讯科技(深圳)有限公司 | 文本分类模型的训练方法、文本分类方法、装置及设备 |
US11934795B2 (en) * | 2021-01-29 | 2024-03-19 | Oracle International Corporation | Augmented training set or test set for improved classification model robustness |
CN113010674B (zh) * | 2021-03-11 | 2023-12-22 | 平安创科科技(北京)有限公司 | 文本分类模型封装方法、文本分类方法及相关设备 |
CN112765359B (zh) * | 2021-04-07 | 2021-06-18 | 成都数联铭品科技有限公司 | 一种基于少样本的文本分类方法 |
CN114462387B (zh) * | 2022-02-10 | 2022-09-02 | 北京易聊科技有限公司 | 无标注语料下的句型自动判别方法 |
CN114328936B (zh) * | 2022-03-01 | 2022-08-30 | 支付宝(杭州)信息技术有限公司 | 建立分类模型的方法和装置 |
CN114911821B (zh) * | 2022-04-20 | 2024-05-24 | 平安国际智慧城市科技股份有限公司 | 一种结构化查询语句的生成方法、装置、设备及存储介质 |
CN116737935B (zh) * | 2023-06-20 | 2024-05-03 | 青海师范大学 | 基于提示学习的藏文文本分类方法、装置及存储介质 |
CN116720005B (zh) * | 2023-08-10 | 2023-10-20 | 四川大学 | 一种基于自适应噪声的数据协同对比推荐模型的系统 |
CN117851601A (zh) * | 2024-02-26 | 2024-04-09 | 海纳云物联科技有限公司 | 事件分类模型的训练方法、使用方法、装置及介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488623A (zh) * | 2013-09-04 | 2014-01-01 | 中国科学院计算技术研究所 | 多种语言文本数据分类处理方法 |
US20190026356A1 (en) * | 2015-09-22 | 2019-01-24 | Ebay Inc. | Miscategorized outlier detection using unsupervised slm-gbm approach and structured data |
CN111813942A (zh) * | 2020-07-23 | 2020-10-23 | 苏州思必驰信息科技有限公司 | 实体分类方法和装置 |
CN111831821A (zh) * | 2020-06-03 | 2020-10-27 | 北京百度网讯科技有限公司 | 文本分类模型的训练样本生成方法、装置和电子设备 |
CN112214604A (zh) * | 2020-11-04 | 2021-01-12 | 腾讯科技(深圳)有限公司 | 文本分类模型的训练方法、文本分类方法、装置及设备 |
-
2020
- 2020-11-04 CN CN202011217057.9A patent/CN112214604A/zh active Pending
-
2021
- 2021-10-18 WO PCT/CN2021/124335 patent/WO2022095682A1/zh active Application Filing
- 2021-10-18 JP JP2023514478A patent/JP2023539532A/ja active Pending
-
2022
- 2022-10-04 US US17/959,402 patent/US20230025317A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103488623A (zh) * | 2013-09-04 | 2014-01-01 | 中国科学院计算技术研究所 | 多种语言文本数据分类处理方法 |
US20190026356A1 (en) * | 2015-09-22 | 2019-01-24 | Ebay Inc. | Miscategorized outlier detection using unsupervised slm-gbm approach and structured data |
CN111831821A (zh) * | 2020-06-03 | 2020-10-27 | 北京百度网讯科技有限公司 | 文本分类模型的训练样本生成方法、装置和电子设备 |
CN111813942A (zh) * | 2020-07-23 | 2020-10-23 | 苏州思必驰信息科技有限公司 | 实体分类方法和装置 |
CN112214604A (zh) * | 2020-11-04 | 2021-01-12 | 腾讯科技(深圳)有限公司 | 文本分类模型的训练方法、文本分类方法、装置及设备 |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115033701A (zh) * | 2022-08-12 | 2022-09-09 | 北京百度网讯科技有限公司 | 文本向量生成模型训练方法、文本分类方法及相关装置 |
CN115346084A (zh) * | 2022-08-15 | 2022-11-15 | 腾讯科技(深圳)有限公司 | 样本处理方法、装置、电子设备、存储介质及程序产品 |
CN115186670A (zh) * | 2022-09-08 | 2022-10-14 | 北京航空航天大学 | 一种基于主动学习的领域命名实体识别方法及系统 |
CN115329723A (zh) * | 2022-10-17 | 2022-11-11 | 广州数说故事信息科技有限公司 | 基于小样本学习的用户圈层挖掘方法、装置、介质及设备 |
CN117455421A (zh) * | 2023-12-25 | 2024-01-26 | 杭州青塔科技有限公司 | 科研项目的学科分类方法、装置、计算机设备及存储介质 |
CN117455421B (zh) * | 2023-12-25 | 2024-04-16 | 杭州青塔科技有限公司 | 科研项目的学科分类方法、装置、计算机设备及存储介质 |
Also Published As
Publication number | Publication date |
---|---|
JP2023539532A (ja) | 2023-09-14 |
CN112214604A (zh) | 2021-01-12 |
US20230025317A1 (en) | 2023-01-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022095682A1 (zh) | 文本分类模型的训练方法、文本分类方法、装置、设备、存储介质及计算机程序产品 | |
CN106997370B (zh) | 基于作者的文本分类和转换 | |
CN108984683B (zh) | 结构化数据的提取方法、系统、设备及存储介质 | |
US11775761B2 (en) | Method and apparatus for mining entity focus in text | |
US11886480B2 (en) | Detecting affective characteristics of text with gated convolutional encoder-decoder framework | |
WO2020238783A1 (zh) | 一种信息处理方法、装置及存储介质 | |
CN110140133A (zh) | 机器学习任务的隐式桥接 | |
JP7301922B2 (ja) | 意味検索方法、装置、電子機器、記憶媒体およびコンピュータプログラム | |
CN112528637B (zh) | 文本处理模型训练方法、装置、计算机设备和存储介质 | |
CN110083702B (zh) | 一种基于多任务学习的方面级别文本情感转换方法 | |
CN111860653A (zh) | 一种视觉问答方法、装置及电子设备和存储介质 | |
CN111930915B (zh) | 会话信息处理方法、装置、计算机可读存储介质及设备 | |
CN115269786B (zh) | 可解释的虚假文本检测方法、装置、存储介质以及终端 | |
CN112668347B (zh) | 文本翻译方法、装置、设备及计算机可读存储介质 | |
Sonawane et al. | ChatBot for college website | |
CN113420869B (zh) | 基于全方向注意力的翻译方法及其相关设备 | |
CN116977885A (zh) | 视频文本任务处理方法、装置、电子设备及可读存储介质 | |
CN112199954B (zh) | 基于语音语义的疾病实体匹配方法、装置及计算机设备 | |
CN113919338B (zh) | 处理文本数据的方法及设备 | |
CN114330285A (zh) | 语料处理方法、装置、电子设备及计算机可读存储介质 | |
CN117521674B (zh) | 对抗信息的生成方法、装置、计算机设备和存储介质 | |
CN115204118B (zh) | 文章生成方法、装置、计算机设备及存储介质 | |
CN113591493B (zh) | 翻译模型的训练方法及翻译模型的装置 | |
Kang et al. | Hierarchical attention networks for user profile inference in social media systems | |
CN116913278B (zh) | 语音处理方法、装置、设备和存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21888379 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2023514478 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 260923) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21888379 Country of ref document: EP Kind code of ref document: A1 |