WO2020140632A1 - Procédé d'extraction de caractéristiques masquées, appareil, dispositif informatique et support de stockage - Google Patents

Procédé d'extraction de caractéristiques masquées, appareil, dispositif informatique et support de stockage Download PDF

Info

Publication number
WO2020140632A1
WO2020140632A1 PCT/CN2019/118242 CN2019118242W WO2020140632A1 WO 2020140632 A1 WO2020140632 A1 WO 2020140632A1 CN 2019118242 W CN2019118242 W CN 2019118242W WO 2020140632 A1 WO2020140632 A1 WO 2020140632A1
Authority
WO
WIPO (PCT)
Prior art keywords
corpus
word vector
feature
self
hidden
Prior art date
Application number
PCT/CN2019/118242
Other languages
English (en)
Chinese (zh)
Inventor
金戈
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020140632A1 publication Critical patent/WO2020140632A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions

Definitions

  • the present application relates to the technical field of text classification, and in particular to a method, apparatus, computer equipment, and computer-readable storage medium for extracting hidden features.
  • Embodiments of the present application provide an implicit feature extraction method, device, computer equipment, and computer-readable storage medium, which can solve the problem of low text classification efficiency in the conventional technology.
  • an embodiment of the present application provides an implicit feature extraction method, the method includes: acquiring a first corpus for performing implicit feature extraction; embedding the first corpus into words to embed the first corpus Convert to word vectors; extract the word vector features of the word vectors through a convolutional neural network; encode the word vector features by self-encoding to extract the hidden features of the word vector features.
  • an embodiment of the present application further provides a computer device, which includes a memory and a processor, a computer program is stored on the memory, and the hidden feature extraction method is implemented when the processor executes the computer program .
  • an embodiment of the present application further provides a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, causes the processor to execute the implicit Feature extraction method.
  • FIG. 1 is a schematic diagram of an application scenario of an implicit feature extraction method provided by an embodiment of this application
  • FIG. 2 is a schematic flowchart of a method for extracting hidden features provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of word vectors in a method for extracting hidden features provided by an embodiment of the present application
  • FIG. 4 is a schematic diagram of a self-encoding structure in an implicit feature extraction method provided by an embodiment of this application;
  • FIG. 5 is a schematic flowchart of a self-encoding structure in an implicit feature extraction method provided by an embodiment of this application;
  • FIG. 6 is a schematic diagram of corpus display in a method for extracting hidden features provided by an embodiment of the present application
  • each subject in FIG. 1 obtains a first corpus for implicit feature extraction, embeds the first corpus into words to convert the first corpus into a word vector, and extracts all the data through a convolutional neural network.
  • the word vector features of the predicate vector, and the word vector features are encoded in a self-encoding manner to extract the hidden features of the word vector features.
  • FIG. 1 only illustrates a desktop computer as a terminal.
  • the type of the terminal is not limited to that shown in FIG. 1.
  • the terminal may also be an electronic device such as a mobile phone, notebook computer, or tablet computer.
  • the application scenarios of the above implicit feature extraction method are only used to illustrate the technical solution of the present application, and are not used to limit the technical solution of the present application.
  • the server obtains the first corpus for performing implicit feature extraction.
  • the first corpus may be a preset corpus on a designated website on the web, and crawling rules may be preset according to actual needs, for example, crawling rules
  • the corpus of a certain web page can also be the relevant corpus of a subject crawled.
  • the first corpus may also be a corpus provided through a corpus database, such as user data accumulated on a website.
  • word embedding is a type of word representation, words with similar meaning have similar representations, is the general term of the method of mapping vocabulary to real number vector.
  • word embedding is a type of technology in which a single word is represented as a real vector in a predefined vector space, and each word is mapped to a vector.
  • FIG. 3 is a schematic diagram of word vectors in a method for extracting hidden features provided by an embodiment of the present application.
  • the text corpus is converted into a pre-trained word vector, that is, the input natural language is encoded into a word vector, which is prepared for the pre-trained word vector.
  • a pre-trained word vector that is, the input natural language is encoded into a word vector, which is prepared for the pre-trained word vector.
  • pre-trained word vectors it is divided into Static method and No-static method. Static method refers to the parameter of the word vector is no longer adjusted during the training of TextCNN.
  • No-static method adjusts the parameter of the word vector during the training process , So the result of No-static method is better than that of Static method.
  • TextCNN English is Text Convolutional Neural Network, a text classification model based on convolutional neural network, that is, using convolutional neural network to classify text.
  • Embedding layer embedding layer
  • it can be adjusted once every 100 batches, which can reduce the training time and fine-tune the word vector.
  • the first corpus can be word embedded using a trained preset word vector dictionary to convert the first corpus into word vectors.
  • the word vector may use Word2Vec pre-trained word vectors, that is, each vocabulary has a corresponding vector representation, and such vector representations can express vocabulary information in data form.
  • Word2vec English is Word to vector, is a software tool for training word vectors.
  • Convolutional Neural Networks English is Convolutional Neural Networks, referred to as CNN, is a type of feedforward neural networks (Feedforward Neural Networks) that contains convolution or related calculations and has a deep structure, is a representative of deep learning (Deep Learning) One of the algorithms. Since the convolutional neural network can perform translation-invariant classification (English is Shift-Invariant Classification), it is also called “translation-invariant artificial neural network (English is Shift-Invariant Artificial Neural Networks, referred to as SIANN).
  • SIANN Shift-Invariant Artificial Neural Networks
  • a convolutional neural network is established, and the features of the corpus are extracted using the convolutional neural network.
  • Convolutional neural networks capture local text information through multiple scale convolution kernels.
  • the vertical dimension of the first-level convolution kernel can be selected from multiple types of scales from 1 to 5 to correspond to the number of captured words, and the horizontal dimension remains the same as the word vector dimension.
  • the one-dimensional convolutional layer corresponding to the longitudinal dimension can be selected according to the length of the text to further refine the information.
  • the self-encoding method refers to the way of encoding through the self-encoding structure.
  • the self-encoding structure is an unsupervised learning method based on the neural network to learn the hidden features. It is an artificial neural network and is used effectively in unsupervised learning.
  • coding The purpose of self-encoding is to learn a representation of a set of data. The representation is generally described by numbers. This representation is also called representation. Encoding is usually used for dimensionality reduction, and self-encoding can also be used for data generation models. Please refer to FIG. 4.
  • FIG. 4 is a schematic diagram of a self-encoding structure in an implicit feature extraction method provided by an embodiment of the present application. As shown in FIG.
  • the self-encoding structure generally includes an input layer, a hidden layer, and an output layer.
  • the input layer receives external input data, encodes through the hidden layer in the middle to learn hidden features, and decodes and outputs the hidden features through the output layer.
  • the hidden layer can be expressed as a functional relationship, such as Hw, b (x), where H is an implicit feature, x is a variable, w and b are parameters
  • the hidden layer structure in the self-encoding structure can be composed of a layer It can also be composed of multiple layers.
  • the hidden layer is composed of one layer and can be called a hidden layer.
  • the hidden layer is composed of multiple layers and can be called multiple hidden layers.
  • the hidden layer shown in FIG. 4 is one layer.
  • the hidden layer may also be multiple layers such as 2, 3 or 4 layers.
  • the construction of the self-coding structure can be achieved through the tensorflow library in Python. The network structure after the construction can be trained, and the self-coding structure after the training can be officially used.
  • the word vector features are encoded by a self-encoding function to obtain the hidden features of the word vector features. That is, the terminal encodes the feature of the word vector through the hidden layer of the self-encoding structure to obtain a digital description of the first corpus for dimensionality reduction, where the hidden layer refers to an unsupervised learning method through a neural network, Convert the text corpus to a digital representation to imply a non-literal form to express the meaning of the text in order to achieve the purpose of extracting a large amount of corpus and then accurately restoring it.
  • the hidden layer is an intermediate layer between the input layer and the output layer of the neural network. Each hidden layer contains a certain number of hidden units, and there are connections between the hidden units and the input and output layers.
  • the self-encoding structure can also be understood as the conversion process of text corpus as follows: 10 dimensions (Chinese characters)-5 dimensions (numbers)-10 dimensions (Chinese characters), where dimension refers to the dimension, and 5 dimension refers to the hidden features of the text as 5 dimensions, such as 5 lines, training to obtain accuracy of 5 dimensions.
  • the following process is realized through a neural network: text representation-replaced by a hidden layer to a digital representation (text meaning expressed by numbers)-restored text representation.
  • FIG. 5 is a schematic flowchart of a self-encoding structure in a method for extracting hidden features provided by an embodiment of the present application. As shown in Figure 5, build a self-encoding network structure.
  • the embodiments of the present application belong to the technical field of text classification.
  • the first corpus when implementing hidden feature extraction, is embedded into the first corpus to obtain the first corpus by acquiring the first corpus for implicit feature extraction. Convert to word vectors, extract the word vector features of the word vectors through a convolutional neural network, and then use an unsupervised algorithm to cluster the corpus description, and then encode the word vector features by self-encoding to extract the
  • the hidden features of the predicate vector feature are used to reduce the dimension of the corpus data, so as to extract the hidden features of the corpus through unsupervised learning, which can improve the accuracy of subsequent learning modeling and overcome the amount of training data. influences.
  • the method further includes:
  • the second corpus is displayed in a preset form.
  • the second corpus has a certain regularity, and the corpus can be displayed in the form of a table or in the form of a chart, so that the user can use the form or graph To obtain information about the second corpus.
  • Table 1 is an example of the second corpus obtained in the form of a table.
  • FIG. 6 is a schematic diagram of the corpus display in the method of extracting hidden features provided by an embodiment of the present application.
  • FIG. 6 is a diagram The form shows an example of the obtained second corpus.
  • the method before the step of encoding the word vector feature through the self-encoding function to obtain the hidden feature of the word vector feature, the method further includes: training the self-encoding function using a training corpus.
  • the step of using the training corpus to train the self-encoding function includes: S710, inputting the word vector features of the training corpus to the self-encoding function; S720, inputting The word vector features of the training corpus are encoded by the self-encoding function to extract the hidden features of the word vector features; S730, decoding the hidden features to obtain a decoded third corpus; S740, determining the location Whether the similarity between the training corpus and the third corpus is greater than or equal to a preset similarity threshold; S750, if the similarity between the training corpus and the third corpus is greater than or equal to the preset similarity threshold, it is determined Complete the training of the self-encoding structure; S760.
  • the self-encoding structure Before using the self-encoding structure to learn the hidden features of the text, the self-encoding structure needs to be trained. After the self-encoding structure extracts the hidden features of the corpus to meet the accuracy requirements, the self-encoding structure is trained.
  • the self-coding network structure after training can be used for feature extraction of text, and the hidden features of the text are learned according to the self-coding structure to use the hidden features of the extracted corpus for modeling and other uses.
  • the loss function in the self-encoding structure is MSE, where, MSE, mean-square error in English, mean square error, is a method of calculating the sum of squares of the distance between the predicted value and the true value
  • the training method is ADAM.
  • ADMA English is Adaptive, estimation, adaptive moment estimation, and the learning rate is 0.001.
  • the learning rate also known as the learning rate, English is Earning Rate, which controls the learning progress of the model.
  • the self-coding network structure after training can be used to extract hidden features of text. Specifically, the self-encoding structure training process is as follows:
  • the training corpus is a Text text corpus, for example, the obtained training corpus includes: cat 1, dog 1, dog 3, person, cat 2, dog 2.
  • Convert the training corpus into a word vector through the word embedding layer that is, convert the text corpus into a word vector. For example, after the above training text corpus is converted into a word vector: 1'(cat 1), 2'(dog 1 ), 2" (dog 3), 3 (person), 1" (cat 2), 2"' (dog 2).
  • Extract the word vector features of the word vector through the convolutional neural network to achieve an unsupervised form of clustering representation that is, extract and classify the word vectors after the training corpus conversion through the convolutional neural network to obtain the Describe the characteristics of the training corpus, for example, the word vector features obtained from the above word vectors are: 1'and 1" (cat 1, cat 2); 2', 2" and 2"' (dog 1, dog 2, dog 3 ); 3 (person).
  • the word vector features of the training corpus are encoded by the hidden layer to learn the hidden features of the training corpus, and a self-encoding structure is established based on the output of the convolutional neural network to use the hidden features of the training corpus, That is, the word vector features of the training corpus are input to the hidden layer of the self-encoding structure through the input layer of the self-encoding structure, that is, the word vector features of the training corpus are input to the self-encoding function to perform Encoding, so as to express the corresponding meaning of the text corpus in digital form, which is an implicit representation relative to the text form.
  • the hidden features learned from the above training corpus are: 1(1' and 1"), 2(2', 2" and 2"'), 3(3).
  • the hidden features of the training corpus are decoded through the output layer of the self-encoding structure to obtain the decoded third corpus, that is, the digital form of the hidden features is restored to the text form through the neural network of the self-encoding structure
  • the content of the restored corpus and the text content of the original training corpus meet the similarity requirement to achieve decoding, that is, the digital form of the hidden feature is restored to the meaning of the text form through a self-encoding structure, and the final result requires the restored content It meets the similarity requirement with the original text.
  • the structure of the above hidden features is: cat 1, cat 2, dog 1, dog 2, dog 3, person, or cat 1, dog 1, dog 3, person, cat 2. Dog 2.
  • the convolutional neural network is established during the training process.
  • the convolutional neural network here is pre-trained to realize the feature extraction of the text using the convolutional neural network.
  • the hidden features of the text refer to the hidden features shown in Figure 5.
  • Layer generated features During the training process, the self-encoding structure, convolutional neural network structure, and word vectors will be updated. Finally, the similarity between the training corpus and the third corpus meets a preset similarity threshold, and the hidden layer of the trained self-encoding structure can reflect the hidden features of the text and can be used for multiple purposes.
  • the embodiment of the present application extracts the hidden features of the text by using an unsupervised algorithm, first converts the text into a pre-trained word vector, and uses the convolutional neural network to extract the features of the text, and then establishes a self-coding structure based on the output of the convolutional neural network to learn
  • the text has hidden features.
  • the self-encoding structure, convolutional neural network structure, and word vectors will be updated.
  • the hidden layer of the trained self-encoding structure can reflect the hidden features of the text to extract the hidden features of the model through unsupervised algorithms, and can be used for multiple purposes. The obtained information can improve the accuracy of subsequent supervised learning modeling and overcome The impact of the amount of training data.
  • the hidden feature extraction model established by the method of the embodiment of the present application is suitable for supervised training with a small number of training samples. Since deep learning has a high possibility of overfitting, a small amount of training sample data will seriously affect the generalization ability of the model. Therefore, hidden features can be established by a large amount of unlabeled training data through the method of the embodiment of the present application Extract the model to learn the hidden features of the text, and then combine the hidden features in the hidden feature extraction model and the training data with annotations to perform supervised learning modeling to improve the accuracy of supervised learning modeling.
  • FIG. 8 is a schematic block diagram of an apparatus for extracting hidden features provided by an embodiment of the present application.
  • an embodiment of the present application further provides a hidden feature extraction device.
  • the hidden feature extraction device includes a unit for performing the aforementioned hidden feature extraction method, and the device may be configured in a computer device such as a terminal or a server.
  • the hidden feature extraction device 800 includes an acquisition unit 801, a conversion unit 802, a first extraction unit 803 and a second extraction unit 804.
  • the obtaining unit 801 is used to obtain a first corpus for performing hidden feature extraction; the conversion unit 802 is used to embed the first corpus into words to convert the first corpus into a word vector; the first extraction unit 803, used to extract word vector features of the word vector through a convolutional neural network; a second extraction unit 804, used to encode the word vector features by self-encoding to extract hidden features of the word vector features .
  • the second extraction unit 804 is configured to encode the word vector feature through a self-encoding function to obtain the hidden feature of the word vector feature.
  • the hidden feature extraction device 800 further includes: a decoding unit 805 for decoding the hidden feature to obtain a decoded second corpus; a display unit 806, It is used to display the second corpus in a preset form; a training unit 807 is used to train the self-encoding function using a training corpus.
  • the training unit 807 includes: an input subunit 8071 for inputting the word vector features of the training corpus to the self-encoding function; an encoding subunit 8072 for Encode the word vector features of the training corpus through the self-encoding function to extract the hidden features of the word vector features; a decoding sub-unit 8073 is used to decode the hidden features to obtain the decoded Three corpora; a judgment subunit 8074, used to judge whether the similarity between the training corpus and the third corpus is greater than or equal to a preset similarity threshold; a determination subunit 8075, used to determine whether the training corpus and the third corpus The similarity of the three corpora is greater than or equal to the preset similarity threshold, and it is determined to complete the training of the self-encoding structure; an adjustment subunit 8076 is used if the similarity between the training corpus and the third corpus is less than the Preset a similarity threshold, adjust parameters in the self-encoding function and continue
  • the display unit is configured to display the second corpus in a table form or a chart form.
  • the conversion unit 802 is configured to embed the first corpus into words using a trained preset word vector dictionary to convert the first corpus into word vectors.
  • the division and connection of the units in the above hidden feature extraction device are only for illustration.
  • the hidden feature extraction device may be divided into different units as needed, or the hidden feature may be extracted.
  • the units in the device adopt different connection sequences and methods to complete all or part of the functions of the hidden feature extraction device.
  • the above-mentioned hidden feature extraction device may be implemented in the form of a computer program, and the computer program may run on the computer device shown in FIG. 10.
  • FIG. 10 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 1000 may be a computer device such as a desktop computer or a server, or may be a component or part in other devices.
  • the computer device 1000 includes a processor 1002, a memory, and a network interface 1005 connected through a system bus 1001, where the memory may include a non-volatile storage medium 1003 and an internal memory 1004.
  • the non-volatile storage medium 1003 can store an operating system 10031 and a computer program 10032.
  • the computer program 10032 When executed, it may cause the processor 1002 to execute one of the aforementioned implicit feature extraction methods.
  • the processor 1002 is used to provide computing and control capabilities to support the operation of the entire computer device 1000.
  • the internal memory 1004 provides an environment for the operation of the computer program 10032 in the non-volatile storage medium 1003.
  • the processor 1002 can execute the above-mentioned hidden feature extraction method.
  • the network interface 1005 is used for network communication with other devices.
  • the structure shown in FIG. 10 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 1000 to which the solution of the present application is applied.
  • the specific computer device 1000 may include more or fewer components than shown in the figure, or combine certain components, or have a different arrangement of components.
  • the computer device may only include a memory and a processor. In such an embodiment, the structures and functions of the memory and the processor are consistent with the embodiment shown in FIG. 10, and details are not described herein again.
  • the processor 1002 is used to run the computer program 10032 stored in the memory to implement the hidden feature extraction method in the embodiment of the present application.
  • the processor 1002 may be a central processing unit (Central Processing Unit, CPU), and the processor 1002 may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), Application specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor.
  • the embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium may be a non-volatile computer-readable storage medium.
  • the computer-readable storage medium stores a computer program.
  • the processor causes the processor to perform the operations described in the foregoing embodiments. Steps of hidden feature extraction method.
  • the storage medium is a physical, non-transitory storage medium, such as a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk, or an optical disk, and various other physical storage media that can store computer programs .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

Les modes de réalisation de l'invention concernent un procédé d'extraction de caractéristiques masquées, un appareil, un dispositif informatique et un support de stockage lisible par ordinateur. Les modes de réalisation de la présente invention relèvent du domaine technique de la classification de texte. Dans les modes de réalisation de l'invention, lorsqu'une extraction de caractéristiques masquées est effectuée, un premier corpus permettant d'effectuer une extraction de caractéristiques masquées est acquis, une intégration de mots est effectuée sur le premier corpus de façon à convertir le premier corpus un vecteur de mots, une caractéristique du vecteur de mots est extraite au moyen d'un réseau neuronal convolutionnel, le vecteur de mots est groupé puis décrit à l'aide d'un algorithme non supervisé, et la caractéristique du vecteur de mots est codée au moyen d'un auto-codage de façon à extraire une caractéristique masquée de la caractéristique du vecteur de mots.
PCT/CN2019/118242 2019-01-04 2019-11-14 Procédé d'extraction de caractéristiques masquées, appareil, dispositif informatique et support de stockage WO2020140632A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910007711.4A CN109871531A (zh) 2019-01-04 2019-01-04 隐含特征提取方法、装置、计算机设备及存储介质
CN201910007711.4 2019-01-04

Publications (1)

Publication Number Publication Date
WO2020140632A1 true WO2020140632A1 (fr) 2020-07-09

Family

ID=66917462

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118242 WO2020140632A1 (fr) 2019-01-04 2019-11-14 Procédé d'extraction de caractéristiques masquées, appareil, dispositif informatique et support de stockage

Country Status (2)

Country Link
CN (1) CN109871531A (fr)
WO (1) WO2020140632A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435199A (zh) * 2021-07-18 2021-09-24 谢勇 一种性格对应文化的存储读取干涉方法及系统

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871531A (zh) * 2019-01-04 2019-06-11 平安科技(深圳)有限公司 隐含特征提取方法、装置、计算机设备及存储介质
CN110413730B (zh) * 2019-06-27 2024-06-07 平安科技(深圳)有限公司 文本信息匹配度检测方法、装置、计算机设备和存储介质
CN110442677A (zh) * 2019-07-04 2019-11-12 平安科技(深圳)有限公司 文本匹配度检测方法、装置、计算机设备和可读存储介质
CN111507100B (zh) * 2020-01-14 2023-05-05 上海勃池信息技术有限公司 一种卷积自编码器及基于该编码器的词嵌入向量压缩方法
CN111222981A (zh) * 2020-01-16 2020-06-02 中国建设银行股份有限公司 可信度确定方法、装置、设备和存储介质
CN112929341A (zh) * 2021-01-22 2021-06-08 网宿科技股份有限公司 一种dga域名的检测方法、系统及装置
CN113239128B (zh) * 2021-06-01 2022-03-18 平安科技(深圳)有限公司 基于隐式特征的数据对分类方法、装置、设备和存储介质
CN113627514A (zh) * 2021-08-05 2021-11-09 南方电网数字电网研究院有限公司 知识图谱的数据处理方法、装置、电子设备和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529721A (zh) * 2016-11-08 2017-03-22 安徽大学 一种深度特征提取的广告点击率预测系统及其预测方法
CN108733682A (zh) * 2017-04-14 2018-11-02 华为技术有限公司 一种生成多文档摘要的方法及装置
CN109871531A (zh) * 2019-01-04 2019-06-11 平安科技(深圳)有限公司 隐含特征提取方法、装置、计算机设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107516110B (zh) * 2017-08-22 2020-02-18 华南理工大学 一种基于集成卷积编码的医疗问答语义聚类方法
CN108427771B (zh) * 2018-04-09 2020-11-10 腾讯科技(深圳)有限公司 摘要文本生成方法、装置和计算机设备
CN108960959B (zh) * 2018-05-23 2020-05-12 山东大学 基于神经网络的多模态互补服装搭配方法、系统及介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529721A (zh) * 2016-11-08 2017-03-22 安徽大学 一种深度特征提取的广告点击率预测系统及其预测方法
CN108733682A (zh) * 2017-04-14 2018-11-02 华为技术有限公司 一种生成多文档摘要的方法及装置
CN109871531A (zh) * 2019-01-04 2019-06-11 平安科技(深圳)有限公司 隐含特征提取方法、装置、计算机设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435199A (zh) * 2021-07-18 2021-09-24 谢勇 一种性格对应文化的存储读取干涉方法及系统
CN113435199B (zh) * 2021-07-18 2023-05-26 谢勇 一种性格对应文化的存储读取干涉方法及系统

Also Published As

Publication number Publication date
CN109871531A (zh) 2019-06-11

Similar Documents

Publication Publication Date Title
WO2020140632A1 (fr) Procédé d'extraction de caractéristiques masquées, appareil, dispositif informatique et support de stockage
US11816442B2 (en) Multi-turn dialogue response generation with autoregressive transformer models
US11244205B2 (en) Generating multi modal image representation for an image
WO2020140403A1 (fr) Procédé et appareil de classification de texte, dispositif informatique et support de stockage
WO2021179570A1 (fr) Procédé et appareil d'étiquetage de séquence, dispositif informatique et support d'informations
WO2020151175A1 (fr) Procédé et dispositif de génération de texte, dispositif informatique et support de stockage
CN110377733B (zh) 一种基于文本的情绪识别方法、终端设备及介质
US20220230061A1 (en) Modality adaptive information retrieval
JP2023500222A (ja) 系列マイニングモデルの訓練方法、系列データの処理方法、系列マイニングモデルの訓練装置、系列データの処理装置、コンピュータ機器、及びコンピュータプログラム
WO2020143303A1 (fr) Procédé et dispositif d'entraînement de modèle d'apprentissage profond, appareil informatique et support d'informations
CN113326383B (zh) 一种短文本实体链接方法、装置、计算设备与存储介质
CN117173269A (zh) 一种人脸图像生成方法、装置、电子设备和存储介质
CN115269768A (zh) 要素文本处理方法、装置、电子设备和存储介质
CN111368554B (zh) 语句处理方法、装置、计算机设备和存储介质
CN114065771A (zh) 一种预训练语言处理方法及设备
CN116127049A (zh) 模型训练方法、文本生成方法、终端设备及计算机介质
CN115033683B (zh) 摘要生成方法、装置、设备及存储介质
CN112487136A (zh) 文本处理方法、装置、设备以及计算机可读存储介质
CN112509559B (zh) 音频识别方法、模型训练方法、装置、设备及存储介质
CN117371447A (zh) 命名实体识别模型的训练方法、装置及存储介质
CN116306612A (zh) 一种词句生成方法及相关设备
CN110442706B (zh) 一种文本摘要生成的方法、系统、设备及存储介质
CN111310460B (zh) 语句的调整方法及装置
CN117252274B (zh) 一种文本音频图像对比学习方法、装置和存储介质
CN111639152B (zh) 意图识别方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19906677

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19906677

Country of ref document: EP

Kind code of ref document: A1