WO2021042517A1 - Procédé et dispositif d'extraction de gist d'article basés sur l'intelligence artificielle, et support de stockage - Google Patents

Procédé et dispositif d'extraction de gist d'article basés sur l'intelligence artificielle, et support de stockage Download PDF

Info

Publication number
WO2021042517A1
WO2021042517A1 PCT/CN2019/116936 CN2019116936W WO2021042517A1 WO 2021042517 A1 WO2021042517 A1 WO 2021042517A1 CN 2019116936 W CN2019116936 W CN 2019116936W WO 2021042517 A1 WO2021042517 A1 WO 2021042517A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
subject
matrix
text
artificial intelligence
Prior art date
Application number
PCT/CN2019/116936
Other languages
English (en)
Chinese (zh)
Inventor
陈一峰
周骏红
汪伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021042517A1 publication Critical patent/WO2021042517A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device and computer-readable storage medium for extracting the subject matter of articles based on artificial intelligence.
  • This application provides an artificial intelligence-based article subject extraction method, device, and computer-readable storage medium, the main purpose of which is to perform intelligent subject extraction based on the article input by the user.
  • an artificial intelligence-based article subject extraction method includes: receiving a text data set, performing word segmentation and merging operations on the text data set to obtain a word text set;
  • the text set is converted into a word matrix set after the encoding operation, and the word matrix set is input into the word vector conversion model to train to obtain the word vector set;
  • the word vector set is subjected to dimensionality reduction operation and then input into the convolutional neural network model
  • the training value is obtained by training, and the size of the training value and the preset threshold is judged, if the training value is greater than the preset threshold, the convolutional neural network model continues training, and if the training value is less than the preset threshold ,
  • the convolutional neural network model completes training; receives text data input by the user, converts the text data input by the user into a word vector, and then inputs it into the trained convolutional neural network model to obtain the main idea of the article and output it .
  • this application also provides an artificial intelligence-based article subject extraction device, which includes a memory and a processor, and the memory stores artificial intelligence-based articles that can run on the processor.
  • the subject extraction program when the artificial intelligence-based article subject extraction program is executed by the processor, the following steps are implemented: receiving a text data set, and performing operations including word segmentation and merging on the text data set to obtain a word text set;
  • the word text set is converted into a word matrix set after an encoding operation, and the word matrix set is input into a word vector conversion model to train to obtain a word vector set;
  • the word vector set is subjected to a dimensionality reduction operation and then input to the convolutional nerve
  • the training value is obtained by training in the network model, and the size of the training value and the preset threshold is judged.
  • the convolutional neural network model continues training, and if the training value is less than the With a preset threshold, the convolutional neural network model completes training; receives text data input by the user, converts the text data input by the user into a word vector, and enters it into the trained convolutional neural network model to obtain an article Subject and output.
  • the present application also provides a computer-readable storage medium on which is stored an artificial intelligence-based article subject extraction program, which can be One or more processors are executed to implement the steps of the above-mentioned artificial intelligence-based article subject extraction method.
  • This application first performs word segmentation and merging operations on the text data set to obtain a word text set, which can avoid the influence of wrong words on the subject of the entire article.
  • the word text set is encoded and word vector transformed to obtain a word vector set.
  • the encoding operation and the word vector transformation reduce the dimension of the word while amplifying the feature attributes.
  • the convolutional neural network model has excellent feature extraction capabilities, can efficiently identify word features, and improve the subject of the article. Output accuracy rate. Therefore, the artificial intelligence-based article subject extraction method, device, and computer-readable storage medium proposed in this application can achieve accurate article subject output results.
  • FIG. 1 is a schematic flowchart of an artificial intelligence-based article subject extraction method provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of the internal structure of an artificial intelligence-based article subject extraction device provided by an embodiment of the application;
  • FIG. 3 is a schematic diagram of modules of an artificial intelligence-based article subject extraction program in an artificial intelligence-based article subject extraction device provided by an embodiment of the application.
  • FIG. 1 it is a schematic flowchart of an artificial intelligence-based article subject extraction method provided by an embodiment of this application.
  • the method can be executed by a device, and the device can be implemented by software and/or hardware.
  • the method for extracting the subject matter of an article based on artificial intelligence includes:
  • S1 Receive a text data set, and perform operations including word segmentation and merging on the text data set to obtain a word text set.
  • the text data set includes multiple types of texts, such as news, social, academic, government development planning, and corporate investment.
  • the cleaning is to remove stop words, Arabic letters and other heteromorphic words in the text data set, because heteromorphic words that have no actual meaning will reduce the text classification effect.
  • the stop words have no practical meaning and have no effect on text analysis, but are frequently used words, such as commonly used pronouns and prepositions.
  • the cleaning is to construct a table of heteromorphic words in advance, sequentially traverse the words in the text data set, and if the words are the same as those in the table of heteromorphic words, remove them until the traversal is completed.
  • the word segmentation is to segment each sentence in the text data set to obtain a single word. Because there is no clear separation mark between words in Chinese representation, word segmentation is indispensable.
  • the word segmentation described in this application can be processed using a stuttering word database based on programming languages such as Python, JAVA, etc.
  • the stuttering word database is developed for research and development based on Chinese part-of-speech features, and is a collection of the text data The number of occurrences of each word is converted to frequency, and the path with the maximum probability is found based on dynamic programming, and the maximum segmentation combination based on word frequency is found.
  • the merging is to merge multiple sentences with the same subject to achieve the purpose of greatly reducing the words in the text data set.
  • the merging includes: traversing each text in the text data set, dividing the text according to paragraphs to obtain several paragraphs, and presetting words that appear more than twice in each paragraph as hypothetical subjects, and constructing Constructing a log-likelihood function based on the conditional probability model of each sentence in each paragraph and the hypothetical subject, and optimizing the conditional probability model based on the log-likelihood function to obtain the subject of each sentence, Combine several sentences with the same subject into one sentence to complete the combination operation.
  • conditional probability model is:
  • y 1 ,..., y N , y i are the hypothetical subjects
  • N is the number of the hypothetical subjects
  • D is the paragraph
  • j is the number of the paragraph, such as D 1 is the number of the text
  • s is the sentence in the paragraph
  • s) is the probability of assuming that the subject y i is the subject of the sentence s
  • s(i, y i ) represents the hypothetical subject of the sentence i is y i .
  • the log likelihood function is:
  • argmax is the hypothetical subject corresponding to the maximum partial derivative of the conditional probability model to all the hypothetical subjects.
  • the word text set is converted into a word matrix set after an encoding operation, and the word matrix set is input into a word vector conversion model for training to obtain a word vector set.
  • the encoding adopts a one-hot encoding form, and the one-hot encoding is to first number each word in the word text set to obtain the largest number, and then create the largest number.
  • Encoding matrices with the same numerical numbering dimension traverse each sentence in the word text set in turn, map each sentence to the encoding matrix, and according to the numerical number of each word in the word text set Complete the encoding operation to obtain the word matrix set.
  • the collection of words and texts is: when people know how to exchange with the system, they can tell their true self and the truth. This is reality.
  • the word vector conversion model includes assuming a weight relationship between a word matrix in the word matrix set and a word word vector in the word vector set, and calculating the weight based on the weight relationship to complete the The conversion process from the word matrix set to the word vector set.
  • the weight relationship is:
  • d is the word matrix set
  • t 1 , t 2 ,..., t n are word matrices in the word matrix set, as in the above [0,0,0,0,0,0,0,0 ,0 ,0,0,0,1,1] etc.
  • w 1 , w 2 , ..., w n are the weights of the corresponding word matrix.
  • f i represents the number of occurrences of the word matrix in the word matrix set
  • N is the total number of texts in the text data set
  • N j represents the total number of words in the text data set
  • N i represents the word i in the text data set
  • the number of occurrences of, F m is the weighting factor, and the value is generally less than 1.
  • the dimensionality reduction operation includes calculating the covariance of each word vector in the word vector set, and removing word vectors in the covariance whose absolute value is greater than a preset covariance threshold to obtain a dimensionality-reduced word vector set.
  • x i , x j represent each word vector of the word vector set
  • n is the number of the word vector set
  • cov(x i , x j ) represents the calculation of the covariance between x i and x j. If the calculated covariance cov(x i ,x j ) is not 0, if it is greater than 0, it means a positive correlation, and if it is less than 0, it means a negative correlation.
  • the convolutional neural network model includes an input layer, a convolutional layer, a pooling layer, a fully connected layer, and an output layer.
  • the input layer receives the word vector set, and the convolutional layer , Pooling layer, fully connected layer combined with activation function training to obtain training values and output through the output layer.
  • the activation function in the preferred embodiment of the present application may include a Softmax function, and the loss function is a least square function.
  • the Softmax function is:
  • O j represents the output value of the jth neuron in the fully connected layer
  • I j represents the input value of the jth neuron in the output layer
  • t represents the total number of neurons in the output layer
  • e is infinite Do not cycle decimals
  • the least square method L(s) is:
  • s is the training value
  • k is the number of word vector sets after dimensionality reduction
  • y i is the word vector set
  • y′ i is the predicted value of the convolutional neural network model.
  • the convolutional neural network model after the completion of the training outputs the article subject: the article describing ancient literary prisons exposed the fidal rule against literati
  • the wasted tyranny of the author shows the author's deep sympathy for intellectuals and strong resentment of the brutal rule.
  • the invention also provides an article subject extraction device based on artificial intelligence.
  • FIG. 2 it is a schematic diagram of the internal structure of an artificial intelligence-based article subject extraction device provided by an embodiment of the present application.
  • the artificial intelligence-based article subject extraction device 1 may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, or a portable computer, or a server.
  • the artificial intelligence-based article subject extraction device 1 at least includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like.
  • the memory 11 may be an internal storage unit of the article subject extraction device 1 based on artificial intelligence, such as the hard disk of the artificial intelligence-based article subject extraction device 1.
  • the memory 11 may also be an external storage device of the article subject extraction device 1 based on artificial intelligence, such as a plug-in hard disk equipped on the article subject extraction device 1 based on artificial intelligence, and a smart media card (Smart Media Card). , SMC), Secure Digital (SD) card, Flash Card, etc.
  • the memory 11 may also include both an internal storage unit of the article subject extraction device 1 based on artificial intelligence and an external storage device.
  • the memory 11 can be used not only to store application software and various data installed in the artificial intelligence-based article subject extraction device 1, such as the code of the artificial intelligence-based article subject extraction program 01, etc., but also to temporarily store the output or The data to be output.
  • the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip, for running program codes or processing stored in the memory 11 Data, such as the implementation of the article subject extraction program 01 based on artificial intelligence.
  • CPU central processing unit
  • controller microcontroller
  • microprocessor or other data processing chip, for running program codes or processing stored in the memory 11 Data, such as the implementation of the article subject extraction program 01 based on artificial intelligence.
  • the communication bus 13 is used to realize the connection and communication between these components.
  • the network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the apparatus 1 and other electronic devices.
  • the device 1 may also include a user interface.
  • the user interface may include a display (Display) and an input unit such as a keyboard (Keyboard).
  • the optional user interface may also include a standard wired interface and a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc.
  • the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the artificial intelligence-based article subject extraction device 1 and to display a visualized user interface.
  • Figure 2 only shows an artificial intelligence-based article subject extraction device 1 with components 11-14 and an artificial intelligence-based article subject extraction program 01.
  • the structure shown in Figure 1 does not constitute
  • the limitation on the article subject extraction device 1 based on artificial intelligence may include fewer or more components than shown, or a combination of certain components, or a different component arrangement.
  • the memory 11 stores an artificial intelligence-based article subject extraction program 01; when the processor 12 executes the artificial intelligence-based article subject extraction program 01 stored in the memory 11, the following steps are implemented:
  • Step 1 Receive a text data set, and perform operations including word segmentation and merging on the text data set to obtain a word text set.
  • the text data set includes multiple types of texts, such as news, social, academic, government development planning, and corporate investment.
  • the cleaning is to remove stop words, Arabic letters and other heteromorphic words in the text data set, because heteromorphic words that have no actual meaning will reduce the text classification effect.
  • the stop words have no practical meaning and have no effect on text analysis, but are frequently used words, such as commonly used pronouns and prepositions.
  • the cleaning is to construct a table of heteromorphic words in advance, sequentially traverse the words in the text data set, and if the words are the same as those in the table of heteromorphic words, remove them until the traversal is completed.
  • the word segmentation is to segment each sentence in the text data set to obtain a single word. Because there is no clear separation mark between words in Chinese representation, word segmentation is indispensable.
  • the word segmentation described in this application can be processed using a stuttering word database based on programming languages such as Python, JAVA, etc.
  • the stuttering word database is developed for research and development based on Chinese part-of-speech features, and is a collection of the text data The number of occurrences of each word is converted to frequency, and the path with the maximum probability is found based on dynamic programming, and the maximum segmentation combination based on word frequency is found.
  • the merging is to merge multiple sentences with the same subject to achieve the purpose of greatly reducing the words in the text data set.
  • the merging includes: traversing each text in the text data set, dividing the text according to paragraphs to obtain several paragraphs, and presetting words that appear more than twice in each paragraph as hypothetical subjects, and constructing Constructing a log-likelihood function based on the conditional probability model of each sentence in each paragraph and the hypothetical subject, and optimizing the conditional probability model based on the log-likelihood function to obtain the subject of each sentence, Several sentences with the same subject are merged into one sentence, and the merge operation is completed.
  • conditional probability model is:
  • y 1 ,..., y N , y i are the hypothetical subjects
  • N is the number of the hypothetical subjects
  • D is the paragraph
  • j is the number of the paragraph, such as D 1 is the number of the text
  • s is the sentence in the paragraph
  • s) is the probability of assuming that the subject y i is the subject of the sentence s
  • s(i, y i ) represents the hypothetical subject of the sentence i is y i .
  • the log likelihood function is:
  • argmax is the hypothetical subject corresponding to the maximum partial derivative of the conditional probability model to all the hypothetical subjects.
  • Step 2 Perform an encoding operation on the word text set and turn it into a word matrix set, and input the word matrix set into a word vector conversion model for training to obtain a word vector set.
  • the encoding adopts a one-hot encoding form, and the one-hot encoding is to first number each word in the word text set to obtain the largest number, and then create the largest number.
  • Encoding matrices with the same numerical numbering dimension traverse each sentence in the word text set in turn, map each sentence to the encoding matrix, and according to the numerical number of each word in the word text set Complete the encoding operation to obtain the word matrix set.
  • the collection of words and texts is: when people know how to exchange with the system, they can tell their true self and the truth. This is reality.
  • the word vector conversion model includes assuming a weight relationship between a word matrix in the word matrix set and a word word vector in the word vector set, and calculating the weight based on the weight relationship to complete the The conversion process from the word matrix set to the word vector set.
  • the weight relationship is:
  • d is the word matrix set
  • t 1 , t 2 ,..., t n are word matrices in the word matrix set, as in the above [0,0,0,0,0,0,0,0 ,0 ,0,0,0,1,1] etc.
  • w 1 , w 2 , ..., w n are the weights of the corresponding word matrix.
  • f i represents the number of occurrences of the word matrix in the word matrix set
  • N is the total number of texts in the text data set
  • N j represents the total number of words in the text data set
  • N i represents the word i in the text data set
  • the number of occurrences of, F m is the weighting factor, and the value is generally less than 1.
  • Step 3 After performing the dimensionality reduction operation on the word vector set, input the training value to the convolutional neural network model for training, and determine the size of the training value and the preset threshold, if the training value is greater than the preset threshold , The convolutional neural network model continues to be trained, and if the training value is less than the preset threshold, the convolutional neural network model completes the training.
  • the dimensionality reduction operation includes calculating the covariance of each word vector in the word vector set, and removing word vectors in the covariance whose absolute value is greater than a preset covariance threshold to obtain a dimensionality-reduced word vector set.
  • x i , x j represent each word vector of the word vector set
  • n is the number of the word vector set
  • cov(x i , x j ) represents the calculation of the covariance between x i and x j. If the calculated covariance cov(x i ,x j ) is not 0, if it is greater than 0, it means a positive correlation, and if it is less than 0, it means a negative correlation.
  • the convolutional neural network model includes an input layer, a convolutional layer, a pooling layer, a fully connected layer, and an output layer.
  • the input layer receives the word vector set, and the convolutional layer , Pooling layer, fully connected layer combined with activation function training to obtain training values and output through the output layer.
  • the activation function in the preferred embodiment of the present application may include a Softmax function, and the loss function is a least square function.
  • the Softmax function is:
  • O j represents the output value of the jth neuron in the fully connected layer
  • I j represents the input value of the jth neuron in the output layer
  • t represents the total number of neurons in the output layer
  • e is infinite Do not cycle decimals
  • the least square method L(s) is:
  • s is the training value
  • k is the number of word vector sets after dimensionality reduction
  • y i is the word vector set
  • y′ i is the predicted value of the convolutional neural network model.
  • Step 4 Receive text data input by the user, convert the text data input by the user into a word vector, and input it into the trained convolutional neural network model to obtain and output the subject of the article.
  • the convolutional neural network model after the completion of the training outputs the article subject: the article describing ancient literary prisons exposed the fidal rule against literati
  • the wasted tyranny of the author shows the author's deep sympathy for intellectuals and strong resentment of the brutal rule.
  • the artificial intelligence-based article subject extraction program can also be divided into one or more modules, and the one or more modules are stored in the memory 11 and run by one or more processors ( This embodiment is executed by the processor 12) to complete the application.
  • the module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is used to describe the article subject extraction program based on artificial intelligence. The execution process of the article subject extraction device.
  • FIG. 3 a schematic diagram of program modules of an artificial intelligence-based article subject extraction program in an embodiment of an artificial intelligence-based article subject extraction device of this application.
  • the artificial intelligence-based article subject The extraction program can be divided into a data receiving module 10, a word vector solving module 20, a model training module 30, and an article subject output module 40.
  • a data receiving module 10 a word vector solving module
  • a model training module 30 a model training module
  • an article subject output module 40 Illustratively:
  • the data receiving module 10 is used for receiving a text data set, and performing operations including word segmentation and merging on the text data set to obtain a word text set.
  • the word vector solving module 20 is configured to: perform an encoding operation on the word text set and convert it into a word matrix set, and input the word matrix set into a word vector conversion model for training to obtain a word vector set.
  • the model training module 30 is configured to: perform a dimensionality reduction operation on the word vector set and input it into a convolutional neural network model for training to obtain a training value, and determine the size of the training value and a preset threshold. If the training value is If the training value is greater than the preset threshold, the convolutional neural network model continues training, and if the training value is less than the preset threshold, the convolutional neural network model completes training.
  • the article subject output module 40 is configured to receive text data input by a user, convert the text data input by the user into a word vector and input it into the trained convolutional neural network model to obtain and output the article subject.
  • the embodiment of the present application also proposes a computer-readable storage medium, the computer-readable storage medium stores an artificial intelligence-based article subject extraction program, and the artificial intelligence-based article subject extraction program can be used by one or more Each processor executes to achieve the following operations:
  • a text data set is received, and operations including word segmentation and merging are performed on the text data set to obtain a word text set.
  • the word text set is converted into a word matrix set after an encoding operation, and the word matrix set is input into a word vector conversion model for training to obtain a word vector set.
  • a convolutional neural network model After performing the dimensionality reduction operation on the word vector set, input it into a convolutional neural network model to obtain training values, and determine the size of the training value and a preset threshold. If the training value is greater than the preset threshold, the The convolutional neural network model continues to be trained, and if the training value is less than the preset threshold, the convolutional neural network model completes the training.
  • the text data input by the user is received, and the text data input by the user is converted into a word vector and then input into the trained convolutional neural network model to obtain and output the subject matter of the article.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé d'extraction de gist d'article basé sur l'intelligence artificielle, consistant : à recevoir un ensemble de données de texte, et à effectuer des opérations de segmentation et de fusion de mots sur l'ensemble de données de texte afin d'obtenir un ensemble de textes de mots ; à effectuer une opération de codage sur l'ensemble de textes de mots puis à convertir ce dernier en un ensemble de matrices de mots, et à entrer l'ensemble de matrices de mots dans un modèle de transformation de vecteurs de mots destiné à l'apprentissage afin d'obtenir un ensemble de vecteurs de mots ; à effectuer une opération de réduction de dimensionnalité sur l'ensemble de vecteurs de mots, puis à introduire ce dernier dans un modèle de réseau neuronal convolutif destiné à l'apprentissage ; et à convertir des données de texte entrées par l'utilisateur en vecteurs de mots, puis à entrer ces derniers dans le modèle de réseau neuronal convolutif appris de façon à obtenir un gist d'article et à délivrer ce dernier. L'invention concerne également un dispositif d'extraction de gist d'article basé sur l'intelligence artificielle, et un support de stockage lisible par ordinateur. Le procédé permet d'obtenir une fonction d'extraction de gist d'article précise et efficace basée sur l'intelligence artificielle.
PCT/CN2019/116936 2019-09-02 2019-11-10 Procédé et dispositif d'extraction de gist d'article basés sur l'intelligence artificielle, et support de stockage WO2021042517A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910826795.4A CN110705268A (zh) 2019-09-02 2019-09-02 基于人工智能的文章主旨提取方法、装置及计算机可读存储介质
CN201910826795.4 2019-09-02

Publications (1)

Publication Number Publication Date
WO2021042517A1 true WO2021042517A1 (fr) 2021-03-11

Family

ID=69193514

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116936 WO2021042517A1 (fr) 2019-09-02 2019-11-10 Procédé et dispositif d'extraction de gist d'article basés sur l'intelligence artificielle, et support de stockage

Country Status (2)

Country Link
CN (1) CN110705268A (fr)
WO (1) WO2021042517A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111651652B (zh) * 2020-04-30 2023-11-10 中国平安财产保险股份有限公司 基于人工智能的情感倾向识别方法、装置、设备及介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509413A (zh) * 2018-03-08 2018-09-07 平安科技(深圳)有限公司 文摘自动提取方法、装置、计算机设备及存储介质
CN109086340A (zh) * 2018-07-10 2018-12-25 太原理工大学 基于语义特征的评价对象识别方法
CN110110330A (zh) * 2019-04-30 2019-08-09 腾讯科技(深圳)有限公司 基于文本的关键词提取方法和计算机设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10216724B2 (en) * 2017-04-07 2019-02-26 Conduent Business Services, Llc Performing semantic analyses of user-generated textual and voice content
CN110019793A (zh) * 2017-10-27 2019-07-16 阿里巴巴集团控股有限公司 一种文本语义编码方法及装置
CN109871532B (zh) * 2019-01-04 2022-07-08 平安科技(深圳)有限公司 文本主题提取方法、装置及存储介质
CN110059191A (zh) * 2019-05-07 2019-07-26 山东师范大学 一种文本情感分类方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509413A (zh) * 2018-03-08 2018-09-07 平安科技(深圳)有限公司 文摘自动提取方法、装置、计算机设备及存储介质
CN109086340A (zh) * 2018-07-10 2018-12-25 太原理工大学 基于语义特征的评价对象识别方法
CN110110330A (zh) * 2019-04-30 2019-08-09 腾讯科技(深圳)有限公司 基于文本的关键词提取方法和计算机设备

Also Published As

Publication number Publication date
CN110705268A (zh) 2020-01-17

Similar Documents

Publication Publication Date Title
WO2021068329A1 (fr) Procédé de reconnaissance d'entités à noms chinois, dispositif et support de stockage lisible par ordinateur
WO2020224213A1 (fr) Procédé d'identification d'intention d'une phrase, dispositif, et support de stockage lisible par ordinateur
CN109190120B (zh) 神经网络训练方法和装置及命名实体识别方法和装置
US20230195773A1 (en) Text classification method, apparatus and computer-readable storage medium
WO2021169116A1 (fr) Procédé, appareil et dispositif de remplissage intelligent de données manquantes, et support de stockage
WO2020237856A1 (fr) Procédé et appareil intelligents de questions et réponses basés sur un graphe de connaissances, et support de stockage informatique
CN111143576A (zh) 一种面向事件的动态知识图谱构建方法和装置
WO2021121198A1 (fr) Procédé et appareil d'extraction de relation d'entité basée sur une similitude sémantique, dispositif et support
WO2020253042A1 (fr) Procédé et dispositif intelligent d'évaluation de sentiments et support de stockage lisible par ordinateur
CN111222305A (zh) 一种信息结构化方法和装置
WO2020147409A1 (fr) Procédé et appareil de classification de texte, dispositif informatique et support de stockage
WO2021056710A1 (fr) Procédé d'identification par questions et réponses à tours multiples, dispositif, appareil informatique, et support d'informations
US20220318515A1 (en) Intelligent text cleaning method and apparatus, and computer-readable storage medium
CN113360654B (zh) 文本分类方法、装置、电子设备及可读存储介质
CN113378970B (zh) 语句相似性检测方法、装置、电子设备及存储介质
CN111984792A (zh) 网站分类方法、装置、计算机设备及存储介质
CN112287069A (zh) 基于语音语义的信息检索方法、装置及计算机设备
WO2020248366A1 (fr) Procédé et dispositif de classification intelligente d'intention de texte, et support d'informations lisible par ordinateur
CN114612921B (zh) 表单识别方法、装置、电子设备和计算机可读介质
CN114547315A (zh) 一种案件分类预测方法、装置、计算机设备及存储介质
CN113627797A (zh) 入职员工画像生成方法、装置、计算机设备及存储介质
CN114780746A (zh) 基于知识图谱的文档检索方法及其相关设备
CN108268629B (zh) 基于关键词的图像描述方法和装置、设备、介质
CN113947095A (zh) 多语种文本翻译方法、装置、计算机设备及存储介质
CN115730597A (zh) 多级语义意图识别方法及其相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19944319

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19944319

Country of ref document: EP

Kind code of ref document: A1