WO2023035330A1 - Procédé et appareil d'extraction d'événement de texte long, dispositif informatique et support de stockage - Google Patents

Procédé et appareil d'extraction d'événement de texte long, dispositif informatique et support de stockage Download PDF

Info

Publication number
WO2023035330A1
WO2023035330A1 PCT/CN2021/120030 CN2021120030W WO2023035330A1 WO 2023035330 A1 WO2023035330 A1 WO 2023035330A1 CN 2021120030 W CN2021120030 W CN 2021120030W WO 2023035330 A1 WO2023035330 A1 WO 2023035330A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
text
long text
truncated
role
Prior art date
Application number
PCT/CN2021/120030
Other languages
English (en)
Chinese (zh)
Inventor
谢翀
罗伟杰
陈永红
黄开梅
Original Assignee
深圳前海环融联易信息科技服务有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海环融联易信息科技服务有限公司 filed Critical 深圳前海环融联易信息科技服务有限公司
Publication of WO2023035330A1 publication Critical patent/WO2023035330A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Definitions

  • the present application relates to the field of computer technology, in particular to a long text event extraction method, device, computer equipment and storage medium.
  • the existing event extraction methods for long text generally have a relatively simple definition of events.
  • some financial public opinion analysis platforms mainly extract the main event roles of financial texts, display them through keywords and other forms, and at the same time evaluate the emotional tendency of the entire text.
  • This type of platform mainly uses simple event classification and NER (Named Entity Recognition, named entity recognition technology) extracts events from long text.
  • NER Named Entity Recognition, named entity recognition technology
  • the event classification technology is to classify the original text with classification labels, and the same text may have multiple labels; the named entity recognition technology is to identify and extract some keyword information that may exist in the original text, such as company, time, etc.
  • a second, more similar approach is relation extraction for shorter texts. It is mainly aimed at the title, summary, summary, etc. of the article, and at the same time pays more attention to the subject, object and the relationship between them in the text.
  • This type of method mainly applies the technology of relation extraction, and there are two implementation methods in the general direction. The first one uses named entity technology to identify the subjects in the text, and then combines the objects and their relationships through other models. Extraction; the second method uses named entity technology to extract the subject and object in the text at the same time. If there are multiple subjects or objects, it is necessary to pair and group different subjects and objects through a binary classification model.
  • the existing existing method For the first existing method mentioned above, first of all, the information extracted by the existing existing method is less. For example, in the long text of the "company listing” type, the existing method mainly focuses on the specific listed company and time That is, other important information such as “financing scale”, “listed market value”, and “financing rounds” have not been extracted or displayed. Secondly, the existing methods only give users reminders at the level of emotion classification, and there are no relevant reminders in terms of importance, timeliness, authority, etc.
  • the second relationship extraction method mentioned above it is relatively simple to extract only the subject, object and association relationship.
  • the application of the method is relatively narrow. Due to the limitation of simple information extraction, this method is generally only used for information extraction of short texts, which greatly affects the scope of application.
  • the relationship extraction method requires that the subject and object must exist at the same time. In reality, the text often lacks the subject or object. For example, "A company goes public”, there is only the subject "A company”, and there is no corresponding object, so this method cannot be applied.
  • the second relation extraction method has significant limitations.
  • Embodiments of the present application provide a long text event extraction method, device, computer equipment, and storage medium, aiming at improving the efficiency and accuracy of event extraction for long text.
  • the embodiment of the present application provides a long text event extraction method, including:
  • the embodiment of the present application provides a long text event extraction device, including:
  • the first truncation unit is configured to obtain trigger words in the long text of the event to be extracted, and perform text truncation on the long text according to the trigger words to obtain the truncated text;
  • a first classification prediction unit configured to use a deep learning model to classify and predict multiple event types corresponding to the truncated text
  • the first extraction unit is used to combine machine reading comprehension technology and pointer network model to extract corresponding event role information for each event type;
  • the result output unit is configured to combine all the event role information into a target event based on a sequence generation algorithm, and output the target event as an event extraction result.
  • an embodiment of the present application provides a computer device, including a memory, a processor, and a computer program stored on the memory and operable on the processor.
  • the processor executes the computer program, Realize the method for extracting long text events as described in the first aspect.
  • an embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the long text as described in the first aspect is implemented. Event extraction method.
  • An embodiment of the present application provides a long text event extraction method, device, computer equipment, and storage medium, the method including: acquiring trigger words in the long text of the event to be extracted, and performing text truncation on the long text according to the trigger words , to obtain the truncated text; use the deep learning model to classify and predict multiple event types corresponding to the truncated text; combine machine reading comprehension technology and pointer network model to extract corresponding event role information for each event type; based on the sequence generation algorithm , combining all the event role information into a target event, and outputting the target event as an event extraction result.
  • the embodiments of the present application improve the event extraction efficiency and extraction accuracy for long texts by performing event classification, event role extraction, and event combination on long texts.
  • Fig. 1 is a schematic flow chart of a long text event extraction method provided by the embodiment of the present application
  • Fig. 2 is a schematic subflow diagram of a long text event extraction method provided in the embodiment of the present application
  • FIG. 3 is a schematic subflow diagram of a long text event extraction method provided in an embodiment of the present application.
  • FIG. 4 is a schematic block diagram of a long text event extraction device provided in an embodiment of the present application.
  • FIG. 5 is a sub-schematic block diagram of a long text event extraction device provided in an embodiment of the present application.
  • FIG. 6 is a sub-schematic block diagram of an apparatus for extracting long text events provided by an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a method for extracting long text events provided by an embodiment of the present application, which specifically includes steps S101 to S104.
  • the event extraction process is specifically divided into three stages: event classification, event role extraction, and event combination.
  • event classification stage the trigger word is used to truncate the long text, and then the deep learning model is used to classify and predict the truncated text.
  • event role extraction stage since the truncated text and event classification information of all truncated texts are obtained in the event classification stage, it is necessary to extract the event role information for each event type in the event role extraction stage, that is, MRC ( Machine Reading Comprehension (machine reading comprehension technology) + pointer network strategy to extract event role information.
  • MRC Machine Reading Comprehension (machine reading comprehension technology) + pointer network strategy to extract event role information.
  • the event combination stage through the model extraction of the first two stages, all event roles of each truncated text belonging to a certain event type are obtained, so this stage combines all event role information into a complete
  • the event that is, the target event
  • This embodiment improves the event extraction efficiency and extraction accuracy for long texts by performing event classification, event role extraction, and event combination on long texts.
  • the long text described in this embodiment may be papers, news reports, magazines and so on.
  • the event extraction for news reports is more detailed, which can support more fine-grained queries and reduce the time for users to read the original text.
  • the order of importance of event roles is provided, so that users can selectively focus on some key points.
  • this embodiment adopts related technologies of deep learning, which greatly saves the workload of later operations and audits.
  • the F1 of the whole process has reached 0.7+.
  • the current evaluation index is set to the whole process F1, which means that from the very beginning of text input, n events are output, and each event outputs m event roles.
  • the calculation formula of F1 is 2 * (p * r) / (p + r) , where p is the accuracy rate, which represents the correct proportion of m * n event roles; r is the recall rate, which represents the correct number of m * n event roles, relative to the proportion of the total number of labels.
  • the step S101 includes: steps S201-S204.
  • S203 Construct a discrete interval according to the total number of words between different trigger words, and select the interval with the largest number of words distributed based on the discrete interval;
  • the stage of event classification there are two major pain points in news reports, such as the length of the text is too long, and the types of events contained are various.
  • pain point 1 that is, the length of the text is too long
  • the trigger word means that if there is a corresponding keyword in the text, there is a certain probability that there is a corresponding type of event.
  • text truncation is mainly combined with event trigger words.
  • the specific method is: first find out all the trigger words in the text, and truncate sentences with a certain word count threshold in the trigger word context.
  • the word count threshold is mainly determined through statistics. Since the Chinese pre-training model generally limits the maximum input text length in order to ensure the effect, the original text needs to be truncated.
  • the specific process is:
  • the specific number of words will be discretized into specific intervals, such as (below 50 words), (50-100 words), etc., and the distribution of each major interval will be counted, and finally the word number interval with the largest distribution proportion will be selected as the trigger word Before and after the word count threshold for text segmentation.
  • the step S102 includes: steps S301 to S304.
  • the training and prediction structure of the deep learning model is modified, and a multi-label classification technology is applied to ensure that each truncated
  • the text of can be predicted as multiple event types the specific process is:
  • this embodiment performs text splicing on the truncated text and each event type, and separates them with special characters. For example, if there are 10 event types, the original single training text will become 10 training texts. At this time, the corresponding training label becomes a binary classification label, that is, the training goal of the model is optimized to determine whether the text belongs to one of the event labels. , which can well solve the problem of small sample size.
  • the model level some changes have been made to adapt to the changes in the process. The model no longer performs convolution on the original text, but performs convolution on the original text after splicing event labels. At this time, the semantics of the text may be quite different. In order to deal with this problem, this embodiment adds a small number of convolution kernels with a step size of 2 while retaining the original convolution kernel with a step size of 1, so as to improve the information of texts that are far away. extraction capacity.
  • this embodiment has also carried out some modifications in the final loss calculation. Since the original model handles multi-label text, the original loss calculation is no longer suitable for the existing binary classification model, and at the same time avoids a large number of negative samples generated after binary classification , this embodiment adopts the focal-loss loss function, so as to effectively avoid the binary classification loss function that the model tends to fit negative samples caused by too many negative samples.
  • all event types are also spliced after the original text. For example, the same predicted text will be expanded to 10 predicted texts.
  • the model can obtain the 2 classification results of whether it belongs to the event type. After post-processing and summarizing all the event types predicted to be 1, all the texts can be obtained. event type.
  • the transformation at the model level, the feedforward calculation is consistent with the training stage, and there are also a small number of convolution kernels with a step size of 2, mainly to ensure that the parameters in the training stage can be completely reproduced in the prediction.
  • the output of the prediction result does not need to go through the focal-loss loss calculation, and the activation function result of the previous layer can be directly output.
  • the step S103 includes:
  • this embodiment adopts the strategy of MRC (Machine Reading Comprehension, that is, machine reading comprehension technology) + pointer network.
  • MRC technology Machine Reading Comprehension, that is, machine reading comprehension technology
  • MRC technology mainly adopts a question-and-answer overall structure, that is, splicing questions after the input truncated text, which can greatly enrich the truncated text, and after adding questions It can focus more on the extraction of role information in this event.
  • the most important training goal of event role recognition is to obtain the starting position and ending position of the character in the truncated text, but if the starting position and ending position between the starting position and the ending position are also
  • event roles For example, "Shenzhen” in "Shenzhen Huawei Technology Company” is both the company name and the region where it is located.
  • the traditional event role recognition technology cannot solve this problem well.
  • the pointer network mainly uses two sets of label values to fit the start position and end position respectively. At the same time, there are two independent sets of label lists for each event role to isolate. The model needs to predict two sets of predictions for each event role separately. value, calculate the loss with two sets of label lists respectively, and finally ensure that the optimal solution can be obtained under each event role.
  • the input of the pointer network is still the truncated text of the concatenated questions under the MRC structure.
  • the pointer network will construct two label lists of length 100.
  • the first label list is mainly responsible for predicting the starting position of the event role. Each position will output the probability value of whether it is the starting position, and find the position with the highest probability value as the starting position of the event role.
  • the specific process can have a variety of basic networks.
  • the encoder of Transformer can be used for processing. Transformer is widely used in the field of NLP. It has powerful feature changes and processing capabilities, and can well extract the surface syntactic structure of the input text. information and deep semantic information. The overall process is similar to the pointer moving back and forth on the text with a length of 100 until the start bit position is found.
  • the second label list works in the same way as the first label list, except it transforms the fit target (i.e., the start position) to the end position of the event actor.
  • this embodiment uses a pointer network to convert the multi-label recognition problem into a large number of single-label binary classification problems , to avoid information confusion.
  • this embodiment adopts MRC technology, which mainly converts the original text, and sends the original text splicing question text together into the pre-trained language model.
  • the model needs to predict the location of the answer to the question text.
  • the question text is strongly related to the event type, so it can realize the strong constraint of the event type on the event role, and ensure that the event role information under each event conforms to the rules formulated by domain experts.
  • the sequence generation algorithm is DOC2EDAG algorithm.
  • the full name of EDAG is Entity-based Directed Acyclic Graph, which means an entity-based directed acyclic graph, constructs a series of event roles extracted from long texts into a directed acyclic graph, that is, generates a sequence composed of event roles as a single event.
  • the step S104 includes:
  • the pain point in the event combination stage is that any event role of any event may be one entity, multiple entities, or even no entity, so it will face extremely complex logic processing in pairing and combination.
  • this pain point is mainly dealt with by rules in the industry, and there are certain models in the academic world.
  • this embodiment is based on the DOC2EDAG algorithm, which converts event combinations into sequence generation tasks. Specifically, for each event type, define a sequence for all event roles of subordinates, and update each event role step by step.
  • the standard for defining the order can be determined by domain knowledge experts, and the standard is the role importance ordering under a single event dimension. For example, the importance of roles in the "company listing" event is: listed company, listing link, listing stock exchange, listing time and so on.
  • updating the state of the event role subordinate to each event type through a state variable includes:
  • the comprehensive judgment is mainly determined by the fully connected layer of the neural network.
  • the main process is that the node feature e of the newly added event role node is transformed through the fully connected layer, and then spliced with the state variable at this time, and then passed A fully connected layer and an activation function to obtain the probability value that the event role node matches the event role.
  • the event role node with the highest matching probability value is selected as the prediction result of the event role.
  • Each event role node may be a real entity, or it may be a null value, and finally the common prefixes are merged to form each individual event.
  • the first stage (that is, the event classification stage) mainly outputs the event types (multi-classification) of all truncated texts of the long text;
  • the second stage (that is, the event role Extraction stage) input these truncated texts, and mainly output all event roles identified under each event type of each truncated text;
  • the third stage (ie, event combination stage) input all event roles, and obtained a batch of event roles through the sequence generation model All events, and finally realize the requirements of event extraction.
  • FIG. 4 is a schematic block diagram of a long text event extraction device 400 provided in an embodiment of the present application.
  • the device 400 includes:
  • the first truncation unit 401 is configured to obtain trigger words in the long text of the event to be extracted, and perform text truncation on the long text according to the trigger words to obtain the truncated text;
  • the first classification prediction unit 402 is configured to use a deep learning model to classify and predict multiple event types corresponding to the truncated text;
  • the first extraction unit 403 is used to combine machine reading comprehension technology and pointer network model to extract corresponding event role information for each event type;
  • the result output unit 404 is configured to combine all the event role information into a target event based on a sequence generation algorithm, and output the target event as an event extraction result.
  • the first truncation unit 401 includes:
  • the trigger word selection unit 501 is used to select the trigger word in the long text through the trigger word dictionary, and utilize the trigger word to pre-truncate the long text;
  • a statistical unit 502 configured to count the number of sentences and the total number of words between different trigger words based on the pre-truncated long text
  • the interval selection unit 503 is used to construct a discrete interval according to the total number of words between different trigger words, and select the interval with the largest number of words distributed based on the discrete interval;
  • the word count threshold setting unit 504 is configured to select the mode number in the word count interval as the word count threshold, and use the word count threshold to perform text truncation on long texts.
  • the first category prediction unit 402 includes:
  • a label splicing unit 601 configured to obtain a training set comprising truncated training text and event types, and stitch the truncated training text in the training set according to the event label;
  • the convolution processing unit 602 is used to perform convolution processing on the spliced truncated training text by increasing the deep learning model of the convolution kernel;
  • An optimization update unit 603, configured to optimize and update the improved deep learning model using a focal-loss loss function
  • the second category prediction unit 604 is configured to use the updated deep learning model to perform event category prediction on the truncated text.
  • the first extraction unit 403 includes:
  • a question splicing unit for splicing questions after each event type of the truncated text using a question-and-answer structure
  • the probability prediction unit is used to construct a label list according to the concatenated questions through the pointer network model, and use the label list to predict the starting position probability value and the ending position probability value of the question sentence in the truncated text;
  • the position selection unit is configured to select the start position and the end position with the highest probability value, and use the text content between the start position and the end position as the event role information under the corresponding event type.
  • the sequence generation algorithm is DOC2EDAG algorithm.
  • the result output unit 404 includes:
  • a role sorting unit configured to sort all event roles under each event type based on the event role information
  • a state update unit configured to update the state of the event roles under each event type through a state variable
  • the sequence output unit is used to construct a directed acyclic graph for all event roles through the DOC2EDAG algorithm according to the sorting result and the state update result, to obtain a sequence of all the event role information combinations, and use the sequence as the target event output.
  • the status update unit includes:
  • a feature transformation unit configured to obtain at least one newly added event role node, and perform feature transformation on each of the event role nodes by using a fully connected layer;
  • the feature splicing unit is used to splice the feature transformation result and the state variable, and input the splicing result into the fully connected layer and the activation function in turn to obtain the matching probability value between each of the event role nodes and the corresponding event role;
  • the node selection unit is configured to select the event role node with the highest matching probability value as the prediction result of the corresponding event role, and update the corresponding event type.
  • the embodiment of the present application also provides a computer-readable storage medium, on which a computer program is stored. When the computer program is executed, the steps provided in the above-mentioned embodiments can be realized.
  • the storage medium may include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, and other media capable of storing program codes.
  • the embodiment of the present application also provides a computer device, which may include a memory and a processor.
  • a computer program is stored in the memory.
  • the processor invokes the computer program in the memory, the steps provided in the above embodiments can be implemented.
  • the computer equipment may also include components such as various network interfaces and power supplies.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente demande divulgue un procédé et un appareil d'extraction d'événement de texte long, un dispositif informatique et un support de stockage. Le procédé consiste à : acquérir un mot de déclenchement dans un texte long d'un événement à extraire, et effectuer une troncature de texte sur le texte long en fonction du mot de déclenchement, de façon à obtenir un texte tronqué ; utiliser un modèle d'apprentissage profond pour classer et prédire une pluralité de types d'événement correspondant au texte tronqué ; en combinaison avec une technologie de compréhension de lecture machine et un modèle de réseau de pointeur, extraire des informations de rôle d'événement correspondantes pour chaque type d'événement ; et sur la base d'un algorithme de génération de séquence, combiner toutes les informations de rôle d'événement en un événement cible, et délivrer en sortie l'événement cible en tant que résultat d'extraction d'événement. Dans la présente demande, en effectuant une classification d'événement, une extraction de rôle d'événement et une combinaison d'événements sur un texte long, l'efficacité d'extraction d'événement et la précision d'extraction du texte long sont améliorées.
PCT/CN2021/120030 2021-09-13 2021-09-24 Procédé et appareil d'extraction d'événement de texte long, dispositif informatique et support de stockage WO2023035330A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111065602.1 2021-09-13
CN202111065602.1A CN113535963B (zh) 2021-09-13 2021-09-13 一种长文本事件抽取方法、装置、计算机设备及存储介质

Publications (1)

Publication Number Publication Date
WO2023035330A1 true WO2023035330A1 (fr) 2023-03-16

Family

ID=78093162

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/120030 WO2023035330A1 (fr) 2021-09-13 2021-09-24 Procédé et appareil d'extraction d'événement de texte long, dispositif informatique et support de stockage

Country Status (2)

Country Link
CN (1) CN113535963B (fr)
WO (1) WO2023035330A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501898A (zh) * 2023-06-29 2023-07-28 之江实验室 适用于少样本和有偏数据的金融文本事件抽取方法和装置
CN116776886A (zh) * 2023-08-15 2023-09-19 浙江同信企业征信服务有限公司 一种信息抽取方法、装置、设备及存储介质
CN117648397A (zh) * 2023-11-07 2024-03-05 中译语通科技股份有限公司 篇章事件抽取方法、系统、设备及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115292568B (zh) * 2022-03-02 2023-11-17 内蒙古工业大学 一种基于联合模型的民生新闻事件抽取方法
CN114996434B (zh) * 2022-08-08 2022-11-08 深圳前海环融联易信息科技服务有限公司 一种信息抽取方法及装置、存储介质、计算机设备
CN115982339A (zh) * 2023-03-15 2023-04-18 上海蜜度信息技术有限公司 突发事件抽取方法、系统、介质、电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009205372A (ja) * 2008-02-27 2009-09-10 Mitsubishi Electric Corp 情報処理装置及び情報処理方法及びプログラム
US20200226218A1 (en) * 2019-01-14 2020-07-16 International Business Machines Corporation Automatic classification of adverse event text fragments
CN111522915A (zh) * 2020-04-20 2020-08-11 北大方正集团有限公司 中文事件的抽取方法、装置、设备及存储介质
CN112861527A (zh) * 2021-03-17 2021-05-28 合肥讯飞数码科技有限公司 一种事件抽取方法、装置、设备及存储介质
CN112905868A (zh) * 2021-03-22 2021-06-04 京东方科技集团股份有限公司 事件抽取方法、装置、设备及存储介质
CN113312916A (zh) * 2021-05-28 2021-08-27 北京航空航天大学 基于触发词语态学习的金融文本事件抽取方法及装置

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2002320280A1 (en) * 2002-07-03 2004-01-23 Iotapi., Com, Inc. Text-machine code, system and method
CN110210027B (zh) * 2019-05-30 2023-01-24 杭州远传新业科技股份有限公司 基于集成学习的细粒度情感分析方法、装置、设备及介质
CN111090763B (zh) * 2019-11-22 2024-04-05 北京视觉大象科技有限公司 一种图片自动标签方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009205372A (ja) * 2008-02-27 2009-09-10 Mitsubishi Electric Corp 情報処理装置及び情報処理方法及びプログラム
US20200226218A1 (en) * 2019-01-14 2020-07-16 International Business Machines Corporation Automatic classification of adverse event text fragments
CN111522915A (zh) * 2020-04-20 2020-08-11 北大方正集团有限公司 中文事件的抽取方法、装置、设备及存储介质
CN112861527A (zh) * 2021-03-17 2021-05-28 合肥讯飞数码科技有限公司 一种事件抽取方法、装置、设备及存储介质
CN112905868A (zh) * 2021-03-22 2021-06-04 京东方科技集团股份有限公司 事件抽取方法、装置、设备及存储介质
CN113312916A (zh) * 2021-05-28 2021-08-27 北京航空航天大学 基于触发词语态学习的金融文本事件抽取方法及装置

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501898A (zh) * 2023-06-29 2023-07-28 之江实验室 适用于少样本和有偏数据的金融文本事件抽取方法和装置
CN116501898B (zh) * 2023-06-29 2023-09-01 之江实验室 适用于少样本和有偏数据的金融文本事件抽取方法和装置
CN116776886A (zh) * 2023-08-15 2023-09-19 浙江同信企业征信服务有限公司 一种信息抽取方法、装置、设备及存储介质
CN116776886B (zh) * 2023-08-15 2023-12-05 浙江同信企业征信服务有限公司 一种信息抽取方法、装置、设备及存储介质
CN117648397A (zh) * 2023-11-07 2024-03-05 中译语通科技股份有限公司 篇章事件抽取方法、系统、设备及存储介质

Also Published As

Publication number Publication date
CN113535963A (zh) 2021-10-22
CN113535963B (zh) 2021-12-21

Similar Documents

Publication Publication Date Title
Huq et al. Sentiment analysis on Twitter data using KNN and SVM
WO2023035330A1 (fr) Procédé et appareil d'extraction d'événement de texte long, dispositif informatique et support de stockage
US20210382878A1 (en) Systems and methods for generating a contextually and conversationally correct response to a query
CN111767716B (zh) 企业多级行业信息的确定方法、装置及计算机设备
Kaur Incorporating sentimental analysis into development of a hybrid classification model: A comprehensive study
CN101140588A (zh) 一种关联关系搜索结果的排序方法及装置
CN113704451A (zh) 一种电力用户诉求筛选方法、系统、电子设备和存储介质
Sun et al. Pre-processing online financial text for sentiment classification: A natural language processing approach
CN112214614A (zh) 基于知识图谱挖掘风险传播路径的方法及其系统
CN112036842A (zh) 一种科技服务智能匹配平台
Shekhawat Sentiment classification of current public opinion on brexit: Naïve Bayes classifier model vs Python’s Textblob approach
Kim et al. Business environmental analysis for textual data using data mining and sentence-level classification
CN113379432B (zh) 一种基于机器学习的销售系统客户匹配方法
CN113486143A (zh) 一种基于多层级文本表示及模型融合的用户画像生成方法
CN111859955A (zh) 一种基于深度学习的舆情数据分析模型
CN115982391A (zh) 信息处理方法及装置
Rizinski et al. Sentiment Analysis in Finance: From Transformers Back to eXplainable Lexicons (XLex)
Khekare et al. Design of Automatic Key Finder for Search Engine Optimization in Internet of Everything
Zhu et al. Sentiment analysis methods: Survey and evaluation
CN113536772A (zh) 一种文本处理方法、装置、设备及存储介质
CN111967251A (zh) 客户声音智慧洞察系统
Bharadi Sentiment Analysis of Twitter Data Using Named Entity Recognition
CN117852553B (zh) 基于聊天记录提取元器件交易场景信息的语言处理系统
CN117708308B (zh) 一种基于rag自然语言智能知识库管理的方法和系统
CN117453895B (zh) 一种智能客服应答方法、装置、设备及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21956506

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE