CN109508459A - A method of extracting theme and key message from news - Google Patents

A method of extracting theme and key message from news Download PDF

Info

Publication number
CN109508459A
CN109508459A CN201811313654.4A CN201811313654A CN109508459A CN 109508459 A CN109508459 A CN 109508459A CN 201811313654 A CN201811313654 A CN 201811313654A CN 109508459 A CN109508459 A CN 109508459A
Authority
CN
China
Prior art keywords
theme
label
matrix
news
serializing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811313654.4A
Other languages
Chinese (zh)
Other versions
CN109508459B (en
Inventor
杨红飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huoshi Creation Technology Co ltd
Original Assignee
Hangzhou Firestone Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Firestone Technology Co Ltd filed Critical Hangzhou Firestone Technology Co Ltd
Priority to CN201811313654.4A priority Critical patent/CN109508459B/en
Publication of CN109508459A publication Critical patent/CN109508459A/en
Application granted granted Critical
Publication of CN109508459B publication Critical patent/CN109508459B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Abstract

The method that the invention discloses a kind of to extract theme and key message from news, comprising the following steps: html tag is carried out to news content and is handled;To treated, news content carries out theme mark and serializing mark respectively, obtains the corresponding serializing label of each word in the corresponding theme label of news content and news content;It creates theme and key message extracts model, which includes a seq2seq network and a fully-connected network, and from the state output of the coding stage of seq2seq network, training pattern obtains optimized parameter for the input of fully-connected network;It is injected into extraction model after carrying out html tag processing to the news content not marked, obtains optimal theme label and serializing label, news generic is obtained according to theme label, the corresponding slot position value of news content is obtained according to serializing label.This method uses seq2seq+attention+crf, strengthens the dependence of disaggregated model and slot filling model, reduces the complexity of text marking, while reducing project development complexity.

Description

A method of extracting theme and key message from news
Technical field
The present invention relates to text classification and information extraction fields, more particularly to one kind, and theme and crucial letter are extracted from news The method of breath.
Background technique
Theme of news extracts the scope for belonging to text classification, and the slot filling in key message extraction belongs to the model of information extraction Farmland is all the chief component of natural language processing.Text classification correlative study can trace back to the fifties in last century earliest, It was to be classified by Expert Rules (Pattern), or even once develop at the beginning of the eighties and established using knowledge engineering at that time Expert system, the advantage of doing so is that short, adaptable and fast solves the problems, such as top, it is apparent that ceiling is very low, it is not only time-consuming and laborious, it covers The range and accuracy rate of lid are all very limited.Later along with internet after the development of statistical learning method, the especially nineties Online amount of text increases and the rise of machine learning subject, has gradually formed a set of warp for solving the problems, such as large-scale text categorization Allusion quotation playing method, the main set pattern in this stage are manual features engineering+shallow-layer disaggregated models.Entire text classification problem is just split into Feature Engineering and classifier two parts.
Feature Engineering often most takes time and effort in machine learning, but extremely important.For abstract, engineering Habit problem is the process for converting data to information and refining knowledge again, is characterized in the process of " data -- > information ", determines to finish The upper limit of fruit, and classifier is the process of " information -- > knowledge ", then is to approach this upper limit.However Feature Engineering is different from Sorter model does not have very strong versatility, generally requires to combine the understanding to feature task.Where text classification problem Naturally also there is its distinctive characteristic processing logic in natural language field, this classification task of tradition point largely works also here. Text feature engineering is divided into three Text Pretreatment, feature extraction, text representation parts, and final purpose is to convert text to count The intelligible format of calculation machine, and encapsulate the information for being sufficiently used for classification, i.e., very strong feature representation ability.Classifier is substantially Statistical classification method, substantial majority machine learning method are all applied in text classification field, such as simple pattra leaves This sorting algorithm (Bayes), KNN, SVM, maximum entropy and neural network etc..
Unstructured data as natural language sentences is converted into structural data, then utilizes powerful inquiry work Tool, such as SQL.This method that meaning is obtained from text is referred to as information extraction, and information extracting system search is a large amount of non-structural Change text, finds certain types of entity and relationship, and be used to fill organized database.These databases can be used To find the answer of particular problem.It is broadly divided into name Entity recognition, relationship is extracted.
Naming Entity recognition (NER) is a classical problem in natural language processing, and application is also extremely wide.Than Name, place name are such as identified from a word, and the name of product, identification medicine name etc. are identified from the search of electric business. Traditional generally acknowledges that relatively good Processing Algorithm is condition random field (CRF), it is a kind of discriminate probabilistic model, is random field One kind being usually used in mark or analytical sequence data, such as natural language text or biological sequence.It is simply to say to apply in NER It is to give a series of feature to go to predict the label of each word.
Relationship is extracted mainly between the semantic classification entity, and the Relation extraction technology of existing mainstream is divided into have supervision Learning method, semi-supervised learning method and unsupervised three kinds of learning method:
1, the learning method of supervision as classification problem, designs Relation extraction task effective special according to training data Then sign uses trained classifier projected relationship to learn various disaggregated models.The problem of this method, is to need big The artificial mark training corpus of amount, and corpus labeling work usually takes time and effort very much.
2, semi-supervised learning method mainly uses Bootstrapping to carry out Relation extraction.For the relationship to be extracted, This method sets several sub-instance by hand first, then iteratively from data from corresponding relationship templates of the relationship of extraction and more Example.
3, unsupervised learning method assumes to possess the entity of identical semantic relation to possessing similar contextual information.Cause This can use each entity to contextual information is corresponded to represent the semantic relation of the entity pair, and to the language of all entities pair Adopted relationship is clustered.
Compared with other two methods, there is the learning method of supervision that can extract more effective feature, accuracy rate and calls together The rate of returning is all higher.Therefore the learning method of supervision receives the concern of more and more scholars.
In nowadays most applications, name Entity recognition, relationship extraction are all that individual task is executing, and are less used It says in conjunction with text classification.Currently used entity, the method that Relation extraction method is assembly line: one sentence of input, it is first It is first named Entity recognition, combination of two then is carried out to the entity identified, then carry out relationship classification, finally presence The triple of entity relationship is as input.The method of assembly line there are the shortcomings that have: 1) error propagation, the mistake of Entity recognition module Misunderstanding influences following relationship classification performance;2) existing relationship between two subtasks is ignored.3) producing need not Redundancy then carry out relationship classification again, those are not related due to being matched two-by-two to the entity identified Entity promotes error rate to can bring about redundant information.
Existing text classification and slot filling are all intended only as individual model to train, and not only ignore between task Dependence, and aggravated the development cycle of entire project, increased the workload of text marking.Text classification and information What the mode of all common supervised learning of extraction was realized, and supervised learning must have enough sample datas, the mark of sample Note is to compare the work taken time and effort, and mark quality also because of people.In this case, task reads the complexity of mark just Bigger and quality also more Customers ' Legal Right.Solve the problems, such as that natural language processing is the most popular with deep learning at present, but depth The period for practising general training is longer, and the task the how especially prominent, seriously constrains the iteration of project.
Summary of the invention
Slot filling is a vital task in natural language understanding, is for extracting various role's letters relevant with event Breath and attribute information.News category and slot filling usually are divided into two independent models to train, and two moulds Type is incoherent.But say that slot filling is to rely on news category on operational angle, different classes of problem will fill Slot type be also different.Technical solution provided by the invention is to instruct news category and slot filling as a model Practice, multiple tasks are incorporated into a task, the correlation between task has been fully considered, can avoid to a greater degree in this way News category and the unmatched problem of slot type, reduce the development cycle, improve the accuracy of result.
The method of the present invention mainly uses the scheme of seq2seq+attention+crf to solve, specifically includes the following steps:
(1) html tag is carried out to the news content crawled on webpage to handle;
(2) to treated, news content carries out theme mark and serializing mark respectively, and it is corresponding to obtain news content The corresponding serializing label of each word in theme label and news content;The theme is noted for the mark affiliated class of news Not;Serializing mark is primarily directed in the case where having marked theme, determining the relevant role of theme or attribute Information;
(3) it creates theme and key message extracts model, which includes a seq2seq network and a fully connected network Network, the state output of the input of fully-connected network from the coding stage of seq2seq network;
(4) news data that step (2) has marked is injected into the seq2seq network for extracting model, to news content In word encoded, cataloged procedure is as follows: first in news content each word carry out embedding vectorization processing, Vectorization matrix is obtained, then vectorization matrix is injected into coding BiLstm bidirectional circulating neural network, obtains outputs Output matrix and finalState end-state matrix;
(5) it is directed to theme label, finalState matrix is injected into the fully-connected network for extracting model, is obtained Logic matrix and actual theme label are done cross entropy and handle to obtain penalty values category_ by logic intermediate result matrix loss;
(6) for serializing label, outputs matrix progress attention attention mechanism is converted to obtain Attention attention matrix;
(7) the decoding BiLstm for attention matrix and outputs matrix being input to seq2seq network together is bis- Into Recognition with Recurrent Neural Network, decode_outputs decoded output matrix is obtained, calculates decode_ with crf loss function Outputs matrix penalty values solt_loss corresponding with serializing label;
(8) the whole loss value loss that category_loss is obtained extracting network plus solt_loss, then utilizes Gradient descent method carries out backpropagation to loss, obtains the optimized parameter for extracting model;
(9) it after carrying out html tag processing to the news content not marked, is injected into theme and key message extracts mould In type, optimal theme label and serializing label is obtained, news generic is obtained according to theme label, is marked according to serializing Label obtain the corresponding slot position value of news content, i.e. role or attribute information.
Further, in the step (4), embedding vectorization processing tool is carried out to each word in news content Body are as follows: the good embedding word vector of pre-training is directly injected into seq2seq network with the method for transfer learning, is being instructed It does not need to be updated the parameter in embedding word vector during practicing.
Further, in the step (6), outputs matrix progress attention attention mechanism is converted to obtain During attention attention matrix, by the way of Self attention and Multi-head, solves tradition Attention model can not parallelization the shortcomings that, promote effect and performance.
Further, in the step (9), theme and key message extract model output theme label matrix and serializing Label matrix using softmax as activation primitive, obtains the theme label of maximum probability as optimal in theme label matrix Theme label;For serializing label matrix, decode_outputs decoded output matrix is subjected to condition random field crf solution Code obtains optimal serializing label.
It is mentioned the beneficial effects of the present invention are: being extracted the invention proposes a kind of disposable solution theme of news with key message The method taken, the present invention use seq2seq+attention+crf scheme, enhance disaggregated model and slot filling model according to The relationship of relying, reduces the complexity of text marking, while can reduce project development complexity.
Detailed description of the invention
Fig. 1 is the implementation process schematic diagram of one embodiment of the invention.
Specific embodiment
Invention is further described in detail in the following with reference to the drawings and specific embodiments.
A kind of method for extracting theme and key message from news provided by the invention, comprising the following steps:
(1) html tag is carried out to the news content crawled on webpage to handle;
(2) to treated, news content carries out theme mark and serializing mark respectively, and it is corresponding to obtain news content The corresponding serializing label of each word in theme label and news content;
The main mark news generic of theme mark, for example, for financial institution, it will news relevant to information of inviting outside investment It is labeled as 1, other news are labeled as 0;
Serializing mark is primarily directed in the case where having marked theme, determining the relevant role of theme or attribute Information, such as financing event relevant for information of inviting outside investment, corresponding role are investor, by investor etc., corresponding attribute Inferior for the financing amount of money, financing wheel, these corresponding roles and attribute are exactly slot position;
(3) it creates theme and key message extracts model, which includes a seq2seq network and a fully connected network Network, the state output of the input of fully-connected network from the coding stage of seq2seq network;
(4) news data that step (2) has marked is injected into the seq2seq network for extracting model, to news content In word encoded, cataloged procedure is as follows: first in news content each word carry out embedding vectorization processing, Vectorization matrix is obtained, then vectorization matrix is injected into coding BiLstm bidirectional circulating neural network, obtains outputs Output matrix and finalState end-state matrix;
Each word in news content is carried out in embedding vectorization treatment process, it will with the method for transfer learning The good embedding word vector of pre-training is directly injected into seq2seq network, is not needed in the training process pair Parameter in embedding word vector is updated.
(5) it is directed to theme label, finalState matrix is injected into the fully-connected network for extracting model, is obtained Logic matrix and actual theme label are done cross entropy and handle to obtain penalty values category_ by logic intermediate result matrix loss;
(6) for serializing label, outputs matrix progress attention attention mechanism is converted to obtain Attention attention matrix by the way of Self attention and Multi-head, solves biography in the process Unite attention model can not parallelization the shortcomings that, promote effect and performance;
(7) the decoding BiLstm for attention matrix and outputs matrix being input to seq2seq network together is bis- Into Recognition with Recurrent Neural Network, decode_outputs decoded output matrix is obtained, calculates decode_ with crf loss function Outputs matrix penalty values solt_loss corresponding with serializing label;
(8) the whole loss value loss that category_loss is obtained extracting network plus solt_loss, then utilizes Gradient descent method carries out backpropagation to loss, obtains the optimized parameter for extracting model;
(9) it after carrying out html tag processing to the news content not marked, is injected into theme and key message extracts mould In type, optimal theme label and serializing label is obtained, news generic is obtained according to theme label, is marked according to serializing Label obtain the corresponding slot position value of news content, i.e. role or attribute information.
Theme and key message extract model output theme label matrix and serializing label matrix, for theme label square Battle array, using softmax as activation primitive, obtains the theme label of maximum probability as optimal theme label;Serializing is marked Matrix is signed, decodes decode_outputs decoded output matrix progress condition random field crf to obtain optimal serializing label.
Such as the processing of the method for the present invention is carried out to following news:
It " steps the auspicious high-end intelligent woman issued based on big data algorithm and produces ultrasonic special machine Chinese mythology goddess Resona 8, she includes The multinomial intelligence such as the automatic volume navigation of fetus cranium brain, the self-navigation of fetus face, the automatic volume navigation of fetal rhythm, intelligent basin baselap sound Using, will for the pre-natal diagnosis of women, postpartum recovery, healthy reproduction bring heart to heart take good care of ";
As shown in Figure 1, the news is input in theme and key message extraction model, model is with word in coding stage Basic unit carries out embedding, f-lstm, b-lstm respectively and obtains outputs output matrix and the final shape of finalState State matrix;FinalState end-state matrix is carried out full connection to handle to obtain final theme label;In decryption phase, The corresponding attention of outpouts and outputs is injected into decryption network together, carried out respectively in decryption network lstm, Crf decode handles to obtain final serializing label, and serializing label is finally converted into corresponding slot position value.
The foregoing is merely preferable implementation examples of the invention, are not intended to restrict the invention, it is all in spirit of that invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.

Claims (5)

1. a kind of method for extracting theme and key message from news, which comprises the following steps:
(1) html tag is carried out to the news content crawled on webpage to handle;
(2) to treated, news content carries out theme mark and serializing mark respectively, obtains the corresponding theme of news content The corresponding serializing label of each word in label and news content;The theme is noted for mark news generic;Institute Serializing mark is stated primarily directed in the case where having marked theme, determining the relevant role of theme or attribute information;
(3) it creates theme and key message extracts model, which includes a seq2seq network and a fully-connected network, State output of the input of fully-connected network from the coding stage of seq2seq network;
(4) news data that step (2) has marked is injected into the seq2seq network for extracting model, in news content Word is encoded, and cataloged procedure is as follows: being carried out embedding vectorization processing to each word in news content first, is obtained Then vectorization matrix is injected into coding BiLstm bidirectional circulating neural network by vectorization matrix, obtain outputs output Matrix and finalState end-state matrix;
(5) it is directed to theme label, finalState matrix is injected into the fully-connected network for extracting model, is obtained in logic Between matrix of consequence, logic matrix and actual theme label are done into cross entropy and handle to obtain penalty values category_loss;
(6) for serializing label, outputs matrix progress attention attention mechanism is converted to obtain attention note Meaning torque battle array;
(7) it follows decoding BiLstm that attention matrix and outputs matrix are input to seq2seq network together is two-way In ring neural network, decode_outputs decoded output matrix is obtained, calculates decode_outputs square with crf loss function Battle array penalty values solt_loss corresponding with serializing label;
(8) the whole loss value loss that category_loss is obtained extracting network plus solt_loss, then utilizes gradient Descent method carries out backpropagation to loss, obtains the optimized parameter for extracting model;
(9) it after carrying out html tag processing to the news content not marked, is injected into theme and key message extracts in model, Optimal theme label and serializing label is obtained, news generic is obtained according to theme label, is obtained according to serializing label To the corresponding slot position value of news content, i.e. role or attribute information.
2. a kind of method for extracting theme and key message from news according to claim 1, which is characterized in that described In step (4), embedding vectorization processing is carried out to each word in news content specifically: with the method for transfer learning The good embedding word vector of pre-training is directly injected into seq2seq network, is not needed in the training process pair Parameter in embedding word vector is updated.
3. a kind of method for extracting theme and key message from news according to claim 1, which is characterized in that described In step (6), convert outputs matrix progress attention attention mechanism to obtain the mistake of attention attention matrix Cheng Zhong, by the way of Self attention and Multi-head, solving traditional attention model can not parallelization Disadvantage promotes effect and performance.
4. a kind of method for extracting theme and key message from news according to claim 1, which is characterized in that described In step (9), theme and key message extract model output theme label matrix and serializing label matrix, for theme label Matrix obtains the theme label of maximum probability as optimal theme label using softmax as activation primitive;For serializing Label matrix decodes decode_outputs decoded output matrix progress condition random field crf to obtain optimal serializing mark Label.
5. a kind of method for extracting theme and key message from news according to claim 1, which is characterized in that the party Method uses seq2seq+attention+crf, strengthens the dependence of disaggregated model and slot filling model, reduces text marking Complexity, while reducing project development complexity.
CN201811313654.4A 2018-11-06 2018-11-06 Method for extracting theme and key information from news Active CN109508459B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811313654.4A CN109508459B (en) 2018-11-06 2018-11-06 Method for extracting theme and key information from news

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811313654.4A CN109508459B (en) 2018-11-06 2018-11-06 Method for extracting theme and key information from news

Publications (2)

Publication Number Publication Date
CN109508459A true CN109508459A (en) 2019-03-22
CN109508459B CN109508459B (en) 2022-11-29

Family

ID=65747642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811313654.4A Active CN109508459B (en) 2018-11-06 2018-11-06 Method for extracting theme and key information from news

Country Status (1)

Country Link
CN (1) CN109508459B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135493A (en) * 2019-05-15 2019-08-16 北京信息科技大学 A kind of news topic tracking
CN110362823A (en) * 2019-06-21 2019-10-22 北京百度网讯科技有限公司 The training method and device of text generation model are described
CN110415815A (en) * 2019-07-19 2019-11-05 银丰基因科技有限公司 The hereditary disease assistant diagnosis system of deep learning and face biological information
CN110532452A (en) * 2019-07-12 2019-12-03 西安交通大学 A kind of general crawler design method of news website based on GRU neural network
CN110597970A (en) * 2019-08-19 2019-12-20 华东理工大学 Multi-granularity medical entity joint identification method and device
CN111062217A (en) * 2019-12-19 2020-04-24 江苏满运软件科技有限公司 Language information processing method and device, storage medium and electronic equipment
CN111143514A (en) * 2019-12-27 2020-05-12 北京百度网讯科技有限公司 Method and apparatus for generating information
CN111950199A (en) * 2020-08-11 2020-11-17 杭州叙简科技股份有限公司 Earthquake data structured automation method based on earthquake news event
CN112765363A (en) * 2021-01-19 2021-05-07 昆明理工大学 Demand map construction method for scientific and technological service demand
CN112818687A (en) * 2021-03-25 2021-05-18 杭州数澜科技有限公司 Method, device, electronic equipment and storage medium for constructing title recognition model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304911A (en) * 2018-01-09 2018-07-20 中国科学院自动化研究所 Knowledge Extraction Method and system based on Memory Neural Networks and equipment
CN108491372A (en) * 2018-01-31 2018-09-04 华南理工大学 A kind of Chinese word cutting method based on seq2seq models
CN108595704A (en) * 2018-05-10 2018-09-28 成都信息工程大学 A kind of the emotion of news and classifying importance method based on soft disaggregated model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304911A (en) * 2018-01-09 2018-07-20 中国科学院自动化研究所 Knowledge Extraction Method and system based on Memory Neural Networks and equipment
CN108491372A (en) * 2018-01-31 2018-09-04 华南理工大学 A kind of Chinese word cutting method based on seq2seq models
CN108595704A (en) * 2018-05-10 2018-09-28 成都信息工程大学 A kind of the emotion of news and classifying importance method based on soft disaggregated model

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110135493A (en) * 2019-05-15 2019-08-16 北京信息科技大学 A kind of news topic tracking
CN110362823A (en) * 2019-06-21 2019-10-22 北京百度网讯科技有限公司 The training method and device of text generation model are described
CN110532452B (en) * 2019-07-12 2022-04-22 西安交通大学 News website universal crawler design method based on GRU neural network
CN110532452A (en) * 2019-07-12 2019-12-03 西安交通大学 A kind of general crawler design method of news website based on GRU neural network
CN110415815A (en) * 2019-07-19 2019-11-05 银丰基因科技有限公司 The hereditary disease assistant diagnosis system of deep learning and face biological information
CN110597970A (en) * 2019-08-19 2019-12-20 华东理工大学 Multi-granularity medical entity joint identification method and device
CN110597970B (en) * 2019-08-19 2023-04-07 华东理工大学 Multi-granularity medical entity joint identification method and device
CN111062217A (en) * 2019-12-19 2020-04-24 江苏满运软件科技有限公司 Language information processing method and device, storage medium and electronic equipment
CN111062217B (en) * 2019-12-19 2024-02-06 江苏满运软件科技有限公司 Language information processing method and device, storage medium and electronic equipment
CN111143514A (en) * 2019-12-27 2020-05-12 北京百度网讯科技有限公司 Method and apparatus for generating information
CN111950199A (en) * 2020-08-11 2020-11-17 杭州叙简科技股份有限公司 Earthquake data structured automation method based on earthquake news event
CN112765363A (en) * 2021-01-19 2021-05-07 昆明理工大学 Demand map construction method for scientific and technological service demand
CN112818687A (en) * 2021-03-25 2021-05-18 杭州数澜科技有限公司 Method, device, electronic equipment and storage medium for constructing title recognition model

Also Published As

Publication number Publication date
CN109508459B (en) 2022-11-29

Similar Documents

Publication Publication Date Title
CN109508459A (en) A method of extracting theme and key message from news
WO2020211275A1 (en) Pre-trained model and fine-tuning technology-based medical text relationship extraction method
CN109408812A (en) A method of the sequence labelling joint based on attention mechanism extracts entity relationship
CN106980608A (en) A kind of Chinese electronic health record participle and name entity recognition method and system
CN110232439B (en) Intention identification method based on deep learning network
CN112487820B (en) Chinese medical named entity recognition method
CN108717574A (en) A kind of natural language inference method based on conjunction label and intensified learning
CN109284361A (en) A kind of entity abstracting method and system based on deep learning
WO2021017025A1 (en) Method for automatically generating python codes from natural language
CN111243699A (en) Chinese electronic medical record entity extraction method based on word information fusion
CN107526798A (en) A kind of Entity recognition based on neutral net and standardization integrated processes and model
CN113221571B (en) Entity relation joint extraction method based on entity correlation attention mechanism
CN112269868A (en) Use method of machine reading understanding model based on multi-task joint training
CN109598002A (en) Neural machine translation method and system based on bidirectional circulating neural network
CN111832293A (en) Entity and relation combined extraction method based on head entity prediction
CN112364132A (en) Similarity calculation model and system based on dependency syntax and method for building system
CN113869055A (en) Power grid project characteristic attribute identification method based on deep learning
CN109446523A (en) Entity attribute extraction model based on BiLSTM and condition random field
CN114564953A (en) Emotion target extraction model based on multiple word embedding fusion and attention mechanism
CN114969269A (en) False news detection method and system based on entity identification and relation extraction
CN116910272B (en) Academic knowledge graph completion method based on pre-training model T5
CN117349311A (en) Database natural language query method based on improved RetNet
CN116680407A (en) Knowledge graph construction method and device
CN114548090B (en) Fast relation extraction method based on convolutional neural network and improved cascade labeling
Ma et al. Joint pre-trained Chinese named entity recognition based on bi-directional language model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 7 / F, building B, 482 Qianmo Road, Xixing street, Binjiang District, Hangzhou City, Zhejiang Province 310000

Patentee after: Huoshi Creation Technology Co.,Ltd.

Address before: 7 / F, building B, 482 Qianmo Road, Xixing street, Binjiang District, Hangzhou City, Zhejiang Province 310000

Patentee before: HANGZHOU FIRESTONE TECHNOLOGY Co.,Ltd.