CN109753660A - A kind of acceptance of the bid webpage name entity abstracting method based on LSTM - Google Patents

A kind of acceptance of the bid webpage name entity abstracting method based on LSTM Download PDF

Info

Publication number
CN109753660A
CN109753660A CN201910013185.2A CN201910013185A CN109753660A CN 109753660 A CN109753660 A CN 109753660A CN 201910013185 A CN201910013185 A CN 201910013185A CN 109753660 A CN109753660 A CN 109753660A
Authority
CN
China
Prior art keywords
bid
word
acceptance
lstm
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910013185.2A
Other languages
Chinese (zh)
Other versions
CN109753660B (en
Inventor
陈羽中
林剑
郭昆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou University
Original Assignee
Fuzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou University filed Critical Fuzhou University
Priority to CN201910013185.2A priority Critical patent/CN109753660B/en
Publication of CN109753660A publication Critical patent/CN109753660A/en
Application granted granted Critical
Publication of CN109753660B publication Critical patent/CN109753660B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention relates to a kind of name entity recognition method of data of getting the bid, include the following steps: that the text data to acceptance of the bid webpage cleans, and obtains acceptance of the bid text;The semantic information feature of text data is obtained using Lattice-LSTM as coding layer;Entity mark is carried out to each word as decoding layer using LSTM, marks the entity information in statement sequence;Carry out the correction and formatting processing of rule;The name entity for the acceptance of the bid webpage that finally output identifies.The present invention is based on Lattice-LSTM-LSTM models, can efficiently identify the name entity in the project winning a bid details page of bidding website.

Description

A kind of acceptance of the bid webpage name entity abstracting method based on LSTM
Technical field
The present invention relates to name entity recognition techniques fields, and in particular to a kind of acceptance of the bid webpage name based on LSTM is real Body abstracting method.
Background technique
Name Entity recognition is a background task of natural language processing.The purpose is to identify name in corpus, Name, institution term etc. name entity.Since these name physical quantities are continuously increased, it is often impossible to exhaustive in dictionary It lists, and its constructive method has respective certain law, thus, usually the identification to these words from vocabulary form Independent process in (such as Chinese word segmentation) task of managing, referred to as name Entity recognition.
As a background task of natural language processing, the correlative study of Entity recognition is named to attract much more The close attention of expert and scholar, and propose some optimization algorithms and model.There is scholar to propose a kind of based on stacking HMM mould The name entity identification algorithms of type, first identify name and place name, then carry out high-rise mechanism name as feature and know Not;There is scholar to propose a kind of Chinese name entity identification algorithms based on condition random field, and obtains based on word, boundary, part of speech Good effect can be got as feature with entity dictionary;There is scholar to propose a kind of method based on bootstrapping, Expand seed vocabulary using bootstrapping technology and solves the problems, such as that artificial labeled data is insufficient;There is scholar to propose a kind of base In the name entity identification algorithms of the neural network structure of BLSTM, this method no longer depends directly on manual features and field is known Knowing, but utilizes the term vector based on context and the term vector based on word, the former expresses the contextual information of name entity, The latter expresses prefix, suffix and the realm information for constituting name entity;There is scholar to propose a kind of based on BLSTM-CRF model Entity identification algorithms are named, when carrying out sequence labelling to sentence, the label between word is not independent, considers front word Label information so that the information of bluebeard compound mark the tag of current word, CRF to replace again to export, produce from the layer using softmax The final prediction of raw each word;There is scholar to propose a kind of deep-neural-network model based on stack from coding classifier, Solve the transition problem from Chinese text sequence to mode input vector, propose before the vectorization convenient for Project Realization to- Back-propagating formula.
Name entity identification algorithms most at present are all to name, place name, and mechanism name is identified, not to its into Row is further to be divided, and bad to the recognition effect of long entity.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of, the acceptance of the bid webpage based on LSTM names entity abstracting method, It can quickly and effectively identify the name entity in the project winning a bid details page of bidding website.
To achieve the above object, the present invention adopts the following technical scheme:
A kind of acceptance of the bid webpage name entity abstracting method based on LSTM, specifically includes the following steps:
Step A: the text data of acceptance of the bid webpage to be extracted is cleaned, acceptance of the bid text is obtained;
Step B: it using Lattice-LSTM model as coding layer, and using acceptance of the bid text as the input of coding layer, obtains The semantic information feature of acceptance of the bid text;
Step C: using LSTM model as decoding layer, and using the semantic information feature of obtained acceptance of the bid text as decoding The input of layer is labeled each word in acceptance of the bid text;
Step D: rule regulating is carried out to the obtained acceptance of the bid text with mark and formatting is handled;
Step E: the name entity of identification is exported.
Further, the step B specifically:
Step B1: word vector is converted by the word in text of getting the bid;
Wherein, for j-th of word c in acceptance of the bid textj, it is converted into word vectorCalculation formula is as follows:
Wherein, ecIndicate character vector mapping table.
Step B2: the word in text of getting the bid is converted into term vector;
Step B3: inputting Lattice-LSTM model for term vector, obtains acceptance of the bid text using Lattice-LSTM model Semantic information feature.
Further, the step B2 specifically:
Step B21: vocabulary D is constructed using Tire tree according to Large Scale Corpus;
Step B22: the matching set of words P of the empty acceptance of the bid text of initialization one;
Step B23: beginning stepping through the first character for text of getting the bid as current word, executes step B24;
Step B24: by matching in vocabulary D using current word as the word of prefix wordIt is added in set P;
Wherein, b indicates position of the first character of word in sentence, and e indicates position of the last character of word in sentence;
Step B25: using the character late of current word as current word, iteration executes step B24, until text of getting the bid Last character terminate;
Step B26: will be in set P after traversalBe converted to term vectorCalculation formula is as follows:
Wherein, ewFor term vector mapping table.
Further, the step B3 is specific as follows:
For each sentence in text, the word sequence vector that step B1 is obtained is sequentially inputAnd step The term vector sequence that B2 is obtainedInto Lattice-LSTM model, each word is exported in the semanteme of context The vector of information indicates sequence, and specific formula for calculation is as follows:
It is the word vector of j-th of word in sentence,Be in sentence with j-th of word be ending word term vector,For j The output at moment;For the weight matrix of word-level LSTM, For word-level The bias term of LSTM;It is forgetting door of the word-level LSTM at the j moment;It is input gate of the word-level LSTM at the j moment;It is candidate memory vector of the word-level LSTM at the j moment;It is memory vector of the word-level LSTM at the j moment;For the weight matrix of character level LSTM,For character level The bias term of LSTM;It is input gate of the character level LSTM at the j moment;Be candidate of the word-level LSTM at the j moment remember to Amount;It is memory vector of the word-level LSTM at the j moment;It is out gate of the word-level LSTM at the j moment; It is to calculateWhen weight.
Further, the step C specifically:
Step C1: for the name Entity recognition task of acceptance of the bid webpage, the word in data is divided into two classes;
Wherein, the first kind represents the word unrelated with entity, is indicated with label " O ";Second class represents relevant to entity The label of word, this kind of words consists of three parts:
Step C2: by the hidden state information of the obtained semantic information that can indicate text of step BIt is input to decoding In the LSTM model of layer, output state of each character under the influence of upper and lower Chinese character, the following institute of specific formula for calculation are calculated Show:
WhereinFor label vector;
Step C3: by label vectorIt is input in Softmax classifier, it is normalized operation, calculate text In each word be marked as the probability of all kinds of labels, specific formula is as follows:
Wherein WyFor weight matrix, byFor bias term, NtFor the species number of label;
Step C4: it using log-likelihood function as loss function, by stochastic gradient descent optimization method, is passed using reversed It broadcasts iteration and updates model parameter, carry out training pattern to minimize loss function, specific formula for calculation is as follows:
Wherein, D indicates the size of training set, LjIt is the length of sentence x,It is character t in sentence xjLabel, It is the probability after normalization, Θ representative model parameter, I (O) is a selection function, to distinguish the loss of label ' O ' and can refer to Show the loss of the label of entity, specific formula for calculation is as follows:
Further, the name entity includes bid mechanism, acceptance of the bid mechanism, bid mechanism their location, middle standard gold Volume, bid authority contact people, project for bidding title, get the bid the time.
Further, the step D specifically:
Step D1: the correction process that rule is carried out with labeled data that step C is obtained;
Step D2: will correction treated that data are formatted processing.
Further, the step D1 specifically:
Step D11: for the amount of money of getting the bid, judge entity with the presence or absence of Arabic numerals by the way of regular expression Or Chinese word figure, if there is no then not thinking to be the acceptance of the bid amount of money and give up.
Step D12: for the time of getting the bid, judgement is not that date building form give up.
Step D13: project name will not be gone out substantially since the string length of project name entity is usually longer Now there was only the case where two or three of word compositions, therefore gives up entity of the string length less than 4 of the project name recognized.
Step D14: reserved character string length longest life when classification same for an acceptance of the bid data occurs multiple Name entity.
Further, in the step D2, processing is formatted to name entity, specifically includes the following steps:
Step D21: for the amount of money of getting the bid, judging whether entity includes unit " hundred ", " one hundred ", " thousand ", " thousand ", " ten thousand ", " ten thousand ", " hundred million ", " hundred million ", " dollar ", " yen ", if carrying out unit conversion comprising if;
Step D22: it for the time of getting the bid, is converted in the form of date format YYYY-MM-DD.
Compared with the prior art, the invention has the following beneficial effects:
The present invention is based on Lattice-LSTM-LSTM models, can efficiently identify the project winning a bid details of bidding website Name entity in the page, and identification that can very well to long entity.
Detailed description of the invention
Fig. 1 is the method for the present invention flow chart.
Specific embodiment
The present invention will be further described with reference to the accompanying drawings and embodiments.
Fig. 1 is please referred to, the present invention provides a kind of acceptance of the bid webpage name entity abstracting method based on LSTM, specifically includes Following steps:
Step A: the text data of acceptance of the bid webpage to be extracted is cleaned, acceptance of the bid text is obtained;
Step B: it using Lattice-LSTM model as coding layer, and using acceptance of the bid text as the input of coding layer, obtains The semantic information feature of acceptance of the bid text;
Step B1: word vector is converted by the word in text of getting the bid;
Wherein, for j-th of word c in acceptance of the bid textj, it is converted into word vectorCalculation formula is as follows:
Wherein, ecIndicate character vector mapping table.
Step B2: the word in text of getting the bid is converted into term vector;
Step B21: vocabulary D is constructed using Tire tree according to Large Scale Corpus;
Step B22: the matching set of words P of the empty acceptance of the bid text of initialization one;
Step B23: beginning stepping through the first character for text of getting the bid as current word, executes step B24;
Step B24: by matching in vocabulary D using current word as the word of prefix wordIt is added in set P;
Wherein, b indicates position of the first character of word in sentence, and e indicates position of the last character of word in sentence;
Step B25: using the character late of current word as current word, iteration executes step B24, until text of getting the bid Last character terminate;
Step B26: will be in set P after traversalBe converted to term vectorCalculation formula is as follows:
Wherein, ewFor term vector mapping table.
Step B3: inputting Lattice-LSTM model for term vector, obtains acceptance of the bid text using Lattice-LSTM model Semantic information feature.
For each sentence in text, the word sequence vector that step B1 is obtained is sequentially inputAnd step The term vector sequence that B2 is obtainedInto Lattice-LSTM model, each word is exported in the semanteme of context The vector of information indicates sequence, and specific formula for calculation is as follows:
It is the word vector of j-th of word in sentence,Be in sentence with j-th of word be ending word term vector,For j The output at moment;For the weight matrix of word-level LSTM,For word The bias term of grade LSTM;It is forgetting door of the word-level LSTM at the j moment;It is input gate of the word-level LSTM at the j moment;It is candidate memory vector of the word-level LSTM at the j moment;It is memory vector of the word-level LSTM at the j moment;For the weight matrix of character level LSTM,For character level The bias term of LSTM;It is input gate of the character level LSTM at the j moment;Be candidate of the word-level LSTM at the j moment remember to Amount;It is memory vector of the word-level LSTM at the j moment;It is out gate of the word-level LSTM at the j moment; It is to calculateWhen weight.
Step C: using LSTM model as decoding layer, and using the semantic information feature of obtained acceptance of the bid text as decoding The input of layer is labeled each word in acceptance of the bid text;
Step C1: for the name Entity recognition task of acceptance of the bid webpage, the word in data is divided into two classes;
Wherein, the first kind represents the word unrelated with entity, is indicated with label " O ";Second class represents relevant to entity The label of word, this kind of words consists of three parts:
Step C2: by the hidden state information of the obtained semantic information that can indicate text of step BIt is input to decoding In the LSTM model of layer, output state of each character under the influence of upper and lower Chinese character, the following institute of specific formula for calculation are calculated Show:
WhereinFor label vector;
Step C3: by label vectorIt is input in Softmax classifier, it is normalized operation, calculate text In each word be marked as the probability of all kinds of labels, specific formula is as follows:
Wherein WyFor weight matrix, byFor bias term, NtFor the species number of label;
Step C4: it using log-likelihood function as loss function, by stochastic gradient descent optimization method, is passed using reversed It broadcasts iteration and updates model parameter, carry out training pattern to minimize loss function, specific formula for calculation is as follows:
Wherein, D indicates the size of training set, LjIt is the length of sentence x,It is character t in sentence xjLabel, It is the probability after normalization, Θ representative model parameter, I (O) is a selection function, to distinguish the loss of label ' O ' and can refer to Show the loss of the label of entity, specific formula for calculation is as follows:
Step D: rule regulating is carried out to the obtained acceptance of the bid text with mark and formatting is handled;
Step D1: the correction process that rule is carried out with labeled data that step C is obtained;
Step D11: for the amount of money of getting the bid, judge entity with the presence or absence of Arabic numerals by the way of regular expression Or Chinese word figure, if there is no then not thinking to be the acceptance of the bid amount of money and give up.
Step D12: for the time of getting the bid, judgement is not that date building form give up.
Step D13: project name will not be gone out substantially since the string length of project name entity is usually longer Now there was only the case where two or three of word compositions, therefore gives up entity of the string length less than 4 of the project name recognized.
Step D14: reserved character string length longest life when classification same for an acceptance of the bid data occurs multiple Name entity.
Step D2: will correction treated that data are formatted processing.
Step D21: for the amount of money of getting the bid, judging whether entity includes unit " hundred ", " one hundred ", " thousand ", " thousand ", " ten thousand ", " ten thousand ", " hundred million ", " hundred million ", " dollar ", " yen ", if carrying out unit conversion comprising if;
Step D22: it for the time of getting the bid, is converted in the form of date format YYYY-MM-DD.
Step E: bid mechanism, acceptance of the bid mechanism, bid mechanism their location, the acceptance of the bid amount of money, the bid mechanism of identification are exported Contact person, project for bidding title, the name entity for time of getting the bid.
The foregoing is merely presently preferred embodiments of the present invention, all equivalent changes done according to scope of the present invention patent With modification, it is all covered by the present invention.

Claims (9)

1. a kind of acceptance of the bid webpage based on LSTM names entity abstracting method, which is characterized in that specifically includes the following steps:
Step A: the text data of acceptance of the bid webpage to be extracted is cleaned, acceptance of the bid text is obtained;
Step B: it using Lattice-LSTM model as coding layer, and using acceptance of the bid text as the input of coding layer, is got the bid The semantic information feature of text;
Step C: using LSTM model as decoding layer, and using the semantic information feature of obtained acceptance of the bid text as the defeated of decoding layer Enter, each word in acceptance of the bid text is labeled;
Step D: rule regulating is carried out to the obtained acceptance of the bid text with mark and formatting is handled;
Step E: the name entity of identification is exported.
2. a kind of acceptance of the bid webpage based on LSTM according to claim 1 names entity abstracting method, it is characterised in that: institute State step B specifically:
Step B1: word vector is converted by the word in text of getting the bid;
Wherein, for j-th of word c in acceptance of the bid textj, it is converted into word vectorCalculation formula is as follows:
Wherein, ecIndicate character vector mapping table;
Step B2: the word in text of getting the bid is converted into term vector;
Step B3: inputting Lattice-LSTM model for term vector, obtains the language of acceptance of the bid text using Lattice-LSTM model Adopted information characteristics.
3. a kind of acceptance of the bid webpage based on LSTM according to claim 2 names entity abstracting method, which is characterized in that institute State step B2 specifically:
Step B21: vocabulary D is constructed using Tire tree according to Large Scale Corpus;
Step B22: the matching set of words P of the empty acceptance of the bid text of initialization one;
Step B23: beginning stepping through the first character for text of getting the bid as current word, executes step B24;
Step B24: by matching in vocabulary D using current word as the word of prefix wordIt is added in set P;
Wherein, b indicates position of the first character of word in sentence, and e indicates position of the last character of word in sentence;
Step B25: using the character late of current word as current word, iteration executes step B24, until the last of text of getting the bid One character ends;
Step B26: will be in set P after traversalBe converted to term vectorCalculation formula is as follows:
Wherein, ewFor term vector mapping table.
4. a kind of acceptance of the bid webpage based on LSTM according to claim 2 names entity abstracting method, which is characterized in that institute It is specific as follows to state step B3:
For each sentence in text, the word sequence vector that step B1 is obtained is sequentially inputIt is obtained with step B2 The term vector sequence arrivedInto Lattice-LSTM model, each word is exported in the semantic information of context Vector indicates sequence, and specific formula for calculation is as follows:
It is the word vector of j-th of word in sentence,Be in sentence with j-th of word be ending word term vector,For the j moment Output;For the weight matrix of word-level LSTM, For word-level LSTM Bias term;It is forgetting door of the word-level LSTM at the j moment;It is input gate of the word-level LSTM at the j moment;It is word Candidate memory vector of the language grade LSTM at the j moment;It is memory vector of the word-level LSTM at the j moment;For the weight matrix of character level LSTM,For character level The bias term of LSTM;It is input gate of the character level LSTM at the j moment;Be candidate of the word-level LSTM at the j moment remember to Amount;It is memory vector of the word-level LSTM at the j moment;It is out gate of the word-level LSTM at the j moment; It is to calculateWhen weight.
5. a kind of acceptance of the bid webpage based on LSTM according to claim 4 names entity abstracting method, which is characterized in that institute State step C specifically:
Step C1: for the name Entity recognition task of acceptance of the bid webpage, the word in data is divided into two classes;
Wherein, the first kind represents the word unrelated with entity, is indicated with label " O ";Second class represents word relevant to entity, this The label of a kind of word consists of three parts:
Step C2: by the hidden state information of the obtained semantic information that can indicate text of step BIt is input to decoding layer In LSTM model, output state of each character under the influence of upper and lower Chinese character is calculated, specific formula for calculation is as follows:
WhereinFor label vector;
Step C3: by label vectorIt is input in Softmax classifier, it is normalized operation, calculate every in text A word is marked as the probability of all kinds of labels, and specific formula is as follows:
Wherein WyFor weight matrix, byFor bias term, Nt is the species number of label;
Step C4: using log-likelihood function as loss function, by stochastic gradient descent optimization method, backpropagation iteration is utilized Model parameter is updated, carrys out training pattern to minimize loss function, specific formula for calculation is as follows:
Wherein, D indicates the size of training set, and Lj is the length of sentence x,It is label of the character t in sentence xj,It is normalizing Probability after change, Θ representative model parameter, I (O) are a selection functions, to distinguish the loss of label ' O ' and can indicate entity Label loss, specific formula for calculation is as follows:
6. a kind of acceptance of the bid webpage based on LSTM according to claim 1 names entity abstracting method, it is characterised in that: institute Stating name entity includes bid mechanism, acceptance of the bid mechanism, bid mechanism their location, the acceptance of the bid amount of money, bid authority contact people, bid Project name is got the bid the time.
7. a kind of acceptance of the bid webpage based on LSTM according to claim 6 names entity abstracting method, which is characterized in that institute State step D specifically:
Step D1: the correction process that rule is carried out with labeled data that step C is obtained;
Step D2: will correction treated that data are formatted processing.
8. a kind of acceptance of the bid webpage based on LSTM according to claim 7 names entity abstracting method, which is characterized in that institute State step D1 specifically:
Step D11: for the amount of money of getting the bid, judge entity with the presence or absence of Arabic numerals or Chinese by the way of regular expression Word figure, if there is no then not thinking it is that acceptance of the bid and is given up the amount of money.
Step D12: for the time of getting the bid, judgement is not that date building form give up.
Step D13: being not in only since the string length of project name entity is usually longer for project name substantially The case where being made of two or three of words, therefore give up entity of the string length less than 4 of the project name recognized.
Step D14: the longest name of reserved character string length is real when classification same for an acceptance of the bid data occurs multiple Body.
9. a kind of acceptance of the bid webpage based on LSTM according to claim 1 names entity abstracting method, which is characterized in that institute It states in step D2, processing is formatted to name entity, specifically includes the following steps:
Step D21: for the amount of money of getting the bid, judging whether entity includes unit " hundred ", " one hundred ", " thousand ", " thousand ", " ten thousand ", " ten thousand ", " hundred million ", " hundred million ", " dollar ", " yen ", if carrying out unit conversion comprising if;
Step D22: it for the time of getting the bid, is converted in the form of date format YYYY-MM-DD.
CN201910013185.2A 2019-01-07 2019-01-07 LSTM-based winning bid web page named entity extraction method Active CN109753660B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910013185.2A CN109753660B (en) 2019-01-07 2019-01-07 LSTM-based winning bid web page named entity extraction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910013185.2A CN109753660B (en) 2019-01-07 2019-01-07 LSTM-based winning bid web page named entity extraction method

Publications (2)

Publication Number Publication Date
CN109753660A true CN109753660A (en) 2019-05-14
CN109753660B CN109753660B (en) 2023-06-13

Family

ID=66404567

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910013185.2A Active CN109753660B (en) 2019-01-07 2019-01-07 LSTM-based winning bid web page named entity extraction method

Country Status (1)

Country Link
CN (1) CN109753660B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334300A (en) * 2019-07-10 2019-10-15 哈尔滨工业大学 Text aid reading method towards the analysis of public opinion
CN110738182A (en) * 2019-10-21 2020-01-31 四川隧唐科技股份有限公司 LSTM model unit training method and device for high-precision identification of bid amount
CN110738319A (en) * 2019-11-11 2020-01-31 四川隧唐科技股份有限公司 LSTM model unit training method and device for recognizing bid-winning units based on CRF
CN111078978A (en) * 2019-11-29 2020-04-28 上海观安信息技术股份有限公司 Web credit website entity identification method and system based on website text content
CN111737969A (en) * 2020-07-27 2020-10-02 北森云计算有限公司 Resume parsing method and system based on deep learning
CN111738002A (en) * 2020-05-26 2020-10-02 北京信息科技大学 Ancient text field named entity identification method and system based on Lattice LSTM
CN112017016A (en) * 2019-10-29 2020-12-01 河南拓普计算机网络工程有限公司 Method for cleaning bid amount of bid-attracting bulletin
CN112948588A (en) * 2021-05-11 2021-06-11 中国人民解放军国防科技大学 Chinese text classification method for quick information editing
CN112989807A (en) * 2021-03-11 2021-06-18 重庆理工大学 Long digital entity extraction method based on continuous digital compression coding
CN112990845A (en) * 2021-01-04 2021-06-18 江苏省测绘地理信息局信息中心 Intelligent acquisition method for mapping market project
JP2021111416A (en) * 2020-01-15 2021-08-02 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Method and apparatus for labeling core entity, electronic device, storage medium, and computer program
CN114048750A (en) * 2021-12-10 2022-02-15 广东工业大学 Named entity identification method integrating information advanced features

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082331A1 (en) * 2008-09-30 2010-04-01 Xerox Corporation Semantically-driven extraction of relations between named entities
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN107832400A (en) * 2017-11-01 2018-03-23 山东大学 A kind of method that location-based LSTM and CNN conjunctive models carry out relation classification
CN108416058A (en) * 2018-03-22 2018-08-17 北京理工大学 A kind of Relation extraction method based on the enhancing of Bi-LSTM input informations
CN108509423A (en) * 2018-04-04 2018-09-07 福州大学 A kind of acceptance of the bid webpage name entity abstracting method based on second order HMM

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082331A1 (en) * 2008-09-30 2010-04-01 Xerox Corporation Semantically-driven extraction of relations between named entities
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN107832400A (en) * 2017-11-01 2018-03-23 山东大学 A kind of method that location-based LSTM and CNN conjunctive models carry out relation classification
CN108416058A (en) * 2018-03-22 2018-08-17 北京理工大学 A kind of Relation extraction method based on the enhancing of Bi-LSTM input informations
CN108509423A (en) * 2018-04-04 2018-09-07 福州大学 A kind of acceptance of the bid webpage name entity abstracting method based on second order HMM

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
唐敏: "基于深度学习的中文实体关系抽取方法研究", 《万方数据学位论文库》, 19 December 2018 (2018-12-19), pages 1 - 75 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110334300A (en) * 2019-07-10 2019-10-15 哈尔滨工业大学 Text aid reading method towards the analysis of public opinion
CN110738182A (en) * 2019-10-21 2020-01-31 四川隧唐科技股份有限公司 LSTM model unit training method and device for high-precision identification of bid amount
CN112017016A (en) * 2019-10-29 2020-12-01 河南拓普计算机网络工程有限公司 Method for cleaning bid amount of bid-attracting bulletin
CN110738319A (en) * 2019-11-11 2020-01-31 四川隧唐科技股份有限公司 LSTM model unit training method and device for recognizing bid-winning units based on CRF
CN111078978A (en) * 2019-11-29 2020-04-28 上海观安信息技术股份有限公司 Web credit website entity identification method and system based on website text content
CN111078978B (en) * 2019-11-29 2024-02-27 上海观安信息技术股份有限公司 Network credit website entity identification method and system based on website text content
JP2021111416A (en) * 2020-01-15 2021-08-02 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Method and apparatus for labeling core entity, electronic device, storage medium, and computer program
JP7110416B2 (en) 2020-01-15 2022-08-01 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Core entity tagging method, core entity tagging device, electronic device, storage medium and computer program
CN111738002A (en) * 2020-05-26 2020-10-02 北京信息科技大学 Ancient text field named entity identification method and system based on Lattice LSTM
CN111737969A (en) * 2020-07-27 2020-10-02 北森云计算有限公司 Resume parsing method and system based on deep learning
CN112990845A (en) * 2021-01-04 2021-06-18 江苏省测绘地理信息局信息中心 Intelligent acquisition method for mapping market project
CN112989807A (en) * 2021-03-11 2021-06-18 重庆理工大学 Long digital entity extraction method based on continuous digital compression coding
CN112989807B (en) * 2021-03-11 2021-11-23 重庆理工大学 Long digital entity extraction method based on continuous digital compression coding
CN112948588A (en) * 2021-05-11 2021-06-11 中国人民解放军国防科技大学 Chinese text classification method for quick information editing
CN114048750A (en) * 2021-12-10 2022-02-15 广东工业大学 Named entity identification method integrating information advanced features

Also Published As

Publication number Publication date
CN109753660B (en) 2023-06-13

Similar Documents

Publication Publication Date Title
CN109753660A (en) A kind of acceptance of the bid webpage name entity abstracting method based on LSTM
CN108984526B (en) Document theme vector extraction method based on deep learning
CN110083831B (en) Chinese named entity identification method based on BERT-BiGRU-CRF
CN104834747B (en) Short text classification method based on convolutional neural networks
CN110555084B (en) Remote supervision relation classification method based on PCNN and multi-layer attention
WO2018028077A1 (en) Deep learning based method and device for chinese semantics analysis
CN106599032B (en) Text event extraction method combining sparse coding and structure sensing machine
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN109117472A (en) A kind of Uighur name entity recognition method based on deep learning
CN113591483A (en) Document-level event argument extraction method based on sequence labeling
CN109902177A (en) Text emotion analysis method based on binary channels convolution Memory Neural Networks
CN113220876B (en) Multi-label classification method and system for English text
WO2022198750A1 (en) Semantic recognition method
CN110297889B (en) Enterprise emotional tendency analysis method based on feature fusion
CN110188175A (en) A kind of question and answer based on BiLSTM-CRF model are to abstracting method, system and storage medium
CN110750646B (en) Attribute description extracting method for hotel comment text
CN111966825A (en) Power grid equipment defect text classification method based on machine learning
CN110851593B (en) Complex value word vector construction method based on position and semantics
CN111177383A (en) Text entity relation automatic classification method fusing text syntactic structure and semantic information
CN109840328A (en) Deep learning comment on commodity text emotion trend analysis method
CN108932229A (en) A kind of money article proneness analysis method
CN111666752A (en) Circuit teaching material entity relation extraction method based on keyword attention mechanism
CN115114926A (en) Chinese agricultural named entity identification method
CN110134950A (en) A kind of text auto-collation that words combines
CN113488196A (en) Drug specification text named entity recognition modeling method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant