CN107807917A - Method for extracting content of text, device, system and storage medium - Google Patents
Method for extracting content of text, device, system and storage medium Download PDFInfo
- Publication number
- CN107807917A CN107807917A CN201710896296.3A CN201710896296A CN107807917A CN 107807917 A CN107807917 A CN 107807917A CN 201710896296 A CN201710896296 A CN 201710896296A CN 107807917 A CN107807917 A CN 107807917A
- Authority
- CN
- China
- Prior art keywords
- content
- text
- extraction
- books
- book
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Abstract
The invention discloses a kind of method for extracting content of text, device, system and storage medium, methods described includes:The content of text extraction request that editor terminal is sent is received, and sends content of text and extracts the page to the editor terminal;Receive the book information that editor terminal extracts page transmission according to content of text;The book information includes book categories, books title, and author;According to the book information, book databases are inquired about and using the target text content of semantic analysis and default contents extraction the Rule Extraction books, and transmit to the editor terminal.The present invention realizes the semi-automation of target text contents extraction, on the basis of ensuring that extracted target text content is accurate, also improves target text contents extraction efficiency, while save time cost and human cost by the interaction of intelligent terminal and server.
Description
Technical field
The present invention relates to natural language processing field, more particularly to a kind of method for extracting content of text, device, system and deposit
Storage media.
Background technology
More and more perfect as teaching platform is more and more, people also gladly pay for online education, and with movement
The fast development of terminal, mobile phone, computer etc. turns into the necessity in people's life, online reading also into people's hobby and practise
It is used.Can be largely by manually to carrying in order to provide the user with the read resource of high quality, each platform, reader, APP etc.
The resource of supply user is screened and identified, to show most excellent most valuable content.But in the case where being commercialized background,
Artificial reading in full even full text intensive reading is only relied on to select the marrow content of books, although accuracy rate is higher, efficiency is low
Under, time cost and human cost are huge.
The content of the invention
One embodiment of the present of invention technical problem to be solved is, there is provided a kind of method for extracting content of text, dress
Put, system and storage medium, the semi-automation of target text contents extraction can be realized, ensured in extracted target text
On the basis of holding accurately, target text contents extraction efficiency is also improved, while save time cost and human cost.
In order to solve the above-mentioned technical problem, An embodiment provides a kind of method for extracting content of text, bag
Include following steps:
The content of text extraction request of editor terminal transmission is received, and it is whole to the editor to send the content of text extraction page
End;
Receive the book information that editor terminal extracts page transmission according to content of text;The book information includes books class
Not, books title, and author;
According to the book information, inquire about book databases and utilize semantic analysis and default contents extraction Rule Extraction
The target text content of the books, and transmit to the editor terminal.
Preferably, it is described according to the book information, inquire about book databases and utilize semantic analysis and default content
Extracting rule extracts the target text content of the books, and transmits to the editor terminal, is specially:
According to the book categories of books, books title, and author, book databases are inquired about to obtain the book text
Content;
Semantic analysis is carried out to the content of text data of books to be extracted, and according in semantic analysis result matching rule base
Corresponding contents extraction rule;
If the match is successful, extracted using the contents extraction rule from the content of text of the books in target text
Hold, and the target text content of extraction is transmitted to the editor terminal;
If it fails to match, semantic analysis result is recorded, and establishes new contents extraction rule, and this is newly-established interior
Hold extracting rule and be updated to rule base.
Preferably, the content of text data to books to be extracted, which carry out semantic analysis, includes:Text to extracting books
This content-data is segmented and part-of-speech tagging;Entity mark is carried out to the result of participle;Build the pass between each word in data
Connection relation;The entity mark includes name mark, time-labeling and numeral mark.
Preferably, the result of described pair of participle carries out entity mark, is specially:
Using the model of condition random field, according to the participle and part of speech mark made through machine learning to the content of text of books
Note, while context, the part of speech of front and rear word and the length of word of the content of text using books, further to book
The content of text of nationality carries out entity mark.
Preferably, contents extraction rule is the book text content sample according to selection, keyword, and with key
The associated grammatical relation of word is trained analysis extraction;The rule base is to be built according to the content of text and semantic analysis of books
It is vertical.
One embodiment of the present of invention additionally provides a kind of content of text extraction element, including:
Content of text extracts request reception unit, receives editor terminal and sends content of text extraction request, and sends text
The contents extraction page is to the editor terminal;
Content of text extraction unit, the book information sent for receiving editor terminal according to the content of text extraction page,
And according to the book information, inquire about book databases and utilize semantic analysis and default contents extraction Rule Extraction books
Target text content, and transmit to the editor terminal;The book information includes book categories, books title, with
And author.
One embodiment of the present of invention additionally provides a kind of content of text extraction element, including processor, memory and
It is stored in the memory and is configured as by the computer program of the computing device, meter described in the computing device
During calculation machine program, method for extracting content of text described above is realized.
One embodiment of the present of invention additionally provides a kind of storage medium, and the storage medium includes the computer journey of storage
Sequence, wherein, equipment where controlling the storage medium when the computer program is run performs content of text described above and carried
Take method.
One embodiment of the present of invention additionally provides a kind of content of text extraction system, including editor terminal and server;
Editor terminal, for sending content of text extraction request to server;
The server, asked for being extracted according to the content of text, send content of text and extract the page to the volume
Collect terminal;
The editor terminal, it is additionally operable to obtain the book information that user chooses according to the content of text extraction page, and sends
To server;The book information includes book categories, books title, and author;
The server, it is additionally operable to according to the book information, inquires about book databases and utilize semantic analysis and preset
The contents extraction Rule Extraction books target text content, and transmit to the editor terminal.
Implement the embodiment of the present invention, have the advantages that:
Method for extracting content of text, device, system and the storage medium of the present invention, the text sent by receiving editor terminal
This contents extraction is asked, and is sent content of text and extracted the page to the editor terminal;Editor terminal is received according to content of text
Extract the book information that the page is sent;The book information includes book categories, books title, and author;According to the book
Nationality information, inquire about in the target text of book databases and utilization semantic analysis and default contents extraction the Rule Extraction books
Hold, and transmit to the editor terminal.The browsable server of responsible editor sends to the process of editor terminal and tentatively extracted
Content of text, and judge whether to read carefully and thoroughly this bibliography, the present invention realizes mesh by the interaction of intelligent terminal and server
The semi-automation of content of text extraction is marked, on the basis of ensuring that extracted target text content is accurate, also improves target text
This contents extraction efficiency, while save time cost and human cost.
Brief description of the drawings
In order to illustrate more clearly of technical scheme, the required accompanying drawing used in embodiment will be made below
Simply introduce, it should be apparent that, drawings in the following description are only some embodiments of the present invention, general for this area
For logical technical staff, on the premise of not paying creative work, other accompanying drawings can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of schematic flow sheet for method for extracting content of text that one embodiment of the present of invention provides;
Fig. 2 is a kind of structural representation for content of text extraction element that one embodiment of the present of invention provides.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
Referring to Fig. 1, Fig. 1 is a kind of flow signal for method for extracting content of text that one embodiment of the present of invention provides
Figure.
A kind of method for extracting content of text that one embodiment of the present of invention provides can be performed by server, and hereafter equal
Illustrated exemplified by using server as executive agent.
The method for extracting content of text, comprises the following steps:
S101, the content of text extraction request that editor terminal is sent is received, and send content of text and extract the page to described
Editor terminal;
In one embodiment of the invention, the editor terminal can be the intelligent terminals such as smart mobile phone, PC, institute
It is the reader APP pages or the wechat small routine page or the wechat public number page etc. to state the content of text extraction page.It is public with wechat
Exemplified by many numbers, the data interaction of editor terminal and server is with the wechat public number page or public number edit page or other are flat
Platform editor edit page is presentation layer.After responsible editor enters edit page, text editing option is clicked on, editor is whole immediately
To server, server responds the request and the returned text contents extraction page to described for the content of text extraction request that end is sent
Editor terminal.
S102, receive the book information that editor terminal extracts page transmission according to content of text;The book information includes
Book categories, books title, and author;
In one embodiment of the invention, responsible editor can carry according to the content of text of server return editor terminal
Take the page to carry out content of text extraction operation, determine to choose the scope for the books for needing to browse or classification letter such as from magnanimity books
Breath, such as finance and economic, financial class, investment type, and specific books, then the books for the books to be extracted chosen are believed
Breath, including book categories, books title, and author are sent to server, and the extraction that next step is carried out by server operates.
S103, according to the book information, inquiry book databases are simultaneously advised using semantic analysis and default contents extraction
The target text content of the books is then extracted, and is transmitted to the editor terminal.
In one embodiment of the invention, it is preferable that described according to the book information, inquiry book databases and profit
With the target text content of semantic analysis and default contents extraction the Rule Extraction books, and transmit to the editor eventually
End, it is specially:
According to the book categories of books, books title, and author, book databases are inquired about to obtain the book text
Content;
Semantic analysis is carried out to the content of text data of books to be extracted, and according in semantic analysis result matching rule base
Corresponding contents extraction rule;
If the match is successful, extracted using the contents extraction rule from the content of text of the books in target text
Hold, and the target text content of extraction is transmitted to the editor terminal;
If it fails to match, semantic analysis result is recorded, and establishes new contents extraction rule, and this is newly-established interior
Hold extracting rule and be updated to rule base.
In one embodiment of the invention, it is preferable that the content of text data to books to be extracted carry out semantic
Analysis includes:The content of text data for extracting books are segmented and part-of-speech tagging;Entity mark is carried out to the result of participle;
Build the incidence relation between each word in data;The entity mark includes name mark, time-labeling and numeral mark.
Specifically, the processing procedure of one embodiment of the present of invention is as follows,
Instructed according to the book text content sample of selection, keyword, and the grammatical relation associated with keyword
Practice analysis extraction contents extraction rule, and rule base is established according to the content of text and semantic analysis of books:
The first step, it is the content of text of books to be segmented and part-of-speech tagging first, for follow-up entity mark and structure
Incidence relation in data between each word supports.The link needs common natural language processing technique, or based on statistics
Or the model such as machine learning can realize the participle and part-of-speech tagging of content of text.For example " proposition 3 is ground before big to sentence
The main points ... that can be touched the heart of sb. " carry out participle and part-of-speech tagging " to grind one/n before big, proposition/v, 3/num, individual/uj, can beat
Dynamic/v, the popular feeling /adj, main points/n ... " wherein/x be part-of-speech tagging, for example n mark nouns, v identify verb etc..
Second step, entity mark, such as name mark, time-labeling, numeral mark, verb mark are done to the result of participle
Deng.Wherein, it is more simpler with numeral mark compare other marks for time-labeling, passes through the responsible regular expression can of a bit
Detect time and numeral and do entity mark.And name mark and verb mark are then preferably needed using condition random field
Model come realize entity mark, be specially:Using the model of condition random field, according to the content of text through machine learning to books
The participle and part-of-speech tagging made, while context, the part of speech and word of front and rear word of the content of text using books
The length of language is trained to substantial amounts of language material, then does various entity marks to the word in content of text according to training result.
It should be noted that condition random field, is a kind of discriminate probabilistic model, it is one kind of random field, is usually used in marking
Note or analytical sequence data, such as natural language word or biological sequence.Such as Markov random field, condition random field is tool
There is a undirected graph model, the summit in figure represents stochastic variable, and the line between summit represents the dependence relation between stochastic variable,
In condition random field, stochastic variable Y's is distributed as conditional probability, and given observed value is then stochastic variable X.In principle, condition
The graph model layout of random field can be with any given, and typically conventional layout is the framework of chain eliminant, and chain eliminant framework is not
By in training (training), inference (inference) or decoding (decoding), the higher algorithm of efficiency all be present
It is available for calculating.
" condition random field " is used for the morphological analyses such as Chinese word segmentation and part-of-speech tagging work, and General Sequences disaggregated model is normal
Frequently with hidden Markov model (HMM), as class-based Chinese word segmentation.But two hypothesis in hidden Markov model be present:
Export independence assumption and Markov property is assumed.Wherein, export independence assumption and require that sequence data is strict independently of each other
The correctness of derivation is can guarantee that, and in fact most of sequence datas can not be expressed as a series of independent events.And condition with
Airport then uses a kind of probability graph model, has the ability of expression long-distance dependence and overlapping property feature, can preferably solve
The advantages of the problems such as award of bid note (classification) biasing, and all features can carry out global normalization, can try to achieve the overall situation most
Excellent solution.
3rd step, then build the incidence relation in data between each word, i.e., it is interdependent between each in content of text
And association.The structure model of conventional comparative maturity has neutral net, maximum entropy, and condition random field.Build each word
The grammatical relation of satisfaction between language or keyword, such as dynamic guest's relation, modified relationship.
4th step, various content of text extracting rules are established according to the grammer result of the 3rd step, and be saved in rule base.
For example to establish content of text extracting rule as follows:In " grinding 3 main points ... that can be touched the heart of sb. of a proposition before big ",
" one " keyword behaviour name mark is ground before big;" proposition " keyword is verb, by the quantity for moving guest's relationship
Word is " 3 ";" touching the heart of sb. " keyword by modified relationship associate for noun " main points " ... can then extract sentence
" grinding 3 main points ... that can be touched the heart of sb. of a proposition before big " is by that analogy, various interior by being extracted in substantial amounts of data sample
Hold extracting rule, establish rule base.
Preferably, contents extraction rule is the book text content sample according to selection, keyword, and with key
The associated grammatical relation of word is trained analysis extraction;The rule base is to be built according to the content of text and semantic analysis of books
It is vertical.
It should be noted that after rule base is established, then carrying for key content can be carried out to the content of text of books
Take.Semantic analysis is carried out to the content of text data of books to be extracted, and according to right in semantic analysis result matching rule base
The contents extraction rule answered, if the match is successful, is extracted using the contents extraction rule from the content of text of the books
Target text content, and the target text content of extraction is transmitted to the editor terminal.If it fails to match, remember
Semantic analysis result is recorded, and establishes new contents extraction rule, and by the newly-established contents extraction Policy Updates to rule base.
A kind of method for extracting content of text that one embodiment of the present of invention provides, the text sent by receiving editor terminal
This contents extraction is asked, and is sent content of text and extracted the page to the editor terminal;Editor terminal is received according to content of text
Extract the book information that the page is sent;The book information includes book categories, books title, and author;According to the book
Nationality information, inquire about in the target text of book databases and utilization semantic analysis and default contents extraction the Rule Extraction books
Hold, and transmit to the editor terminal.The browsable server of responsible editor sends to the process of editor terminal and tentatively extracted
Content of text, and judge whether to read carefully and thoroughly this bibliography, the present invention realizes mesh by the interaction of intelligent terminal and server
The semi-automation of content of text extraction is marked, on the basis of ensuring that extracted target text content is accurate, also improves target text
This contents extraction efficiency, while save time cost and human cost.
Referring to Fig. 2, Fig. 2 is a kind of structural representation for content of text extraction element that one embodiment of the present of invention provides
Figure.
One embodiment of the present of invention additionally provides a kind of content of text extraction element, including:
Content of text extracts request reception unit 201, receives editor terminal and sends content of text extraction request, and sends text
This contents extraction page is to the editor terminal;
Content of text extraction unit 202, believe for receiving the books that editor terminal is sent according to the content of text extraction page
Breath, and according to the book information, inquire about book databases and be somebody's turn to do using semantic analysis and default contents extraction Rule Extraction
The target text content of books, and transmit to the editor terminal;The book information includes book categories, books name
Claim, and author.
A kind of content of text extraction element that one embodiment of the present of invention provides, request is extracted by content of text and received
Unit 201 receives the content of text extraction request of editor terminal transmission, and it is whole to the editor to send the content of text extraction page
End, the book information that then the reception editor terminal of content of text extraction unit 202 is sent according to the content of text extraction page, wherein
The book information includes book categories, books title, and author.Content of text extraction unit 202 is believed according to the books
Breath, inquire about book databases and utilize the target text content of semantic analysis and default contents extraction the Rule Extraction books,
And transmit to the editor terminal.The browsable server of responsible editor sends the text tentatively extracted to the process of editor terminal
This content, and judge whether to read carefully and thoroughly this bibliography, the present invention realizes target text by the interaction of intelligent terminal and server
The semi-automation of this contents extraction, on the basis of ensuring that extracted target text content is accurate, also improve in target text
Hold extraction efficiency, while save time cost and human cost.
One embodiment of the present of invention additionally provides a kind of content of text extraction element, including processor, memory and
It is stored in the memory and is configured as by the computer program of the computing device, meter described in the computing device
During calculation machine program, method for extracting content of text described above is realized.
One embodiment of the present of invention additionally provides a kind of storage medium, and the storage medium includes the computer journey of storage
Sequence, wherein, equipment where controlling the storage medium when the computer program is run performs content of text described above and carried
Take method.
One embodiment of the present of invention additionally provides a kind of content of text extraction system, including editor terminal and server;
Editor terminal, for sending content of text extraction request to server;
The server, asked for being extracted according to the content of text, send content of text and extract the page to the volume
Collect terminal;
The editor terminal, it is additionally operable to obtain the book information that user chooses according to the content of text extraction page, and sends
To server;The book information includes book categories, books title, and author;
The server, it is additionally operable to according to the book information, inquires about book databases and utilize semantic analysis and preset
The contents extraction Rule Extraction books target text content, and transmit to the editor terminal.
A kind of method for extracting content of text system that one embodiment of the present of invention provides, sent by receiving editor terminal
Content of text extraction request, and send content of text and extract the page to the editor terminal;Editor terminal is received according to text
The book information that the contents extraction page is sent;The book information includes book categories, books title, and author;According to institute
Book information is stated, book databases is inquired about and utilizes the target text of semantic analysis and default contents extraction the Rule Extraction books
This content, and transmit to the editor terminal.The process that the browsable server of responsible editor is sent to editor terminal is preliminary
The content of text of extraction, and judge whether to read carefully and thoroughly this bibliography, the interaction of the invention by intelligent terminal and server, it is real
The semi-automation of existing target text contents extraction, on the basis of ensuring that extracted target text content is accurate, also improves mesh
Content of text extraction efficiency is marked, while saves time cost and human cost.
Described above is the preferred embodiment of the present invention, it is noted that for those skilled in the art
For, under the premise without departing from the principles of the invention, some improvement and deformation can also be made, these are improved and deformation is also considered as
Protection scope of the present invention.
One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with
The hardware of correlation is instructed to complete by computer program, described program can be stored in a computer read/write memory medium
In, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic
Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access
Memory, RAM) etc..
Claims (9)
1. a kind of method for extracting content of text, it is characterised in that comprise the following steps:
The content of text extraction request that editor terminal is sent is received, and sends content of text and extracts the page to the editor terminal;
Receive the book information that editor terminal extracts page transmission according to content of text;The book information include book categories,
Books title, and author;
According to the book information, inquire about book databases and utilize semantic analysis and default contents extraction Rule Extraction book
The target text content of nationality, and transmit to the editor terminal.
2. a kind of method for extracting content of text according to claim 1, it is characterised in that described to be believed according to the books
Breath, inquire about book databases and utilize the target text content of semantic analysis and default contents extraction the Rule Extraction books,
And transmit to the editor terminal, it is specially:
According to the book categories of books, books title, and author, book databases are inquired about to obtain in the book text
Hold;
Semantic analysis is carried out to the content of text data of books to be extracted, and according to corresponding in semantic analysis result matching rule base
Contents extraction rule;
If the match is successful, target text content is extracted from the content of text of the books using the contents extraction rule,
And the target text content of extraction is transmitted to the editor terminal;
If it fails to match, semantic analysis result is recorded, and establishes new contents extraction rule, and the newly-established content is carried
Policy Updates are taken to rule base.
A kind of 3. method for extracting content of text according to claim 2, it is characterised in that the text to books to be extracted
This content-data, which carries out semantic analysis, to be included:The content of text data for extracting books are segmented and part-of-speech tagging;To participle
Result carry out entity mark;Build the incidence relation between each word in data;The entity mark includes name mark, time
Mark and numeral mark.
4. a kind of method for extracting content of text according to claim 3, it is characterised in that the result of described pair of participle is carried out
Entity marks, and is specially:
Using the model of condition random field, according to the participle and part-of-speech tagging made through machine learning to the content of text of books,
Context, the part of speech of front and rear word and the length of word of the content of text of books are utilized simultaneously, further to books
Content of text carry out entity mark.
5. a kind of method for extracting content of text according to any one of Claims 1-4, it is characterised in that the content carries
Rule is taken to be instructed for the book text content sample according to selection, keyword, and the grammatical relation associated with keyword
Practice analysis extraction;The rule base is to be established according to the content of text of books and semantic analysis.
A kind of 6. content of text extraction element, it is characterised in that including:
Content of text extracts request reception unit, receives editor terminal and sends content of text extraction request, and sends content of text
The page is extracted to the editor terminal;
Content of text extraction unit, the book information sent for receiving editor terminal according to the content of text extraction page, and root
According to the book information, inquire about book databases and utilize the mesh of semantic analysis and default contents extraction the Rule Extraction books
Content of text is marked, and is transmitted to the editor terminal;The book information includes book categories, books title, Yi Jizuo
Person.
7. a kind of content of text extraction element, it is characterised in that including processor, memory and be stored in the memory
And it is configured as, by the computer program of the computing device, described in the computing device during computer program, realizing such as
Method for extracting content of text described in Claims 1-4.
A kind of 8. storage medium, it is characterised in that the storage medium includes the computer program of storage, wherein, in the meter
Equipment is performed in the text as described in Claims 1-4 any one calculation machine program controls the storage medium when running where
Hold extracting method.
9. a kind of content of text extraction system, it is characterised in that including editor terminal and server;
Editor terminal, for sending content of text extraction request to server;
The server, asked for being extracted according to the content of text, send the content of text extraction page and edited eventually to described
End;
The editor terminal, it is additionally operable to obtain the book information that user extracts page selection according to content of text, and sends and extremely take
Business device;The book information includes book categories, books title, and author;
The server, it is additionally operable to according to the book information, inquires about book databases and utilize semantic analysis and default interior
Hold extracting rule and extract the target text content of the books, and transmit to the editor terminal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710896296.3A CN107807917A (en) | 2017-09-27 | 2017-09-27 | Method for extracting content of text, device, system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710896296.3A CN107807917A (en) | 2017-09-27 | 2017-09-27 | Method for extracting content of text, device, system and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107807917A true CN107807917A (en) | 2018-03-16 |
Family
ID=61584547
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710896296.3A Pending CN107807917A (en) | 2017-09-27 | 2017-09-27 | Method for extracting content of text, device, system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107807917A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109166608A (en) * | 2018-09-17 | 2019-01-08 | 新华三大数据技术有限公司 | Electronic health record information extracting method, device and equipment |
CN109259733A (en) * | 2018-10-25 | 2019-01-25 | 深圳和而泰智能控制股份有限公司 | Apnea detection method, apparatus and detection device in a kind of sleep |
CN112257388A (en) * | 2020-10-19 | 2021-01-22 | 深圳市大成天下信息技术有限公司 | Content display method, mobile terminal and system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101510221A (en) * | 2009-02-17 | 2009-08-19 | 北京大学 | Enquiry statement analytical method and system for information retrieval |
CN102456037A (en) * | 2010-10-28 | 2012-05-16 | 康佳集团股份有限公司 | Method and device for reading e-books in mobile terminal |
CN104361028A (en) * | 2014-10-23 | 2015-02-18 | 明博教育科技有限公司 | Method and system for extracting book knowledge points according to book catalogue |
CN104572849A (en) * | 2014-12-17 | 2015-04-29 | 西安美林数据技术股份有限公司 | Automatic standardized filing method based on text semantic mining |
CN105302796A (en) * | 2015-11-23 | 2016-02-03 | 浪潮软件股份有限公司 | Semantic analysis method based on dependency tree |
CN105630958A (en) * | 2015-12-24 | 2016-06-01 | 小米科技有限责任公司 | Book managing method and device |
US20160371243A1 (en) * | 2012-11-16 | 2016-12-22 | International Business Machines Corporation | Building and maintaining information extraction rules |
CN106815293A (en) * | 2016-12-08 | 2017-06-09 | 中国电子科技集团公司第三十二研究所 | System and method for constructing knowledge graph for information analysis |
-
2017
- 2017-09-27 CN CN201710896296.3A patent/CN107807917A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101510221A (en) * | 2009-02-17 | 2009-08-19 | 北京大学 | Enquiry statement analytical method and system for information retrieval |
CN102456037A (en) * | 2010-10-28 | 2012-05-16 | 康佳集团股份有限公司 | Method and device for reading e-books in mobile terminal |
US20160371243A1 (en) * | 2012-11-16 | 2016-12-22 | International Business Machines Corporation | Building and maintaining information extraction rules |
CN104361028A (en) * | 2014-10-23 | 2015-02-18 | 明博教育科技有限公司 | Method and system for extracting book knowledge points according to book catalogue |
CN104572849A (en) * | 2014-12-17 | 2015-04-29 | 西安美林数据技术股份有限公司 | Automatic standardized filing method based on text semantic mining |
CN105302796A (en) * | 2015-11-23 | 2016-02-03 | 浪潮软件股份有限公司 | Semantic analysis method based on dependency tree |
CN105630958A (en) * | 2015-12-24 | 2016-06-01 | 小米科技有限责任公司 | Book managing method and device |
CN106815293A (en) * | 2016-12-08 | 2017-06-09 | 中国电子科技集团公司第三十二研究所 | System and method for constructing knowledge graph for information analysis |
Non-Patent Citations (1)
Title |
---|
陈劲: "面向中文网页的信息抽取关键技术研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109166608A (en) * | 2018-09-17 | 2019-01-08 | 新华三大数据技术有限公司 | Electronic health record information extracting method, device and equipment |
CN109259733A (en) * | 2018-10-25 | 2019-01-25 | 深圳和而泰智能控制股份有限公司 | Apnea detection method, apparatus and detection device in a kind of sleep |
CN112257388A (en) * | 2020-10-19 | 2021-01-22 | 深圳市大成天下信息技术有限公司 | Content display method, mobile terminal and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111177569B (en) | Recommendation processing method, device and equipment based on artificial intelligence | |
CN104408093B (en) | A kind of media event key element abstracting method and device | |
US11714839B2 (en) | Apparatus and method for automated and assisted patent claim mapping and expense planning | |
CN110633409B (en) | Automobile news event extraction method integrating rules and deep learning | |
Zhu et al. | Multimodal joint attribute prediction and value extraction for e-commerce product | |
CA3129745A1 (en) | Neural network system for text classification | |
CN108573047A (en) | A kind of training method and device of Module of Automatic Chinese Documents Classification | |
CN107729309A (en) | A kind of method and device of the Chinese semantic analysis based on deep learning | |
CN107766371A (en) | A kind of text message sorting technique and its device | |
Vakulenko et al. | Enriching iTunes App Store Categories via Topic Modeling. | |
CN103678269A (en) | Information processing method and device | |
WO2021184674A1 (en) | Text keyword extraction method, electronic device, and computer readable storage medium | |
CN107807917A (en) | Method for extracting content of text, device, system and storage medium | |
CN109299233A (en) | Text data processing method, device, computer equipment and storage medium | |
CN109582792A (en) | A kind of method and device of text classification | |
CN111782793A (en) | Intelligent customer service processing method, system and equipment | |
CN106980667A (en) | A kind of method and apparatus that label is marked to article | |
CN110880142B (en) | Risk entity acquisition method and device | |
CN111462752A (en) | Client intention identification method based on attention mechanism, feature embedding and BI-L STM | |
CN116070632A (en) | Informal text entity tag identification method and device | |
CN106485525A (en) | Information processing method and device | |
CN106372956A (en) | Method and system for intention entity recognition based on user query log | |
CN110968661A (en) | Event extraction method and system, computer readable storage medium and electronic device | |
CN113887202A (en) | Text error correction method and device, computer equipment and storage medium | |
CN110287341A (en) | A kind of data processing method, device and readable storage medium storing program for executing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180316 |
|
RJ01 | Rejection of invention patent application after publication |