CN106682209A - Cross-language scientific and technical literature retrieval method and cross-language scientific and technical literature retrieval system - Google Patents
Cross-language scientific and technical literature retrieval method and cross-language scientific and technical literature retrieval system Download PDFInfo
- Publication number
- CN106682209A CN106682209A CN201611261604.7A CN201611261604A CN106682209A CN 106682209 A CN106682209 A CN 106682209A CN 201611261604 A CN201611261604 A CN 201611261604A CN 106682209 A CN106682209 A CN 106682209A
- Authority
- CN
- China
- Prior art keywords
- scientific
- technical literature
- technology
- literature
- language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a cross-language scientific and technical literature retrieval method and a cross-language scientific and technical literature retrieval system. The cross-language scientific and technical literature retrieval method includes establishing a scientific and technical literature ontology library for storing homogeneous keywords, scientific and technical literature information ontology links corresponding to the homogeneous keywords as well as original scientific and technical literature indexes pointing to the scientific and technical literature information ontology links, wherein the homogeneous keywords refer to a set of synonymous or near-synonymous Chinese keywords and English keywords; reading an index word inputted by a user, and searching the scientific and technical literature information ontology link corresponding to the homogeneous keywords matching with the index word from the scientific and technical literature ontology library; searching relevant literature according to the scientific and technical literature information ontology link corresponding to the homogeneous keywords and the original scientific and technical literature index pointing to the scientific and technical literature information ontology link, and displaying the literature to the user according to a preset sequence. The cross-language scientific and technical literature retrieval method is capable of improving scientific and technical literature retrieval precision.
Description
Technical field
The present invention relates to computer search technical field, and in particular to a kind of across language technology document retrieval method and be
System.
Background technology
With the development of information technology, people become increasingly popular to obtain knowledge using the mode of retrieval electronic document.
But the knowledge needed for user there may be in the document of different language, and user is more willing to enter line retrieval with mother tongue, and this is just produced
Across linguistry retrieval and the demand for extracting are given birth to.
Cross-language retrieval refers to that user remits another kind of nature language of retrieval using the term of certain natural language (original language)
The document of speech (object language) expression.But existing knowledge base is in the cross-language retrieval of Chinese and English, is all first to turn over key word
English is translated into, translator of English is referred again to and is entered line retrieval in data base.Due to being usually present the feelings translated a word between Chinese and English more
Condition, this is resulted in cross-language retrieval, and retrieval accuracy is substantially reduced.
The content of the invention
In view of this, it is an object of the invention to overcome the deficiencies in the prior art, there is provided a kind of across language technology document inspection
Rope method and system, improve the degree of accuracy of Indexing of Scien. and Tech. Literature.
To realize object above, the present invention is adopted the following technical scheme that:
A kind of across language technology document retrieval method, including:
Step S1, scientific and technical literature ontology library is set up, wherein, the same class keywords that are stored with the scientific and technical literature ontology library,
With the technology literature information body link corresponding to class keywords and the technology literature information body link sensing source scientific and technical literature
Index;The same class keywords are the Chinese key and English keyword set of synonymous or near justice;The source scientific and technical literature
It is each Chinese key and English key word are derived from same class keywords scientific and technical literature set;
Step S2, the term for reading user input, and search and the term phase in the scientific and technical literature ontology library
Technology literature information body link corresponding to the same class keywords of matching;
Step S3, by the link of technology literature information body corresponding to same class keywords and the technology literature information body
The index of link sensing source scientific and technical literature, finds out pertinent literature, and is shown to user by predetermined order.
Preferably, step S1 is specially:Each scientific and technical literature in being indexed to data base, performs following step
Suddenly:
Step S11, the Chinese key and English key word extracted in scientific and technical literature;
Step S12, identical Chinese key or English key word are merged, the Chinese key of synonymous or nearly justice
A class is classified as with English key word;
Step S13, to each class keywords, set up the link of technology literature information body, meanwhile, set up the section
The index of skill documentation & info body link sensing source scientific and technical literature;
The link of technology literature information body and the technology literature information body link in step S14, set step S13 refers to
To the index of source scientific and technical literature, scientific and technical literature ontology library is formed;
Wherein, the technology literature information includes:The exercise question of scientific and technical literature, author, summary, key word, publication time, section
The background parts of skill document, problematic portion and solution page.
Preferably, step S11 is specially:Semantic analysis are carried out to scientific and technical literature, to extract literary key word and English
Key word.
Preferably, the predetermined order is:By the matching degree in term and scientific and technical literature ontology library with class keywords,
From high to low with tabular form arrangement.
Preferably, the parsing and inquiry of body in scientific and technical literature ontology library are realized using Jena and SparQL language.
A kind of across language technology peek-a-boo, including:
Body library module, for setting up scientific and technical literature ontology library, wherein, it is stored with the scientific and technical literature ontology library similar
Key word, with the technology literature information body link corresponding to class keywords and technology literature information body link sensing source section
The index of skill document;The same class keywords are the Chinese key and English keyword set of synonymous or near justice;The source section
Skill document is the scientific and technical literature set that each Chinese key and English key word are derived from same class keywords;
Retrieval module, for reading the term of user input, and searches and the inspection in the scientific and technical literature ontology library
Technology literature information body link corresponding to the same class keywords that rope word matches;
Display module, for by the technology literature information body link corresponding to same class keywords and the scientific and technical literature letter
The index of breath body link sensing source scientific and technical literature, finds out pertinent literature, and is shown to user by predetermined order.
The present invention adopts above technical scheme, at least possesses following beneficial effect:
As shown from the above technical solution, the present invention is provided this across language technology document retrieval method and system, due to
Foundation has scientific and technical literature ontology library, the same class keywords that are stored with scientific and technical literature ontology library, with the science and technology corresponding to class keywords
Documentation & info body is linked and the technology literature information body links the index of sensing source scientific and technical literature, and the same class keywords are
Synonymous or nearly adopted Chinese key and English keyword set so that after user input term, only need to be in scientific and technical literature sheet
The technology literature information body link corresponding to the same class keywords that the term matches is searched in body storehouse, by similar key
The index of the link of technology literature information body and the technology literature information body link sensing source scientific and technical literature corresponding to word, looks into
Pertinent literature is found out, and user is shown to by predetermined order, you can realized retrieval, compared to existing technology, eliminate retrieving
In original language to the translation process of object language, the degree of accuracy of Indexing of Scien. and Tech. Literature can be improved.
Description of the drawings
A kind of schematic flow sheet across language technology document retrieval method that Fig. 1 is provided for one embodiment of the invention;
A kind of schematic block diagram across language technology peek-a-boo that Fig. 2 is provided for one embodiment of the invention.
Specific embodiment
Below by drawings and Examples, technical scheme is described in further detail.
Referring to Fig. 1, across the language technology document retrieval method of one kind that one embodiment of the invention is provided, including:
Step S1, scientific and technical literature ontology library is set up, wherein, the same class keywords that are stored with the scientific and technical literature ontology library,
With the technology literature information body link corresponding to class keywords and the technology literature information body link sensing source scientific and technical literature
Index;The same class keywords are the Chinese key and English keyword set of synonymous or near justice;The source scientific and technical literature
It is each Chinese key and English key word are derived from same class keywords scientific and technical literature set;
Step S2, the term for reading user input, and search and the term phase in the scientific and technical literature ontology library
Technology literature information body link corresponding to the same class keywords of matching;
Step S3, by the link of technology literature information body corresponding to same class keywords and the technology literature information body
The index of link sensing source scientific and technical literature, finds out pertinent literature, and is shown to user by predetermined order.
It should be noted that the scientific and technical literature includes technical paper, technical journal and minutes etc..
In order to make it easy to understand, concrete this across the language technology document retrieval method for introducing present invention offer is as follows:
First, according to the semantic pattern of technical paper, a technical paper is divided into into context analyzer, proposition problem, solution
Certainly three parts of scheme, using these three concepts as technical paper class subclass.Meanwhile, paper deliver form including periodical and
Meeting, equally as filiation.Additionally, every paper has many key words, we are using each key word as
The example in section field.
Secondly, the information associations such as the keyword instances in each field, paper key word, paper, periodical, author are got up, is built
The vertical link of technology literature information body one by one, then the link of each technology literature information body is associated, form scientific and technical literature
Ontology library.Because body has the function of attribute derivation, system constantly carries out computing and derivation using Jena, and then constantly builds
Vertical new linking relationship so that scientific and technical literature ontology library constantly improve.For example:Software repeated usage is the key word of paper A, although opinion
Literary A only has software repeated usage this key word, but, software reuse, software repeated usage and software architecture this three
Individual synonym can be classified as same class keywords.After ontology inference, paper A not only possesses software repeated usage this key word, also
Possess software reuse and software architecture the two its synonyms.So, when user is used based on this
During this across the language technology document retrieval method of bright offer, no matter the lookup key word that they are input into is software
Architecture, or software repeated usage, software reuse, can retrieve paper A, so be achieved that across language technology opinion
Text management.
As shown from the above technical solution, this across the language technology document retrieval method that the present invention is provided, has due to setting up
Scientific and technical literature ontology library, the same class keywords that are stored with scientific and technical literature ontology library, with corresponding to class keywords scientific and technical literature believe
The link of breath body and the index of technology literature information body link sensing source scientific and technical literature, the same class keywords for synonymous or
The Chinese key of nearly justice and English keyword set so that after user input term, only need to be in scientific and technical literature ontology library
The technology literature information body link corresponding to the same class keywords that the term matches is searched, institute is right by same class keywords
The technology literature information body link answered and the index of the technology literature information body link sensing source scientific and technical literature, find out phase
Document is closed, and user is shown to by predetermined order, you can realized retrieval, compared to existing technology, eliminate the source in retrieving
Language can improve the degree of accuracy of Indexing of Scien. and Tech. Literature to the translation process of object language.
Preferably, step S1 is specially:Each scientific and technical literature in being indexed to data base, performs following step
Suddenly:
Step S11, the Chinese key and English key word extracted in scientific and technical literature;
Step S12, identical Chinese key or English key word are merged, the Chinese key of synonymous or nearly justice
A class is classified as with English key word;
Step S13, to each class keywords, set up the link of technology literature information body, meanwhile, set up the section
The index of skill documentation & info body link sensing source scientific and technical literature;
The link of technology literature information body and the technology literature information body link in step S14, set step S13 refers to
To the index of source scientific and technical literature, scientific and technical literature ontology library is formed;
Wherein, the technology literature information includes:The exercise question of scientific and technical literature, author, summary, key word, publication time, section
The background parts of skill document, problematic portion and solution page.
Preferably, step S11 is specially:Semantic analysis are carried out to scientific and technical literature, to extract literary key word and English
Key word.
Preferably, the predetermined order is:By the matching degree in term and scientific and technical literature ontology library with class keywords,
From high to low with tabular form arrangement.
For example, such as key word X occurs in the summary part of paper A, and key word X occurs in the background parts of paper B, closes
Keyword X paper C problematic portion occur, key word X paper D solution page occur, then by paper D, paper C,
Paper B, the sequencing display of paper A are to user.
Preferably, the parsing and inquiry of body in scientific and technical literature ontology library are realized using Jena and SparQL language.
Referring to Fig. 2, a kind of across language technology peek-a-boo 100, including:
Body library module 101, for setting up scientific and technical literature ontology library, wherein, it is stored with the scientific and technical literature ontology library
Same class keywords, point to the link of technology literature information body and technology literature information body link corresponding to class keywords
The index of source scientific and technical literature;The same class keywords are the Chinese key and English keyword set of synonymous or near justice;It is described
Source scientific and technical literature is the scientific and technical literature set that each Chinese key and English key word are derived from same class keywords.
Retrieval module 102, for reading the term of user input, and searches in the scientific and technical literature ontology library and is somebody's turn to do
Technology literature information body link corresponding to the same class keywords that term matches;
Display module 103, for by the technology literature information body link corresponding to same class keywords and the science and technology text
The index of Information Ontology link sensing source scientific and technical literature is offered, pertinent literature is found out, and user is shown to by predetermined order.
Above-described specific embodiment, has been carried out further to the purpose of the present invention, technical scheme and beneficial effect
Describe in detail, should be understood that the specific embodiment that the foregoing is only the present invention, be not intended to limit the present invention
Protection domain, all any modification, equivalent substitution and improvements within the spirit and principles in the present invention, done etc. all should include
Within protection scope of the present invention.Term " first ", " second " are only used for describing purpose, and it is not intended that indicating or implying
Relative importance.Term " multiple " refers to two or more, unless otherwise clearly restriction.
Claims (6)
1. across the language technology document retrieval method of one kind, it is characterised in that include:
Step S1, scientific and technical literature ontology library is set up, wherein, it is the same class keywords that are stored with the scientific and technical literature ontology library, similar
The rope of the link of technology literature information body and the technology literature information body link sensing source scientific and technical literature corresponding to key word
Draw;The same class keywords are the Chinese key and English keyword set of synonymous or near justice;The source scientific and technical literature is same
The scientific and technical literature set that each Chinese key and English key word are derived from class keywords;
Step S2, the term of user input is read, and search in the scientific and technical literature ontology library and match with the term
Same class keywords corresponding to technology literature information body link;
Step S3, by the link of technology literature information body corresponding to same class keywords and technology literature information body link
The index of sensing source scientific and technical literature, finds out pertinent literature, and is shown to user by predetermined order.
2. across language technology document retrieval method according to claim 1, it is characterised in that step S1 is specially:
Each scientific and technical literature in being indexed to data base, performs following steps:
Step S11, the Chinese key and English key word extracted in scientific and technical literature;
Step S12, identical Chinese key or English key word are merged, the Chinese key and English of synonymous or nearly justice
Literary key word is classified as a class;
Step S13, to each class keywords, set up the link of technology literature information body, meanwhile, set up the science and technology literary
Offer the index of Information Ontology link sensing source scientific and technical literature;
The link of technology literature information body and the technology literature information body link sensing source in step S14, set step S13
The index of scientific and technical literature, forms scientific and technical literature ontology library;
Wherein, the technology literature information includes:The exercise question of scientific and technical literature, author, summary, key word, publication time, science and technology text
Background parts, problematic portion and the solution page offered.
3. across language technology document retrieval method according to claim 2, it is characterised in that step S11 is specially:
Semantic analysis are carried out to scientific and technical literature, to extract literary key word and English key word.
4. across language technology document retrieval method according to claim 1, it is characterised in that the predetermined order is:Press
With the matching degree of class keywords in term and scientific and technical literature ontology library, from high to low with tabular form arrangement.
5. across the language technology document retrieval method according to any one of Claims 1 to 4, it is characterised in that adopt Jena
With the parsing and inquiry that SparQL language realizes body in scientific and technical literature ontology library.
6. across the language technology peek-a-boo of one kind, it is characterised in that include:
Body library module, for setting up scientific and technical literature ontology library, wherein, be stored with similar key in the scientific and technical literature ontology library
Word, with the technology literature information body link corresponding to class keywords and technology literature information body link sensing source science and technology text
The index offered;The same class keywords are the Chinese key and English keyword set of synonymous or near justice;The source science and technology text
It is each Chinese key and English key word are derived from same class keywords scientific and technical literature set to offer;
Retrieval module, for reading the term of user input, and searches and the term in the scientific and technical literature ontology library
Technology literature information body link corresponding to the same class keywords for matching;
Display module, for by the technology literature information body link corresponding to same class keywords and the technology literature information sheet
The index of body link sensing source scientific and technical literature, finds out pertinent literature, and is shown to user by predetermined order.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611261604.7A CN106682209A (en) | 2016-12-30 | 2016-12-30 | Cross-language scientific and technical literature retrieval method and cross-language scientific and technical literature retrieval system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611261604.7A CN106682209A (en) | 2016-12-30 | 2016-12-30 | Cross-language scientific and technical literature retrieval method and cross-language scientific and technical literature retrieval system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106682209A true CN106682209A (en) | 2017-05-17 |
Family
ID=58848751
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611261604.7A Pending CN106682209A (en) | 2016-12-30 | 2016-12-30 | Cross-language scientific and technical literature retrieval method and cross-language scientific and technical literature retrieval system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682209A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107391690A (en) * | 2017-07-25 | 2017-11-24 | 李小明 | A kind of method for handling documentation & info |
CN108132933A (en) * | 2017-12-28 | 2018-06-08 | 中译语通科技(青岛)有限公司 | A kind of generation method across language analysis report |
CN108345694A (en) * | 2018-03-19 | 2018-07-31 | 华北电力大学(保定) | A kind of document retrieval method and system based on subject data base |
CN109299466A (en) * | 2018-10-22 | 2019-02-01 | 中国船舶工业综合技术经济研究院 | A kind of document retrieval method and system towards science and techniques of defence field |
CN110046148A (en) * | 2019-04-23 | 2019-07-23 | 北京恒冠网络数据处理有限公司 | A kind of patent navigation Service management system based on big data |
CN110175912A (en) * | 2019-04-08 | 2019-08-27 | 西安西电链融科技有限公司 | Across the chain assets transfer method of block chain, block chain information terminal based on the committee |
CN110188166A (en) * | 2019-05-15 | 2019-08-30 | 北京字节跳动网络技术有限公司 | Document search method, device and electronic equipment |
CN110209942A (en) * | 2019-06-04 | 2019-09-06 | 广德元瑞生产力促进中心有限公司 | A kind of scientific and technological information intelligently pushing system based on big data |
CN111767378A (en) * | 2020-06-24 | 2020-10-13 | 北京墨丘科技有限公司 | Method and device for intelligently recommending scientific and technical literature |
CN112667781A (en) * | 2020-12-31 | 2021-04-16 | 北京万方数据股份有限公司 | Malignant tumor document acquisition method and device |
TWI779599B (en) * | 2021-02-09 | 2022-10-01 | 鼎新電腦股份有限公司 | Application programming interface service search system and application programming interface service search method |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103336852A (en) * | 2013-07-24 | 2013-10-02 | 清华大学 | Cross-language ontology construction method and device |
US20160179945A1 (en) * | 2014-12-19 | 2016-06-23 | Universidad Nacional De Educación A Distancia (Uned) | System and method for the indexing and retrieval of semantically annotated data using an ontology-based information retrieval model |
-
2016
- 2016-12-30 CN CN201611261604.7A patent/CN106682209A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103336852A (en) * | 2013-07-24 | 2013-10-02 | 清华大学 | Cross-language ontology construction method and device |
US20160179945A1 (en) * | 2014-12-19 | 2016-06-23 | Universidad Nacional De Educación A Distancia (Uned) | System and method for the indexing and retrieval of semantically annotated data using an ontology-based information retrieval model |
Non-Patent Citations (3)
Title |
---|
刘伟成等: "多语言本体构建及其在跨语言信息检索中的应用", 《科技知识对象的语义模式研究,徐昊,中国优秀博士学位论文全文数据库信息科技辑,2013年第8期》 * |
吴丹等: "本体在跨语言信息检索中的应用机制研究", 《图书情报工作》 * |
徐昊: "科技知识对象的语义模式研究", 《中国优秀博士学位论文全文数据库信息科技辑》 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107391690B (en) * | 2017-07-25 | 2020-03-31 | 李小明 | Method for processing document information |
CN107391690A (en) * | 2017-07-25 | 2017-11-24 | 李小明 | A kind of method for handling documentation & info |
CN108132933A (en) * | 2017-12-28 | 2018-06-08 | 中译语通科技(青岛)有限公司 | A kind of generation method across language analysis report |
CN108345694B (en) * | 2018-03-19 | 2021-09-03 | 华北电力大学(保定) | Document retrieval method and system based on theme database |
CN108345694A (en) * | 2018-03-19 | 2018-07-31 | 华北电力大学(保定) | A kind of document retrieval method and system based on subject data base |
CN109299466A (en) * | 2018-10-22 | 2019-02-01 | 中国船舶工业综合技术经济研究院 | A kind of document retrieval method and system towards science and techniques of defence field |
CN109299466B (en) * | 2018-10-22 | 2023-07-07 | 中国船舶工业综合技术经济研究院 | Document retrieval method and system oriented to national defense science and technology field |
CN110175912A (en) * | 2019-04-08 | 2019-08-27 | 西安西电链融科技有限公司 | Across the chain assets transfer method of block chain, block chain information terminal based on the committee |
CN110175912B (en) * | 2019-04-08 | 2023-05-05 | 西安链融科技有限公司 | Committee-based blockchain cross-chain asset transfer method and blockchain information terminal |
CN110046148A (en) * | 2019-04-23 | 2019-07-23 | 北京恒冠网络数据处理有限公司 | A kind of patent navigation Service management system based on big data |
CN110188166A (en) * | 2019-05-15 | 2019-08-30 | 北京字节跳动网络技术有限公司 | Document search method, device and electronic equipment |
CN110209942A (en) * | 2019-06-04 | 2019-09-06 | 广德元瑞生产力促进中心有限公司 | A kind of scientific and technological information intelligently pushing system based on big data |
CN110209942B (en) * | 2019-06-04 | 2021-03-19 | 广德元瑞生产力促进中心有限公司 | Scientific and technological information intelligence push system based on big data |
CN111767378A (en) * | 2020-06-24 | 2020-10-13 | 北京墨丘科技有限公司 | Method and device for intelligently recommending scientific and technical literature |
CN112667781A (en) * | 2020-12-31 | 2021-04-16 | 北京万方数据股份有限公司 | Malignant tumor document acquisition method and device |
TWI779599B (en) * | 2021-02-09 | 2022-10-01 | 鼎新電腦股份有限公司 | Application programming interface service search system and application programming interface service search method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106682209A (en) | Cross-language scientific and technical literature retrieval method and cross-language scientific and technical literature retrieval system | |
US11080295B2 (en) | Collecting, organizing, and searching knowledge about a dataset | |
Alzahrani et al. | Fuzzy semantic-based string similarity for extrinsic plagiarism detection | |
US8751218B2 (en) | Indexing content at semantic level | |
AU2008292779B2 (en) | Coreference resolution in an ambiguity-sensitive natural language processing system | |
US20170357625A1 (en) | Event extraction from documents | |
US20110113048A1 (en) | Enabling Faster Full-Text Searching Using a Structured Data Store | |
CN112231494B (en) | Information extraction method and device, electronic equipment and storage medium | |
US9317608B2 (en) | Systems and methods for parsing search queries | |
Zu et al. | Resume information extraction with a novel text block segmentation algorithm | |
US8738643B1 (en) | Learning synonymous object names from anchor texts | |
Lu et al. | A dataset search engine for the research document corpus | |
Beheshti et al. | Big data and cross-document coreference resolution: Current state and future opportunities | |
KR20100066919A (en) | Triple indexing and searching scheme for efficient information retrieval | |
Jia et al. | A Chinese unknown word recognition method for micro-blog short text based on improved FP-growth | |
Klampfl et al. | Machine learning techniques for automatically extracting contextual information from scientific publications | |
Jutta et al. | Linguistic variation in the Austrian Media Corpus. Dealing with the challenges of large amounts of data | |
US10896227B2 (en) | Data processing system, data processing method, and data structure | |
TW201822031A (en) | Method of creating chart index with text information and its computer program product capable of generating a virtual chart message catalog and schema index information to facilitate data searching | |
Kumar et al. | Text mining and similarity search using extended tri-gram algorithm in the reference based local repository dataset | |
Sanabila et al. | Automatic Wayang Ontology Construction using Relation Extraction from Free Text | |
Gao et al. | Improving medical ontology based on word embedding | |
Hu et al. | Chinese named entity recognition with CRFs: Two levels | |
Liu et al. | Domain phrase identification using atomic word formation in Chinese text | |
Schäfer et al. | Advances in deep parsing of scholarly paper content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170517 |
|
RJ01 | Rejection of invention patent application after publication |