CN111078976A - Medical system crawler-based data extraction method - Google Patents
Medical system crawler-based data extraction method Download PDFInfo
- Publication number
- CN111078976A CN111078976A CN201911104769.7A CN201911104769A CN111078976A CN 111078976 A CN111078976 A CN 111078976A CN 201911104769 A CN201911104769 A CN 201911104769A CN 111078976 A CN111078976 A CN 111078976A
- Authority
- CN
- China
- Prior art keywords
- data
- medical
- character recognition
- crawled
- baidu
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000013075 data extraction Methods 0.000 title claims description 9
- 238000000605 extraction Methods 0.000 claims abstract description 12
- 230000014509 gene expression Effects 0.000 claims abstract description 12
- 230000011218 segmentation Effects 0.000 claims abstract description 9
- 230000005540 biological transmission Effects 0.000 claims abstract description 7
- 230000009193 crawling Effects 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 230000004044 response Effects 0.000 claims description 6
- 230000003203 everyday effect Effects 0.000 claims description 5
- 230000009286 beneficial effect Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 description 5
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 206010028980 Neoplasm Diseases 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000003187 abdominal effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9532—Query formulation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Abstract
The invention relates to a method for extracting data based on medical system crawler, and belongs to the technical field of medical image character recognition. Firstly, initializing a URL in a medical system; analyzing the URL queue, analyzing html data by using a regular expression, and analyzing json data by using a json module; then, HTTP transmission is carried out on the URL of each piece of medical data, and the target medical data are matched and crawled through the ID of the patient for seeing a doctor and the ID of medical advice; storing the data crawled by the crawler into a medical database; judging the crawled patient data, and performing character recognition on the PDF document by using a Baidu character recognition API; and performing word segmentation, text denoising and key information extraction on the PDF document corpus processed by the Baidu character recognition API, and storing the key information in a medical database. The invention solves the problems of difficult extraction of medical data and time-consuming and tedious extraction.
Description
Technical Field
The invention relates to a method for extracting data based on medical system crawler, and belongs to the technical field of medical image character recognition.
Background
With the development of the medical health industry in China, domestic hospitals are successively provided with systems such as a hospital information system, a PACS (medical image transmission and archiving system), an LIS (examination information system) and the like, and along with the application of the information systems, a long-term neglected problem gradually emerges from the water surface, which is the problem of data extraction. Nowadays, the problem of data extraction has become a bottleneck and a short board which limit the performance of various information systems, and the importance of data extraction has become a key point of attention of people;
data mining is a non-squaring process that proposes implicit, potentially valuable, and ultimately understandable patterns from a database, a key step in knowledge discovery. The medical database is rich in information, and may contain medical images of patients, related pathological parameters, test and measurement results, diagnosis records, and related parameter bases (age, sex, medical history, time of hospital admission, etc.). Medical data is generally stored in a medical system, and a corresponding interface is not used for extraction, so that the arrangement of the medical data is very complex and tedious, manual arrangement is needed manually, and a large amount of manpower and material resources are consumed. However, with the development of the internet, all users can acquire knowledge to be acquired by a certain means in huge network information. As is known, for different data individuals, knowledge needing to be taken is different, and the difficulty in acquiring target information is greatly increased by the phenomenon, so that the concept of the Web crawler is brought forward, and the Web crawler has strong specialty and can effectively query a plurality of Web pages. The starting point of Web crawler execution is a simple Web page, and then, to access other pages, the access is mainly completed according to hyperlinks, and the above operations are repeated, so that all pages can be retrieved and scanned, and the required information is acquired. The crawler program can automatically acquire the webpage, the implementation strategy adopted by the crawler program and the operation efficiency are obvious, the influence on the search result is obvious, and if the selected crawler program is excellent and efficient, the search information can be timely and accurate. The earliest crawlers were Goole crawlers, and the function achieved was that different processes could be completed for each crawler set-up; search engines such as hundredths, search fox and the like should also start to research the crawler program, but the crawler technology of the engines is kept secret. The crawler can be edited according to the effective combination of the algorithm provided by the computer and the assistance completed manually of the website, and can obtain more complete relevant information, which is urgently needed for building the medical information base. With the development of the times, the updating speed of the medical system is high, a long process may be needed for the construction of the medical system interface and the medical system interface is not necessarily suitable for all medical departments, but the manual arrangement and the collection of medical data information are very complicated and energy-consuming.
Disclosure of Invention
The invention provides a method for extracting data based on medical system crawler, which is used for solving the problems that medical data are difficult to extract and time-consuming and tedious to extract.
The technical scheme of the invention is as follows: a method for extracting data based on medical system crawler includes initializing URL in medical system; analyzing the URL queue, analyzing html data by using a regular expression, and analyzing json data by using a json module; then, HTTP transmission is carried out on the URL of each piece of medical data, and the target medical data are matched and crawled through the ID of the patient for seeing a doctor and the ID of medical advice; storing the data crawled by the crawler into a medical database; judging the crawled patient data, analyzing whether the crawled patient data is a PDF document or not, if the crawled patient data is the PDF document, then performing character recognition by using a Baidu character recognition API (application program interface), and converting picture data into character data after the Baidu character recognition API is recognized; if not, storing the crawled data in a medical database; and performing word segmentation, text denoising and key information extraction on the PDF document corpus processed by the Baidu character recognition API, and storing the key information in a medical database.
Further, the method for extracting data based on the medical system crawler comprises the following specific steps:
step 1: initializing URL: sending a request to a target medical data website for medical data crawling by using an http (hyper text transport protocol) library of a hospital webpage in a medical system, and if a server can respond, obtaining a Response of the hospital webpage, wherein the Response comprises hypertext markup language html (hypertext markup language) data of the hospital webpage and light data exchange format json data of the hospital webpage;
step 2: analyzing the URL queue: the regular expression is used for analyzing html data, and then a json module is used for analyzing json data;
step 3: patient data crawling: HTTP protocol transmission is carried out on the URL of each piece of medical data, and the target medical data are matched and crawled through the ID of the patient for seeing a doctor and the ID of medical advice; storing the data crawled by the crawler into a medical database;
step 4: and (3) PDF document character recognition: judging whether the patient data crawled at Step3 is a PDF document or not, if so, performing character recognition by using an Baidu character recognition API, and converting picture data into character data after the Baidu character recognition API is recognized; if not, storing the crawled data into a medical database, wherein the Baidu character recognition API is a platform capable of recognizing various general scenes and files and then returning results according to lines;
step 5: utilizing a jieba word segmentation algorithm to segment the PDF document corpus processed by the Baidu character recognition API;
step 6: text denoising: the PDF document corpus processed by the Baidu character recognition API comprises a plurality of symbols, punctuations and stop word information after word segmentation, and the information affects the quality of medical data and is not beneficial to keyword extraction of a medical report form, so that irrelevant text contents are removed; then establishing a Chinese stop word list stopwords.txt, traversing each word in the text, and deleting the word appearing in the stop word list;
step 7: extracting key information: and the PDF document corpus subjected to text denoising cannot obtain key information corresponding to the keywords, the key information is processed by using a regular expression, the key information under the corresponding input keywords is extracted, and the key information is stored in a medical database.
Further, the method also comprises the Step of Step 8: newly adding data processing: processing the data updated aiming at the medical system every day according to Step 1-7; and searching the information obtained after extracting the newly added data in the medical database according to the name, the age, the address and the identity card to see whether a plurality of patients with the same attributes such as the name, the age, the address and the identity card exist, if so, judging that the patients are readmitted, storing the patients into a readmitted medical database, and otherwise, storing the patients into a readmitted patient information base.
The invention has the beneficial effects that:
1. the method provided by the invention can be used for sorting the medical data, solving the problem that the medical document is difficult to extract the information, providing a monitoring function for newly-added data, judging whether the patient is admitted or a new patient within 30 days, and providing technical support for further mining and analyzing subsequent medical data;
2. the automatic process of medical data processing and storing is realized, a large amount of manpower and material resources are saved, and unformatted data of the medical data are converted into formatted data;
3. a more perfect database which develops the medical health industry can be obtained to a certain extent;
4. according to the invention, on the basis of replacing manual extraction, all target medical data are extracted, and PDF documents such as medical advice documents and CT diagnosis lists are subjected to character recognition by using a Baidu character recognition API (application program interface), and key information extraction is carried out after denoising, a searching and judging process is added to newly added data every day, a database for hospital re-admission is added, finally, a complete medical database is formed, the target medical data are fully and efficiently extracted and sorted, and a large amount of manpower and material resources are saved.
Drawings
FIG. 1 is a flow chart of the present invention;
FIG. 2 is a flow chart of a network architecture for crawling targeted medical data in the present invention;
FIG. 3 is a matching graph of key information matching of regular expressions in the present invention.
Detailed Description
Example 1: as shown in fig. 1 to 3, a method for extracting data based on a medical system crawler includes the following specific steps:
step 1: initializing URL: sending a request to a target medical data website for medical data crawling by using an http (hyper text transport protocol) library of a hospital webpage in a medical system, and if a server can respond, obtaining a Response of the hospital webpage, wherein the Response comprises hypertext markup language html (hypertext markup language) data of the hospital webpage and light data exchange format json data of the hospital webpage;
step 2: analyzing the URL queue: the regular expression is used for analyzing html data, and then a json module is used for analyzing json data;
step 3: patient data crawling: HTTP protocol transmission is carried out on the URL of each piece of medical data, and the target medical data are matched and crawled through the ID of the patient for seeing a doctor and the ID of medical advice; storing the data crawled by the crawler into a medical database; FIG. 2 is a flow chart of a network architecture for crawling target medical data according to the present invention;
step 4: and (3) PDF document character recognition: judging whether the patient data crawled at Step3 is a PDF document or not, if so, performing character recognition by using an Baidu character recognition API, and converting picture data into character data after the Baidu character recognition API is recognized; if not, storing the crawled data into a medical database, wherein the Baidu character recognition API is a platform capable of recognizing various general scenes and files and then returning results according to lines;
step 5: utilizing a jieba word segmentation algorithm to segment the PDF document corpus processed by the Baidu character recognition API;
step 6: text denoising: the PDF document corpus processed by the Baidu character recognition API comprises a plurality of symbols, punctuations and stop word information after word segmentation, and the information affects the quality of medical data and is not beneficial to keyword extraction of a medical report form, so that irrelevant text contents are removed; then establishing a Chinese stop word list stopwords.txt, traversing each word in the text, and deleting the word appearing in the stop word list;
step 7: extracting key information: and the PDF document corpus subjected to text denoising cannot obtain key information corresponding to the keywords, the key information is processed by using a regular expression, the key information under the corresponding input keywords is extracted, and the key information is stored in a medical database. For example, the keyword is "tumor size", and data after the "nodule" is extracted to obtain key information after the two keywords; including what the size of the tumor is, whether the nodule may be a benign nodule or a malignant nodule, etc.; as shown in fig. 3, it is a matching graph of the regular expression matching key information in the present invention;
further, the method also comprises the Step of Step 8: newly adding data processing: processing the data updated aiming at the medical system every day according to Step 1-7; and searching the information obtained after extracting the newly added data in the medical database according to the name, the age, the address and the identity card to see whether a plurality of patients with the same attributes such as the name, the age, the address and the identity card exist, if so, judging that the patients are readmitted, storing the patients into a readmitted medical database, and otherwise, storing the patients into a readmitted patient information base.
According to the method, the webpage structure of the medical data is analyzed, the medical data of the system is crawled, and aiming at the problem that the system login interface is troublesome to extract, the medical data of each patient is crawled by matching the ID of the patient in a doctor, the medical advice ID and the like as identifiers;
the invention can effectively store the previous medical information, the current-stage basic information and the like of the patient in the database, extract the medical data of PDF medical advice documents, abdominal slices, CT enhanced reconstruction documents and the like of the patient, utilize Baidu character recognition API to carry out character recognition, extract the key information and store the key information in the database, can dig medical data to be difficult to do a breakthrough point, and save the waste of a large amount of human resources;
processing the text denoised PDF document corpus by using a regular expression, extracting key information under corresponding input keywords, and storing the key information in a medical database; the invention converts the unformatted data into the formatted data, stores the formatted data in the correspondingly constructed database, extracts and judges the newly-added data every day to obtain the information of the patient to be admitted again, and finally forms the complete database. The experiment is carried out in the urological department of a second affiliated hospital of a university, and finally, a complete urological database is extracted, and compared with a manual extraction method and subsequent storage, better results are obtained.
In order to test the performance of the method provided by the invention, a database of manual statistics is adopted to be compared with a database of the invention; table 1 shows the comparison between the time and the accuracy of the manual data extraction and the data extraction of the invention, and the method has the advantages of high accuracy, short required time and high efficiency;
TABLE 1
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.
Claims (3)
1. A method for extracting data based on medical system crawler is characterized in that: firstly, initializing a URL in a medical system; analyzing the URL queue, analyzing html data by using a regular expression, and analyzing json data by using a json module; then, HTTP transmission is carried out on the URL of each piece of medical data, and the target medical data are matched and crawled through the ID of the patient for seeing a doctor and the ID of medical advice; storing the data crawled by the crawler into a medical database; judging the crawled patient data, analyzing whether the crawled patient data is a PDF document or not, if the crawled patient data is the PDF document, then performing character recognition by using a Baidu character recognition API (application program interface), and converting picture data into character data after the Baidu character recognition API is recognized; if not, storing the crawled data in a medical database; and performing word segmentation, text denoising and key information extraction on the PDF document corpus processed by the Baidu character recognition API, and storing the key information in a medical database.
2. The method for medical system crawler-based data extraction of claim 1, wherein: the method for extracting data based on the medical system crawler comprises the following specific steps:
step 1: initializing URL: sending a request to a target medical data website for medical data crawling by using an http (hyper text transport protocol) library of a hospital webpage in a medical system, and if a server can respond, obtaining a Response of the hospital webpage, wherein the Response comprises hypertext markup language html (hypertext markup language) data of the hospital webpage and light data exchange format json data of the hospital webpage;
step 2: analyzing the URL queue: the regular expression is used for analyzing html data, and then a json module is used for analyzing json data;
step 3: patient data crawling: HTTP protocol transmission is carried out on the URL of each piece of medical data, and the target medical data are matched and crawled through the ID of the patient for seeing a doctor and the ID of medical advice; storing the data crawled by the crawler into a medical database;
step 4: and (3) PDF document character recognition: judging whether the patient data crawled at Step3 is a PDF document or not, if so, performing character recognition by using an Baidu character recognition API, and converting picture data into character data after the Baidu character recognition API is recognized; if not, storing the crawled data into a medical database, wherein the Baidu character recognition API is a platform capable of recognizing various general scenes and files and then returning results according to lines;
step 5: utilizing a jieba word segmentation algorithm to segment the PDF document corpus processed by the Baidu character recognition API;
step 6: text denoising: the PDF document corpus processed by the Baidu character recognition API comprises a plurality of symbols, punctuations and stop word information after word segmentation, and the information affects the quality of medical data and is not beneficial to keyword extraction of a medical report form, so that irrelevant text contents are removed; then establishing a Chinese stop word list stopwords.txt, traversing each word in the text, and deleting the word appearing in the stop word list;
step 7: extracting key information: and the PDF document corpus subjected to text denoising cannot obtain key information corresponding to the keywords, the key information is processed by using a regular expression, the key information under the corresponding input keywords is extracted, and the key information is stored in a medical database.
3. The method for medical system crawler-based data extraction of claim 1, wherein: further comprising the Step 8: newly adding data processing: processing the data updated aiming at the medical system every day according to Step 1-7; and searching the information obtained after extracting the newly added data in the medical database according to the name, the age, the address and the identity card to see whether a plurality of patients with the same attributes such as the name, the age, the address and the identity card exist, if so, judging that the patients are readmitted, storing the patients into a readmitted medical database, and otherwise, storing the patients into a readmitted patient information base.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911104769.7A CN111078976A (en) | 2019-11-08 | 2019-11-08 | Medical system crawler-based data extraction method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911104769.7A CN111078976A (en) | 2019-11-08 | 2019-11-08 | Medical system crawler-based data extraction method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111078976A true CN111078976A (en) | 2020-04-28 |
Family
ID=70310938
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911104769.7A Pending CN111078976A (en) | 2019-11-08 | 2019-11-08 | Medical system crawler-based data extraction method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111078976A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113223661A (en) * | 2021-05-26 | 2021-08-06 | 杭州比康信息科技有限公司 | Traditional Chinese medicine prescription transmission system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104820697A (en) * | 2015-04-28 | 2015-08-05 | 迈德高武汉生物医学信息科技有限公司 | Medical data mining method and system |
CN109215754A (en) * | 2018-09-10 | 2019-01-15 | 平安科技(深圳)有限公司 | Medical record data processing method, device, computer equipment and storage medium |
CN109493931A (en) * | 2018-10-25 | 2019-03-19 | 平安科技(深圳)有限公司 | A kind of coding method of patient file, server and computer readable storage medium |
CN109524073A (en) * | 2018-10-17 | 2019-03-26 | 新博卓畅技术(北京)有限公司 | A kind of automatic deciphering method of hospital's audit report, system and equipment |
CN110136837A (en) * | 2019-03-29 | 2019-08-16 | 中国人民解放军总医院 | A kind of medical data processing platform |
CN110335654A (en) * | 2019-07-03 | 2019-10-15 | 重庆邮电大学 | A kind of information extraction method of electronic health record, system and computer equipment |
CN110415831A (en) * | 2019-07-18 | 2019-11-05 | 天宜(天津)信息科技有限公司 | A kind of medical treatment big data cloud service analysis platform |
-
2019
- 2019-11-08 CN CN201911104769.7A patent/CN111078976A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104820697A (en) * | 2015-04-28 | 2015-08-05 | 迈德高武汉生物医学信息科技有限公司 | Medical data mining method and system |
CN109215754A (en) * | 2018-09-10 | 2019-01-15 | 平安科技(深圳)有限公司 | Medical record data processing method, device, computer equipment and storage medium |
CN109524073A (en) * | 2018-10-17 | 2019-03-26 | 新博卓畅技术(北京)有限公司 | A kind of automatic deciphering method of hospital's audit report, system and equipment |
CN109493931A (en) * | 2018-10-25 | 2019-03-19 | 平安科技(深圳)有限公司 | A kind of coding method of patient file, server and computer readable storage medium |
CN110136837A (en) * | 2019-03-29 | 2019-08-16 | 中国人民解放军总医院 | A kind of medical data processing platform |
CN110335654A (en) * | 2019-07-03 | 2019-10-15 | 重庆邮电大学 | A kind of information extraction method of electronic health record, system and computer equipment |
CN110415831A (en) * | 2019-07-18 | 2019-11-05 | 天宜(天津)信息科技有限公司 | A kind of medical treatment big data cloud service analysis platform |
Non-Patent Citations (4)
Title |
---|
于珊珊等: "医疗大数据中的非结构化数据检索爬虫技术研究", 《2014中华医院信息网络大会》 * |
冯思度等: "基于医疗信息的网络爬虫系统的研究与设计", 《现代信息科技》 * |
卞伟玮等: "基于网络爬虫技术的健康医疗大数据采集整理系统", 《山东大学学报(医学版)》 * |
苗玥等: "基于Python的医学数据爬取及分析处理", 《信息技术与现代化》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113223661A (en) * | 2021-05-26 | 2021-08-06 | 杭州比康信息科技有限公司 | Traditional Chinese medicine prescription transmission system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102053991B (en) | Method and system for multi-language document retrieval | |
KR101845897B1 (en) | System and method for supporting medical academic research | |
WO2015196906A1 (en) | Search-based method and device for obtaining disease advisory information | |
CN105912684B (en) | The cross-media retrieval method of view-based access control model feature and semantic feature | |
CN112559684A (en) | Keyword extraction and information retrieval method | |
CN112232065A (en) | Method and device for mining synonyms | |
WO2020101479A1 (en) | System and method to detect and generate relevant content from uniform resource locator (url) | |
Mehta et al. | DOM tree based approach for web content extraction | |
Ruocco et al. | A scalable algorithm for extraction and clustering of event-related pictures | |
CN112818200A (en) | Data crawling and event analyzing method and system based on static website | |
CN108280081B (en) | Method and device for generating webpage | |
CN107193996B (en) | Similar medical record matching and retrieving system | |
US20170235835A1 (en) | Information identification and extraction | |
Martín-Valdivia et al. | Using information gain to improve multi-modal information retrieval systems | |
CN114141384A (en) | Method, apparatus and medium for retrieving medical data | |
CN111403011B (en) | Registration department pushing method, device and system, electronic equipment and storage medium | |
CN113343680A (en) | Structured information extraction method based on multi-type case history texts | |
CN111078976A (en) | Medical system crawler-based data extraction method | |
US11880396B2 (en) | Method and system to perform text-based search among plurality of documents | |
CN112035723A (en) | Resource library determination method and device, storage medium and electronic device | |
Kiran et al. | An approach towards establishing reference linking in desktop reference manager | |
CN114238735B (en) | Intelligent internet data acquisition method | |
Karisani et al. | Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval | |
US11669556B1 (en) | Method and system for document retrieval and exploration augmented by knowledge graphs | |
EP3367267A1 (en) | System and method for creating entity records using existing data sources |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200428 |