WO2019080428A1

WO2019080428A1 - Method for obtaining target document and application server

Info

Publication number: WO2019080428A1
Application number: PCT/CN2018/077627
Authority: WO
Inventors: 阮晓雯; 周瑜; 徐亮; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2017-10-23
Filing date: 2018-02-28
Publication date: 2019-05-02
Also published as: CN108427702A; CN108427702B

Abstract

Disclosed in the present application is a method for obtaining a target document. The method comprises: obtaining search keywords; establishing a document selection model based on a character deletion table, a synonym and near-synonym table, and a specification parameter table; inputting preprocessed document information to the document selection model, so that the document selection module processes the document information according to the search keywords; calculating, according to a preset keyword frequency and density algorithm, word frequencies and density scores of the search keywords in documents output by the document selection model, and performing relevance ranking on the documents according to the word frequencies and the density scores; and outputting, according to a preset relevance threshold, a target document with relevance greater than the preset relevance threshold in the documents. The present application also provides an application server and a computer readable storage medium. By means of the method for obtaining a target document, the application server, and the computer readable storage medium provided by the present application, a target document can be quickly obtained.

Description

Target document acquisition method and application server

The present application claims the priority of the Chinese Patent Application, filed on Jan. 23, 2017, filed Jan.

Technical field

The present application relates to the field of data analysis technologies, and in particular, to a target document acquisition method and an application server.

Background technique

With the advent of the information age, people store a large amount of information in large-capacity storage devices and use the database management system for information integration and management, and obtain the required information by querying the database. At present, based on keyword matching retrieval, due to the ambiguity of vocabulary, query conditions and expressions, the retrieval encounters many problems. For example, under the medical insurance policy, there are two types of defined insulin use logic, one of which is limited to repeated episodes of hypoglycemia. The conversion to data characteristics means that there are two or more glucose usage records, that is, the need to capture drugs in natural language. The "glucose" field information is involved. However, different cities read different types of writing formats and methods when entering data. In many cases, it is difficult to correctly parse the data. If the direct use of raw data for natural language capture "glucose" production effect will be less ideal, and even deviate from the real results. If special treatment is applied to an area, it will need to be reprocessed when moving to other areas, adding a lot of time cost.

Therefore, in view of the above problems, it is urgent to provide a new retrieval method to obtain real retrieval results and adapt to different regions and reduce costs.

Summary of the invention

In view of this, the present application proposes a target document acquisition method and an application server to solve the problem.

First, in order to achieve the above object, the present application provides a target document obtaining method, the method comprising the steps of: acquiring at least one document and document information corresponding to the document, and preprocessing the document information; acquiring a search keyword Establishing a document selection model based on a character deletion table, a synonymous synonyms table, and a specification parameter table; inputting the preprocessed document information into the document selection model, the document selection model according to the retrieval keyword to the document information Processing; calculating a word frequency and a density score of the search keyword in the document output by the document selection model according to a preset keyword frequency and density algorithm, and correlating the document according to the word frequency and density score Degree sorting; and outputting, according to the preset relevance threshold, the target document in the document whose relevance is greater than the preset relevance threshold.

In addition, in order to achieve the above object, the present application further provides an application server, including a memory and a processor, where the memory stores a target document acquisition system executable on the processor, where the target document acquisition system is The steps of the target document acquisition method as described above are implemented when the processor is executed.

Further, to achieve the above object, the present application further provides a computer readable storage medium storing a target document acquisition system, the target document acquisition system being executable by at least one processor, such that The at least one processor performs the steps of the target document acquisition method as described above.

Compared with the prior art, the target document obtaining method, the application server, and the computer readable storage medium proposed by the present application first obtain a search keyword; secondly, establish a document selection based on a character deletion table, a synonym synonym table, and a specification parameter table. Model; inputting the preprocessed document information into the document selection model again, the document selection model processing the document information according to the retrieval keyword; and then calculating the according to a preset keyword word frequency and density algorithm a word frequency and a density score of the search keyword in the document output by the document selection model, and sorting the documents according to the word frequency and the density score; finally outputting the document according to a preset relevance threshold The target document whose relevance is greater than the preset relevance threshold. The target document acquisition method, the application server and the computer readable storage medium proposed by the application can quickly and accurately obtain the target document on the network, and can be applied to different regions, thereby greatly improving efficiency and reducing cost.

DRAWINGS

1 is a schematic diagram of an optional hardware architecture of an application server of the present application;

2 is a schematic diagram of a program module of an implementation manner of a target document obtaining system of the present application;

3 is a schematic flowchart of a first embodiment of a method for acquiring an object of the present application;

4 is a schematic flowchart of a second embodiment of a method for acquiring an object of the present application;

FIG. 5 is a schematic flowchart diagram of a third implementation manner of an object document obtaining method according to the present application; FIG.

6 is a schematic flowchart of a fourth embodiment of a method for acquiring an object of the present application;

FIG. 7 is a schematic flowchart diagram of a fifth embodiment of an object document obtaining method of the present application.

The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.

Detailed ways

In order to make the objects, technical solutions, and advantages of the present application more comprehensible, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.

It should be noted that the descriptions of "first", "second" and the like in the present application are for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. . Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In addition, the technical solutions between the various embodiments may be combined with each other, but must be based on the realization of those skilled in the art, and when the combination of the technical solutions is contradictory or impossible to implement, it should be considered that the combination of the technical solutions does not exist. Nor is it within the scope of protection required by this application.

Referring to FIG. 1, it is a schematic diagram of an optional hardware architecture of the application server 1 of the present application. In this embodiment, the application server 1 may include, but is not limited to, the memory 11, the processor 12, and the network interface 13 being communicably connected to each other through a system bus. It is pointed out that Figure 1 only shows the application server 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.

The application server 1 may be a computing device such as a rack server, a blade server, a tower server, or a rack server. The application server 1 may be an independent server or a server cluster composed of multiple servers. .

The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (eg, SD or DX memory, etc.), a random access memory (RAM), a static Random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk, and the like. In some embodiments, the memory 11 may be an internal storage unit of the application server 1, such as a hard disk or memory of the application server 1. In other embodiments, the memory 11 may also be an external storage device of the application server 1, such as a plug-in hard disk equipped on the application server 1, a smart memory card (SMC), and a secure digital number. (Secure Digital, SD) card, flash card, etc. Of course, the memory 11 can also include both the internal storage unit of the application server 1 and its external storage device. In the present embodiment, the memory 11 is generally used to store an operating system installed in the application server 1 and various types of application software, such as program code of the target document acquisition system 200. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.

The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the application server 1. In this embodiment, the processor 12 is configured to run program code or process data stored in the memory 11, such as running the target document acquisition system 200 and the like.

The network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the application server 1 and other electronic devices.

So far, the hardware structure and functions of the devices related to this application have been described in detail. Hereinafter, various embodiments of the present application will be made based on the above description.

First, the present application proposes a target document acquisition system 200.

Referring to FIG. 2, it is a program module diagram of the first embodiment of the target document obtaining system 200 of the present application.

In an embodiment, the target document acquisition system 200 includes a series of computer program instructions stored on the memory 11, and when the computer program instructions are executed by the processor 12, the target document acquisition of the embodiments of the present application may be implemented. operating. In some embodiments, the target document acquisition system 200 can be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in FIG. 2, the target document acquisition system 200 can be divided into an acquisition module 21, an establishment module 22, a call module 23, a processing module 24, a comparison module 25, and an output module 26. among them:

The obtaining module 21 is configured to acquire at least one document and document information corresponding to the document, and preprocess the document information.

Specifically, the document may be various documents from insurance institutions and medical institutions, and the insurance institution and the medical institution have a database for storing medical insurance reimbursement documents, drug lists and the like; the medical institutions include hospitals and clinics established in different places. Wait.

Specifically, the preprocessing includes the steps of: segmenting the document to obtain at least one word; performing part of speech analysis on the word to obtain first information of the word; and using the word as a predetermined part of speech or The first information is a word that presets the first information as a candidate word.

Specifically, the part of speech includes: nouns, verbs, adjectives, several times, quantifiers, pronouns, adverbs, conjunctions, auxiliary words, etc.; the first information includes: person name, institution name, place name, time, date, percentage, etc. . For example, in the field of medicine, the name of a pharmaceutical compound is often an important candidate, and the part of a pharmaceutical compound is usually a noun, so the default part of speech can be a noun.

Specifically, the above processing steps can be implemented by using the following tools. For example, when the document is a Chinese document, the ICTCLAS (Institute of Computing Technology Chinese Lexical Analysis System) and the HIT-IRLAS lexical method of Harbin Institute of Technology can be used. Analyzers, etc.; when the target document is an English document, Stanford Parse (also known as the Stanford Lexical Analyzer) can be used. Preferably, the candidate words may also be subjected to shallow syntactic analysis or block analysis to form block structure information, and the block structure information is further used as a candidate word, for example, the block structure information may be a non-recursive name phrase, a verb phrase, etc. Wait.

The obtaining module 21 is further configured to acquire a search keyword.

Specifically, the search keyword may include one or more, and when the method is used to extract a recurrent hypoglycemic document, the search keyword may be defined as “glucose”. Users can set different keywords according to different needs.

The establishing module 22 is configured to establish a document selection model based on a character deletion table, a synonym synonym table, and a specification parameter table.

Specifically, after the word segmentation is performed on the document, the processing of the word segmentation belongs to a fuzzy process, and the document selection model is established, and the required document can be accurately selected according to the candidate words after the pre-processing.

Specifically, the character deletion table includes a character that is inconsistent with the search keyword among the candidate words, and in some cases, the target document includes a plurality of sentences, symbols, words, and the like, and the target is preprocessed. The words obtained after the document is segmented may include some characters and words that do not meet the requirements, and the character deletion table is created to deal with the words after the mistake and the inappropriate word segmentation.

Specifically, the synonymous synonym table includes synonyms and synonyms of the search keywords, and may also include foreign language vocabularies corresponding to different languages. A synonym is a group of words that have similar meanings or are related to each other. The same word can have multiple synonyms in the same language. For keywords that need to be searched, different people have different ways of writing them in different places. For example, computers and computers are synonymous. In different fields, even the same words have different meanings. Therefore, the selected technical field is also important for correct retrieval.

Specifically, the specification parameter table includes multiple parameters corresponding to the search keyword. Taking glucose as an example, in the case of recurrent hypoglycemia, the specification parameters of glucose include the amount and frequency of use. When the search keyword has a specification parameter definition, the candidate parameter can be accurately positioned using the specification parameter table.

Specifically, establishing a document selection model includes the following steps: analyzing a search keyword to obtain a technical field of the search keyword; in the technical field, setting a character deletion table according to the analysis result; in the technical field, obtaining the data from a database Key words synonym, synonym and establish synonym synonym table; in the technical field, the keyword is analyzed and the specification parameter of the keyword is selected to establish a specification parameter table; and the character deletion table, the synonym synonym table and the specification parameter table Make a dynamic update.

The calling module 23 is configured to input the pre-processed document information into the document selection model, and the document selection model processes the document information.

Specifically, the preprocessed document information is input as input information to the document selection model, and the document selection model processes the document information according to a preset condition, the processing step includes: calling a character deletion table to In the document information, the characters, words and words that are wrong, redundant, and obviously related to the search keyword are deleted; the synonymous synonyms table is called to replace the search keyword, and the search keyword after the replacement is searched, and the search key is The document information matching the word and its synonym synonym is saved; the specification parameter table is used to compare and analyze the specification parameters corresponding to the search keyword and the synonymous synonym, and the document information matching the data in the specification parameter table is saved.

Specifically, when processing the document information, the following steps may be further included: establishing a bracket recognition model, and identifying different usage manners of the brackets to obtain accurate classification data.

Specifically, the bracket recognition model can identify different functions of the brackets, including a unilateral relationship, a parallel relationship, and an inclusion relationship, wherein the unilateral relationship refers to the parentheses as a separator to segment the document information in the target document, and the parallel relationship refers to the document. The parentheses in the information are used to display the aliases of some words, and the inclusion relationship refers to the specific parameter information of the partial nouns in the information in the document information.

The processing module 24 is configured to calculate, according to a preset keyword frequency and density algorithm, a word frequency and a density score of the keyword in the document output by the document selection model, and the document according to the word frequency and density score Sort the relevance.

Specifically, the keyword frequency and density score M is:

M=∑log (total number of documents/(number of documents containing keywords +1))*exp(count(keyword), S), where count (keyword) is the number of times the query word is hit in the search result, Log (total number of documents / (number of documents containing keywords + 1)) is the importance of keywords in the query results, and S is a preset parameter.

The comparison module 25 is configured to compare the relevance of the document with a preset relevance threshold.

The output module 26 is configured to output a target document that is greater than a preset relevance threshold according to a preset correlation threshold.

Further, when the target document obtaining method is applied to the medical field, for example, for acquiring a recurrent hypoglycemic document, the following steps may be further included: analyzing the filtered recurrent hypoglycemic document to obtain the patient's identity information. Obtaining historical medical data of the patient from the database according to the identity information of the patient; obtaining data of the patient's glucose use, disease detection, and treatment mode from the historical medical treatment data; and obtaining all recurrent episodes of the patient according to the above data Blood glucose receipts.

In addition, the present application also proposes a target document acquisition method.

Referring to FIG. 3, it is a schematic flowchart of the first implementation manner of the target document obtaining method of the present application. In the present embodiment, the order of execution of the steps in the flowchart shown in FIG. 3 may be changed according to different requirements, and some steps may be omitted.

Step S110: Acquire at least one document and document information corresponding to the document, and preprocess the document information.

In step S120, a search keyword is obtained.

Step S130, establishing a document selection model based on a character deletion table, a synonym synonym table, and a specification parameter table.

Specifically, after the word segmentation is performed on the document, the processing of the word segmentation belongs to a fuzzy process, and the document selection model is established, and the required document can be accurately selected according to the candidate words after the pre-processing. For example, a document selection model based on a character deletion table, a synonym synonym table, and a specification parameter table can quickly and accurately obtain a desired document.

Step S140, input the pre-processed document information into the document selection model, and the document selection model processes the document information according to the retrieval keyword.

Specifically, the document selection model is invoked to process the document information, and the matched document can be quickly obtained. The processing further includes: establishing a bracket recognition model to identify different usage patterns of the brackets to obtain accurate classification data. The data in parentheses is diverse, including the interpretation of the previous word, quantitative description, synonyms, foreign words, etc. At the same time, the parentheses can only exist as a sentence segmentation. Establishing the item number identification model according to different situations can help to obtain Better results.

Step S150: Calculate a word frequency and a density score of the search keyword in the document output by the document selection model according to a preset keyword frequency and density algorithm, and correlate the document according to the word frequency and density score. Degree sorting.

Specifically, the keyword frequency and density score M is:

Step S160: Output, according to the preset relevance threshold, a target document in the document that is greater than the preset relevance threshold.

Specifically, setting the relevance threshold can obtain the required document more conveniently and accurately, and the user can also finely adjust the relevance threshold according to the result of the review, and the retrieval method is more perfect through the feedback operation.

As shown in FIG. 4, it is a schematic flowchart of a second implementation manner of the target document obtaining method of the present application. In the first embodiment, the pre-processing in step S110, “acquiring at least one document and document information corresponding to the document, and pre-processing the document information” specifically includes the following steps:

Step S210, segmenting the document to obtain at least one word.

Step S220, performing part of speech analysis on the words to obtain first information of the words.

Step S230, the words whose words are predetermined part of speech or the first information is preset first information are used as candidate words.

As shown in FIG. 5, it is a schematic flowchart of a third embodiment of the target document obtaining method of the present application. In the first embodiment, the step S130 "establishing a document selection model based on the character deletion table, the synonymous synonyms table and the specification parameter table" includes the following steps:

Step S310, analyzing the search keyword to obtain a technical field of the search keyword.

Specifically, the search keyword often represents a special meaning of its special field, whereby the technical field of the search keyword can be determined. For example, if the search keyword is “binary tree”, the technical field can be reduced to a computer. Algorithms, etc.

Step S320, in the technical field, setting a character deletion table according to the analysis result.

Step S330, in the technical field, synonym and synonym of the keyword are obtained from a database and a synonym synonym table is established.

Step S340, in the technical field, selecting the specification parameter of the keyword after analyzing the keyword to establish the specification parameter table.

Step S350, dynamically updating the character deletion table, the synonym synonym table, and the specification parameter table.

Specifically, when more and more information is available, the character deletion table, the synonym synonym table, and the specification parameter table may be dynamically updated according to the obtained information, so that the character deletion table and the The synonym synonym table and the specification parameter table are more perfect, so that the document selection model based on the character deletion table, the synonym synonym table and the specification parameter table is more accurate.

FIG. 6 is a schematic flowchart diagram of a fourth embodiment of the method for acquiring an object of the present application. In this embodiment, the step of the step of “putting the pre-processed document information into the document selection model, and the document selection model processing the document information according to the retrieval keyword” specifically includes the following steps. :

Step S410, calling the character deletion table to delete characters, words that are incorrect, redundant, and obviously related to the search keyword in the document information.

Step S420, calling the synonym synonym table to replace the search keyword, searching the replaced search keyword, and saving document information matching the search keyword and its synonym similarity.

Step S430, the specification parameter table is invoked to perform comparison analysis on the specification parameters corresponding to the search keyword and the synonymous synonym, and the document information matching the data in the specification parameter table is saved.

FIG. 7 is a schematic flowchart diagram of a fifth embodiment of a method for acquiring an object of the present application. In this embodiment, when the target document obtaining method is used to acquire a recurrent hypoglycemic document, the step of “outputting the relevance in the document is greater than the preset relevance threshold according to a preset relevance threshold. After the target document, the following steps can also be included:

Step S610, analyzing the filtered target document to obtain identity information of the patient.

Specifically, the target document selected may be a recurrent hypoglycemic document.

Step S620, obtaining historical medical treatment data of the patient from the database according to the identity information of the patient.

Step S630, obtaining data such as glucose usage, disease detection, and treatment mode of the patient from the historical medical treatment data.

Step S640, obtaining all recurrent hypoglycemia documents of the patient according to the above data.

The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in the various embodiments of the present application.

The above is only a preferred embodiment of the present application, and thus does not limit the scope of the patent application, and the equivalent structure or equivalent process transformation made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims

A method for acquiring a target document is applied to an application server, and the method includes the steps of:

Obtaining at least one document and document information corresponding to the document, and preprocessing the document information;

Get the search keyword;

Establish a document selection model based on a character deletion table, a synonym synonym table, and a specification parameter table;

Importing the pre-processed document information into the document selection model, the document selection model processing the document information according to the retrieval keyword;

Calculating a word frequency and a density score of the search keyword in the document output by the document selection model according to a preset keyword frequency and density algorithm, and sorting the documents according to the word frequency and the density score; and

And outputting, in the document, the target document whose relevance is greater than the preset relevance threshold according to a preset relevance threshold.
The target document obtaining method according to claim 1, wherein the preprocessing of the step of "acquiring at least one document and document information corresponding to the document and preprocessing the document information" further includes the following step:

Segmenting the document to obtain at least one word;

Performing part-of-speech analysis on the words to obtain first information of the words; and

The words whose words are predetermined part of speech or whose first information is preset first information are used as candidate words.
The target document obtaining method according to claim 1, wherein the character deletion table includes a character that is inconsistent with the search keyword among the candidate words; and the synonymous synonym table includes a keyword corresponding to the search keyword. Synonyms and synonyms; the specification parameter table includes various parameters corresponding to the search keyword.
The method for acquiring a target document according to claim 3, wherein the step of establishing the target document selection model comprises:

Performing analysis on the search keyword to obtain a technical field of the search keyword;

In the technical field, a character deletion table is set according to the analysis result;

In the technical field, synonym and synonym of the keyword are obtained from a database and a synonym synonym table is established;

In the technical field, the specification parameter table is established by selecting the specification parameter of the keyword after the keyword analysis; and

Dynamically updating the character deletion table, the synonym synonym table, and the specification parameter table.
The target document obtaining method according to claim 1, wherein said step "putting preprocessed document information into said document selection model, said document selection model is responsive to said retrieval keyword to said document information In the process of processing, the processing steps include:

Calling the character deletion table to delete characters, words that are incorrect, redundant, and obviously related to the search keyword in the document information;

Calling the synonym synonym table to replace the search keyword, searching the replaced search keyword, and saving document information matching the search keyword and its synonymous synonym; and

The specification parameter table is called to perform comparison analysis on the specification parameters corresponding to the search keyword and the synonymous synonym, and the document information matching the data in the specification parameter table is saved.
The target document obtaining method according to claim 5, wherein the step "puts the preprocessed document information into the document selection model, and the document selection model pairs the document information according to the retrieval keyword In the processing, the processing step further includes:

Create a bracket recognition model to identify the different ways in which the brackets are used to obtain accurate categorical data.
The target document acquisition method according to claim 1, wherein the keyword word frequency and density score M are: M = ∑log (total number of documents / (number of documents including the search keyword + 1)) * Exp(count (the search keyword), S), wherein count (the search keyword) is the number of times the search keyword is hit in the search result, log (total number of documents / (including the search key) The number of documents of the word +1)) is the importance degree of the search keyword in the query result, and S is a preset parameter.
The target document acquisition method according to claim 1, wherein when the target document acquisition method is used to acquire a recurrent hypoglycemic document, the step of “outputting the document according to a preset correlation threshold” After the target document whose correlation is greater than the preset relevance threshold, the following steps are also included:

Performing analysis on the selected target document to obtain identity information of the patient;

Obtaining historical medical data of the patient from the database according to the identity information of the patient;

Obtaining data such as glucose use, disease detection, and treatment methods of the patient from the historical medical treatment data; and

According to the above data, all recurrent hypoglycemia documents of the patient were obtained.
An application server, comprising: a memory, a processor, on the memory, a target document acquisition system executable on the processor, where the target document acquisition system is used by the processor The implementation steps are as follows:

Obtaining at least one document and document information corresponding to the document, and preprocessing the document information;

Get the search keyword;

Establish a document selection model based on a character deletion table, a synonym synonym table, and a specification parameter table;

Importing the pre-processed document information into the document selection model, the document selection model processing the document information according to the retrieval keyword;

Calculating a word frequency and a density score of the search keyword in the document output by the document selection model according to a preset keyword frequency and density algorithm, and sorting the documents according to the word frequency and the density score; and

And outputting, in the document, the target document whose relevance is greater than the preset relevance threshold according to a preset relevance threshold.
The application server according to claim 9, wherein the preprocessing of the step of "acquiring at least one document and document information corresponding to the document and preprocessing the document information" further comprises the following steps:

Segmenting the document to obtain at least one word;

Performing part-of-speech analysis on the words to obtain first information of the words; and

The words whose words are predetermined part of speech or whose first information is preset first information are used as candidate words.
The application server according to claim 9, wherein said character deletion table includes a character that is inconsistent with the search keyword among said candidate words; said synonymous synonym table includes a synonym corresponding to the search keyword And synonym; the specification parameter table includes various parameters corresponding to the search keyword.
The application server according to claim 11, wherein the step of establishing the target document selection model comprises:

Performing analysis on the search keyword to obtain a technical field of the search keyword;

In the technical field, a character deletion table is set according to the analysis result;

In the technical field, synonym and synonym of the keyword are obtained from a database and a synonym synonym table is established;

In the technical field, the specification parameter table is established by selecting the specification parameter of the keyword after the keyword analysis; and

Dynamically updating the character deletion table, the synonym synonym table, and the specification parameter table.
The application server according to claim 9, wherein said step "putting preprocessed document information into said document selection model, said document selection model processing said document information based on said retrieval key The processing steps include:

Calling the character deletion table to delete characters, words that are incorrect, redundant, and obviously related to the search keyword in the document information;

Recalling the synonym synonym table to replace the search keyword, searching the replaced search keyword, and saving document information matching the search keyword and its synonymous synonym; and

The specification parameter table is called to perform comparison analysis on the specification parameters corresponding to the search keyword and the synonymous synonym, and the document information matching the data in the specification parameter table is saved.
The application server according to claim 9, wherein said keyword word frequency and density score M are: M = ∑log (total number of documents / (number of documents including said search keyword +1)) * exp ( Count (the search keyword), S), wherein count (the search keyword) is the number of times the search keyword is hit in the search result, log (total number of documents / (including the search keyword) The number of documents +1)) is the importance degree of the search keyword in the query result, and S is a preset parameter.
The application server according to claim 9, wherein when said target document acquisition method is for acquiring a recurrent hypoglycemic document, said step "outputting said correlation in said document according to a preset relevance threshold After the target document whose degree is greater than the preset relevance threshold, the following steps are also included:

Performing analysis on the selected target document to obtain identity information of the patient;

Obtaining historical medical data of the patient from the database according to the identity information of the patient;

Obtaining data such as glucose use, disease detection, and treatment methods of the patient from the historical medical treatment data; and

According to the above data, all recurrent hypoglycemia documents of the patient were obtained.
A computer readable storage medium storing a target document acquisition system, the target document acquisition system being executable by at least one processor to cause the at least one processor to perform the following steps:

Obtaining at least one document and document information corresponding to the document, and preprocessing the document information;

Get the search keyword;

Establish a document selection model based on a character deletion table, a synonym synonym table, and a specification parameter table;

Importing the pre-processed document information into the document selection model, the document selection model processing the document information according to the retrieval keyword;

Calculating a word frequency and a density score of the search keyword in the document output by the document selection model according to a preset keyword frequency and density algorithm, and sorting the documents according to the word frequency and the density score; and

And outputting, in the document, the target document whose relevance is greater than the preset relevance threshold according to a preset relevance threshold.
The computer readable storage medium according to claim 16, wherein the preprocessing of the step of "acquiring at least one document and document information corresponding to the document and preprocessing the document information" further comprises The following steps:

Segmenting the document to obtain at least one word;

Performing part-of-speech analysis on the words to obtain first information of the words; and

The words whose words are predetermined part of speech or whose first information is preset first information are used as candidate words.
The computer readable storage medium according to claim 16, wherein said character deletion table includes a character that is inconsistent with the search key among said candidate words; said synonymous synonym table includes and searches for a keyword Corresponding synonyms and synonyms; the specification parameter table includes various parameters corresponding to the search keyword.
The computer readable storage medium of claim 18, wherein the step of establishing the target document selection model comprises:

Performing analysis on the search keyword to obtain a technical field of the search keyword;

In the technical field, a character deletion table is set according to the analysis result;

In the technical field, synonym and synonym of the keyword are obtained from a database and a synonym synonym table is established;

In the technical field, the specification parameter table is established by selecting the specification parameter of the keyword after the keyword analysis; and

Dynamically updating the character deletion table, the synonym synonym table, and the specification parameter table.
A computer readable storage medium according to claim 16, wherein said step "putting preprocessed document information into said document selection model, said document selection model being said to said document based on said retrieval key In the information processing, the processing steps include:

Calling the character deletion table to delete characters, words that are incorrect, redundant, and obviously related to the search keyword in the document information;

Calling the synonym synonym table to replace the search keyword, searching the replaced search keyword, and saving document information matching the search keyword and its synonymous synonym; and

The specification parameter table is called to perform comparison analysis on the specification parameters corresponding to the search keyword and the synonymous synonym, and the document information matching the data in the specification parameter table is saved.