CN106503265A - Structured search system and its searching method based on weights - Google Patents
Structured search system and its searching method based on weights Download PDFInfo
- Publication number
- CN106503265A CN106503265A CN201611077910.5A CN201611077910A CN106503265A CN 106503265 A CN106503265 A CN 106503265A CN 201611077910 A CN201611077910 A CN 201611077910A CN 106503265 A CN106503265 A CN 106503265A
- Authority
- CN
- China
- Prior art keywords
- search
- weights
- module
- tree
- structured
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 238000012545 processing Methods 0.000 claims description 21
- 230000008569 process Effects 0.000 claims description 16
- 230000011218 segmentation Effects 0.000 claims description 11
- 230000008878 coupling Effects 0.000 claims description 4
- 238000010168 coupling process Methods 0.000 claims description 4
- 238000005859 coupling reaction Methods 0.000 claims description 4
- 241001269238 Data Species 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 41
- 210000000115 thoracic cavity Anatomy 0.000 description 17
- 238000006243 chemical reaction Methods 0.000 description 7
- 235000013399 edible fruits Nutrition 0.000 description 6
- 206010028980 Neoplasm Diseases 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 210000000038 chest Anatomy 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 210000003238 esophagus Anatomy 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 210000004872 soft tissue Anatomy 0.000 description 2
- 238000000547 structure data Methods 0.000 description 2
- 241000705930 Broussonetia papyrifera Species 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/322—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/338—Presentation of query results
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a kind of structured search system based on weights, including structure tree module, for carrying out participle to every unstructured data in data memory module, it is split as single keyword vocabulary, to each keyword vocabulary definitions weights, and a text structure tree is set up according to grammar contexts;Search tree module, for receiving the search expression of client, carries out participle to search expression, is split as single keyword vocabulary, to each keyword vocabulary definitions weights, and sets up a search tree according to grammar contexts;Analysis module, for being mated search tree with all of text structure tree, draws matching value according to weight computing;Search Results, for matching value is ranked up from big to small with score value, are shown to client by display module.The invention also discloses a kind of searching method of the structured search system based on weights.The present invention can carry out accurate effectively search to unstructured datas such as free texts.
Description
Technical field
The present invention relates to medical information field, more particularly, to a kind of structured search system based on weights and its
Searching method.
Background technology
With the development and the construction of medical information of information technology, present hospital and various medical institutions implement already
Various information systems soft ware, such as " HIS (hospital information management system) ", " EMR (electronic medical record system) ", " PACS (medical science shadows
As achieve and Transmission system) ", " RIS (image information management system) " etc..As information system spreads all over each medical field,
For many years, bulk data is generated and has saved bit by bit, the data that each system is produced mainly have following two class:
1 structural data:The such as demographic such as patient's name, sex.Information system generally by such each information all
Individually it is stored in the different field of database, in that context it may be convenient to obtain inquiry etc..
2 unstructured datas:The such as main suit of patient, medical history, iconography report etc..The part is usually descriptive language,
The free language text of big section.As the category information is that doctor's typing or patient give an oral account, so language is extremely lack of standardization, information system
System is generally preserved as an entirety.
For the search comparative maturity already of said structure data, information system can simply adopt structuralized query very much
The data base tools such as language (SQL) scanning for, but for destructuring numbers such as the Radiologic imaging as patient and diagnosis
According to, although wherein include extremely valuable information, but not particularly effective method is accurately searching for and apply, existing
Software mainly have the following two kinds for the searching method of unstructured data:
1 is scanned for by " keyword " using data base tool:Such as with the SQL (SQL) of relational database
To carry out matching inquiry (like), that is, all data comprising " keyword " are searched, but there are a lot of drawbacks, it is impossible to obtained
Accurate believable Search Results, for example:
1.1 cannot process synonym:The description comprising " the 5th thoracic vertebrae " to be such as inquired about, in actual applications, doctor can make
With the language such as " thoracic vertebrae 5 ", " 5 centrum of chest ", " T5 ", " T5 vertebras ", the same meaning is.
1.2 can only define simple keyword, it is impossible to carry out many words and accurately inquire about:All " the 5th thoracic dorsal vertebraes will for example be inquired about
The patient of folding ", due to the complexity of Chinese language, actual description can such as " 5 visible fracture of thoracic vertebrae ", " T5 has found fracture ", " bone
Fold on present 5th thoracic vertebrae " etc., so the validity of Search Results is extremely low.
1.3 cannot inquire about to value range:Such as search " diameter of tumor is between 2-3CM " etc..
2 natural language search engines:The search engines such as similar Baidu, Google.Relative to above method, although the method
Have some improvement, such as synon process, but the particularity due to medical field, still cannot obtain accurate
Search Results, major defect have at 3 points:
2.1 keywords do not have logical interdependency, therefore cannot carry out many words and accurately inquire about:" the 5th thoracic dorsal vertebrae will for example be inquired about
Folding ", actually system can be by " the 5th thoracic vertebrae ", " fracture " two words or " the 5th ", " thoracic vertebrae ", " fracture " three words being searched
Rope, due to simply according to keyword searching for respectively, so much incongruent contents can be found out, for example described below " the 5th thoracic vertebrae increases
Raw, the 7th fracture of thoracic vertebra ", and the content that much really meets and cannot search as keyword is mismatched, for example retouch as follows
State " 3-6 fractures of thoracic vertebra " (3-6 thoracic vertebraes actual comprising the 3rd, the 4th,5th, the 6th thoracic vertebrae);
2.2 equally cannot inquire about for value range:Such as search " diameter of tumor is between 2-3CM " etc.;
2.3 search result relevances do not have quantizating index:Search for and would generally list substantial amounts of Search Results, but the knot
Whether the result that fruit is mutually wanted with actual user is consistent completely?If do not corresponded, matching degree is how many?Neither one quantizating index, needs
User is wanted to carry out screening judgement one by one.
Therefore above no matter which kind of method, all accurately effectively cannot scan for.Popularization with medical information system
And deeply apply, increasing unstructured data is produced, and wherein contains a large amount of extremely valuable information, therefore
Doctor and other user sides is how helped just accurately to search data of interest also increasingly urgent.
Content of the invention
In view of this, present invention is primarily targeted at providing a kind of structured search system and its search based on weights
Method, can solve the problem that the limitation for searching for means present in prior art, cannot obtain as Search Results validity is low
The problem of accurate Search Results.
For reaching above-mentioned purpose, the technical scheme is that and be achieved in that:
On the one hand, the invention provides a kind of structured search system based on weights, including data memory module, structure
Tree module, search tree module, analysis module and display module, wherein, data memory module is connected with structure tree module, is used for
Storage unstructured data;Structure tree module, is connected with data memory module and analysis module respectively, for every non-structural
Changing data carries out word segmentation processing, unstructured data is split as single keyword vocabulary, to each keyword vocabulary definitions
Weights, and a text structure tree corresponding to the unstructured data is set up according to grammar contexts;Search tree module, with point
Analysis module is connected, and for receiving the search expression for coming from client, carries out word segmentation processing to search expression, will search for table
Single keyword vocabulary is split as up to formula, to each keyword vocabulary definitions weights, and one is set up according to grammar contexts
Search tree corresponding to the search expression;Analysis module, is connected with search tree module and structure tree module respectively, for searching
Suo Shu is mated with all of text structure tree, draws matching value according to weight computing;Display module, with analysis module phase
Even, for matching value is ranked up from big to small with score value, all zero items are removed, and Search Results is shown to client
End.
Preferably, the system also includes:Synonym modular converter, is connected with structure tree module and search tree module respectively,
For synonym conversion being carried out to keyword vocabulary, synonym normalizing is carried out according to synonym dictionary.
Preferably, the system also includes:Value range identification module, is connected with structure tree module and search tree module respectively,
For recognizing the value range of keyword vocabulary.
Preferably, search tree module also includes operator processing unit, for the logical operator in search expression
It is identified and process.
Preferably, to each keyword vocabulary definitions weights, it is the phase of the rudimentary knowledge according to unstructured data text
The importance of closing property and special characteristic is determining.
Preferably, display module also includes star display unit, for determining the number of star according to matching value, and by star
Number and matching value be simultaneously shown to client.
On the other hand, present invention also offers a kind of searching method of the structured search system based on weights, including:Knot
Paper mulberry module carries out word segmentation processing to every unstructured data in data memory module, and unstructured data is split as list
Only keyword vocabulary, to each keyword vocabulary definitions weights, and sets up one corresponding to the non-knot according to grammar contexts
The text structure tree of structure data;Search tree module receives the search expression for coming from client, and search expression is carried out
Search expression is split as single keyword vocabulary by word segmentation processing, to each keyword vocabulary definitions weights, and according to language
Method context sets up a search tree corresponding to the search expression;Analysis module is by search tree and all of text structure tree
Mated, matching value is drawn according to weight computing;Matching value is ranked up from big to small by display module with score value, removes institute
There is zero item, and Search Results are shown to client.
Preferably, before text structure tree or search tree is set up, the method also includes:Synonym modular converter is to key
Word vocabulary carries out synonym conversion, carries out synonym normalizing according to synonym dictionary.
Preferably, before text structure tree or search tree is set up, the method also includes:The identification of value range identification module is closed
The value range of keyword vocabulary.
Preferably, before search tree is set up, the method also includes:Operator processing unit to search expression in patrol
Collect operator to be identified and process.
Preferably, to each keyword vocabulary definitions weights, it is the phase of the rudimentary knowledge according to unstructured data text
The importance of closing property and special characteristic is determining.
Preferably, the method also includes:Star display unit determines the number of star, and the number by star according to matching value
And matching value is shown to client simultaneously.
The technique effect of the present invention:
1., due to being provided with structure tree module and search tree module in the present invention, by non-structured free text and search
Rope expression formula carries out participle, and carries out structuring reconstruct, forms text structure tree and search tree, defines each keyword vocabulary
And the weights of branch, search tree mated by analysis module with all of text structure tree, draws coupling according to weight computing
Value, so that Search Results are accurately credible;
2., as the present invention is provided with synonym modular converter and value range identification module, keyword vocabulary has been carried out same
Adopted word conversion so that synonym normalizing, it is possible to recognize the value range of keyword vocabulary, can solve the problem that present in prior art
The problem of synonym and value range cannot be processed, makes Search Results more accurate, not fall out valuable information;
3. the search condition based on natural language, as the present invention is provided with operator processing unit, to search expression
In logical operator be identified and process so that Search Results are more comprehensive, and are convenient for users to operate;
4., as the present invention is also provided with star display unit, Search Results carry out weights scoring according to matching degree, and give
Star evaluation, scoring highest is given to show that up front screen judgement one by one without the need for user, Search Results are very clear, very
Intuitively, search efficiency is improve, is allowed to more hommization.
Description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this
Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 shows the according to embodiments of the present invention one structured search system structure diagram based on weights;
Fig. 2 shows that the text structure tree of the according to embodiments of the present invention one structured search system based on weights is illustrated
Figure;
Fig. 3 shows the search tree schematic diagram of the according to embodiments of the present invention one structured search system based on weights;
Fig. 4 shows that analysis module in the according to embodiments of the present invention one structured search system based on weights is calculated
Search tree and the matching value schematic diagram of each text structure tree;
Fig. 5 shows that the text structure tree of the according to embodiments of the present invention one structured search system based on weights is illustrated
Figure;
Fig. 6 shows the search tree schematic diagram of the according to embodiments of the present invention one structured search system based on weights;
Fig. 7 shows that analysis module in the according to embodiments of the present invention one structured search system based on weights is calculated
Search tree and the matching value schematic diagram of each text structure tree;
Fig. 8 shows that the Search Results of the according to embodiments of the present invention one structured search system based on weights show and shows
It is intended to;
Fig. 9 shows the according to embodiments of the present invention two structured search system structure diagram based on weights;
Figure 10 shows the according to embodiments of the present invention three structured search system structure diagram based on weights;
Figure 11 shows the according to embodiments of the present invention four structured search system structure diagram based on weights;
Figure 12 shows the according to embodiments of the present invention five structured search system structure diagram based on weights;
Figure 13 shows that the Search Results of the according to embodiments of the present invention five structured search system based on weights show
Schematic diagram;
Figure 14 shows that the Search Results of the according to embodiments of the present invention five structured search system based on weights show
Schematic diagram;
Figure 15 shows the searching method flow process of the according to embodiments of the present invention six structured search system based on weights
Figure;
Figure 16 shows the searching method Chinese of the according to embodiments of the present invention six structured search system based on weights
This structure tree schematic diagram;
Figure 17 shows according to embodiments of the present invention six searching method based on the structured search system of weights and searches
Suo Shu schematic diagrames;
Figure 18 shows according to embodiments of the present invention six searching method based on the structured search system of weights point
Search tree and the matching value schematic diagram of each text structure tree that analysis module is calculated;
Figure 19 shows the searching method Chinese of the according to embodiments of the present invention six structured search system based on weights
This structure tree schematic diagram;
Figure 20 shows according to embodiments of the present invention six searching method based on the structured search system of weights and searches
Suo Shu schematic diagrames;
Figure 21 shows according to embodiments of the present invention six searching method based on the structured search system of weights point
Search tree and the matching value schematic diagram of each text structure tree that analysis module is calculated;
Figure 22 shows according to embodiments of the present invention six searching method based on the structured search system of weights and searches
Hitch fruit display schematic diagram;
Figure 23 shows according to embodiments of the present invention six searching method based on the structured search system of weights and searches
Hitch fruit display schematic diagram;
Figure 24 shows according to embodiments of the present invention six searching method based on the structured search system of weights and searches
Hitch fruit display schematic diagram.
Specific embodiment
Below with reference to the accompanying drawings and in conjunction with the embodiments, the present invention is described in detail.
Embodiment one
Fig. 1 shows the according to embodiments of the present invention one structured search system structure diagram based on weights;Such as Fig. 1
Shown, the system includes:Data memory module 10, structure tree module 20, search tree module 30, analysis module 40 and display module
50, wherein,
Data memory module 10, is connected with structure tree module 20, for storing unstructured data;
Here mentioned unstructured data is that doctor's typing or patient give an oral account, such as patient main suit, medical history, iconography
Report etc.;
Structure tree module 20, is connected with data memory module 10 and analysis module 40 respectively, for every destructuring
Data carry out word segmentation processing, and unstructured data is split as single keyword vocabulary, to each keyword vocabulary definitions power
Value, and a text structure tree corresponding to the unstructured data is set up according to grammar contexts;
Fig. 2 shows that the text structure tree of the according to embodiments of the present invention one structured search system based on weights is illustrated
Figure;As shown in Fig. 2 sentence is split into single keyword vocabulary according to semantics by structure tree module;
Search tree module 30, is connected with analysis module 40, for receiving the search expression for coming from client, to search
Expression formula carries out word segmentation processing, and search expression is split as single keyword vocabulary, to each keyword vocabulary definitions power
Value, and a search tree corresponding to the search expression is set up according to grammar contexts;
Fig. 3 shows the search tree schematic diagram of the according to embodiments of the present invention one structured search system based on weights;
As shown in figure 3, search tree module for user input search expression carrying out structuring reconstruct to search condition;
Wherein, above-mentioned to each keyword vocabulary definitions weights, it is the rudimentary knowledge according to unstructured data text
The importance of correlation and special characteristic is determining;
Analysis module 40, is connected with search tree module 20 and structure tree module 30 respectively, for by search tree with all of
Tree is mated text structure, draws matching value according to weight computing;
Fig. 4 shows that analysis module in the according to embodiments of the present invention one structured search system based on weights is calculated
Search tree and the matching value schematic diagram of each text structure tree;As shown in Figure 4;
Display module 50, is connected with analysis module 40, for matching value is ranked up from big to small with score value, removes institute
There is zero item, and Search Results are shown to client.
The present embodiment is illustrated with an example below:
Fig. 5 shows that the text structure tree of the according to embodiments of the present invention one structured search system based on weights is illustrated
Figure;Fig. 6 shows the search tree schematic diagram of the according to embodiments of the present invention one structured search system based on weights;Fig. 7 shows
Analysis module is calculated in the according to embodiments of the present invention one structured search system based on weights search tree and each are gone out
The matching value schematic diagram of text structure tree;Fig. 8 shows the according to embodiments of the present invention one structured search system based on weights
The Search Results display schematic diagram of system;As shown in Fig. 5, Fig. 6, Fig. 7, Fig. 8,
For example doctor has write described below:
" distal esophagus tube wall is substantially uneven to be thickened, and sees that lump shadow of soft tissue is formed, about 2.8 centimetres of thickest layer face "
System generates text structure tree as shown in Figure 5 after being carried out structuring reconstruct;
If user searches for following search expression:
" lump shadow is most thick greater than about 2.6 centimetres "
System generates search tree as shown in Figure 6 after being carried out structuring reconstruct;
Search tree is mated by analysis module with text structure tree, calculates matching degree, obtains score value for 10 points, such as
Shown in Fig. 7, Fig. 8.
Embodiments of the invention are provided with structure tree module and search tree module, by non-structured free text and search
Rope expression formula carries out participle, and carries out structuring reconstruct, forms text structure tree and search tree, defines each keyword vocabulary
And the weights of branch, search tree mated by analysis module with all of text structure tree, draws coupling according to weight computing
Value, so that Search Results are accurately credible.
Embodiment two
Fig. 9 shows the according to embodiments of the present invention two structured search system structure diagram based on weights;Such as Fig. 9
Shown, the system also includes:Synonym modular converter 60, is connected with structure tree module 20 and search tree module 30 respectively, is used for
Synonym conversion is carried out to keyword vocabulary, synonym normalizing is carried out according to synonym dictionary.
The description comprising " the 5th thoracic vertebrae " to be for example inquired about, in actual applications, doctor can use " thoracic vertebrae 5 ", " 5 vertebra of chest
The language such as body ", " T5 ", " T5 vertebras ", is the same meaning, and now, synonym modular converter carries out synonymous according to synonym dictionary
Word normalizing, solves synonym problems, improves the validity of Search Results.
Embodiment three
Figure 10 shows the according to embodiments of the present invention three structured search system structure diagram based on weights;As schemed
Shown in 10, the system also includes:Value range identification module 70, is connected with structure tree module 20 and search tree module 30 respectively, is used
Value range in identification keyword vocabulary.
Such as area, length, volume, capacity etc., solve the problems, such as in prior art cannot hunting zone value interval, such as
Search " diameter of tumor is between 2-3CM " etc..
Embodiments of the invention are provided with value range identification module, can recognize the value range of keyword vocabulary, can solve
Certainly present in prior art cannot process range value problem, make Search Results more accurate, do not fall out valuable letter
Breath.
Example IV
Figure 11 shows the according to embodiments of the present invention four structured search system structure diagram based on weights;As schemed
Shown in 11, search tree module 30 also includes operator processing unit 302, carries out for the logical operator in search expression
Identification and process.
For example and, comprising or, more than etc..
Search condition of the embodiments of the invention based on natural language, as the present invention is provided with operator processing unit,
Logical operator in search expression is identified and process so that Search Results are more comprehensive, and facilitates user behaviour
Make.
Embodiment five
Figure 12 shows the according to embodiments of the present invention five structured search system structure diagram based on weights;Figure 13
Show the Search Results display schematic diagram of the according to embodiments of the present invention five structured search system based on weights;As schemed
12nd, shown in Figure 13, display module 50 also includes star display unit 502, for determining the number of star according to matching value, and will
The number of star and matching value are shown to client simultaneously.
From 5 star of 0-10 highests, 10 grades can be divided, as follows:
Figure 14 shows that the Search Results of the according to embodiments of the present invention five structured search system based on weights show
Schematic diagram;
The Search Results of search " lump shadow is most thick greater than about 2.6 centimetres " for example, in embodiment one show such as Figure 14 institutes
Show.
As embodiments of the invention are also provided with star display unit, Search Results carry out weights according to matching degree and comment
Point, and star evaluation is given, scoring highest shows up front, screens judgement, one mesh of Search Results one by one without the need for user
So, very intuitively, search efficiency is improve, is allowed to more hommization.
Embodiment six
Figure 15 shows the searching method flow process of the according to embodiments of the present invention six structured search system based on weights
Figure, as shown in figure 15, the method is comprised the following steps:
Step S601, structure tree module carry out word segmentation processing to every unstructured data in data memory module, will
Unstructured data is split as single keyword vocabulary, to each keyword vocabulary definitions weights, and according to grammar contexts
Set up a text structure tree corresponding to the unstructured data;Figure 16 show according to embodiments of the present invention six based on power
Searching method Chinese version structure tree schematic diagram (as shown in figure 16) of the structured search system of value
Here mentioned unstructured data is that doctor's typing or patient give an oral account, such as patient main suit, medical history, iconography
Report etc.;
Step S602, search tree module receive the search expression for coming from client, carry out participle to search expression
Process, search expression is split as single keyword vocabulary, to each keyword vocabulary definitions weights, and according to grammatically
A search tree corresponding to the search expression is hereafter set up;Figure 17 show according to embodiments of the present invention six based on weights
Structured search system searching method in search tree schematic diagram (as shown in figure 17)
Wherein, above-mentioned to each keyword vocabulary definitions weights, it is the rudimentary knowledge according to unstructured data text
The importance of correlation and special characteristic is determining;
Search tree is mated by step S603, analysis module with all of text structure tree, is drawn according to weight computing
Matching value;Figure 18 shows according to embodiments of the present invention six searching method based on the structured search system of weights and analyzes
Search tree and the matching value schematic diagram (as shown in figure 18) of each text structure tree that module is calculated
Matching value is ranked up from big to small by step S604, display module with score value, is removed all zero items, and will be searched
Hitch fruit is shown to client.
The present embodiment is illustrated with an example below:
For example doctor has write described below:
" distal esophagus tube wall is substantially uneven to be thickened, and sees that lump shadow of soft tissue is formed, about 2.8 centimetres of thickest layer face "
System generates text structure tree as shown in figure 19 after being carried out structuring reconstruct;Figure 19 is shown according to this
The searching method Chinese version structure tree schematic diagram of the structured search system based on weights of bright embodiment six;
If user searches for following search expression:
" lump shadow is most thick greater than about 2.6 centimetres "
System generates search tree as shown in figure 20 after being carried out structuring reconstruct;Figure 20 is shown according to of the invention real
Apply search tree schematic diagram in the searching method based on the structured search system of weights of example six;
Search tree is mated by analysis module with text structure tree, calculates matching degree, obtains score value for 10 points;Figure
21 show that analysis module is calculated according to embodiments of the present invention six searching method based on the structured search system of weights
Search tree and each text structure tree matching value schematic diagram;Figure 22 show according to embodiments of the present invention six based on weights
Structured search system searching method in Search Results display schematic diagram;As shown in Figure 21, Figure 22.
Wherein, before text structure tree or search tree is set up, the method also includes:Synonym modular converter is to keyword
Vocabulary carries out synonym conversion, carries out synonym normalizing according to synonym dictionary.
The description comprising " the 5th thoracic vertebrae " will such as be inquired about, in actual applications, doctor can use " thoracic vertebrae 5 ", " 5 centrum of chest ",
The language such as " T5 ", " T5 vertebras ", are the same meaning, and now, synonym modular converter carries out synonym according to synonym dictionary and returns
One, synonym problems are solved, the validity of Search Results is improve.
Before text structure tree or search tree is set up, the method also includes:Value range identification module recognizes keyword word
The value range of remittance.
Such as area, length, volume, capacity etc., solve the problems, such as in prior art cannot hunting zone value interval, such as
Search " diameter of tumor is between 2-3CM " etc..
Before search tree is set up, the method also includes:Operator processing unit to search expression in logical operation
Symbol is identified and process.For example and, comprising or, more than etc..
The method also includes:Star display unit determines the number of star according to matching value, and by the number of star and
Client is shown to simultaneously with value.Figure 23 shows the according to embodiments of the present invention six structured search system based on weights
Search Results display schematic diagram (as shown in figure 23) in searching method.
From 5 star of 0-10 highests, 10 grades can be divided, as follows:
Figure 24 shows according to embodiments of the present invention six searching method based on the structured search system of weights and searches
Hitch fruit display schematic diagram;
For example, the Search Results of search " lump shadow is most thick greater than about 2.6 centimetres " show as shown in figure 24.
Embodiments of the invention are provided with structure tree module and search tree module, by non-structured free text and search
Rope expression formula carries out participle, and carries out structuring reconstruct, forms text structure tree and search tree, defines each keyword vocabulary
And the weights of branch, search tree mated by analysis module with all of text structure tree, draws coupling according to weight computing
Value, so that Search Results are accurately credible;Synonym modular converter and value range identification module is provided with, keyword vocabulary is entered
Synonym conversion is gone so that synonym normalizing, it is possible to recognize the value range of keyword vocabulary, can solve the problem that in prior art
The problem that cannot process synonym and value range for existing, makes Search Results more accurate, does not fall out valuable information;
Based on the search condition of natural language, as the present invention is provided with operator processing unit, to search expression in logic fortune
Operator is identified and process so that Search Results are more comprehensive, and is convenient for users to operate;Embodiments of the invention also set up
Star display unit, Search Results carry out weights scoring according to matching degree, and give star evaluation, and scoring highest shows
Foremost, screens judgement one by one without the need for user, and Search Results are very clear, very intuitively, improves search efficiency, is allowed to more
Hommization.
From the above description, it can be seen that the above embodiment of the present invention achieves following technique effect:The reality of the present invention
Apply example and be provided with structure tree module and search tree module, non-structured free text and search expression are carried out participle,
And structuring reconstruct is carried out, text structure tree and search tree is formed, the weights of each keyword vocabulary and branch are defined, is analyzed
Search tree is mated by module with all of text structure tree, draws matching value according to weight computing, so that Search Results
Accurately credible;Synonym modular converter and value range identification module is provided with, synonym conversion has been carried out to keyword vocabulary, has been made
Obtain synonym normalizing, it is possible to recognize the value range of keyword vocabulary, can solve the problem that
Adopted word and the problem of value range, make Search Results more accurate, do not fall out valuable information;Searching based on natural language
Rope condition, as the present invention is provided with operator processing unit, to search expression in logical operator be identified and place
Reason so that Search Results are more comprehensive, and are convenient for users to operate;Embodiments of the invention are also provided with star display unit,
Search Results carry out weights scoring according to matching degree, and give star evaluation, and scoring highest shows up front, without the need for user
Screening judges one by one, and Search Results are very clear, very intuitively, improve search efficiency, are allowed to more hommization.
Obviously, this those skilled in the art should be understood that each module or each step of the above-mentioned present invention can be used and lead to
With computing device realizing, they can be concentrated on single computing device, or are distributed in multiple computing device institutes group
Into network on, alternatively, they can be realized with the executable program code of computing device, it is thus possible to they are deposited
Storage is executed by computing device in the storage device, or they is fabricated to each integrated circuit modules respectively, or by it
In multiple modules or step be fabricated to single integrated circuit module to realize.So, the present invention is not restricted to any specific
Hardware and software combine.
The preferred embodiments of the present invention are the foregoing is only, the present invention is not limited to, for the skill of this area
For art personnel, the present invention can have various modifications and variations.All within the spirit and principles in the present invention, made any repair
Change, equivalent, improvement etc., should be included within the scope of the present invention.
Claims (12)
1. a kind of structured search system based on weights, it is characterised in that including data memory module, structure tree module, search
Suo Shu modules, analysis module and display module, wherein,
The data memory module, is connected with the structure tree module, for storing unstructured data;
The structure tree module, is connected with the data memory module and the analysis module respectively, for per non-described in bar
Structural data carries out word segmentation processing, the unstructured data is split as single keyword vocabulary, to pass each described
Keyword vocabulary definitions weights, and a text structure tree corresponding to the unstructured data is set up according to grammar contexts;
The search tree module, is connected with the analysis module, for receiving the search expression for coming from client, to described
Search expression carries out word segmentation processing, the search expression is split as single keyword vocabulary, to key each described
Word vocabulary definitions weights, and a search tree corresponding to the search expression is set up according to grammar contexts;
The analysis module, is connected with the search tree module and the structure tree module respectively, for by the search tree with
The all of text structure tree is mated, and draws matching value according to the weight computing;
The display module, is connected with the analysis module, for the matching value is ranked up from big to small with score value, is gone
All zero items are removed, and Search Results are shown to client.
2. the structured search system based on weights according to claim 1, it is characterised in that the system also includes synonymous
Word modular converter, is connected with the structure tree module and the search tree module respectively, for carrying out to the keyword vocabulary
Synonym is changed, and carries out synonym normalizing according to synonym dictionary.
3. the structured search system based on weights according to claim 1, it is characterised in that the system also includes scope
Value identification module, is connected with the structure tree module and the search tree module, respectively for recognizing the keyword vocabulary
Value range.
4. the structured search system based on weights according to claim 1, it is characterised in that the search tree module is also
Including operator processing unit, for being identified and process to the logical operator in the search expression.
5. the structured search system based on weights according to claim 1, it is characterised in that described to pass each described
Keyword vocabulary definitions weights, be rudimentary knowledge according to the unstructured data text correlation and special characteristic important
Property is determining.
6. the structured search system based on weights according to claim 1, it is characterised in that the display module is also wrapped
Star display unit is included, for determining the number of star according to the matching value, and by the number of the star and the coupling
Value is shown to the client simultaneously.
7. a kind of searching method of the structured search system based on weights, it is characterised in that include:
Structure tree module carries out word segmentation processing to every unstructured data in data memory module, by the destructuring number
According to single keyword vocabulary is split as, to keyword vocabulary definitions weights each described, and one is set up according to grammar contexts
The individual text structure tree corresponding to the unstructured data;
Search tree module receives the search expression for coming from client, carries out word segmentation processing to the search expression, by institute
State search expression and be split as single keyword vocabulary, to keyword vocabulary definitions weights each described, and according to grammatically
A search tree corresponding to the search expression is hereafter set up;
The search tree is mated by analysis module with all of text structure tree, is drawn according to the weight computing
With value;
The matching value is ranked up from big to small by display module with score value, removes all zero items, and Search Results are shown
Show client.
8. the searching method of the structured search system based on weights according to claim 7, it is characterised in that setting up
Before the text structure tree or the search tree, the method also includes:Synonym modular converter enters to the keyword vocabulary
Row synonym is changed, and carries out synonym normalizing according to synonym dictionary.
9. the searching method of the structured search system based on weights according to claim 7, it is characterised in that setting up
Before the text structure tree or the search tree, the method also includes:Value range identification module recognizes the keyword vocabulary
Value range.
10. the searching method of the structured search system based on weights according to claim 7, it is characterised in that building
Before founding the search tree, the method also includes:Operator processing unit enters to the logical operator in the search expression
Row identification and process.
The searching method of the 11. structured search systems based on weights according to claim 7, it is characterised in that described
To keyword vocabulary definitions weights each described, it is correlation and the spy of rudimentary knowledge according to the unstructured data text
Determine the importance of feature to determine.
The searching method of the 12. structured search systems based on weights according to claim 7, it is characterised in that the party
Method also includes:Star display unit determines the number of star according to the matching value, and by the number of the star and described
The client is shown to simultaneously with value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611077910.5A CN106503265A (en) | 2016-11-30 | 2016-11-30 | Structured search system and its searching method based on weights |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611077910.5A CN106503265A (en) | 2016-11-30 | 2016-11-30 | Structured search system and its searching method based on weights |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106503265A true CN106503265A (en) | 2017-03-15 |
Family
ID=58327973
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611077910.5A Pending CN106503265A (en) | 2016-11-30 | 2016-11-30 | Structured search system and its searching method based on weights |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106503265A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107491534A (en) * | 2017-08-22 | 2017-12-19 | 北京百度网讯科技有限公司 | Information processing method and device |
WO2019028631A1 (en) * | 2017-08-07 | 2019-02-14 | 深圳益强信息科技有限公司 | Method for determining relative confidentiality of technical know-how |
CN110209829A (en) * | 2018-02-12 | 2019-09-06 | 百度在线网络技术(北京)有限公司 | Information processing method and device |
CN111309870A (en) * | 2020-03-04 | 2020-06-19 | 平安养老保险股份有限公司 | Data rapid searching method and device and computer equipment |
CN111309853A (en) * | 2019-09-03 | 2020-06-19 | 东南大学 | Code searching method based on structured information |
CN112069305A (en) * | 2020-11-13 | 2020-12-11 | 北京智慧星光信息技术有限公司 | Data screening method and device and electronic equipment |
CN113254588A (en) * | 2021-06-02 | 2021-08-13 | 竹间智能科技(上海)有限公司 | Data searching method and system |
CN114006719A (en) * | 2021-09-14 | 2022-02-01 | 国科信创科技有限公司 | AI verification method, device and system based on situation awareness |
CN114564938A (en) * | 2020-11-27 | 2022-05-31 | 阿里巴巴集团控股有限公司 | Document parsing method and device, storage medium and processor |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060004721A1 (en) * | 2004-04-23 | 2006-01-05 | Bedworth Mark D | System, method and technique for searching structured databases |
CN101093493A (en) * | 2006-06-23 | 2007-12-26 | 国际商业机器公司 | Speech conversion method for database inquiry, converter, and database inquiry system |
CN101510221A (en) * | 2009-02-17 | 2009-08-19 | 北京大学 | Enquiry statement analytical method and system for information retrieval |
US20100281063A1 (en) * | 2009-05-01 | 2010-11-04 | Brother Kogyo Kabushiki Kaisha | Distributed storage system, management apparatus, node apparatus, recording medium on which node program is recorded, page information acquisition method, recording medium on which page information sending program is recorded, and page information sending method |
CN103324678A (en) * | 2013-05-27 | 2013-09-25 | 俞声 | Information retrieval method and device |
CN104252533A (en) * | 2014-09-12 | 2014-12-31 | 百度在线网络技术(北京)有限公司 | Search method and search device |
CN105843960A (en) * | 2016-04-18 | 2016-08-10 | 上海泥娃通信科技有限公司 | Semantic tree based indexing method and system |
CN105955976A (en) * | 2016-04-15 | 2016-09-21 | 中国工商银行股份有限公司 | Automatic answering system and method |
CN105975625A (en) * | 2016-05-26 | 2016-09-28 | 同方知网数字出版技术股份有限公司 | Chinglish inquiring correcting method and system oriented to English search engine |
-
2016
- 2016-11-30 CN CN201611077910.5A patent/CN106503265A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060004721A1 (en) * | 2004-04-23 | 2006-01-05 | Bedworth Mark D | System, method and technique for searching structured databases |
CN101093493A (en) * | 2006-06-23 | 2007-12-26 | 国际商业机器公司 | Speech conversion method for database inquiry, converter, and database inquiry system |
CN101510221A (en) * | 2009-02-17 | 2009-08-19 | 北京大学 | Enquiry statement analytical method and system for information retrieval |
US20100281063A1 (en) * | 2009-05-01 | 2010-11-04 | Brother Kogyo Kabushiki Kaisha | Distributed storage system, management apparatus, node apparatus, recording medium on which node program is recorded, page information acquisition method, recording medium on which page information sending program is recorded, and page information sending method |
CN103324678A (en) * | 2013-05-27 | 2013-09-25 | 俞声 | Information retrieval method and device |
CN104252533A (en) * | 2014-09-12 | 2014-12-31 | 百度在线网络技术(北京)有限公司 | Search method and search device |
CN105955976A (en) * | 2016-04-15 | 2016-09-21 | 中国工商银行股份有限公司 | Automatic answering system and method |
CN105843960A (en) * | 2016-04-18 | 2016-08-10 | 上海泥娃通信科技有限公司 | Semantic tree based indexing method and system |
CN105975625A (en) * | 2016-05-26 | 2016-09-28 | 同方知网数字出版技术股份有限公司 | Chinglish inquiring correcting method and system oriented to English search engine |
Non-Patent Citations (2)
Title |
---|
刘琼: ""基于本体的非结构化文本查询方法研究及应用"", 《第二十二届全国计算机信息管理学术研讨会论文集》, 31 December 2008 (2008-12-31), pages 123 - 129 * |
温树田: "中医药文献信息检索与利用", 第四军医大学出版社, pages: 180 - 181 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019028631A1 (en) * | 2017-08-07 | 2019-02-14 | 深圳益强信息科技有限公司 | Method for determining relative confidentiality of technical know-how |
US11232140B2 (en) | 2017-08-22 | 2022-01-25 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Method and apparatus for processing information |
CN107491534B (en) * | 2017-08-22 | 2020-11-20 | 北京百度网讯科技有限公司 | Information processing method and device |
CN107491534A (en) * | 2017-08-22 | 2017-12-19 | 北京百度网讯科技有限公司 | Information processing method and device |
CN110209829A (en) * | 2018-02-12 | 2019-09-06 | 百度在线网络技术(北京)有限公司 | Information processing method and device |
CN111309853B (en) * | 2019-09-03 | 2024-03-22 | 东南大学 | Code searching method based on structured information |
CN111309853A (en) * | 2019-09-03 | 2020-06-19 | 东南大学 | Code searching method based on structured information |
CN111309870B (en) * | 2020-03-04 | 2022-11-18 | 平安养老保险股份有限公司 | Data rapid searching method and device and computer equipment |
CN111309870A (en) * | 2020-03-04 | 2020-06-19 | 平安养老保险股份有限公司 | Data rapid searching method and device and computer equipment |
CN112069305B (en) * | 2020-11-13 | 2021-03-30 | 北京智慧星光信息技术有限公司 | Data screening method and device and electronic equipment |
CN112069305A (en) * | 2020-11-13 | 2020-12-11 | 北京智慧星光信息技术有限公司 | Data screening method and device and electronic equipment |
CN114564938A (en) * | 2020-11-27 | 2022-05-31 | 阿里巴巴集团控股有限公司 | Document parsing method and device, storage medium and processor |
CN113254588A (en) * | 2021-06-02 | 2021-08-13 | 竹间智能科技(上海)有限公司 | Data searching method and system |
CN113254588B (en) * | 2021-06-02 | 2023-08-22 | 竹间智能科技(上海)有限公司 | Data searching method and system |
CN114006719A (en) * | 2021-09-14 | 2022-02-01 | 国科信创科技有限公司 | AI verification method, device and system based on situation awareness |
CN114006719B (en) * | 2021-09-14 | 2023-10-13 | 国科信创科技有限公司 | AI verification method, device and system based on situation awareness |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106503265A (en) | Structured search system and its searching method based on weights | |
CN103425975B (en) | A kind of clinical case data collecting system and acquisition method | |
US20160171095A1 (en) | Identifying and Displaying Relationships Between Candidate Answers | |
US11244755B1 (en) | Automatic generation of medical imaging reports based on fine grained finding labels | |
CN106875941A (en) | A kind of voice method for recognizing semantics of service robot | |
CN112148851A (en) | Construction method of medicine knowledge question-answering system based on knowledge graph | |
CN106776711A (en) | A kind of Chinese medical knowledge mapping construction method based on deep learning | |
US10339143B2 (en) | Systems and methods for relation extraction for Chinese clinical documents | |
JP6875993B2 (en) | Methods and systems for contextual evaluation of clinical findings | |
CN106776888A (en) | Intelligence structure search system and its searching method | |
CN111046272A (en) | Intelligent question-answering system based on medical knowledge map | |
CN106777996A (en) | A kind of physical examination data search system based on Solr | |
CN109346171A (en) | A kind of aided diagnosis method, device and computer equipment | |
US11763081B2 (en) | Extracting fine grain labels from medical imaging reports | |
CN109840275B (en) | Method, device and equipment for processing medical search statement | |
KR101375221B1 (en) | A clinical process modeling and verification method | |
CN111582039B (en) | Sign language recognition and conversion system and method based on deep learning and big data | |
CN113343680A (en) | Structured information extraction method based on multi-type case history texts | |
KR102182619B1 (en) | Knowledge extraction system using frame based on ontology | |
CN103425976B (en) | A kind of case report table identification system and recognition methods | |
Müller et al. | Analyzing web log files of the Health On the Net HONmedia search engine to define typical image search tasks for image retrieval evaluation | |
CN113345557A (en) | Data processing method and system | |
KR102448275B1 (en) | Method and Apparatus to Reasoning Biological System Characteristics through Identification Keys | |
TWI811598B (en) | Smart image diagnosis report translation method | |
CN108052503A (en) | The computational methods and device of a kind of confidence level |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170315 |