CN104933159B - A kind of semantic query method based on drug ontology library - Google Patents

A kind of semantic query method based on drug ontology library Download PDF

Info

Publication number
CN104933159B
CN104933159B CN201510363192.7A CN201510363192A CN104933159B CN 104933159 B CN104933159 B CN 104933159B CN 201510363192 A CN201510363192 A CN 201510363192A CN 104933159 B CN104933159 B CN 104933159B
Authority
CN
China
Prior art keywords
drug
ontology
query
model
library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510363192.7A
Other languages
Chinese (zh)
Other versions
CN104933159A (en
Inventor
叶宁
杨铄
黄海平
沙超
王汝传
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201510363192.7A priority Critical patent/CN104933159B/en
Publication of CN104933159A publication Critical patent/CN104933159A/en
Application granted granted Critical
Publication of CN104933159B publication Critical patent/CN104933159B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention discloses a kind of semantic query methods based on drug ontology library, the drug ontology library of this method needs with the help of domain expert, the building of ontology is carried out referring to skeleton method, ontology captures first, i.e. with the help of domain expert, obtain concept term and its relationship in removing pharmaceutical producs field, ensure these concept terms and relationship unambiguity, drug is then carried out to the expression of structuring, then ontology evaluation is carried out, judge whether crude drug product ontology complies with standard, ontology is saved in the form of a file if meeting, otherwise ontology acquisition phase is returned, finally drug ontology information is saved in the form of a file.

Description

A kind of semantic query method based on drug ontology library
Technical field
This patent is related to a kind of semantic query method based on drug ontology library, belongs to intelligent Search Technique field.
Background technique
Traditional text information retrieval technique based on Keywords matching is poor to the tenability of semantic matches, will only divide The word in keyword and index database after word is matched, can not correct understanding and processing user search it is semantic and be intended to, deposit The problems such as search result is inaccurate, irrelevant information is excessive.Although the retrieval technique based on Keywords matching has already been through Repeatedly improve, but due to do not have add semantic processes in terms of function, cause retrieval performance not obtain basic improvement, The ideal effect of efficiently and accurately inquiry can not be reached.
Ontology is the concept for being derived from philosophy, and original meaning refers to the theory about presence and its essence and rule, counted later Calculation machine scientific domain introduces, and refers in particular to making clear, formalizing, explanation of standardizing made by shared conceptual model, it emphasizes field In essential idea, also emphasize the association between these essential ideas.The ontology in some field can will be various in the field Relationship between concept and concept having to explicitly, is formally expressed, so that the semantic meaning representation for including in concept be come out.
In recent years, the development of ontology theory brought new power with gradually mature for the development of information retrieval technique, together When also for improve searching system precision ratio and recall ratio provide better guarantee.As a kind of effective Representation of concepts level knot Structure and semantic theory and method, ontology have been widely used in computer science and field of information management, and by success The intelligent retrieval system new applied to building.And the present invention can well solve problem above.
Summary of the invention
Present invention aims at solving above-mentioned the deficiencies in the prior art, a kind of semanteme based on drug ontology library is proposed Querying method, this method introduce drug ontology library, improve the accuracy in drug inquiry and the efficiency of inquiry.
The technical scheme adopted by the invention to solve the technical problem is that: a kind of semantic query side based on drug ontology library Method, this method comprises the following steps:
Step 1: building drug ontology library
The drug ontology library needs with the help of domain expert, and the building of ontology is carried out referring to skeleton method;
Step 1-1: ontology capture, it may be assumed that with the help of domain expert, obtain concept term and its pass in removing pharmaceutical producs field System, it is ensured that these concept terms and relationship unambiguity;
Step 1-2: drug is carried out to the expression of structuring;
Step 1-3: ontology evaluation judges whether crude drug product ontology complies with standard, by ontology with the shape of file if meeting Formula saves, otherwise return step 1-1;
Step 2: user issues inquiry request to user interactive module, and the data that user interactive module will inquire are submitted to Inquire data processing module;
Step 3: the data that query processing module will inquire are handled;Specific step includes the following:
Step 3-1: data prediction
1) cleaning treatment that word is carried out according to ontology model, removes unrelated content;
2) word is divided according to ontology model, i.e., query statement is transformed into disjunctive normal form.
Step 3-2: synonymous expansion
Each of disjunctive normal form conjunction word is mapped in ontology library, the expansion of synonym is carried out, that is, carries it into Into ontology library, corresponding standardization concept is found, the interrogation model of drug ontology is generated;
Step 3-3: query execution
Step 3-4: the interrogation model of the ontology of generation is submitted into ontology information processor.
Step 4: ontology processor carries out the processing of information to interrogation model, it is opened up by drug ontology library and semantic relation Module composition is opened up, semanteme expands the specific steps of module are as follows:
Step 4-1: according to the relational model of drug ontology, the relationship for carrying out drug is expanded, and generates drug relational query Set;
Step 4-2: drug relational query set and drug Ontology Query set are carried out to the calculating of similarity, calculate drug Product expand the similarity of set with drug query set, similarity formula are as follows:
Formula 1
Wherein, MaxLenth is the depth capacity of body network, and min is concept node w1, and shortest path has between w2 To the quantity on side.
Step 4-3: setting similarity threshold will be greater than the term group and drug interrogation model composition set of threshold value, depending on For final drug query set;
Step 5: by query expansion compound mapping into medicine information library, obtaining the specifying information of drug;
Step 6: final query information is presented to the user by user interactive module.
The utility model has the advantages that
1, the present invention is calculated based on drug ontology model, and using body similarity, is inquired medicine information It is inquired with expansion.
2, drug ontology library is introduced into drug inquiry system by the present invention, improves the Efficiency and accuracy of drug inquiry;
3, system structure of the invention is clear, simple, it is easy to accomplish.
Detailed description of the invention:
Fig. 1 is drug ontology library schematic diagram of the invention.
Fig. 2 is flow chart of the method for the present invention.
Specific embodiment
The invention is described in further detail with reference to the accompanying drawings of the specification.
Such as Fig. 1, drug ontology library of the invention needs with the help of domain expert, and the structure of ontology is carried out referring to skeleton method It builds, ontology captures first, i.e., with the help of domain expert, obtains concept term and its relationship in removing pharmaceutical producs field, it is ensured that this Drug, is then carried out the expression of structuring, then carries out ontology evaluation, judge crude drug by a little concept terms and relationship unambiguity Whether product ontology complies with standard, and saves ontology in the form of a file if meeting, and otherwise returns to ontology acquisition phase, most Drug ontology information is saved in the form of a file afterwards.
Such as Fig. 2, the querying method of medicine information of the present invention is specifically included as follows:
Step 1: user terminal logs on to medicine information searching system;
Step 2: user issues retrieval request to medicine information searching system by user interactive system, searches for drug D;
Step 3: drug D being carried out to the cleaning treatment for carrying out word according to ontology model, unrelated content is removed and is examined Rope word D1
Step 4: according to ontology model by D1It is divided, i.e., query statement is transformed into disjunctive normal form, such as p ∨ q It itself is exactly a disjunctive normal form that form, which is the retrieval of single word here, does not need to be converted;
Step 5: each of disjunctive normal form conjunction word being mapped in ontology library, the expansion of synonym is carried out, i.e., will It is brought into ontology library, finds corresponding standardization concept, generates the interrogation model S of drug ontology;
Step 6: by S according to the relational model of drug ontology, the relationship for carrying out drug is expanded, and generates drug relational query Set R (r1, r2, r3..., rn);
Step 7: drug relational query set R and drug Ontology Query model S are subjected to the calculating by formula similarity, Calculate the similarity that drug expands set with drug query set;
Step 8: setting similarity threshold, the low relationship of removal similarity will be greater than the term group of threshold value with interrogation model S Form set O (o1, o2, o3..., on), it is considered as final drug query set;
Step 9: drug query set O being mapped in medicine information library, the specifying information of drug is obtained;
Step 10: last drug query information is presented to the user by medicine search system by user interactive module.

Claims (2)

1. a kind of semantic query method based on drug ontology library, which is characterized in that described method includes following steps:
Step 1: building drug ontology library, drug ontology library need with the help of domain expert, carry out ontology referring to skeleton method Building, comprising:
Step 1-1: ontology capture, it may be assumed that with the help of domain expert, concept term and its relationship in removing pharmaceutical producs field are obtained, Ensure these concept terms and relationship unambiguity;
Step 1-2: drug is carried out to the expression of structuring;
Step 1-3: ontology evaluation judges whether drug ontology complies with standard, saves ontology in the form of a file if meeting Get up, otherwise return step 1-1;
Step 2: user issues inquiry request to user interactive module, and the data that user interactive module will inquire submit to inquiry Data processing module;
Step 3: the data that inquiry data processing module will be inquired are handled, comprising: step 3-1: data prediction;
1) cleaning treatment that word is carried out according to ontology model, removes unrelated content;
2) word is divided according to ontology model, i.e., query statement is transformed into disjunctive normal form;
Step 3-2: synonymous expansion;
Each of disjunctive normal form conjunction word is mapped in ontology library, the expansion of synonym is carried out, i.e., it will be in disjunctive normal form Each conjunction word be brought into ontology library, find corresponding standardization concept, generate the interrogation model of drug ontology;
Step 3-3: query execution;
Step 3-4: the interrogation model of the ontology of generation is submitted into ontology information processor;
Step 4: ontology information processor carries out the processing of information to interrogation model, it is opened up by drug ontology library and semantic relation Module composition is opened up, semantic relation expands module and includes:
Step 4-1: according to the relational model of drug ontology, the relationship for carrying out drug is expanded, and generates drug relational query set;
Step 4-2: the interrogation model of drug relational query set and drug ontology is carried out to the calculating of similarity, calculates drug The similarity of relational query set and drug Ontology Query set, similarity formula are as follows:
Wherein, MaxLenth is the depth capacity of body network, and min is concept node w1, the directed edge of shortest path between w2 Quantity;
Step 4-3: setting similarity threshold will be greater than the term group and drug Ontology Query model composition set of threshold value, depending on For final drug query set;
Step 5: final drug query set being mapped in medicine information library, the specifying information of drug is obtained;
Step 6: query information will be presented to the user by user interactive module and be presented to the user by user interactive module.
2. a kind of semantic query method based on drug ontology library, which is characterized in that the method includes as follows:
Step 1: user terminal logs on to medicine information searching system;
Step 2: user issues retrieval request to medicine information searching system by user interactive module, searches for drug D;
Step 3: drug D being carried out to the cleaning treatment for carrying out word according to ontology model, unrelated content is removed and obtains term D1
Step 4: according to ontology model by D1It is divided, i.e., query statement is transformed into disjunctive normal form, be single word here Retrieval itself be exactly a disjunctive normal form, do not need to be converted;
Step 5: each of disjunctive normal form conjunction word being mapped in ontology library, the expansion of synonym is carried out, i.e., will be extracted Each of normal form conjunction word is brought into ontology library, is found corresponding standardization concept, is generated the inquiry mould of drug ontology Type S;
Step 6: by S according to the relational model of drug ontology, the relationship for carrying out drug is expanded, and generates drug relational query set R(r1, r2, r3..., rn);
Step 7: drug relational query set R and drug Ontology Query model S being subjected to the calculating by formula similarity, calculated The similarity of drug relational query set R and drug Ontology Query model S out;
Step 8: setting similarity threshold, the low relationship of removal similarity will be greater than the term group and interrogation model S group of threshold value At set O (o1, o2, o3..., on), it is considered as final drug query set;
Step 9: drug query set O being mapped in medicine information library, the specifying information of drug is obtained;
Step 10: last drug query information is presented to the user by medicine information searching system by user interactive module.
CN201510363192.7A 2015-06-26 2015-06-26 A kind of semantic query method based on drug ontology library Active CN104933159B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510363192.7A CN104933159B (en) 2015-06-26 2015-06-26 A kind of semantic query method based on drug ontology library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510363192.7A CN104933159B (en) 2015-06-26 2015-06-26 A kind of semantic query method based on drug ontology library

Publications (2)

Publication Number Publication Date
CN104933159A CN104933159A (en) 2015-09-23
CN104933159B true CN104933159B (en) 2019-01-18

Family

ID=54120326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510363192.7A Active CN104933159B (en) 2015-06-26 2015-06-26 A kind of semantic query method based on drug ontology library

Country Status (1)

Country Link
CN (1) CN104933159B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818124B (en) * 2017-03-03 2020-07-14 平安医疗健康管理股份有限公司 Data matching method and device
CN107577800A (en) * 2017-09-21 2018-01-12 合肥集知网知识产权运营有限公司 A kind of big data patent retrieval method based on fuzzy set model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060047649A1 (en) * 2003-12-29 2006-03-02 Ping Liang Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
CN102081668A (en) * 2011-01-24 2011-06-01 熊晶 Information retrieval optimizing method based on domain ontology
CN103020074A (en) * 2011-09-23 2013-04-03 倪毅 Object-level search technique based on main body

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060047649A1 (en) * 2003-12-29 2006-03-02 Ping Liang Internet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
CN102081668A (en) * 2011-01-24 2011-06-01 熊晶 Information retrieval optimizing method based on domain ontology
CN103020074A (en) * 2011-09-23 2013-04-03 倪毅 Object-level search technique based on main body

Also Published As

Publication number Publication date
CN104933159A (en) 2015-09-23

Similar Documents

Publication Publication Date Title
US20220121695A1 (en) Knowledge graph-based case retrieval method, device and equipment, and storage medium
CN103218436B (en) A kind of Similar Problems search method and device that merges class of subscriber label
KR102080362B1 (en) Query expansion
CN110413732A (en) The knowledge searching method of software-oriented defect knowledge
CN102799677B (en) Water conservation domain information retrieval system and method based on semanteme
US8478704B2 (en) Decomposable ranking for efficient precomputing that selects preliminary ranking features comprising static ranking features and dynamic atom-isolated components
CN102253971B (en) PageRank method based on quick similarity
CN102402561B (en) Searching method and device
CN103365910A (en) Method and system for information retrieval
CN104516949A (en) Webpage data processing method and apparatus, query processing method and question-answering system
CN103020074A (en) Object-level search technique based on main body
CN103235812A (en) Method and system for identifying multiple query intents
CN103886099A (en) Semantic retrieval system and method of vague concepts
CN109063114B (en) Heterogeneous data integration method and device for energy cloud platform, terminal and storage medium
CN106156271A (en) Related information directory system based on distributed storage and foundation thereof and using method
CN110019384A (en) A kind of acquisition methods of blood relationship data provide the method and device of blood relationship data
CN104933159B (en) A kind of semantic query method based on drug ontology library
CN102737125B (en) Web temporal object model-based outdated webpage information automatic discovering method
CN103064907A (en) System and method for topic meta search based on unsupervised entity relation extraction
CN104156431A (en) RDF keyword research method based on stereogram community structure
CN109992593A (en) A kind of large-scale data parallel query method based on subgraph match
CN104217026A (en) Chinese microblog tendency retrieving method based on graph model
WO2012091539A1 (en) A semantic similarity matching system and a method thereof
CN105512270A (en) Method and device for determining related objects
CN115082010A (en) Intelligent management method, storage medium and system for metadata in power field

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 210003 new model road 66, Gulou District, Nanjing, Jiangsu

Applicant after: Nanjing Post & Telecommunication Univ.

Address before: 210023 9 Wen Yuan Road, Qixia District, Nanjing, Jiangsu.

Applicant before: Nanjing Post & Telecommunication Univ.

GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20150923

Assignee: NUPT INSTITUTE OF BIG DATA RESEARCH AT YANCHENG CO., LTD.

Assignor: Nanjing Post & Telecommunication Univ.

Contract record no.: X2019980001249

Denomination of invention: Semantic query method based on medicine body library

Granted publication date: 20190118

License type: Common License

Record date: 20191224