CN110085290A - The breast molybdenum target of heterogeneous information integration is supported to report semantic tree method for establishing model - Google Patents
The breast molybdenum target of heterogeneous information integration is supported to report semantic tree method for establishing model Download PDFInfo
- Publication number
- CN110085290A CN110085290A CN201910256713.7A CN201910256713A CN110085290A CN 110085290 A CN110085290 A CN 110085290A CN 201910256713 A CN201910256713 A CN 201910256713A CN 110085290 A CN110085290 A CN 110085290A
- Authority
- CN
- China
- Prior art keywords
- text
- molybdenum target
- semantic
- semantic tree
- breast cancer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/322—Trees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/34—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H15/00—ICT specially adapted for medical reports, e.g. generation or transmission thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The present invention relates to a kind of breast molybdenum targets for supporting heterogeneous information integration to report semantic tree method for establishing model, which comprises the following steps: forms the text normalization database of breast cancer molybdenum target Findings text description;The text description for obtaining breast cancer molybdenum target Findings in real time, the division of progress phrase is described according to semantic information based on text normalization database to text;Obtain the semantic constraint of entity;Form the semantic tree of text description.The present invention is realized by way of constructing breast molybdenum target semantic tree by the text information structuring from Different hospital, different doctors, complicated breast cancer molybdenum target images, realizes the semantic-based integrated of Heterogeneous Information.
Description
Technical field
The present invention relates to a kind of breast molybdenum targets for supporting heterogeneous information integration to report semantic tree method for establishing model, belongs to doctor
Learn text structure process field.
Background technique
With the high speed development of medical information, current 80% hospital has been completed that Informatization Service is built.Nowadays
Electronic health record also instead of papery case history, but invariably the diagnosis report of patient be still according to the knowledge of doctor and
Working experience by natural language to the non-structured description of region of interest, can natural language be that computer is unable to Direct Recognition
With processing.
The key that text structureization processing develops as artificial intelligence in medical field, MedLEE (Medical
Language Extraction and Encoding System)、UMLS(The Unified Medical Language
) etc. System external natural language processing system is very perfect, but due to Chinese with English in semantic, syntactic structure
Greatest differences, it is poor to the portability of Chinese medical text.The country starts late to the research of medicine text structure, uses for reference
External existing technology also achieves many breakthrough progress, but for breast molybdenum target diagnostic imaging report text structuring
Research it is still few.
Summary of the invention
The object of the present invention is to provide a kind of breast molybdenum target diagnostic imaging report text structuring processing methods.
In order to achieve the above object, the technical solution of the present invention is to provide a kind of mammary gland molybdenums for supporting heterogeneous information integration
Target reports semantic tree method for establishing model, which comprises the following steps:
Step 1, the text normalization database that the description of breast cancer molybdenum target Findings text is formed according to Expert Rules, text
It is stored in this standardization database and describes relevant to meet current medicine Discipline Maturity to breast cancer molybdenum target Findings text
Phrase;
Step 2, the text description for obtaining breast cancer molybdenum target Findings in real time, are based on text normalization according to semantic information
Database describes to carry out the division of phrase and removes unwanted redundancy to text, extracts related to breast cancer diagnosis
Description, divide the range of each entity, wherein using the classification results of lesion, using each lesion as an entity;
Step 3, the semantic constraint for obtaining entity;
The semantic tree for the text description that step 4, forming step 2 obtain, the root node of semantic tree are entity, semantic tree it is interior
Portion's node is each attribute of entity, and leaf node is the corresponding attribute description of each attribute.
Preferably, further include step 5: the semantic tree obtained to previous step visualizes.
The present invention is realized by way of constructing breast molybdenum target semantic tree will be from Different hospital, different doctors, multiple
The semantic-based integrated of Heterogeneous Information is realized in the text information structuring of miscellaneous breast cancer molybdenum target image.
Detailed description of the invention
Fig. 1 is that Chinese breast molybdenum target Findings text semantic tree constructs flow chart, and main process is as follows: input is to be processed
Breast molybdenum target image text;Text is segmented;The main node of semantic tree is found out according to text feature, and utilizes it
Semantic constraint finds its leaf node;The node of semantic tree is hung up into leaf node according to input sequence, is completed to the semantic tree
Scanning.
The text of Fig. 2 Chinese breast molybdenum target Findings segments sample, has chosen a breast cancer mesh target image text and retouches
A clause in stating is being segmented as a result, from participle as a result, in the case where not considering omission, it can be seen that Chinese
The syntactic structure of a clause may be summarized to be position+subject+predicate+different attribute and retouch in breast molybdenum target Findings text
It states.Utilize such structure the classification that can be quickly found out corresponding to word.
The semantic tree semantic constraint of Fig. 3 Chinese breast molybdenum target Findings constructs, and is incited somebody to action on the basis of being segmented
The result sorted out for the associated description of entity according to its feature.Mainly using the part of speech feature of word and using specially
Each key words are all assigned a classification by the word stored in database constructed by family's rule.It will be unwanted superfluous
Remaining word abandons.
The semantic tree construction of Fig. 4 Chinese breast molybdenum target Findings, according in a breast cancer molybdenum target image text description
A clause, the construction of the semantic subtree marked off.Using the method for hierarchy nesting, entity is nested in step by step is included
In level, before finding next entity, do not terminate to add the entity attributes, for ignoring those attributes being not present
Value.It is all a nesting using the obtained each semantic tree of such method.
Specific embodiment
In order to make the present invention more obvious and understandable, hereby with preferred embodiment, and attached drawing is cooperated to be described in detail below.
A kind of breast molybdenum target for supporting heterogeneous information integration provided by the invention reports that semantic tree method for establishing model includes
The text description of breast molybdenum target image under medical scenario for real world expression constructs related semantic tree.Building Chinese cream
Mainly comprising the steps that for gland molybdenum target Findings semantic tree forms breast cancer molybdenum target Findings text according to Expert Rules
The database of description;The text of breast cancer molybdenum target Findings is described to carry out using priori knowledge with focus characteristic to be to draw
The participle work of departure section;The entity in the text description of breast cancer molybdenum target Findings is subjected to semanteme about using features described above
Beam;According to semantic constraint, will have related node in semantic tree and interconnect, constitutes a complete breast cancer molybdenum target image
Semantic tree.
The text normalization database of breast cancer molybdenum target Findings
Firstly, constructing the database of breast cancer molybdenum target Findings description according to Expert Rules.Not yet due to the country
Breast cancer medical imaging describes to form unified specification, it would be desirable to analyze different dept. of radiology's specialists for breast molybdenum target shadow
As the description of performance.By investigation Different hospital for the different structure of breast molybdenum target Findings text and to the progress of its content
Analysis, obtains the phrase for meeting current medicine Discipline Maturity.By above method by the text about breast molybdenum target Radiologic imaging
This Description standardization forms the standard Unify legislation for breast cancer symptom.
The text of breast cancer molybdenum target Findings segments
The division for carrying out according to semantic information phrase is described for the text of the breast molybdenum target image of input and will not
The redundancy removal needed, extracts important description relevant to breast cancer diagnosis.It is close for meaning and belong to same class
The word of description constructs thesaurus, guarantees effective identification near synonym, enhances the scalability of semantic tree.By for shadow
As the observation and summary of text, it can be deduced that such conclusion: centered on entity, entity is the description to its position before,
It is the description to its each attributive character after entity, thus more can efficiently divides the range of each entity.Due to doctor
Treating image description has the characteristics that several words, few verb, slightly subject, it is especially desirable to pay attention to the area for noun or nominal phrase
Point.Following 6 class can be divided into according to the taxeme of participle.
Classification | Number |
Entity | 1 |
Predicate | 2 |
Attribute | 3 |
Value | 4 |
Quantifier | 5 |
Distribution | 6 |
The semantic constraint of breast cancer molybdenum target Findings text
Text structure by investigating understanding breast molybdenum target Findings early period will be each using the classification results of lesion
A lesion is as an entity, and according to division of teaching contents, according to its different characteristics, entity possesses different semantic constraints.It is embodied in
It is exactly in the form of leaf node on semantic tree.Since the text description of molybdenum target Findings generally can be to skin etc. with sign
It is described, but only can just have relevant statement when it occurs for pernicious lesion, thus this category feature is needed more
Filling meaning.It is finally obtained the result is that comprehensively consider practical application scene and breast molybdenum target image text description in lesion difference
The result that feature is obtained.
The semantic tree of breast cancer molybdenum target Findings text constructs
By semantic constraint, by semantic tree between different entities, entity and its attribute and attribute and its value contact
Get up.It should be noted that syntactic structure in text, the connection after participle in sentence between each word.There may be multiple entities
Share the case where same attributive character describes, it is also possible to which there are the same alike results of single entity to possess a variety of descriptions.Herein
Not only to consider that comma, fullstop and the conjunction etc. that include in the breast molybdenum target image text description of input play separation in the process
The content of effect, it is also necessary to consider the relationship between context, guarantee semantic complete smoothness.
The semantic tree of breast cancer molybdenum target Findings text visualizes
Semantic tree visualization presents the knot after breast cancer molybdenum target Findings text structure with more intuitive way
Fruit can easily show the mode classification of molybdenum target Findings text.Semantic tree is more applicable for due to its tree-shaped structure
Visualization, and traditional way of output is difficult to clearly convey the structure and content of semantic tree.Visual semantic tree is also convenient for root
It is searched and is watched according to its different characteristic.
Claims (2)
1. it is a kind of support heterogeneous information integration breast molybdenum target report semantic tree method for establishing model, which is characterized in that including with
Lower step:
Step 1, the text normalization database that the description of breast cancer molybdenum target Findings text is formed according to Expert Rules, text rule
It is stored in generalized database and describes relevant to meet the short of current medicine Discipline Maturity to breast cancer molybdenum target Findings text
Language;
Step 2, the text description for obtaining breast cancer molybdenum target Findings in real time, are based on text normalization data according to semantic information
Library describes to carry out the division of phrase and removes unwanted redundancy to text, extracts retouch relevant to breast cancer diagnosis
It states, divides the range of each entity, wherein using the classification results of lesion, using each lesion as an entity;
Step 3, the semantic constraint for obtaining entity;
The semantic tree for the text description that step 4, forming step 2 obtain, the root node of semantic tree are entity, and the inside of semantic tree is saved
Point is each attribute of entity, and leaf node is the corresponding attribute description of each attribute.
2. a kind of breast molybdenum target report semantic tree method for establishing model for supporting heterogeneous information integration as described in claim 1,
It is characterized in that, further including step 5:
The semantic tree obtained to previous step visualizes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910256713.7A CN110085290A (en) | 2019-04-01 | 2019-04-01 | The breast molybdenum target of heterogeneous information integration is supported to report semantic tree method for establishing model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910256713.7A CN110085290A (en) | 2019-04-01 | 2019-04-01 | The breast molybdenum target of heterogeneous information integration is supported to report semantic tree method for establishing model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110085290A true CN110085290A (en) | 2019-08-02 |
Family
ID=67413908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910256713.7A Pending CN110085290A (en) | 2019-04-01 | 2019-04-01 | The breast molybdenum target of heterogeneous information integration is supported to report semantic tree method for establishing model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110085290A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765274A (en) * | 2019-10-10 | 2020-02-07 | 东华大学 | Method for automatically generating ultrasonic report by voice input thyroid ultrasonic abnormal description |
CN111429406A (en) * | 2020-03-05 | 2020-07-17 | 北京深睿博联科技有限责任公司 | Method and device for detecting breast X-ray image lesion by combining multi-view reasoning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040078190A1 (en) * | 2000-09-29 | 2004-04-22 | Fass Daniel C | Method and system for describing and identifying concepts in natural language text for information retrieval and processing |
US7209923B1 (en) * | 2006-01-23 | 2007-04-24 | Cooper Richard G | Organizing structured and unstructured database columns using corpus analysis and context modeling to extract knowledge from linguistic phrases in the database |
US20090077113A1 (en) * | 2005-05-12 | 2009-03-19 | Kabire Fidaali | Device and method for semantic analysis of documents by construction of n-ary semantic trees |
CN102651055A (en) * | 2012-04-11 | 2012-08-29 | 华中科技大学 | Method and system for generating file based on medical image |
CN107423289A (en) * | 2017-07-19 | 2017-12-01 | 东华大学 | A kind of structuring processing method of across type of mammary clinical tumor document |
-
2019
- 2019-04-01 CN CN201910256713.7A patent/CN110085290A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040078190A1 (en) * | 2000-09-29 | 2004-04-22 | Fass Daniel C | Method and system for describing and identifying concepts in natural language text for information retrieval and processing |
US20090077113A1 (en) * | 2005-05-12 | 2009-03-19 | Kabire Fidaali | Device and method for semantic analysis of documents by construction of n-ary semantic trees |
US7209923B1 (en) * | 2006-01-23 | 2007-04-24 | Cooper Richard G | Organizing structured and unstructured database columns using corpus analysis and context modeling to extract knowledge from linguistic phrases in the database |
CN102651055A (en) * | 2012-04-11 | 2012-08-29 | 华中科技大学 | Method and system for generating file based on medical image |
CN107423289A (en) * | 2017-07-19 | 2017-12-01 | 东华大学 | A kind of structuring processing method of across type of mammary clinical tumor document |
Non-Patent Citations (9)
Title |
---|
俞扬信: "一种基于语义树的三维模型检索方法", 《情报理论与实践》 * |
刘玉文等: "一种医学本体多层概念语义关联度度量模型研究", 《九江学院学报(自然科学版)》 * |
张晗等: "基于语义图的医学多文档摘要提取模型构建", 《图书情报工作》 * |
文必龙等: "一种数据元语义描述方法", 《哈尔滨商业大学学报(自然科学版)》 * |
李俊杰: "基于最大熵原理的医疗文本信息结构化", 《临床医学工程》 * |
杜先懋等: "医学影像存储与传输系统中结构化报告的初步应用研究", 《中华放射学杂志》 * |
田驰远等: "基于依存句法分析的病理报告结构化处理方法", 《计算机研究与发展》 * |
陈德华等: "病理镜检文本数据的结构化处理方法", 《计算机与现代化》 * |
黄文博等: "一种融合PLSA模型和树模型的文本病历语义分析新方法", 《吉林大学学报(理学版)》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110765274A (en) * | 2019-10-10 | 2020-02-07 | 东华大学 | Method for automatically generating ultrasonic report by voice input thyroid ultrasonic abnormal description |
CN110765274B (en) * | 2019-10-10 | 2023-10-24 | 东华大学 | Method for automatically generating ultrasonic report by voice input thyroid ultrasonic abnormal description |
CN111429406A (en) * | 2020-03-05 | 2020-07-17 | 北京深睿博联科技有限责任公司 | Method and device for detecting breast X-ray image lesion by combining multi-view reasoning |
CN111429406B (en) * | 2020-03-05 | 2023-10-27 | 北京深睿博联科技有限责任公司 | Mammary gland X-ray image lesion detection method and device combining multi-view reasoning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
He et al. | Pathvqa: 30000+ questions for medical visual question answering | |
JP7008772B2 (en) | Automatic identification and extraction of medical conditions and facts from electronic medical records | |
US10929420B2 (en) | Structured report data from a medical text report | |
CN109378053B (en) | Knowledge graph construction method for medical image | |
CN109599185B (en) | Disease data processing method and device, electronic equipment and computer readable medium | |
US9165116B2 (en) | Patient data mining | |
CN112597774B (en) | Chinese medical named entity recognition method, system, storage medium and equipment | |
US8155951B2 (en) | Process for constructing a semantic knowledge base using a document corpus | |
JP5154832B2 (en) | Document search system and document search method | |
US8935155B2 (en) | Method for processing medical reports | |
US20160335403A1 (en) | A context sensitive medical data entry system | |
CN106874643A (en) | Build the method and system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector | |
US20190057773A1 (en) | Method and system for performing triage | |
CN107851121A (en) | Identify the mistake in medical data | |
CN109918672B (en) | Structural processing method of thyroid ultrasound report based on tree structure | |
US9684647B2 (en) | Domain-specific computational lexicon formation | |
CN106502982B (en) | The structuring processing method of unstructured Chinese breast ultrasound text | |
JP2004157623A (en) | Search system and search method | |
Hammami et al. | Automated classification of cancer morphology from Italian pathology reports using Natural Language Processing techniques: A rule-based approach | |
AU2020407062A1 (en) | Unsupervised taxonomy extraction from medical clinical trials | |
RU2720363C2 (en) | Method for generating mathematical models of a patient using artificial intelligence techniques | |
CN110069639B (en) | Method for constructing thyroid ultrasound field ontology | |
CN110085290A (en) | The breast molybdenum target of heterogeneous information integration is supported to report semantic tree method for establishing model | |
CN111460788A (en) | Interactive reading method for CT/PET report | |
Jebadas et al. | Histogram distance metric learning to diagnose breast cancer using semantic analysis and natural language interpretation methods |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |