WO2019164064A1 - System for interpreting medical image through generation of refined artificial intelligence reinforcement learning data, and method therefor - Google Patents

System for interpreting medical image through generation of refined artificial intelligence reinforcement learning data, and method therefor Download PDF

Info

Publication number
WO2019164064A1
WO2019164064A1 PCT/KR2018/005641 KR2018005641W WO2019164064A1 WO 2019164064 A1 WO2019164064 A1 WO 2019164064A1 KR 2018005641 W KR2018005641 W KR 2018005641W WO 2019164064 A1 WO2019164064 A1 WO 2019164064A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
learning
medical image
feature
medical
Prior art date
Application number
PCT/KR2018/005641
Other languages
French (fr)
Korean (ko)
Inventor
이병일
김성현
Original Assignee
(주)헬스허브
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by (주)헬스허브 filed Critical (주)헬스허브
Publication of WO2019164064A1 publication Critical patent/WO2019164064A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present invention relates to a medical image reading system and method through the generation of purified artificial intelligence reinforcement learning data, and more particularly, to extract the reinforcement learning data of medical image reading from the medical image reading expert's readings
  • the present invention relates to a medical image reading system and a method for generating artificial intelligence enhanced learning data that can reduce the computational cost and complexity of artificial intelligence medical image reading and improve accuracy.
  • a doctor's ability is often determined by the ability to read lesions from medical images without missing them. Accurate judgment of medical images is of paramount importance because early detection of pathologies and early detection of pathologies are very important in increasing the likelihood of treating and curing the disease.
  • CAD computer-aided diagnosis
  • the image analysis technology used here is mainly evolving from pattern recognition by image processing to prediction of lesions through machine learning. It extracts features from images, vectorizes images with extracted features, and then uses various machine learning classification techniques. In recent years, deep learning has become a mainstream method.
  • CNNs Convolutional neural networks
  • the present invention provides a medical image reading system and method for improving performance of a conventional CNN by applying purified learning data to a CNN.
  • Korean Patent Publication No. 10-2017-0140757 (December 21, 2017) relates to a clinical decision support ensemble system and a clinical decision support method using the same, and the clinical practice of a patient through machine learning received from a plurality of external medical institutions.
  • Korean Patent Publication No. 10-2015-0108701 (2015.09.30.) Relates to a system and method for visualizing anatomical elements in a medical image, and to verify anatomical elements included in the medical image by using anatomical context information. By presenting a system and method for visualizing anatomical elements in a medical image that automatically classifies and user-friendly visualization of the classified anatomical elements.
  • Korean Patent Publication No. 10-2015-0098119 (2015.08.27.) Relates to a system and method for removing false-positive lesion candidates in a medical image, and to verify lesion candidates detected in a medical image using anatomical context information. By providing a system and method for removing false-positive lesion candidates in a medical image to remove false-positive lesion candidates.
  • the prior art integrates clinical prediction results of patients to perform ensemble prediction, to use anatomical context information in medical images, and to remove false positive lesion candidates, or to provide extensive learning data for deep learning as in the present invention. It has not been suggested to reduce the computational cost and complexity and improve the accuracy of the learning model for the purification and the artificial intelligence medical image reading through it.
  • the present invention was created to solve the above problems, and extracts the reinforcement learning data of the medical image reading from the readings of the medical image reading expert (imagine medicine specialist) and utilizes it as artificial intelligence learning data It is an object of the present invention to provide a medical image reading system and method through refined artificial intelligence enhanced learning data generation that can reduce the calculation cost and complexity of reading and improve accuracy.
  • the present invention to improve the performance of the medical image reading system, to improve the learning effect through the image, to identify the presence and location of the lesion as well as to identify the various types of conditions that can appear in the same body parts
  • Another object is to generate a medical report supervised learning model that can normalize texts contained in well-trained radiologist's readings to produce refined learning data.
  • the present invention to improve the performance of the medical image reading system, by extracting the refined learning data (refined data) by learning the readings that have been verified through the generated reading record supervision learning model, and together with the medical image image
  • the purpose of the present invention is to provide a structure and generation method of refined learning data with a new structure that can enhance learning effects and identify types of diseases by reinforcement learning.
  • the present invention improves the read performance of the existing convolutional neural network and extracts the type of the disease by extracting filters that can identify the type of the condition from the refined learning data having the new data structure and interacting it with the convolutional neural network.
  • the goal is to build a reading-based supervised learning model that can form a converged convolutional neural network with a recognizable new structure.
  • the medical image reading system through the generation of purified artificial intelligence reinforcement learning data generates the normalized type of purified learning data extracted from the readings of a medical image reading expert.
  • a reading record supervised learning unit and a learning model generating unit performing machine learning to read the medical image by inputting the learning data purified by the reading record supervising learning unit, wherein the machine learning includes the medical image.
  • the learning data is automatically received by reading the expert's readings, and the input is characterized in that the learning data automatically becomes reinforcement learning data.
  • the reading record supervised learning unit may include a medical record data loading unit reading data from a file location address of the reading statement, findings, findings, and recommendations by the body parts.
  • a labeling unit which classifies the section into a section including a recommendation, and labels a disease-related word or phrase as a set from the plain text of each section, extracts a disease-related word or phrase from the labeled reading, and extracts the word.
  • a feature extractor for extracting a common feature from a phrase, a feature matrix generator for generating a feature matrix by regularizing the extracted features, and mapping the read to a feature matrix, if given.
  • a feature analysis unit for analyzing features, and to generate purified learning data using only the analyzed features. It characterized in that it comprises a data generator.
  • the medical record data loading unit reads the position of the file, the total number of files, the length of the file, or a combination thereof from the file position address of the read original data given as an input value, and loads the file into the system memory.
  • the data loaded in the memory is labeled for each read section by body part, classified into respective sets, and rearranged on the system memory, and the feature extracting unit includes the plain text of each section, SNOMED-CT, and ICD. Selecting only the plain text in comparison with the standard medical term data set including -11, LOINC, and KCD-10, and selecting the word type, description form, description frequency, or the like from the medical term extracted from the medical term extract section.
  • Combination analysis to characterize terms related to the presence or absence of lesions to characterize terms associated with the location of lesions, Extracts features of terms related to descriptions, features of terms representing types of conditions, or a combination thereof, and the feature matrix generator maps the features extracted from the feature extractor into data sets that are newly input as plain text. Generates a feature matrix that can be compared and analyzed for terms of similar or identical meaning, and the feature analyzer, when the unpurified original read is input, maps it to the feature matrix in the plain text that describes the read.
  • Extracting, analyzing, and classifying the presence or absence of a lesion, the location of the lesion, the symptoms, and the type of the disease, and the purified data generation unit generates the learning data purified from the data extracted, analyzed, and classified by the feature analysis unit. It is characterized by.
  • the learning model generation unit is trained by a converged convolutional neural network, and the converged convolutional neural network continuously performs refined learning data from readings of the medical image reading expert.
  • deep learning machine learning is performed through the fusion convolutional neural network to reduce the calculation amount and improve the accuracy to improve the overall performance.
  • the converged convolutional neural network is characterized in that the weight is updated by reverse propagation.
  • read-record instructional learning that generates purified learning data in a normalized form extracted from a reading of a medical image reading expert (supervised learning) and a learning model generation step of performing a machine learning to read the medical image with the input of the training data purified in the read recording instruction learning step
  • the machine learning is a reading of the medical image reading expert
  • the read history supervised learning step may include loading medical record data reading data from a file location address of the read statement, finding the read statement by body parts, findings, and conclusions.
  • a feature analysis step of analyzing features by mapping to a feature matrix, and only the analyzed features And a refinement data generation step of generating refined learning data.
  • the medical record data loading step reads the file location, the total number of files, the length of the file, or a combination thereof from the file location address of the read original data given as an input value, and loads the file into the system memory.
  • the data loaded in the system memory is labeled by each reading section for each body part, classified into respective sets and rearranged on the system memory, and the medical term extraction step includes the plain text of each section. Only the plaintext text is selectively extracted in comparison with a standard medical term data set including SNOMED-CT, ICD-11, LOINC, and KCD-10, and the feature extraction step is performed from the medical terminology extracted by the medical terminology extracting unit.
  • Analyze word types, narrative forms, frequency of descriptions, or a combination thereof to determine the characteristics of a term related to the presence or absence of a lesion Extracting features of terms related to the location of the sides, features of terms related to the description of symptoms, features of terms representing the types of symptoms, or a combination thereof, wherein the feature matrix generating step includes extracting the features extracted from the feature extractor. Create a feature matrix that can compare and analyze whether the terms have similar or identical meanings by mapping them with newly input plaintext text as a set, and the feature analysis step is performed when an unrefined original readout is inputted.
  • Extracting, analyzing, and classifying the presence or absence of a lesion, the location of the lesion, the symptom, the type of the disease, and the like in the plain text that describes the reading by mapping it to a feature matrix, and the purified data generation step is performed by the feature analysis unit. And generating refined learning data from the classified data.
  • the learning model generation step learning is performed by a converged convolutional neural network, and the converged convolutional neural network continuously performs the refined learning data from readings of the medical image reading expert.
  • deep learning machine learning is performed through the fusion convolutional neural network to reduce the calculation amount and improve the accuracy to improve the overall performance.
  • the learning model generation step characterized in that the weight is updated by the backward propagation.
  • the user uses the convolutional neural network to determine the presence of the lesion, the location of the lesion, and the pathology of the medical image.
  • Conventional convolution by fusing the output value of the convolutional neural network through analysis of pixel information of the medical image and the output value through analysis of the read record of the same body region calculated as a learning result in the read record supervised learning model. It is possible to obtain more accurate readings than medical image readings using neural networks or to reduce computational complexity of convolutional neural networks, and to predict the types of complications that were not known through conventional convolutional neural networks. .
  • FIG. 1 is a view showing the concept of a conventional medical image reading system.
  • FIG. 2 is a view showing the concept of a medical image reading system and method through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • FIG. 3 is a view for explaining the configuration of a medical image reading system through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • FIG. 4 is a block diagram illustrating a configuration of a medical report supervised learning model for generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • FIG. 4 is a block diagram showing the configuration of a medical report supervised learning unit for generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • FIG. 5 is a view showing a process of generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • FIG. 6 is a conceptual diagram illustrating a configuration of a converged convolutional neural network (CCNN) through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • CCNN converged convolutional neural network
  • FIG. 7 is a flowchart for generating a feature matrix of a read / write map learning model according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of extracting feature values associated with a disease from an arbitrary reading in a reading recording supervising model according to an embodiment of the present invention.
  • FIG. 9 is a flowchart of a medical image reading process through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • FIG. 1 is a view showing the concept of a conventional medical image reading system.
  • FIG. 2 is a view showing the concept of a medical image reading system and method through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • the conventional medical image reading system is a system that helps the doctor to determine the presence of the lesion through the medical image, using the medical image stored in the database of the local, hospital, medical institutions, etc.
  • the input medical image was applied to the learning model to predict a lesion from the reading result.
  • the present invention proposes a structure for performing reinforcement learning to upgrade the learning model with readings which are reading results of medical images of specialists.
  • a specialist configures a reading by reading a medical image, and uses the generated reading to improve the learning result of the learning model of the medical image reading system.
  • the present invention serves to improve the learning performance and reduce the complexity in the process of generating a learning model for learning a medical image using the reading result of the reading.
  • FIG. 3 is a view for explaining the configuration of a medical image reading system through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • the medical image reading system 10 may include a readout instruction learning unit 100, a medical image learning unit 200, a medical image database 300, and a reading database ( 400) and the like.
  • the medical image database 300 and the reading database 400 may be configured in one database.
  • the radiology specialist reads the medical image 300 and then writes a readout 400 of the image.
  • the medical image is input to the medical image learning unit 200 to perform machine learning.
  • the reading 400 is input to the reading guidance learning unit 100 to extract the characteristics of the reading and provide it to the medical image learning unit 200 to improve the learning performance (calculation amount, complexity) for the medical image.
  • the medical image learning unit 200 which learns the medical image by inputting the medical image 30 is extracted from the readout of the medical image 300 read by a specialist.
  • the readout guidance learning unit 100 provides the purified learning data required by the medical image learning unit 200, thereby improving the learning performance of the medical image,
  • FIG. 4 is a block diagram showing the configuration of a medical report supervised learning unit for generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • the read record map learning unit 100 discovers the medical record data loading unit 111 that reads data from the file location address of the read statement, and the read document for each body part (
  • a labeling processor 112 for classifying into sections including Findings, Conclusions, and Recommendations, and labeling a disease-related word or phrase from a plain text of each section into a set, the disease in the labeled readings
  • the feature extractor 113 extracts a related word or phrase, extracts a common feature from the extracted word or phrase, and generates a feature matrix by regularizing the extracted features. ).
  • the resulting feature matrix is stored in a database and can be used to analyze features by mapping incoming readings to the feature matrix.
  • the reading history map learning unit 100 maps the readings to a feature matrix through a map learning, given a random reading, and analyzes the features by the map learning feature analysis unit 120 and the analyzed features only. Further comprising a refined data generation unit 130 for generating purified learning data.
  • the medical record data loading unit 111 reads the position of the file, the total number of files, the length of the file, or a combination thereof from the file position address of the original reading data given as an input value, and loads it into the system memory or the auxiliary memory. The results of the study will be used for reading, reading and teaching.
  • the labeling processor 112 labels the data loaded in the system memory or the auxiliary memory for each read section for each body part, classifies them into sets, and rearranges the data in the system memory or the auxiliary memory.
  • the feature extractor 113 selectively extracts only the plaintext text by comparing the plaintext text of each section with a standard medical term data set including SNOMED-CT, ICD-11, LOINC, and KCD-10. Analyze the word type, description form, frequency of description, or a combination thereof from the medical terms extracted by the extraction unit, and the characteristics of the terms related to the presence of the lesion, the characteristics of the terms related to the location of the lesion, and the characteristics of the terms related to the description of symptoms. Extract features or combinations of terms that indicate the type of condition.
  • the feature matrix generator 114 compares and analyzes whether the feature extracted by the feature extractor 130 is a similar or identical meaning by mapping the features extracted from the feature extractor 130 to the newly input plaintext text. Matrix).
  • the supervised learning feature analysis unit 120 when the unrefined original readout is input, applies it to the supervised learning model, maps it to the feature matrix, and indicates the presence or absence of the lesion in the plaintext text describing the readout. Extract, analyze and classify symptoms, types of symptoms, and more.
  • the refined data generation unit 130 generates the refined learning data from the data extracted, analyzed, and classified by the supervised feature analysis unit 120.
  • the refined learning data corresponds to additional information for improving performance of reading a medical image.
  • the supervised learning feature analysis unit 120 performs supervised learning, first by learning a readout of any newly input medical image based on a well-defined feature matrix for the medical image, and the corresponding readout is characterized by Determine if you have a matrix.
  • the newly read readings on any medical images are extracted from disease-related words or phrases through the Natural Language Processor (NLP), and inputted into the supervised learning model. Is extracted.
  • NLP Natural Language Processor
  • any new reading is inputted, it is classified by comparing with the existing feature matrix.
  • the medical record data loading unit 111, the labeling processing unit 112, and the feature extraction unit A method of performing the function of the reference feature matrix extractor 110 including the process of the 113 and the feature matrix generator 114 may be performed as it is, or may be inputted into a map learning model and classified.
  • FIG. 5 is a view showing a process of generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • a reading (R k ) of a radiologist is input to generate a set of extracting only the contents corresponding to finding (F) among readings, and a conclusion of the readings (Conc. Create a set that extracts only the content of C), and create and label a set that extracts only the content of Recommendation (R) from the readings. These sets need to be distinguished by body part (BP). .
  • FM 1 ,. , FM x, etc. is a set of feature metric extracted from the F, C, and R sets of labeled readings.
  • the feature metric of the Xth feature is mapped to A, which converges or dominates all k synonyms, expressions, and vocabularies representing the same disease or condition from a 1 to a k , and the entire feature matrix.
  • Feature Metrix consists of each metric indicating disease name, location expression, and severity.
  • the generated result is RR x and outputs data whose X-th row data R x is refined through a feature matrix.
  • FIG. 6 is a conceptual diagram illustrating a configuration of a convergent convolutional neural network (CCNN) through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • CCNN convolutional neural network
  • a fused convolutional neural network can be constructed. For example, from the plain text describing the reading in the refined AI data, the presence or absence of the lesion, the location of the lesion, the symptom, and the type of the disease are extracted, analyzed, and classified to intensively predict the parts to be predicted in the convolutional stage.
  • the other part does not add a relatively large amount of computational complexity. That is, by precisely learning about the area where the lesion exists and learning differently according to the symptoms or the type of the lesion, the complexity is reduced and the learning performance is improved as compared to the same intensity learning for all the input images. .
  • the reading is decoded for the medical image, and then the medical records are analyzed according to the supervised learning model.
  • the information on the presence or absence of the lesion, the location of the lesion, the symptom, and the type of the symptom is extracted. And classify and construct a convolutional neural network fused with supervised learning models.
  • the fused convolutional neural network has a plurality of convolutional layers, and each time, a customized convolution is performed by using the result of supervised learning.
  • the present invention can create a learning model with further improved performance.
  • the radiologists read the readings, label them by findings, conclusions, and recommendations for each body part, extract the features, generate and store the feature matrix, and save the results.
  • the feature matrix is also extracted from the medical image readings and then mapped with the stored feature matrix to derive refined learning data. Segmenting this feature matrix can extract information about the presence or absence of the lesion, the location of the lesion, the symptoms, and the type of symptom. Through this, it is possible to construct a converged CNN and perform learning.
  • the convergence convolutional network according to the present invention is configured such that the weight is updated by reverse propagation reflecting the evaluation result of the specialist in reverse.
  • the read result of the output unit is propagated to the hidden layer and the convolutional layer by inverting the weight to be corrected to enable more accurate reading.
  • FIG. 7 is a flowchart for generating a feature matrix of a read / write map learning model according to an embodiment of the present invention.
  • the feature matrix generation process of the supervised learning model first loads the dicom metadata and the readout of the medical image image from the network or the local storage (S110). Next, a body part field included in the dicom metadata of the loaded medical image image is extracted (S120). Subsequently, plain texts are inserted into each set by labeling the readout of the medical image by findings, conclusions, and recommendations (S130).
  • FIG. 8 is a flowchart of extracting feature values associated with a disease from an arbitrary reading in a reading recording supervising model according to an embodiment of the present invention.
  • the process of extracting a disease-related feature value from an arbitrary reading first loads the dicom metadata and reading of the medical image image from a network or a local storage ( S210). Subsequently, a body part field included in the dicom metadata of the medical image image is extracted (S220).
  • the plain text is extracted for each section by labeling the readout of the medical image image by findings, conclusions, and recommendations (S230).
  • the extracted plain text is mapped to elements of the feature matrix of the same body part (S240).
  • S240 data of text in which similar or identical terms or lexical expressions exist is extracted (S250).
  • the extracted data can be analyzed to extract information on the presence or absence of lesions, the location of the lesions, symptoms, and types of symptoms. Through this, it is possible to construct a converged CNN and perform learning.
  • FIG. 9 is a flowchart of a medical image reading process through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
  • the medical image reading process through the generation of purified artificial intelligence reinforcement learning data may be performed to first generate purified learning data in a normalized form extracted from a reading of a medical image reading expert.
  • Record supervised learning is performed (S310).
  • machine learning is performed to read the medical image with input of the learning data purified in the read / write instruction learning step (S320).
  • the user when a user reads the presence or absence of a lesion of the medical image image, the location of the lesion, the type of the disease, etc. using the convolutional neural network, the user outputs the output value of the convolutional neural network by analyzing pixel information of the medical image image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Image Analysis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention relates to a system for interpreting a medical image through generation of refined artificial intelligence reinforcement learning data, and a method therefor, and to a system for interpreting a medical image through generation of refined artificial intelligence reinforcement learning data, and a method therefor, the system extracting reinforcement learning data for medical image interpretation from an interpretation text of a medical image interpretation expert so as to utilize same as artificial intelligence learning data, thereby decreasing the calculation costs and the complexity of medical image interpretation using artificial intelligence and improving the accuracy thereof.

Description

정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법Medical image reading system and its method by generating purified artificial intelligence reinforcement learning data
본 발명은 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법에 관한 것으로, 더욱 상세하게는 의료영상 판독 전문가의 판독문으로부터 의료영상 판독의 강화 학습 데이터를 추출하여 이를 인공지능 학습 데이터로 활용함으로써 인공지능 의료영상 판독의 계산비용 및 복잡도를 낮추고 정확도를 향상시킬 수 있는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법에 관한 것이다.The present invention relates to a medical image reading system and method through the generation of purified artificial intelligence reinforcement learning data, and more particularly, to extract the reinforcement learning data of medical image reading from the medical image reading expert's readings The present invention relates to a medical image reading system and a method for generating artificial intelligence enhanced learning data that can reduce the computational cost and complexity of artificial intelligence medical image reading and improve accuracy.
사람의 질병을 진단하기 위해 사용되는 의료기기들 중에서 상당수가 의료영상을 획득하여 출력한다. 사람의 신체부위를 스캔하여 획득한 영상을 처리하여 병변이 존재하는지 여부를 판독한다.Among medical devices used to diagnose human diseases, many of them acquire and output medical images. An image obtained by scanning a human body part is processed to read whether a lesion is present.
의료영상으로부터 병변을 놓치지 않고 판독할 수 있는 능력에 따라 의사의 능력이 결정되기도 한다. 정확한 병변의 판단과 병증의 조기 발견은 병을 치료하고 완치할 가능성을 높이는데 매우 중요하기 때문에, 의료영상의 정확한 판단은 무엇보다 중요하다고 할 수 있다.A doctor's ability is often determined by the ability to read lesions from medical images without missing them. Accurate judgment of medical images is of paramount importance because early detection of pathologies and early detection of pathologies are very important in increasing the likelihood of treating and curing the disease.
그러나 다양한 분야의 의료영상에서 병변을 모두 잘 찾아내기란 여간 어려운 일이 아니며, 특정 분야에서 상당히 오랜 시간동안 경험을 쌓지 않고서는 의료영상에서 병변을 정확하게 찾아내는 것이 매우 어렵다. 아울러 특정 분야에서 오랜 경험을 쌓은 전문의가 흔하지도 않은 것이 현실이다.However, it is not difficult to find all lesions well in various fields of medical imaging, and it is very difficult to accurately find lesions in medical imaging without gaining a long time experience in a particular field. In addition, the reality is that it is not uncommon for specialists with long experience in a particular field.
이러한 문제점을 오래 전부터 많은 사람들이 인지하여 이를 해결하고자 하였다. 대표적인 것이 컴퓨터 보조 진단(CAD, Computer-Aided Diagnosis)의 개념이다. 이는 스캔한 의료영상을 디지털화하여 컴퓨터로 영상을 처리한 다음 객체의 특징들에 기반한 수학적 모델링을 통해 규칙기반 시스템(rule-based system)이나 전문가 시스템(expert system)을 구성하는 것이다.Many people have recognized this problem for a long time and tried to solve it. A representative one is the concept of computer-aided diagnosis (CAD). This is done by digitizing the scanned medical image, processing the image with a computer, and then constructing a rule-based or expert system through mathematical modeling based on the characteristics of the object.
여기서 사용되는 영상분석기술은 주로 영상처리에 의한 패턴인식에서 기계학습을 통한 병변의 예측으로 진화되고 있는데, 영상으로부터 특징을 추출하고, 추출한 특징들로 영상을 벡터화한 후 다양한 기계학습 분류기법을 활용하다가, 최근에는 딥러닝에 의한 방법이 주류를 이루고 있다.The image analysis technology used here is mainly evolving from pattern recognition by image processing to prediction of lesions through machine learning. It extracts features from images, vectorizes images with extracted features, and then uses various machine learning classification techniques. In recent years, deep learning has become a mainstream method.
의료영상을 분석하는 주요 과제로는 영상을 분류(classification)하고, 객체를 검출(detection)하며, 객체의 경계를 추출(segmentation)하고, 서로 다른 영상의 정합(registration) 등이 있다. 이러한 주요 과제를 해결하는 수단으로 영상을 입력으로 하여 처리하는데 최적화된 컨벌루션 뉴럴 네트워크(CNN, convolutional neural network)가 가장 널리 이용된다.The main tasks of analyzing medical images include classifying images, detecting objects, segmenting objects, and regulating different images. Convolutional neural networks (CNNs), which are optimized for processing images as inputs, are the most widely used as a means to solve these major problems.
대부분의 의료영상에 대한 학습방식은, 입력 데이터와 정답인 데이터를 학습 데이터로 하여, 입력과 정답 간의 함수관계를 CNN이 학습하는 지도학습의 범주에 속한다. 그러나 딥러닝 기반의 인공지능 기술은 매우 많은 수의 학습 데이터가 필요하다, 따라서 많은 수의 학습 데이터를 커버하기 위해서 학습 데이터의 정제가 매우 중요한 이슈이다. 즉, 양질의 방대한 학습 데이터를 확보하는 것과 정확한 판독성능을 확보하는 것이 무엇보다 중요하다는 것을 알 수 있다.Most of the medical image learning methods belong to the category of supervised learning in which CNN learns the functional relationship between input and correct answers using input data and correct answer data as learning data. However, deep learning-based artificial intelligence technology requires a very large number of learning data, and thus, refining the learning data is a very important issue to cover a large number of learning data. In other words, it is important to secure a large amount of high quality learning data and to ensure accurate read performance.
따라서 본 발명에서는 의료영상 판독 전문가의 판독문으로부터 의료영상 판독의 강화 학습 데이터를 추출하여 이를 인공지능 학습 데이터로 활용함으로써 인공지능 의료영상 판독의 계산비용 및 복잡도를 낮추고 정확도를 향상시킬 수 있는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법을 제시하고자 한다.Therefore, in the present invention, by extracting the reinforcement learning data of the medical image reading from the readings of the medical image reading expert and using it as the artificial intelligence learning data, it is a refined artificial which can lower the calculation cost and complexity of the artificial intelligence medical image reading and improve the accuracy. We present a medical image reading system and its method through the generation of intelligent reinforcement learning data
보다 상세하게는 판독 전문가의 판독문 텍스트 데이터로부터 정규화된 폼(form)을 생성하는 지도 학습 모델(supervised learning model)의 구조 및 방법, 다수의 유효 판독문들로부터 추출해낸 정제된 학습 데이터의 구조 및 데이터 생성, 정제된 학습 데이터를 CNN에 적용하여 종래의 CNN의 성능을 향상시킬 수 있는 의료영상 판독 시스템 및 그 방법을 제시하고자 한다.More specifically, the structure and method of a supervised learning model that generates a normalized form from the read text data of the reading expert, the structure of the refined learning data extracted from multiple valid reads, and the data generation. The present invention provides a medical image reading system and method for improving performance of a conventional CNN by applying purified learning data to a CNN.
다음으로 본 발명의 기술분야에 존재하는 선행기술에 대하여 간단하게 설명하고, 이어서 본 발명이 상기 선행기술에 비해서 차별적으로 이루고자 하는 기술적 사항에 대해서 기술하고자 한다.Next, the prior art existing in the technical field of the present invention will be briefly described, and then the technical matters to be made differently from the prior art will be described.
먼저 한국공개특허 제10-2017-0140757호(2017.12.21.)는 임상 의사결정 지원 앙상블 시스템 및 이를 이용한 임상 의사결정 지원 방법에 관한 것으로, 복수의 외부 의료기관으로부터 수신되는 기계학습을 통한 환자의 임상예측 결과를 통합하여, 앙상블 예측을 수행함으로써, 상기 환자의 현재 상태뿐만 아니라 향후 상기 환자의 질환에 대한 진행 상태를 예측하여, 의료인의 의료행위에 관한 신속하고 정확한 임상 의사결정을 지원하기 위한 시스템 및 그 방법에 관한 것이다.First of all, Korean Patent Publication No. 10-2017-0140757 (December 21, 2017) relates to a clinical decision support ensemble system and a clinical decision support method using the same, and the clinical practice of a patient through machine learning received from a plurality of external medical institutions. A system for integrating prediction results and performing ensemble predictions to predict not only the current state of the patient but also the progress of the patient's disease in the future, thereby supporting prompt and accurate clinical decision making of medical personnel; It is about how.
또한 한국공개특허 제10-2015-0108701호(2015.09.30.)는 의료 영상 내 해부학적 요소 시각화 시스템 및 방법에 관한 것으로, 의료 영상 내에 포함되어 있는 해부학적 요소들을 해부학적 맥락 정보를 이용하여 검증함으로써 자동으로 분류하고 분류된 해부학적 요소들을 사용자 친화적으로 시각화하는 의료 영상 내 해부학적 요소 시각화 시스템 및 방법을 제시하고 있다.In addition, Korean Patent Publication No. 10-2015-0108701 (2015.09.30.) Relates to a system and method for visualizing anatomical elements in a medical image, and to verify anatomical elements included in the medical image by using anatomical context information. By presenting a system and method for visualizing anatomical elements in a medical image that automatically classifies and user-friendly visualization of the classified anatomical elements.
한편 한국공개특허 제10-2015-0098119호(2015.08.27.)는 의료 영상 내 거짓양성 병변후보 제거 시스템 및 방법에 관한 것으로, 의료 영상 내에서 검출된 병변후보를 해부학적 맥락정보를 이용하여 검증함으로써, 거짓양성 병변후보를 제거하는 의료 영상 내 거짓양성 병변후보 제거 시스템 및 방법을 제시하고 있다.Meanwhile, Korean Patent Publication No. 10-2015-0098119 (2015.08.27.) Relates to a system and method for removing false-positive lesion candidates in a medical image, and to verify lesion candidates detected in a medical image using anatomical context information. By providing a system and method for removing false-positive lesion candidates in a medical image to remove false-positive lesion candidates.
상기 선행기술들은 환자의 임상예측 결과를 통합하여 앙상블 예측을 수행하거나, 의료영상 내 해부학적 맥락 정보를 이용하거나 및 거짓 양성 병변후보를 제거하는 것이나, 본 발명과 같이 딥러닝을 위한 방대한 학습 데이터의 정제와 이를 통한 인공지능 의료영상 판독을 위한 학습모델의 계산비용 및 복잡도를 낮추고 정확도를 향상시키는 것은 제시된 바가 없다.The prior art integrates clinical prediction results of patients to perform ensemble prediction, to use anatomical context information in medical images, and to remove false positive lesion candidates, or to provide extensive learning data for deep learning as in the present invention. It has not been suggested to reduce the computational cost and complexity and improve the accuracy of the learning model for the purification and the artificial intelligence medical image reading through it.
본 발명은 상기와 같은 문제점을 해결하기 위해 창작된 것으로서, 의료영상 판독 전문가(영상의학과 전문의)의 판독문으로부터 의료영상 판독의 강화 학습 데이터를 추출하여 이를 인공지능 학습 데이터로 활용함으로써 인공지능 의료영상 판독의 계산비용 및 복잡도를 낮추고 정확도를 향상시킬 수 있는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법을 제공하는 것을 목적으로 한다.The present invention was created to solve the above problems, and extracts the reinforcement learning data of the medical image reading from the readings of the medical image reading expert (imagine medicine specialist) and utilizes it as artificial intelligence learning data It is an object of the present invention to provide a medical image reading system and method through refined artificial intelligence enhanced learning data generation that can reduce the calculation cost and complexity of reading and improve accuracy.
또한 본 발명은 의료영상 판독 시스템의 성능을 향상시키기 위해, 이미지를 통한 학습 효과를 개선하고, 병변의 존재여부 및 위치를 식별할 뿐 만 아니라 동일한 신체 부위에 나타날 수 있는 다양한 병증의 종류를 식별하기 위해 잘 훈련된 영상의학과 전문의의 판독문들에 포함된 텍스트들을 정규화하여 정제된 학습 데이터를 생성할 수 있는 판독기록 지도학습 모델(medical report supervised learning model)을 생성하는 것을 또 다른 목적으로 한다.In addition, the present invention to improve the performance of the medical image reading system, to improve the learning effect through the image, to identify the presence and location of the lesion as well as to identify the various types of conditions that can appear in the same body parts Another object is to generate a medical report supervised learning model that can normalize texts contained in well-trained radiologist's readings to produce refined learning data.
또한 본 발명은 의료영상 판독 시스템의 성능을 향상시키기 위해, 상기 생성한 판독기록 지도학습모델을 통해 검증이 완료된 판독문들을 학습시켜 정제된 학습 데이터(refined data)를 추출하여, 이를 의료 영상 이미지와 함께 강화 학습시킴으로써 학습 효과를 향상시키고 병증의 종류를 식별할 수 있는 새로운 구조를 지닌 정제된 학습 데이터의 구조 및 생성 방법을 제공하는 데 그 목적이 있다.In addition, the present invention to improve the performance of the medical image reading system, by extracting the refined learning data (refined data) by learning the readings that have been verified through the generated reading record supervision learning model, and together with the medical image image The purpose of the present invention is to provide a structure and generation method of refined learning data with a new structure that can enhance learning effects and identify types of diseases by reinforcement learning.
또한 본 발명은 상기의 새로운 데이터 구조를 갖는 정제된 학습 데이터로부터 병증의 종류를 식별할 수 있는 필터들을 추출하여 이를 컨벌루션 뉴럴 네트워크에 상호작용시킴으로써 기존 컨벌루션 뉴럴 네트워크의 판독 성능을 향상시키고 병증의 종류를 식별할 수 있는 새로운 구조를 가진 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)를 구성할 수 있는 판독기록 지도학습 모델을 구축하는 데 그 목적이 있다.In addition, the present invention improves the read performance of the existing convolutional neural network and extracts the type of the disease by extracting filters that can identify the type of the condition from the refined learning data having the new data structure and interacting it with the convolutional neural network. The goal is to build a reading-based supervised learning model that can form a converged convolutional neural network with a recognizable new structure.
상기와 같은 목적을 달성하기 위하여 본 발명의 바람직한 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템은, 의료영상 판독 전문가의 판독문으로부터 추출한 정규화된 형태의 정제된 학습 데이터를 생성하는 판독 기록 지도학습(supervised learning)부 및 상기 판독 기록 지도학습부에서 정제된 학습 데이터를 입력으로 상기 의료영상을 판독하도록 기계학습을 수행하는 학습모델생성부를 포함하며, 상기 기계학습은 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받음으로써, 상기 학습 데이터가 자동으로 강화학습 데이터가 되는 것을 특징으로 한다.In order to achieve the above object, the medical image reading system through the generation of purified artificial intelligence reinforcement learning data according to a preferred embodiment of the present invention generates the normalized type of purified learning data extracted from the readings of a medical image reading expert. A reading record supervised learning unit and a learning model generating unit performing machine learning to read the medical image by inputting the learning data purified by the reading record supervising learning unit, wherein the machine learning includes the medical image. The learning data is automatically received by reading the expert's readings, and the input is characterized in that the learning data automatically becomes reinforcement learning data.
상기 판독 기록 지도학습(supervised learning)부는, 상기 판독문의 파일 위치 주소로부터 데이터를 읽어 들이는 의료기록 데이터 로딩부, 상기 판독문을 신체부위(Body Part) 별로 발견(Findings), 결론(Conclusion) 및 권고(Recommendation)를 포함한 섹션으로 분류하고, 상기 각 섹션의 평문텍스트로부터 질병관련 단어 또는 어구를 하나의 집합으로 라벨링하는 라벨링처리부, 상기 라벨링된 판독문에서 질병관련 단어 또는 어구를 추출하고, 상기 추출된 단어 또는 어구로부터 공통된 특징을 추출하는 특징 추출부, 상기 추출된 특징들을 규칙화하여 특징 행렬(Feature Matrix)을 생성하는 특징행렬 생성부, 임의의 판독문이 주어지면 상기 판독문을 특징 행렬에 사상(mapping)시켜 특징을 분석하는 특징 분석부, 및 상기 분석된 특징만으로 정제된 학습 데이터를 생성하는 정제 데이터 생성부를 포함하는 것을 특징으로 한다.The reading record supervised learning unit may include a medical record data loading unit reading data from a file location address of the reading statement, findings, findings, and recommendations by the body parts. A labeling unit which classifies the section into a section including a recommendation, and labels a disease-related word or phrase as a set from the plain text of each section, extracts a disease-related word or phrase from the labeled reading, and extracts the word. Or a feature extractor for extracting a common feature from a phrase, a feature matrix generator for generating a feature matrix by regularizing the extracted features, and mapping the read to a feature matrix, if given. A feature analysis unit for analyzing features, and to generate purified learning data using only the analyzed features. It characterized in that it comprises a data generator.
상기 의료기록 데이터 로딩부는, 입력값으로 주어진 판독문 원본 데이터의 파일 위치 주소로부터 파일의 위치, 전체 파일 개수, 파일의 길이 또는 이들의 조합을 읽어 들여 시스템 메모리에 적재시키며, 상기 라벨링 처리부는, 상기 시스템 메모리에 적재된 데이터를 신체부위(Body Part)별로 각 판독 섹션별로 라벨링하고, 이를 각각의 집합으로 분류하여 시스템 메모리상에 재배치하며, 상기 특징 추출부는, 각 섹션의 평문텍스트와 SNOMED-CT, ICD-11, LOINC, KCD-10을 포함한 표준 의료 용어 데이터 집합과 비교하여 해당 평문텍스트만 선택적으로 추출하여, 상기 의료관련 용어 추출부에서 추출된 의료 관련 용어로부터 단어 종류, 서술 형태, 서술 빈도 또는 이들의 조합을 분석하여 병변의 유무와 관련된 용어의 특징, 병변의 위치 표시와 관련된 용어의 특징, 증상 묘사와 관련된 용어의 특징, 병증의 종류를 나타내는 용어의 특징 또는 이들의 조합을 추출하며, 상기 특징행렬 생성부는, 상기 특징 추출부에서 추출된 특징들을 데이터 집합으로 하여 새롭게 입력되는 평문텍스트와 사상시킴으로써 유사 또는 동일 의미의 용어인지 비교 및 분석이 가능한 특징 행렬(Feature Matrix)을 생성하며, 상기 특징 분석부는, 정제되지 않은 원본 판독문이 입력되었을 때, 이를 특징 행렬에 사상시켜 판독문을 서술하는 평문텍스트에서 병변의 존재 유무, 병변의 위치, 증상, 병증의 종류 등을 추출, 분석 및 분류하며, 상기 정제 데이터 생성부는, 상기 특징 분석부에서 추출, 분석 및 분류된 데이터들로 정제된 학습 데이터를 생성하는 것을 특징으로 한다.The medical record data loading unit reads the position of the file, the total number of files, the length of the file, or a combination thereof from the file position address of the read original data given as an input value, and loads the file into the system memory. The data loaded in the memory is labeled for each read section by body part, classified into respective sets, and rearranged on the system memory, and the feature extracting unit includes the plain text of each section, SNOMED-CT, and ICD. Selecting only the plain text in comparison with the standard medical term data set including -11, LOINC, and KCD-10, and selecting the word type, description form, description frequency, or the like from the medical term extracted from the medical term extract section. Combination analysis to characterize terms related to the presence or absence of lesions, to characterize terms associated with the location of lesions, Extracts features of terms related to descriptions, features of terms representing types of conditions, or a combination thereof, and the feature matrix generator maps the features extracted from the feature extractor into data sets that are newly input as plain text. Generates a feature matrix that can be compared and analyzed for terms of similar or identical meaning, and the feature analyzer, when the unpurified original read is input, maps it to the feature matrix in the plain text that describes the read. Extracting, analyzing, and classifying the presence or absence of a lesion, the location of the lesion, the symptoms, and the type of the disease, and the purified data generation unit generates the learning data purified from the data extracted, analyzed, and classified by the feature analysis unit. It is characterized by.
상기 학습모델생성부는, 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)에 의해 학습이 수행되며, 상기 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)는 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받고 이를 학습모델의 생성에 반영함으로써, 융합 컨볼루션 뉴럴 네트워크를 통해서 딥러닝 기계학습을 수행하여 계산량을 줄이고 정확도를 향상시켜 전체적인 성능을 향상하는 것을 특징으로 한다.The learning model generation unit is trained by a converged convolutional neural network, and the converged convolutional neural network continuously performs refined learning data from readings of the medical image reading expert. By receiving the updated input and reflecting it in the generation of the learning model, deep learning machine learning is performed through the fusion convolutional neural network to reduce the calculation amount and improve the accuracy to improve the overall performance.
또한 상기 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)는 역방향 전파에 의해서 가중치가 업데이트되도록 구성되는 것을 특징으로 한다.In addition, the converged convolutional neural network is characterized in that the weight is updated by reverse propagation.
한편 본 발명의 또 다른 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 방법은, 의료영상 판독 전문가의 판독문으로부터 추출한 정규화된 형태의 정제된 학습 데이터를 생성하는 판독 기록 지도학습(supervised learning) 단계 및 상기 판독 기록 지도학습 단계에서 정제된 학습 데이터를 입력으로 상기 의료영상을 판독하도록 기계학습을 수행하는 학습모델생성 단계를 포함하며, 상기 기계학습은 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받음으로써, 상기 학습 데이터가 자동으로 강화학습 데이터가 되는 것을 특징으로 한다.Meanwhile, in the medical image reading method through the generation of purified artificial intelligence reinforcement learning data according to another embodiment of the present invention, read-record instructional learning that generates purified learning data in a normalized form extracted from a reading of a medical image reading expert (supervised learning) and a learning model generation step of performing a machine learning to read the medical image with the input of the training data purified in the read recording instruction learning step, the machine learning is a reading of the medical image reading expert By receiving continuous input of the refined learning data from the input, the learning data is characterized in that the reinforcement learning data automatically.
상기 판독 기록 지도학습(supervised learning) 단계는, 상기 판독문의 파일 위치 주소로부터 데이터를 읽어 들이는 의료기록 데이터 로딩 단계, 상기 판독문을 신체부위(Body Part) 별로 발견(Findings), 결론(Conclusion) 및 권고(Recommendation)를 포함한 섹션으로 분류하고, 상기 각 섹션의 평문텍스트로부터 질병관련 단어 또는 어구를 하나의 집합으로 라벨링하는 라벨링처리 단계, 상기 라벨링된 판독문에서 질병관련 단어 또는 어구를 추출하는 의료관련 용어 추출 단계, 상기 추출된 단어 또는 어구로부터 공통된 특징을 추출하는 특징 추출 단계, 상기 추출된 특징들을 규칙화하여 특징 행렬(Feature Matrix)을 생성하는 특징행렬생성 단계, 임의의 판독문이 주어지면 상기 판독문을 특징 행렬에 사상(mapping)시켜 특징을 분석하는 특징 분석 단계, 및 상기 분석된 특징만으로 정제된 학습 데이터를 생성하는 정제 데이터 생성 단계를 포함하는 것을 특징으로 한다.The read history supervised learning step may include loading medical record data reading data from a file location address of the read statement, finding the read statement by body parts, findings, and conclusions. A labeling step of classifying into sections containing recommendations, labeling disease-related words or phrases as a set from the plain text of each section, and medical terminology extracting disease-related words or phrases from the labeled readings. Extracting a feature, extracting a common feature from the extracted word or phrase, generating a feature matrix by regularizing the extracted features, and generating a feature matrix; A feature analysis step of analyzing features by mapping to a feature matrix, and only the analyzed features And a refinement data generation step of generating refined learning data.
상기 의료기록 데이터 로딩 단계는, 입력값으로 주어진 판독문 원본 데이터의 파일 위치 주소로부터 파일의 위치, 전체 파일 개수, 파일의 길이 또는 이들의 조합을 읽어 들여 시스템 메모리에 적재시키며, 상기 라벨링 처리 단계는, 상기 시스템 메모리에 적재된 데이터를 신체부위(Body Part)별로 각 판독 섹션별로 라벨링하고, 이를 각각의 집합으로 분류하여 시스템 메모리상에 재배치하며, 상기 의료관련 용어 추출 단계는, 각 섹션의 평문텍스트와 SNOMED-CT, ICD-11, LOINC, KCD-10을 포함한 표준 의료 용어 데이터 집합과 비교하여 해당 평문텍스트만 선택적으로 추출하며, 상기 특징 추출 단계는, 상기 의료관련 용어 추출부에서 추출된 의료 관련 용어로부터 단어 종류, 서술 형태, 서술 빈도 또는 이들의 조합을 분석하여 병변의 유무와 관련된 용어의 특징, 병변의 위치 표시와 관련된 용어의 특징, 증상 묘사와 관련된 용어의 특징, 병증의 종류를 나타내는 용어의 특징 또는 이들의 조합을 추출하며, 상기 특징행렬 생성 단계는, 상기 특징 추출부에서 추출된 특징들을 데이터 집합으로 하여 새롭게 입력되는 평문텍스트와 사상시킴으로써 유사 또는 동일 의미의 용어인지 비교 및 분석이 가능한 특징 행렬(Feature Matrix)을 생성하며, 상기 특징 분석 단계는, 정제되지 않은 원본 판독문이 입력되었을 때, 이를 특징 행렬에 사상시켜 판독문을 서술하는 평문텍스트에서 병변의 존재 유무, 병변의 위치, 증상, 병증의 종류 등을 추출, 분석 및 분류하며, 상기 정제 데이터 생성 단계는, 상기 특징 분석부에서 추출, 분석 및 분류된 데이터들로 정제된 학습 데이터를 생성하는 것을 특징으로 한다.The medical record data loading step reads the file location, the total number of files, the length of the file, or a combination thereof from the file location address of the read original data given as an input value, and loads the file into the system memory. The data loaded in the system memory is labeled by each reading section for each body part, classified into respective sets and rearranged on the system memory, and the medical term extraction step includes the plain text of each section. Only the plaintext text is selectively extracted in comparison with a standard medical term data set including SNOMED-CT, ICD-11, LOINC, and KCD-10, and the feature extraction step is performed from the medical terminology extracted by the medical terminology extracting unit. Analyze word types, narrative forms, frequency of descriptions, or a combination thereof to determine the characteristics of a term related to the presence or absence of a lesion, Extracting features of terms related to the location of the sides, features of terms related to the description of symptoms, features of terms representing the types of symptoms, or a combination thereof, wherein the feature matrix generating step includes extracting the features extracted from the feature extractor. Create a feature matrix that can compare and analyze whether the terms have similar or identical meanings by mapping them with newly input plaintext text as a set, and the feature analysis step is performed when an unrefined original readout is inputted. Extracting, analyzing, and classifying the presence or absence of a lesion, the location of the lesion, the symptom, the type of the disease, and the like in the plain text that describes the reading by mapping it to a feature matrix, and the purified data generation step is performed by the feature analysis unit. And generating refined learning data from the classified data.
상기 학습모델생성 단계는, 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)에 의해 학습이 수행되며, 상기 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)는 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받고 이를 학습모델의 생성에 반영함으로써, 융합 컨볼루션 뉴럴 네트워크를 통해서 딥러닝 기계학습을 수행하여 계산량을 줄이고 정확도를 향상시켜 전체적인 성능을 향상하는 것을 특징으로 한다.In the learning model generation step, learning is performed by a converged convolutional neural network, and the converged convolutional neural network continuously performs the refined learning data from readings of the medical image reading expert. By inputting the data into the learning model and reflecting it in the generation of the learning model, deep learning machine learning is performed through the fusion convolutional neural network to reduce the calculation amount and improve the accuracy to improve the overall performance.
또한 상기 학습모델생성 단계는, 역방향 전파에 의해서 가중치가 업데이트되도록 구성되는 것을 특징으로 한다.In addition, the learning model generation step, characterized in that the weight is updated by the backward propagation.
이상에서와 같이 본 발명의 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법에 따르면, 사용자는 컨벌루션 뉴럴 네트워크를 이용하여 의료영상 이미지의 병변의 존재 여부, 병변의 위치, 병증의 종류 등을 판독할 때, 의료영상 이미지의 픽셀 정보 분석을 통한 컨벌루션 뉴럴 네트워크의 출력값과 판독기록 지도학습 모델에서 학습 결과로 산출된 동일 신체부위에 대한 판독기록 분석을 통한 출력값을 융합하여 종래의 컨벌루션 뉴럴 네트워크를 이용한 의료영상 이미지 판독 결과보다 정확한 판독 결과를 얻거나 컨벌루션 뉴럴 네트워크의 계산 복잡도(complexity)를 낮출 수 있으며, 종래의 컨벌루션 뉴럴 네트워크를 통해 알 수 없었던 병증의 종류까지 예측할 수 있는 효과가 있다.As described above, according to the medical image reading system and the method through the generation of purified artificial intelligence reinforcement learning data of the present invention, the user uses the convolutional neural network to determine the presence of the lesion, the location of the lesion, and the pathology of the medical image. Conventional convolution by fusing the output value of the convolutional neural network through analysis of pixel information of the medical image and the output value through analysis of the read record of the same body region calculated as a learning result in the read record supervised learning model. It is possible to obtain more accurate readings than medical image readings using neural networks or to reduce computational complexity of convolutional neural networks, and to predict the types of complications that were not known through conventional convolutional neural networks. .
도 1은 종래의 의료영상 판독 시스템의 개념을 나타낸 도면이다.1 is a view showing the concept of a conventional medical image reading system.
도 2는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법의 개념을 나타낸 도면이다.2 is a view showing the concept of a medical image reading system and method through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 3은 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템의 구성을 설명하기 위한 도면이다.3 is a view for explaining the configuration of a medical image reading system through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 4는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 위한 판독기록 지도학습 모델(medical report supervised learning model)의 구성을 보인 블록도이다.FIG. 4 is a block diagram illustrating a configuration of a medical report supervised learning model for generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 4는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 위한 판독기록 지도학습(medical report supervised learning)부의 구성을 보인 블록도이다.4 is a block diagram showing the configuration of a medical report supervised learning unit for generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 5는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터를 생성하는 과정을 보인 도면이다.5 is a view showing a process of generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 6은 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 융합된 컨벌루션 뉴럴 네트워크(CCNN)의 구성을 보인 개념도이다.6 is a conceptual diagram illustrating a configuration of a converged convolutional neural network (CCNN) through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 7은 본 발명의 일 실시예에 따른 판독기록 지도학습 모델의 특징행렬 생성을 위한 흐름도이다.7 is a flowchart for generating a feature matrix of a read / write map learning model according to an embodiment of the present invention.
도 8은 본 발명의 일 실시예에 따른 판독기록 지도학습 모델에서 임의의 판독문에서 질병과 관련된 특징값을 추출하는 흐름도이다.FIG. 8 is a flowchart of extracting feature values associated with a disease from an arbitrary reading in a reading recording supervising model according to an embodiment of the present invention.
도 9는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 과정의 흐름도이다.9 is a flowchart of a medical image reading process through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
이하, 첨부한 도면을 참조하여 본 발명의 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법에 대한 바람직한 실시 예를 상세히 설명한다. 각 도면에 제시된 동일한 참조부호는 동일한 부재를 나타낸다. 또한 본 발명의 실시 예들에 대해서 특정한 구조적 내지 기능적 설명들은 단지 본 발명에 따른 실시 예를 설명하기 위한 목적으로 예시된 것으로, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는 것이 바람직하다.Hereinafter, with reference to the accompanying drawings will be described in detail a preferred embodiment of the medical image reading system and method through purified artificial intelligence enhanced learning data generation of the present invention. Like reference numerals in the drawings denote like elements. In addition, specific structural to functional descriptions of the embodiments of the present invention are only illustrated for the purpose of describing the embodiments according to the present invention, and unless otherwise defined, all terms used herein including technical or scientific terms These have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. Terms such as those defined in the commonly used dictionaries should be construed as having meanings consistent with the meanings in the context of the related art, and are not construed in ideal or excessively formal meanings unless expressly defined herein. It is preferable not to.
도 1은 종래의 의료영상 판독 시스템의 개념을 나타낸 도면이다.1 is a view showing the concept of a conventional medical image reading system.
도 2는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템 및 그 방법의 개념을 나타낸 도면이다.2 is a view showing the concept of a medical image reading system and method through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 1에 도시된 바와 같이, 종래의 의료영상 판독시스템은 의사가 의료영상을 통해서 병변의 존재유무에 대한 판단을 하는데 도움을 주는 시스템으로, 로컬, 병원, 의료기관 등의 데이터베이스에 저장된 의료영상을 이용하여 학습모델을 생성하고, 새로운 의료영상을 입력하면 상기 학습모델에 입력 의료영상을 적용하여 그 판독결과로부터 병변을 예측하는 구조를 가지고 있었다.As shown in Figure 1, the conventional medical image reading system is a system that helps the doctor to determine the presence of the lesion through the medical image, using the medical image stored in the database of the local, hospital, medical institutions, etc. When a learning model was created and a new medical image was inputted, the input medical image was applied to the learning model to predict a lesion from the reading result.
이 경우 수많은 의료영상으로부터 학습모델을 생성하기 때문에 학습에 많은 시간이 소요되며, 정확도도 떨어지는 문제가 있었다. 이에 본 발명에서는 이러한 문제를 극복하기 위해서 전문의의 의료영상에 대한 판독결과인 판독문으로 학습모델을 고도화하는 강화학습을 수행하는 구조를 제시하고자 한다.In this case, since a learning model is generated from a number of medical images, a lot of time is required for learning, and there is a problem of inferior accuracy. Therefore, in order to overcome this problem, the present invention proposes a structure for performing reinforcement learning to upgrade the learning model with readings which are reading results of medical images of specialists.
도 2에 도시된 바와 같이, 전문의는 의료영상을 판독하여 판독문을 구성하는데, 여기서 생성된 판독문을 이용하여 의료영상 판독 시스템의 학습모델의 학습결과를 향상시키는데 활용하고자 한다. 특히, 본 발명에서는 판독문의 판독결과를 이용하여 의료영상을 학습하는 학습모델의 생성 과정에 학습 성능을 개선하고 복잡도를 줄이는 역할을 수행하도록 한다.As shown in FIG. 2, a specialist configures a reading by reading a medical image, and uses the generated reading to improve the learning result of the learning model of the medical image reading system. In particular, the present invention serves to improve the learning performance and reduce the complexity in the process of generating a learning model for learning a medical image using the reading result of the reading.
이하에서는 본 발명의 일 실시예에 따른 의료영상 판독 시스템에 대한 세부 구성에 대해서 논의하고자 한다.Hereinafter, a detailed configuration of a medical image reading system according to an embodiment of the present invention will be discussed.
도 3은 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템의 구성을 설명하기 위한 도면이다.3 is a view for explaining the configuration of a medical image reading system through the generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 3에 도시된 바와 같이, 본 발명의 일 실시예에 따른 의료영상 판독 시스템(10)은 판독문 지도 학습부(100), 의료영상 학습부(200), 의료영상 데이터베이스(300), 판독문 데이터베이스(400) 등을 포함한다. 상기 의료영상 데이터베이스(300)와 판독문 데이터베이스(400)는 하나의 데이터베이스에 구성되어도 무방하다.As shown in FIG. 3, the medical image reading system 10 according to an exemplary embodiment of the present invention may include a readout instruction learning unit 100, a medical image learning unit 200, a medical image database 300, and a reading database ( 400) and the like. The medical image database 300 and the reading database 400 may be configured in one database.
여기서 영상의학과 전문의는 의료영상(300)을 판독한 다음 해당 영상의 판독문(400)을 작성한다. 해당 의료영상은 의료영상 학습부(200)에 입력되어 기계학습을 수행한다. 아울러 판독문(400)은 판독문 지도 학습부(100)에 입력되어 해당 판독문의 특징을 추출하여 의료영상 학습부(200)에 제공함으로써 해당 의료영상에 대한 학습 성능(계산량, 복잡도)을 향상시키다.The radiology specialist reads the medical image 300 and then writes a readout 400 of the image. The medical image is input to the medical image learning unit 200 to perform machine learning. In addition, the reading 400 is input to the reading guidance learning unit 100 to extract the characteristics of the reading and provide it to the medical image learning unit 200 to improve the learning performance (calculation amount, complexity) for the medical image.
따라서 본 발명의 의료영상 판독 시스템(10)은 의료영상(30))을 입력으로 하여 의료영상을 학습하는 의료영상 학습부(200)가 상기 의료영상(300)을 전문의가 판독한 판독문으로부터 추출한 데이터를 이용하여 판독문 지도 학습부(100)에서 의료영상 학습부(200)에서 필요로 하는 정제된 학습 데이터를 제공하여, 의료영상의 학습 성능을 향상시킨다,Therefore, in the medical image reading system 10 of the present invention, the medical image learning unit 200 which learns the medical image by inputting the medical image 30 is extracted from the readout of the medical image 300 read by a specialist. By using the readout guidance learning unit 100 provides the purified learning data required by the medical image learning unit 200, thereby improving the learning performance of the medical image,
이 과정에서 영상의학과 전문의의 의료영상에 대한 판독문을 점점 고도화하여 반영함으로써, 의료영상의 학습 성능을 향상시킨다.In this process, the readings of the medical images of the radiologist and specialist are gradually advanced and reflected, thereby improving the learning performance of the medical images.
이하에서는 의료영상의 판독문 지도 학습부(100)와 의료영상 학습부(200)의 구성에 대해서 자세하게 설명하고자 한다.Hereinafter, the configuration of the readout guidance learning unit 100 and the medical image learning unit 200 of the medical image will be described in detail.
도 4는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 위한 판독기록 지도학습(medical report supervised learning)부의 구성을 보인 블록도이다.4 is a block diagram showing the configuration of a medical report supervised learning unit for generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 4에 도시된 바와 같이, 상기 판독기록 지도학습부(100)는 판독문의 파일 위치 주소로부터 데이터를 읽어 들이는 의료기록 데이터 로딩부(111), 상기 판독문을 신체부위(Body Part) 별로 발견(Findings), 결론(Conclusion) 및 권고(Recommendation)를 포함한 섹션으로 분류하고, 상기 각 섹션의 평문텍스트로부터 질병관련 단어 또는 어구를 하나의 집합으로 라벨링하는 라벨링처리부(112), 상기 라벨링된 판독문에서 질병관련 단어 또는 어구를 추출하고, 상기 추출된 단어 또는 어구로부터 공통된 특징을 추출하는 특징 추출부(113), 상기 추출된 특징들을 규칙화하여 특징 행렬(Feature Matrix)을 생성하는 특징행렬 생성부(114)를 포함한다. 이러한 결과로 생성된 특징행렬은 데이터베이스에 저장되어, 향 후 입력되는 판독문을 특징행렬과 사상시켜 특징을 분석하는데 사용할 수 있다.As shown in FIG. 4, the read record map learning unit 100 discovers the medical record data loading unit 111 that reads data from the file location address of the read statement, and the read document for each body part ( A labeling processor 112 for classifying into sections including Findings, Conclusions, and Recommendations, and labeling a disease-related word or phrase from a plain text of each section into a set, the disease in the labeled readings The feature extractor 113 extracts a related word or phrase, extracts a common feature from the extracted word or phrase, and generates a feature matrix by regularizing the extracted features. ). The resulting feature matrix is stored in a database and can be used to analyze features by mapping incoming readings to the feature matrix.
아울러 상기 판독기록 지도학습부(100)는 임의의 판독문이 주어지면 지도학습을 통해 상기 판독문을 특징 행렬에 사상(mapping)시켜 특징을 분석하는 지도학습 특징 분석부(120) 및 상기 분석된 특징만으로 정제된 학습 데이터를 생성하는 정제 데이터 생성부(130)를 더 포함한다.In addition, the reading history map learning unit 100 maps the readings to a feature matrix through a map learning, given a random reading, and analyzes the features by the map learning feature analysis unit 120 and the analyzed features only. Further comprising a refined data generation unit 130 for generating purified learning data.
여기서 상기 의료기록 데이터 로딩부(111)는, 입력값으로 주어진 판독문 원본 데이터의 파일 위치 주소로부터 파일의 위치, 전체 파일 개수, 파일의 길이 또는 이들의 조합을 읽어 들여 시스템 메모리나 보조메모리에 적재시키고, 그 결과를 이용하여 판독기록 지도학습에 활용한다.Here, the medical record data loading unit 111 reads the position of the file, the total number of files, the length of the file, or a combination thereof from the file position address of the original reading data given as an input value, and loads it into the system memory or the auxiliary memory. The results of the study will be used for reading, reading and teaching.
상기 라벨링 처리부(112)는, 상기 시스템 메모리 또는 보조메모리에 적재된 데이터를 신체부위(Body Part)별로 각 판독 섹션별로 라벨링하고, 이를 각각의 집합으로 분류하여 시스템 메모리나 보조메모리 상에 재배치한다.The labeling processor 112 labels the data loaded in the system memory or the auxiliary memory for each read section for each body part, classifies them into sets, and rearranges the data in the system memory or the auxiliary memory.
상기 특징 추출부(113)는, 각 섹션의 평문텍스트와 SNOMED-CT, ICD-11, LOINC, KCD-10을 포함한 표준 의료 용어 데이터 집합과 비교하여 해당 평문텍스트만 선택적으로 추출하여, 상기 의료관련 용어 추출부에서 추출된 의료 관련 용어로부터 단어 종류, 서술 형태, 서술 빈도 또는 이들의 조합을 분석하여 병변의 유무와 관련된 용어의 특징, 병변의 위치 표시와 관련된 용어의 특징, 증상 묘사와 관련된 용어의 특징, 병증의 종류를 나타내는 용어의 특징 또는 이들의 조합을 추출한다.The feature extractor 113 selectively extracts only the plaintext text by comparing the plaintext text of each section with a standard medical term data set including SNOMED-CT, ICD-11, LOINC, and KCD-10. Analyze the word type, description form, frequency of description, or a combination thereof from the medical terms extracted by the extraction unit, and the characteristics of the terms related to the presence of the lesion, the characteristics of the terms related to the location of the lesion, and the characteristics of the terms related to the description of symptoms. Extract features or combinations of terms that indicate the type of condition.
상기 특징행렬 생성부(114)는, 상기 특징 추출부(130)에서 추출된 특징들을 데이터 집합으로 하여 새롭게 입력되는 평문텍스트와 사상시킴으로써 유사 또는 동일 의미의 용어인지 비교 및 분석이 가능한 특징 행렬(Feature Matrix)을 생성한다.The feature matrix generator 114 compares and analyzes whether the feature extracted by the feature extractor 130 is a similar or identical meaning by mapping the features extracted from the feature extractor 130 to the newly input plaintext text. Matrix).
다음으로 상기 지도학습 특징 분석부(120)는, 정제되지 않은 원본 판독문이 입력되었을 때, 이를 지도학습모델에 적용하여 특징행렬에 사상시켜 판독문을 서술하는 평문텍스트에서 병변의 존재 유무, 병변의 위치, 증상, 병증의 종류 등을 추출, 분석 및 분류한다.Next, the supervised learning feature analysis unit 120, when the unrefined original readout is input, applies it to the supervised learning model, maps it to the feature matrix, and indicates the presence or absence of the lesion in the plaintext text describing the readout. Extract, analyze and classify symptoms, types of symptoms, and more.
상기 정제 데이터 생성부(130)는, 상기 지도학습 특징 분석부(120)에서 추출, 분석 및 분류된 데이터들로 정제된 학습 데이터를 생성한다. 상기 정제된 학습 데이터는 의료영상의 판독에 대한 성능을 향상시키기 위한 부가 정보에 해당한다.The refined data generation unit 130 generates the refined learning data from the data extracted, analyzed, and classified by the supervised feature analysis unit 120. The refined learning data corresponds to additional information for improving performance of reading a medical image.
특히 상기 지도학습 특징 분석부(120)는 지도학습을 수행하는 것으로, 먼저 의료영상에 대한 잘 정의된 특징행렬을 기준으로 임의의 새로이 입력된 의료영상의 판독문을 학습시켜 해당 입력된 판독문이 어떠한 특징행렬을 갖는지 분류하여 판단한다.In particular, the supervised learning feature analysis unit 120 performs supervised learning, first by learning a readout of any newly input medical image based on a well-defined feature matrix for the medical image, and the corresponding readout is characterized by Determine if you have a matrix.
이를 위해서 새로이 입력되는 임의의 의료영상에 대한 판독문은 NLP(Natural Language Processor)를 통해 질병관련 단어 또는 어구를 추출하고, 지도학습 모델에 입력하면, 입력된 판독문의 특징행렬과 매핑하여 정제된 학습 데이터가 추출된다.To this end, the newly read readings on any medical images are extracted from disease-related words or phrases through the Natural Language Processor (NLP), and inputted into the supervised learning model. Is extracted.
임의의 새로운 판독문이 입력되면 이를 기존의 특징행렬과 비교하여 분류하는데, 이때 임의의 새로운 판독문으로부터 특징행렬을 추출해야 하므로, 결국 의료기록 데이터 로딩부(111), 라벨링 처리부(112), 특징 추출부(113) 및 특징행렬 생성부(114)의 과정을 포함하는 기준 특징행렬 추출부(110)의 기능을 그대로 수행하는 방법도 가능하고, 지도학습모델에 입력하여 분류하는 방법도 가능하다.If any new reading is inputted, it is classified by comparing with the existing feature matrix. At this time, since the feature matrix must be extracted from the new reading, the medical record data loading unit 111, the labeling processing unit 112, and the feature extraction unit A method of performing the function of the reference feature matrix extractor 110 including the process of the 113 and the feature matrix generator 114 may be performed as it is, or may be inputted into a map learning model and classified.
바람직하게 본 발명에서는 두 가지 방법을 모두 사용가능하다.Preferably both methods can be used in the present invention.
이하에서는 학습용 텍스트 데이터를 입력받아 정제된 학습 데이터를 생성하는 과정에 대해서 설명하고자 한다.Hereinafter, a process of generating refined learning data by receiving text data for learning will be described.
도 5는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터를 생성하는 과정을 보인 도면이다.5 is a view showing a process of generating purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 5에 도시된 바와 같이, 먼저 영상의학과 전문의의 판독문(Rk)을 입력받아 판독문 중 발견(Finding)(F)에 해당하는 내용만 추출해 낸 집합을 생성하고, 여기서 판독문 중 결론(Conclusion)(C)에 해당하는 내용만 추출해 낸 집합을 생성하고, 판독문 중 권고(Recommendation(R)에 해당하는 내용만 추출해 낸 집합을 생성하고 레이블링한다. 이러한 집합들은 신체부위(BP)별로 구분될 필요가 있다.As shown in FIG. 5, first, a reading (R k ) of a radiologist is input to generate a set of extracting only the contents corresponding to finding (F) among readings, and a conclusion of the readings (Conc. Create a set that extracts only the content of C), and create and label a set that extracts only the content of Recommendation (R) from the readings.These sets need to be distinguished by body part (BP). .
이렇게 레이블링된 사항으로부터 특징을 추출한다. 여기서 FM1, … , FMx 등은 라벨링된 판독문의 F, C, R 집합에서 추출해 낸 특징맵(feature metric)의 집합이다. 즉 특징을 추출한 다음 특징별로 메트릭스를 구축한다. 예를 들어, X번째 특징의 측정(feature metric)은 a1 부터 ak 까지 동일한 질병이나 증상을 나타내는 k개의 동의어나 표현, 어휘를 모두 수렴 또는 지배(dominate)하는 A에 사상되며, 전체 특징행렬(Feature Metrix)은 질병명, 위치 표현, 중증 정도를 나타내는 각 측정(metric)으로 구성된다. 여기서 생성된 결과는 RRx로써, X번째 로우 데이터(Raw data)(Rx)가 특징 행렬(Feature Matrix)을 통해 정제된(refined) 데이터를 출력한다.Extract the feature from this label. Where FM 1 ,. , FM x, etc., is a set of feature metric extracted from the F, C, and R sets of labeled readings. In other words, we extract features and then build metrics for them. For example, the feature metric of the Xth feature is mapped to A, which converges or dominates all k synonyms, expressions, and vocabularies representing the same disease or condition from a 1 to a k , and the entire feature matrix. (Feature Metrix) consists of each metric indicating disease name, location expression, and severity. The generated result is RR x and outputs data whose X-th row data R x is refined through a feature matrix.
도 6은 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 융합 컨벌루션 뉴럴 네트워크(CCNN)의 구성을 보인 개념도이다.6 is a conceptual diagram illustrating a configuration of a convergent convolutional neural network (CCNN) through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 6에 도시된 바와 같이, 종래의 컨벌루션 뉴럴 네트워크에 정제된 인공지능 강화학습 데이터를 적용함으로써, 융합된 컨벌루션 뉴럴 네트워크를 구축할 수 있다. 예를 들어 정제된 인공지능 강화학습 데이터에서 판독문을 서술하는 평문텍스트로부터 병변의 존재 유무, 병변의 위치, 증상, 병증의 종류 등을 추출, 분석 및 분류하여 컨벌루션 단계에서 세밀하게 예측해야할 부분을 집중적으로 학습하고, 그렇지 않은 부분은 상대적으로 많은 계산 복잡도를 부가하지 않도록 한다. 즉, 병변이 존재하는 부위에 대해서 정밀하게 학습하고, 증상이나 병변의 종류에 따라 다르게 학습함으로써, 모든 입력영상에 대해서 동일한 강도의 학습을 수행하는 것에 비해서 복잡도는 줄이고 학습성능은 향상시키는 역할을 한다.As shown in FIG. 6, by applying purified artificial intelligence reinforcement learning data to a conventional convolutional neural network, a fused convolutional neural network can be constructed. For example, from the plain text describing the reading in the refined AI data, the presence or absence of the lesion, the location of the lesion, the symptom, and the type of the disease are extracted, analyzed, and classified to intensively predict the parts to be predicted in the convolutional stage. The other part does not add a relatively large amount of computational complexity. That is, by precisely learning about the area where the lesion exists and learning differently according to the symptoms or the type of the lesion, the complexity is reduced and the learning performance is improved as compared to the same intensity learning for all the input images. .
여기서 임의의 판독문이 입력되면 해당 의료영상에 대해서 판독문을 해독한 다음 지도학습모델에 따라 의료기록을 분석한 다음 병변의 존재유무, 병변의 위치, 증상, 변증의 종류에 대한 정보를 추출하고, 분선 및 분류하여 지도학습 모델과 융합된 컨벌루션 뉴럴 네트워크를 구성하고 수행하도록 한다.If a random reading is input, the reading is decoded for the medical image, and then the medical records are analyzed according to the supervised learning model. The information on the presence or absence of the lesion, the location of the lesion, the symptom, and the type of the symptom is extracted. And classify and construct a convolutional neural network fused with supervised learning models.
상기 융합된 컨벌루션 뉴럴 네트워크는 복수의 컨벌루션 레이어가 존재하며, 이때마다 지도학습의 결과를 이용하여 커스터마이징된 컨벌루션을 수행하도록 한다. 이러한 방식으로 본 발명은 더욱 향상된 성능의 학습모델을 생성할 수 있다. 의료영상과 해당 의료영상의 판독문을 이용하여 영상의학과 전문의가 판독문을 판독한 결과에서 각 신체부위 별로 발견, 결론, 권고 별로 라벨링하고, 특징을 추출하여 특징행렬을 생성하여 저장하고, 그 결과를 새로운 의료영상의 판독문에 대해서도 특징행렬을 추출한 다음 기 저장된 특징행렬과 매핑하여 정제된 학습 데이터를 도출한다. 이러한 특징행렬을 분선하면 병변의 존재유무, 병변의 위치, 증상, 변증의 종류에 대한 정보를 추출할 수 있다. 이를 통해서 융합 CNN을 구성하고 학습을 수행할 수 있다.The fused convolutional neural network has a plurality of convolutional layers, and each time, a customized convolution is performed by using the result of supervised learning. In this way, the present invention can create a learning model with further improved performance. Using the medical images and the readings of the medical images, the radiologists read the readings, label them by findings, conclusions, and recommendations for each body part, extract the features, generate and store the feature matrix, and save the results. The feature matrix is also extracted from the medical image readings and then mapped with the stored feature matrix to derive refined learning data. Segmenting this feature matrix can extract information about the presence or absence of the lesion, the location of the lesion, the symptoms, and the type of symptom. Through this, it is possible to construct a converged CNN and perform learning.
아울러 본 발명에 따른 융합 컨벌루션 네트워크는 전문의의 평가결과를 역으로 반영하는 역방향 전파에 의해서 가중치가 업데이트되도록 구성된다. 이는 CNN을 구성하는 각 단계에서 파라미터의 가중치를 업데이트함에 있어서, 출력부의 판독결과를 은닉층, 컨벌루션 계층으로 가중치를 역으로 전파하여 수정하여 보다 정확한 판독이 가능하도록 한다.In addition, the convergence convolutional network according to the present invention is configured such that the weight is updated by reverse propagation reflecting the evaluation result of the specialist in reverse. In updating the weight of the parameter in each step of configuring the CNN, the read result of the output unit is propagated to the hidden layer and the convolutional layer by inverting the weight to be corrected to enable more accurate reading.
도 7은 본 발명의 일 실시예에 따른 판독기록 지도학습 모델의 특징행렬 생성을 위한 흐름도이다.7 is a flowchart for generating a feature matrix of a read / write map learning model according to an embodiment of the present invention.
도 7에 도시된 바와 같이, 지도학습 모델의 특징행렬 생성 과정은, 먼저 네트워크 또는 로컬 저장소에서 의료영상 이미지의 다이콤 메타데이터 및 판독문을 로딩한다(S110). 다음으로 로딩한 의료영상 이미지의 다이콤 메타데이터에 포함된 신체부위(Body Part) 필드를 추출한다(S120). 이어서, 의료영상 이미지의 판독문을 발견(Findings), 결론(Conclusion), 권고(Recommendation) 별로 라벨링하여 각 집합에 평문텍스트들을 삽입한다(S130).As shown in FIG. 7, the feature matrix generation process of the supervised learning model first loads the dicom metadata and the readout of the medical image image from the network or the local storage (S110). Next, a body part field included in the dicom metadata of the loaded medical image image is extracted (S120). Subsequently, plain texts are inserted into each set by labeling the readout of the medical image by findings, conclusions, and recommendations (S130).
이어서 추가 의료영상 이미지가 있으면(S140), S110 내지 S130의 과정을 반복하고, 아니면 표준 의료 용어 데이터 집합을 로딩한다(S150). 또한 라벨링된 발견(Findings), 결론(Conclusion), 권고(Recommendation) 별로 해당 집합의 각 요소들과 표준 의료용어 데이터집합을 순차적으로 맵핑하여 질병에 관련된 단어 또는 어휘를 추출한다(S160).Subsequently, if there is an additional medical image (S140), the process of S110 to S130 is repeated, or a standard medical term data set is loaded (S150). In addition, a word or vocabulary related to a disease is extracted by sequentially mapping each element of the corresponding set and the standard medical term data set according to labeled findings, conclusions, and recommendations (S160).
상기 추출된 단어 및 어휘들로부터 단어 종류, 서술 형태, 서술 빈도 분석을 통해 특징을 추출한다(S170). 마지막으로, 추출된 특징들을 엘리먼트로 하는 특징행렬을 생성한다(S180).Features are extracted from the extracted words and vocabularies through word type, description form, and description frequency analysis (S170). Finally, a feature matrix including the extracted features as an element is generated (S180).
도 8은 본 발명의 일 실시예에 따른 판독기록 지도학습 모델에서 임의의 판독문에서 질병과 관련된 특징값을 추출하는 흐름도이다.FIG. 8 is a flowchart of extracting feature values associated with a disease from an arbitrary reading in a reading recording supervising model according to an embodiment of the present invention.
도 8에 도시된 바와 같이, 본 발명의 일 실시예에 따른 임의의 판독문에서 질병과 관련된 특징값을 추출하는 과정은 먼저 네트워크 또는 로컬 저장소에서 의료영상 이미지의 다이콤 메타데이터 및 판독문을 로딩한다(S210). 이어서 의료영상 이미지의 다이콤 메타데이터에 포함된 신체부위(Body Part) 필드를 추출한다(S220).As shown in FIG. 8, the process of extracting a disease-related feature value from an arbitrary reading according to an embodiment of the present invention first loads the dicom metadata and reading of the medical image image from a network or a local storage ( S210). Subsequently, a body part field included in the dicom metadata of the medical image image is extracted (S220).
상기 의료영상 이미지의 판독문을 발견(Findings), 결론(Conclusion), 권고(Recommendation) 별로 라벨링하여 각 섹션별로 평문텍스트를 추출한다(S230). 그리고 추출된 평문텍스트를 동일 신체부위(Body Part)의 특징행렬의 각 엘리먼트들과 맵핑한다(S240). 상기 특징행렬의 맵핑 결과 유사 또는 동일 용어 또는 어휘 표현이 존재하는 텍스트의 데이터를 추출한다(S250).The plain text is extracted for each section by labeling the readout of the medical image image by findings, conclusions, and recommendations (S230). The extracted plain text is mapped to elements of the feature matrix of the same body part (S240). As a result of mapping the feature matrix, data of text in which similar or identical terms or lexical expressions exist is extracted (S250).
이렇게 추출된 데이터를 분석하여 병변의 존재유무, 병변의 위치, 증상, 변증의 종류에 대한 정보를 추출할 수 있다. 이를 통해서 융합 CNN을 구성하고 학습을 수행할 수 있다.The extracted data can be analyzed to extract information on the presence or absence of lesions, the location of the lesions, symptoms, and types of symptoms. Through this, it is possible to construct a converged CNN and perform learning.
도 9는 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 과정의 흐름도이다.9 is a flowchart of a medical image reading process through generation of purified artificial intelligence enhanced learning data according to an embodiment of the present invention.
도 9를 참조하면, 본 발명의 일 실시예에 따른 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 과정은 먼저 의료영상 판독 전문가의 판독문으로부터 추출한 정규화된 형태의 정제된 학습 데이터를 생성하는 판독기록 지도학습(supervised learning)을 수행한다(S310).Referring to FIG. 9, the medical image reading process through the generation of purified artificial intelligence reinforcement learning data according to an embodiment of the present invention may be performed to first generate purified learning data in a normalized form extracted from a reading of a medical image reading expert. Record supervised learning is performed (S310).
이어서 상기 판독 기록 지도학습 단계에서 정제된 학습 데이터를 입력으로 상기 의료영상을 판독하도록 기계학습을 수행한다(S320).Subsequently, machine learning is performed to read the medical image with input of the learning data purified in the read / write instruction learning step (S320).
상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받고 이를 학습모델의 생성에 반영함으로써, 융합 컨볼루션 뉴럴 네트워크를 통해서 딥러닝 기계학습을 수행하여 계산량을 줄이고 정확도를 향상시켜 전체적인 성능을 향상한다(S330).By continuously updating and receiving refined learning data from the readings of the medical image reading expert and reflecting it in the generation of the learning model, deep learning machine learning is performed through the converged convolutional neural network to reduce the calculation amount and improve the accuracy. Improve the performance (S330).
본 발명에 따르면, 사용자는 컨벌루션 뉴럴 네트워크를 이용하여 의료영상 이미지의 병변의 존재 여부, 병변의 위치, 병증의 종류 등을 판독할 때, 의료영상 이미지의 픽셀 정보 분석을 통한 컨벌루션 뉴럴 네트워크의 출력값과 판독기록 지도학습 모델에서 학습 결과로 산출된 동일 신체부위에 대한 판독기록 분석을 통한 출력값을 융합하여 종래의 컨벌루션 뉴럴 네트워크를 이용한 의료영상 이미지 판독 결과보다 정확한 판독 결과를 얻거나 컨벌루션 뉴럴 네트워크의 계산 복잡도(complexity)를 낮출 수 있으며, 종래의 컨벌루션 뉴럴 네트워크를 통해 알 수 없었던 병증의 종류까지 예측할 수 있다.According to the present invention, when a user reads the presence or absence of a lesion of the medical image image, the location of the lesion, the type of the disease, etc. using the convolutional neural network, the user outputs the output value of the convolutional neural network by analyzing pixel information of the medical image image. Convergence of output values through analysis of reading records for the same body parts calculated as learning results in reading records supervised learning model to obtain accurate readings than medical image readings using conventional convolutional neural networks, or computational complexity of convolutional neural networks The complexity can be lowered and even a kind of disease unknown through the conventional convolutional neural network can be predicted.

Claims (10)

  1. 의료영상 판독 전문가의 판독문으로부터 추출한 정규화된 형태의 정제된 학습 데이터를 생성하는 판독 기록 지도학습(supervised learning)부; 및A read recording supervised learning unit for generating purified learning data in a normalized form extracted from a reading of a medical image reading expert; And
    상기 판독 기록 지도학습부에서 정제된 학습 데이터를 입력으로 상기 의료영상을 판독하도록 기계학습을 수행하는 학습모델생성부;를 포함하며,And a learning model generation unit configured to perform machine learning to read the medical image by inputting the learning data purified by the read / write guidance learning unit.
    상기 기계학습은 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받음으로써, 상기 학습 데이터가 자동으로 강화학습 데이터가 되는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템.The machine learning continuously receives the refined learning data from the readout of the medical image reading expert and receives the input, whereby the learning data is automatically enhanced reinforcement learning data generation. Medical Image Reading System.
  2. 청구항 1에 있어서,The method according to claim 1,
    상기 판독 기록 지도학습(supervised learning)부는,The read record supervised learning unit,
    상기 판독문의 파일 위치 주소로부터 데이터를 읽어 들이는 의료기록 데이터 로딩부;A medical record data loading unit for reading data from a file position address of the read statement;
    상기 판독문을 신체부위(Body Part) 별로 발견(Findings), 결론(Conclusion) 및 권고(Recommendation)를 포함한 섹션으로 분류하고, 상기 각 섹션의 평문텍스트로부터 질병관련 단어 또는 어구를 하나의 집합으로 라벨링하는 라벨링 처리부;Classifying the readings into sections including findings, conclusions, and recommendations by body part, and labeling disease-related words or phrases as a set from the plain text of each section. A labeling processor;
    상기 라벨링된 판독문에서 질병관련 단어 또는 어구를 추출하고, 상기 추출된 단어 또는 어구로부터 공통된 특징을 추출하는 특징 추출부;A feature extracting unit extracting a disease-related word or phrase from the labeled reading and extracting a common feature from the extracted word or phrase;
    상기 추출된 특징들을 규칙화하여 특징 행렬(Feature Matrix)을 생성하는 특징행렬 생성부;A feature matrix generator for generating a feature matrix by regularizing the extracted features;
    임의의 판독문이 주어지면 상기 판독문을 특징 행렬에 사상(mapping)시켜 특징을 분석하는 특징 분석부; 및A feature analyzer for analyzing a feature by mapping the read sentence to a feature matrix when an arbitrary read sentence is given; And
    상기 분석된 특징만으로 정제된 학습 데이터를 생성하는 정제 데이터 생성부;를 포함하는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템.And a refined data generation unit for generating purified learning data using only the analyzed features.
  3. 청구항 2에 있어서,The method according to claim 2,
    상기 의료기록 데이터 로딩부는, 입력값으로 주어진 판독문 원본 데이터의 파일 위치 주소로부터 파일의 위치, 전체 파일 개수, 파일의 길이 또는 이들의 조합을 읽어 들여 시스템 메모리에 적재시키며,The medical record data loading unit reads the location of the file, the total number of files, the length of the file, or a combination thereof from the file location address of the original reading data given as an input value, and loads the file into the system memory.
    상기 라벨링 처리부는, 상기 시스템 메모리에 적재된 데이터를 신체부위(Body Part)별로 각 판독 섹션별로 라벨링하고, 이를 각각의 집합으로 분류하여 시스템 메모리상에 재배치하며,The labeling processor may label the data loaded in the system memory for each read section for each body part, classify them into sets, and rearrange the data in the system memory.
    상기 특징 추출부는, 각 섹션의 평문텍스트와 SNOMED-CT, ICD-11, LOINC, KCD-10을 포함한 표준 의료 용어 데이터 집합과 비교하여 해당 평문텍스트만 선택적으로 추출하여, 의료관련 용어 추출부에서 추출된 의료 관련 용어로부터 단어 종류, 서술 형태, 서술 빈도 또는 이들의 조합을 분석하여 병변의 유무와 관련된 용어의 특징, 병변의 위치 표시와 관련된 용어의 특징, 증상 묘사와 관련된 용어의 특징, 병증의 종류를 나타내는 용어의 특징 또는 이들의 조합을 추출하며,The feature extractor selectively extracts only the plaintext text by comparing the plaintext text of each section with a standard medical term data set including SNOMED-CT, ICD-11, LOINC, and KCD-10. Analyze word types, descriptive forms, frequency of descriptions, or a combination thereof from medical terms to determine the characteristics of the term associated with the presence of the lesion, the characteristics of the term associated with the location of the lesion, the characteristics of the term associated with the description of the symptoms, and the type of condition. Extract features or combinations of terms that appear;
    상기 특징행렬 생성부는, 상기 특징 추출부에서 추출된 특징들을 데이터 집합으로 하여 새롭게 입력되는 평문텍스트와 사상시킴으로써 유사 또는 동일 의미의 용어인지 비교 및 분석이 가능한 특징 행렬(Feature Matrix)을 생성하며,The feature matrix generator generates a feature matrix capable of comparing and analyzing whether the terms have similar or identical meanings by mapping the features extracted by the feature extractor as data sets to the newly input plain text.
    상기 특징 분석부는, 정제되지 않은 원본 판독문이 입력되었을 때, 이를 특징 행렬에 사상시켜 판독문을 서술하는 평문텍스트에서 병변의 존재 유무, 병변의 위치, 증상, 병증의 종류 등을 추출, 분석 및 분류하며,The feature analyzing unit extracts, analyzes, and classifies the presence or absence of the lesion, the location of the lesion, the symptom, the type of the disease, and the like in the plain text that describes the reading by mapping the unrefined original readout into a feature matrix. ,
    상기 정제 데이터 생성부는, 상기 특징 분석부에서 추출, 분석 및 분류된 데이터들로 정제된 학습 데이터를 생성하는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템.The purification data generation unit, the medical image reading system through the generation of purified artificial intelligence enhanced learning data, characterized in that for generating the learning data purified from the data extracted, analyzed and classified by the feature analysis unit.
  4. 청구항 1에 있어서,The method according to claim 1,
    상기 학습모델생성부는, 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)에 의해 학습이 수행되며, 상기 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)는 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받고 이를 학습모델의 생성에 반영함으로써, 융합 컨볼루션 뉴럴 네트워크를 통해서 딥러닝 기계학습을 수행하여 계산량을 줄이고 정확도를 향상시켜 전체적인 성능을 향상하는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템.The learning model generation unit is trained by a converged convolutional neural network, and the converged convolutional neural network continuously performs refined learning data from readings of the medical image reading expert. Refined artificial intelligence reinforcement learning data characterized by improving the overall performance by reducing the amount of calculation and improving accuracy by performing deep learning machine learning through the converged convolutional neural network. Medical image reading system through generation.
  5. 청구항 4에 있어서,The method according to claim 4,
    상기 학습모델생성부는, 역방향 전파에 의해서 가중치가 업데이트되도록 구성되는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 시스템.The learning model generation unit is a medical image reading system through the generation of purified artificial intelligence enhanced learning data, characterized in that the weight is updated by the reverse propagation.
  6. 의료영상 판독 전문가의 판독문으로부터 추출한 정규화된 형태의 정제된 학습 데이터를 생성하는 판독 기록 지도학습(supervised learning) 단계; 및A read record supervised learning step of generating purified learning data in a normalized form extracted from a reading of a medical image reading expert; And
    상기 판독 기록 지도학습 단계에서 정제된 학습 데이터를 입력으로 상기 의료영상을 판독하도록 기계학습을 수행하는 학습모델생성 단계;를 포함하며,And a learning model generation step of performing a machine learning to read the medical image with input of the learning data purified in the read / write instruction learning step.
    상기 기계학습은 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받음으로써, 상기 학습 데이터가 자동으로 강화학습 데이터가 되는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 방법.The machine learning continuously receives the refined learning data from the readout of the medical image reading expert and receives the input, whereby the learning data is automatically enhanced reinforcement learning data generation. How to read medical images.
  7. 청구항 6에 있어서,The method according to claim 6,
    상기 판독 기록 지도학습(supervised learning) 단계는,The read recording supervised learning step,
    상기 판독문의 파일 위치 주소로부터 데이터를 읽어 들이는 의료기록 데이터 로딩 단계;A medical record data loading step of reading data from a file location address of the read statement;
    상기 판독문을 신체부위(Body Part) 별로 발견(Findings), 결론(Conclusion) 및 권고(Recommendation)를 포함한 섹션으로 분류하고, 상기 각 섹션의 평문텍스트로부터 질병관련 단어 또는 어구를 하나의 집합으로 라벨링하는 라벨링 처리 단계;Classifying the readings into sections including findings, conclusions, and recommendations by body part, and labeling disease-related words or phrases as a set from the plain text of each section. A labeling process step;
    상기 라벨링된 판독문에서 질병관련 단어 또는 어구를 추출하는 의료관련 용어 추출 단계;A medical terminology extracting step of extracting a disease-related word or phrase from the labeled reading;
    상기 추출된 단어 또는 어구로부터 공통된 특징을 추출하는 특징 추출 단계;A feature extraction step of extracting a common feature from the extracted word or phrase;
    상기 추출된 특징들을 규칙화하여 특징 행렬(Feature Matrix)을 생성하는 특징행렬 생성 단계;Generating a feature matrix by regularizing the extracted features;
    임의의 판독문이 주어지면 상기 판독문을 특징 행렬에 사상(mapping)시켜 특징을 분석하는 특징 분석 단계; 및A feature analysis step of mapping a read to a feature matrix to analyze a feature, if any read is given; And
    상기 분석된 특징만으로 정제된 학습 데이터를 생성하는 정제 데이터 생성 단계;를 포함하는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 방법.And a purified data generation step of generating purified learning data using only the analyzed features.
  8. 청구항 7에 있어서,The method according to claim 7,
    상기 의료기록 데이터 로딩 단계는, 입력값으로 주어진 판독문 원본 데이터의 파일 위치 주소로부터 파일의 위치, 전체 파일 개수, 파일의 길이 또는 이들의 조합을 읽어 들여 시스템 메모리에 적재시키며,The medical record data loading step reads the location of the file, the total number of files, the length of the file, or a combination thereof from the file location address of the original reading data given as an input value, and loads them into the system memory.
    상기 라벨링 처리 단계는, 상기 시스템 메모리에 적재된 데이터를 신체부위(Body Part)별로 각 판독 섹션별로 라벨링하고, 이를 각각의 집합으로 분류하여 시스템 메모리상에 재배치하며,The labeling process may include labeling the data loaded in the system memory for each read section by body part, classifying the data into respective sets, and rearranging the data in the system memory.
    상기 의료관련 용어 추출 단계는, 각 섹션의 평문텍스트와 SNOMED-CT, ICD-11, LOINC, KCD-10을 포함한 표준 의료 용어 데이터 집합과 비교하여 해당 평문텍스트만 선택적으로 추출하며,The medical term extraction step is to selectively extract only the plaintext text compared to the plain text of each section and the standard medical term data set including SNOMED-CT, ICD-11, LOINC, KCD-10,
    상기 특징 추출 단계는, 의료관련 용어 추출부에서 추출된 의료 관련 용어로부터 단어 종류, 서술 형태, 서술 빈도 또는 이들의 조합을 분석하여 병변의 유무와 관련된 용어의 특징, 병변의 위치 표시와 관련된 용어의 특징, 증상 묘사와 관련된 용어의 특징, 병증의 종류를 나타내는 용어의 특징 또는 이들의 조합을 추출하며,The feature extracting step may be performed by analyzing a word type, a description form, a description frequency, or a combination thereof from the medical related terms extracted by the medical terminology extracting unit. Extract features, features of the term associated with the description of symptoms, features of the term indicating the type of condition, or a combination thereof;
    상기 특징행렬 생성 단계는, 특징 추출부에서 추출된 특징들을 데이터 집합으로 하여 새롭게 입력되는 평문텍스트와 사상시킴으로써 유사 또는 동일 의미의 용어인지 비교 및 분석이 가능한 특징 행렬(Feature Matrix)을 생성하며,The feature matrix generation step may generate a feature matrix capable of comparing and analyzing whether the terms have similar or identical meanings by mapping the features extracted by the feature extractor as data sets to the newly input plain text.
    상기 특징 분석 단계는, 정제되지 않은 원본 판독문이 입력되었을 때, 이를 특징 행렬에 사상시켜 판독문을 서술하는 평문텍스트에서 병변의 존재 유무, 병변의 위치, 증상, 병증의 종류 등을 추출, 분석 및 분류하며,In the feature analysis step, when an unpurified original readout is input, it is mapped to a feature matrix to extract, analyze, and classify the presence or absence of the lesion, the location of the lesion, the symptom, and the type of the disease in the plaintext text describing the readout. ,
    상기 정제 데이터 생성 단계는, 상기 특징 분석부에서 추출, 분석 및 분류된 데이터들로 정제된 학습 데이터를 생성하는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 방법.Wherein the step of generating the purified data, the method for reading a medical image through the generation of purified artificial intelligence reinforcement learning data, characterized in that for generating the training data purified by the extracted, analyzed and classified data in the feature analysis unit.
  9. 청구항 6에 있어서,The method according to claim 6,
    상기 학습모델생성 단계는, 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)에 의해 학습이 수행되며, 상기 융합 컨벌루션 뉴럴 네트워크(Converged Convolutional Neural Network)는 상기 의료영상 판독 전문가의 판독문으로부터 정제된 학습 데이터를 지속적으로 업데이트하여 입력받고 이를 학습모델의 생성에 반영함으로써, 융합 컨볼루션 뉴럴 네트워크를 통해서 딥러닝 기계학습을 수행하여 계산량을 줄이고 정확도를 향상시켜 전체적인 성능을 향상하는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 방법.In the learning model generation step, learning is performed by a converged convolutional neural network, and the converged convolutional neural network continuously performs the refined learning data from readings of the medical image reading expert. Refined artificial intelligence reinforcement learning, characterized by improving the overall performance by performing deep learning machine learning through the converged convolutional neural network, reducing the calculation amount and improving the accuracy by receiving the input and updating it to the generation of the learning model. Medical image reading method through data generation.
  10. 청구항 9에 있어서,The method according to claim 9,
    상기 학습모델생성 단계는, 역방향 전파에 의해서 가중치가 업데이트되도록 구성되는 것을 특징으로 하는 정제된 인공지능 강화학습 데이터 생성을 통한 의료영상 판독 방법.The learning model generation step, the medical image reading method through the generation of purified artificial intelligence enhanced learning data, characterized in that the weight is updated by the reverse propagation.
PCT/KR2018/005641 2018-02-26 2018-05-17 System for interpreting medical image through generation of refined artificial intelligence reinforcement learning data, and method therefor WO2019164064A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020180022735A KR102153920B1 (en) 2018-02-26 2018-02-26 System and method for interpreting medical images through the generation of refined artificial intelligence reinforcement learning data
KR10-2018-0022735 2018-02-26

Publications (1)

Publication Number Publication Date
WO2019164064A1 true WO2019164064A1 (en) 2019-08-29

Family

ID=67687142

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2018/005641 WO2019164064A1 (en) 2018-02-26 2018-05-17 System for interpreting medical image through generation of refined artificial intelligence reinforcement learning data, and method therefor

Country Status (2)

Country Link
KR (1) KR102153920B1 (en)
WO (1) WO2019164064A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104595A (en) * 2019-12-16 2020-05-05 华中科技大学 Deep reinforcement learning interactive recommendation method and system based on text information
CN112190269A (en) * 2020-12-04 2021-01-08 兰州大学 Construction method of depression auxiliary identification model based on multi-source electroencephalogram data fusion
CN116543918A (en) * 2023-07-04 2023-08-04 武汉大学人民医院(湖北省人民医院) Method and device for extracting multi-mode disease features

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102140402B1 (en) * 2019-09-05 2020-08-03 주식회사 루닛 Apparatus for quality managment of medical image interpretation usnig machine learning, and method thereof
KR102325555B1 (en) * 2019-11-27 2021-11-12 주식회사 에프앤디파트너스 Medical image automatic doctor's recommendation device
KR102183310B1 (en) * 2020-03-02 2020-11-26 국민대학교산학협력단 Deep learning-based professional image interpretation device and method through expertise transplant
KR102240932B1 (en) * 2020-03-23 2021-04-15 한승호 Method, apparatus, and system of managing oral health data
KR102365287B1 (en) * 2020-03-31 2022-02-18 인제대학교 산학협력단 Method and system for automatically writing obtained brain MRI image techniques
KR102426091B1 (en) * 2020-06-26 2022-07-29 고려대학교 산학협력단 System for Refining Pathology Report through Ontology Database Based Deep Learning
KR102213924B1 (en) * 2020-07-15 2021-02-08 주식회사 루닛 Apparatus for quality managment of medical image interpretation usnig machine learning, and method thereof
KR102480134B1 (en) * 2020-07-15 2022-12-22 주식회사 루닛 Apparatus for quality managment of medical image interpretation usnig machine learning, and method thereof
KR102516820B1 (en) 2020-11-19 2023-04-04 주식회사 테렌즈 3d convolutional neural network for detection of alzheimer's disease
KR102516868B1 (en) 2020-11-19 2023-04-04 주식회사 테렌즈 3d convolutional neural network for detection of parkinson's disease
KR102476957B1 (en) * 2020-12-11 2022-12-12 가천대학교 산학협력단 Apparatus and method for providing hologram based on medical image
KR102507315B1 (en) * 2021-01-19 2023-03-08 주식회사 루닛 Apparatus for quality management of medical image interpretation using machine learning, and method thereof
KR102326740B1 (en) * 2021-04-30 2021-11-17 (주)제이엘케이 Method and apparatus for implementing automatic evolution platform through automatic machine learning
WO2022260292A1 (en) * 2021-06-11 2022-12-15 주식회사 라인웍스 Cancer pathology report data extraction method, and system and program for implementing same
WO2023205177A1 (en) * 2022-04-19 2023-10-26 Synthesis Health Inc. Combining natural language understanding and image segmentation to intelligently populate text reports

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110075920A1 (en) * 2009-09-14 2011-03-31 Siemens Medical Solutions Usa, Inc. Multi-Level Contextual Learning of Data
KR20140042531A (en) * 2012-09-28 2014-04-07 삼성전자주식회사 Apparatus and method for diagnosing lesion using categorized diagnosis model
KR20150098119A (en) * 2014-02-19 2015-08-27 삼성전자주식회사 System and method for removing false positive lesion candidate in medical image
KR20160066481A (en) * 2014-11-29 2016-06-10 주식회사 인피니트헬스케어 Method of intelligently searching medical image and medical information
KR20160096460A (en) * 2015-02-05 2016-08-16 삼성전자주식회사 Recognition system based on deep learning including a plurality of classfier and control method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110075920A1 (en) * 2009-09-14 2011-03-31 Siemens Medical Solutions Usa, Inc. Multi-Level Contextual Learning of Data
KR20140042531A (en) * 2012-09-28 2014-04-07 삼성전자주식회사 Apparatus and method for diagnosing lesion using categorized diagnosis model
KR20150098119A (en) * 2014-02-19 2015-08-27 삼성전자주식회사 System and method for removing false positive lesion candidate in medical image
KR20160066481A (en) * 2014-11-29 2016-06-10 주식회사 인피니트헬스케어 Method of intelligently searching medical image and medical information
KR20160096460A (en) * 2015-02-05 2016-08-16 삼성전자주식회사 Recognition system based on deep learning including a plurality of classfier and control method thereof

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111104595A (en) * 2019-12-16 2020-05-05 华中科技大学 Deep reinforcement learning interactive recommendation method and system based on text information
CN111104595B (en) * 2019-12-16 2023-04-07 华中科技大学 Deep reinforcement learning interactive recommendation method and system based on text information
CN112190269A (en) * 2020-12-04 2021-01-08 兰州大学 Construction method of depression auxiliary identification model based on multi-source electroencephalogram data fusion
CN112190269B (en) * 2020-12-04 2024-03-12 兰州大学 Depression auxiliary identification model construction method based on multisource brain electric data fusion
CN116543918A (en) * 2023-07-04 2023-08-04 武汉大学人民医院(湖北省人民医院) Method and device for extracting multi-mode disease features
CN116543918B (en) * 2023-07-04 2023-09-22 武汉大学人民医院(湖北省人民医院) Method and device for extracting multi-mode disease features

Also Published As

Publication number Publication date
KR102153920B1 (en) 2020-09-09
KR20190102399A (en) 2019-09-04

Similar Documents

Publication Publication Date Title
WO2019164064A1 (en) System for interpreting medical image through generation of refined artificial intelligence reinforcement learning data, and method therefor
US20240203599A1 (en) Method and system of for predicting disease risk based on multimodal fusion
US11610678B2 (en) Medical diagnostic aid and method
WO2017022908A1 (en) Method and program for bone age calculation using deep neural networks
US11244755B1 (en) Automatic generation of medical imaging reports based on fine grained finding labels
Carchiolo et al. Medical prescription classification: a NLP-based approach
CN111477320B (en) Treatment effect prediction model construction system, treatment effect prediction system and terminal
EP3557584A1 (en) Artificial intelligence querying for radiology reports in medical imaging
US20210375488A1 (en) System and methods for automatic medical knowledge curation
CN112151183A (en) Entity identification method of Chinese electronic medical record based on Lattice LSTM model
CN113707307A (en) Disease analysis method and device, electronic equipment and storage medium
CN111275118A (en) Chest film multi-label classification method based on self-correction type label generation network
CN115859914A (en) Diagnosis ICD automatic coding method and system based on medical history semantic understanding
CN116383413B (en) Knowledge graph updating method and system based on medical data extraction
Zhao et al. Exploiting classification correlations for the extraction of evidence-based practice information
CN114242194A (en) Natural language processing device and method for medical image diagnosis report based on artificial intelligence
Waheeb et al. An efficient sentiment analysis based deep learning classification model to evaluate treatment quality
WO2024090712A1 (en) Artificial intelligence chatting system for psychotherapy through empathy
WO2024005413A1 (en) Artificial intelligence-based method and device for extracting information from electronic document
Sidnyaev et al. Formal grammar theory in recognition methods of unknown objects
CN109859813B (en) Entity modifier recognition method and device
CN113658688B (en) Clinical decision support method based on word segmentation-free deep learning
US11809826B2 (en) Assertion detection in multi-labelled clinical text using scope localization
CN112309519B (en) Electronic medical record medication structured processing system based on multiple models
Cui et al. Intelligent recommendation for departments based on medical knowledge graph

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18906826

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18906826

Country of ref document: EP

Kind code of ref document: A1