KR101718159B1

KR101718159B1 - Method for Extracting Region of Interest Value From Medical Image and Computer Readable Recording Medium Recorded with Program for Performing the Same Method

Info

Publication number: KR101718159B1
Application number: KR1020140056536A
Authority: KR
Inventors: 이영한; 서진석
Original assignee: 연세대학교 산학협력단
Priority date: 2014-05-12
Filing date: 2014-05-12
Publication date: 2017-03-22
Also published as: KR20150129922A

Abstract

본 발명은 의료영상저장시스템(PACS)에 의해 제공되는 의료 영상으로부터 관심영역(ROI) 값을 자동적으로 추출하는 방법과 이 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체에 관한 것이다. 본 발명의 방법에 의하면, 의료 영상으로부터 관심영역(ROI) 값들을 자동화된 방법으로 추출할 수 있으므로, 관심영역(ROI) 값들의 수동 작업으로 추출해내는 경우에 비해 효율성과 정확도가 크게 개선된다. 본 발명의 방법은 오픈 소스 소프트웨어인 광학문자인식(optical character recognition, OCR) 소프트웨어와 매크로 프로그램(Macro program)을 사용함으로 범용적 접근성이 우수하다. 본 발명은 의료영상저장시스템(PACS)에 적용되어 이 시스템의 효율성을 크게 개선할 것으로 기대된다. The present invention relates to a method for automatically extracting a ROI value from a medical image provided by a medical image storage system (PACS) and a computer readable recording medium on which a program for performing the method is recorded. According to the method of the present invention, the ROI values from the medical image can be extracted by an automated method, so that the efficiency and accuracy are greatly improved as compared with the case where ROI values are manually extracted. The method of the present invention is excellent in general accessibility by using optical character recognition (OCR) software and a macro program which are open-source software. The present invention is expected to be applied to a medical image storage system (PACS) to greatly improve the efficiency of this system.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention This invention relates to a method for extracting a ROI value from a medical image and a computer-readable recording medium on which a program for performing the method is recorded. Same Method}

본 발명은 의료 영상으로부터 관심영역(ROI) 값을 추출하는 방법 및 이 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록 매체에 관한 것이다.
The present invention relates to a method for extracting a ROI value from a medical image and a computer-readable recording medium on which a program for performing the method is recorded.

종래의 필름 기반 영상의학 분야에서, 관심영역(ROI, region-of-interest)을 설정하는 것은 CT-콘솔(console) 또는 이미지 재구성 워크스테이션(image-reconstruction workstation)에 제한되었다. 의료영상저장시스템(PACS; Picture Archiving and Communicating System)을 디지털 영상의학에 사용함에 따라, 여러 가지 다양한 목적을 위해 PACS 뷰어 상에서 관심영역(ROI)를 설정하고자 하는 요구가 증가하고 있다. 관심영역(ROI)의 설정시에, 영역(area)값, 평균(mean)값, 표준편차(standard deviation)값, 최대(max)값 및 최소(min)값 등을 포함하는 관심영역(ROI)의 통계학적 수치들이 스크린상에 표시된다(도 1 참조). 방사선 전문의는 PACS 스크린상의 수치들을 읽고 이 수치들은 필요한 방사선 리포트상에 타이핑하게 된다. 때때로, 특히 연구목적을 위해, 방사선 전문의들은 대단히 많은 수의 관심영역(ROI)들을 설정해야 하고 스프레드시트상에 관련 수치들을 직접 수동으로 입력해야만 한다. 그러나 이러한 수동에 의한 처리 과정은 반복적인 성가신 작업이고, 이 과정에서 오타가 발생할 수도 있다. In the conventional film-based imaging medicine field, setting the region of interest (ROI) has been limited to a CT-console or an image-reconstruction workstation. As Picture Archiving and Communicating System (PACS) is used in digital imaging medicine, there is a growing demand for setting ROIs on PACS viewers for various purposes. (ROI) including the area value, the mean value, the standard deviation value, the max value and the min value, and the like in the setting of the ROI, Are displayed on the screen (see Figure 1). The radiologist reads the figures on the PACS screen and types them on the required radiation report. Sometimes, especially for research purposes, radiologists must set up a large number of ROIs and manually enter relevant values on the spreadsheet. However, this manual processing is repetitive and cumbersome, and errors may occur in this process.

광학문자인식(optical character recognition, OCR) 기술은 타이핑되거나, 스캔되거나 또는 프린트된 텍스트 이미지를 디지털 인코딩된 텍스트 포맷으로 기계적 또는 전자적으로 변환시키는 기술이다[1]. OCR 엔진은 이미지값들을 텍스트 데이터로 변환할 수 있고, 패턴 인식이나 문자 인식을 위해 광범위하게 이용되고 있다. 디지털화된 영상의학 환경에서, OCR을 이용한 영상의학에서의 응용예들이 보고되고 있다[2-4].
Optical character recognition (OCR) technology is a technique that mechanically or electronically converts a typed, scanned, or printed text image into a digitally encoded text format [1]. The OCR engine can convert image values into text data and is widely used for pattern recognition and character recognition. In digitized radiological environments, applications in radiology using OCR have been reported [2-4].

본 명세서 전체에 걸쳐 다수의 논문 및 특허문헌이 참조되고 그 인용이 표시되어 있다. 인용된 논문 및 특허문헌의 개시 내용은 그 전체로서 본 명세서에 참조로 삽입되어 본 발명이 속하는 기술 분야의 수준 및 본 발명의 내용이 보다 명확하게 설명된다.
Numerous papers and patent documents are referenced and cited throughout this specification. The disclosures of the cited papers and patent documents are incorporated herein by reference in their entirety to better understand the state of the art to which the present invention pertains and the content of the present invention.

대한민국 등록특허 제10-1287382호Korean Patent No. 10-1287382 대한민국 등록특허 제10-1099446호Korean Patent No. 10-1099446 미국 공개특허 제2013/0251233 A1호U.S. Published Patent Application No. 2013/0251233 A1

1. Optical character recognition. Wikipedia. Available at http://en.wikipedia.org/wiki/Optical_character_recognition. Accessed August 24, 2013. 1. Optical character recognition. Wikipedia. Available at http://en.wikipedia.org/wiki/Optical_character_recognition. Accessed August 24, 2013. 2. Cook TS, Zimmerman S, Maidment AD, Kim W, Boonn WW. J Am Coll Radiol 2010; 7:871-877. 2. Cook TS, Zimmerman S, Maidment AD, Kim W, Boonn WW. J Am Coll Radiol 2010; 7: 871-877. 3. Li X, Zhang D, Liu B. AJR Am J Roentgenol 2011; 196:W781-783. 3. Li X, Zhang D, Liu B. AJR Am J Roentgenol 2011; 196: W781-783. 4. Lee YH, Song HT, Suh JS. Journal of digital imaging 2012; 25:815-818. 4. Lee YH, Song HT, Suh JS. Journal of digital imaging 2012; 25: 815-818. 5. AutoHotkey Web site. Available at http://www.autohotkey.com. Accessed August 24, 2013. 5. AutoHotkey Web site. Available at http://www.autohotkey.com. Accessed August 24, 2013. 6. GOCR Web site. Available at http://jocr.sourceforge.net/api Accessed August 24, 2013. 6. GOCR Web site. Available at http://jocr.sourceforge.net/api Accessed August 24, 2013. 7. Goyal N, Jain N, Rachapalli V. Clin Radiol 2009; 64:119-126. 7. Goyal N, Jain N, Rachapalli V. Clin Radiol 2009; 64: 119-126. 8. Harisinghani MG, Blake MA, Saksena M, et al. Radiographics 2004; 24:615-627. 8. Harisinghani MG, Blake MA, Saksena M, et al. Radiographics 2004; 24: 615-627.

본 발명자들은 의료영상저장시스템(PACS, Picture Archiving and Communicating System)에 의해 제공되는 의료 영상으로부터 관심영역(ROI)의 통계학적 값들을 자동적으로 추출해낼 수 있는 방법을 개발하기 위해 연구 노력하였다. 그 결과, 의료영상저장시스템(PACS)의 영상에서 관심영역(ROI)의 값들이 표시된 영역을 그래픽 파일로 캡처한 후 광학문자인식(OCR, optical character recognition)에 의해 상기 캡처한 영상의 문자 및 숫자를 텍스트로 변환키고, 변환된 텍스트 파일로부터 관심영역(ROI)의 통계학적 값들을 효과적으로 추출할 수 있는 방법을 성공적으로 개발하여 본 발명을 완성하였다. The present inventors have sought to develop a method for automatically extracting statistical values of a ROI from a medical image provided by a PACS (Picture Archiving and Communicating System). As a result, an area where the values of the ROI are displayed in the image of the medical image storage system (PACS) is captured as a graphics file, and then the characters and numerals of the captured image are captured by optical character recognition (OCR) To text, and to extract the statistical values of the ROI from the converted text file successfully, thereby completing the present invention.

따라서, 본 발명의 목적은 의료영상으로부터 관심영역(ROI) 값을 추출하는 방법을 제공하는데 있다. Accordingly, it is an object of the present invention to provide a method for extracting a ROI value from a medical image.

본 발명의 다른 목적은 상기 방법을 자동적으로 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체를 제공하는데 있다.
Another object of the present invention is to provide a computer readable recording medium on which a program for automatically performing the above method is recorded.

본 발명의 목적 및 장점은 하기의 발명의 상세한 설명, 청구의 범위 및 도면에 의해 보다 명확하게 된다.
The objects and advantages of the present invention will become more apparent from the following detailed description of the invention, claims and drawings.

본 발명의 일 양태에 따르면, 본 발명은 다음의 단계를 포함하는 의료 영상으로부터 관심영역(ROI) 값을 추출하는 방법을 제공한다: According to one aspect of the present invention, the present invention provides a method of extracting a ROI value from a medical image comprising the steps of:

(a) 의료영상저장시스템(PACS, Picture Archiving and Communicating System)의 영상으로부터 관심영역(ROI)의 값들이 나타난 영역을 그래픽 파일로 캡처(capture)하는 단계; (a) capturing, as a graphic file, an area in which values of a ROI are displayed from an image of a PACS (Picture Archiving and Communicating System);

(b) 광학문자인식(optical character recognition, OCR) 소프트웨어를 사용하여 상기 캡처된 그래픽 파일 영상의 관심영역(ROI)의 값들을 텍스트 파일로 변환시키는 단계; (b) converting values of a region of interest (ROI) of the captured graphic file image into a text file using optical character recognition (OCR) software;

(c) 상기 광학문자인식(OCR)에 의해 변환된 텍스트 파일에서 오류를 수정하는 단계; (c) correcting errors in the text file converted by the optical character recognition (OCR);

(d) 상기 오류가 수정된 텍스트 파일로부터 관심영역(ROI)의 통계학적 값들을 추출하는 단계; (d) extracting statistical values of the ROI from the error corrected text file;

(e) 상기 추출된 통계학적 값들을 임시 문자열 형식으로 제작하는 단계; 및 (e) constructing the extracted statistical values in a temporary string format; And

(f) 상기 임시 문자열 형식을 스프레드시트(spreadsheet)에 복사하여 넣는 단계.
(f) copying the temporary string format into a spreadsheet.

이하에서 본 발명을 각 단계에 따라 상세히 설명한다.
Hereinafter, the present invention will be described in detail with reference to the respective steps.

단계 (a): 의료영상저장시스템(PACS, Picture Archiving and Communicating System)의 영상으로부터 관심영역(ROI)의 값들이 나타난 영역을 그래픽 파일로 캡처(capture)하는 단계 (A) capturing a region of a region of interest (ROI) from a video of a PACS (Picture Archiving and Communicating System) into a graphic file;

본 명세서에서 용어 "의료영상저장시스템(PACS, Picture Archiving and Communicating System)" 은 의료 영상을 디지털로 획득, 저장, 전송 및 조회를 통합적으로 처리하는 시스템을 의미한다. 보다 구체적으로 다양한 장치에 의해 촬영된 의료 영상을 디지털 이미지로 변환하고, 이를 대용량 기억장치에 저장시키고, 전송 및 조회할 수 있도록 하여 영상의학과 전문가가 모니터를 통해 판독할 수 있게 해주는 시스템을 의미한다. As used herein, the term " Picture Archiving and Communicating System (PACS) " refers to a system that digitally acquires, stores, transmits, and queries medical images in an integrated manner. More specifically, the present invention refers to a system that enables medical imaging professionals to read medical images by converting medical images taken by various devices into digital images, storing them in a mass storage device, transmitting and inquiring them through a monitor.

본 발명에서 상기 의료영상저장시스템(PACS)의 영상은 바람직하게는 X선 영상, CT (computed tomography) 영상, MRI (magnetic resonance imaging) 영상, PET (positron emission tomography) 영상, 또는 SPECT (single photon emission computed tomography) 영상을 포함하나, 이에 한정되지 않는다. In the present invention, the image of the medical image storage system (PACS) is preferably an X-ray image, a computed tomography image, an MRI (magnetic resonance imaging) image, a PET (positron emission tomography) computed tomography) images.

본 명세서에서 용어 "관심영역(ROI, region-of-interest)" 은 나타난 의료 영상에서 특정한 의료상 목적을 위해 설정되는 영역을 의미한다. 예컨대 종양의 크기를 측정하기 위해 종양이 촬영된 영상에서 종양을 포함하는 경계선에 의해 규정되는 일정한 영역을 의미한다. As used herein, the term "region-of-interest " (ROI) refers to an area set for a particular medical purpose in the displayed medical image. For example, to measure the size of a tumor, the tumor refers to a region defined by a border containing the tumor in the imaged image.

본 명세서에서 용어 "관심영역(ROI)의 값" 은 관심영역에 관련된 정보를 나타내는 특정한 값(value)을 의미하며, 바람직하게는 영역(area)값, 평균(average)값, 표준편차(standard deviation)값, 최대(max)값, 및 최소(min)값과 같은 관심영역(ROI)의 통계학적 값들을 의미한다. As used herein, the term "value of ROI" refers to a specific value that represents information related to a region of interest, and is preferably an area value, an average value, a standard deviation (ROI) values, such as a value of a region of interest (ROI), a maximum value, and a minimum value.

PACS 뷰어 상에 표시되는 영상에서 관심영역(ROI)의 값들을 포함하는 영역을 캡처하여 그래픽 파일로 임시 저장한다. An area including values of ROIs from an image displayed on the PACS viewer is captured and temporarily stored in a graphic file.

본 명세서에서 용어 "캡처(capture)" 는 편집하거나 저장을 위해 원래의 영상이나 이미지 정보 중에서 필요한 부분을 따로 떼어 놓는 것을 의미한다. 즉, 텍스트 문자열이나 그래픽 파일로 변환하는 등의 편집이나 저장을 위해 화면에 나타난 일부 정보를 선택하는 것을 의미한다. As used herein, the term "capture" means to separate necessary portions of the original image or image information for editing or storage. That is, it means to select some information displayed on the screen for editing or saving such as converting to a text string or a graphic file.

상기 캡처는 비상업적으로 접근 가능한 다양한 오픈 소스 프로그램을 사용하여 행할 수 있다. The capture may be performed using a variety of non-commercially accessible open source programs.

상기 저장되는 그래픽 파일에는 후속하는 광학문자인식(optical character recognition, OCR)에 의해 텍스트 파일로 변환되는 관심영역(ROI)의 값들이 포함되어야 한다. 그래픽 파일은 특정의 형식으로 제한되지 않으며, 광학문자인식(OCR)에 의해 인식되어 텍스트 형식으로 변환될 수 있는 파일 형식이면 가능하다.
The stored graphic file should include values of ROI that are converted into a text file by a subsequent optical character recognition (OCR). The graphics file is not limited to a specific format, but can be a file format that can be recognized by Optical Character Recognition (OCR) and converted to text format.

단계 (b): 광학문자인식(optical character recognition, OCR) 소프트웨어를 사용하여 상기 캡처된 그래픽 파일 영상의 관심영역(ROI)의 값들을 텍스트 파일로 변환시키는 단계 (B) converting the values of the ROI of the captured graphic file image into a text file using optical character recognition (OCR) software;

광학문자인식(optical character recognition, OCR) 소프트웨어에 의해 상기 캡처된 그래픽 파일의 영상에서 관심영역(ROI)의 값들을 텍스트 파일로 변환시킨다. 변환된 텍스트 파일은 임시폴더에 저장될 수 있다.
Converts the values of ROIs into an image file in an image of the captured graphic file by optical character recognition (OCR) software. The converted text file can be stored in a temporary folder.

단계 c): 상기 광학문자인식(OCR)에 의해 변환된 텍스트 파일에서 오류를 수정하는 단계 Step c): correcting the error in the text file converted by the optical character recognition (OCR)

상기 광학문자인식(OCR)에 의해 변환된 텍스트 파일에서 오류를 수정한다. 상기 오류는 OCR에 의해 관심영역(ROI)의 값들이 텍스트 파일로 변환될 때에 잘못 인식되어 올바르지 않은 형식의 문자나 숫자로 인식된 값들을 의미한다. 이러한 오류의 대표적 패턴은 분할 오류(segmentation error), 과공간(extra-spaces) 오류, 및 숫자를 문자인 것으로 인식하는 오류 등이 될 수 있다. 상기 분할 오류는 인식된 문자 각각을 잘못 분할한 오류이며, 예를 들어, "1,441"을 "1 ,44 1"으로 인식하는 오류이고, 이러한 분할 오류의 수정은 숫자나 문자 사이의 공간을 제거하여 행할 수 있다. 상기 과공간 오류는 인식된 문자 사이의 공간이 과도하게 설정 인식되어 발생된 오류이다. 상기 과공간 오류는 불필요한 2중 공간을 단일 공간 또는 공간 제거에 의해 수정할 수 있다. 상기 숫자를 문자인 것으로 인식된 오류는 예를 들어 "8"을 "B"로 인식한 경우와 같이 숫자를 비슷한 형태의 문자로 잘못 인식한 경우이다. And corrects the error in the text file converted by the optical character recognition (OCR). The error refers to values recognized as characters or digits of an incorrect format that are erroneously recognized when the values of ROIs are converted into text files by OCR. Representative patterns of these errors can be segmentation errors, extra-spaces errors, and errors that recognize numbers as characters. For example, "1, 441" is recognized as "1, 44 1", and correction of such a division error is performed by removing space between numbers and characters . The oversampling error is an error generated when the space between recognized characters is excessively set and recognized. The oversampling error can be corrected by removing the unnecessary double space by a single space or space. An error recognized as a character of the number is a case where the number is mistakenly recognized as a character of a similar type, for example, when "8" is recognized as "B".

본 발명에서 상기 광학문자인식(OCR)에 의한 인식 오류는 바람직하게는 매크로 프로그램(macro program)에 의해 수정될 수 있다. In the present invention, the recognition error due to the optical character recognition (OCR) can be corrected by a macro program.

본 명세서에서 용어 "매크로 프로그램(macro program)"은 매크로 생성 프로그램(macro generating program)과 동일한 의미로서, 서로 연관된 명령어의 연속으로 된 프로그램이 하나의 매크로 명령어로 간주되어 하나의 명령문으로 수행되는 프로그램을 의미한다.
As used herein, the term "macro program" has the same meaning as a macro generating program. In the present specification, a macro program is a program in which a program in a series of related commands is regarded as one macro instruction, it means.

단계 (d): 상기 오류가 수정된 텍스트 파일로부터 관심영역(ROI)의 통계학적 값들을 추출하는 단계 (D) extracting statistical values of the ROI from the error corrected text file;

상기 오류가 수정된 텍스트 파일의 문자열로부터 관심영역(ROI)의 통계학적 값들을 분리하여 추출해 낸다. 상기 분리하여 추출해내는 관심영역(ROI)의 통계학적 값들은 바람직하게는 영역(area)값, 평균(average)값, 표준편차(standard deviation)값, 최대(max)값, 및 최소(min)값을 포함하나, 이에 제한되지 않는다.
And extracts statistical values of the ROI from the string of the corrected text file. The statistical values of the extracted ROI are preferably an area value, an average value, a standard deviation value, a max value, and a min value But is not limited thereto.

단계 (e): 상기 추출된 통계학적 값들을 임시 문자열 형식으로 제작하는 단계 Step (e): constructing the extracted statistical values in a temporary string format

상기 추출된 관심영역(ROI)의 통계학적 값들인 영역값, 평균값, 표준편차값, 최대값 및 최소값을 이와 같은 순서의 임시 문자열 형식으로 제작한다. 바람직하게는 상기 각 통계학적 값들의 사이는 탭(tab)에 의해서 간격으로 나누어진다.
An area value, an average value, a standard deviation value, a maximum value, and a minimum value, which are statistical values of the extracted ROI, are formed in the temporary string format of the order. Preferably, the interval between the statistical values is divided by a tab.

단계 (f): 상기 임시 문자열 형식을 스프레드시트(spreadsheet)에 복사하여 넣는 단계 Step (f): copying the temporary string format into a spreadsheet

상기 제작된 임시 문자열 형식을 엑셀(excel) 소프트웨어(마이크로소프트社)의 스프레드시트(Spreadsheet)로 내보내기 위해 이를 복사한 후 스프레드시트에 붙여 넣는다.
To export the prepared temporary string format to a spreadsheet of Excel software (Microsoft), copy it and paste it into a spreadsheet.

본 발명의 다른 양태에 따르면, 본 발명은 다음의 단계를 포함하는 의료 영상으로부터 관심영역(ROI) 값을 추출하는 프로그램이 기록된 컴퓨터 판독 가능한 기록 매체를 제공한다: According to another aspect of the present invention, the present invention provides a computer-readable recording medium having recorded thereon a program for extracting a ROI value from a medical image comprising the steps of:

본 발명의 프로그램이 기록된 컴퓨터 판독 가능한 기록 매체에서 상기 각 단계에 대한 설명은 상기 본 발명의 다른 양태인 의료영상저장시스템(PACS)의 영상으로부터 관심영역(ROI) 값들을 추출하는 방법에서 설명된 내용과 동일하므로 중복하여 설명하지 않는다.
The description of each of the above steps in the computer-readable recording medium on which the program of the present invention is recorded will be described with reference to a method of extracting ROI values from an image of a medical image storage system (PACS) The description is not duplicated.

본 발명의 특징 및 이점을 요약하면 다음과 같다: The features and advantages of the present invention are summarized as follows:

(i) 본 발명은 의료영상저장시스템(PACS)에 의해 제공되는 의료 영상으로부터 관심영역(ROI) 값을 자동적으로 추출하는 방법과 이 방법을 수행하는 프로그램이 기록된 컴퓨터 판독 가능한 기록매체에 관한 것이다. (i) The present invention relates to a method for automatically extracting a ROI value from a medical image provided by a medical image storage system (PACS) and a computer readable recording medium on which a program for performing the method is recorded .

(ⅱ) 본 발명의 방법에 의하면, 의료 영상으로부터 관심영역(ROI) 값들을 자동화된 방법으로 추출할 수 있으므로, 관심영역(ROI) 값들의 수동 작업으로 추출해내는 경우에 비해 효율성과 정확도가 크게 개선된다. (Ii) According to the method of the present invention, the ROI values can be extracted from the medical image in an automated manner, so that efficiency and accuracy are greatly improved compared to extracting by ROI values manually do.

(ⅲ) 본 발명의 방법은 오픈 소스 소프트웨어인 광학문자인식(optical character recognition, OCR) 소프트웨어와 매크로 프로그램(macro program)을 사용함으로 범용적 접근성이 우수하다. (Iii) The method of the present invention is excellent in universal accessibility by using optical character recognition (OCR) software and a macro program which are open-source software.

(ⅳ) 본 발명은 의료영상저장시스템(PACS)에 적용되어 이 시스템의 효율성을 크게 개선할 것으로 기대된다.
(Iv) The present invention is expected to be applied to a medical image storage system (PACS) to greatly improve the efficiency of this system.

도 1은 PACS의 스크린삿을 보여준다. 도 1의 패널 A는 PACS 스크린상에 표시되는 관심영역(ROI) 값들이다. 큰 이미지로서, 관심영역(ROI) 값들이 백그라운드 신호와 중복되어 있다. 도 1의 패널 B는 줌-아웃(zoom-out)에 의해, 관심영역(ROI) 값들이 백그라운드 없이 표시되는 것을 보여준다.
도 2는 관심영역(ROI) 추출 모듈의 플로우차트이다. PACS는 의료영상저장시스템(Picture Archiving and Communicating System)이고; ROI는 관심영역(region-of-interest)이고; OCR는 광학문자인식(optical character recognition)을 나타낸다.
도 3은 관심영역(ROI)의 OCR 결과를 보여준다. 도 3의 패널 A는 관심영역(ROI)값들을 캡처한 이미지이다. 도 3의 패널 B는 OCR로부터 가공하지 않은 원 텍스트(raw text)이다. 도 3의 패널 C는 원 텍스트의 오류 보정 및 공간 제거를 행한 결과이다.
도 4는 엑셀 소프트웨어의 스프레드시트를 보여준다. 추출된 관심영역(ROI)값들은 적합한 셀-분리로 스프레드시트상으로 성공적으로 삽입되었다. Figure 1 shows a screen shot of a PACS. Panel A of Figure 1 is the ROI values displayed on the PACS screen. As a large image, ROI values overlap with background signals. Panel B of FIG. 1 shows that ROI values are displayed without background, by zoom-out.
Figure 2 is a flow chart of a ROI extraction module. PACS is a Picture Archiving and Communicating System; ROI is region-of-interest; OCR represents optical character recognition.
Figure 3 shows the OCR results of the region of interest (ROI). Panel A of FIG. 3 is an image that captures ROI values. Panel B in Figure 3 is raw text that has not been processed from the OCR. Panel C in Fig. 3 is a result of error correction and space removal of the original text.
Figure 4 shows a spreadsheet of Excel software. The extracted ROI values were successfully inserted onto the spreadsheet with suitable cell-separation.

이하, 실시예를 통하여 본 발명을 더욱 상세히 설명하고자 한다. 이들 실시예는 오로지 본 발명을 보다 구체적으로 설명하기 위한 것으로, 본 발명의 요지에 따라 본 발명의 범위가 이들 실시예에 의해 제한되지 않는다는 것은 당업계에서 통상의 지식을 가진 자에 있어서 자명할 것이다.
Hereinafter, the present invention will be described in more detail with reference to Examples. It is to be understood by those skilled in the art that these embodiments are only for describing the present invention in more detail and that the scope of the present invention is not limited by these embodiments in accordance with the gist of the present invention .

실시예 Example

도구 및 방법 Tools and Methods

1. 하드웨어 및 소프트웨어 1. Hardware and Software

PACS 워크스테이션은(DB-P600, 삼성, 한국)의 구성은 3.3-GHz i5-2500 (인텔, Santa Clara, CA), 4096 MB RAM (random access memory), 내부 비디오 그래픽 어레이 HD-그래픽스 (인텔) 및 외부 비디오 그래픽 어레이 NVIDIA Quadro FX 1400 (엔비디아, Santa Clara, CA)이었고, 5-메가 픽셀의 편평 패널 LCD (Liquid-crystal display) 모니터(Totoku Electric, 일본)를 사용하였다. PACS 소프트웨어는 윈도우-기반 PACS 뷰어 Centricityㄾ PACS RA1000 (GE Healthcare, Barrington, IL)을 사용하였다. OS (operating system) 프로그램은 마이크로소프트 윈도우 7을 사용하였다. 오토핫키(AutoHotkey, version 1.0.48)를 사용하여 메인 모듈을 디자인하였다. 이 오픈소스 매크로 프로그램은 공식 웹페이지(http://www.autohotkey.com)에서 다운로드가 가능하다[5]. OCR 기능으로서 GOCR version 0.49을 선택하였으며, 이 오픈소스 OCR 소프트웨어는 공식 웹페이지(http://jocr.sourceforge.net/api)으로부터 다운로드가 가능하다[6].
The PACS workstation (DB-P600, Samsung, Korea) consists of 3.3-GHz i5-2500 (Intel, Santa Clara, CA), 4096 MB random access memory (RAM), internal video graphics array HD- And an external video graphics array NVIDIA Quadro FX 1400 (NVIDIA, Santa Clara, Calif.) And a 5-megapixel flat-panel LCD (liquid-crystal display) monitor (Totoku Electric, Japan). The PACS software was a Windows-based PACS viewer, Centricity® PACS RA1000 (GE Healthcare, Barrington, IL). The operating system (OS) program used Microsoft Windows 7. I designed the main module using AutoHotkey (version 1.0.48). This open source macro program can be downloaded from the official web page (http://www.autohotkey.com) [5]. We selected GOCR version 0.49 as OCR function, and this open source OCR software can be downloaded from official web page (http://jocr.sourceforge.net/api) [6].

2. 관심영역(ROI) 값 추출의 메인 모듈 2. Main module of ROI value extraction

미리 규정된 컴퓨터 키를 한번 누르기(키스트로크)에 의해, 메인 모듈은 다음과 같이 활성화되어, 스크린상의 관심영역(ROI) 값들을 텍스트상의 관심영역(ROI) 값들로 변환시킨다. 상세한 변환 프로세스는 다음과 같다(도 2 참조).
By pressing a predefined computer key once (keystroke), the main module is activated as follows to convert ROI values on the screen to ROI values on the text. The detailed conversion process is as follows (see FIG. 2).

(i) 관심영역(ROI) 값들의 영역을 그래픽 파일로 캡처: (i) Capture area of region of interest (ROI) values as a graphic file :

OCR을 위해서 ROI 값들을 갖는 OCR-타겟된 캡처한 스크린샷을 로컬 드라이브의 임시 폴더에 JPG-포맷된 그래픽 파일로 저장하였다(도 1 참조). 효율적인 이미지 스크린샷을 위해 관심영역(ROI) 값들이 위치할 것으로 예상되는 특정 영역내에서 OCR이 작동하도록 디자인하였다.
OCR-targeted captured screen shots with ROI values for OCR were stored as JPG-formatted graphics files in a temporary folder on a local drive (see FIG. 1). We designed the OCR to work within a specific area where ROI values are expected to be located for efficient image screen shots.

(ⅱ) 캡처된 이미지의 OCR: (Ii) OCR of the captured image :

OCR 엔진은 JPG-포맷된 그래픽 파일을 로딩하고(도 3의 패널 A 참조), 이미지 파일내의 관심영역(ROI) 값들의 문자들을 텍스트 포맷된 파일로 변환시켰다(도 3의 패널 B 참조). 변환된 텍스트 파일을 임시 폴더에 저장하였다.
The OCR engine loaded the JPG-formatted graphics file (see panel A of FIG. 3) and converted the characters of the ROI values in the image file into a text formatted file (see panel B of FIG. 3). The converted text file was saved in a temporary folder.

(ⅲ) 광학적으로 인식된 관심영역(ROI) 값들의 오류의 수정: (Iii) Correction of errors in optically recognized ROI values :

광학분자인식(OCR)과정에서의 오류의 수정은 매크로 문자열 대체 조작에 의해 쓰여진 소프트웨어로 자동적으로 수행하였다. 모듈에서 OCR 엔진의 전형적인 인식 오류 패턴은 분할 오류(segmentation error), 과공간(extra-spaces) 오류 및 숫자를 문자인 것으로 인식하는 오류들이다. 분할 오류는 예를 들어, "1,441"을 "1 ,44 1"으로 인식하는 오류이며, 이러한 오류를 수정하기 위해서, 매크로 프로그램을 사용하여 숫자 사이의 공간을 제거하였다. 불필요한 2중 공간은 단일 공간 또는 공간 제거에 의해 대체하였다(도 3의 패널 C 참조). 관심영역(ROI) 통계학적 값들은 대체로 숫자이기 때문에, 관심영역(ROI) 값에서 숫자가 문자로 인식된 오류의 경우, 예를 들어, "O"을 "Ο" 또는 "o"으로 인식한 경우, "8"을 "B"로 인식한 경우, 및 "7"을 "Z"으로 잘못 인식한 오류 등도 매크로 프로그램을 사용하여 오류를 수정하였다.
Correction of errors in Optical Molecognition (OCR) process was automatically performed by software written by macro-string replacement operations. Typical recognition error patterns in OCR engines in modules are segmentation errors, extra-spaces errors, and errors that recognize numbers as characters. For example, a partition error is an error that recognizes "1,441" as "1, 44 1". To correct this error, a macro program was used to remove the space between numbers. An unnecessary double space was replaced by a single space or space removal (see panel C of FIG. 3). Because the ROI statistical values are usually numbers, in the case of an error in which a number is recognized as a character in the ROI value, for example, when "O" is recognized as "Ο" or "o" , "8" was recognized as "B", and "7" was mistakenly recognized as "Z".

(ⅳ) 관심영역(ROI)들의 통계학적 값들의 추출:(Iv) Extraction of statistical values of ROIs:

영역(area)값, 평균(average)값, 표준편차(standard deviation)값, 최대(max)값 및 최소(min)값 등을 포함하는 관심영역(ROI)의 통계학적 값들을 OCR-인식된 문자열로부터 분리하였다. PACS 뷰어(Centricity, GE Healthcare)내에서, 관심영역(ROI) 값의 포맷은 A [영역 값], Average [평균 값], SD [표준편차 값], Max [최대값], 및 Min [최소값] 이다.
The statistical values of the ROI including the area value, the average value, the standard deviation value, the max value and the min value are stored in an OCR-recognized string . Within the PACS viewer (Centricity, GE Healthcare), the ROI value formats are A [Area Value], Average [Average Value], SD [Standard Deviation Value], Max [Maximum Value], and Min [Minimum Value] to be.

(ⅴ) 탭(tab)으로 간격을 둔 임시 문자열의 제작: (V) Making temporary strings spaced by tabs :

임시 포맷은 탭으로 간격을 둔 값들이다. 임시 문자열의 미리 규정한 주형(template)의 예는 다음과 같았다: 101.9 [Tab] 1885.8 [Tab] 67.9 [Tab] 2080 [Tab] 1714 [Tab]. 여기에서 [tab]은 문자간 간격을 탭(Tab)에 의해 간격을 둔 것을 의미한다. 이러한 문자열은 데이터 분리를 위해 엑셀 (마이크로소프트社) 스프레드시트으로 붙여 넣기하는데 사용될 수 있다.
Temporary formats are tab-spaced values. An example of a predefined template of a temporary string was: 101.9 [Tab] 1885.8 [Tab] 67.9 [Tab] 2080 [Tab] 1714 [Tab]. Here, [tab] means that spacing between characters is spaced by a tab. These strings can be used to paste into Excel (Microsoft) spreadsheets for data separation.

(ⅵ) 임시 문자열을 클립보드(clipboard)에 복사(Vi) copy temporary strings to the clipboard

임시 문자열은 스프레드시트로 내보내기 위해 복사된다. 문자열은 붙여넣기를 위해 [Control-V] 키를 보냄으로써, 스프레드시트상에서 붙여넣기와 셀-분리가 될 수 있다(도 4 참조).
Temporary strings are copied for export to a spreadsheet. The string can be paste and cell-separated on a spreadsheet by sending a [Control-V] key for paste (see FIG. 4).

3. 프로그램 정확도의 평가 및 비교 3. Evaluating and comparing program accuracy

선택된 관심영역(ROI) 값들의 OCR에 대한 정확도를 2010년 3월과 2011년 12월사이의 PACS 서버 데이터로부터 얻은 자기공명영상의 280개 관심영역(ROI)들의 1040개 인식으로부터 평가하였다. 하나의 ROI 측정은 영역(area)값, 평균(average)값, 표준편차(standard deviation)값, 최대(max)값, 및 최소(min)값의 5개의 ROI 값들을 포함한다. 추출모듈로부터 얻은 OCR-인식된 ROI 값들은 한명의 방사선 전문의에 의해 육안으로 확인된 관심영역(ROI)값들과 비교하였다. 만일 인식 오류 또는 실패가 발생하였다면, OCR 인식과정의 방해로부터 불필요한 백그라운드 이미지를 피하기 위해 줌 아웃한 후에 추출 시스템을 반복하였다. 수동 입력과 비교하기 위해, 엑셀 스프레드시트상에의 입력 시간을 수동 입력과 추출 모듈 도움 입력 사이에서 비교하였다. 수동 방법 및 자동화 추출 모듈-도움 방법을 비교하기 위해, 각 방법에서의 입력 시간을 한명의 방사선 전문의가 기록하였다. PACS상의 이미지는 ROI 값들이 가려지지 않도록 줌 아웃 하였다. 참조로써 육안 인식 수동 방법에 의한 결과의 정확성도 평가하였다. 4개의 관심영역(ROI)들의 3개 세트로서 총 12개의 관심영역(ROI)을 PACS 워크스테이션에서 비교하였다.
The accuracy of the OCR of selected ROI values was evaluated from the recognition of 1040 ROIs of 280 ROIs from magnetic resonance images obtained from PACS server data between March 2010 and December 2011. One ROI measurement includes five ROI values: an area value, an average value, a standard deviation value, a maximum value, and a minimum value. The OCR-recognized ROI values from the extraction module were compared to the ROI values visually identified by one radiologist. If a recognition error or failure occurred, the extraction system was repeated after zooming out to avoid unnecessary background images from disturbing the OCR recognition process. To compare with the manual input, the input time on the Excel spreadsheet was compared between the manual input and the extraction module help input. Manual method and automated extraction module - To compare the methods of help, one radiologist recorded the input time in each method. The images on the PACS are zoomed out so that the ROI values are not obscured. As a reference, the accuracy of the results by the visual recognition manual method was also evaluated. A total of 12 ROIs were compared on a PACS workstation as three sets of four ROIs.

실험결과 Experiment result

관심영역(ROI)값들의 추출 모듈은 OCR 및 매크로 프로그램에 의해 성공적으로 수행되었다. 영역(area)값, 평균(average)값, 표준편차(standard deviation)값, 최대(max)값, 및 최소(min)값의 5개의 값들이 OCR에 의해 인식될 수 있었다. 또한, 오토핫키(AutoHotkey)-코드된 모듈에 의해, 관심영역(ROI)값들이 추출되고 오류가 보정되었으며, 각 관심영역(ROI)값들은 탭(tab)-공간으로 분리될 수 있다. 정확도는 자기공명영상(magnetic resonance image)의 280개의 관심영역(ROI)들의 1040개의 인식(recognition)들에 대해서 평가하였고, 값 추출 결과들은 백그라운드가 중복된 관심영역(ROI)들이 없이 100％의 정확성을 나타내었다. 종래의 수동 입력 방법 및 자동화 추출 모듈 도움 방법 사이의 평균 입력 시간은 각각 34.97초 및 7.87초이었다. 이러한 시간 절약 비교 테스트에서도 본 발명의 추출 모듈의 정확도는 100％을 보였다.
The extraction module of ROI values has been successfully performed by OCR and macro program. Five values of the area value, the average value, the standard deviation value, the maximum value, and the minimum value could be recognized by OCR. Also, ROI values are extracted and error corrected by an AutoHotkey-coded module, and each ROI value can be separated into tab-space. Accuracy was assessed for 1040 perceptions of 280 ROIs of a magnetic resonance image, and the value extraction results were 100% accurate without background overlapping ROIs Respectively. The average input times between conventional manual input methods and automated extraction module assist methods were 34.97 seconds and 7.87 seconds, respectively. In this time-saving comparative test, the accuracy of the extraction module of the present invention was 100%.

고찰 Review

컴퓨터의 사용이 의학 분야 및 특히 영상의학 분야에서 증가되고 있다[7]. 반복업무에 의한 육체적 스트레스를 최소화하기 위해, 필름이 없는 영상의학이 언어 인식 소프트웨어 및 전자적 질문지 형식 등을 도입하였으나, 관심영역(ROI)의 설정 작업을 대체할 수는 없었다. 관심영역(ROI)의 설정은 영상의학 판독 및 연구에서 가장 지루하고 성가진 작업중에 하나이다. 또한, 매일 컴퓨터를 사용하는 사람들은 반복적인 물리적 스트레스에 기인한 근골결질환에 걸릴 수 있다는 것도 잘 알려져 있다[7, 8]. 관심영역(ROI)값들을 스크린상에서 육안으로 인식하여 이를 수동으로 입력하는 경우 인식 오류 및 오타에 의한 오류가 발생할 수 있는데, 이러한 오류는 매우 심각한 결과를 초래할 수 있다. 본 발명자들은 상업적으로 접근가능한 PACS 뷰가 임시 파일에 관심영역(ROI)을 저장하거나 관심영역(ROI) 수치들을 클립보드(clipboard)에 복사하는 유용한 기능을 갖추지 못하고 있는 지에 대해 의문을 가졌으며, 차기 PACS 세대에 이러한 유용한 기능들이 포함될 수 있는지를 검토하였다. 관심영역(ROI) 값들의 추출 모듈의 개발을 위해, 본 발명자들은 OCR을 위해 GOCR 및 매크로 기능을 위해 오토핫키(AutoHotkey)를 포함하는 오픈 소스 소프트웨어를 사용하였으며, 관심영역(ROI) 설정에 적용할 수 있는 이러한 모듈을 이용하여 오류 수정 기능을 갖는 관심영역(ROI) 값들의 자동화된 추출 방법을 발명하였다. 본 발명의 방법에 의해 정확성을 확보되었으며 반복되는 다수의 육안-확인 및 타이핑 작업이 최소화되었다. 우리의 정확도 실험에서 OCR의 정확도는 100％이었다. 관심영역(ROI)의 값들이 모두 숫자 값들이기 때문에, OCR의 인식 오류는 거의 없다고 생각된다. 변환된 관심영역(ROI)값들의 오류 수정은 매크로 프로그램으로 수행하였다. 분할 오류(segmentation error), 과공간(extra-spaces) 및 문자인 것으로 인식하는 오류를 포함하여 흔한 인식 오류 패턴은 성공적으로 보정되었다. 관심영역(ROI)값들의 포맷은 PACS 뷰어에 따라 달라질 수 있으며, 값들은 특정한 포맷으로 추출될 수 있다. 이러한 포맷은 관심영역(ROI) 포맷에 따라 변경될 수 있다. 그러나, 백그라운드 신호 강도의 존재에 따른 인식의 한계는 존재한다. 관심영역(ROI) 값들이 스크린상의 백그라운드에 의해 간섭되면, 인식 오류가 발생할 수 있다(도 1의 패널 A 및 B 참조), 본 발명의 연구에서, 백그라운드 신호에 기인하는 인식 오류는 영상(이미지)를 줌 아웃(zoom out)함에 의해 백그라운드 신호를 제거하여 보정되었다. 결론적으로, 관심영역(ROI) 값들의 간단하고 효율적인 추출 모듈은 오픈 소스 OCR 소프트웨어와 매크로 프로그램을 이용함으로서 확립할 수 있었다. 이러한 모듈과 개념은 다음 세대 PACS에 적용 가능할 것이다.
The use of computers has been increasing in the medical field and especially in the field of radiology [7]. In order to minimize physical stress caused by repetitive tasks, film-less radiology introduced language recognition software and electronic questionnaire formats, but it could not replace the task of setting ROIs. The setting of the ROI is one of the most tedious and sexually charged tasks in imaging and research. It is also well known that people who use computers every day can suffer from musculoskeletal diseases caused by repetitive physical stresses [7, 8]. When the ROI values are visually recognized on the screen and manually entered, errors due to recognition errors and typographical errors may occur. Such errors may cause very serious consequences. The present inventors have wondered whether a commercially available PACS view does not have a useful function of storing a ROI in a temporary file or copying ROI values to a clipboard, We examined whether these useful functions could be included in the PACS generation. For the development of extraction modules of ROI values, we used open source software that includes AutoHotkey for GOCR and macro functions for OCR and applied it to ROI settings (ROI) values with error correction functions using these modules, which are capable of automatically extracting ROI values. The method of the present invention ensures accuracy and minimizes the number of repeated visual-confirmation and typing operations. In our accuracy experiments, the accuracy of OCR was 100%. Since the values of the ROI are all numerical values, it is considered that there is little recognition error in the OCR. Error correction of the converted ROI values was performed by a macro program. Common recognition error patterns, including errors that recognize segmentation errors, extra-spaces, and characters, have been successfully corrected. The format of the ROI values may vary depending on the PACS viewer, and values may be extracted in a specific format. This format may be changed according to the ROI format. However, there is a limit to recognition due to the presence of background signal strength. If the ROI values are interfered by the background on the screen, a perceptual error may occur (see panels A and B of FIG. 1). In the study of the present invention, To remove the background signal by zooming out. In conclusion, a simple and efficient extraction module of ROI values could be established by using open source OCR software and macro programs. These modules and concepts will be applicable to next generation PACS.

이상으로 본 발명의 특정한 부분을 상세히 기술하였는 바, 당업계의 통상의 지식을 가진 자에게 있어서 이러한 구체적인 기술은 단지 바람직한 구현 예일 뿐이며, 이에 본 발명의 범위가 제한되는 것이 아닌 점은 명백하다. 따라서, 본 발명의 실질적인 범위는 첨부된 청구항과 그의 등가물에 의하여 정의된다고 할 것이다.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the same is by way of illustration and example only and is not to be construed as limiting the scope of the present invention. Accordingly, the actual scope of the present invention will be defined by the appended claims and their equivalents.

Claims

A method for extracting a region-of-interest (ROI) value from a medical image comprising the steps of:
(a) capturing a region of a region of interest (ROI) from a video of a PACS (Picture Archiving and Communicating System) into a graphic file, wherein the image of the medical image storage system is X (SPECT) image or a single photon emission computed tomography (SPECT) image, wherein the background region is a region of interest, a CT (computed tomography) image, a magnetic resonance imaging And removing the background signal by zooming out the image when a recognition error of the values of the image is generated;
(b) converting values of a region of interest (ROI) of the captured graphic file image into a text file using optical character recognition (OCR) software;
(c) correcting an error in a text file converted by the optical character recognition (OCR), wherein the error is a segmentation error, an extra-space error, Is an error;
(d) extracting statistical values of a ROI from the error corrected text file, wherein the statistical values of the ROI include an area value, an average value, a standard deviation deviation value, a max value, and a min value;
(e) constructing the extracted statistical values in a temporary string format; And
(f) copying the temporary string format into a spreadsheet.

delete

The method of claim 1, wherein the error correction in step (c) is performed by a macro program.

delete

The method of claim 1, wherein the temporary character string in step (e) is in the order of an area value, an average value, a standard deviation value, a maximum value, and a minimum value.

7. The method of claim 6, wherein a gap is formed between each statistical value in the temporary string by a tab.

A computer-readable medium having recorded thereon a program for extracting a region of interest (ROI) value from a medical image comprising the steps of:
(a) capturing a region of a region of interest (ROI) values from a video image of a PACS into a graphics file, the PACS The image may be an X-ray image, a computed tomography (CT) image, a magnetic resonance imaging (MRI) image, a positron emission tomography (PET) image, or a single photon emission computed tomography (SPECT) And removing a background signal by zooming out the image when a recognition error of the values of the ROI occurs;
(b) converting values of a region of interest (ROI) of the captured graphic file image into a text file using optical character recognition (OCR) software;
(c) correcting an error in a text file converted by the optical character recognition (OCR), wherein the error is a segmentation error, an extra-space error, Is an error;
(d) extracting statistical values of a ROI from the error corrected text file, wherein the statistical values of the ROI include an area value, an average value, a standard deviation deviation value, a max value, and a min value;
(e) constructing the extracted statistical values in a temporary string format; And
(f) copying the temporary string format into a spreadsheet.

delete

9. The method according to claim 8, wherein correction of the error in step (c) is performed by a macro program. A computer readable recording medium on which a program for extracting a ROI value from a medical image is recorded media.

delete

9. The method according to claim 8, wherein the temporary string in the step (e) is in the order of an area value, an average value, a standard deviation value, a maximum value and a minimum value. And a computer readable recording medium storing the program.

14. The computer-readable medium of claim 13, wherein a program is provided for extracting a ROI value from a medical image, wherein spacing between each statistical value in the temporary string is formed by a tab. Recording medium.