KR102339714B1

KR102339714B1 - Apparatus, method and program for extraction EMF frequency bandwidth information in research literature

Info

Publication number: KR102339714B1
Application number: KR1020190143718A
Authority: KR
Inventors: 김의직; 이상우; 권정혁
Original assignee: 한림대학교 산학협력단
Priority date: 2019-11-11
Filing date: 2019-11-11
Publication date: 2021-12-14
Also published as: KR20210056814A

Abstract

본 발명에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 장치는 전자기장의 주파수 대역 설명 시 이용되는 주파수 관용 단어를 포함하는 주파수 대역 단어 목록을 생성하고, 상기 주파수 관용 단어를 이용하여 연구문헌 내에서 후보 전자기장 주파수 대역 정보를 추출하고, 상기 후보 전자기장 주파수 대역 정보 중에서 전자기장 방출원 조건을 만족하는 후보 전자기장 주파수 대역 정보를 전자기장 방출원의 전자기장 주파수 대역 정보로 선택하는 프로세서; 및 상기 연구문헌을 저장하는 메모리;를 포함한다.The electromagnetic field frequency band information extraction apparatus in the research literature according to the present invention generates a frequency band word list including the frequency band word used to describe the frequency band of the electromagnetic field, and uses the frequency band word list to use the frequency band information in the research literature a processor for extracting band information and selecting candidate electromagnetic field frequency band information satisfying an electromagnetic field emission source condition from among the candidate electromagnetic field frequency band information as electromagnetic field frequency band information of an electromagnetic field emission source; and a memory for storing the research literature.

Description

Apparatus, method and program for extraction EMF frequency bandwidth information in research literature}

본 발명은 연구문헌 내 전자기장 주파수 대역 정보 추출 장치, 방법 및 프로그램에 관한 것으로, 더욱 상세하게는 연구문헌 내에서 연구된 전자기장 방출원의 전자기장 주파수 대역 정보를 추출하는 연구문헌 내 전자기장 주파수 대역 정보 추출 장치, 방법 및 프로그램에 관한 것이다.The present invention relates to a device, method and program for extracting electromagnetic field frequency band information in research literature, and more particularly, to an electromagnetic field frequency band information extraction device in research literature for extracting electromagnetic field frequency band information of an electromagnetic field emission source studied in research literature , methods and programs.

최근, 전자파의 유해성이 알려지기 시작하면서 이를 차단하기 위한 방법과 재료들이 등장하고 있다. 전자파란 전자기장 성분을 가지는 파동(wave)을 말하는데, 인체에 악영향을 미치는 파를 유해파라 한다. 특히, 최근에 들어 자기적 성질을 갖는 낮은 주파수의 저주파의 인체에 대한 유해성이 부각되고 있고, 송전탑 주위의 자기장(60Hz)이 발암과의 상관성이 알려지면서 국내외적으로 큰 반향을 불러일으키고 있다.Recently, as the harmful effects of electromagnetic waves have been known, methods and materials for blocking them have emerged. An electromagnetic wave refers to a wave having an electromagnetic field component, and a wave that adversely affects the human body is called a harmful wave. In particular, in recent years, the harmful effects of low-frequency, low-frequency, magnetic properties to the human body have been highlighted, and the magnetic field (60Hz) around power transmission towers is known to have a correlation with carcinogenesis, causing great repercussions at home and abroad.

전자파가 초래하는 발암등의 위해성 논의 이외에도, 자기적 성질을 갖는 저주파 전자파에 인체가 장기간 노출되면 인체 내에 유도전류가 생성되어 세포막내에 존재하는 Na+, K+, Cl- 등의 각종 이온의 불균형을 초래하여, 호르몬 분비 및 면역세포에 영향을 주는 것으로 알려져 있다. 또한, 자기장은 인체의 수면과 관련 있는 멜라토닌의 분비량을 변화시켜 장기 노출 시 불면증과 등과 관계된다는 연구 결과가 보고되고 있다. 이에 따라, 세계 각국은 전자파 노출한계를 설정하고 전자기기에서 나오는 전자파의 규제를 전자기기의 수출 장벽으로도 활용하고 있는데, 예를 들어, 모니터의 경우 자기누설이 2mG 이상이면 스웨덴 등 유럽 지역으로의 수출은 불가능하게 된다.In addition to discussing the risks of cancer caused by electromagnetic waves, when the human body is exposed to low-frequency electromagnetic waves with magnetic properties for a long period of time, an induced current is generated in the human body, resulting in an imbalance of various ions such as Na+, K+, and Cl- that exist in the cell membrane. , is known to affect hormone secretion and immune cells. In addition, research results have been reported that the magnetic field changes the amount of melatonin secretion related to sleep in the human body and is related to insomnia during long-term exposure. Accordingly, countries around the world set limits on exposure to electromagnetic waves and are using the regulation of electromagnetic waves from electronic devices as export barriers for electronic devices. Export becomes impossible.

이처럼, 전자파 및 전자기장의 유해성이 중요시되면서, 전자기장에 의한 인체 위험을 연구한 결과를 포함하는 연구문헌이 다수 발표되고 있다. 하지만, 이러한, 전자기장에 의한 인체 위험을 연구한 결과를 포함하는 연구문헌들의 경우, 연구문헌의 저자 이름, 저자 이름, 제목, 출판사 이름, 출판 날짜 등과 같은 간략한 정보들만이 제공되기 때문에 전자기장에 의한 인체 위험과 관련한 연구문헌을 효과적으로 검색하고 인덱싱할 수 없다.As such, as the harmfulness of electromagnetic waves and electromagnetic fields is emphasized, a number of research literatures including the results of studies on human risks caused by electromagnetic fields have been published. However, in the case of research literature including the results of research on human risk due to electromagnetic fields, only brief information such as the author's name, author's name, title, publisher name, publication date, etc. is provided. Research literature related to risk cannot be effectively searched and indexed.

특히, 연구문헌에 주로 다루고 있는 전자기장의 방출원과 방출원의 전자기장 주파수 대역 정보는 연구문헌 내의 자세한 내용을 파악하여야만 확인이 가능하여, 방출원의 전자기장 주파수 대역에 따른 인체 위험을 연구문헌에 통해 효율적으로 파악하기 용이하지 않은 문제점이 있다.In particular, the information on the emission source of the electromagnetic field and the electromagnetic field frequency band of the emission source, which are mainly covered in the research literature, can be confirmed only by understanding the details in the research literature. There is a problem that is not easy to understand.

본 발명은 연구문헌으로부터 전자기장 방출원의 전자기장 주파수 대역 정보를 추출할 수 있는 연구문헌 내 전자기장 주파수 대역 정보 추출 장치, 방법 및 프로그램을 제공할 수 있다. The present invention can provide an electromagnetic field frequency band information extraction apparatus, method and program in the research literature that can extract the electromagnetic field frequency band information of the electromagnetic field emission source from the research literature.

본 발명의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있고, 본 발명의 실시예에 의해 보다 분명하게 이해될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention not mentioned may be understood by the following description, and will be more clearly understood by the examples of the present invention. It will also be readily apparent that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the appended claims.

상기 프로세서는 상기 연구문헌 내에서 상기 주파수 관용 단어를 검색하고, 상기 검색된 주파수 관용 단어와 인접한 숫자 텍스트를 검색하고, 상기 검색된 주파수 관용 단어와 상기 검색된 숫자 텍스트를 상기 후보 전자기장 주파수 대역 정보로 추출할 수 있다.The processor searches for the frequency tolerance word in the research literature, searches for numeric text adjacent to the searched frequency tolerance word, and extracts the searched frequency tolerance word and the searched numeric text as the candidate electromagnetic field frequency band information have.

상기 프로세서는 상기 후보 전자기장 주파수 대역 정보의 주파수 관용 단어에 대응하여 주파수 단위를 확인하고, 상기 확인된 주파수 단위에 기초하여 상기 후보 전자기장 주파수 대역 정보의 전자기장 주파수 대역을 확인할 수 있다.The processor may identify a frequency unit corresponding to a frequency tolerance word of the candidate electromagnetic field frequency band information, and may identify an electromagnetic field frequency band of the candidate electromagnetic field frequency band information based on the identified frequency unit.

상기 프로세서는 상기 후보 전자기장 주파수 대역 정보가 복수이면, 상기 전자기장 주파수 대역이 높은 순으로 미리 설정된 개수 내에 포함되는 후보 전자기장 주파수 대역 정보가 상기 전자기장 방출원 조건을 만족하는 것으로 판단할 수 있다.When the plurality of candidate electromagnetic field frequency band information is provided, the processor may determine that candidate electromagnetic field frequency band information included in a preset number in the order of the highest electromagnetic field frequency band satisfies the electromagnetic field emission source condition.

상기 프로세서는 상기 후보 전자기장 주파수 대역 정보의 전자기장 주파수 대역이 미리 설정된 주파수 대역에 포함되면 해당 후보 전자기장 주파수 대역 정보가 상기 전자기장 방출원 조건을 만족하는 것으로 판단할 수 있다.When the electromagnetic field frequency band of the candidate electromagnetic field frequency band information is included in a preset frequency band, the processor may determine that the candidate electromagnetic field frequency band information satisfies the electromagnetic field emission source condition.

본 발명에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 방법은 프로세서가 전자기장의 주파수 대역 설명 시 이용되는 주파수 관용 단어를 포함하는 주파수 대역 단어 목록을 생성하는 단계; 상기 프로세서가 상기 주파수 관용 단어를 이용하여 연구문헌 내에서 후보 전자기장 주파수 대역 정보를 추출하는 단계; 및 상기 프로세서가 상기 후보 전자기장 주파수 대역 정보 중에서 전자기장 방출원 조건을 만족하는 후보 전자기장 주파수 대역 정보를 전자기장 방출원의 전자기장 주파수 대역 정보로 선택하는 단계;를 포함한다.The electromagnetic field frequency band information extraction method in the research literature according to the present invention comprises the steps of: generating, by a processor, a frequency band word list including a frequency band word used when describing a frequency band of an electromagnetic field; extracting, by the processor, candidate electromagnetic field frequency band information from the research literature by using the frequency tolerance word; and selecting, by the processor, candidate electromagnetic field frequency band information satisfying the electromagnetic field emission source condition from among the candidate electromagnetic field frequency band information as electromagnetic field frequency band information of the electromagnetic field emission source.

상기 후보 전자기장 주파수 대역 정보를 추출하는 단계는 상기 프로세서가 상기 연구문헌 내에서 상기 주파수 관용 단어를 검색하고, 상기 검색된 주파수 관용 단어와 인접한 숫자 텍스트를 검색하고, 상기 검색된 주파수 관용 단어와 상기 검색된 숫자 텍스트를 상기 후보 전자기장 주파수 대역 정보로 추출하는 단계;를 포함할 수 있다.In the step of extracting the candidate electromagnetic field frequency band information, the processor searches for the frequency tolerance word in the research literature, searches for numeric text adjacent to the searched frequency tolerance word, and the searched frequency tolerance word and the searched numeric text may include; extracting as the candidate electromagnetic field frequency band information.

상기 전자기장 방출원의 전자기장 주파수 대역 정보로 선택하는 단계는 상기 프로세서가 상기 후보 전자기장 주파수 대역 정보의 주파수 관용 단어에 대응하여 주파수 단위를 확인하고, 상기 확인된 주파수 단위에 기초하여 상기 후보 전자기장 주파수 대역 정보의 전자기장 주파수 대역을 확인하는 단계;를 포함할 수 있다.In the selecting of the electromagnetic field frequency band information of the electromagnetic field emission source, the processor identifies a frequency unit corresponding to a frequency tolerance word of the candidate electromagnetic field frequency band information, and based on the identified frequency unit, the candidate electromagnetic field frequency band information It may include; confirming the frequency band of the electromagnetic field.

상기 전자기장 방출원의 전자기장 주파수 대역 정보로 선택하는 단계는 상기 프로세서가 상기 후보 전자기장 주파수 대역 정보가 복수이면, 상기 전자기장 주파수 대역이 높은 순으로 미리 설정된 개수 내에 포함되는 후보 전자기장 주파수 대역 정보가 상기 전자기장 방출원 조건을 만족하는 것으로 판단하는 단계;를 포함할 수 있다.In the step of selecting the electromagnetic field frequency band information of the electromagnetic field emission source, when the processor includes a plurality of the candidate electromagnetic field frequency band information, candidate electromagnetic field frequency band information included within a preset number in the order of the highest electromagnetic field frequency band is the electromagnetic field emission It may include; determining that the original condition is satisfied.

상기 전자기장 방출원의 전자기장 주파수 대역 정보로 선택하는 단계는 상기 프로세서가 상기 후보 전자기장 주파수 대역 정보의 전자기장 주파수 대역이 미리 설정된 주파수 대역에 포함되면 해당 후보 전자기장 주파수 대역 정보가 상기 전자기장 방출원 조건을 만족하는 것으로 판단하는 단계;를 포함할 수 있다.In the step of selecting the electromagnetic field frequency band information of the electromagnetic field emission source, when the electromagnetic field frequency band of the candidate electromagnetic field frequency band information is included in a preset frequency band, the candidate electromagnetic field frequency band information satisfies the electromagnetic field emission source condition It may include;

본 발명에 따른 컴퓨터 프로그램은 하드웨어인 컴퓨터와 결합되어, 연구문헌 내 전자기장 주파수 대역 정보 추출 방법을 수행할 수 있도록 컴퓨터에서 독출 가능한 기록매체에 저장될 수 있다.The computer program according to the present invention may be stored in a computer-readable recording medium so as to be combined with a computer, which is hardware, to perform a method for extracting electromagnetic field frequency band information in research literature.

본 발명에 따르면, 연구문헌 내에서 주파수 관용 단어를 검색하고 이와 인접한 숫자 텍스트를 검색하여 전자기장 주파수 대역 정보를 추출함으로써, 연구문헌 내에서 다루고 있는 전자기장 방출원의 전자기장 주파수 대역 정보를 신속하고 정확도 높게 추출할 수 있다.According to the present invention, by extracting electromagnetic field frequency band information by searching for frequency tolerance words in research literature and searching numerical texts adjacent thereto, electromagnetic field frequency band information of electromagnetic field emission sources covered in research literature is extracted quickly and with high accuracy can do.

도 1은 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 장치와 전자 장치를 도시한 도면이다.
도 2는 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 장치의 구성을 도시한 도면이다.
도 3은 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보를 추출하는 과정을 설명하기 위한 도면이다.
도 4는 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 방법의 순서도이다.
도 5는 본 발명의 다른 실시 예에 따른 연구 목적 단어 목록과 비목적 단어 목록의 일 예를 도시한 도면이다.
도 6은 본 발명의 다른 실시 예에 따른 연구 주제 단어 목록의 일 예를 도시한 도면이다.
도 7은 본 발명의 다른 실시 예에 따른 가중치 산출 과정을 설명하기 위한 도면이다.
도 8은 본 발명의 또 다른 실시 예에 따른 카테고리 특징 단어 목록의 일 예를 도시한 도면이다.
도 9는 본 발명의 또 다른 실시 예에 따른 연구문헌의 중요도를 평가하는 과정을 설명하기 위한 도면이다.1 is a diagram illustrating an electromagnetic field frequency band information extraction device and an electronic device in a research document according to an embodiment of the present invention.
2 is a diagram illustrating the configuration of an electromagnetic field frequency band information extraction apparatus in a research document according to an embodiment of the present invention.
3 is a diagram for explaining a process of extracting electromagnetic field frequency band information in a research document according to an embodiment of the present invention.
4 is a flowchart of a method for extracting electromagnetic field frequency band information in a research document according to an embodiment of the present invention.
5 is a diagram illustrating an example of a research purpose word list and a non-objective word list according to another embodiment of the present invention.
6 is a diagram illustrating an example of a research topic word list according to another embodiment of the present invention.
7 is a diagram for explaining a weight calculation process according to another embodiment of the present invention.
8 is a diagram illustrating an example of a category characteristic word list according to another embodiment of the present invention.
9 is a view for explaining a process of evaluating the importance of research literature according to another embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나, 본 발명은 이하에서 개시되는 실시예들에 제한되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야의 통상의 기술자에게 본 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. Advantages and features of the present invention and methods of achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only the present embodiments allow the disclosure of the present invention to be complete, and those of ordinary skill in the art to which the present invention pertains. It is provided to fully understand the scope of the present invention to those skilled in the art, and the present invention is only defined by the scope of the claims.

본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다. 명세서에서 사용되는 "포함한다(comprises)" 및/또는 "포함하는(comprising)"은 언급된 구성요소 외에 하나 이상의 다른 구성요소의 존재 또는 추가를 배제하지 않는다. 명세서 전체에 걸쳐 동일한 도면 부호는 동일한 구성 요소를 지칭하며, "및/또는"은 언급된 구성요소들의 각각 및 하나 이상의 모든 조합을 포함한다. 비록 "제1", "제2" 등이 다양한 구성요소들을 서술하기 위해서 사용되나, 이들 구성요소들은 이들 용어에 의해 제한되지 않음은 물론이다. 이들 용어들은 단지 하나의 구성요소를 다른 구성요소와 구별하기 위하여 사용하는 것이다. 따라서, 이하에서 언급되는 제1 구성요소는 본 발명의 기술적 사상 내에서 제2 구성요소일 수도 있음은 물론이다.The terminology used herein is for the purpose of describing the embodiments and is not intended to limit the present invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” and/or “comprising” does not exclude the presence or addition of one or more other components in addition to the stated components. Like reference numerals refer to like elements throughout, and "and/or" includes each and every combination of one or more of the recited elements. Although "first", "second", etc. are used to describe various elements, these elements are not limited by these terms, of course. These terms are only used to distinguish one component from another. Accordingly, it goes without saying that the first component mentioned below may be the second component within the spirit of the present invention.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야의 통상의 기술자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또한, 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다.Unless otherwise defined, all terms (including technical and scientific terms) used herein will have the meaning commonly understood by those of ordinary skill in the art to which this invention belongs. In addition, terms defined in a commonly used dictionary are not to be interpreted ideally or excessively unless specifically defined explicitly.

명세서에서 사용되는 “프로세서”라는 용어는 소프트웨어, FPGA 또는 ASIC과 같은 하드웨어 구성요소를 의미하며, “프로세서”는 어떤 역할들을 수행한다. 그렇지만 “프로세서”는 소프트웨어 또는 하드웨어에 한정되는 의미는 아니다. “프로세서”는 어드레싱할 수 있는 저장 매체에 있도록 구성될 수 있다. 따라서, 일 예로서 “프로세서”는 소프트웨어 구성요소들, 객체지향 소프트웨어 구성요소들, 클래스 구성요소들 및 태스크 구성요소들과 같은 구성요소들과, 프로세서들, 함수들, 속성들, 프로시저들, 서브루틴들, 프로그램 코드의 세그먼트들, 드라이버들, 펌웨어, 마이크로 코드, 회로, 데이터, 데이터베이스, 데이터 구조들, 테이블들, 어레이들 및 변수들을 포함한다. 구성요소들과 “프로세서”들 안에서 제공되는 기능은 더 작은 수의 구성요소들 및 “프로세서”들로 결합되거나 추가적인 구성요소들과 “프로세서”들로 더 분리될 수 있다.As used herein, the term “processor” refers to a hardware component such as software, FPGA or ASIC, and “processor” performs certain roles. However, “processor” is not meant to be limited to software or hardware. A “processor” may be configured to reside on an addressable storage medium. Thus, by way of example, “processor” includes components such as software components, object-oriented software components, class components, and task components, and processors, functions, properties, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays and variables. The functionality provided within components and “processors” may be combined into a smaller number of components and “processors” or further separated into additional components and “processors”.

공간적으로 상대적인 용어인 "아래(below)", "아래(beneath)", "하부(lower)", "위(above)", "상부(upper)" 등은 도면에 도시되어 있는 바와 같이 하나의 구성요소와 다른 구성요소들과의 상관관계를 용이하게 기술하기 위해 사용될 수 있다. 공간적으로 상대적인 용어는 도면에 도시되어 있는 방향에 더하여 사용시 또는 동작 시 구성요소들의 서로 다른 방향을 포함하는 용어로 이해되어야 한다. 예를 들어, 도면에 도시되어 있는 구성요소를 뒤집을 경우, 다른 구성요소의 "아래(below)"또는 "아래(beneath)"로 기술된 구성요소는 다른 구성요소의 "위(above)"에 놓여질 수 있다. 따라서, 예시적인 용어인 "아래"는 아래와 위의 방향을 모두 포함할 수 있다. 구성요소는 다른 방향으로도 배향될 수 있으며, 이에 따라 공간적으로 상대적인 용어들은 배향에 따라 해석될 수 있다.Spatially relative terms "below", "beneath", "lower", "above", "upper", etc. It can be used to easily describe the correlation between a component and other components. Spatially relative terms should be understood as terms including different directions of components during use or operation in addition to the directions shown in the drawings. For example, when a component shown in the drawing is turned over, a component described as “beneath” or “beneath” of another component may be placed “above” of the other component. can Accordingly, the exemplary term “below” may include both directions below and above. Components may also be oriented in other orientations, and thus spatially relative terms may be interpreted according to orientation.

본 명세서에서, 컴퓨터는 적어도 하나의 프로세서를 포함하는 모든 종류의 하드웨어 장치를 의미하는 것이고, 실시 예에 따라 해당 하드웨어 장치에서 동작하는 소프트웨어적 구성도 포괄하는 의미로서 이해될 수 있다. 예를 들어, 컴퓨터는 스마트폰, 태블릿 PC, 데스크톱, 노트북 및 각 장치에서 구동되는 사용자 클라이언트 및 애플리케이션을 모두 포함하는 의미로서 이해될 수 있으며, 또한 이에 제한되는 것은 아니다.In this specification, a computer refers to all types of hardware devices including at least one processor, and may be understood as encompassing software configurations operating in the corresponding hardware device according to embodiments. For example, a computer may be understood to include, but is not limited to, smart phones, tablet PCs, desktops, notebooks, and user clients and applications running on each device.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 명세서에서 설명되는 각 단계들은 프로세서에 의하여 수행되는 것으로 설명되나, 각 단계의 주체는 이에 제한되는 것은 아니며, 실시 예에 따라 각 단계들의 적어도 일부가 서로 다른 장치에서 수행될 수도 있다.Each step described in this specification is described as being performed by a processor, but the subject of each step is not limited thereto, and at least a portion of each step may be performed in different devices according to embodiments.

도 1은 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 장치(10, 이하 '추출 장치'라 함)와 전자 장치(20)를 도시한 도면이다.1 is a diagram illustrating an electromagnetic field frequency band information extraction device 10 (hereinafter referred to as an 'extraction device') and an electronic device 20 in a research document according to an embodiment of the present invention.

도 1을 참조하면, 추출 장치(10)는 전자 장치(20)로부터 연구문헌을 입력 받고, 연구문헌에 대해 텍스트 마이닝을 수행하여 연구문헌의 전자기장 주파수 대역 정보를 추출할 수 있다.Referring to FIG. 1 , the extraction device 10 may receive research literature input from the electronic device 20 , perform text mining on the research literature, and extract electromagnetic field frequency band information of the research literature.

한편, 추출 장치(10)는 연구문헌에 포함된 모든 문장에 대해서 텍스트 마이닝을 수행할 수 있으나, 연산 속도 및 추출 효율을 고려하여 연구문헌의 초록에 대해서만 텍스트 마이닝을 수행할 수 있다.Meanwhile, the extraction apparatus 10 may perform text mining on all sentences included in the research literature, but may perform text mining only on the abstract of the research literature in consideration of the operation speed and extraction efficiency.

이하에서, 추출 장치(10)가 연구문헌에 대해 각종 수치를 산출하거나 텍스트를 변형하는 것은 연구문헌 전체가 아닌 연구문헌의 초록에 대해서 수행되는 것을 의미할 수 있다.Hereinafter, the extraction device 10 calculates various numerical values for the research literature or transforms the text may mean that it is performed for the abstract of the research literature rather than the entire research literature.

한편, 본 명세서에서의 연구문헌은 연구 카테고리, 저자, 발행처가 한정되지 않으며, 공개된 모든 연구문헌 중 일부일 수 있다. On the other hand, the research literature in the present specification is not limited to the research category, author, and publisher, and may be a part of all published research literature.

일 실시 예에서, 추출 장치(10)는 전자기장에 따른 인체 건강 위험에 대한 연구 내용이 포함된 연구문헌을 전자 장치(20)로부터 입력 받을 수 있다.In an embodiment, the extraction device 10 may receive, from the electronic device 20, a research document including research content on human health risks according to the electromagnetic field.

전자 장치(20)는 추출 장치(10)로 연구문헌을 제공하기 위한 구성이다. 본 발명에 따른 전자 장치(200)는 서버 또는 스마트폰으로 구현될 수 있으나, 이는 일 실시예에 불과할 뿐, 서버 또는 스마트폰 외에 태블릿 PC(tablet personal computer), 이동 전화기(mobile phone), 영상 전화기, 전자책 리더기(e-book reader), 데스크탑 PC (desktop PC), 랩탑 PC(laptop PC), 넷북 컴퓨터(netbook computer), 워크스테이션(workstation), 서버, PDA(personal digital assistant), PMP(portable multimedia player) 또는 웨어러블 장치(wearable device) 중 어느 하나일 수 있다. The electronic device 20 is configured to provide research literature to the extraction device 10 . The electronic device 200 according to the present invention may be implemented as a server or a smart phone, but this is only an exemplary embodiment, and in addition to the server or smart phone, a tablet personal computer (PC), a mobile phone, and a video phone , e-book reader, desktop PC, laptop PC, netbook computer, workstation, server, PDA (personal digital assistant), PMP (portable) multimedia player) or a wearable device.

도 2는 본 발명의 일 실시 예에 따른 추출 장치(10)의 구성을 도시한 도면이고, 도 3은 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보를 추출하는 과정을 설명하기 위한 도면이다.2 is a view showing the configuration of the extraction apparatus 10 according to an embodiment of the present invention, Figure 3 is for explaining the process of extracting the electromagnetic field frequency band information in the research literature according to an embodiment of the present invention It is a drawing.

본 발명의 일 실시 예에 따른 추출 장치(10)는 연구문헌 내 전자기장 방출원의 전자기장 주파수 대역 정보를 추출할 수 있다.Extraction apparatus 10 according to an embodiment of the present invention may extract the electromagnetic field frequency band information of the electromagnetic field emission source in the research literature.

이를 위한, 본 발명의 일 실시 예에 따른 추출 장치(10)는 프로세서(11) 및 메모리부(12)를 포함할 수 있다.For this purpose, the extraction apparatus 10 according to an embodiment of the present invention may include a processor 11 and a memory unit 12 .

프로세서(11)는 연구문헌에 대해 텍스트 마이닝을 수행하기 위해 연구문헌을 전처리할 수 있다.The processor 11 may pre-process the research literature in order to perform text mining on the research literature.

구체적으로, 프로세서(11)는 전처리 과정으로써, 연구문헌의 단어 또는 문장을 토큰화하는 제4 전치리 과정을 수행할 수 있다.Specifically, as a preprocessing process, the processor 11 may perform a fourth preprocessing process for tokenizing words or sentences of research literature.

일 실시 예에서, 프로세서(11)는 Python 라이브러리인 NLTK(Natural Language Toolkit)를 이용하여 상술된 전처리 과정을 구현할 수 있다. In an embodiment, the processor 11 may implement the above-described preprocessing process using a Python library, Natural Language Toolkit (NLTK).

이후, 프로세서(11)는 전자기장의 주파수 대역 설명 시 이용되는 주파수 관용 단어를 포함하는 주파수 대역 단어 목록을 생성할 수 있다.Thereafter, the processor 11 may generate a frequency band word list including a frequency band word used when describing a frequency band of the electromagnetic field.

여기서, 주파수 관용 단어는 주파수를 대역 설명 시 빈번하게 쓰이는 단어일 수 있으며, 예를 들어, 주파수 관용 단어는 "Hz", "MHz" 및 "GHz" 중 어느 하나일 수 있다. Here, the frequency tolerance word may be a word frequently used when describing a frequency band. For example, the frequency tolerance word may be any one of “Hz”, “MHz”, and “GHz”.

프로세서(11)는 주파수 관용 단어를 이용하여 연구문헌 내에서 후보 전자기장 주파수 대역 정보를 추출할 수 있다.The processor 11 may extract candidate electromagnetic field frequency band information from within the research literature by using a frequency tolerance word.

구체적으로, 프로세서(11)는 도 3에 도시된 바와 같이, 연구문헌 내에서 주파수 관용 단어를 검색하고, 검색된 주파수 관용 단어와 인접한 숫자 텍스트를 검색하고, 검색된 주파수 관용 단어와 검색된 숫자 텍스트를 후보 전자기장 주파수 대역 정보로 추출할 수 있다.Specifically, as shown in FIG. 3 , the processor 11 searches for a frequency tolerance word in the research literature, searches for numeric text adjacent to the searched frequency tolerance word, and sets the searched frequency tolerance word and the searched numeric text with a candidate electromagnetic field. It can be extracted as frequency band information.

프로세서(11)는 후보 전자기장 주파수 대역 정보 중에서 전자기장 방출원 조건을 만족하는 후보 전자기장 주파수 대역 정보를 전자기장 방출원의 전자기장 주파수 대역 정보로 선택할 수 있다.The processor 11 may select candidate electromagnetic field frequency band information satisfying the electromagnetic field emission source condition from among the candidate electromagnetic field frequency band information as the electromagnetic field frequency band information of the electromagnetic field emission source.

이를 위해, 프로세서(11)는 후보 전자기장 주파수 대역 정보의 주파수 관용 단어에 대응하여 주파수 단위를 확인하고, 확인된 주파수 단위에 기초하여 후보 전자기장 주파수 대역 정보의 전자기장 주파수 대역을 확인할 수 있다.To this end, the processor 11 may identify a frequency unit corresponding to a frequency tolerance word of the candidate electromagnetic field frequency band information, and may identify an electromagnetic field frequency band of the candidate electromagnetic field frequency band information based on the checked frequency unit.

이후, 프로세서(11)는 후보 전자기장 주파수 대역 정보가 복수이면, 전자기장 주파수 대역이 높은 순으로 미리 설정된 개수 내에 포함되는 후보 전자기장 주파수 대역 정보가 전자기장 방출원 조건을 만족하는 것으로 판단할 수 있다.Thereafter, when there is a plurality of candidate electromagnetic field frequency band information, the processor 11 may determine that the candidate electromagnetic field frequency band information included in a preset number in the order of the highest electromagnetic field frequency band satisfies the electromagnetic field emission source condition.

예를 들어, 미리 설정된 개수가 1개이고, 도 8에 도시된 바와 같이, 후보 전자기장 주파수 대역 정보 “10GHz 내지 11GHz”와 “150Mhz 내지 160Mhz” 2개인 경우, 프로세서(11)는 전자기장 주파수 대역이 가장 높은 “10GHz 내지 11GHz”가 전자기장 방출원 조건을 만족하는 것으로 판단하여 “10GHz 내지 11GHz”를 전자기장 주파수 대역 정보로 결정할 수 있다.For example, when the preset number is one, and as shown in FIG. 8 , the candidate electromagnetic field frequency band information “10 GHz to 11 GHz” and “150 Mhz to 160 Mhz” are two, the processor 11 has the highest electromagnetic field frequency band It is determined that “10 GHz to 11 GHz” satisfies the electromagnetic field emission source condition, and “10 GHz to 11 GHz” may be determined as electromagnetic field frequency band information.

다른 실시 예에 따른 프로세서(11)는 후보 전자기장 주파수 대역 정보의 전자기장 주파수 대역이 미리 설정된 주파수 대역에 포함되면 해당 후보 전자기장 주파수 대역 정보가 전자기장 방출원 조건을 만족하는 것으로 판단할 수 있다.The processor 11 according to another embodiment may determine that the candidate electromagnetic field frequency band information satisfies the electromagnetic field emission source condition when the electromagnetic field frequency band of the candidate electromagnetic field frequency band information is included in a preset frequency band.

예를 들어, 미리 설정된 주파수 대역이 1GHz 내지 100Ghz인 경우, 프로세서(11)는 미리 설정된 주파수 대역 1GHz 내지 100Ghz에 포함되는 전자기장 주파수 대역인 “10GHz 내지 11GHz”가 전자기장 방출원 조건을 만족하는 것으로 판단하여 “10GHz 내지 11GHz”를 전자기장 주파수 대역 정보로 결정할 수 있다.For example, if the preset frequency band is 1 GHz to 100 Ghz, the processor 11 determines that the electromagnetic field frequency band "10 GHz to 11 GHz" included in the preset frequency band 1 GHz to 100 Ghz satisfies the electromagnetic field emission source condition. “10 GHz to 11 GHz” may be determined as electromagnetic field frequency band information.

이를 통해, 프로세서(11)는 연구문헌에서 인체 건강에 위험한 영향일 미치는 전자기장 방출원이 방출하는 전자기장 주파수 대역 정보를 신속하고 정확하게 추출할 수 있다.Through this, the processor 11 can quickly and accurately extract electromagnetic field frequency band information emitted by an electromagnetic field emission source that has a dangerous effect on human health from research literature.

메모리부(12)는 전자 장치(20)로부터 입력되는 연구문헌을 저장할 수 있다. 또한, 메모리부(12)는 프로세서(11)의 상술된 연산 과정에 필요한 프로그램을 저장할 수 있다.The memory unit 12 may store research literature input from the electronic device 20 . In addition, the memory unit 12 may store a program necessary for the above-described operation process of the processor 11 .

이러한 메모리부(12)는 비휘발성 메모리(예: 플래시 메모리, 하드 디스크 등) 및 휘발성 메모리(예: RAM(random access memory)를 포함할 수 있고, 프로그램은 비휘발성 메모리에 저장되고 휘발성 메모리로 로드되어 동작할 수 있다.The memory unit 12 may include a non-volatile memory (eg, a flash memory, a hard disk, etc.) and a volatile memory (eg, random access memory (RAM)), and a program is stored in the non-volatile memory and loaded into the volatile memory. and can work.

도 4는 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 방법의 순서도이다.4 is a flowchart of a method for extracting electromagnetic field frequency band information in a research document according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 방법은 S1 단계에서, 프로세서가 전자기장의 주파수 대역 설명 시 이용되는 주파수 관용 단어를 포함하는 주파수 대역 단어 목록을 생성하게 된다.Referring to FIG. 4 , in the method for extracting electromagnetic field frequency band information in research literature according to an embodiment of the present invention, in step S1, the processor generates a frequency band word list including frequency bands used when describing the frequency band of the electromagnetic field. will do

이후, S2 단계에서, 프로세서가 주파수 관용 단어를 이용하여 연구문헌 내에서 후보 전자기장 주파수 대역 정보를 추출하게 된다.Thereafter, in step S2, the processor extracts candidate electromagnetic field frequency band information from the research literature by using the frequency tolerance word.

이때, S2 단계에서, 프로세서가 연구문헌 내에서 주파수 관용 단어를 검색하고, 검색된 주파수 관용 단어와 인접한 숫자 텍스트를 검색하고, 검색된 주파수 관용 단어와 검색된 숫자 텍스트를 후보 전자기장 주파수 대역 정보로 추출하는 단계를 더 수행하게 된다.At this time, in step S2, the processor searches for a frequency tolerance word in the research literature, searches for a numeric text adjacent to the searched frequency tolerance word, and extracts the searched frequency tolerance word and the searched numeric text as candidate electromagnetic field frequency band information. will perform more

이후, S3 단계에서, 프로세서가 후보 전자기장 주파수 대역 정보 중에서 전자기장 방출원 조건을 만족하는 후보 전자기장 주파수 대역 정보를 전자기장 방출원의 전자기장 주파수 대역 정보로 선택하게 된다.Thereafter, in step S3 , the processor selects candidate electromagnetic field frequency band information satisfying the electromagnetic field emission source condition from among the candidate electromagnetic field frequency band information as electromagnetic field frequency band information of the electromagnetic field emission source.

이때, S3 단계에서, 프로세서가 후보 전자기장 주파수 대역 정보의 주파수 관용 단어에 대응하여 주파수 단위를 확인하고, 확인된 주파수 단위에 기초하여 후보 전자기장 주파수 대역 정보의 전자기장 주파수 대역을 확인하는 단계를 더 수행하게 된다.At this time, in step S3, the processor identifies a frequency unit corresponding to the frequency tolerance word of the candidate electromagnetic field frequency band information, and further performs the step of confirming the electromagnetic field frequency band of the candidate electromagnetic field frequency band information based on the confirmed frequency unit do.

또한, S3 단계에서, 프로세서가 후보 전자기장 주파수 대역 정보가 복수이면, 전자기장 주파수 대역이 높은 순으로 미리 설정된 개수 내에 포함되는 후보 전자기장 주파수 대역 정보가 전자기장 방출원 조건을 만족하는 것으로 판단하는 단계를 더 수행하게 된다. In addition, in step S3, if there is a plurality of candidate electromagnetic field frequency band information, the processor further performs the step of determining that the candidate electromagnetic field frequency band information included within a preset number in the order of the highest electromagnetic field frequency band satisfies the electromagnetic field emission source condition will do

한편, 본 발명의 일 실시 예에 따른 컴퓨터 프로그램은 하드웨어인 컴퓨터와 결합되어, 본 발명의 일 실시 예에 따른 연구문헌 내 전자기장 주파수 대역 정보 추출 방법을 수행할 수 있도록 컴퓨터에서 독출 가능한 기록매체에 저장될 수 있다. On the other hand, the computer program according to an embodiment of the present invention is combined with a computer that is hardware, and is stored in a computer-readable recording medium to perform the electromagnetic field frequency band information extraction method in the research document according to an embodiment of the present invention. can be

본 발명의 실시예와 관련하여 설명된 방법 또는 알고리즘의 단계들은 하드웨어로 직접 구현되거나, 하드웨어에 의해 실행되는 소프트웨어 모듈로 구현되거나, 또는 이들의 결합에 의해 구현될 수 있다. 소프트웨어 모듈은 RAM(Random Access Memory), ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리(Flash Memory), 하드 디스크, 탈착형 디스크, CD-ROM, 또는 본 발명이 속하는 기술 분야 에서 잘 알려진 임의의 형태의 컴퓨터 판독가능 기록매체에 상주할 수도 있다.The steps of a method or algorithm described in relation to an embodiment of the present invention may be implemented directly in hardware, as a software module executed by hardware, or by a combination thereof. Software modules include random access memory (RAM), read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, hard disk, removable disk, CD-ROM, Alternatively, it may reside in any type of computer-readable recording medium well known in the art to which the present invention pertains.

도 5는 본 발명의 다른 실시 예에 따른 연구 목적 단어 목록과 비목적 단어 목록의 일 예를 도시한 도면이고, 도 6은 본 발명의 다른 실시 예에 따른 연구 주제 단어 목록의 일 예를 도시한 도면이고, 도 7은 본 발명의 다른 실시 예에 따른 가중치 산출 과정을 설명하기 위한 도면이다.5 is a diagram illustrating an example of a research object word list and a non-objective word list according to another embodiment of the present invention, and FIG. 6 is a diagram illustrating an example of a research topic word list according to another embodiment of the present invention. 7 is a diagram for explaining a weight calculation process according to another embodiment of the present invention.

도 5 내지 도 7을 참조하면, 추출 장치(10)는 프로세서(11)와 연구문헌을 저장하는 메모리부(12)를 포함할 수 있다.5 to 7 , the extraction device 10 may include a processor 11 and a memory unit 12 for storing research literature.

구체적으로, 프로세서(11)는 전처리 과정으로써, 연구문헌의 텍스트 중 대문자를 소문자 변경하는 제1 전처리 과정, 연구문헌의 문장 부호, 공백, be 동사 및 1회 출현한 단어와 같은 중지 단어를 제거하는 제2 전처리 과정, 연구문헌의 단어가 변형 단어인 경우 원형 단어를 추출하는 제3 전처리 과정 및 연구문헌의 단어 또는 문장을 토큰화하는 제4 전치리 과정을 수행할 수 있다.Specifically, as a preprocessing process, the processor 11 removes stop words such as a first preprocessing process for changing uppercase and lowercase letters in the text of the research literature, punctuation marks, spaces, be verbs, and words that appear once in the research literature A second preprocessing process, a third preprocessing process of extracting a circular word when the word of the research literature is a modified word, and a fourth preprocessing process of tokenizing the word or sentence of the research literature may be performed.

이후, 프로세서(11)는 연구문헌에 포함된 문장 각각에 연구 목적 정보가 포함되었는지 여부를 판단하여 문장 중에서 후보 목적 전술 문장을 선택할 수 있다.Thereafter, the processor 11 may select a candidate objective tactical sentence from among the sentences by determining whether research objective information is included in each of the sentences included in the research literature.

구체적으로, 프로세서(11)는 도 5에 도시된 바와 같이, 연구 목적의 설명 시 이용되는 목적 관용 단어와 연구문헌의 제목에 포함된 제목 단어를 포함하는 연구 목적 단어 목록을 생성하고, 연구의 실험 결과의 설명 시 이용되는 실험 관용 단어를 포함하는 비목적 단어 목록을 생성할 수 있다.Specifically, as shown in FIG. 5 , the processor 11 generates a research object word list including a title word included in the title of the research document and the object idiomatic word used when explaining the research object, and the research experiment A non-objective word list including experimental idiomatic words used to describe the results can be created.

즉, 프로세서(11)는 연구 목적을 전술하는 문장에서 빈번하게 쓰이는 목적 관용 단어와 추출 대상이 되는 연구문헌의 연구 목적이 함축되어 제목 단어를 합쳐 연구 목적 단어 목록으로 생성할 수 있다.That is, the processor 11 may generate a research object word list by combining a title word that is implied by an object idiomatic word frequently used in the sentence describing the study object and a research object of a research document to be extracted.

또한, 프로세서(11)는 연구문헌의 초록에 빈번하게 쓰이지만 연구 목적과 연계성이 없는 실험 결과를 설명하는 경우 빈번하게 쓰이는 실험 관용 단어가 포함되도록 비목적 단어 목록을 생성할 수 있다.Also, the processor 11 may generate a non-objective word list to include frequently used experimental idiomatic words when describing experimental results that are frequently used in the abstract of research literature but have no connection with the research purpose.

이후, 프로세서(11)는 연구문헌에 포함된 문장 각각에 포함된 단어가 연구 목적 단어 목록과 비목적 단어 목록에 포함되는지 여부에 기초하여 연구문헌에 포함된 문장 중에서 후보 목적 전술 문장을 선택할 수 있다.Thereafter, the processor 11 may select a candidate objective tactical sentence from among the sentences included in the research literature based on whether a word included in each sentence included in the research literature is included in the research object word list and the non-objective word list. .

구체적으로, 프로세서(11)는 연구문헌에 포함된 문장의 단어 중 하나 이상이 연구 목적 단어 목록에 포함되고, 비목적 단어 목록에 포함되지 않으면, 해당 문장을 후보 목적 전술 문장으로 선택할 수 있다.Specifically, if one or more of the words of the sentences included in the research literature are included in the research target word list and are not included in the non-objective word list, the processor 11 may select the corresponding sentence as the candidate target tactical sentence.

이에 따라, 프로세서(11)는 연구문헌에 포함된 문장 중에서 연구 목적과 연관성이 높은 복수의 문장만을 후보 목적 전술 문장으로 선택할 수 있다.Accordingly, the processor 11 may select only a plurality of sentences highly related to the research purpose from among the sentences included in the research document as the candidate purpose tactical sentences.

한편, 프로세서(11)는 도 6에 도시된 바와 같이, 연구문헌에 포함된 단어 각각의 연구문헌 내 중요도를 산출하고, 연구문헌 내 중요도에 기초하여 연구 주제 단어 목록을 생성할 수 있다.Meanwhile, as shown in FIG. 6 , the processor 11 may calculate the importance in the research literature of each word included in the research literature, and generate a research topic word list based on the importance in the research literature.

구체적으로, 프로세서(11)는 연구문헌에 포함된 단어 중 어느 하나인 제1 단어와 연구문헌에 포함된 문장 중 어느 하나인 제1 문장 간의 연구문헌 내 중요도를 산출할 수 있다.Specifically, the processor 11 may calculate the importance in the research literature between the first word that is any one of the words included in the research literature and the first sentence that is any one of the sentences included in the research literature.

프로세서(11)는 제1 문장에 포함된 단어의 단어수와 제1 단어의 제1 문장에서의 제1 주제 출현횟수 간의 제1 주제 출현비율을 산출하고, 연구문헌의 문장수와 제1 단어가 포함된 연구문헌의 문장수 간의 제2 주제 출현비율을 산출할 수 있다.The processor 11 calculates the first topic appearance ratio between the number of words included in the first sentence and the number of appearances of the first topic in the first sentence of the first word, and the number of sentences in the research literature and the first word It is possible to calculate the appearance ratio of the second topic between the number of sentences in the included research literature.

이후, 프로세서(11)는 제1 주제 출현비율과 제2 주제 출현비율에 기초하여 제1 단어와 제1 문장 간의 연구문헌 내 중요도를 산출할 수 있다.Thereafter, the processor 11 may calculate the importance in the research literature between the first word and the first sentence based on the first topic appearance ratio and the second topic appearance ratio.

이때, 프로세서(11)는 하기의 수학식 1을 이용하여 연구문헌에 포함된 단어 중 어느 하나인 제n 단어와 연구문헌에 포함된 문장 중 어느 하나인 제m 문장 간의 연구문헌 내 중요도를 산출할 수 있다. At this time, the processor 11 calculates the importance in the research literature between the n-th word, which is any one of the words included in the research literature, and the m-th sentence, which is any one of the sentences included in the research literature, using Equation 1 below. can

<수학식 1><Equation 1>

여기서, W_n,m은 연구문헌에 포함된 단어 중 어느 하나인 제n 단어와 연구문헌에 포함된 문장 중 어느 하나인 제m 문장 간의 연구문헌 내 중요도이고, W_nPS_m은 제n 단어의 제m 문장에서의 제1 주제 출현횟수이고, WPS_m은 제m 문장에 포함된 단어의 단어수이고, SPL은 연구문헌의 문장수이고, S_WnPL은 제n 단어가 포함된 연구문헌의 문장수이고, a는 조절 상수이다.Here, W _n,m is the degree of importance in the research literature between the nth word, which is any one of the words included in the research literature, and the mth sentence, which is any one of the sentences included in the research literature, and W _n PS _m is the importance of the nth word in the research literature. is the number of appearances of the first topic in the mth sentence, WPS _m is the number of words in the word included in the mth sentence, SPL is the number of sentences in the research literature, and S _Wn PL is the sentence in the research literature including the nth word number, and a is the control constant.

프로세서(11)는 상술된 방식으로 연구문헌에 포함된 모든 제n 단어와 제m 문장 각각에 간의 연구문헌 내 중요도를 산출하여, 매트릭스 형태인 TF-IDF 매트릭스 데이터를 생성할 수 있다. 즉, 프로세서(11)는 제n 단어와 제m 문장 각각에 간의 연구문헌 내 중요도 이를 위해, 프로세서(11)는 출현 빈도에 기초하여 중요도를 산출하는 TF-IDF(Term Frequency-Inverse Document Frequency) 분석법을 이용하여 제n 단어와 제m 문장 각각에 간의 연구문헌 내 중요도를 산출할 수 있다.The processor 11 may generate the TF-IDF matrix data in the form of a matrix by calculating the importance in the research literature between all the n-th words and the m-th sentences included in the research literature in the manner described above. That is, the processor 11 determines the importance in the research literature between each of the nth word and the mth sentence. To this end, the processor 11 calculates the importance based on the frequency of appearance. can be used to calculate the importance in the research literature between each of the nth word and the mth sentence.

이후, 프로세서(11)는 잠재 의미 분석(Latent Semantic Analysis; LSA) 툴의 특이값 분해(Singular Value Decomposition; SVD) 기능을 이용하여 제n 단어와 제m 문장 각각에 간의 연구문헌 내 중요도 중에서 유의미한 중요도만을 추출하여 후보 연구 주제 단어를 10개씩 포함하는 후보 연구 주제 단어 목록을 6개 생성할 수 있다.Thereafter, the processor 11 uses the singular value decomposition (SVD) function of the latent semantic analysis (LSA) tool to determine the significant importance among the importance in the research literature between the n-th word and the m-th sentence, respectively. , can be extracted to generate 6 candidate research topic word lists including 10 candidate research topic words each.

프로세서(11)는 후보 연구 주제 단어 목록 각각과 연구문헌의 제목에 포함된 단어와의 중복 단어수를 확인하고, 중복 단어수가 최다인 후보 연구 주제 단어 목록을 연구 문헌의 연구 주제 단어 목록으로 선택할 수 있다.The processor 11 may check the number of overlapping words between each of the candidate research topic word lists and the words included in the title of the research literature, and select the candidate research topic word list having the largest number of overlapping words as the research topic word list of the research literature. have.

이를 통해, 프로세서(11)는 연구 문헌의 연구 주제와 관련이 높은 연구 주제 단어로 구성된 연구 주제 단어 목록을 생성할 수 있다.Through this, the processor 11 may generate a research topic word list composed of research topic words highly related to the research topic of the research literature.

이러한, 연구 주제 단어 목록에는 도 6에 도시된 바와 같이, 연구 주제 단어 10개와 연구 주제 단어 각각에 매칭된 연구문헌 내 중요도가 포함될 수 있다.As shown in FIG. 6 , the research topic word list may include 10 research topic words and an importance in the research literature matched to each of the research topic words.

이후, 프로세서(11)는 도 7에 도시된 바와 같이, 후보 목적 전술 문장 각각의 임시 가중치 및 연구문헌의 제목과 후보 목적 전술 문장 각각 간의 제목 유사도를 산출하여 후보 목적 전술 문장 각각의 가중치를 산출할 수 있다.Thereafter, as shown in FIG. 7 , the processor 11 calculates a temporary weight of each of the candidate objective tactical sentences and the title similarity between the title of the research document and each of the candidate objective tactical sentences to calculate the weight of each of the candidate objective tactical sentences. can

구체적으로, 프로세서(11)는 연구 주제 단어 중에서 후보 목적 전술 문장에 포함된 연구 주제 단어의 연구문헌 내 중요도를 합산하여 임시 가중치를 산출할 수 있다.Specifically, the processor 11 may calculate a temporary weight by summing the importance in the research literature of the research topic words included in the candidate objective tactical sentences among the research topic words.

이후, 프로세서(11)는 후보 목적 전술 문장에 포함된 단어 중에서 연구문헌의 제목에 포함된 단어의 단어수와 연구문헌의 제목의 단어수 간의 단어수 비율을 제목 유사도로 산출할 수 있다.Thereafter, the processor 11 may calculate the word number ratio between the number of words included in the title of the research document and the number of words in the title of the research document among the words included in the candidate target tactical sentence as the title similarity.

최종적으로, 프로세서(11)는 임시 가중치와 제목 유사도를 곱하여 후보 목적 전술 문장 각각의 가중치 산출할 수 있다.Finally, the processor 11 may calculate the weight of each of the candidate target tactical sentences by multiplying the temporary weight by the title similarity.

이러한, 프로세서(11)는 가중치가 최대인 후보 목적 전술 문장을 목적 전술 문장으로 선택할 수 있다.The processor 11 may select a candidate target tactic sentence having a maximum weight as the target tactic sentence.

즉, 프로세서(11)는 연구문헌의 문장 중에서 연구 목적을 나타내는 후보 목적 전술 문장을 복수로 추출하고, 후보 목적 전술 문장 각각의 임시 가중치와 제목 유사도를 각각 산출한 후 임시 가중치와 제목 유사도를 곱하여 후보 목적 전술 문장 각각의 가중치를 산출하여 목적 전술 문장을 선택함으로써, 연구문헌의 연구 목적과 최근 접한 내용을 포함하는 문장을 목적 전술 문장으로 선택할 수 있다.That is, the processor 11 extracts a plurality of candidate objective tactical sentences representing the research purpose from among the sentences of the research literature, calculates a temporary weight and title similarity of each of the candidate objective tactical sentences, and then multiplies the temporary weight and the title similarity to obtain a candidate By calculating the weight of each of the target tactical sentences and selecting the target tactic sentence, a sentence including the research purpose of the research literature and the most recent content can be selected as the target tactic sentence.

도 8은 본 발명의 또 다른 실시 예에 따른 카테고리 특징 단어 목록의 일 예를 도시한 도면이다.8 is a diagram illustrating an example of a category characteristic word list according to another embodiment of the present invention.

본 발명의 또 다른 실시 예에 따른 추출 장치(10)는 연구 카테고리 별 카테고리 특징 단어 목록을 이용하여 연구문헌의 연구 카테고리 정보를 추출할 수 있다.The extraction apparatus 10 according to another embodiment of the present invention may extract research category information of research literature by using a list of category features for each research category.

이를 위한, 본 발명의 다른 실시 예에 따른 추출 장치(10)는 프로세서(11) 및 연구문헌을 저장하는 메모리부(12)를 포함할 수 있다.For this purpose, the extraction apparatus 10 according to another embodiment of the present invention may include a processor 11 and a memory unit 12 for storing research literature.

구체적으로, 프로세서(11)는 전처리 과정으로써, 연구문헌의 텍스트 중 대문자를 소문자 변경하는 제1 전처리 과정, 연구문헌의 문장 부호, 공백 및 be 동사와 같은 중지 단어를 제거하는 제2 전처리 과정, 연구문헌의 단어가 변형 단어인 경우 원형 단어를 추출하는 제3 전처리 과정 및 연구문헌의 단어 또는 문장을 토큰화하는 제4 전치리 과정을 수행할 수 있다.Specifically, the processor 11 is a preprocessing process, a first preprocessing process for changing uppercase and lowercase letters in the text of research literature, a second preprocessing process for removing stop words such as punctuation marks, spaces, and be verbs of research literature, research When the word of the document is a modified word, the third preprocessing process of extracting the original word and the fourth preprocessing process of tokenizing the word or sentence of the research document may be performed.

이후, 프로세서(11)는 도 8에 도시된 바와 같이, 후보 연구 카테고리 별로 카테고리 특징 단어 목록을 생성할 수 있다.Thereafter, the processor 11 may generate a category feature word list for each candidate research category, as shown in FIG. 8 .

구체적으로, 프로세서(11)는 후보 연구 카테고리인 역학적 연구 카테고리, 동물 실험 연구 카테고리 및 세포 실험 연구 카테고리 각각의 카테고리 특징 단어 목록을 생성할 수 있다.Specifically, the processor 11 may generate a list of category features for each of the candidate research categories, an epidemiologic research category, an animal experiment research category, and a cell experiment research category.

프로세서(11)는 전자 장치(20)로부터 입력되는 카테고리 특징 단어를 분류하여 후보 연구 카테고리 별로 카테고리 특징 단어 목록을 생성할 수 있다.The processor 11 may classify the category feature words input from the electronic device 20 to generate a category feature word list for each candidate research category.

여기서, 카테고리 특징 단어 목록은 카테고리의 특징을 나타내는 카테고리 특징 단어가 포함된 단어 목록일 수 있다.Here, the category characteristic word list may be a word list including category characteristic words indicating the characteristics of the category.

이후, 프로세서(11)는 카테고리 특징 단어 목록에 포함된 카테고리 특징 단어가 연구문헌에 포함되어 있는지 여부에 기초하여 후보 연구 카테고리 별로 카테고리 유사도를 산출할 수 있다.Thereafter, the processor 11 may calculate the category similarity for each candidate research category based on whether the category characteristic word included in the category characteristic word list is included in the research literature.

구체적으로, 프로세서(11)는 카테고리 특징 단어 중에서 연구문헌 내에 포함된 카테고리 특징 단어 각각의 연구문헌 내 중요도를 산출할 수 있다.Specifically, the processor 11 may calculate the importance in the research literature of each of the category feature words included in the research literature among the category feature words.

이때, 프로세서(11)는 연구문헌의 단어수와 카테고리 특징 단어 중에서 연구문헌 내에 포함된 카테고리 특징 단어의 연구문헌에서의 제1 카테고리 출현횟수 간의 제1 카테고리 출현비율을 산출하고, 연구문헌의 문장수와 카테고리 특징 단어가 포함된 연구문헌의 문장수 간의 제2 카테고리 출현비율을 산출할 수 있다.At this time, the processor 11 calculates the first category appearance ratio between the number of words in the research literature and the number of appearances of the first category in the research literature of the category characteristic words included in the research literature among the category characteristic words, and the number of sentences in the research literature and the second category appearance ratio between the number of sentences in the research literature including the category feature word can be calculated.

이어서, 프로세서(11)는 제1 카테고리 출현비율과 제2 카테고리 출현비율에 기초하여 연구문헌 내 중요도를 산출할 수 있다.Then, the processor 11 may calculate the importance in the research literature based on the first category appearance rate and the second category appearance rate.

여기서, 프로세서(11)는 하기의 수학식 2를 이용하여 카테고리 특징 단어 중 어느 하나인 제x 단어의 연구문헌 내 중요도를 산출할 수 있다. Here, the processor 11 may calculate the importance in the research literature of the x-th word, which is any one of the category feature words, by using Equation 2 below.

<수학식 2><Equation 2>

여기서, W_x,x은 카테고리 특징 단어 중 어느 하나인 제x 단어의 연구문헌 내 중요도이고, W_xPL은 제x 단어의 연구문헌에서의 제1 카테고리 출현횟수 이고, WPL은 연구문헌의 단어수이고, SPL은 연구문헌의 문장수이고, S_WxPL은 제x 단어가 포함된 연구문헌의 문장수이고, b는 조절 상수이다.Here, W _x,x is the importance in the research literature of the x-th word, which is any one of the category feature words, W _x PL is the number of appearances of the first category in the research literature of the x-th word, and WPL is the number of words in the research literature , SPL is the number of sentences in the research literature, S _Wx PL is the number of sentences in the research literature including the x-th word, and b is the adjustment constant.

한편, 프로세서(11)는 출현 빈도에 기초하여 중요도를 산출하는 TF-IDF(Term Frequency-Inverse Document Frequency) 분석법을 이용하여 카테고리 특징 단어 중 어느 하나인 제x 단어의 연구문헌 내 중요도를 산출할 수 있다.On the other hand, the processor 11 can calculate the importance in the research literature of the x-th word, which is any one of the category feature words, using a TF-IDF (Term Frequency-Inverse Document Frequency) analysis method that calculates the importance based on the frequency of appearance. have.

이를 통해, 프로세서(11)는 후보 연구 카테고리 별 카테고리 특징 단어 목록에 포함된 카테고리 특징 단어 각각의 연구문헌 내 중요도를 산출할 수 있다.Through this, the processor 11 may calculate the importance in the research literature of each category characteristic word included in the category characteristic word list for each candidate research category.

이후, 프로세서(11)는 연구문헌 내 중요도를 후보 연구 카테고리 별로 합산하여 후보 연구 카테고리 별로 카테고리 유사도를 산출하고 카테고리 유사도가 최대인 후보 연구 카테고리를 연구 카테고리로 선택할 수 있다.Thereafter, the processor 11 may calculate the category similarity for each candidate research category by summing the importance in the research literature for each candidate research category, and select the candidate research category having the maximum category similarity as the research category.

이에 따라, 프로세서(11)는 후보 연구 카테고리 중에서 연구 문헌의 연구 카테고리와 가장 근접한 후보 연구 카테고리를 신속하고 정확하게 선택할 수 있다.Accordingly, the processor 11 can quickly and accurately select a candidate research category closest to the research category of the research literature from among the candidate research categories.

도 9는 본 발명의 또 다른 실시 예에 따른 연구문헌의 중요도를 평가하는 과정을 설명하기 위한 도면이다.9 is a view for explaining a process of evaluating the importance of research literature according to another embodiment of the present invention.

본 발명의 또 다른 실시 예에 따른 추출 장치(10)는 연구문헌의 연구 목적과 전자기장 주파수 대역 정보를 이용하여 연구문헌의 중요도를 평가할 수 있다.Extraction apparatus 10 according to another embodiment of the present invention can evaluate the importance of the research literature by using the research purpose and electromagnetic field frequency band information of the research literature.

이를 위한, 본 발명의 또 다른 실시 예에 따른 추출 장치(10)는 프로세서(11) 및 연구문헌을 저장하는 메모리부(12)를 포함할 수 있다.For this purpose, the extraction apparatus 10 according to another embodiment of the present invention may include a processor 11 and a memory unit 12 for storing research literature.

프로세서(11)는 연구문헌의 연구 목적을 나타내는 목적 전술 문장과 연구문헌의 연구 카테고리에 대응되는 카테고리 중요 단어 목록 간의 목적 유사도를 산출할 수 있다.The processor 11 may calculate the degree of purpose similarity between the objective tactical sentence indicating the research purpose of the research literature and the category important word list corresponding to the research category of the research literature.

프로세서(11)는 상기 카테고리 중요 단어 목록 중에서 상기 목적 전술 문장에 포함된 단어의 단어수와 상기 목적 전술 문장에 포함된 단어의 단어수 간의 단어수 비율을 상기 목적 유사도로 산출할 수 있다.The processor 11 may calculate a word count ratio between the number of words included in the target tactical sentence and the number of words included in the target tactical sentence in the list of important words for the category as the target similarity.

여기서, 카테고리 중요 단어 목록은 카테고리 특징 단어 목록의 단어 중에서 연구문헌에서의 출현빈도가 상위인 일부 단어만을 포함하는 단어 목록일 수 있다.Here, the category important word list may be a word list including only some words having a higher frequency of appearance in the research literature among words in the category feature word list.

한편, 프로세서(11)는 상기 연구문헌 내 전자기장 방출원의 전자기장 주파수 대역 정보와 상기 연구 카테고리에 대응되는 카테고리 중요 주파수 대역 정보 간의 주파수 유사도를 산출할 수 있다.Meanwhile, the processor 11 may calculate the frequency similarity between the electromagnetic field frequency band information of the electromagnetic field emission source in the research literature and category important frequency band information corresponding to the research category.

여기서, 연구 카테고리 주요 주파수 대역은 연구 카테고리 별로 주요하게 연구되는 주파수 대역일 수 있다.Here, the research category main frequency band may be a frequency band mainly studied for each research category.

구체적으로, 프로세서(11)는 상기 카테고리 중요 주파수 대역 정보의 주파수 대역과 상기 전자기장 방출원의 상기 전자기장 주파수 대역 정보의 주파수 대역의 주파수 대역 비율을 상기 주파수 유사도로 산출할 수 있다.Specifically, the processor 11 may calculate the frequency band ratio of the frequency band of the category important frequency band information and the frequency band of the electromagnetic field frequency band information of the electromagnetic field emission source as the frequency similarity.

예를 들어, 프로세서(11)는 카테고리 중요 주파수 대역 정보의 주파수 대역이 “1Hz 내지 100Hz”이고, 상기 전자기장 방출원의 상기 전자기장 주파수 대역 정보의 주파수 대역이 “51Hz 내지 120Hz”인 경우, 카테고리 중요 주파수 대역 정보의 주파수 대역 중에서 상기 전자기장 방출원의 상기 전자기장 주파수 대역 정보의 주파수 대역과 중첩되는 주파수 대역의 비율을 주파수 대역 비율로 산출하여 “50%”로 산출할 수 있다.For example, when the frequency band of the category important frequency band information is “1 Hz to 100 Hz”, and the frequency band of the electromagnetic field frequency band information of the electromagnetic field emission source is “51 Hz to 120 Hz”, the processor 11 is a category important frequency A ratio of a frequency band overlapping with a frequency band of the electromagnetic field frequency band information of the electromagnetic field emission source among the frequency bands of the band information may be calculated as “50%” by calculating the frequency band ratio.

최종적으로, 프로세서(11)는 상기 목적 유사도와 상기 주파수 유사도에 기초하여 상기 연구문헌의 중요도를 평가할 수 있다.Finally, the processor 11 may evaluate the importance of the research literature based on the objective similarity and the frequency similarity.

구체적으로, 프로세서(11)는 상기 목적 유사도와 상기 주파수 유사도를 합산하여 상기 연구문헌의 중요도로 산출하고, 상기 중요도가 미리 설정 기준 중요도를 초과하는 경우, 상기 연구문헌의 중요도 등급을 중요 등급으로 평가할 수 있다.Specifically, the processor 11 calculates the importance of the research literature by summing the objective similarity and the frequency similarity, and when the importance exceeds a preset reference importance, the importance grade of the research literature is evaluated as an important grade. can

한편, 다른 실시 예에 따른 프로세서(11)는 상기 목적 전술 문장과 상기 연구문헌의 연구 카테고리와 상이한 연구 카테고리에 대응되는 카테고리 중요 단어 목록 간의 목적 비유사도를 산출할 수 있다.On the other hand, the processor 11 according to another embodiment may calculate the target dissimilarity between the target tactical sentence and the category important word list corresponding to a research category different from the research category of the research literature.

프로세서(11)는 연구문헌의 연구 카테고리와 상이한 연구 카테고리에 대응되는 카테고리 중요 단어 목록 중에서 상기 목적 전술 문장에 포함된 단어의 단어수와 상기 목적 전술 문장에 포함된 단어의 단어수 간의 단어수 비율을 상기 목적 비유사도로 산출할 수 있다.The processor 11 calculates the ratio of the number of words between the number of words included in the target tactical sentence and the number of words included in the target tactical sentence from among the category important word list corresponding to the research category different from the research category of the research literature. The target dissimilarity can be calculated.

즉, 프로세서(11)는 목적 전술 문장에 포함된 단어와 연구문헌의 연구 카테고리와 상이한 연구 카테고리에 대응되는 카테고리 중요 단어 목록에 포함된 단어가 중복될수록 목적 비유사도가 높도록 산출할 수 있다.That is, the processor 11 may calculate the target dissimilarity to be higher as the word included in the target tactical sentence overlaps the word included in the category important word list corresponding to the research category different from the research category of the research literature.

이후, 프로세서(11)는 상기 목적 유사도와 상기 목적 비유사도 간의 차이값이 미리 설정된 차이값 범위에 포함되면, 상기 연구문헌의 중요도 등급을 비중요 등급으로 평가할 수 있다. Thereafter, when the difference value between the objective similarity and the objective dissimilarity is included in a preset difference value range, the processor 11 may evaluate the importance grade of the research literature as a non-important grade.

이제까지 본 발명에 대하여 바람직한 실시 예를 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 본 발명을 구현할 수 있음을 이해할 것이다. 그러므로 상기 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 한다.So far, the present invention has been focused on a preferred embodiment. Those of ordinary skill in the art to which the present invention pertains will understand that the present invention can be implemented in modified forms without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive point of view. The scope of the present invention is indicated in the claims rather than the foregoing description, and all differences within an equivalent scope should be construed as being included in the present invention.

이상과 같이, 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 이것에 의해 한정되지 않으며 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 본 발명의 기술사상과 아래에 기재될 특허청구범위의 균등범위 내에서 다양한 수정 및 변형이 가능함은 물론이다.As described above, although the present invention has been described with reference to limited embodiments and drawings, the present invention is not limited thereto, and the technical idea of the present invention and the following by those of ordinary skill in the art to which the present invention pertains. Of course, various modifications and variations are possible within the scope of equivalents of the claims to be described.

10: 연구문헌 내 전자기장 주파수 대역 정보 추출 장치
11: 프로세서
12: 메모리부
20: 전자 장치10: Electromagnetic field frequency band information extraction device in research literature
11: Processor
12: memory unit
20: electronic device

Claims

Generates a frequency band word list including the frequency band word used when describing the frequency band of the electromagnetic field, extracts candidate electromagnetic field frequency band information from the research literature using the frequency band word, and uses the electromagnetic field frequency band information from the candidate electromagnetic field frequency band information a processor for selecting candidate electromagnetic field frequency band information satisfying the emission source condition as electromagnetic field frequency band information of the electromagnetic field emission source; and
Including; a memory for storing the research literature;
the processor is
confirming a frequency unit corresponding to a frequency tolerance word of the candidate electromagnetic field frequency band information, and identifying an electromagnetic field frequency band of the candidate electromagnetic field frequency band information based on the identified frequency unit;
the processor is
If the electromagnetic field frequency band of the candidate electromagnetic field frequency band information is included in a preset frequency band, it is determined that the candidate electromagnetic field frequency band information satisfies the electromagnetic field emission source condition,
the processor is
calculating the frequency similarity between the electromagnetic field frequency band information and category important frequency band information corresponding to the research category of the research literature,
the processor is
calculating a frequency band ratio of a frequency band of the category important frequency band information and a frequency band of the electromagnetic field frequency band information of the electromagnetic field emission source as the frequency similarity;
the processor is
Extraction of electromagnetic field frequency band information in research literature, characterized in that the ratio of the frequency band overlapping the frequency band of the electromagnetic field frequency band information of the electromagnetic field emission source among the frequency bands of the category important frequency band information is calculated as the frequency band ratio Device.

According to claim 1,
the processor is
Searching for the frequency tolerance word in the research literature, searching for numeric text adjacent to the searched frequency tolerance word, and extracting the searched frequency tolerance word and the searched numeric text as the candidate electromagnetic field frequency band information Electromagnetic field frequency band information extraction device in research literature.

delete

generating, by the processor, a frequency band word list including frequency idiomatic words used when describing a frequency band of an electromagnetic field;
extracting, by the processor, candidate electromagnetic field frequency band information from the research literature by using the frequency tolerance word; and
selecting, by the processor, candidate electromagnetic field frequency band information satisfying the electromagnetic field emission source condition from among the candidate electromagnetic field frequency band information as electromagnetic field frequency band information of the electromagnetic field emission source;
The step of selecting the electromagnetic field frequency band information of the electromagnetic field emission source is
determining, by the processor, that the candidate electromagnetic field frequency band information satisfies the electromagnetic field emission source condition when the electromagnetic field frequency band of the candidate electromagnetic field frequency band information is included in a preset frequency band; and
Including, by the processor, a frequency unit corresponding to a frequency tolerance word of the candidate electromagnetic field frequency band information, and identifying an electromagnetic field frequency band of the candidate electromagnetic field frequency band information based on the identified frequency unit;
Calculating, by the processor, a frequency similarity between the electromagnetic field frequency band information and category important frequency band information corresponding to the research category of the research literature;
Calculating the frequency similarity includes:
Calculating, by the processor, a frequency band ratio of the frequency band of the category important frequency band information and the frequency band of the electromagnetic field frequency band information of the electromagnetic field emission source as the frequency similarity;
Calculating the frequency similarity includes:
Calculating, as the frequency band ratio, a ratio of a frequency band overlapping with a frequency band of the electromagnetic field frequency band information of the electromagnetic field emission source from among the frequency bands of the category important frequency band information as the frequency band ratio My electromagnetic field frequency band information extraction method.

7. The method of claim 6,
The step of extracting the candidate electromagnetic field frequency band information
The processor searches for the frequency tolerance word in the research literature, searches for numeric text adjacent to the searched frequency tolerance word, and extracts the searched frequency tolerance word and the searched numeric text as the candidate electromagnetic field frequency band information ; Method of extracting electromagnetic field frequency band information in the research literature, characterized in that it includes.

delete

A computer program stored in a computer-readable recording medium in combination with a computer, which is hardware, to perform the method of claim 6.