WO2015023031A1 - Method for supporting search in specialist fields and apparatus therefor - Google Patents

Method for supporting search in specialist fields and apparatus therefor Download PDF

Info

Publication number
WO2015023031A1
WO2015023031A1 PCT/KR2013/011920 KR2013011920W WO2015023031A1 WO 2015023031 A1 WO2015023031 A1 WO 2015023031A1 KR 2013011920 W KR2013011920 W KR 2013011920W WO 2015023031 A1 WO2015023031 A1 WO 2015023031A1
Authority
WO
WIPO (PCT)
Prior art keywords
words
mapping
word
mapping table
query
Prior art date
Application number
PCT/KR2013/011920
Other languages
French (fr)
Korean (ko)
Inventor
이수원
백종범
Original Assignee
숭실대학교산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 숭실대학교산학협력단 filed Critical 숭실대학교산학협력단
Publication of WO2015023031A1 publication Critical patent/WO2015023031A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries

Definitions

  • the present invention uses the Q & A data or precedent data collected from the web site to learn mapping probabilities between general terms and jargon or statutes, and then use them to predict and provide the terminology or statutes for the query. It relates to a field search support method and an apparatus thereof.
  • BEST-Project is developing a case search system that inputs the current situation to the general public and maps it with the general term user ontology to provide the user with the corresponding case search result.
  • the construction and maintenance of such terminology ontology and terminology ontology requires a lot of time and money.
  • the present invention uses the Q & A data or precedent data collected from the web site to learn mapping probabilities between general terms and jargon or statutes, and then use them to predict and provide the terminology or statutes for the query. It is to provide a field search support method and apparatus.
  • the general term-terminology mapping probability may be calculated using the frequency of the words that appear simultaneously in the question unit and the answer unit.
  • the general term-terminology mapping probability is calculated using pairwise mutual information (PMI),
  • Represents a generic term set Represents a set of statutory words.
  • Represents a set of terminology Denotes words that are included in the general term set but not in the statutory word set, Denotes words that are included in the terminology and also included in statute keywords.
  • a mapping probability may be calculated and predicted on the n (natural number) terminologies that match words included in the query using the mapping table.
  • step (d) the mapping probability is calculated using a na ⁇ ve Bayesian classifier.
  • mapping probability is calculated by the following formula,
  • mapping probabilities included in the mapping table Denotes the terminology and X denotes the query.
  • the word-law mapping table includes a reliability according to the mapping between words and the law, the reliability is calculated by the following formula,
  • a collection unit for collecting query-answer data in a web document;
  • An extracting unit that extracts a word by analyzing the question unit and the answer unit in the question-answer data;
  • a mapping table generator configured to generate a general term-terminology mapping table by calculating a general term-terminology mapping probability by analyzing correlations between the words included in the question unit and the answer unit;
  • a prediction unit for extracting and providing a term including a word included in a query using the term mapping table.
  • an extracting unit for extracting each word by analyzing the case data;
  • a mapping table generator for generating a word-statute mapping table by calculating a mapping probability between a word and a law using the word;
  • a prediction unit that predicts a law for a query using the word-law mapping table.
  • the present invention has an advantage that it is possible to provide a search convenience to a user who is relatively knowledgeable in terms of technical terms or statutes.
  • FIG. 1 is a diagram schematically illustrating a configuration of a specialized field search support apparatus according to an embodiment of the present invention.
  • mapping table in accordance with an embodiment of the present invention.
  • FIG. 3 is a block diagram schematically illustrating an internal configuration of an apparatus for searching a specialty field according to another exemplary embodiment of the present invention.
  • FIG. 4 illustrates a word-statute mapping table according to another embodiment of the present invention.
  • 5 is a view showing the results of the laws and regulations predicted for the query according to an embodiment of the present invention.
  • FIG. 6 is a flowchart illustrating a method of providing a specialized term for a query by a specialized field search support apparatus according to an exemplary embodiment of the present invention.
  • FIG. 7 is a flowchart illustrating a method for providing a specialized field search support apparatus according to another embodiment of the present invention to provide a statute for a query word.
  • first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.
  • the present invention is to provide a legislative search service to the general public who is relatively weak in legal knowledge, it can be provided by mapping the general terms commonly used by the general person to specialized terms.
  • the present invention may generate a word-statute mapping table based on the probability of mapping between words and laws based on the precedent data, and use this to predict and provide a law for a query.
  • FIG. 1 is a diagram schematically illustrating a structure of a specialized field search support apparatus according to an embodiment of the present invention
  • FIG. 2 is a diagram illustrating a mapping table according to an embodiment of the present invention.
  • Q & A data is collected from a website, the Q & A data is divided into a question part and an answer part, and each word is extracted. Let's discuss the possible devices.
  • the apparatus for searching a specialty field includes a collector 110, an extractor 115, a mapping table generator 120, a memory 130, and a controller ( 135).
  • Collecting unit 110 is a means for collecting the question and answer data through the communication network.
  • the collection unit 110 may collect Q & A data (eg, Naver intellectuals) from a web document and store it in a database.
  • Q & A data eg, Naver intellectuals
  • the extraction unit 115 is a means for analyzing the Q & A data to classify the question unit and the answer unit, and extracting words from the question unit and the answer unit, respectively.
  • the extractor 115 may extract each word in morpheme units by morphologically analyzing the Q & A data. That is, suppose that the question of Q & A data is the same as "I ask about the image copyright of the product.” The extraction unit 115 can extract "product", “image”, “copyright”, and “question” as words. have.
  • the words extracted from the question unit by analyzing the Q & A data may be referred to as general terms according to the phrase, and the words extracted from the answer unit may be described as terminology according to the phrase.
  • the mapping table generator 120 is a means for calculating a mapping probability by analyzing correlations between the words extracted from the Q & A data and generating a term mapping table based on the correlation probability.
  • the mapping table generator 120 may calculate mapping probabilities using the frequencies of words that appear in the question and answer units, respectively, for the words extracted from the question and answer units.
  • mapping table generator 120 that is, the mapping candidate extractor 115 may calculate a mapping probability by calculating pairwise mutual information (PMI) between general terms and jargon.
  • PMI pairwise mutual information
  • the mutual information amount (PMI) may be calculated using Equation 1 below.
  • mapping table illustrates a mapping table according to an embodiment of the present invention.
  • the mapping table includes the number of simultaneous occurrences of a word included in an answer unit corresponding to a word included in a question unit and a mapping probability calculated based on the same.
  • the prediction unit 125 is a means for calculating a terminology mapping probability for a query input by a user using a mapping table to predict and provide related terminology.
  • the prediction unit 125 may extract and provide the top n (natural numbers) terminologies having a high mapping probability among the terminology corresponding to the words included in the query by referring to the mapping table.
  • the predictor 125 may calculate a probability of what the law implies by the user's input query using the naive Bayesian classifier, which will be described in more detail with reference to FIG. 3 below. .
  • the memory 130 is a means for storing various algorithms, mapping tables, and the like necessary for operating the specialty field search support apparatus 100 according to an embodiment of the present invention.
  • the controller 135 is an internal component (eg, the collector 110, the extractor 115, the mapping table generator 120) of the apparatus for searching a specialty field according to an embodiment of the present invention. , The memory 130 and the like.
  • mapping probability between the general term and the terminology is calculated and the mapping table for the term is generated according to an embodiment of the present invention.
  • FIG. 2 an apparatus capable of predicting and providing a law for a query based on a mapping probability between a query and a law according to another embodiment of the present invention will be described.
  • FIG. 3 is a block diagram schematically illustrating an internal configuration of an apparatus for searching a specialty field according to another embodiment of the present invention
  • FIG. 4 is a diagram illustrating a word-law mapping table according to another embodiment of the present invention
  • FIG. 5 is a diagram illustrating a result of a predicted law for a query word according to an exemplary embodiment of the present invention.
  • the apparatus for searching a specialty field may include a collector 310, an extractor 315, a mapping table generator 320, a predictor 325, and a memory. 330 and the control unit 335 is configured.
  • Collecting unit 310 is a means for collecting the case data.
  • the collector 310 may collect the case data from a specific web site that provides the case data.
  • the extraction unit 315 is a means for analyzing the case data and extracting words, respectively.
  • the extractor 315 may extract the words by analyzing the case data in morpheme units. Since the method itself for extracting words in morpheme units from a specific sentence is already known to those skilled in the art, a separate description thereof will be omitted.
  • the mapping table generator 320 is a means for generating a mapping table by calculating a mapping probability between a word extracted from the precedent data and a law.
  • a mapping probability between a word extracted from the precedent data and a law is a mapping probability between a word extracted from the precedent data and a law.
  • the mapping table generator 320 may use the reliability as a measure for calculating the mapping probability between the words extracted from the precedent data and the law.
  • the mapping table generator 320 may calculate a mapping probability between a word and a law using Equation 2 below.
  • Represents the set of occurrence words in the case data Represents a set of statutory names
  • the word-law mapping table includes each word, a law mapped to the word, and a reliability thereof. That is, the higher the reliability of the law mapped to each word, the higher the reliability of the law mapped to the word.
  • the prediction unit 325 is a means for predicting a legal mapping probability for the user's input query using the keyword-statute table.
  • the prediction unit 325 may calculate a probability of what the law implies by the user's input query using the naive Bayesian classifier.
  • the prediction unit 325 may calculate the query-law mapping probability by using the number 3 from which the MAX function is removed from the naive Bayesian classifier.
  • mapping probability Denotes the mapping probability, which is equal to number 2.
  • the prediction unit 325 refers only to the top n keyword-law mappings that have a high probability of being mapped to the law C i by referring to the keyword-law table in order to prevent the mapping probability from being excessively reduced when predicting the query statement-law mapping probability. Can be used to calculate the mapping probability. If this is expressed as an expression, it is equal to number 4.
  • FIG. 5 shows the laws and the mapping probabilities predicted corresponding to the query statements.
  • the prediction unit 325 calculates a mapping probability of a law for a query using a word-law mapping table, and then provides a law for the query in the order of the highest mapping probability according to the calculated mapping probability. Can be predicted and provided.
  • the memory 330 is a means for storing various algorithms, mapping tables, etc. required for operating the specialty field search support apparatus 300 according to an embodiment of the present invention.
  • the controller 335 is an internal component (eg, the collector 310, the extractor 315, the mapping table generator 320) of the apparatus for searching a specialty field according to an embodiment of the present invention. , A predictor, a memory 330, etc.).
  • FIG. 6 is a flowchart illustrating a method of providing a specialized term for a query by the apparatus for supporting a specialized field search according to an embodiment of the present invention.
  • the specialization search support apparatus collects Q & A data from a web site and stores the database.
  • the specialized field search support apparatus divides the collected Q & A data into a question unit and an answer unit, and analyzes each to extract a word.
  • the specialized field search support apparatus may extract the words in the morpheme units by analyzing the question unit and the answer unit in the morpheme units. Since this is already apparent to those skilled in the art, a separate description thereof will be omitted.
  • the specialized field search support apparatus In operation 620, the specialized field search support apparatus generates a mapping table through correlation analysis between words extracted from the question unit and the answer unit.
  • a mapping table may be generated by calculating a mapping probability through correlation analysis between words extracted from the question unit and the answer unit (words extracted from the question unit and words extracted from the answer unit).
  • the specialized field search support apparatus predicts and provides a term corresponding to a query using a mapping table.
  • the specialized field search support apparatus may calculate a mapping probability of the input query based on the naive Bayesian classifier with reference to the mapping table, and then predict and provide the top n terminologies having a high mapping probability.
  • FIG. 7 is a flowchart illustrating a method of providing a law for a query by a specialized field search support apparatus according to another exemplary embodiment of the present invention.
  • the specialization search support device collects case data.
  • the specialty search support device may collect and store case data from a web site.
  • the specialized field search support apparatus extracts words by analyzing case data.
  • the specialized field search support apparatus may extract the words in morpheme units by analyzing the case data in the form of units. Since this is already apparent to those skilled in the art, a separate description thereof will be omitted.
  • the specialized field search support apparatus generates a word-statute mapping table by calculating a mapping probability between words extracted from case data and the laws.
  • the specialized field search support apparatus predicts a law for a query using a word-law mapping table.
  • the apparatus for searching a specialized field may calculate a law mapping probability for a query by referring to a word-law mapping table and predict a law for a query based on the word-law mapping table.
  • the specialized field search support apparatus may use a na ⁇ ve Bayesian classifier, which is a concept classification technique, which is the same as described using the number 4 in FIG.
  • the specialized field search support method according to an embodiment of the present invention may be implemented in the form of program instructions that can be executed through various electronic means for processing information may be recorded in the storage medium.
  • the storage medium may include program instructions, data files, data structures, etc. alone or in combination.
  • the program instructions recorded in the storage medium may be those specially designed and constructed for the present invention, or may be known and available to those skilled in the software art.
  • Examples of storage media include magnetic media such as hard disks, floppy disks and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic-optical media such as floppy disks.
  • hardware devices specifically configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory, and the like.
  • Examples of program instructions include not only machine code generated by a compiler, but also devices that process information electronically using an interpreter, for example, high-level language code that can be executed by a computer.
  • the hardware device described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Algebra (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method and an apparatus for supporting a search in specialist fields are disclosed. A method for supporting a search in specialist fields comprises the steps of: (a) collecting query-answer data from web pages; (b) extracting words from the query-answer data by analyzing and classifying the query-answer data into a query part and an answer part; (c) calculating a general term-technical term mapping probability by analyzing the correlation between words respectively included in the query part and the answer part and creating a general term-technical term mapping table; and (d) extracting and providing a technical term including the word included in a query using the term mapping table.

Description

전문분야 검색 지원 방법 및 그 장치Specialized field search support method and device
본 발명은 웹 사이트에서 수집된 Q&A 데이터 또는 판례 데이터를 이용하여 일반 용어와 전문 용어 또는 법령과의 매핑 확률을 학습한 후, 이를 이용하여 질의어에 대한 전문 용어 또는 법령을 예측하여 제공할 수 있는 전문 분야 검색 지원 방법 및 그 장치에 관한 것이다.The present invention uses the Q & A data or precedent data collected from the web site to learn mapping probabilities between general terms and jargon or statutes, and then use them to predict and provide the terminology or statutes for the query. It relates to a field search support method and an apparatus thereof.

전문 분야에 대한 사전지식이 없는 일반인이 각종 전문 분야의 지식을 검색하고 활용하는 것은 매우 어려운 일이다. 예를 들어 의료지식이 없는 일반인이 본인의 몸상태에 대한 의료지식을 검색하고 이해하는 것은 거의 불가능한 일이며, 법률지식이 없는 일반인이 변호사의 도움 없이 본인이 처한 어려움에 해당되는 법령을 찾아 내고 활용하는 것 역시 매우 힘든 일이다. 특히 법률 분야에 있어서 법령은 일상에서 거의 사용되지 않는 법률전문용어만으로 구성되어 있다. 이로 인하여 법률전문용어를 모르는 일반인이 작성한 질의문(Query)만으로는 법령정보검색에 한계가 있다.It is very difficult for ordinary people without prior knowledge of specializations to search and utilize knowledge of various specialties. For example, it is almost impossible for a general person without medical knowledge to search for and understand medical knowledge about his or her physical condition, and a general person without legal knowledge can find and use laws and regulations that correspond to his or her difficulties without the help of a lawyer. It is also very difficult to do. Especially in the field of law, the law consists only of legal terminology which is rarely used in everyday life. For this reason, there is a limit to the legal information retrieval by using only queries written by the general public who do not know legal terminology.
일반인의 법령 혹은 판례 검색 작업을 지원하기 위한 연구로는 네덜란드에서 2005년도부터 수행된 BEST-Project가 대표적이다. BEST-Project에서는 일반인이 현재 처한 상황을 입력하면 이를 일반 용어 온톨로지(User Ontology)와 매핑하여 이에 해당하는 판례 검색 결과를 사용자에게 제공하여 주는 판례 검색 시스템을 개발 중에 있다. 그러나 이러한 일반용어 온톨로지 및 전문용어 온톨로지를 구축하고 유지보수하는 작업은 많은 시간과 비용을 필요로 한다. 또한 한국어에 있어서는 일반용어와 전문용어간의 뚜렷한 구분선을 찾기가 어렵다는 문제점을 지닌다. The Best-Project, which has been conducted since 2005 in the Netherlands, is typical of research to support public legislation or precedent search. BEST-Project is developing a case search system that inputs the current situation to the general public and maps it with the general term user ontology to provide the user with the corresponding case search result. However, the construction and maintenance of such terminology ontology and terminology ontology requires a lot of time and money. In addition, in Korean, it is difficult to find a distinct line between a general term and a terminology.

본 발명은 웹 사이트에서 수집된 Q&A 데이터 또는 판례 데이터를 이용하여 일반 용어와 전문 용어 또는 법령과의 매핑 확률을 학습한 후, 이를 이용하여 질의어에 대한 전문 용어 또는 법령을 예측하여 제공할 수 있는 전문 분야 검색 지원 방법 및 그 장치를 제공하기 위한 것이다.The present invention uses the Q & A data or precedent data collected from the web site to learn mapping probabilities between general terms and jargon or statutes, and then use them to predict and provide the terminology or statutes for the query. It is to provide a field search support method and apparatus.

본 발명의 일 측면에 따르면, 전문 분야 검색 지원 방법 및 그 장치를 제공함으로써, 웹 사이트에서 수집된 Q&A 데이터 또는 판례 데이터를 이용하여 일반 용어와 전문 용어 또는 법령과의 매핑 확률을 학습한 후, 이를 이용하여 질의어에 대한 전문 용어 또는 법령을 예측하여 제공할 수 있는 방법이 제공된다.According to an aspect of the present invention, by providing a method and apparatus for supporting specialized field search, after learning the mapping probability between general terms and jargon or statutes using Q & A data or precedent data collected from the website, Provided is a method for predicting and providing a terminology or a law for a query.
본 발명의 일 실시예에 따르면, (a)웹 문서에서 질의-답변 데이터를 수집하는 단계; (b) 상기 질의-답변 데이터에서 질문부와 답변부를 구분하여 분석하여 단어를 추출하는 단계; (c) 상기 질문부 및 상기 답변부에 포함된 각 단어들간의 상관성 분석을 통해 일반 용어-전문 용어 매핑 확률을 계산하여 일반 용어-전문용어 매핑 테이블을 생성하는 단계; 및 (d) 상기 용어 매핑 테이블을 이용하여 질의문에 포함된 단어를 포함하는 전문 용어를 추출하여 제공하는 단계를 포함하는 전문 분야 검색 지원 방법이 제공될 수 있다.According to one embodiment of the invention, (a) collecting the query-answer data in a web document; (b) extracting a word by analyzing the question part and the answer part in the question-answer data; (c) generating a general term-terminology mapping table by calculating a general term-terminology mapping probability by analyzing correlations between the words included in the question unit and the answer unit; And (d) extracting and providing a terminology including a word included in a query using the term mapping table.
상기 (c) 단계는, 상기 질문부 및 상기 답변부에 동시 출현한 단어의 빈도를 이용하여 상기 일반 용어-전문 용어 매핑 확률을 계산할 수 있다.In the step (c), the general term-terminology mapping probability may be calculated using the frequency of the words that appear simultaneously in the question unit and the answer unit.
상기 일반 용어-전문 용어 매핑 확률은 PMI(pairwise mutual information)를 이용하여 계산되되, The general term-terminology mapping probability is calculated using pairwise mutual information (PMI),
Figure PCTKR2013011920-appb-I000001
Figure PCTKR2013011920-appb-I000001
Figure PCTKR2013011920-appb-I000002
Figure PCTKR2013011920-appb-I000002
여기서,
Figure PCTKR2013011920-appb-I000003
는 일반 용어 집합을 나타내고,
Figure PCTKR2013011920-appb-I000004
은 법령 출현 단어 집합을 나타낸다. 또한,
Figure PCTKR2013011920-appb-I000005
는 전문 용어 집합을 나타내며,
Figure PCTKR2013011920-appb-I000006
는 일반 용어 집합에는 포함되면서 법령 출현 단어 집합에는 포함되지 않는 단어들을 나타내고,
Figure PCTKR2013011920-appb-I000007
는 전문 용어에 포함되면서 동시에 법령 키워드에도 포함되는 단어를 나타낸다.
here,
Figure PCTKR2013011920-appb-I000003
Represents a generic term set,
Figure PCTKR2013011920-appb-I000004
Represents a set of statutory words. Also,
Figure PCTKR2013011920-appb-I000005
Represents a set of terminology,
Figure PCTKR2013011920-appb-I000006
Denotes words that are included in the general term set but not in the statutory word set,
Figure PCTKR2013011920-appb-I000007
Denotes words that are included in the terminology and also included in statute keywords.
상기 (d) 단계는, 상기 매핑 테이블을 이용하여 상기 질의문에 포함된 단어와 일치하는 상기 n(자연수)개의 전문 용어를 대상으로 매핑 확률을 계산하여 예측할 수 있다.In the step (d), a mapping probability may be calculated and predicted on the n (natural number) terminologies that match words included in the query using the mapping table.
상기 (d) 단계는, 상기 매핑 확률은 나이브 베이지안 분류기(naïve Bayesian classifier)를 이용하여 계산되되,In step (d), the mapping probability is calculated using a naïve Bayesian classifier.
상기 매핑 확률은 하기 수식에 의해 계산되며,The mapping probability is calculated by the following formula,
Figure PCTKR2013011920-appb-I000008
Figure PCTKR2013011920-appb-I000008
여기서,
Figure PCTKR2013011920-appb-I000009
이되, 상기
Figure PCTKR2013011920-appb-I000010
는 매핑 테이블에 포함된 매핑 확률을 나타내고,
Figure PCTKR2013011920-appb-I000011
는 전문용어를 나타내며, X는 질의문을 나타낸다.
here,
Figure PCTKR2013011920-appb-I000009
This is
Figure PCTKR2013011920-appb-I000010
Represents the mapping probabilities included in the mapping table,
Figure PCTKR2013011920-appb-I000011
Denotes the terminology and X denotes the query.

본 발명의 다른 실시예에 따르면, (a) 판례 데이터를 분석하여 단어를 각각 추출하는 단계; (b) 상기 단어를 이용하여 단어와 법령간 매핑 확률을 계산하여 단어-법령 매핑 테이블을 생성하는 단계; 및 (c) 상기 단어-법령 매핑 테이블을 이용하여 질의문에 대한 법령을 예측하는 단계를 포함하는 전문 분야 검색 지원 방법이 제공될 수 있다.According to another embodiment of the present invention, (a) analyzing the case data to extract each word; (b) generating a word-statute mapping table by calculating mapping probabilities between words and statutes using the words; And (c) predicting a law for a query using the word-law mapping table.
상기 단어-법령 매핑 테이블은 단어와 법령간의 매핑에 따른 신뢰도를 포함하되, 상기 신뢰도는 하기 수식에 의해 계산되되, The word-law mapping table includes a reliability according to the mapping between words and the law, the reliability is calculated by the following formula,
Figure PCTKR2013011920-appb-I000012
Figure PCTKR2013011920-appb-I000012
Figure PCTKR2013011920-appb-I000013
Figure PCTKR2013011920-appb-I000013
여기서,
Figure PCTKR2013011920-appb-I000014
는 판례 데이터내 출현한 단어의 집합을 나타내고,
Figure PCTKR2013011920-appb-I000015
는 법령명에 대한 집합을 나타내며,
Figure PCTKR2013011920-appb-I000016
은 판례 데이터내 출현 단어 집합에 포함되는 단어들 중 법령명에 포함되지 않는 단어를 나타내고,
Figure PCTKR2013011920-appb-I000017
는 판례 데이터내에 출현하는 단어들 중 법령명의 집합에 포함되는 법령들을 나타낸다.
here,
Figure PCTKR2013011920-appb-I000014
Represents the set of words that appear in the case data,
Figure PCTKR2013011920-appb-I000015
Represents a set of statutory names,
Figure PCTKR2013011920-appb-I000016
Represents words that are not included in the legal name among words included in the set of occurrence words in the case data,
Figure PCTKR2013011920-appb-I000017
Represents statutes included in a set of statutory names among words appearing in case data.

본 발명의 다른 측면에 따르면, 웹 사이트에서 수집된 Q&A 데이터 또는 판례 데이터를 이용하여 일반 용어와 전문 용어 또는 법령과의 매핑 확률을 학습한 후, 이를 이용하여 질의어에 대한 전문 용어 또는 법령을 예측하여 제공할 수 있는 장치가 제공된다.According to another aspect of the present invention, using the Q & A data or precedent data collected from the website to learn the mapping probability between the general term and the term or statute, by using it to predict the term or statute for the query Provided are devices that can provide.
본 발명의 일 실시예에 따르면, 웹 문서에서 질의-답변 데이터를 수집하는 수집부; 상기 질의-답변 데이터에서 질문부와 답변부를 구분하여 분석하여 단어를 추출하는 추출부; 상기 질문부 및 상기 답변부에 포함된 각 단어들간의 상관성 분석을 통해 일반 용어-전문 용어 매핑 확률을 계산하여 일반 용어-전문 용어 매핑 테이블을 생성하는 매핑 테이블 생성부; 및 상기 용어 매핑 테이블을 이용하여 질의문에 포함된 단어를 포함하는 전문 용어를 추출하여 제공하는 예측부를 포함하는 전문 분야 검색 지원 장치가 제공될 수 있다.According to an embodiment of the present invention, a collection unit for collecting query-answer data in a web document; An extracting unit that extracts a word by analyzing the question unit and the answer unit in the question-answer data; A mapping table generator configured to generate a general term-terminology mapping table by calculating a general term-terminology mapping probability by analyzing correlations between the words included in the question unit and the answer unit; And a prediction unit for extracting and providing a term including a word included in a query using the term mapping table.

본 발명의 다른 실시예에 따르면, 판례 데이터를 분석하여 단어를 각각 추출하는 추출부; 상기 단어를 이용하여 단어와 법령간 매핑 확률을 계산하여 단어-법령 매핑 테이블을 생성하는 매핑 테이블 생성부; 및 상기 단어-법령 매핑 테이블을 이용하여 질의문에 대한 법령을 예측하는 예측부를 포함하는 전문 분야 검색 지원 장치가 제공될 수 있다.According to another embodiment of the present invention, an extracting unit for extracting each word by analyzing the case data; A mapping table generator for generating a word-statute mapping table by calculating a mapping probability between a word and a law using the word; And a prediction unit that predicts a law for a query using the word-law mapping table.

본 발명의 일 실시예에 따른 전문 분야 검색 지원 방법 및 그 장치를 제공함으로써, 웹 사이트에서 수집된 Q&A 데이터 또는 판례 데이터를 이용하여 일반 용어와 전문 용어 또는 법령과의 매핑 확률을 학습한 후, 이를 이용하여 질의어에 대한 전문 용어 또는 법령을 예측하여 제공할 수 있다.By providing a method and apparatus for supporting specialized field search according to an embodiment of the present invention, after learning mapping probabilities between general terms and specialized terms or statutes using Q & A data or precedent data collected from a web site, It can be used to predict the terminology or law for the query.
이로 인해, 본 발명은 전문 용어 또는 법령에 대해 상대적으로 지식이 적은 사용자에게 검색 편의성을 제공할 수 있는 이점이 있다.For this reason, the present invention has an advantage that it is possible to provide a search convenience to a user who is relatively knowledgeable in terms of technical terms or statutes.

도 1은 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치의 구성을 개략적으로 도시한 도면.1 is a diagram schematically illustrating a configuration of a specialized field search support apparatus according to an embodiment of the present invention.
도 2는 본 발명의 일 실시예에 따른 매핑 테이블을 예시한 도면.2 illustrates a mapping table in accordance with an embodiment of the present invention.
도 3은 본 발명의 다른 실시예에 따른 전문 분야 검색 지원 장치의 내부 구성을 개략적으로 도시한 블록도.3 is a block diagram schematically illustrating an internal configuration of an apparatus for searching a specialty field according to another exemplary embodiment of the present invention.
도 4는 본 발명의 다른 실시예에 따른 단어-법령 매핑 테이블을 예시한 도면.4 illustrates a word-statute mapping table according to another embodiment of the present invention.
도 5는 본 발명의 일 실시예에 따른 질의어에 대해 예측된 법령에 대한 결과를 나타낸 도면.5 is a view showing the results of the laws and regulations predicted for the query according to an embodiment of the present invention.
도 6은 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치가 질의어에 대한 전문 용어를 제공하는 방법을 나타낸 순서도.FIG. 6 is a flowchart illustrating a method of providing a specialized term for a query by a specialized field search support apparatus according to an exemplary embodiment of the present invention. FIG.
도 7은 본 발명의 다른 실시예에 따른 전문 분야 검색 지원 장치가 질의어에 대한 법령을 제공하는 방법을 나타낸 순서도.7 is a flowchart illustrating a method for providing a specialized field search support apparatus according to another embodiment of the present invention to provide a statute for a query word.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all transformations, equivalents, and substitutes included in the spirit and scope of the present invention. In the following description of the present invention, if it is determined that the detailed description of the related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted.
제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.
본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.
본 발명은 법률 지식이 상대적으로 약한 일반인에게 법령 검색 서비스를 제공하기 위한 것으로, 일반인이 일반적으로 사용하는 일반 용어를 전문 용어로 매핑하여 제공할 수 있다.The present invention is to provide a legislative search service to the general public who is relatively weak in legal knowledge, it can be provided by mapping the general terms commonly used by the general person to specialized terms.
또한, 본 발명은 판례 데이터에 기초하여 단어와 법령간 매핑 확률을 기반으로 단어-법령간 매핑 테이블을 생성하고, 이를 이용하여 질의문에 대한 법령을 예측하여 제공할 수 있다. In addition, the present invention may generate a word-statute mapping table based on the probability of mapping between words and laws based on the precedent data, and use this to predict and provide a law for a query.
이하, 본 발명의 실시예를 첨부한 도면들을 참조하여 상세히 설명하기로 한다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치의 구성을 개략적으로 도시한 도면이고, 도 2는 본 발명의 일 실시예에 따른 매핑 테이블을 예시한 도면이다. 도 1에서는 웹사이트에서 Q&A 데이터를 수집하고, Q&A 데이터를 질문부와 답변부로 구분하여 단어를 각각 추출한 후 일반 용어와 전문 용어의 매핑 확률을 계산한 후 질의문에 대한 전문 용어를 예측하여 제공할 수 있는 장치에 대해 설명하기로 하자.FIG. 1 is a diagram schematically illustrating a structure of a specialized field search support apparatus according to an embodiment of the present invention, and FIG. 2 is a diagram illustrating a mapping table according to an embodiment of the present invention. In FIG. 1, Q & A data is collected from a website, the Q & A data is divided into a question part and an answer part, and each word is extracted. Let's discuss the possible devices.
도 1을 참조하면, 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치(100)는 수집부(110), 추출부(115), 매핑 테이블 생성부(120), 메모리(130) 및 제어부(135)를 포함하여 구성된다. Referring to FIG. 1, the apparatus for searching a specialty field according to an embodiment of the present invention 100 includes a collector 110, an extractor 115, a mapping table generator 120, a memory 130, and a controller ( 135).
수집부(110)는 통신망을 통해 질의답변 데이터를 수집하기 위한 수단이다.Collecting unit 110 is a means for collecting the question and answer data through the communication network.
예를 들어, 수집부(110)는 웹 문서에서 Q&A 데이터(예를 들어, 네이버 지식인 등)를 수집하여 데이터베이스에 저장할 수 있다.For example, the collection unit 110 may collect Q & A data (eg, Naver intellectuals) from a web document and store it in a database.
추출부(115)는 Q&A 데이터를 분석하여 질문부 및 답변부를 각각 구분한 후, 질문부 및 답변부에 각각 단어를 추출하기 위한 수단이다.The extraction unit 115 is a means for analyzing the Q & A data to classify the question unit and the answer unit, and extracting words from the question unit and the answer unit, respectively.
예를 들어, 추출부(115)는 Q&A 데이터를 형태소 분석하여 형태소 단위로 각 단어들을 각각 추출할 수 있다. 즉, Q&A 데이터의 질문이 ""제품의 이미지 저작권에 대해 질문합니다"와 같다고 가정하자. 추출부(115)는 "제품", "이미지", "저작권", "질문"을 단어로 추출할 수 있다.For example, the extractor 115 may extract each word in morpheme units by morphologically analyzing the Q & A data. That is, suppose that the question of Q & A data is the same as "I ask about the image copyright of the product." The extraction unit 115 can extract "product", "image", "copyright", and "question" as words. have.
또한, Q&A 데이터 특성상 질문자에 비해 질문에 대해 답변하는 사람들이 상대적으로 해당 분야에 대한 전문가일 가능성이 높으며, 특정 웹 사이트에서는 의사, 변호사, 변리사 등과 같은 해당 분야 전문가가 답변할 수 있는 서비스를 운영하고 있다.In addition, due to the nature of Q & A data, people who answer questions are more likely to be experts in the field than the questioner, and certain websites operate services that can be answered by experts in the field, such as doctors, lawyers, patent attorneys have.
이에 따라, 본 명세서에서 Q&A 데이터를 분석하여 질문부에서 추출된 단어는 구문에 따라 일반 용어로도 칭하여 설명하며, 답변부에서 추출된 단어는 구문에 따라 전문 용어로 칭하여 설명될 수 있다.Accordingly, in this specification, the words extracted from the question unit by analyzing the Q & A data may be referred to as general terms according to the phrase, and the words extracted from the answer unit may be described as terminology according to the phrase.
매핑 테이블 생성부(120)는 Q&A 데이터에서 추출된 각 단어들의 상호간의 상관성 분석을 통해 매핑 확률을 계산하고, 이를 기반으로 용어 매핑 테이블을 생성하기 위한 수단이다.The mapping table generator 120 is a means for calculating a mapping probability by analyzing correlations between the words extracted from the Q & A data and generating a term mapping table based on the correlation probability.
이를 위해, 매핑 테이블 생성부(120)는 질문부 및 답변부에서 각각 추출된 단어들을 대상으로 질문부 및 답변부에서 상호 출현하는 단어들의 빈도수를 이용하여 매핑 확률을 계산할 수 있다.To this end, the mapping table generator 120 may calculate mapping probabilities using the frequencies of words that appear in the question and answer units, respectively, for the words extracted from the question and answer units.
예를 들어, 매핑 테이블 생성부(120)는 즉, 매핑 후보 추출부(115)는 일반 용어 및 전문 용어간의 상호정보량(PMI: pairwise mutual information)을 계산하여 매핑 확률을 계산할 수 있다.For example, the mapping table generator 120, that is, the mapping candidate extractor 115 may calculate a mapping probability by calculating pairwise mutual information (PMI) between general terms and jargon.
상호정보량(PMI)은 하기 수 1을 이용하여 계산될 수 있다.The mutual information amount (PMI) may be calculated using Equation 1 below.
Figure PCTKR2013011920-appb-M000001
Figure PCTKR2013011920-appb-M000001
Figure PCTKR2013011920-appb-I000018
Figure PCTKR2013011920-appb-I000018
여기서,
Figure PCTKR2013011920-appb-I000019
는 질문내 출현 단어 집합을 나타내고,
Figure PCTKR2013011920-appb-I000020
은 법령내 출현 단어 집합을 나타내며,
Figure PCTKR2013011920-appb-I000021
는 답변내 출현 단어 집합을 나타낸다.
here,
Figure PCTKR2013011920-appb-I000019
Represents the set of occurrence words in the question,
Figure PCTKR2013011920-appb-I000020
Represents a set of occurrence words within the decree,
Figure PCTKR2013011920-appb-I000021
Represents the set of occurrence words in the answer.
또한,
Figure PCTKR2013011920-appb-I000022
는 질문내 출현 단어 집합에 포함되는 단어들 중 법령내 출현 단어 집합에 포함되지 않는 단어들을 나타내며,
Figure PCTKR2013011920-appb-I000023
는 답변내 출현 단어 집합에 포함되면서 법령내 출현 단어 집합에도 포함되는 단어를 나타낸다.
Also,
Figure PCTKR2013011920-appb-I000022
Denotes words that are not included in the word set in the law among words included in the word set in the question,
Figure PCTKR2013011920-appb-I000023
Indicates a word that is included in the appearance word set in the answer and is also included in the appearance word set in the law.
도 2에는 본 발명의 일 실시예에 따른 매핑 테이블이 예시되어 있다. 도 2를 참조하면, 매핑 테이블은 질문부에 포함된 단어에 대응하는 답변부에 포함된 단어와의 동시 출현 횟수와 이를 기반으로 계산된 매핑 확률이 포함된다.2 illustrates a mapping table according to an embodiment of the present invention. Referring to FIG. 2, the mapping table includes the number of simultaneous occurrences of a word included in an answer unit corresponding to a word included in a question unit and a mapping probability calculated based on the same.
예측부(125)는 매핑 테이블을 이용하여 사용자에 의해 입력된 질의문에 대한 전문 용어 매핑 확률을 계산하여 관련된 전문 용어를 예측하여 제공하기 위한 수단이다. 예측부(125)는 매핑 테이블을 참조하여, 질의문에 포함된 단어에 대응하는 전문 용어 중 매핑 확률이 높은 상위 n(자연수)개의 전문 용어를 추출하여 제공할 수도 있다.The prediction unit 125 is a means for calculating a terminology mapping probability for a query input by a user using a mapping table to predict and provide related terminology. The prediction unit 125 may extract and provide the top n (natural numbers) terminologies having a high mapping probability among the terminology corresponding to the words included in the query by referring to the mapping table.
예를 들어, 예측부(125)는 나이브 베이지안 분류기를 이용하여 사용자의 입력 질의문이 암시하는 법령이 무엇인지에 대한 확률을 계산할 수 있으며, 이는 하기에서 도 3을 참조하여 보다 상세히 설명하기로 한다.For example, the predictor 125 may calculate a probability of what the law implies by the user's input query using the naive Bayesian classifier, which will be described in more detail with reference to FIG. 3 below. .
메모리(130)는 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치(100)를 운용하기 위해 필요한 다양한 알고리즘, 매핑 테이블 등을 저장하기 위한 수단이다.The memory 130 is a means for storing various algorithms, mapping tables, and the like necessary for operating the specialty field search support apparatus 100 according to an embodiment of the present invention.
제어부(135)는 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치(100)의 내부 구성 요소들(예를 들어, 수집부(110), 추출부(115), 매핑 테이블 생성부(120), 메모리(130) 등)을 제어하기 위한 수단이다.The controller 135 is an internal component (eg, the collector 110, the extractor 115, the mapping table generator 120) of the apparatus for searching a specialty field according to an embodiment of the present invention. , The memory 130 and the like.
도 1에서는 본 발명의 일 실시예에 따른 일반 용어와 전문 용어간의 매핑 확률을 계산하여 이에 대한 매핑 테이블을 생성하는 것을 중심으로 설명하였다. 도 2에는 본 발명의 다른 실시예에 따른 질의문과 법령간 매핑 확률을 기반으로, 질의어에 대한 법령을 예측하여 제공할 수 있는 장치에 대해 설명하기로 한다.In FIG. 1, the mapping probability between the general term and the terminology is calculated and the mapping table for the term is generated according to an embodiment of the present invention. In FIG. 2, an apparatus capable of predicting and providing a law for a query based on a mapping probability between a query and a law according to another embodiment of the present invention will be described.

도 3은 본 발명의 다른 실시예에 따른 전문 분야 검색 지원 장치의 내부 구성을 개략적으로 도시한 블록도이고, 도 4는 본 발명의 다른 실시예에 따른 단어-법령 매핑 테이블을 예시한 도면이며, 도 5는 본 발명의 일 실시예에 따른 질의어에 대해 예측된 법령에 대한 결과를 나타낸 도면이다.3 is a block diagram schematically illustrating an internal configuration of an apparatus for searching a specialty field according to another embodiment of the present invention, and FIG. 4 is a diagram illustrating a word-law mapping table according to another embodiment of the present invention. FIG. 5 is a diagram illustrating a result of a predicted law for a query word according to an exemplary embodiment of the present invention. FIG.
도 3을 참조하면, 본 발명의 다른 실시예에 따른 전문 분야 검색 지원 장치(300)는 수집부(310), 추출부(315), 매핑 테이블 생성부(320), 예측부(325), 메모리(330) 및 제어부(335)를 포함하여 구성된다.Referring to FIG. 3, the apparatus for searching a specialty field according to another exemplary embodiment of the present invention may include a collector 310, an extractor 315, a mapping table generator 320, a predictor 325, and a memory. 330 and the control unit 335 is configured.
수집부(310)는 판례 데이터를 수집하기 위한 수단이다.Collecting unit 310 is a means for collecting the case data.
예를 들어, 수집부(310)는 판례 데이터를 제공하는 특정 웹 사이트에서 판례 데이터를 수집할 수 있다.For example, the collector 310 may collect the case data from a specific web site that provides the case data.
추출부(315)는 판례 데이터를 분석하여 단어를 각각 추출하기 위한 수단이다.The extraction unit 315 is a means for analyzing the case data and extracting words, respectively.
예를 들어, 추출부(315)는 판례 데이터를 형태소 단위로 분석하여 단어를 각각 추출할 수 있다. 특정 문장에서 형태소 단위로 단어를 추출하는 방법 자체는 이미 당업자에게는 공지된 사항이므로 이에 대한 별도의 설명은 생략하기로 한다.For example, the extractor 315 may extract the words by analyzing the case data in morpheme units. Since the method itself for extracting words in morpheme units from a specific sentence is already known to those skilled in the art, a separate description thereof will be omitted.
매핑 테이블 생성부(320)는 판례 데이터에서 추출된 단어와 법령간의 매핑 확률을 계산하여 매핑 테이블을 생성하기 위한 수단이다. 이하, 본 명세서 도 1에서 설명한 매핑 테이블과의 구분을 위해 단어-법령 매핑 테이블로 칭하여 설명하기로 한다.The mapping table generator 320 is a means for generating a mapping table by calculating a mapping probability between a word extracted from the precedent data and a law. Hereinafter, to distinguish from the mapping table described in FIG. 1, the word-law mapping table will be described.
예를 들어, 매핑 테이블 생성부(320)는 판례 데이터에서 추출된 단어와 법령간의 매핑 확률을 계산하기 위한 척도로 신뢰도를 이용할 수 있다.For example, the mapping table generator 320 may use the reliability as a measure for calculating the mapping probability between the words extracted from the precedent data and the law.
매핑 테이블 생성부(320)는 하기 수 2를 이용하여 단어와 법령간의 매핑 확률을 계산할 수 있다.The mapping table generator 320 may calculate a mapping probability between a word and a law using Equation 2 below.

Figure PCTKR2013011920-appb-M000002
Figure PCTKR2013011920-appb-M000002
Figure PCTKR2013011920-appb-I000024
Figure PCTKR2013011920-appb-I000024
여기서,
Figure PCTKR2013011920-appb-I000025
는 판례 데이터내 출현 단어 집합을 나타내고,
Figure PCTKR2013011920-appb-I000026
는 법령명 집합을 나타내고,
Figure PCTKR2013011920-appb-I000027
은 판례 데이터내 출현 단어 집합에 포함되는 단어들 중 법령명에 포함되지 않는 단어를 나타내고,
Figure PCTKR2013011920-appb-I000028
는 판례 데이터 내에 출현하는 단어들 중 법령명의 집합에 포함되는 법령들을 나타낸다.
here,
Figure PCTKR2013011920-appb-I000025
Represents the set of occurrence words in the case data,
Figure PCTKR2013011920-appb-I000026
Represents a set of statutory names,
Figure PCTKR2013011920-appb-I000027
Represents words that are not included in the legal name among words included in the set of occurrence words in the case data,
Figure PCTKR2013011920-appb-I000028
Represents statutes included in a set of statutory names among words appearing in case data.
도 4에는 단어-법령 매핑 테이블이 예시되어 있다. 단어-법령 매핑 테이블은 도 4에 도시된 바와 같이, 각 단어와 해당 단어에 매핑되는 법령, 그에 따른 신뢰도가 포함된다. 즉, 각 단어에 매핑되는 법령에 대한 신뢰도가 높을수록 해당 단어에 매핑된 법령의 신뢰도가 높은 것을 의미한다.4 illustrates a word-law mapping table. As shown in FIG. 4, the word-law mapping table includes each word, a law mapped to the word, and a reliability thereof. That is, the higher the reliability of the law mapped to each word, the higher the reliability of the law mapped to the word.
예측부(325)는 키워드-법령 테이블을 이용하여 사용자의 입력 질의문에 대한 법령 매핑 확률을 예측하기 위한 수단이다.The prediction unit 325 is a means for predicting a legal mapping probability for the user's input query using the keyword-statute table.
예를 들어, 예측부(325)는 나이브 베이지안 분류기를 이용하여 사용자의 입력 질의문이 암시하는 법령이 무엇인지에 대한 확률을 계산할 수 있다.For example, the prediction unit 325 may calculate a probability of what the law implies by the user's input query using the naive Bayesian classifier.
일반적으로 각 질의문은 여러 개의 법령과 관련되는 경우가 많으므로, 예측부(325)는 나이브 베이지안 분류기에서 MAX 함수를 제거한 수 3을 이용하여 질의문-법령 매핑 확률을 계산할 수 있다.In general, since each query statement is often associated with a number of laws, the prediction unit 325 may calculate the query-law mapping probability by using the number 3 from which the MAX function is removed from the naive Bayesian classifier.
이를 수식으로 표현하면 수 3과 같다.If this is expressed as an expression, it is equal to the number 3.
Figure PCTKR2013011920-appb-M000003
Figure PCTKR2013011920-appb-M000003
여기서,
Figure PCTKR2013011920-appb-I000029
는 매핑 확률을 나타내며, 이는 수2와 동일하다. 예측부(325)는 질의문-법령 매핑 확률을 예측시 매핑 확률이 지나치게 감소하는 현상을 방지하기 위해 키워드-법령 테이블을 참조하여 법령 Ci와 매핑될 확률이 높은 상위 n개의 키워드-법령 매핑만을 이용하여 매핑 확률을 계산할 수 있다. 이를 수식으로 표현하면 수 4와 같다.
here,
Figure PCTKR2013011920-appb-I000029
Denotes the mapping probability, which is equal to number 2. The prediction unit 325 refers only to the top n keyword-law mappings that have a high probability of being mapped to the law C i by referring to the keyword-law table in order to prevent the mapping probability from being excessively reduced when predicting the query statement-law mapping probability. Can be used to calculate the mapping probability. If this is expressed as an expression, it is equal to number 4.
Figure PCTKR2013011920-appb-M000004
Figure PCTKR2013011920-appb-M000004
Figure PCTKR2013011920-appb-I000030
Figure PCTKR2013011920-appb-I000030
도 5에는 질의문에 대응하여 예측된 법령과 그에 따른 매핑 확률이 도시되어 있다.FIG. 5 shows the laws and the mapping probabilities predicted corresponding to the query statements.
예측부(325)는 도 5에 도시된 바와 같이, 단어-법령 매핑 테이블을 이용하여 질의문에 대한 법령의 매핑 확률을 계산하고, 계산된 매핑 확률에 따라 매핑 확률이 높은 순으로 질의어에 대한 법령을 예측하여 제공할 수 있다.As illustrated in FIG. 5, the prediction unit 325 calculates a mapping probability of a law for a query using a word-law mapping table, and then provides a law for the query in the order of the highest mapping probability according to the calculated mapping probability. Can be predicted and provided.
메모리(330)는 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치(300)를 운용하기 위해 필요한 다양한 알고리즘, 매핑 테이블 등을 저장하기 위한 수단이다.The memory 330 is a means for storing various algorithms, mapping tables, etc. required for operating the specialty field search support apparatus 300 according to an embodiment of the present invention.
제어부(335)는 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치(300)의 내부 구성 요소들(예를 들어, 수집부(310), 추출부(315), 매핑 테이블 생성부(320), 예측부, 메모리(330) 등)을 제어하기 위한 수단이다.The controller 335 is an internal component (eg, the collector 310, the extractor 315, the mapping table generator 320) of the apparatus for searching a specialty field according to an embodiment of the present invention. , A predictor, a memory 330, etc.).

도 6은 본 발명의 일 실시예에 따른 전문 분야 검색 지원 장치가 질의어에 대한 전문 용어를 제공하는 방법을 나타낸 순서도이다.FIG. 6 is a flowchart illustrating a method of providing a specialized term for a query by the apparatus for supporting a specialized field search according to an embodiment of the present invention.
단계 610에서 전문 분야 검색 지원 장치는 웹 사이트에서 Q&A 데이터를 수집하여 데이터베이스 저장한다.In operation 610, the specialization search support apparatus collects Q & A data from a web site and stores the database.
단계 615에서 전문 분야 검색 지원 장치는 수집된 Q&A 데이터를 질문부 및 답변부로 구분하고, 각각을 분석하여 단어를 추출한다.In operation 615, the specialized field search support apparatus divides the collected Q & A data into a question unit and an answer unit, and analyzes each to extract a word.
예를 들어, 전문 분야 검색 지원 장치는 이미 전술한 바와 같이 질문부 및 답변부를 각각 형태소 단위로 분석하여 형태소 단위로 단어를 각각 추출할 수 있다. 이는 이미 당업자에게는 자명한 사항이므로 이에 대한 별도의 설명은 생략하기로 한다.For example, as already described above, the specialized field search support apparatus may extract the words in the morpheme units by analyzing the question unit and the answer unit in the morpheme units. Since this is already apparent to those skilled in the art, a separate description thereof will be omitted.
단계 620에서 전문 분야 검색 지원 장치는 질문부 및 답변부에서 추출된 각각의 단어 상호간 상관성 분석을 통해 매핑 테이블을 생성한다.In operation 620, the specialized field search support apparatus generates a mapping table through correlation analysis between words extracted from the question unit and the answer unit.
이미 전술한 바와 같이, 질문부 및 답변부에서 추출된 단어간(질문부에서 추출된 단어와 답변부에서 추출된 단어)간의 상관성 분석을 통한 매핑 확률을 계산하여 매핑 테이블을 생성할 수 있다.As described above, a mapping table may be generated by calculating a mapping probability through correlation analysis between words extracted from the question unit and the answer unit (words extracted from the question unit and words extracted from the answer unit).
단계 625에서 전문 분야 검색 지원 장치는 매핑 테이블을 이용하여 질의어에 대응하는 전문 용어를 예측하여 제공한다.In operation 625, the specialized field search support apparatus predicts and provides a term corresponding to a query using a mapping table.
예를 들어, 전문 분야 검색 지원 장치는 매핑 테이블을 참조하여 나이브 베이지안 분류기를 기반으로 입력된 질의어에 대한 매핑 확률을 계산한 후 매핑 확률이 높은 상위 n개의 전문 용어를 예측하여 제공할 수 있다.For example, the specialized field search support apparatus may calculate a mapping probability of the input query based on the naive Bayesian classifier with reference to the mapping table, and then predict and provide the top n terminologies having a high mapping probability.

도 7은 본 발명의 다른 실시예에 따른 전문 분야 검색 지원 장치가 질의어에 대한 법령을 제공하는 방법을 나타낸 순서도이다.7 is a flowchart illustrating a method of providing a law for a query by a specialized field search support apparatus according to another exemplary embodiment of the present invention.
단계 710에서 전문 분야 검색 지원 장치는 판례 데이터를 수집한다. 예를 들어, 전문 분야 검색 지원 장치는 웹 사이트에서 판례 데이터를 수집하여 저장할 수 있다.In operation 710, the specialization search support device collects case data. For example, the specialty search support device may collect and store case data from a web site.
단계 715에서 전문 분야 검색 지원 장치는 판례 데이터를 분석하여 단어를 각각 추출한다. 예를 들어, 전문 분야 검색 지원 장치는 판례 데이터를 형태로 단위로 분석하여 형태소 단위로 단어를 각각 추출할 수 있다. 이는 이미 당업자에게는 자명한 사항이므로 이에 대한 별도의 설명은 생략하기로 한다.In operation 715, the specialized field search support apparatus extracts words by analyzing case data. For example, the specialized field search support apparatus may extract the words in morpheme units by analyzing the case data in the form of units. Since this is already apparent to those skilled in the art, a separate description thereof will be omitted.
단계 720에서 전문 분야 검색 지원 장치는 판례 데이터에서 추출된 단어와 법령간의 매핑 확률을 계산하여 단어-법령 매핑 테이블을 생성한다.In operation 720, the specialized field search support apparatus generates a word-statute mapping table by calculating a mapping probability between words extracted from case data and the laws.
이에 대해서는 도 3에서 설명한 바와 동일하므로 중복되는 설명은 생략하기로 한다.Since this is the same as described with reference to FIG. 3, redundant descriptions will be omitted.
단계 725에서 전문 분야 검색 지원 장치는 단어-법령 매핑 테이블을 이용하여 질의문에 대한 법령을 예측한다.In operation 725, the specialized field search support apparatus predicts a law for a query using a word-law mapping table.
즉, 전문 분야 검색 지원 장치는 단어-법령 매핑 테이블을 참조하여 질의어에 대한 법령 매핑 확률을 계산하고, 이를 기반으로 질의어에 대한 법령을 예측할 수 있다. 이를 위해 전문 분야 검색 지원 장치는 개념분류기법인 나이브 베이지안 분류기를 이용할 수 있으며, 이는 도 3의 수4를 이용하여 설명한 바와 동일하므로 중복되는 설명은 생략하기로 한다.That is, the apparatus for searching a specialized field may calculate a law mapping probability for a query by referring to a word-law mapping table and predict a law for a query based on the word-law mapping table. To this end, the specialized field search support apparatus may use a naïve Bayesian classifier, which is a concept classification technique, which is the same as described using the number 4 in FIG.

한편, 본 발명의 실시예에 따른 전문 분야 검색 지원 방법은 다양한 전자적으로 정보를 처리하는 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 저장 매체에 기록될 수 있다. 저장 매체는 프로그램 명령, 데이터 파일, 데이터 구조등을 단독으로 또는 조합하여 포함할 수 있다. On the other hand, the specialized field search support method according to an embodiment of the present invention may be implemented in the form of program instructions that can be executed through various electronic means for processing information may be recorded in the storage medium. The storage medium may include program instructions, data files, data structures, etc. alone or in combination.
저장 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 소프트웨어 분야 당업자에게 공지되어 사용 가능한 것일 수도 있다. 저장 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media) 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 전자적으로 정보를 처리하는 장치, 예를 들어, 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. The program instructions recorded in the storage medium may be those specially designed and constructed for the present invention, or may be known and available to those skilled in the software art. Examples of storage media include magnetic media such as hard disks, floppy disks and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic-optical media such as floppy disks. hardware devices specifically configured to store and execute program instructions such as magneto-optical media and ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine code generated by a compiler, but also devices that process information electronically using an interpreter, for example, high-level language code that can be executed by a computer.
상술한 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The hardware device described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

상기에서는 본 발명의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야에서 통상의 지식을 가진 자라면 하기의 특허 청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.Although the above has been described with reference to a preferred embodiment of the present invention, those skilled in the art to which the present invention pertains without departing from the spirit and scope of the present invention as set forth in the claims below It will be appreciated that modifications and variations can be made.

Claims (10)

  1. (a)웹 문서에서 질의-답변 데이터를 수집하는 단계;
    (b) 상기 질의-답변 데이터에서 질문부와 답변부를 구분하여 분석하여 단어를 추출하는 단계;
    (c) 상기 질문부 및 상기 답변부에 포함된 각 단어들간의 상관성 분석을 통해 일반 용어-전문 용어 매핑 확률을 계산하여 일반 용어-전문용어 매핑 테이블을 생성하는 단계; 및
    (d) 상기 용어 매핑 테이블을 이용하여 질의문에 포함된 단어를 포함하는 전문 용어를 추출하여 제공하는 단계를 포함하는 전문 분야 검색 지원 방법.

    (a) collecting query-answer data from the web document;
    (b) extracting a word by analyzing the question part and the answer part in the question-answer data;
    (c) generating a general term-terminology mapping table by calculating a general term-terminology mapping probability by analyzing correlations between the words included in the question unit and the answer unit; And
    (d) extracting and providing a term that includes a word included in a query by using the term mapping table.

  2. 제1 항에 있어서, 상기 (c) 단계는,
    상기 질문부 및 상기 답변부에 동시 출현한 단어의 빈도를 이용하여 상기 일반 용어-전문 용어 매핑 확률을 계산하는 것을 특징으로 하는 전문 분야 검색 지원 방법.

    The method of claim 1, wherein step (c) comprises:
    And calculating the general term-terminology mapping probabilities using the frequency of the words that appear simultaneously in the question unit and the answer unit.

  3. 제1 항에 있어서,
    상기 일반 용어-전문 용어 매핑 확률은 PMI(pairwise mutual information)를 이용하여 계산되는 것을 특징으로 하는 전문 분야 검색 지원 방법.
    Figure PCTKR2013011920-appb-I000031

    Figure PCTKR2013011920-appb-I000032

    여기서,
    Figure PCTKR2013011920-appb-I000033
    는 일반 용어 집합을 나타내고,
    Figure PCTKR2013011920-appb-I000034
    은 법령 출현 단어 집합을 나타낸다. 또한,
    Figure PCTKR2013011920-appb-I000035
    는 전문 용어 집합을 나타내며,
    Figure PCTKR2013011920-appb-I000036
    는 일반 용어 집합에는 포함되면서 법령 출현 단어 집합에는 포함되지 않는 단어들을 나타내고,
    Figure PCTKR2013011920-appb-I000037
    는 전문 용어에 포함되면서 동시에 법령 키워드에도 포함되는 단어를 나타냄.

    The method of claim 1,
    The general term-terminology mapping probability is calculated using pairwise mutual information (PMI).
    Figure PCTKR2013011920-appb-I000031

    Figure PCTKR2013011920-appb-I000032

    here,
    Figure PCTKR2013011920-appb-I000033
    Represents a generic term set,
    Figure PCTKR2013011920-appb-I000034
    Represents a set of statutory words. Also,
    Figure PCTKR2013011920-appb-I000035
    Represents a set of terminology,
    Figure PCTKR2013011920-appb-I000036
    Denotes words that are included in the general term set but not in the statutory word set,
    Figure PCTKR2013011920-appb-I000037
    Refers to words that are included in jargon and in statute keywords.

  4. 제1 항에 있어서,
    상기 (d) 단계는,
    상기 매핑 테이블을 이용하여 상기 질의문에 포함된 단어와 일치하는 상기 n(자연수)개의 전문 용어를 대상으로 매핑 확률을 계산하여 예측하는 것을 특징으로 하는 전문 분야 검색 지원 방법.

    The method of claim 1,
    In step (d),
    And a mapping probability is calculated and predicted for the n (natural numbers) terminology matching the word included in the query using the mapping table.

  5. 제4 항에 있어서,
    상기 (d) 단계는,
    상기 매핑 확률은 나이브 베이지안 분류기(naïve Bayesian classifier)를 이용하여 계산되는 것을 특징으로 하는 전문 분야 검색 지원 방법.

    The method of claim 4, wherein
    In step (d),
    The mapping probability is calculated using a naïve Bayesian classifier.

  6. 제5 항에 있어서,
    상기 매핑 확률은 하기 수식에 의해 계산되는 것을 특징으로 하는 전문 분야 검색 지원 방법.
    Figure PCTKR2013011920-appb-I000038

    여기서,
    Figure PCTKR2013011920-appb-I000039
    이되, 상기
    Figure PCTKR2013011920-appb-I000040
    는 매핑 테이블에 포함된 매핑 확률을 나타내고,
    Figure PCTKR2013011920-appb-I000041
    는 전문용어를 나타내며, X는 질의문을 나타냄.

    The method of claim 5,
    The mapping probability is calculated by the following formula.
    Figure PCTKR2013011920-appb-I000038

    here,
    Figure PCTKR2013011920-appb-I000039
    This is
    Figure PCTKR2013011920-appb-I000040
    Represents the mapping probabilities included in the mapping table,
    Figure PCTKR2013011920-appb-I000041
    Denotes a terminology and X denotes a query.

  7. (a) 판례 데이터를 분석하여 단어를 각각 추출하는 단계;
    (b) 상기 단어를 이용하여 단어와 법령간 매핑 확률을 계산하여 단어-법령 매핑 테이블을 생성하는 단계; 및
    (c) 상기 단어-법령 매핑 테이블을 이용하여 질의문에 대한 법령을 예측하는 단계를 포함하는 전문 분야 검색 지원 방법.

    (a) extracting words by analyzing case data;
    (b) generating a word-statute mapping table by calculating mapping probabilities between words and statutes using the words; And
    (c) predicting a law for a query using the word-law mapping table.

  8. 제8 항에 있어서,
    상기 단어-법령 매핑 테이블은 단어와 법령간의 매핑에 따른 신뢰도를 포함하되, 상기 신뢰도는 하기 수식에 의해 계산되는 것을 특징으로 하는 전문 분야 검색 지원 방법.
    Figure PCTKR2013011920-appb-I000042

    Figure PCTKR2013011920-appb-I000043

    여기서,
    Figure PCTKR2013011920-appb-I000044
    는 판례 데이터내 출현한 단어의 집합을 나타내고,
    Figure PCTKR2013011920-appb-I000045
    는 법령명에 대한 집합을 나타내며,
    Figure PCTKR2013011920-appb-I000046
    은 판례 데이터내 출현 단어 집합에 포함되는 단어들 중 법령명에 포함되지 않는 단어를 나타내고,
    Figure PCTKR2013011920-appb-I000047
    는 판례 데이터내에 출현하는 단어들 중 법령명의 집합에 포함되는 법령들을 나타냄.

    The method of claim 8,
    The word-law mapping table includes a reliability according to the mapping between words and the law, wherein the reliability is calculated by the following formula.
    Figure PCTKR2013011920-appb-I000042

    Figure PCTKR2013011920-appb-I000043

    here,
    Figure PCTKR2013011920-appb-I000044
    Represents the set of words that appear in the case data,
    Figure PCTKR2013011920-appb-I000045
    Represents a set of statutory names,
    Figure PCTKR2013011920-appb-I000046
    Represents words that are not included in the legal name among words included in the set of occurrence words in the case data,
    Figure PCTKR2013011920-appb-I000047
    Represents statutes included in a set of statute names among words appearing in case data.

  9. 웹 문서에서 질의-답변 데이터를 수집하는 수집부;
    상기 질의-답변 데이터에서 질문부와 답변부를 구분하여 분석하여 단어를 추출하는 추출부;
    상기 질문부 및 상기 답변부에 포함된 각 단어들간의 상관성 분석을 통해 일반 용어-전문 용어 매핑 확률을 계산하여 일반 용어-전문 용어 매핑 테이블을 생성하는 매핑 테이블 생성부; 및
    상기 용어 매핑 테이블을 이용하여 질의문에 포함된 단어를 포함하는 전문 용어를 추출하여 제공하는 예측부를 포함하는 전문 분야 검색 지원 장치.

    Collecting unit for collecting the query-answer data from the web document;
    An extracting unit that extracts a word by analyzing the question unit and the answer unit in the question-answer data;
    A mapping table generator configured to generate a general term-terminology mapping table by calculating a general term-terminology mapping probability by analyzing correlations between the words included in the question unit and the answer unit; And
    And a prediction unit for extracting and providing a term including a word included in a query using the term mapping table.

  10. 판례 데이터를 분석하여 단어를 각각 추출하는 추출부;
    상기 단어를 이용하여 단어와 법령간 매핑 확률을 계산하여 단어-법령 매핑 테이블을 생성하는 매핑 테이블 생성부; 및
    상기 단어-법령 매핑 테이블을 이용하여 질의문에 대한 법령을 예측하는 예측부를 포함하는 전문 분야 검색 지원 장치.

    An extraction unit for extracting words by analyzing case data;
    A mapping table generator for generating a word-statute mapping table by calculating a mapping probability between a word and a law using the word; And
    And a prediction unit for predicting a law for a query using the word-law mapping table.

PCT/KR2013/011920 2013-08-14 2013-12-20 Method for supporting search in specialist fields and apparatus therefor WO2015023031A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020130096419A KR101515413B1 (en) 2013-08-14 2013-08-14 Professional field search supporting method and apparatus
KR10-2013-0096419 2013-08-14

Publications (1)

Publication Number Publication Date
WO2015023031A1 true WO2015023031A1 (en) 2015-02-19

Family

ID=52468407

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2013/011920 WO2015023031A1 (en) 2013-08-14 2013-12-20 Method for supporting search in specialist fields and apparatus therefor

Country Status (2)

Country Link
KR (1) KR101515413B1 (en)
WO (1) WO2015023031A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018040501A1 (en) * 2016-09-05 2018-03-08 北京百度网讯科技有限公司 Man-machine interaction method and apparatus based on artificial intelligence
CN111353301A (en) * 2020-02-24 2020-06-30 成都网安科技发展有限公司 Auxiliary secret fixing method and device
CN112182019A (en) * 2020-10-20 2021-01-05 国网福建省电力有限公司经济技术研究院 Semantic parsing search method for power grid statistics professional index feature extraction

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101707941B1 (en) * 2015-12-09 2017-02-27 펄슨정보기술 주식회사 Method, device and computer readable recording medium for searching precedent using automatic coversion between general term and legal term
KR102607216B1 (en) * 2016-04-01 2023-11-29 삼성전자주식회사 Method of generating a diagnosis model and apparatus generating a diagnosis model thereof

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020169743A1 (en) * 2001-05-08 2002-11-14 David Arnold Web-based method and system for identifying and searching patents
KR20040042065A (en) * 2002-11-12 2004-05-20 하창승 Intelligent information searching method using case-based reasoning algorithm and association rule mining algorithm
US20110093449A1 (en) * 2008-06-24 2011-04-21 Sharon Belenzon Search engine and methodology, particularly applicable to patent literature
US20110313987A1 (en) * 2009-12-01 2011-12-22 Rishab Aiyer Ghosh System and method for search of sources and targets based on relative expertise of the sources
WO2012178152A1 (en) * 2011-06-23 2012-12-27 I3 Analytics Methods and systems for retrieval of experts based on user customizable search and ranking parameters

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020169743A1 (en) * 2001-05-08 2002-11-14 David Arnold Web-based method and system for identifying and searching patents
KR20040042065A (en) * 2002-11-12 2004-05-20 하창승 Intelligent information searching method using case-based reasoning algorithm and association rule mining algorithm
US20110093449A1 (en) * 2008-06-24 2011-04-21 Sharon Belenzon Search engine and methodology, particularly applicable to patent literature
US20110313987A1 (en) * 2009-12-01 2011-12-22 Rishab Aiyer Ghosh System and method for search of sources and targets based on relative expertise of the sources
WO2012178152A1 (en) * 2011-06-23 2012-12-27 I3 Analytics Methods and systems for retrieval of experts based on user customizable search and ranking parameters

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018040501A1 (en) * 2016-09-05 2018-03-08 北京百度网讯科技有限公司 Man-machine interaction method and apparatus based on artificial intelligence
KR20190028793A (en) * 2016-09-05 2019-03-19 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Human Machine Interactive Method and Device Based on Artificial Intelligence
KR102170563B1 (en) 2016-09-05 2020-10-27 베이징 바이두 넷컴 사이언스 앤 테크놀로지 코., 엘티디. Human machine interactive method and apparatus based on artificial intelligence
CN111353301A (en) * 2020-02-24 2020-06-30 成都网安科技发展有限公司 Auxiliary secret fixing method and device
CN112182019A (en) * 2020-10-20 2021-01-05 国网福建省电力有限公司经济技术研究院 Semantic parsing search method for power grid statistics professional index feature extraction

Also Published As

Publication number Publication date
KR101515413B1 (en) 2015-04-29
KR20150019474A (en) 2015-02-25

Similar Documents

Publication Publication Date Title
Cabrio et al. Five years of argument mining: A data-driven analysis.
Hopp et al. The extended Moral Foundations Dictionary (eMFD): Development and applications of a crowd-sourced approach to extracting moral intuitions from text
Aslam et al. Fake detect: A deep learning ensemble model for fake news detection
US20210109958A1 (en) Conceptual, contextual, and semantic-based research system and method
Trupthi et al. Sentiment analysis on twitter using streaming API
CN103699625B (en) Method and device for retrieving based on keyword
CN103488671B (en) For inquiring about the method and system with integrated structured and unstructured data
US9239875B2 (en) Method for disambiguated features in unstructured text
CN108280114B (en) Deep learning-based user literature reading interest analysis method
Waltl et al. Predicting the outcome of appeal decisions in Germany’s tax law
WO2015023031A1 (en) Method for supporting search in specialist fields and apparatus therefor
KR102334236B1 (en) Method and application of meaningful keyword extraction from speech-converted text data
Abdi et al. A linguistic treatment for automatic external plagiarism detection
KR20150050140A (en) Method for automactically constructing corpus, method and apparatus for recognizing named entity using the same
WO2016036345A1 (en) External resource identification
Shekhawat Sentiment classification of current public opinion on BREXIT: Naïve Bayes classifier model vs Python’s TextBlob approach
Eke et al. The significance of global vectors representation in sarcasm analysis
KR101333485B1 (en) Method for constructing named entities using online encyclopedia and apparatus for performing the same
Yörük et al. Random sampling in corpus design: Cross-context generalizability in automated multicountry protest event collection
CN113450905A (en) Medical auxiliary diagnosis system, method and computer readable storage medium
Amato et al. An application of semantic techniques for forensic analysis
Wani et al. CoDeS: A Deep Learning Framework for Identifying COVID-Caused Depression Symptoms
Córdova Sáenz et al. Assessing the use of attention weights to interpret BERT-based stance classification
Sweidan et al. Word embeddings with fuzzy ontology reasoning for feature learning in aspect sentiment analysis
Dalal et al. An investigation of data requirements for the detection of depression from social media posts

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13891351

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13891351

Country of ref document: EP

Kind code of ref document: A1