KR102216065B1

KR102216065B1 - Method for providing search result for video segment

Info

Publication number: KR102216065B1
Application number: KR1020200053412A
Authority: KR
Inventors: 박승범; 장수현; 안근진
Original assignee: 호서대학교 산학협력단; 주식회사 리빈에이아이
Priority date: 2020-05-04
Filing date: 2020-05-04
Publication date: 2021-02-18

Abstract

The present invention provides a search method of a video segment or an image using an artificial intelligence-based search model. A video segment or an image does not basically include text-based information, but a corresponding passage can be generated by using a technique such as object recognition and caption generation. If summary information on a video or an image exists, the information can also be included in the passage. A video segment or an image can be searched by using the generated text passage. If an artificial intelligence-based search model is used, search results with improved performance can be obtained in comparison to using only an unsupervised search model.

Description

Method for providing search result for video segment}

본 발명은 동영상 세그먼트에 대한 검색결과를 제공하는 방법에 대한 것으로서, 더 구체적으로는, 윅수퍼비전 방법론에 기초하여 학습된 정보검색모델을 이용하여 동영상 세그먼트에 대한 검색결과를 제공하는 방법에 대한 것이다. 본 발명은 이미지 검색에도 이용될 수 있다.The present invention relates to a method of providing a search result for a video segment, and more specifically, to a method of providing a search result for a video segment using an information retrieval model learned based on a wick supervision methodology. The present invention can also be used for image search.

검색기술은 구글이 그래프 이론(graph theory)에 기초를 둔 페이지랭크 (PageRank) 기법의 검색기술을 선보인 이후로 지속적으로 발전하여 왔다. 이러한 검색기술은 비지도학습에 기초한 것으로서, 문서뭉치만 주어지면 검색이 가능하였다. 비지도학습에 기초한 검색모델로서 대표적인 것은 BM25가 있으며, RM3라는 쿼리 확장 (query expansion) 기법과 함께 사용하는 경우 매우 향상된 성능을 보인다. 오픈 소스로는 Anserini가 학술 분야 및 현장에서 널리 이용되고 있다. Search technology has been continuously developed since Google introduced the PageRank search technology based on graph theory. This search technique was based on unsupervised learning, and it was possible to search only given a bunch of documents. BM25 is a representative search model based on unsupervised learning, and it shows very improved performance when used with a query expansion technique called RM3. As an open source, Anserini is widely used in the academic field and in the field.

한편, 자연어처리 분야에서도 인공지능 기법을 적용하고자 하는 학술 분야에서의 연구에 따라, 다양한 검색모델이 제안되어 왔다. 예를 들어, DRMM, KNRM, PACRR 등과 같은 딥러닝 기반의 검색모델이 제안되었다. 구글이 2018년 발표한 BERT는 다양한 자연어처리 분야에서 좋은 성능을 나타냈으며, 트랜스포머 또는 언어 모델 기반의 검색모델로 활용하려는 연구가 이어져 왔다. Meanwhile, in the field of natural language processing, various search models have been proposed in accordance with research in the academic field to apply artificial intelligence techniques. For example, deep learning-based search models such as DRMM, KNRM, and PACRR have been proposed. BERT, released by Google in 2018, has shown good performance in various natural language processing fields, and research has been conducted to use it as a search model based on a transformer or language model.

각 분야마다 오픈 소스가 공개된 인공지능 모델들을 소개하는 웹사이트인 Paper With Code의 Ad-Hoc Information Retrieval 항목에서는 비지도학습에 기초한 검색모델인 Anserini를 포함하여 인공지능 기반의 검색모델들의 현재 시점에서의 SOTA (State-of-the-Art), 즉, 가장 좋은 성능을 나타내는 검색모델을 파악할 수 있다. 지미 린(Lin, Jimmy)이라는 캐나다 워털루 대학 소속 연구자에 다르면, BERT 이전의 딥러닝 계열의 검색모델들, 즉, DRMM, KNRM, PACRR 등의 검색모델은 비지도학습 방법론에 기초한 검색모델인 Anserini와 성능이 비슷하거나 오히려 떨어졌지만, BERT 이후에 제안된 모델들은 Anserini 보다 성능이 향상되었다고 한다(참조: Lin, Jimmy. "The Neural Hype, Justified! A Recantation."). 이러한 사항은 전술한 Paper With Code 의 Ad-Hoc Information Retrieval 항목의 리더보드(leader board)에서도 확인이 가능하다. 이러한 학술연구 결과로부터, 인공지능 기반의 검색모델에 의하여 검색결과의 정확도가 향상될 수 있음을 알 수 있다.In the Ad-Hoc Information Retrieval section of Paper With Code, a website that introduces artificial intelligence models that are open source in each field, the current point of view of artificial intelligence-based search models including Anserini, a search model based on unsupervised learning. SOTA (State-of-the-Art), that is, the search model that shows the best performance can be identified. According to a researcher at the University of Waterloo, Canada, named Jimmy Lin, deep learning search models before BERT, i.e., DRMM, KNRM, and PACRR, are similar to Anserini, a search model based on unsupervised learning methodology. Although the performance is similar or worse, the models proposed after BERT are said to have improved performance over Anserini (see Lin, Jimmy. "The Neural Hype, Justified! A Recantation."). Such matters can also be checked on the leader board of the Ad-Hoc Information Retrieval item of the above Paper With Code. From these academic research results, it can be seen that the accuracy of the search results can be improved by the search model based on artificial intelligence.

그러나, 인공지능 기반의 검색모델은 몇가지 제약이 존재한다. However, artificial intelligence-based search models have some limitations.

인공지능 기반의 검색모델을 추론에 이용하기 위해서는 먼저 학습시켜야 하는데, 이러한 학습에는 대량의 레이블드 데이터(labeled data)가 요구된다. 레이블드 데이터는 기본적으로 인간이 가공하여 제공하여야 하는데, 학습에 필요한 데이터의 양을 고려할 때 레이블링에 소요되는 비용이 너무 크기 때문에 비경제적이다.In order to use an artificial intelligence-based search model for inference, it must first be trained, and such learning requires a large amount of labeled data. Labeled data must be basically processed and provided by humans, but it is uneconomical because the cost for labeling is too high when considering the amount of data required for learning.

다른 문제로서, 비지도학습에 기초한 검색모델은 일반적으로 문서의 길이가 길더라도 문제가 되지 않지만, 인공지능 기반의 검색모델들은 대부분 처리할 수 있는 문서의 길이에 제한이 있다. 예를 들어, BERT의 경우 처리할 수 있는 최대 토큰 수는 512개로 제한된다. 따라서, 짧은 글로 이루어진 말뭉치를 검색대상으로 하는 경우에는 문제가 없지만, 특허, 논문 등과 같이 길이가 긴 문서를 검색하는 경우에는 적용에 어려움이 있다.As another problem, search models based on unsupervised learning generally do not matter even if the length of the document is long, but most of the search models based on artificial intelligence have limitations in the length of documents that can be processed. In the case of BERT, for example, the maximum number of tokens that can be processed is limited to 512. Therefore, there is no problem in the case of searching for a corpus consisting of short articles, but there is difficulty in applying it in the case of searching for long documents such as patents and papers.

한편, 동영상 또는 이미지의 경우에는 기본적으로 텍스트 정보를 포함하고 있지 않기 때문에, 정보검색 기법으로 검색하기에는 어려움이 있다.On the other hand, in the case of moving pictures or images, it is difficult to search using an information retrieval technique because text information is not basically included.

[1] https://paperswithcode.com/task/ad-hoc-information-retrieval[1] https://paperswithcode.com/task/ad-hoc-information-retrieval [2] MacAvaney, Sean, et al. "CEDR: Contextualized embeddings for document ranking." Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.[2] MacAvaney, Sean, et al. "CEDR: Contextualized embeddings for document ranking." Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019. [3] Dai, Zhuyun, and Jamie Callan. "Deeper text understanding for IR with contextual neural language modeling." Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.[3] Dai, Zhuyun, and Jamie Callan. "Deeper text understanding for IR with contextual neural language modeling." Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019.

본 발명은 전술한 문제를 해결하고자 하는 것으로서, 동영상 세그먼트 또는 이미지에 대하여 텍스트 형식의 쿼리를 이용하여 검색결과를 제공하는 방법을 제공하는 것이다.The present invention is to solve the above-described problem, and provides a method of providing a search result for a video segment or image by using a text-type query.

본 발명에 일 양태에 의하여, 복수개의 패시지(passage)를 포함하는 말뭉치(corpus)로부터 입력된 쿼리에 대응하는 검색결과를 제공하기 위한 컴퓨터로 구현되는 방법으로서, (a) 비지도학습 (unsupervised) 방법론에 기초한 검색모델에 의하여, 상기 말뭉치로부터 상기 입력된 쿼리에 대응하는 N개의 패시지가 리트리빙 (retrieving) 되는 단계; (b) 윅수퍼비전 (weak-supervision) 방법론에 의하여 학습된 인공지능 기반 검색모델에 의하여, 상기 입력된 쿼리를 기초로 상기 단계 (a)에서 리트리빙 된 N개의 패시지가 리랭킹 (re-ranking) 되는 단계; 및, (c) 상기 단계 (b)에서 리랭킹 된 N개의 패시지에 대응하는 검색결과 목록이 출력되는 단계를 포함하되, 상기 패시지는, 동영상 세그먼트의 객체 정보 및 자막 정보 중의 적어도 하나를 포함하는 동영상 세그먼트를 검색하기 위한 컴퓨터로 구현되는 방법이 제공된다.According to an aspect of the present invention, a computer-implemented method for providing a search result corresponding to a query input from a corpus including a plurality of passages, (a) unsupervised learning Retrieving N passages corresponding to the input query from the corpus by a search model based on methodology; (b) Re-ranking the N passages retrieved in step (a) based on the input query by an artificial intelligence-based search model learned by the weak-supervision methodology Step of becoming; And, (c) outputting a list of search results corresponding to the N passages re-ranked in step (b), wherein the passage includes at least one of object information and subtitle information of a video segment. A computer-implemented method for retrieving segments is provided.

본 발명의 다른 양태에 의하여, 복수개의 패시지를 포함하는 말뭉치로부터 입력된 쿼리에 대응하는 검색결과를 제공하기 위한 장치로서, 적어도 하나의 프로세서; 및 컴퓨터로 실행가능한 명령을 저장하는 적어도 하나의 메모리를 포함하되, 상기 적어도 하나의 메모리에 저장된 상기 컴퓨터로 실행가능한 명령은, 상기 적어도 하나의 프로세서에 의하여, (a) 비지도학습 (unsupervised) 방법론에 기초한 검색모델에 의하여, 상기 말뭉치로부터 상기 입력된 쿼리에 대응하는 N개의 패시지가 리트리빙 (retrieving) 되는 단계; (b) 윅수퍼비전 (weak-supervision) 방법론에 의하여 학습된 인공지능 기반 검색모델에 의하여, 상기 입력된 쿼리를 기초로 상기 단계 (a)에서 리트리빙 된 N개의 패시지가 리랭킹 (re-ranking) 되는 단계; 및, (c) 상기 단계 (b)에서 리랭킹 된 N개의 패시지에 대응하는 검색결과 목록이 출력되는 단계를 포함하되, 상기 패시지는, 동영상 세그먼트의 객체 정보 및 자막 정보 중의 적어도 하나를 포함하는 동영상 세그먼트를 검색하기 위한 장치가 제공된다.According to another aspect of the present invention, an apparatus for providing a search result corresponding to a query input from a corpus including a plurality of passages, comprising: at least one processor; And at least one memory storing a computer-executable instruction, wherein the computer-executable instruction stored in the at least one memory is provided by the at least one processor, (a) an unsupervised methodology Retrieving N passages corresponding to the input query from the corpus by a search model based on (b) Re-ranking the N passages retrieved in step (a) based on the input query by an artificial intelligence-based search model learned by the weak-supervision methodology Step of becoming; And, (c) outputting a list of search results corresponding to the N passages re-ranked in step (b), wherein the passage includes at least one of object information and subtitle information of a video segment. An apparatus for searching for segments is provided.

본 발명의 또다른 양태에 의하여, 복수개의 패시지를 포함하는 말뭉치로부터 입력된 쿼리에 대응하는 검색결과를 제공하기 위한 컴퓨터 프로그램으로서, 비일시적 저장 매체에 저장되며, 프로세서에 의하여, (a) 비지도학습 (unsupervised) 방법론에 기초한 검색모델에 의하여, 상기 말뭉치로부터 상기 입력된 쿼리에 대응하는 N개의 패시지가 리트리빙 (retrieving) 되는 단계; (b) 윅수퍼비전 (weak-supervision) 방법론에 의하여 학습된 인공지능 기반 검색모델에 의하여, 상기 입력된 쿼리를 기초로 상기 단계 (a)에서 리트리빙 된 N개의 패시지가 리랭킹 (re-ranking) 되는 단계; 및, (c) 상기 단계 (b)에서 리랭킹 된 N개의 패시지에 대응하는 검색결과 목록이 출력되는 단계를 포함하되, 상기 패시지는, 동영상 세그먼트의 객체 정보 및 자막 정보 중의 적어도 하나를 포함하는 동영상 세그먼트를 검색하기 위한 비일시적 저장 매체에 저장되는 컴퓨터 프로그램이 제공된다.According to another aspect of the present invention, as a computer program for providing a search result corresponding to a query input from a corpus including a plurality of passages, stored in a non-transitory storage medium, and by a processor, (a) unsupervised Retrieving N passages corresponding to the input query from the corpus by a search model based on an unsupervised methodology; (b) Re-ranking the N passages retrieved in step (a) based on the input query by an artificial intelligence-based search model learned by the weak-supervision methodology Step of becoming; And, (c) outputting a list of search results corresponding to the N passages re-ranked in step (b), wherein the passage includes at least one of object information and subtitle information of a video segment. A computer program is provided that is stored on a non-transitory storage medium for retrieving segments.

본 발명의 또다른 양태에 의하여, 복수개의 패시지(passage)를 포함하는 말뭉치(corpus)로부터 입력된 쿼리에 대응하는 검색결과를 제공하기 위한 컴퓨터로 구현되는 방법으로서, (a) 비지도학습 (unsupervised) 방법론에 기초한 검색모델에 의하여, 상기 말뭉치로부터 상기 입력된 쿼리에 대응하는 N개의 패시지가 리트리빙 (retrieving) 되는 단계; (b) 윅수퍼비전 (weak-supervision) 방법론에 의하여 학습된 인공지능 기반 검색모델에 의하여, 상기 입력된 쿼리를 기초로 상기 단계 (a)에서 리트리빙 된 N개의 패시지가 리랭킹 (re-ranking) 되는 단계; 및, (c) 상기 단계 (b)에서 리랭킹 된 N개의 패시지에 대응하는 검색결과 목록이 출력되는 단계를 포함하되, 상기 패시지는, 이미지의 객체 정보 및 캡션 정보 중의 적어도 하나를 포함하는 이미지를 검색하기 위한 컴퓨터로 구현되는 방법이 제공된다.According to another aspect of the present invention, a computer-implemented method for providing a search result corresponding to a query input from a corpus including a plurality of passages, comprising: (a) unsupervised learning ) Retrieving N passages corresponding to the input query from the corpus by a methodology-based search model; (b) Re-ranking the N passages retrieved in step (a) based on the input query by an artificial intelligence-based search model learned by the weak-supervision methodology Step of becoming; And, (c) outputting a list of search results corresponding to the N passages reranked in step (b), wherein the passage includes an image including at least one of object information and caption information of the image. A computer-implemented method for searching is provided.

본 발명의 또다른 양태에 의하여, 복수개의 패시지를 포함하는 말뭉치로부터 입력된 쿼리에 대응하는 검색결과를 제공하기 위한 장치로서, 적어도 하나의 프로세서; 및 컴퓨터로 실행가능한 명령을 저장하는 적어도 하나의 메모리를 포함하되, 상기 적어도 하나의 메모리에 저장된 상기 컴퓨터로 실행가능한 명령은, 상기 적어도 하나의 프로세서에 의하여, (a) 비지도학습 (unsupervised) 방법론에 기초한 검색모델에 의하여, 상기 말뭉치로부터 상기 입력된 쿼리에 대응하는 N개의 패시지가 리트리빙 (retrieving) 되는 단계; (b) 윅수퍼비전 (weak-supervision) 방법론에 의하여 학습된 인공지능 기반 검색모델에 의하여, 상기 입력된 쿼리를 기초로 상기 단계 (a)에서 리트리빙 된 N개의 패시지가 리랭킹 (re-ranking) 되는 단계; 및, (c) 상기 단계 (b)에서 리랭킹 된 N개의 패시지에 대응하는 검색결과 목록이 출력되는 단계를 포함하되, 상기 패시지는, 이미지의 객체 정보 및 캡션 정보 중의 적어도 하나를 포함하는 이미지를 검색하기 위한 장치가 제공된다.According to another aspect of the present invention, an apparatus for providing a search result corresponding to a query input from a corpus including a plurality of passages, comprising: at least one processor; And at least one memory storing a computer-executable instruction, wherein the computer-executable instruction stored in the at least one memory is provided by the at least one processor, (a) an unsupervised methodology Retrieving N passages corresponding to the input query from the corpus by a search model based on (b) Re-ranking the N passages retrieved in step (a) based on the input query by an artificial intelligence-based search model learned by the weak-supervision methodology Step of becoming; And, (c) outputting a list of search results corresponding to the N passages reranked in step (b), wherein the passage includes an image including at least one of object information and caption information of the image. An apparatus for searching is provided.

본 발명의 또다른 양태에 의하여, 복수개의 패시지를 포함하는 말뭉치로부터 입력된 쿼리에 대응하는 검색결과를 제공하기 위한 컴퓨터 프로그램으로서, 비일시적 저장 매체에 저장되며, 프로세서에 의하여, (a) 비지도학습 (unsupervised) 방법론에 기초한 검색모델에 의하여, 상기 말뭉치로부터 상기 입력된 쿼리에 대응하는 N개의 패시지가 리트리빙 (retrieving) 되는 단계; (b) 윅수퍼비전 (weak-supervision) 방법론에 의하여 학습된 인공지능 기반 검색모델에 의하여, 상기 입력된 쿼리를 기초로 상기 단계 (a)에서 리트리빙 된 N개의 패시지가 리랭킹 (re-ranking) 되는 단계; 및, (c) 상기 단계 (b)에서 리랭킹 된 N개의 패시지에 대응하는 검색결과 목록이 출력되는 단계를 포함하되, 상기 패시지는, 이미지의 객체 정보 및 자막 정보 중의 적어도 하나를 포함하는 이미지를 검색하기 위한 비일시적 저장 매체에 저장되는 컴퓨터 프로그램이 제공된다.According to another aspect of the present invention, as a computer program for providing a search result corresponding to a query input from a corpus including a plurality of passages, stored in a non-transitory storage medium, and by a processor, (a) unsupervised Retrieving N passages corresponding to the input query from the corpus by a search model based on an unsupervised methodology; (b) Re-ranking the N passages retrieved in step (a) based on the input query by an artificial intelligence-based search model learned by the weak-supervision methodology Step of becoming; And, (c) outputting a list of search results corresponding to the N passages re-ranked in step (b), wherein the passage includes an image including at least one of object information and subtitle information of the image. A computer program is provided that is stored on a non-transitory storage medium for retrieval.

본 발명에 따라, 동영상 세그먼트 또는 이미지에 대하여 텍스트 형식의 쿼리를 이용하여 검색결과를 제공하는 방법이 제공된다.According to the present invention, there is provided a method of providing a search result for a video segment or image by using a textual query.

도 1은 본 발명에 따른 인공지능 기반의 정보검색모델을 학습시키기 위한 방법을 도시한 흐름도.
도 2는 본 발명에 따른 인공지능 기반의 정보검색모델을 이용한 정보검색 방법을 도시한 흐름도.
도 3은 본 발명의 일 예에 따른 정보검색 모델을 개략적으로 도시한 도면.
도 4는 본 발명에 따른 인공지능 기반의 정보검색모델을 이용한 정보검색 방법을 수행하기 위한 장치를 개략적으로 도시한 도면.
도 5는 본 발명의 일 실시예에 따른 검색결과를 다른 모델의 검색결과와 대비하는 도면.
도 6은 본 발명의 일 실시예에 따른 검색결과를 또다른 모델의 검색결과와 대비하는 도면.
도 7은 본 발명에 따른 검색방법에 의하여 동영상 세그먼트를 검색할 수 있도록 하는 패시지를 생성하는 방법을 개략적으로 설명하는 도면.1 is a flow chart showing a method for learning an information retrieval model based on artificial intelligence according to the present invention.
2 is a flowchart showing an information retrieval method using an artificial intelligence-based information retrieval model according to the present invention.
3 is a diagram schematically showing an information retrieval model according to an example of the present invention.
4 is a diagram schematically showing an apparatus for performing an information retrieval method using an information retrieval model based on artificial intelligence according to the present invention.
5 is a diagram for comparing a search result with a search result of another model according to an embodiment of the present invention.
6 is a view comparing a search result according to an embodiment of the present invention with a search result of another model.
7 is a diagram schematically illustrating a method of generating a passage for searching for a video segment by a search method according to the present invention.

이하에서는, 첨부된 도면을 참조하여 본 발명에 따른 실시예를 상세히 설명한다. 동일하거나 유사한 구성요소에 대해서는 동일 또는 유사한 도면 부호를 부여하고 이에 대한 중복되는 설명은 생략한다. 본 명세서에 개시된 실시예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 첨부된 도면은 본 명세서에 개시된 실시예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 등가인 것 내지 대체하는 것을 포함하는 것으로 이해되어야 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. The same or similar reference numerals are assigned to the same or similar components, and redundant descriptions thereof are omitted. In describing the embodiments disclosed in the present specification, when it is determined that a detailed description of related known technologies may obscure the subject matter of the embodiments disclosed in the present specification, the detailed description thereof will be omitted. The accompanying drawings are only for easy understanding of the embodiments disclosed in the present specification, and the technical idea disclosed in the present specification is not limited by the accompanying drawings, and all changes and equivalents included in the spirit and scope of the present invention It is to be understood as including or substituting.

제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이러한 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용되며 해당되는 구성요소들은 이러한 용어들에 의해 한정되지 않는다. 단수의 표현은, 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.Terms including ordinal numbers, such as first and second, can be used to describe various elements, but these terms are used only for the purpose of distinguishing one element from other elements, and the corresponding elements are these terms. Is not limited by Singular expressions include plural expressions unless the context clearly indicates otherwise.

본 명세서에서 사용된 "포함한다", "구비한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 단계, 구성요소 또는 이들을 조합한 것이 존재함을 한정하려는 것으로 이해되어야 하며, 하나 이상의 다른 특징들이나 단계, 구성요소 또는 이들을 조합한 것들이 존재할 또는 부가될 가능성을 배제하려는 것은 아니다.As used herein, terms such as "comprises", "includes" or "have" should be understood as limiting the existence of features, steps, components, or combinations thereof described in the specification. It is not intended to exclude the possibility of the presence or addition of features, steps, components, or combinations thereof.

도 1에는 본 발명에 따른 인공지능 기반의 정보검색모델을 학습시키는 방법이 도시되어 있으며, 도 2에는 본 발명에 따른 인공지능 기반의 정보검색모델을 이용하여 추론을 수행하는 단계가 도시되어 있고, 도 3에는 본 발명에 따른 정보검색모델의 일 예가 개략적으로 도시되어 있다. 이하에서는 이들 도면을 참고하여 본 발명에 대하여 설명한다.Fig. 1 shows a method of learning an artificial intelligence-based IR model according to the present invention, and Fig. 2 shows a step of performing inference using the artificial intelligence-based IR model according to the present invention, 3 schematically shows an example of an information retrieval model according to the present invention. Hereinafter, the present invention will be described with reference to these drawings.

[인공지능 기반의 정보검색모델의 학습단계][Learning stage of information retrieval model based on artificial intelligence]

정보검색모델Information retrieval model

본 명세서에서 정보검색모델은 '비지도학습 (unsupervised) 방법론에 기초한 검색모델'과 '인공지능 기반 검색모델'로 구분한다. 전자는, BM25, QL(Query Likelihood) 등 통계적 또는 기타 비지도학습 방법론에 기초한 정보검색모델을 의미하며, 후자는, DRMM, KNRM, PACRR 등의 딥러닝 계열 및 BERT 등 언어 모델 (Language Model) 계열 등을 포함하여 학습에 의하여 마련되는 정보검색모델을 의미한다.In this specification, the information retrieval model is classified into'a retrieval model based on unsupervised methodology' and'a retrieval model based on artificial intelligence'. The former refers to an information retrieval model based on statistical or other unsupervised learning methodologies such as BM25 and QL (Query Likelihood), and the latter is a deep learning series such as DRMM, KNRM, and PACRR, and a language model series such as BERT. It refers to an information retrieval model prepared by learning including, etc.

본 발명에서 전자인 비지도학습 방법론에 기초한 검색모델은, 학습단계에서는 슈도-레이블을 마련하기 위하여 이용되며, 추론단계에서는 말뭉치로부터 패시지(passage)를 리트리빙 (retrieving) 하기 위하여 이용되고, 후자는 추론단계에서 리트리빙된 패시지를 리랭킹 (re-ranking) 하기 위하여 이용된다.In the present invention, the retrieval model based on the former unsupervised learning methodology is used to prepare a pseudo-label in the learning stage, and in the inference stage to retrieve a passage from the corpus, and the latter It is used for re-ranking the retrieved passages in the inference stage.

오픈 소스가 존재하는 정보검색모델에 대하여는 Paper With Code의 Ad-Hoc Information Retrieval 항목에 공개되어 있다.Information retrieval models with open sources are disclosed in the Ad-Hoc Information Retrieval section of Paper With Code.

비지도학습 방법론에 기초한 검색모델Search model based on unsupervised learning methodology

비지도학습 (unsupervised) 방법론에 기초한 검색모델로는, BM25, QL(Query Likelihood) 등이 알려져 있으며, 오픈 소스로는 Anserini가 많이 이용된다. Anserini는 BM25와 함께 쿼리 확장 (query expansion) 방법론인 RM3를 이용하는 검색모델이다. 연구결과에 따르면, Anserini의 성능은 DRMM, KNRM, PACRR 등의 딥러닝 계열의 검색모델의 성능과 비슷하지만, BERT 등 언어 모델 계열의 검색모델에 비하여는 성능이 떨어지는 것으로 알려져 있다.As a search model based on an unsupervised methodology, BM25 and QL (Query Likelihood) are known, and Anserini is widely used as an open source. Anserini is a search model that uses RM3, a query expansion methodology, along with BM25. According to research results, the performance of Anserini is similar to that of deep learning search models such as DRMM, KNRM, and PACRR, but it is known to be inferior to that of language model series such as BERT.

비지도학습 방법론에 기초한 검색모델은 학습이 요구되지 않으며 통계적 이론 등에 기초하여 쿼리와 문서 간의 유사도를 판단한다.The search model based on the unsupervised learning methodology does not require learning and determines the similarity between the query and the document based on statistical theories.

인공지능 기반 검색모델Artificial intelligence-based search model

인공지능 기반 검색모델로서는, DRMM, KNRM, PACRR 등의 딥러닝 계열의 검색모델과, BERT 등 트랜스포머 (transformer) 또는 언어 모델 계열의 검색모델이 알려져 있다. Paper With Code에 따르면, 본 특허의 출원일 현재 가장 성능이 좋은, 즉, SOTA (State-of-the-Art) 검색모델은 언어 모델 계열의 BERT와 딥러닝 계열의 검색모델을 결합시킨 구조를 갖는 CEDR인 것으로 파악된다.As a search model based on artificial intelligence, a search model based on deep learning such as DRMM, KNRM, and PACRR, and a search model based on a transformer or language model such as BERT are known. According to the Paper With Code, as of the filing date of this patent, the SOTA (State-of-the-Art) search model has a structure that combines the BERT of the language model and the search model of the deep learning system. It is understood to be.

인공지능 기반의 검색모델은 쿼리와 문서 간의 유사도를 학습데이터에 기초하여 학습시킴으로써 완성된다. 학술적으로는 TREC, SQuAD, MS Marco 등 쿼리-문서 관계를 제공하는 데이터셋을 이용하여 인공지능 기반 검색모델을 학습시킨다.The artificial intelligence-based search model is completed by learning the similarity between a query and a document based on learning data. In academic terms, AI-based search models are trained using datasets that provide query-document relationships such as TREC, SQuAD, and MS Marco.

BERT 등의 언어 모델 계열의 검색엔진은, 사전학습 (pre-training) 단계에서는 자기 지도 학습 (Self-supervision learning) 방법론을 이용하지만, 정보검색을 위한 파인튜닝 (fine-tuning) 단계에서는 딥러닝 계열의 검색모델과 마찬가지로 쿼리-문서 관계에 대한 데이터셋에 기초한 지도 학습이 요구된다.Language model search engines such as BERT use Self-supervision learning methodology in the pre-training stage, but deep learning in the fine-tuning stage for information retrieval. Like the search model of, supervised learning based on the data set for the query-document relationship is required.

윅수퍼비전 (Weaksupervision) 방법론Weaksupervision methodology

학술적으로 이용되는 TREC, SQuAD, MS Marco 등의 쿼리-문서 관계 데이터셋은 실제 사용자의 검색활동에 기초하여 추출된 그라운드 트루스(ground truth)에 해당하거나, 문서의 제목을 쿼리로 이용한 것도 있다. 그러나, 실제 현장에서는 문서뭉치(document corpus)만이 존재하며 이를 검색하기 위하여 사용자가 입력한 쿼리가 존재하지 않거나, 혹시 존재하더라도 인공지능 기반의 검색모델을 학습시키기에는 매우 부족하다. 본 발명에서는 문서뭉치만 존재하는 경우에 각 문서로부터 슈도-쿼리를 생성하고, 생성된 슈도-쿼리를 이용하여 의사 (psuedo) 쿼리-문서 관계를 형성시키고 슈도-레이블(psuedo-label)을 생성하며, 이를 이용하여 인공지능 기반의 검색모델을 학습시키는 방법론이 제공된다. 본 발명에 따른 인공지능 기반의 검색모델에 대한 윅수퍼비전 방법론은 크게 구분하면 다음과 같은 단계를 포함한다: 1) 문서뭉치의 각 문서로부터 슈도-쿼리를 생성하는 단계, 2) 생성된 슈도-쿼리를 이용하여 의사 쿼리-문서 관계를 생성시키고, 생성된 의사 쿼리-문서 관계에 기초하여 슈도-레이블을 생성하는 단계, 3) 생성된 슈도-레이블을 이용하여 인공지능 기반의 검색모델을 학습시키는 단계. 이하에서는 전술한 각 단계에 대하여 항목을 나누어 설명한다.Query-document relational datasets such as TREC, SQuAD, and MS Marco, which are used academically, correspond to ground truths extracted based on actual user search activities, or some use the title of a document as a query. However, there is only a document corpus in the actual field, and there is no query input by the user to search for it, or even if it exists, it is very insufficient to learn an artificial intelligence-based search model. In the present invention, when only a document bundle exists, a pseudo-query is generated from each document, a pseudo-query-document relationship is formed using the generated pseudo-query, and a pseudo-label is generated. In addition, a methodology for learning an artificial intelligence-based search model using this is provided. The WickSupervision methodology for an artificial intelligence-based search model according to the present invention includes the following steps: 1) generating a pseudo-query from each document in a document bundle, 2) the generated pseudo-query Creating a pseudo-query-document relationship by using and generating a pseudo-label based on the pseudo-query-document relationship, 3) Learning an artificial intelligence-based search model using the generated pseudo-label . Hereinafter, each of the above-described steps will be described separately.

1) 슈도-쿼리를 생성하는 단계1) creating a pseudo-query

문서뭉치의 각 문서로부터 1개 이상의 키워드를 추출하고 이를 슈도-쿼리로 결정한다. 문서로부터 키워드를 추출하는 방법은 기존에 알려진 키워드 추출 기법을 이용한다. 키워드 추출 알고리즘도 크게 비지도학습에 기반한 기법과 지도학습에 기반한 기법으로 나눌 수 있으며, 다수의 기법들에 대한 오픈 소스가 존재한다.One or more keywords are extracted from each document in the document bundle, and this is determined as a pseudo-query. A method of extracting keywords from documents uses a previously known keyword extraction technique. Keyword extraction algorithms can be largely divided into techniques based on unsupervised learning and techniques based on supervised learning, and there are open sources for a number of techniques.

대부분의 기법들이 문서로부터 복수개의 키워드 또는 구문(keyphrase)을 추출하는 방식이지만, 최근에는 지도학습에 기반한 기법으로서 BERT를 이용하여 자연어 문장 형태의 쿼리를 생성하는 Doc2Query라는 기법도 오픈 소스로 공개된 바 있다.Most of the techniques are methods of extracting a plurality of keywords or phrases from a document, but recently a technique called Doc2Query, which generates natural language sentence-type queries using BERT as a supervised learning-based technique, has also been released as an open source. have.

2) 슈도-레이블을 생성하는 단계2) creating a pseudo-label

각 문서에서 추출된 슈도-쿼리를 BM25 등의 비지도학습 방법론에 기초한 검색모델의 입력으로 하여 문서뭉치로부터 M개의 문서를 리트리빙 한다. 이 때, 해당 슈도-쿼리를 추출한 문서가 M개의 문서 중에 상위에 포함될 가능성이 높지만 포함되지 않을 수도 있다.The pseudo-query extracted from each document is used as an input of a search model based on unsupervised learning methodology such as BM25, and M documents are retrieved from the document stack. In this case, the document from which the pseudo-query is extracted is likely to be included in the upper rank among the M documents, but may not be included.

리트리빙 된 M개의 문서 중에서, 상위 m(<M)개의 문서를 포지티브 (positive) 학습 데이터로 레이블링 (labeling) 하고, M개 중 나머지 문서 중의 적어도 일부를 네거티브 (negative) 학습 데이터로 레이블링 한다. 인공지능 기반의 검색모델에 따라, 추가로 다른 일부를 뉴트럴 (neutral) 학습 데이터로 레이블링 할 수 있다. 일반적으로 포지티브 데이터와 네가티브 데이터는 반드시 필요하지만, 뉴트럴 데이터는 반드시 필요하지는 않다.Among the retrieved M documents, top m (<M) documents are labeled as positive training data, and at least some of the remaining M documents are labeled as negative training data. Depending on the AI-based search model, additional parts can be labeled as neutral training data. In general, positive data and negative data are required, but neutral data is not.

여기서, 리트리빙 하는 문서의 수 M, 포지티브 학습 데이터의 수 m, 네거티브 학습 데이터의 수 및 뉴트럴 학습 데이터의 수 등은 정수이며, 일종의 하이퍼 파라미터로서 문서뭉치의 특성에 따라 다르게 결정될 수 있다. 개발자는 문서뭉치의 특성에 따라 실험적 또는 이론적 접근방식으로 이들 하이퍼 파라미터를 조정하여 인공지능 기반의 검색모델의 정확도를 높일 수 있다.Here, the number of documents M to be retrieved, the number of positive training data m, the number of negative training data, and the number of neutral training data are integers, and may be differently determined according to the characteristics of the document bundle as a kind of hyper parameter. Developers can increase the accuracy of artificial intelligence-based search models by adjusting these hyperparameters in an experimental or theoretical approach according to the characteristics of the document bundle.

3) 인공지능 기반의 검색모델을 학습시키는 단계3) Learning an artificial intelligence-based search model

인공지능 기반의 검색모델을 학습시키기 위해서는 쿼리-문서 관계가 필요하므로, 학습 데이터는 리트리빙에 이용된 쿼리 및 그 쿼리에 의하여 리트리빙된 문서의 관계의 형태로 제공된다. 예를 들어, (슈도-쿼리, 포지티브 문서)와 같이 2개의 데이터의 쌍의 형태로 주어지거나, (슈도-쿼리, 포지티브 문서, 네가티브 문서)와 같이 3개의 데이터가 연관된 형태로 제공될 수 있다. 물론 뉴트럴 문서까지 4개의 데이터가 연관될 수도 있다.Since a query-document relationship is required to train an artificial intelligence-based search model, the training data is provided in the form of a relationship between a query used for retrieving and a document retrieved by the query. For example, it may be given in the form of a pair of two data such as (pseudo-query, positive document), or three data may be provided in a related form such as (pseudo-query, positive document, negative document). Of course, up to a neutral document may be associated with four data.

학습에 필요한 데이터의 양은 정확하게 알려진 바는 없으며 주로 실험적 방법으로 확인되고 있다. 다만, 모델이 포함하는 파라미터 수에 따라 학습에 필요한 데이터의 양도 늘어난다. BERT 계열의 언어 모델 기반의 검색모델들은 대체로 위키피디아 등에 기초하여 사전학습이 된 모델이 공개되어 있으며, 사전학습된 모델을 정보검색과 같은 특정 과업(task)에 맞춰 파인튜닝 하는 것에는 사전학습과 대비하여 적은 양의 데이터가 요구된다.The amount of data required for learning is not known exactly, and is mainly confirmed by experimental methods. However, the amount of data required for training increases according to the number of parameters included in the model. BERT-series language model-based search models generally have pre-learned models based on Wikipedia, and fine-tuning pre-trained models for specific tasks such as information retrieval is compared with prior learning. Therefore, a small amount of data is required.

학술연구에서는 입수할 수 있는 데이터셋에 따라 학습 데이터의 양이 결정되지만, 실제 현장에서는 문서뭉치에 포함된 문서의 양에 따라 학습 데이터가 결정될 수 있다. 문서뭉치에 포함된 문서의 양이 학습을 위해 부족한 경우에는, 비슷한 분야의 다른 문서를 문서뭉치에 추가하거나, 데이터 증대 (data augmentation) 기법을 이용하여 데이터의 양을 늘릴 수 있다.In academic research, the amount of learning data is determined according to the available datasets, but in the actual field, the learning data can be determined according to the amount of documents included in the document stack. If the amount of documents included in the document stack is insufficient for learning, another document of a similar field can be added to the document stack, or the amount of data can be increased by using a data augmentation technique.

인공지능 기반의 모델을 학습시키는 경우, 일반적으로 데이터를 학습용 데이터와 검증용 데이터로 나누어 성능을 검증하는 것이 바람직하다. 예를 들어, 전체 데이터 중에서 80%의 데이터는 학습에 이용하고, 나머지 20%의 데이터는 검증에 이용할 수 있다.When training an artificial intelligence-based model, it is generally desirable to divide the data into training data and verification data to verify performance. For example, 80% of the data can be used for training and the remaining 20% of the data can be used for verification.

[추론단계][Inference stage]

본 발명에서 추론단계는 사용자가 입력한 쿼리에 대응하여 검색결과를 제공하는 과정을 의미한다. 추론단계는 크게 다음과 같은 단계들로 구분될 수 있다: 리트리빙 (retrieving) 단계, 리랭킹 (re-ranking) 단계 및 출력 단계. 이하에서는 각 단계별로 상세히 설명한다.In the present invention, the reasoning step refers to a process of providing a search result in response to a query input by a user. The reasoning step can be broadly divided into the following steps: a retrieving step, a re-ranking step, and an output step. Hereinafter, each step will be described in detail.

리트리빙 단계Retrieving stage

검색대상인 문서뭉치로부터, 비지도학습 방법론에 기초한 검색모델에 의하여 N개의 문서를 추출하는 단계이다. 일반적으로 인공지능 기반의 검색모델은 추론에 소요되는 시간이 비지도학습 방법론에 기초한 검색모델에 비하여 상당히 긴 편이다. 따라서, 문서뭉치에 포함된 전체 문서에 대하여 인공지능 기반의 검색모델을 적용하는 경우, 지나치게 긴 시간이 소요되어 사용자의 편의성이 떨어지게 된다. 따라서, 1차로 속도가 상대적으로 빠른 비지도학습 방법론에 기초한 검색모델에 의하여 다수의 문서를 리트리빙 한 후에, 2차로 리트리빙 된 문서에 대해서만 인공지능 기반의 검색모델을 적용하는 것이 일반적이다. 인공지능 기반의 검색모델의 추론 속도를 향상시켜 리트리빙 단계를 생략하려는 연구는 지속적으로 이루어지고 있지만, 아직까지는 사용자 편의성을 고려할 때 충분히 빠른 추론 속도에 도달하지 못한 상태로 파악된다.This is a step of extracting N documents from a bunch of documents to be searched by a search model based on an unsupervised learning methodology. In general, the search model based on artificial intelligence takes considerably longer time for inference than the search model based on unsupervised learning methodology. Therefore, when an artificial intelligence-based search model is applied to all documents included in the document stack, it takes an excessively long time and the user's convenience is degraded. Therefore, after retrieving a number of documents by a search model based on an unsupervised learning methodology, which is relatively fast in the first lane, it is common to apply an artificial intelligence-based search model only to the second retrieved documents. Research to skip the retrieving step by improving the inference speed of the artificial intelligence-based search model is continuously being conducted, but it is understood that it has not yet reached a sufficiently fast inference speed in consideration of user convenience.

본 단계에서는 정확도(accuracy)도 중요하지만 재현율(recall)이 더 중요하다. 반면, 2차 리랭킹 단계에서는 재현율보다는 정확도가 더 중요하다. 재현율을 높이는 방법 중의 하나는, 검색결과에서 제공하고자 하는 문서의 수에 비하여 몇배 더 많은 수의 문서를 리트리빙 하는 것이다. 다른 방안으로서는, 비지도학습 방법론에 기초한 검색모델의 재현율을 높이는 것이다. BERT를 이용하는 DeepCT Index 라는 기법을 이용하면, 동일한 비지도학습 방법론에 기초한 검색모델을 이용하면서도 재현율을 높일 수 있다. 많은 양의 문서를 리트리빙 하기 위해서는 문서 수에 비례하여 리트리빙에 소요되는 시간이 증가한다. DeepCT Index 기법을 이용하는 경우, 리트리빙 하는 문서 수를 줄임으로써 검색에 소요되는 시간을 단축시키면서도, 그보다 몇배 많은 수의 문서를 리트리빙 하는 경우와 동일한 수준의 재현율을 구현할 수 있다.At this stage, accuracy is also important, but recall is more important. On the other hand, in the second re-ranking step, accuracy is more important than recall. One of the ways to increase the recall rate is to retrieve the number of documents several times larger than the number of documents to be provided in the search result. Another solution is to increase the reproducibility of the search model based on the unsupervised learning methodology. By using a technique called DeepCT Index using BERT, it is possible to increase the reproducibility while using a search model based on the same unsupervised learning methodology. In order to retrieve a large amount of documents, the time required for retrieving increases in proportion to the number of documents. In the case of using the DeepCT Index technique, it is possible to achieve the same level of reproducibility as in the case of retrieving several times more documents while reducing the time required for searching by reducing the number of documents to be retrieved.

비지도학습 방법론에 기초한 검색모델로서, 학습단계에서 이용된 검색모델과 동일한 것을 이용할 수도 있고, 다른 검색모델을 이용할 수도 있다. 해당 분야에서 재현율이 높은 검색모델을 선택하는 것이 중요하다. 학술적 연구에서는 Anserini(BM25+RM3)가 오픈 소스 기반의 검색모델로서 널리 이용되고 있다.As a search model based on the unsupervised learning methodology, the same search model as used in the learning stage may be used, or another search model may be used. It is important to select a search model with a high reproducibility in the field. In academic research, Anserini (BM25+RM3) is widely used as an open source based search model.

리랭킹 단계Re-ranking stage

리트리빙 된 N개의 문서와 사용자가 입력한 쿼리와의 관련도가 비지도학습 방법론에 기초한 검색모델에 의하여 평가되고 관련도 순에 따라 정렬된다. 본 단계에서는, 리트리빙 된 N개의 문서와 사용자가 입력한 쿼리와의 관련도가 인공지능 기반의 검색모델에 의하여 다시 평가된 후에 재정렬, 즉 리랭킹 된다. 전술한 학습단계에서 학습된 인공지능 기반의 검색모델에 의하여 본 단계가 수행된다.The relevance between the retrieved N documents and the query entered by the user is evaluated by a search model based on the unsupervised learning methodology and sorted according to the degree of relevance. In this step, the relationship between the retrieved N documents and the query entered by the user is re-evaluated by the artificial intelligence-based search model, and then re-ordered, that is, re-ranked. This step is performed by the artificial intelligence-based search model learned in the above-described learning step.

검색결과 출력Search results output

인공지능 기반의 검색모델에 의하여 리랭킹 된 N개의 문서는 관련도의 순서에 따라 정렬되어 검색결과로 출력된다.The N documents re-ranked by the artificial intelligence-based search model are sorted according to the order of the degree of relevance and output as a search result.

[특수한 경우의 처리][Processing in special cases]

문서의 길이와 관련된 사항Matters related to the length of the document

일반적으로 비지도학습 방법론에 기초한 검색모델은 문서의 길이가 길더라도 실행에 문제가 없지만, 인공지능 기반의 검색모델은 처리할 수 있는 문서의 최대 길이에 제한이 있는 경우가 많다. 특히, BERT 이후에 소개되는 인공지능 기반의 검색모델들은 처리할 수 있는 최대 토큰(token)의 수가 제한된다. 예를 들어, BERT의 경우에는 처리할 수 있는 최대 토큰의 수가 512개로 제한된다.In general, a search model based on unsupervised learning methodology does not have any problem in execution even if the document length is long, but the search model based on artificial intelligence often has a limit on the maximum length of a document that can be processed. In particular, AI-based search models introduced after BERT are limited in the number of maximum tokens that can be processed. For example, in the case of BERT, the maximum number of tokens that can be processed is limited to 512.

인공지능 기반의 검색모델이 처리할 수 있는 문서의 최대 길이에 제한이 있는 경우, 문서뭉치(document corpus)의 각 문서를 제한된 길이 이하의 패시지(passage)로 나누어 이를 검색대상으로 할 수 있다. 본 명세서에서는, 길이가 긴 문서로 이루어진 문서뭉치와 구별하기 위하여 짧은 길이의 글로 이루어진 패시지의 뭉치를 '말뭉치(corpus)'라고 표현한다.When there is a limit on the maximum length of documents that can be processed by the artificial intelligence-based search model, each document in the document corpus can be divided into passages less than the limited length and used as a search target. In this specification, a bundle of passages made of short texts is expressed as a'corpus' in order to distinguish it from a document pile made of a long document.

학습단계에서는, 말뭉치에 포함된 패시지가 문서로 대체되는 것 외에는 전술한 것과 동일하다. 즉, 패시지로부터 슈도-쿼리를 추출하고, 비지도학습 방법론에 기초한 검색모델을 이용하여 슈도-레이블을 생성시킨 후에 인공지능 기반의 검색모델을 학습시킨다. 인공지능 기반의 검색모델은 슈도-쿼리와 문서 전체 사이의 관련도가 아니라, 슈도-쿼리와 패시지 사이의 관련도에 기초하여 학습하게 된다.In the learning stage, it is the same as described above, except that the passages included in the corpus are replaced with documents. That is, a pseudo-query is extracted from a passage, a pseudo-label is created using a search model based on an unsupervised learning methodology, and then an artificial intelligence-based search model is trained. The artificial intelligence-based search model is learned based on the relationship between the pseudo-query and the passage, not the relationship between the pseudo-query and the entire document.

추론단계에서는, 문서뭉치의 각 문서를 패시지로 나누어 말뭉치로 만들고, 패시지와 문서의 대응관계를 참조할 수 있는 형태로 저장하는 전처리가 필요하다. 리트리빙 단계에서는 말뭉치로부터 사용자가 입력한 쿼리에 기초하여 N개의 패시지를 리트리빙 하고, 리랭킹 단계에서는 리트리빙 된 N개의 패시지를 리랭킹 한다. In the inference stage, it is necessary to pre-process each document of the document stack into a corpus by dividing it into a passage, and storing the correspondence between the passage and the document in a form that can be referenced. In the retrieving step, N passages are retrieved from the corpus based on a query input by the user, and in the re-ranking step, the retrieved N passages are reranked.

검색결과에서 패시지가 아닌 문서를 관련도 순으로 정렬하여 제공하여야 한다. 이를 위하여, 전처리 단계에서 제공된 패시지와 문서의 대응관계를 참조하여 리랭킹 된 패시지의 관련도 정렬 순서에 대응하도록 문서를 정렬하여 검색결과로서 제공한다. 하나의 문서로부터 여러 개의 패시지가 분리되므로, 리랭킹 된 결과에는 하나의 문서로부터 추출된 패시지가 복수개 포함될 수 있다. 이 경우, 문서의 정렬 순서는 가장 관련도가 높은 패시지의 순서에 대응되도록 할 수 있다.In the search results, documents other than passages must be provided in order of relevance. To this end, documents are sorted so as to correspond to the sorting order of the re-ranked passage by referring to the correspondence relationship between the passage provided in the preprocessing step and the document, and provided as a search result. Since several passages are separated from one document, the reranked result may include a plurality of passages extracted from one document. In this case, the sorting order of documents may correspond to the order of the most relevant passages.

문서확장 (document expansion)Document expansion

길이가 긴 문서의 경우에는 다수개의 단락을 포함하고 있는데, 각 단락의 내용은 반드시 주제가 일치하지 않는다. 따라서, 각 단락을 패시지로 나누어 말뭉치로 저장하는 경우에 관련성이 있는 문서로부터 분리된 단락임에도 불구하고 관련도가 낮게 평가될 가능성이 존재한다.A long document contains multiple paragraphs, but the content of each paragraph does not necessarily match the subject. Therefore, when each paragraph is divided into passages and stored as a corpus, there is a possibility that the degree of relevance may be evaluated low even though the paragraph is separated from the relevant document.

이러한 문제를 방지하기 위하여, 문서의 제목 또는 이에 준하는 문구를 해당 문서에서 분리된 각 패시지에 추가할 수 있다. 즉, 한 문서에서 분리된 각 패시지에는 동일한 문구가 추가된다. 이렇게 패시지에 관련성을 갖는 문구를 추가하여 확장시키는 것을 본 명세서에서는 '문서확장'이라는 용어로 표현한다. 문서의 제목에 준하는 문구는, 예를 들어, 문서요약 (text summarization) 기법을 이용하여 생성될 수 있다. 문서요약 기법은 크게 추출적 요약(extractive summarization)과 추상적 요약(abstractive summarization)으로 구분되며, 양자 모두 적용이 가능하다. 문서확장을 위하여 추가되는 문구를 생성하는 기법은 반드시 문서요약 기법에 한정되지 않으며, 해당 문서의 주제를 압축적으로 표현할 수 있는 기법이라면 어떤 것이라도 무방하다. To prevent this problem, the title of the document or equivalent text can be added to each separate passage in the document. In other words, the same phrase is added to each separate passage in a document. In this specification, the addition of a phrase having a relevance to the passage is expressed as the term'document extension'. The phrase corresponding to the title of the document can be generated, for example, using a text summarization technique. The document summarization technique is largely divided into extractive summarization and abstract summarization, and both can be applied. Techniques for generating additional phrases for document expansion are not necessarily limited to the document summary technique, and any technique capable of compressively expressing the subject of the document may be used.

예를 들어, 사용자가 입력한 쿼리에 복수개의 키워드가 포함되어 있는데, 어느 키워드는 원래의 패시지에 포함되어 있지만, 다른 키워드는 문서확장에 의하여 추가된 문구에 포함되어 있을 수 있다. 문서확장에 의하여 추가된 문구는 해당 문서 전체의 주제를 포함하고 있으므로, 해당 패시지는 사용자가 검색하고자 의도한 내용을 포함하고 있을 가능성이 존재한다. 문서확장으로 해당 문구가 추가되지 않았다면 리트리빙 되지 않았을 패시지가 문서확장에 의하여 리트리빙 될 수 있다. 이는 리트리빙 단계의 재현율(recall)을 높일 수 있다. 위에서 설명한 바와 같이, 리트리빙 단계에서는 재현율이 높은 것이 중요하다.For example, a query entered by a user includes a plurality of keywords, some of which are included in the original passage, but other keywords may be included in a phrase added by document expansion. Since the text added by the document expansion includes the subject of the entire document, there is a possibility that the passage contains the content intended to be searched by the user. If the text was not added by document expansion, a message that would not have been retrieved can be retrieved by document expansion. This can increase the recall of the retrieving step. As described above, it is important to have high reproducibility in the retrieving step.

문서확장으로 패시지에 추가되는 문구의 길이가 너무 길면 문서확장에 의하여 늘어난 패시지의 길이가 인공지능 기반의 검색모델이 처리할 수 있는 길이를 넘어갈 수 있다. 이로 인한 부작용을 최소화하기 위하여, 문서확장에 의하여 추가되는 문구는 패시지의 앞쪽에 배치되는 것이 바람직하다. 문서확장으로 길이가 늘어난 패시지의 후단부가 짤려 나간다고 하여도, 사용자가 입력한 쿼리에 포함된 키워드가 그 후단부에 포함되어 있지 않다면, 여전히 해당 패시지는 리트리빙 단계에서 추출될 것이기 때문이다.If the length of the text added to the passage by document expansion is too long, the length of the passage that is increased by document expansion may exceed the length that can be processed by the artificial intelligence-based search model. In order to minimize the side effects resulting from this, it is desirable that the text added by document expansion be placed in front of the passage. This is because even if the trailing end of a passage lengthened by document expansion is cut off, if the keyword included in the query entered by the user is not included in the trailing end, the passage will still be extracted in the retrieving step.

도메인 특화 (domain adaptiveness)Domain adaptiveness

인공지능 기반의 검색모델을 특정 도메인의 문서로 학습시키는 경우 해당 도메인에 특화된 검색모델로 이용될 수 있다. 예를 들어, 위키피디아 등 보편적인 문서뭉치로 사전학습된 BERT를 특정 도메인의 문서로 파인튜닝 하는 경우, 사전학습 단계에서는 어휘들 간의 일반적인 관계를 학습하고, 파인튜닝 시에는 해당 도메인에 특화된 어휘에 대해 학습하게 된다. When an artificial intelligence-based search model is trained as a document of a specific domain, it can be used as a search model specialized for that domain. For example, in the case of fine-tuning the BERT pre-learned as a general document bundle such as Wikipedia to a document of a specific domain, in the pre-learning stage, the general relationship between the vocabularies is learned, and in fine-tuning, the vocabulary specialized for the domain is learned. You will learn.

특정 도메인에 특화된 검색모델은 다른 도메인에 대해서는 성능이 상대적으로 떨어질 수 있지만 해당 도메인에서는 성능이 향상된다.A search model specialized for a specific domain may have relatively poor performance for other domains, but it improves performance for that domain.

앙상블 (Ensemble) 검색모델Ensemble search model

전술한 방법에서는, 인공지능 기반의 검색모델에 의하여 리랭킹 된 결과를 최종 검색결과로 활용하였지만, 비지도학습 방법론에 기초한 검색모델과 인공지능 기반의 검색모델의 앙상블을 이용하여 검색결과를 제공할 수도 있다. 이 경우 최종평가는 수학식 1과 같이 표현된다.In the above method, the results re-ranked by the artificial intelligence-based search model were used as the final search results, but the search results were provided using an ensemble of search models based on unsupervised learning methodology and search models based on artificial intelligence. May be. In this case, the final evaluation is expressed as in Equation 1.

[수학식 1][Equation 1]

(최종평가) = a*(비지도학습 방법론에 기초한 검색모델의 평가) (Final evaluation) = a*(Evaluation of search model based on unsupervised learning methodology)

+ (1-a)*(인공지능 기반의 검색모델의 평가)+ (1-a)*(Evaluation of artificial intelligence-based search model)

수학식 1에서 a는 0과 1 사이의 값이며, 하이퍼 파라미터로서 최선의 검색결과를 제공하도록 조정될 수 있다.In Equation 1, a is a value between 0 and 1, and may be adjusted to provide the best search result as a hyper parameter.

앙상블 모델이 반드시 단독 모델에 비하여 성능이 향상되는 것을 보장하지는 않으며, 도메인에 따라 채택 여부를 검토할 수 있다.The ensemble model does not necessarily guarantee that the performance is improved compared to the single model, and it can be considered whether or not it is adopted depending on the domain.

도 4에는 본 발명에 따른 검색방법을 수행하기 위한 컴퓨터 장치가 도시되어 있다.4 shows a computer device for performing a search method according to the present invention.

도 1 내지 3을 참조하여 본 발명에 따른 검색방법 및 학습방법에 대하여는 상세히 설명한 바 있으므로, 도 4를 참조하여서는 그러한 검색방법을 수행하기 위한 장치(100)를 간략히 설명한다.Since the search method and the learning method according to the present invention have been described in detail with reference to FIGS. 1 to 3, an apparatus 100 for performing such a search method will be briefly described with reference to FIG. 4.

도 4를 참조하면, 컴퓨터 장치(100)는, 프로세서(110), 프로그램과 데이터를 저장하는 비휘발성 저장부(120), 실행 중인 프로그램들을 저장하는 휘발성 메모리(130), 사용자와의 사이에 정보를 입력 및 출력하는 입/출력부(140) 및 이들 장치 사이의 내부 통신 통로인 버스 등으로 이루어져 있다. 실행 중인 프로그램으로는, 운영체계(Operating System) 및 다양한 어플리케이션이 있을 수 있다. 도시되지는 않았지만, 전력제공부를 포함한다.Referring to FIG. 4, the computer device 100 includes a processor 110, a nonvolatile storage unit 120 storing programs and data, a volatile memory 130 storing programs being executed, and information between a user and a user. It consists of an input/output unit 140 for inputting and outputting, and a bus, which is an internal communication path between these devices. As a running program, there may be an operating system and various applications. Although not shown, it includes a power supply.

학습단계에서는 저장부(120)에 저장된 학습 데이터를 이용하여 메모리(130)에서 인공지능 기반의 검색모델을 학습시킨다. 추론단계에서는, 저장부(120)에 저장된 비지도학습 기반의 검색모델과(210) 인공지능 기반의 검색모델(220)을 메모리(130)에서 실행시킨다. 말뭉치는 저장부(120)에 저장되고, 입/출력부(140)를 통하여 입력된 쿼리에 기초하여 검색방법을 수행한다.In the learning step, an artificial intelligence-based search model is trained in the memory 130 by using the learning data stored in the storage unit 120. In the inference step, the unsupervised learning-based search model 210 and the artificial intelligence-based search model 220 stored in the storage unit 120 are executed in the memory 130. The corpus is stored in the storage unit 120 and a search method is performed based on a query input through the input/output unit 140.

[실시예][Example]

이하에서는, 본 발명에 따른 검색결과 제공 방법을 한국생산기술연구원이 보유한 특허문서를 대상으로 적용한 실시예에 대하여 설명한다.Hereinafter, an embodiment in which the method for providing a search result according to the present invention is applied to a patent document held by the Korea Institute of Industrial Technology will be described.

인공지능 기반 검색모델Artificial intelligence-based search model

인공지능 기반 검색모델로서는 오픈 소스가 공개된 CEDR-KNRM 모델을 이용하였다. CEDR-KNRM 모델은 간략히 설명하면 BERT와 KNRM을 병렬로 처리하는 구조이며, 구체적인 사항은 관련 논문 및 공개된 소스 코드로부터 확인이 가능하다. Paper With Code 에 따르면 실시예를 구현하는 시점에 성능이 가장 좋은 모델이었으며, 출원일 현재까지 변동이 없는 것으로 확인된다.As an artificial intelligence-based search model, the open source CEDR-KNRM model was used. In brief, the CEDR-KNRM model is a structure that processes BERT and KNRM in parallel, and details can be checked from related papers and published source codes. According to Paper With Code, it was confirmed that the model had the best performance at the time of implementing the embodiment, and there is no change until the filing date.

말뭉치Corpus

검색 대상 특허문서, 즉, 문서뭉치는 대략 3000건이었으며, 모든 문서의 길이가 BERT가 처리할 수 있는 512개의 토큰을 넘기 때문에, 단락별로 구분하여 말뭉치를 형성하였다. 말뭉치는 대략 120,000개의 패시지를 포함한다.The number of patent documents to be searched, that is, document bundles, was approximately 3,000, and since the length of all documents exceeded 512 tokens that BERT could process, a corpus was formed by dividing each paragraph. The corpus contains approximately 120,000 passages.

각 패시지 및 추출된 특허문서와의 관계가 추후 추론단계에서 참조될 수 있도록 저장된다.The relationship between each passage and the extracted patent document is stored so that it can be referred to at a later stage of inference.

슈도-쿼리 생성Create pseudo-query

특허문서의 각 단락으로 형성된 패시지에 키워드 추출 기법을 적용하여 1개 이상의 키워드 또는 문구를 슈도-쿼리로서 추출하였다. 여기에 사용된 기법은 RAKE로서 논문 및 오픈 소스가 공개되어 있다.At least one keyword or phrase was extracted as a pseudo-query by applying the keyword extraction technique to the passage formed by each paragraph of the patent document. The technique used here is RAKE, and the paper and open source are open to the public.

동일한 슈도-쿼리를 중복 제거하면 대략 40,000개의 슈도-쿼리가 생성되었다.If the same pseudo-query was redundantly removed, approximately 40,000 pseudo-queries were generated.

쿼리-문서 관계 형성Query-document relationship formation

각각의 슈도-쿼리를 입력으로 하고 기존에 알려진 비지도학습 방법론에 기초한 검색모델인 BM25를 이용하여 말뭉치로부터 300건의 패시지를 추출하였다. 슈도-쿼리는 대략 40,000개였으므로, 추출된 패시지는 총 12,000,000건이다. 이는 CEDR-KNRM 모델을 학습시키기에 부족하지 않은 양의 데이터이다.Each pseudo-query was input, and 300 passages were extracted from the corpus using BM25, a search model based on the previously known unsupervised learning methodology. Since there were approximately 40,000 pseudo-queries, the total number of extracted passages was 12,000,000. This is an insufficient amount of data to train the CEDR-KNRM model.

슈도-레이블링Pseudo-labeling

각각의 슈도-쿼리에 의하여 추출된 300건의 패시지 중에서 상위 m개의 패시지를 포지티브 학습 데이터로 분류하고, 하위 p개의 패시지를 네거티브 학습 데이터로 분류하였다. CEDR-KNRM 모델은 뉴트럴 학습 데이터도 이용하기 때문에, 상위 m개의 패시지와 하위 p개의 패시지를 제외한 나머지 패시지를 뉴트럴 학습 데이터로 분류하였다.Among 300 passages extracted by each pseudo-query, the upper m passages were classified as positive learning data, and the lower p passages were classified as negative learning data. Since the CEDR-KNRM model also uses neutral training data, the remaining passages excluding the upper m passages and the lower p passages were classified as neutral training data.

m, p 등의 하이퍼 파라미터는 학습 후 검색모델의 성능을 발명자가 검색결과를 검토하는 방식으로 검증하면서 조정하였다. 그라운드 트루스가 없는 데이터셋이기 때문에, 마련된 학습 데이터의 일부를 검증용 데이터로 이용할 수도 있지만, 발명자가 수작업으로 검증이 가능한 분야이기 때문에 수작업을 통하여 성능을 검증하였다.Hyperparameters such as m and p were adjusted by verifying the performance of the search model after learning by the inventor reviewing the search results. Since it is a dataset without ground truth, some of the prepared training data can be used as verification data, but since it is a field that can be verified by hand by the inventor, the performance was verified manually.

추론단계Reasoning stage

사용자가 복수개의 키워드로 이루어진 쿼리를 입력하면, 먼저 BM25를 이용하여 말뭉치로부터 300건의 패시지를 리트리빙 한다. 리트리빙 된 300건의 패시지는 다음으로 사용자에 의하여 입력된 쿼리와 함께 학습된 CEDR-KNRM 모델로 제공된다. CEDR-KNRM 모델은 300건의 패시지를 리랭킹 한다. 리랭킹 된 300건의 패시지와 대응되는 특허문서를 조회하여 패시지의 순서에 대응되도록 특허문서의 순서를 정렬시킨다. 리트리빙 된 패키지 중에서 동일한 특허문서에 속하는 것이 존재하는 경우, 대응하는 특허문서는 가장 높게 평가된 패시지의 순서에 맞춰 정렬되고 후순위의 패시지는 특허문서의 정렬시 무시된다. 따라서, 검색결과는 300건 보다 작거나 같게 나타난다.When a user inputs a query composed of a plurality of keywords, 300 passages are first retrieved from the corpus using BM25. The retrieved 300 passages are then provided as a trained CEDR-KNRM model along with a query entered by the user. The CEDR-KNRM model reranks 300 passages. The order of the patent documents is arranged to correspond to the order of the passages by searching the patent documents corresponding to the 300 re-ranked messages. In the case where there is one belonging to the same patent document among the retrieved packages, the corresponding patent documents are sorted according to the order of the highest-valued passages, and the subsequent ones are ignored when sorting the patent documents. Therefore, search results are less than or equal to 300.

검색결과 비교Compare search results

도 5 및 도 6에는 본 실시예에 의하여 개발된 검색방법으로 검색된 결과와 다른 검색방법으로 검색된 결과를 대비하여 설명하고 있다.In FIGS. 5 and 6, the results searched by the search method developed according to the present embodiment and the results searched by other search methods are described in comparison.

도 5는 "휠체어" 및 "주행보조"라는 2개의 키워드를 포함하는 쿼리를 입력하여 한국생산기술연구원이 보유한 특허 중에서 검색된 결과로서, 좌측은 본 실시예에 의하여 개발된 검색방법으로 검색된 결과이고 우측은 KIPRIS에서 검색된 결과이다. 도시된 바와 같이, 본 실시예에 의하여 개발된 검색방법에서는 "휠체어" 및 "주행보조"라는 키워드와 정확하게 일치하는 단어를 포함하는 문서는 물론, 거동이 불편한 사람이 이용할 수 있는 다른 장치들에 대한 특허도 함께 검색되는 반면, KIPRIS에서는 "휠체어" 및 "주행보조"라는 키워드와 정확하게 일치하는 단어를 포함하는 문서만이 검색된다.5 is a search result among patents owned by the Korea Institute of Industrial Technology by entering a query including two keywords "wheelchair" and "driving aid", and the left is the search result by the search method developed according to this embodiment, and the right Is the result retrieved from KIPRIS. As shown, in the search method developed according to the present embodiment, documents including words that exactly match the keywords "wheelchair" and "driving aid" as well as other devices that can be used by people with disabilities While patents are also searched, in KIPRIS, only documents containing words that exactly match the keywords "wheelchair" and "driving aid" are searched.

도 6은 "스캐닝소나", "안전감시", "해양운용장비", "해상감시", 및 "양식장감시"라는 6개의 키워드를 포함하는 쿼리를 입력하여 한국생산기술연구원이 보유한 특허 중에서 검색된 결과로서, 상측은 본 실시예에 의하여 개발된 검색방법으로 검색된 결과이고 하측은 구글 특허검색에서 검색된 결과이다. 다만, 구글 특허검색의 경우 쿼리에 포함되는 키워드가 5개가 넘어가면 검색이 수행되지 않았기 때문에, "양식장감시"라는 키워드를 제외하고 검색을 수행하였다. 도시된 바와 같이, 본 실시예에 의하여 개발된 검색방법에서는 쿼리에 포함된 키워드와 정확하게 일치하지 않더라도 의미상으로 유사한 단어들을 포함하는 문서를 검색하는 반면, 구글 특허에서는 검색결과가 없는 것으로 나타난다. 이러한 결과는 KIPRIS도 동일하다(도시되지 않음).6 is a search result among patents held by the Korea Institute of Industrial Technology by entering a query including six keywords such as "scanning sonar", "safety monitoring", "marine operation equipment", "marine monitoring", and "aquaculture monitoring" As, the upper side is the search result by the search method developed according to the present embodiment, and the lower side is the search result by Google Patent Search. However, in the case of the Google patent search, since the search was not performed when the number of keywords included in the query exceeded 5, the search was performed excluding the keyword "Farm Monitoring". As shown, in the search method developed according to the present embodiment, even if it does not exactly match the keyword included in the query, it searches for a document containing words that are semantically similar, whereas in the Google patent, there is no search result. These results are the same for KIPRIS (not shown).

도 5 및 도 6으로부터 알 수 있듯이, 인공지능 기반의 검색모델을 이용하는 본 발명에 따른 검색방법은, 입력된 쿼리에 포함된 키워드를 포함하지는 않지만 의미상으로 관련성이 있는 문서도 추가로 검색할 수 있다.As can be seen from FIGS. 5 and 6, the search method according to the present invention using an artificial intelligence-based search model does not include keywords included in the input query, but can additionally search documents that are semantically relevant. have.

[동영상 세그먼트 검색에의 응용][Application to video segment search]

도 7에는 본 발명에 따른 검색방법을 동영상 세그먼트 (video segment) 검색에 응용하는 방법이 도시되어 있다.7 shows a method of applying the search method according to the present invention to a video segment search.

동영상과 관련된 텍스트 정보는, 예를 들어, 객체 인식에 의하여 인식된 객체 명칭, 자막 (subtitle), 동영상 개요 등이 있다. 이 중에서 동영상 개요는 동영상 전체에 적용되는 사항이며, 인식된 객체 명칭 또는 자막 등은 특정 세그먼트에만 적용되는 사항이다. Text information related to a video includes, for example, an object name recognized by object recognition, a subtitle, and an outline of a video. Among them, the outline of the video is applied to the entire video, and the recognized object name or subtitle is applied only to a specific segment.

일반적으로 자막 정보는 별도의 텍스트 파일로서 타임스탬프 (또는 프레임 넘버) 구간에 대응되는 자막이 기록되어 있다. 동영상 플레이어는 동영상과 함께 자막 정보를 읽어들인 후에 타임스탬프를 참고하여 해당 구간에서 대응되는 자막을 동영상 위에 표시한다.In general, the subtitle information is a separate text file, and the subtitle corresponding to the timestamp (or frame number) section is recorded. After reading the subtitle information along with the video, the video player refers to the timestamp and displays the corresponding subtitle on the video.

동영상 객체 인식은 YOLO 등을 비롯하여 다양한 기법들이 알려져 있으며, 거의 실시간으로 객체를 인식하고 레이블링 한다. 이렇게 레이블링 된 객체 명칭을 자막 파일과 유사한 형식으로 해당 객체가 등장하는 구간에 대해 저장할 수 있다. 객체 인식 기법에서 개체명 인식(Named Entity Recognition)이 가능한 경우에는 더욱 구체적인 정보가 마련될 수 있다.Various techniques, including YOLO, are known for moving object recognition, and objects are recognized and labeled in near real time. The labeled object name can be saved for the section in which the object appears in a format similar to a subtitle file. If Named Entity Recognition is possible in the object recognition technique, more detailed information may be provided.

도 7에 도시된 바와 같이, 자막 파일과 객체 정보 파일은 통합되면서 구간이 중첩되는 경우 구분되도록 나뉠 수 있다. 각 구간에서 등장하는 객체와 자막 정보를 파악할 수 있다. 위에서 설명한 문서확장 기법을 이용하여, 각 구간의 자막 정보 및 객체 정보에 동영상 개요를 덧붙인다. 그러면, 각 구간의 정보는 텍스트로 이루어진 패시지가 된다.As shown in FIG. 7, when the subtitle file and the object information file are integrated and overlapped, the subtitle file and the object information file may be divided so as to be distinguished. Object and subtitle information appearing in each section can be identified. Using the document expansion technique described above, an outline of the video is added to the subtitle information and object information of each section. Then, the information of each section becomes a passage made of text.

이러한 패시지들은 말뭉치를 형성하며, 입력된 쿼리에 의하여 검색될 수 있다. 예를 들어, 특정 객체들이 등장하는 장면을 검색한다거나, 특정 내용의 영화 중에서 특정 대사가 나오는 장면을 검색하는 경우 등에 본 실시예에 따른 검색방법을 응용할 수 있다.These passages form a corpus and can be searched by an input query. For example, the search method according to the present exemplary embodiment may be applied to a case where a scene in which specific objects appear, or a scene in which a specific dialogue appears among movies having a specific content is searched.

[이미지 검색에의 응용][Application to image search]

동영상에서와 마찬가지로, 이미지에 대해서도 객체 인식이 가능하다. 또한, 이미지에 대응되는 설명인 캡션(cpation)을 생성시키는 인공지능에 대한 연구도 꾸준히 이루어지고 있다. 캡션은 주로 한문장으로 제공되지만, 여러 문장으로 제공되기도 한다. As in video, object recognition is possible for images. In addition, research on artificial intelligence that generates captions, which are descriptions corresponding to images, is also being conducted steadily. The caption is mainly provided in one sentence, but it is also provided in several sentences.

생성된 캡션과 인식된 객체명을 덧붙여서 말뭉치의 패시지를 형성할 수 있다. 다만, 캡션의 생성에는 인식된 객체명이 이용되므로, 캡션과 인식된 객체명을 덧붙이는 것은 중복된 정보일 수 있다. 또한, 일군의 이미지에 대한 개요 정보가 존재한다면, 이 개요 정보는 해당 일군의 이미지에 대한 패시지에 공통적으로 포함될 수 있다.A corpus passage can be formed by adding the generated caption and recognized object name. However, since the recognized object name is used to generate the caption, adding the caption and the recognized object name may be redundant information. In addition, if there is outline information for a group of images, this outline information may be commonly included in a passage for the group of images.

위와 같이 형성된 패시지는 본 발명에 따른 검색방법으로 검색이 가능하다.The passage formed as described above can be searched by the search method according to the present invention.

전술한 상세한 설명은 어떤 면에서도 제한적으로 해석되어서는 아니되며 예시적인 것으로 고려되어야 한다. 본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.The above detailed description should not be construed as limiting in any respect and should be considered illustrative. The scope of the present invention should be determined by rational interpretation of the appended claims, and all changes within the equivalent scope of the present invention are included in the scope of the present invention.

Claims

As a computer-implemented method that learns an artificial intelligence-based search model for searching video segments by the weak-supervision methodology,
(a) generating a passage corresponding to the video segment including at least one of object information and subtitle information extracted from each video segment;
(b) forming a corpus from a plurality of video segments into a plurality of passages formed by the step (a);
(c) generating a pseudo-query from each passage of the corpus formed in step (b) using one or more keywords extracted by a technique based on unsupervised learning or a technique based on supervised learning;
(d) generating a pseudo-label using the corpus generated in step (b), the pseudo-query generated in step (c), and a search model based on an unsupervised methodology; And
(e) learning an artificial intelligence-based search model by using the pseudo-label generated in step (d)
Computer-implemented method for learning an artificial intelligence-based search model for searching for a video segment comprising a.

The method according to claim 1,
Each passage of the corpus further includes an overview of the entire video to which each video segment belongs
A computer-implemented method for learning an artificial intelligence-based search model for searching video segments, characterized in that.

As a device that learns an artificial intelligence-based search model for searching video segments by the weak-supervision methodology,
At least one processor; And
Including at least one memory for storing computer-executable instructions,
The computer-executable instruction stored in the at least one memory, by the at least one processor,
(a) generating a passage corresponding to the video segment including at least one of object information and subtitle information extracted from each video segment;
(b) forming a corpus from a plurality of video segments into a plurality of passages formed by the step (a);
(c) generating a pseudo-query from each passage of the corpus formed in step (b) using one or more keywords extracted by a technique based on unsupervised learning or a technique based on supervised learning;
(d) generating a pseudo-label using the corpus generated in step (b), the pseudo-query generated in step (c), and a search model based on an unsupervised methodology; And
(e) learning an artificial intelligence-based search model by using the pseudo-label generated in step (d)
Including
A device that trains an artificial intelligence-based search model to search for video segments.

The method of claim 3,
Each passage of the corpus further includes an overview of the entire video to which each video segment belongs
A device that trains an artificial intelligence-based search model to search for video segments.

As a computer program to provide a method of learning an artificial intelligence-based search model for searching video segments by a weak-supervision methodology,
It is stored in a non-transitory storage medium, and by the processor,
(a) generating a passage corresponding to the video segment including at least one of object information and subtitle information extracted from each video segment;
(b) forming a corpus from a plurality of video segments into a plurality of passages formed by the step (a);
(c) generating a pseudo-query from each passage of the corpus formed in step (b) using one or more keywords extracted by a technique based on unsupervised learning or a technique based on supervised learning;
(d) generating a pseudo-label using the corpus generated in step (b), the pseudo-query generated in step (c), and a search model based on an unsupervised methodology; And
(e) learning an artificial intelligence-based search model by using the pseudo-label generated in step (d)
Including
A computer program stored in a non-transitory storage medium that provides a method for learning an artificial intelligence-based search model for searching video segments.

A computer-implemented method for searching for video segments using an artificial intelligence-based search model learned by the method described in claim 1,
(a) generating a passage corresponding to the video segment, including object information and subtitle information extracted from each video segment using an object recognition artificial intelligence model;
(b) forming a corpus from a plurality of video segments into a plurality of passages formed by the step (a);
(c) retrieving N passages corresponding to a query input from the corpus formed in step (b) by a search model based on an unsupervised methodology;
(d) re-ranking the N passages retrieved in step (c) based on the input query by the artificial intelligence-based search model; And,
(e) outputting video segments corresponding to the N passages re-ranked in step (d) as search results
Computer-implemented method for searching for a video segment comprising a.

The method of claim 6,
Each passage of the corpus further includes an overview of the entire video to which each video segment belongs
Computer-implemented method for searching for a video segment, characterized in that.

A device for searching for a video segment using an artificial intelligence-based search model learned by the method according to claim 1,
At least one processor; And
Including at least one memory for storing computer-executable instructions,
The computer-executable instruction stored in the at least one memory, by the at least one processor,
(a) generating a passage corresponding to the video segment, including object information and subtitle information extracted from each video segment using an object recognition artificial intelligence model;
(b) forming a corpus from a plurality of video segments into a plurality of passages formed by the step (a);
(c) retrieving N passages corresponding to a query input from the corpus formed in step (b) by a search model based on an unsupervised methodology;
(d) re-ranking the N passages retrieved in step (c) based on the input query by the artificial intelligence-based search model; And,
(e) outputting video segments corresponding to the N passages re-ranked in step (d) as search results
An apparatus for searching a video segment comprising a.

The method of claim 8,
Each passage of the corpus further includes an overview of the entire video to which each video segment belongs
Device for searching for a video segment, characterized in that.

A computer program for providing a method of searching for a video segment using an artificial intelligence-based search model learned by the method according to claim 1,
It is stored in a non-transitory storage medium, and by the processor,
(a) generating a passage corresponding to the video segment, including object information and subtitle information extracted from each video segment using an object recognition artificial intelligence model;
(b) forming a corpus from a plurality of video segments into a plurality of passages formed by the step (a);
(c) retrieving N passages corresponding to a query input from the corpus formed in step (b) by a search model based on an unsupervised methodology;
(d) re-ranking the N passages retrieved in step (c) based on the input query by the artificial intelligence-based search model; And,
(e) outputting video segments corresponding to the N passages re-ranked in step (d) as search results
A computer program stored in a non-transitory storage medium for providing a method of searching for a video segment comprising a.