KR102620260B1

KR102620260B1 - Method and device for recogniting object based on graph

Info

Publication number: KR102620260B1
Application number: KR1020230069422A
Authority: KR
Inventors: 박규동; 전호철; 손미애; 김종모; 이정빈
Original assignee: 국방과학연구소
Priority date: 2023-05-30
Filing date: 2023-05-30
Publication date: 2023-12-29

Abstract

객체에 대한 서로 다른 종류의 텍스트 데이터 및 이미지 데이터를 이용하여 객체 인식을 할 수 있는 그래프 기반의 객체 인식방법 및 장치가 제공된다. 객체 인식방법은, 외부로부터 객체에 대한 복수의 텍스트 데이터 및 복수의 이미지 데이터를 수신하여 그룹핑하고, 신경망모델을 통해 텍스트 데이터를 텍스트 그래프로 변환하고, 이미지 데이터를 이미지 그래프로 변환하고, 텍스트 그래프 및 이미지 그래프를 연결하여 객체를 인식한다. A graph-based object recognition method and device that can recognize objects using different types of text data and image data for objects are provided. The object recognition method receives and groups a plurality of text data and a plurality of image data about the object from the outside, converts the text data into a text graph through a neural network model, converts the image data into an image graph, and generates a text graph and Recognize objects by connecting image graphs.

Description

Graph-based object recognition method and device {METHOD AND DEVICE FOR RECOGNITING OBJECT BASED ON GRAPH}

본 발명은 그래프(graph) 기반의 객체 인식방법 및 장치에 관한 것이다. The present invention relates to a graph-based object recognition method and device.

객체 인식(object recognition) 기술은 센서 등에 의해 수집된 이미지 또는 비디오 등의 이미지로부터 사람이나 사물 등의 객체를 식별하는 기술이다. Object recognition technology is a technology that identifies objects such as people or objects from images such as images or videos collected by sensors.

이러한 객체 인식 기술은 다양한 분야에서 적용되고 있는데, 예컨대 SAR(synthetic aperture radar)에서 제공된 이미지로부터 전장 상황을 파악하거나 표적 등의 객체를 검출하는 데 활용될 수 있다. This object recognition technology is applied in various fields. For example, it can be used to identify battlefield situations or detect objects such as targets from images provided by SAR (synthetic aperture radar).

최근 센서 기술의 발전에 따라 센서로부터 다양한 유형의 복수의 이미지가 수집되고 있다. 이에, 복수의 이미지로부터 객체를 정확하게 인식하기 위한 방법들이 연구되고 있다. Recently, with the development of sensor technology, multiple images of various types are being collected from sensors. Accordingly, methods for accurately recognizing objects from multiple images are being studied.

그러나, 종래의 객체 인식 기술은 수집된 이미지만을 이용하여 객체 인식을 수행하고 있으며, 이로 인해 저조도 환경이나 운무 등에 의해 제한된 가시 환경 등에서 수집된 이미지에 대해서는 객체 인식 성능이 저하되는 문제가 있었다. However, conventional object recognition technology performs object recognition using only collected images, which has the problem of deteriorating object recognition performance for images collected in low-light environments or limited visibility environments due to fog or the like.

한국공개특허 제10-2022-0148053호(2022.11.04.)Korean Patent Publication No. 10-2022-0148053 (2022.11.04.)

본 발명은 객체에 대한 서로 다른 종류의 텍스트 데이터 및 이미지 데이터를 이용하여 객체 인식을 할 수 있는 그래프 기반의 객체 인식방법 및 장치를 제공하고자 하는 데 있다. The purpose of the present invention is to provide a graph-based object recognition method and device that can recognize objects using different types of text data and image data for objects.

본 발명의 일 실시예에 따른 객체 인식방법은, 외부로부터 객체에 대한 복수의 텍스트 데이터 및 복수의 이미지 데이터를 수신하여 그룹핑(grouping)하는 단계; 기 학습된 텍스트 그래프 변환부를 이용하여 각 그룹의 텍스트 데이터를 텍스트 그래프로 변환하는 단계; 기 학습된 이미지 그래프 변환부를 이용하여 각 그룹의 이미지 데이터를 이미지 그래프로 변환하는 단계; 및 기 학습된 객체 인식부를 이용하여 상기 텍스트 그래프 및 상기 이미지 그래프를 연결하여 상기 객체를 인식하는 단계를 포함한다. An object recognition method according to an embodiment of the present invention includes receiving a plurality of text data and a plurality of image data for an object from the outside and grouping them; Converting the text data of each group into a text graph using a previously learned text graph conversion unit; Converting the image data of each group into an image graph using a previously learned image graph conversion unit; and recognizing the object by connecting the text graph and the image graph using a previously learned object recognition unit.

상기 그룹핑하는 단계는, 하나 이상의 시공간 컨텍스트 정보에 기초하여 상기 복수의 텍스트 데이터 및 상기 복수의 이미지 데이터 간 시공간 동일성을 판단하는 단계; 및 판단 결과에 기초하여 상기 복수의 텍스트 데이터 및 상기 복수의 이미지 데이터 중에서 동일 시공간을 갖는 하나 이상의 텍스트 데이터 및 이미지 데이터를 하나의 그룹으로 그룹핑하는 단계를 포함한다.The grouping may include determining spatiotemporal identity between the plurality of text data and the plurality of image data based on one or more spatiotemporal context information; and grouping one or more text data and image data having the same time and space into one group among the plurality of text data and the plurality of image data based on the determination result.

여기서, 상기 하나 이상의 시공간 컨텍스트 정보는, 상기 복수의 텍스트 데이터 및 상기 복수의 이미지 데이터 각각이 발생한 위치 좌표 및 타임스탬프를 포함한다. Here, the one or more spatio-temporal context information includes location coordinates and timestamps where each of the plurality of text data and the plurality of image data occurred.

상기 텍스트 그래프로 변환하는 단계는, 상기 각 그룹의 텍스트 데이터에서 복수의 키워드를 추출하는 단계; 상기 복수의 키워드 각각의 중요도를 판단하여 하나 이상의 핵심 키워드를 추출하는 단계; 상기 하나 이상의 핵심 키워드를 노드로 하는 초기 텍스트 그래프를 생성하는 단계; 상기 텍스트 데이터 내에서 상기 하나 이상의 핵심 키워드의 공통 등장빈도를 판단하는 단계; 및 판단 결과에 기초하여 상기 초기 텍스트 그래프에 엣지 가중치를 적용하여 상기 텍스트 그래프를 생성하는 단계를 포함한다. The converting into a text graph includes extracting a plurality of keywords from the text data of each group; Extracting one or more key keywords by determining the importance of each of the plurality of keywords; generating an initial text graph using the one or more key keywords as nodes; determining a common frequency of appearance of the one or more key keywords in the text data; and generating the text graph by applying edge weights to the initial text graph based on the determination result.

상기 텍스트 그래프 변환부는 키워드 추출부를 포함하고, 상기 키워드 추출부는, 상기 텍스트 데이터와 함께 레이블 데이터로 추출정답을 입력 받으면, 상기 텍스트 데이터에서 상기 하나 이상의 핵심 키워드를 추출하여 출력하도록 학습된다.The text graph conversion unit includes a keyword extraction unit, and the keyword extraction unit is trained to extract and output the one or more key keywords from the text data when the extracted correct answer is input as label data together with the text data.

상기 이미지 그래프로 변환하는 단계는, 상기 각 그룹의 이미지 데이터에서 복수의 객체 및 상기 복수의 객체 각각에 대응되는 복수의 태그를 추출하는 단계; 상기 복수의 태그 각각을 노드로 하는 초기 이미지 그래프를 생성하는 단계; 상기 초기 이미지 그래프의 각 노드에 대한 중요도를 판단하는 단계; 및 판단 결과에 기초하여 상기 초기 이미지 그래프의 노드 가지치기를 통해 상기 이미지 그래프를 생성하는 단계를 포함한다. The converting into an image graph includes extracting a plurality of objects and a plurality of tags corresponding to each of the plurality of objects from the image data of each group; generating an initial image graph using each of the plurality of tags as a node; determining the importance of each node of the initial image graph; and generating the image graph through node pruning of the initial image graph based on the determination result.

상기 이미지 그래프 변환부는 객체 추출부를 포함하고, 상기 객체 추출부는, 상기 이미지 데이터와 함께 레이블 데이터로 추출정답을 입력 받으면, 상기 이미지 데이터에서 상기 복수의 객체 및 상기 복수의 태그를 추출하여 출력하도록 학습된다.The image graph conversion unit includes an object extraction unit, and the object extraction unit is trained to extract and output the plurality of objects and the plurality of tags from the image data when receiving an extraction answer as label data together with the image data. .

상기 객체를 인식하는 단계는, 상기 텍스트 그래프 및 상기 이미지 그래프 각각의 엔터티(entity) 간 동일여부를 판단하는 단계; 판단 결과에 기초하여 상기 텍스트 그래프 및 상기 이미지 그래프 간을 연결하는 단계; 및 연결된 그래프에 기초하여 상기 객체에 대한 인식 결과를 출력하는 단계를 포함한다. Recognizing the object may include determining whether entities of the text graph and the image graph are identical; connecting the text graph and the image graph based on a determination result; and outputting a recognition result for the object based on the connected graph.

상기 객체 인식부는 동일성 판단부를 포함하고, 상기 동일성 판단부는, 상기 텍스트 그래프 및 상기 이미지 그래프와 함께 레이블 데이터로 판단정답을 입력 받으면, 상기 텍스트 그래프와 상기 이미지 그래프 간 동일 여부를 판단하여 출력하도록 학습된다. The object recognition unit includes an identity determination unit, and the identity determination unit is trained to determine whether the text graph and the image graph are identical when receiving the correct answer as label data together with the text graph and the image graph and output the same. .

본 발명의 실시예에 따른 객체 인식장치는, 객체 인식 프로그램이 저장된 메모리; 및 상기 객체 인식 프로그램을 실행하고, 상기 객체 인식부는 동일성 판단부를 포함하고, 외부로부터 객체에 대한 복수의 텍스트 데이터 및 복수의 이미지 데이터를 수신하여 그룹핑(grouping)하고, 기 학습된 텍스트 그래프 변환부를 이용하여 각 그룹의 텍스트 데이터를 텍스트 그래프로 변환하고, 기 학습된 이미지 그래프 변환부를 이용하여 각 그룹의 이미지 데이터를 이미지 그래프로 변환하고, 기 학습된 객체 인식부를 이용하여 상기 텍스트 그래프 및 상기 이미지 그래프를 연결하여 상기 객체를 인식하는 프로세서를 포함한다. An object recognition device according to an embodiment of the present invention includes a memory storing an object recognition program; and executing the object recognition program, wherein the object recognition unit includes an identity determination unit, receives a plurality of text data and a plurality of image data for the object from the outside, groups them, and uses a previously learned text graph conversion unit. Convert the text data of each group into a text graph, convert the image data of each group into an image graph using a previously learned image graph conversion unit, and convert the text graph and the image graph using a previously learned object recognition unit. It includes a processor that connects and recognizes the object.

본 발명은 객체에 대한 서로 다른 종류의 텍스트 데이터와 이미지 데이터 각각을 그래프로 변환하고, 변환된 그래프 간 동일 여부를 판단하여 객체를 인식하여 출력할 수 있다. The present invention can convert different types of text data and image data about objects into graphs, determine whether the converted graphs are identical, and recognize and output the objects.

이에, 본 발명은 저조도 환경이나 제한된 가시 환경에서 이미지 데이터가 수집되더라도 이에 대응되는 텍스트 데이터를 함께 이용하여 객체를 인식하여 함으로써, 특정 장소 또는 특정 시점에서 객체 인식의 정확도를 높일 수 있다. Accordingly, the present invention can increase the accuracy of object recognition at a specific place or time by recognizing the object using the corresponding text data even if image data is collected in a low-light environment or a limited visibility environment.

또한, 본 발명은 이미지 데이터 및 텍스트 데이터 각각의 변환된 그래프 간 동일 여부를 판단하여 연결한 그래프를 이용하여 객체를 인식함으로써, 종래와 대비하여 객체 인식을 위한 신경망모델의 복잡도를 해소할 수 있으며, 이로 인해 객체 인식의 속도를 향상시킬 수 있다. In addition, the present invention determines whether the converted graphs of image data and text data are identical and recognizes objects using the connected graphs, thereby resolving the complexity of the neural network model for object recognition compared to the prior art. This can improve the speed of object recognition.

도 1은 본 발명의 실시예에 따른 그래프 기반의 객체 인식장치를 나타내는 도면이다.
도 2는 도 1의 객체 인식프로그램의 기능을 개념적으로 나타내는 도면이다.
도 3은 도 2의 텍스트 그래프 변환부를 나타내는 도면이다.
도 4는 도 2의 이미지 그래프 변환부를 나타내는 도면이다.
도 5는 도 2의 객체 인식부를 나타내는 도면이다.
도 6은 본 발명의 실시예에 따른 그래프 기반의 객체 인식방법을 나타내는 도면이다.
도 7 내지 도 9는 본 발명의 그래프 기반의 객체 인식방법을 구체적으로 나타내는 도면들이다. 1 is a diagram showing a graph-based object recognition device according to an embodiment of the present invention.
FIG. 2 is a diagram conceptually showing the function of the object recognition program of FIG. 1.
FIG. 3 is a diagram showing the text graph conversion unit of FIG. 2.
Figure 4 is a diagram showing the image graph conversion unit of Figure 2.
FIG. 5 is a diagram showing the object recognition unit of FIG. 2.
Figure 6 is a diagram showing a graph-based object recognition method according to an embodiment of the present invention.
7 to 9 are diagrams specifically showing the graph-based object recognition method of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다.The advantages and features of the present invention and methods for achieving them will become clear by referring to the embodiments described in detail below along with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below and may be implemented in various different forms. The present embodiments are merely provided to ensure that the disclosure of the present invention is complete and to be understood by those skilled in the art in the technical field to which the present invention pertains. It is provided to fully inform those who have the scope of the invention, and the present invention is only defined by the scope of the claims.

본 발명의 실시예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고 후술되는 용어들은 본 발명의 실시예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing embodiments of the present invention, if a detailed description of a known function or configuration is judged to unnecessarily obscure the gist of the present invention, the detailed description will be omitted. The terms described below are terms defined in consideration of functions in the embodiments of the present invention, and may vary depending on the intention or custom of the user or operator. Therefore, the definition should be made based on the contents throughout this specification.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시 예에 대하여 상세하게 설명한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the attached drawings.

도 1은 본 발명의 실시예에 따른 그래프 기반의 객체 인식장치를 나타내는 도면이다. 1 is a diagram showing a graph-based object recognition device according to an embodiment of the present invention.

도 1을 참조하면, 본 실시예의 객체 인식장치(100)는 입출력부(110), 프로세서(120) 및 메모리(130)를 포함할 수 있다. Referring to FIG. 1, the object recognition device 100 of this embodiment may include an input/output unit 110, a processor 120, and a memory 130.

입출력부(110)는 외부의 수집장치, 예컨대 복수의 센서 등에서 감지된 객체에 대한 다양한 유형의 데이터를 수신할 수 있다. 또한, 입출력부(110)는 후술될 프로세서(120)에서 생성된 객체 데이터에 대응되는 객체 인식결과를 외부장치, 예컨대 서버 등의 장치로 출력할 수 있다. The input/output unit 110 may receive various types of data about objects detected by an external collection device, for example, a plurality of sensors. Additionally, the input/output unit 110 may output an object recognition result corresponding to object data generated by the processor 120, which will be described later, to an external device, such as a server.

여기서, 객체 데이터는 특정 장소 또는 특정 시점에서 하나 이상의 객체 각각으로부터 수집된 복수의 텍스트 데이터 및 복수의 이미지 데이터를 포함할 수 있다. 이러한 객체 데이터는 객체에 대한 종류, 명칭, 형태, 수량, 위치 등의 정보, 객체가 위치하는 장소에 대한 위치, 지형, 환경 등의 정보 및 객체 데이터가 발생된 시점정보 등을 포함할 수 있다. Here, the object data may include a plurality of text data and a plurality of image data collected from each of one or more objects at a specific location or a specific point in time. Such object data may include information such as the type, name, shape, quantity, and location of the object, information such as the location, terrain, and environment of the place where the object is located, and information on the time when the object data was generated.

프로세서(120)는 입출력부(110)를 통해 객체 데이터, 즉 복수의 텍스트 데이터 및 복수의 이미지 데이터를 제공받고, 메모리(130)에 저장된 객체 인식 프로그램(140)을 이용하여 특정 장소 또는 특정 시점에서의 하나 이상의 객체에 대한 인식 결과를 생성하여 출력할 수 있다. The processor 120 receives object data, that is, a plurality of text data and a plurality of image data through the input/output unit 110, and uses the object recognition program 140 stored in the memory 130 to recognize the object data at a specific location or at a specific point in time. Recognition results for one or more objects can be generated and output.

메모리(130)는 객체 인식 프로그램(140) 및 이의 실행에 필요한 정보를 저장할 수 있다. 객체 인식 프로그램(140)은 입출력부(110)를 통해 제공된 복수의 텍스트 데이터 및 복수의 이미지 데이터로부터 하나 이상의 객체를 인식하여 그 결과를 생성할 수 있는 명령어들을 포함하는 소프트웨어일 수 있다. The memory 130 may store the object recognition program 140 and information necessary for its execution. The object recognition program 140 may be software that includes instructions that can recognize one or more objects from a plurality of text data and a plurality of image data provided through the input/output unit 110 and generate a result.

이에, 프로세서(120)는 입출력부(110)를 통해 객체에 대한 복수의 텍스트 데이터 및 이미지 데이터가 수신되면, 메모리(130)의 객체 인식 프로그램(140)을 실행하여 기 수신된 복수의 텍스트 및 복수의 이미지 데이터로부터 하나 이상의 객체에 대한 객체 인식 결과를 생성하여 출력할 수 있다.Accordingly, when a plurality of text data and image data for an object are received through the input/output unit 110, the processor 120 executes the object recognition program 140 of the memory 130 to identify the previously received plurality of texts and image data. Object recognition results for one or more objects can be generated and output from image data.

도 2는 도 1의 객체 인식 프로그램의 기능을 개념적으로 나타내는 도면이다. FIG. 2 is a diagram conceptually showing the function of the object recognition program of FIG. 1.

도 2를 참조하면, 본 실시예의 객체 인식 프로그램(140)은 데이터 그룹핑(grouping)부(141), 텍스트 그래프 변환부(143), 이미지 그래프 변환부(145) 및 객체 인식부(147)를 포함할 수 있다. Referring to FIG. 2, the object recognition program 140 of this embodiment includes a data grouping unit 141, a text graph conversion unit 143, an image graph conversion unit 145, and an object recognition unit 147. can do.

도 2에 도시된 데이터 그룹핑부(141), 텍스트 그래프 변환부(143), 이미지 그래프 변환부(145) 및 객체 인식부(147)는 본 실시예의 객체 인식 프로그램(140)의 기능을 쉽게 설명하기 위해 개념적으로 나눈 것으로서, 본 발명은 이에 제한되지는 않는다. The data grouping unit 141, text graph conversion unit 143, image graph conversion unit 145, and object recognition unit 147 shown in FIG. 2 are used to easily explain the functions of the object recognition program 140 of this embodiment. As conceptually divided for the sake of clarity, the present invention is not limited thereto.

예컨대, 본 발명의 실시예에 따라 객체 인식 프로그램(140)의 데이터 그룹핑부(141), 텍스트 그래프 변환부(143), 이미지 그래프 변환부(145) 및 객체 인식부(147)는 그 기능이 병합되거나 분리될 수 있으며, 하나의 프로그램에 포함된 일련의 명령어들로 구현될 수도 있다. For example, according to an embodiment of the present invention, the functions of the data grouping unit 141, text graph conversion unit 143, image graph conversion unit 145, and object recognition unit 147 of the object recognition program 140 are merged. It can be separated or implemented as a series of instructions included in one program.

데이터 그룹핑부(141)는 입출력부(110)를 통해 제공된 복수의 텍스트 데이터 및 복수의 이미지 데이터를 복수의 그룹 각각으로 그룹핑할 수 있다. The data grouping unit 141 may group a plurality of text data and a plurality of image data provided through the input/output unit 110 into a plurality of groups.

예컨대, 데이터 그룹핑부(141)는 복수의 텍스트 데이터 및 복수의 이미지 데이터 각각에서 추출된 하나 이상의 시공간 컨텍스트(context) 정보에 기초하여 복수의 텍스트 데이터 및 복수의 이미지 데이터 간 시공간 동일성을 판단할 수 있다. For example, the data grouping unit 141 may determine spatiotemporal identity between a plurality of text data and a plurality of image data based on one or more spatiotemporal context information extracted from each of the plurality of text data and the plurality of image data. .

또한, 데이터 그룹핑부(141)는 동일성 판단 결과에 따라 복수의 텍스트 데이터 및 복수의 이미지 데이터 중에서 동일 시공간을 갖는 하나 이상의 텍스트 데이터 및 이미지 데이터를 하나의 그룹으로 그룹핑할 수 있다. Additionally, the data grouping unit 141 may group one or more text data and image data having the same time and space into one group among a plurality of text data and a plurality of image data according to the identity determination result.

이에, 그룹핑 된 데이터 그룹은 특정장소 및 시점에서 하나 이상의 객체에 대해 수집된 하나 이상의 텍스트 데이터 및 이미지 데이터를 포함할 수 있다. Accordingly, the grouped data group may include one or more text data and image data collected for one or more objects at a specific place and time.

여기서, 시공간 컨텍스트 정보는 객체 데이터, 즉 텍스트 데이터 및 이미지 데이터 각각이 발생된 위치, 즉 수집된 위치에 대한 좌표 및 수집된 시점에 대한 타임스탬프를 포함할 수 있다. Here, the spatiotemporal context information may include the location where each of the object data, that is, the text data and the image data, was generated, that is, the coordinates for the collected location, and a timestamp for the point of collection.

한편, 본 실시예의 데이터 그룹핑부(141)는 복수의 텍스트 데이터 및 복수의 이미지 데이터 각각에서 수집한 시공간 컨텍스트 정보를 시간적 또는 공간적으로 더 확장시킴으로써, 복수의 텍스트 데이터 및 복수의 이미지 데이터의 그룹핑 성능을 높일 수도 있다. Meanwhile, the data grouping unit 141 of this embodiment further expands the spatiotemporal context information collected from each of the plurality of text data and the plurality of image data temporally or spatially, thereby improving the grouping performance of the plurality of text data and the plurality of image data. You can also increase it.

텍스트 그래프 변환부(143)는 데이터 그룹핑부(141)에 의해 그룹핑 된 복수의 그룹 각각의 하나 이상의 텍스트 데이터를 그래프 형태로 변환하여 텍스트 그래프를 생성할 수 있다. The text graph conversion unit 143 may generate a text graph by converting one or more text data from each of the plurality of groups grouped by the data grouping unit 141 into a graph form.

이러한 텍스트 그래프 변환부(143)는 텍스트 그래프 생성을 위한 기 학습된 신경망모델을 포함할 수 있다. This text graph conversion unit 143 may include a previously learned neural network model for generating a text graph.

도 3은 도 2의 텍스트 그래프 변환부를 나타내는 도면이다. FIG. 3 is a diagram showing the text graph conversion unit of FIG. 2.

도 3을 참조하면, 본 실시예의 텍스트 그래프 변환부(143)는 키워드 추출부(151) 및 텍스트 그래프 생성부(153)를 포함할 수 있다. Referring to FIG. 3, the text graph conversion unit 143 of this embodiment may include a keyword extraction unit 151 and a text graph creation unit 153.

키워드 추출부(151)는 텍스트 데이터를 입력 받으면, 상기 텍스트 데이터의 복수의 키워드, 예컨대 명사 또는 동사를 추출하고, 상기 복수의 키워드 각각에 대한 중요도를 판단하여 하나 이상의 핵심 키워드를 추출하여 출력하도록 학습될 수 있다. When receiving text data, the keyword extraction unit 151 learns to extract a plurality of keywords, such as nouns or verbs, from the text data, determine the importance of each of the plurality of keywords, and extract and output one or more key keywords. It can be.

또한, 키워드 추출부(151)는 출력한 핵심 키워드를 이용하여 손실 값을 생성하고, 상기 손실 값이 최소가 되도록 전술한 텍스트 데이터에서 하나 이상의 핵심 키워드를 추출하는 학습을 반복하여 수행할 수 있다. Additionally, the keyword extraction unit 151 may generate a loss value using the output core keywords and repeatedly perform learning to extract one or more core keywords from the above-described text data so that the loss value is minimized.

이를 위하여, 키워드 추출부(151)는 텍스트 데이터와 함께 추출 정답, 예컨대 텍스트 데이터에 대한 실제 핵심 키워드를 레이블 데이터로 입력 받을 수 있다. 이에, 키워드 추출부(151)는 추출정답을 실제 출력한 추출 결과, 즉 하나 이상의 핵심 키워드와 비교하고, 비교 결과에 따라 손실 값을 생성할 수 있다. To this end, the keyword extraction unit 151 may receive the extracted correct answer, for example, the actual key keyword for the text data, as label data along with the text data. Accordingly, the keyword extraction unit 151 may compare the extracted correct answer with the actual output extraction result, that is, one or more key keywords, and generate a loss value according to the comparison result.

텍스트 그래프 생성부(153)는 키워드 추출부(151)에서 출력된 하나 이상의 핵심 키워드에 기초하여 텍스트 데이터에 대응되는 텍스트 그래프를 생성할 수 있다. The text graph generator 153 may generate a text graph corresponding to text data based on one or more key keywords output from the keyword extractor 151.

예컨대, 텍스트 그래프 생성부(153)는 하나 이상의 핵심 키워드를 노드(node)로 하는 초기 텍스트 그래프를 생성할 수 있다. 여기서, 초기 텍스트 그래프의 각 노드는 엣지(edge)로 연결될 수 있다. For example, the text graph generator 153 may generate an initial text graph using one or more key keywords as nodes. Here, each node of the initial text graph may be connected by an edge.

텍스트 그래프 생성부(153)는 텍스트 데이터 내에서 하나 이상의 핵심 키워드의 출현빈도, 예컨대 하나 이상의 핵심키워드에 대한 공통 출연빈도를 판단할 수 있다. 텍스트 그래프 생성부(153)는 판단 결과에 기초하여 초기 텍스트 그래프의 각 노드 간에 엣지 가중치를 부여하고, 가중치가 부여된 엣지를 포함하는 텍스트 그래프를 생성하여 출력할 수 있다. The text graph generator 153 may determine the frequency of appearance of one or more key keywords in the text data, for example, the common frequency of appearance for one or more key keywords. The text graph generator 153 may assign edge weights to each node of the initial text graph based on the determination result, and generate and output a text graph including the weighted edges.

다시 도 2를 참조하면, 이미지 그래프 변환부(145)는 데이터 그룹핑부(141)에 의해 그룹핑 된 복수의 그룹 각각의 하나 이상의 이미지 데이터를 그래프 형태로 변환하여 이미지 그래프를 생성할 수 있다. Referring again to FIG. 2, the image graph conversion unit 145 may generate an image graph by converting one or more image data from each of a plurality of groups grouped by the data grouping unit 141 into a graph form.

이러한 이미지 그래프 변환부(145)는 이미지 그래프 생성을 위한 기 학습된 신경망모델을 포함할 수 있다. This image graph conversion unit 145 may include a previously learned neural network model for generating an image graph.

도 4는 도 2의 이미지 그래프 변환부를 나타내는 도면이다. Figure 4 is a diagram showing the image graph conversion unit of Figure 2.

도 4를 참조하면, 본 실시예의 이미지 그래프 변환부(145)는 객체 추출부(161) 및 이미지 그래프 생성부(163)를 포함할 수 있다. Referring to FIG. 4, the image graph conversion unit 145 of this embodiment may include an object extraction unit 161 and an image graph creation unit 163.

객체 추출부(161)는 이미지 데이터를 입력 받으면, 상기 이미지 데이터에 포함된 하나 이상의 객체 및 이에 대응되는 태그를 추출하여 출력하도록 학습될 수 있다. When receiving image data, the object extractor 161 may be trained to extract and output one or more objects included in the image data and tags corresponding to them.

여기서, 객체는 이미지 형태로 추출되고, 태그는 객체의 명칭에 대한 텍스트형태로 추출될 수 있다.Here, the object can be extracted in the form of an image, and the tag can be extracted in the form of text for the name of the object.

또한, 객체 추출부(161)는 출력한 객체 및 태그를 이용하여 손실 값을 생성하고, 상기 손실 값이 최소가 되도록 이미지 데이터로부터 객체 및 태그를 추출하는 학습을 반복하여 수행할 수 있다. Additionally, the object extractor 161 may generate a loss value using the output object and tag, and repeatedly perform learning to extract the object and tag from the image data to minimize the loss value.

이를 위하여, 객체 추출부(161)는 이미지 데이터와 함께 추출 정답, 예컨대 이미지 데이터의 실제 객체 및 태그를 레이블 데이터로 입력 받을 수 있다. 이에, 객체 추출부(161)는 추출정답을 실제 출력한 결과, 즉 하나 이상의 객체 및 태그와 비교하고, 비교 결과에 따라 손실 값을 생성할 수 있다. To this end, the object extraction unit 161 may receive the image data as well as the extraction answer, for example, the actual object and tag of the image data as label data. Accordingly, the object extraction unit 161 may compare the extracted correct answer with the actual output result, that is, one or more objects and tags, and generate a loss value according to the comparison result.

이미지 그래프 생성부(163)는 객체 추출부(161)에서 추출된 객체 및 태그에 기초하여 이미지 데이터에 대응되는 이미지 그래프를 생성할 수 있다. The image graph generator 163 may generate an image graph corresponding to image data based on the objects and tags extracted from the object extractor 161.

예컨대, 이미지 그래프 생성부(163)는 하나 이상의 태그를 노드로 하는 초기 이미지 그래프를 생성할 수 있다. 여기서, 초기 이미지 그래프의 각 노드는 엣지로 연결될 수 있다. For example, the image graph generator 163 may generate an initial image graph using one or more tags as nodes. Here, each node in the initial image graph may be connected to an edge.

이미지 그래프 생성부(163)는 초기 이미지 그래프의 각 노드에 대한 정보 엔트로피(entropy)를 각각 산출하고, 산출된 정보 엔트로피에 기초하여 각 노드의 중요도를 판단할 수 있다. The image graph generator 163 may calculate information entropy for each node of the initial image graph and determine the importance of each node based on the calculated information entropy.

이미지 그래프 생성부(163)는 판단된 중요도에 기초하여 초기 이미지 그래프의 각 노드에 대한 가지치기를 수행하고, 그에 따라 이미지 그래프를 생성하여 출력할 수 있다. The image graph generator 163 may prune each node of the initial image graph based on the determined importance, and generate and output an image graph accordingly.

여기서, 이미지 그래프 생성부(163)는 초기 이미지 그래프의 최하위 노드 및 이와 엣지를 통해 연결된 상위노드들 간의 경로에서 산출되는 엔트로피 변화량에 기초하여 초기 이미지 그래프의 적어도 하나의 노드를 삭제하는 가지치기를 수행할 수 있다. Here, the image graph generator 163 performs pruning to delete at least one node of the initial image graph based on the entropy change calculated in the path between the lowest node of the initial image graph and the upper nodes connected to it through edges. can do.

다시 도 2를 참조하면, 객체 인식부(147)는 텍스트 그래프와 이미지 그래프를 연결하여 연결 그래프를 생성하고, 생성된 연결 그래프에 기초하여 복수의 그룹 각각에서 하나 이상의 객체를 인식하여 그 결과를 출력할 수 있다. Referring again to FIG. 2, the object recognition unit 147 connects the text graph and the image graph to create a connected graph, recognizes one or more objects in each of the plurality of groups based on the created connected graph, and outputs the result. can do.

이러한 객체 인식부(147)는 객체 인식결과를 생성하기 위한 기 학습된 신경망모델을 포함할 수 있다. This object recognition unit 147 may include a previously learned neural network model for generating object recognition results.

도 5는 도 2의 객체 인식부를 나타내는 도면이다. FIG. 5 is a diagram showing the object recognition unit of FIG. 2.

도 5를 참조하면, 객체 인식부(147)는 동일성 판단부(171) 및 그래프 연결부(173)를 포함할 수 있다. Referring to FIG. 5 , the object recognition unit 147 may include an identity determination unit 171 and a graph connection unit 173.

동일성 판단부(171)는 텍스트 그래프와 이미지 그래프를 각각 입력 받으면, 텍스트 그래프의 엔터티(entity)와 이미지 그래프의 엔터티 간 동일 여부를 판단하여 출력하도록 학습될 수 있다. The identity determination unit 171 may be trained to determine whether the entities of the text graph and the entities of the image graph are identical when receiving text graphs and image graphs respectively, and output the same.

또한, 동일성 판단부(171)는 출력한 동일 여부 판단 결과에 기초하여 손실 값을 생성하고, 상기 손실 값이 최소가 되도록 동일 여부 판단의 학습을 반복하여 수행할 수 있다. Additionally, the identity determination unit 171 may generate a loss value based on the output identity determination result and repeat learning of the identity determination to minimize the loss value.

이를 위하여, 동일성 판단부(171)는 텍스트 그래프 및 이미지 그래프와 함께 판단 정답, 예컨대 두 그래프의 엔터티 동일 여부를 레이블 데이터로 입력 받을 수 있다. 이에, 동일성 판단부(171)는 판단정답을 실제 출력한 결과, 즉 동일성 판단 결과와 비교하고, 비교 결과에 따라 손실 값을 생성할 수 있다. To this end, the identity determination unit 171 may receive the text graph and the image graph as well as the correct answer to determine, for example, whether the entities of the two graphs are the same as label data. Accordingly, the identity determination unit 171 may compare the determination correct answer with the actual output result, that is, the identity determination result, and generate a loss value according to the comparison result.

그래프 연결부(173)는 동일성 판단부(171)의 판단 결과에 기초하여 텍스트 그래프와 이미지 그래프를 연결하여 연결된 그래프를 생성할 수 있다. The graph connection unit 173 may create a connected graph by connecting the text graph and the image graph based on the determination result of the identity determination unit 171.

이에, 객체 인식부(147)는 연결된 그래프로부터 객체를 인식하여 객체 인식결과를 출력할 수 있다. Accordingly, the object recognition unit 147 can recognize the object from the connected graph and output the object recognition result.

이와 같이, 본 실시예의 객체 인식장치(100)는 외부에서 제공되는 서로 다른 종류의 텍스트 데이터와 이미지 데이터 각각을 그래프로 변환하고, 변환된 그래프 간 동일 여부를 판단하여 객체를 인식하여 출력할 수 있다. In this way, the object recognition device 100 of this embodiment converts different types of text data and image data provided from the outside into graphs, determines whether the converted graphs are identical, and recognizes and outputs the object. .

이에, 본 발명은 저조도 환경이나 제한된 가시 환경에서 이미지 데이터가 수집되더라도 이에 대응되는 텍스트 데이터를 함께 이용하여 객체를 인식함으로써, 특정 장소 또는 특정 시점에서 객체 인식의 정확도를 높일 수 있다. Accordingly, the present invention can increase the accuracy of object recognition at a specific place or time by recognizing the object using the corresponding text data even if image data is collected in a low-light environment or a limited visibility environment.

또한, 본 발명은 이미지 데이터와 텍스트 데이터 각각을 그래프로 변환하고, 변환된 그래프 간 동일 여부를 판단하여 연결한 그래프를 이용하여 객체를 인식함으로써, 종래와 대비하여 객체 인식을 위한 신경망모델의 복잡도를 해소할 수 있으며, 이로 인해 객체 인식의 속도를 향상시킬 수 있다. In addition, the present invention converts each of the image data and text data into a graph, determines whether the converted graphs are identical, and recognizes the object using the connected graph, thereby reducing the complexity of the neural network model for object recognition compared to the prior art. This can be resolved, thereby improving the speed of object recognition.

도 6은 본 발명의 실시예에 따른 그래프 기반의 객체 인식방법을 나타내는 도면이고, 도 7 내지 도 9는 본 발명의 그래프 기반의 객체 인식방법을 구체적으로 나타내는 도면들이다. Figure 6 is a diagram showing a graph-based object recognition method according to an embodiment of the present invention, and Figures 7 to 9 are diagrams specifically showing the graph-based object recognition method of the present invention.

도 6을 참조하면, 본 실시예의 객체 인식장치(100)는 외부의 복수의 센서 등으로부터 하나 이상의 객체에 대한 복수의 텍스트 데이터 및 복수의 이미지 데이터를 포함하는 객체 데이터를 수신할 수 있다. Referring to FIG. 6, the object recognition device 100 of this embodiment may receive object data including a plurality of text data and a plurality of image data for one or more objects from a plurality of external sensors.

이에, 프로세서(120)는 메모리(130)에 저장된 객체 인식 프로그램(140)을 실행하고, 수신된 복수의 텍스트 데이터 및 복수의 이미지 데이터로부터 특정 장소 또는 특정 시점에서의 하나 이상의 객체를 인식하여 그 결과를 출력할 수 있다. Accordingly, the processor 120 executes the object recognition program 140 stored in the memory 130, recognizes one or more objects at a specific location or at a specific time from the received plurality of text data and plurality of image data, and results in can be output.

예컨대, 프로세서(120)는 데이터 그룹핑부(141)를 통해 복수의 텍스트 데이터 및 복수의 이미지 데이터를 복수의 그룹으로 그룹핑할 수 있다(S10). For example, the processor 120 may group a plurality of text data and a plurality of image data into a plurality of groups through the data grouping unit 141 (S10).

이때, 데이터 그룹핑부(141)는 복수의 텍스트 데이터 및 복수의 이미지 데이터 각각에서 하나 이상의 컨텍스트 정보를 추출하고, 추출된 컨텍스트 정보에 기초하여 복수의 텍스트 데이터 및 복수의 이미지 데이터 간 시공간 동일성을 판단할 수 있다. At this time, the data grouping unit 141 extracts one or more context information from each of the plurality of text data and the plurality of image data, and determines spatiotemporal identity between the plurality of text data and the plurality of image data based on the extracted context information. You can.

이어, 데이터 그룹핑부(141)는 판단 결과에 기초하여 복수의 텍스트 데이터 및 복수의 이미지 데이터 중 동일 시공간을 갖는 하나 이상의 텍스트 데이터 및 이미지 데이터를 동일한 하나의 그룹으로 그룹핑할 수 있다. Next, the data grouping unit 141 may group one or more text data and image data having the same time and space among the plurality of text data and the plurality of image data into the same group based on the determination result.

다음으로, 프로세서(120)는 기 학습된 텍스트 그래프 변환부(143)를 이용하여 복수의 그룹 각각의 텍스트 데이터를 그래프 형태로 변환하여 텍스트 그래프를 생성할 수 있다(S20). Next, the processor 120 may generate a text graph by converting the text data of each of the plurality of groups into a graph form using the previously learned text graph conversion unit 143 (S20).

도 7을 참조하면, 텍스트 그래프 변환부(143)의 키워드 추출부(151)는 텍스트 데이터로부터 복수의 키워드를 추출할 수 있다. 이어, 키워드 추출부(151)는 추출된 복수의 키워드 각각의 중요도를 판단하고, 판단 결과에 따라 복수의 키워드 중 적어도 하나의 핵심 키워드를 추출할 수 있다(S110). Referring to FIG. 7, the keyword extraction unit 151 of the text graph conversion unit 143 can extract a plurality of keywords from text data. Next, the keyword extraction unit 151 determines the importance of each of the plurality of extracted keywords, and extracts at least one core keyword from the plurality of keywords according to the judgment result (S110).

다음으로, 텍스트 그래프 변환부(143)의 텍스트 그래프 생성부(153)는 추출된 핵심 키워드를 노드로 하는 초기 텍스트 그래프를 생성할 수 있다(S120). 이때, 초기 텍스트 그래프의 각 노드는 엣지를 통해 연결될 수 있다. Next, the text graph generation unit 153 of the text graph conversion unit 143 may generate an initial text graph using the extracted key keywords as nodes (S120). At this time, each node in the initial text graph may be connected through an edge.

계속해서, 텍스트 그래프 생성부(153)는 텍스트 데이터 내에서 핵심 키워드의 공통 등장빈도를 판단할 수 있다(S130). Continuing, the text graph generator 153 may determine the common frequency of appearance of key keywords within the text data (S130).

이어, 텍스트 그래프 생성부(153)는 판단 결과에 따라 초기 텍스트 그래프의 노드 간 엣지에 대한 가중치를 산출하고, 산출된 엣지 가중치를 적용하여 텍스트 데이터에 대응되는 텍스트 그래프를 생성할 수 있다(S140).Next, the text graph generator 153 may calculate a weight for the edges between nodes of the initial text graph according to the determination result, and apply the calculated edge weight to generate a text graph corresponding to the text data (S140). .

다시 도 6을 참조하면, 프로세서(120)는 기 학습된 이미지 그래프 변환부(145)를 이용하여 복수의 그룹 각각의 이미지 데이터를 그래프 형태로 변환하여 이미지 그래프를 생성할 수 있다(S30). Referring again to FIG. 6, the processor 120 may generate an image graph by converting the image data of each of the plurality of groups into a graph form using the previously learned image graph conversion unit 145 (S30).

도 8을 참조하면, 이미지 그래프 변환부(145)의 객체 추출부(161)는 이미지 데이터로부터 하나 이상의 객체 및 이에 대응되는 태그를 추출할 수 있다(S210).Referring to FIG. 8, the object extraction unit 161 of the image graph conversion unit 145 may extract one or more objects and tags corresponding to them from image data (S210).

다음으로, 이미지 그래프 변환부(145)의 이미지 그래프 생성부(163)는 추출된 하나 이상의 태그를 노드로 하는 초기 이미지 그래프를 생성할 수 있다(S220). 이때, 초기 이미지 그래프의 각 노드는 엣지를 통해 연결될 수 있다.Next, the image graph generation unit 163 of the image graph conversion unit 145 may generate an initial image graph using one or more extracted tags as nodes (S220). At this time, each node in the initial image graph may be connected through an edge.

계속해서, 이미지 그래프 생성부(163)는 초기 이미지 그래프의 각 노드에 대한 정보 엔트로피를 각각 산출하고, 산출된 정보 엔트로피에 기초하여 각 노드의 중요도를 판단할 수 있다(S230). Subsequently, the image graph generator 163 may calculate the information entropy for each node of the initial image graph and determine the importance of each node based on the calculated information entropy (S230).

이어, 이미지 그래프 생성부(163)는 판단된 중요도에 기초하여 초기 이미지 그래프의 각 노드에 대한 가지치기를 수행하고, 그에 따라 이미지 그래프를 생성하여 출력할 수 있다(S240). Next, the image graph generator 163 may prune each node of the initial image graph based on the determined importance, and generate and output an image graph accordingly (S240).

다시 도 6을 참조하면, 프로세서(120)는 기 학습된 객체 인식부(147)를 이용하여 복수의 그룹 각각의 텍스트 그래프와 이미지 그래프를 연결하고, 연결된 그래프에 기초하여 복수의 그룹 각각에서 하나 이상의 객체를 인식하고, 그 결과를 출력할 수 있다(S40). Referring again to FIG. 6, the processor 120 connects the text graph and the image graph of each of the plurality of groups using the previously learned object recognition unit 147, and based on the connected graph, one or more Objects can be recognized and the results can be output (S40).

도 9를 참조하면, 객체 인식부(147)의 동일성 판단부(171)는 텍스트 그래프 및 이미지 그래프 각각에서 하나 이상의 엔터티를 추출할 수 있다. 이어, 동일성 판단부(171)는 텍스트 그래프의 엔터티와 이미지 그래프의 엔터티 간의 동치 관계를 분석하고, 그에 따라 텍스트 그래프와 이미지 그래프의 동일 여부를 판단할 수 있다(S310).Referring to FIG. 9, the identity determination unit 171 of the object recognition unit 147 may extract one or more entities from each of the text graph and the image graph. Next, the identity determination unit 171 may analyze the equivalence relationship between the entities of the text graph and the entities of the image graph, and determine whether the text graph and the image graph are identical accordingly (S310).

텍스트 그래프와 이미지 그래프가 동일하다고 판단되면, 그래프 연결부(173)는 판단 결과에 기초하여 텍스트 그래프와 이미지 그래프를 연결할 수 있다(S320). If it is determined that the text graph and the image graph are the same, the graph connection unit 173 may connect the text graph and the image graph based on the determination result (S320).

이에, 객체 인식부(147)는 연결된 그래프로부터 객체를 인식하여 객체 인식결과를 출력할 수 있다(S330). Accordingly, the object recognition unit 147 can recognize the object from the connected graph and output the object recognition result (S330).

반면, 텍스트 그래프와 이미지 그래프가 동일하지 않다고 판단되면, 동일성 판단부(171)는 다른 텍스트 그래프 및 이미지 그래프에 대한 동일 여부를 판단할 수 있다. On the other hand, if it is determined that the text graph and the image graph are not identical, the identity determination unit 171 may determine whether other text graphs and image graphs are identical.

이와 같이, 본 발명은 외부에서 제공되는 서로 다른 종류의 텍스트 데이터와 이미지 데이터 각각을 그래프로 변환하고, 변환된 그래프 간 동일 여부를 판단하여 객체를 인식하여 출력할 수 있다. In this way, the present invention can convert different types of text data and image data provided from the outside into graphs, determine whether the converted graphs are identical, and recognize and output objects.

이상에서 설명된 본 발명의 블록도의 각 블록과 순서도의 각 단계의 조합들은 컴퓨터 프로그램 인스트럭션들에 의해 수행될 수도 있다. 이들 컴퓨터 프로그램 인스트럭션들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 인코딩 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 인코딩 프로세서를 통해 수행되는 그 인스트럭션들이 블록도의 각 블록 또는 순서도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다. 이들 컴퓨터 프로그램 인스트럭션들은 특정 방법으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용 가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 인스트럭션들은 블록도의 각 블록 또는 순서도 각 단계에서 설명된 기능을 수행하는 인스트럭션 수단을 내포하는 제조 품목을 생산하는 것도 가능하다. 컴퓨터 프로그램 인스트럭션들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 인스트럭션들은 블록도의 각 블록 및 순서도의 각 단계에서 설명된 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.Combinations of each block of the block diagram of the present invention and each step of the flowchart described above may be performed by computer program instructions. Since these computer program instructions can be mounted on the encoding processor of a general-purpose computer, special-purpose computer, or other programmable data processing equipment, the instructions performed through the encoding processor of the computer or other programmable data processing equipment are included in each block or block of the block diagram. Each step of the flowchart creates a means to perform the functions described. These computer program instructions may also be stored in computer-usable or computer-readable memory that can be directed to a computer or other programmable data processing equipment to implement a function in a particular way, so that the computer-usable or computer-readable memory The instructions stored in can also produce manufactured items containing instruction means that perform the functions described in each block of the block diagram or each step of the flowchart. Computer program instructions can also be mounted on a computer or other programmable data processing equipment, so that a series of operational steps are performed on the computer or other programmable data processing equipment to create a process that is executed by the computer, thereby generating a process that is executed by the computer or other programmable data processing equipment. Instructions that perform processing equipment may also provide steps for executing functions described in each block of the block diagram and each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능(들)을 실행하기 위한 하나 이상의 실행 가능한 인스트럭션들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있다. 또, 몇 가지 대체 실시예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하고 또는 그 블록들 또는 단계들이 때때로 해당하는 기능에 따라 역순으로 수행되는 것도 가능하다.Additionally, each block or each step may represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical function(s). Additionally, it should be noted that in some alternative embodiments it is possible for the functions mentioned in the blocks or steps to occur out of order. For example, two blocks or steps shown in succession may in fact be performed substantially simultaneously, or the blocks or steps may sometimes be performed in reverse order depending on the corresponding function.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 품질에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 균등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The above description is merely an illustrative explanation of the technical idea of the present invention, and those skilled in the art will be able to make various modifications and variations without departing from the essential quality of the present invention. Accordingly, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention, but are for illustrative purposes, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention shall be interpreted in accordance with the claims below, and all technical ideas within the scope equivalent thereto shall be construed as being included in the scope of rights of the present invention.

100: 객체 인식장치
110: 입출력부
120: 프로세서
130: 메모리
140: 객체 인식 프로그램
141: 데이터 그룹핑부
143: 텍스트 그래프 변환부
145: 이미지 그래프 변환부
147: 객체 인식부100: Object recognition device
110: input/output unit
120: processor
130: memory
140: Object recognition program
141: Data grouping unit
143: Text graph conversion unit
145: Image graph conversion unit
147: Object recognition unit

Claims

Receiving a plurality of text data and a plurality of image data for an object from the outside and grouping them;
Converting the text data of each group into a text graph using a previously learned text graph conversion unit;
Extracting a plurality of objects and a plurality of tags corresponding to each of the plurality of objects from the image data of each group using a pre-learned image graph conversion unit, and generating an initial image graph with each of the plurality of tags as a node, converting the image data of each group into an image graph by pruning nodes of the initial image graph based on a result of determining the importance of each node of the initial image graph; and
A graph-based object recognition method comprising recognizing the object by connecting the text graph and the image graph using a previously learned object recognition unit.

According to paragraph 1,
The grouping step is,
determining spatiotemporal identity between the plurality of text data and the plurality of image data based on one or more spatiotemporal context information; and
A graph-based object recognition method comprising grouping one or more text data and image data having the same time and space among the plurality of text data and the plurality of image data into one group based on a determination result.

According to paragraph 2,
The one or more spatio-temporal context information is,
A graph-based object recognition method including location coordinates and timestamps where each of the plurality of text data and the plurality of image data occurred.

According to paragraph 1,
The step of converting to a text graph is,
extracting a plurality of keywords from the text data of each group;
Extracting one or more key keywords by determining the importance of each of the plurality of keywords;
generating an initial text graph using the one or more key keywords as nodes;
determining a common frequency of appearance of the one or more key keywords in the text data; and
A graph-based object recognition method comprising generating the text graph by applying an edge weight to the initial text graph based on a determination result.

According to paragraph 1,
The text graph conversion unit includes a keyword extraction unit,
The keyword extraction unit,
A graph-based object recognition method that is learned to extract and output the one or more core keywords from the text data when the extracted answer is input as label data together with the text data.

delete

According to paragraph 1,
The image graph conversion unit includes an object extraction unit,
The object extraction unit,
A graph-based object recognition method that is learned to extract and output the plurality of objects and the plurality of tags from the image data when an extraction answer is input as label data together with the image data.

According to paragraph 1,
The step of recognizing the object is,
determining whether entities of the text graph and the image graph are identical;
connecting the text graph and the image graph based on a determination result; and
A graph-based object recognition method comprising outputting a recognition result for the object based on a connected graph.

According to clause 8,
The object recognition unit includes an identity determination unit,
The identity determination unit,
A graph-based object recognition method that is learned to determine whether the text graph and the image graph are identical and output them when the correct answer is input as label data along with the text graph and the image graph.

Memory in which an object recognition program is stored; and
Execute the object recognition program, receive a plurality of text data and a plurality of image data for the object from the outside, group them, and convert the text data of each group into a text graph using a previously learned text graph conversion unit. Then, a plurality of objects and a plurality of tags corresponding to each of the plurality of objects are extracted from the image data of each group using a pre-learned image graph conversion unit, and an initial image graph is generated with each of the plurality of tags as a node. And, based on the result of determining the importance of each node of the initial image graph, the image data of each group is converted into an image graph through node pruning of the initial image graph, and the image data of each group is converted into an image graph using a pre-learned object recognition unit. A graph-based object recognition device comprising a processor that recognizes the object by connecting a text graph and the image graph.

According to clause 10,
The processor,
Determine spatiotemporal identity between the plurality of text data and the plurality of image data based on one or more spatiotemporal context information, and determine the spatiotemporal identity between the plurality of text data and the plurality of image data based on the determination result. One or more texts having the same space and time among the plurality of text data and the plurality of image data A graph-based object recognition device that groups data and image data into one group.

According to clause 11,
The one or more spatio-temporal context information is,
A graph-based object recognition device including location coordinates and timestamps where each of the plurality of text data and the plurality of image data occurred.

According to clause 10,
The processor,
Extracting a plurality of keywords from the text data of each group, determining the importance of each of the plurality of keywords to extract one or more core keywords, generating an initial text graph with the one or more core keywords as nodes, and generating the text A graph-based object recognition device that determines the common frequency of appearance of the one or more key keywords in data and generates the text graph by applying edge weights to the initial text graph based on the determination result.

According to clause 10,
The text graph conversion unit includes a keyword extraction unit,
The keyword extraction unit,
A graph-based object recognition device that is trained to extract and output the one or more key keywords from the text data when the extracted answer is input as label data together with the text data.

delete

According to clause 10,
The image graph conversion unit includes an object extraction unit,
The object extraction unit,
A graph-based object recognition device that is trained to extract and output the plurality of objects and the plurality of tags from the image data when an extraction answer is input as label data together with the image data.

According to clause 10,
The processor,
Determine whether the entities of the text graph and the image graph are identical, connect the text graph and the image graph based on the determination result, and output recognition results for the object based on the connected graph. A graph-based object recognition device.

According to clause 10,
The object recognition unit includes an identity determination unit,
The identity determination unit,
A graph-based object recognition device that is trained to determine whether the text graph and the image graph are identical when receiving a correct answer as label data along with the text graph and the image graph and output the result.

A computer-readable recording medium storing a computer program,
The computer program is,
Receiving a plurality of text data and a plurality of image data for an object from the outside and grouping them;
Converting the text data of each group into a text graph using a previously learned text graph conversion unit;
Extracting a plurality of objects and a plurality of tags corresponding to each of the plurality of objects from the image data of each group using a pre-learned image graph conversion unit, and generating an initial image graph with each of the plurality of tags as a node, converting the image data of each group into an image graph by pruning nodes of the initial image graph based on a result of determining the importance of each node of the initial image graph; and
A computer-readable recording medium comprising instructions for a processor to perform a graph-based object recognition method comprising recognizing the object by connecting the text graph and the image graph using a previously learned object recognition unit.

A computer program stored on a computer-readable recording medium,
The computer program is,
Receiving a plurality of text data and a plurality of image data for an object from the outside and grouping them;
Converting the text data of each group into a text graph using a previously learned text graph conversion unit;
Extracting a plurality of objects and a plurality of tags corresponding to each of the plurality of objects from the image data of each group using a pre-learned image graph conversion unit, and generating an initial image graph with each of the plurality of tags as a node, converting the image data of each group into an image graph by pruning nodes of the initial image graph based on a result of determining the importance of each node of the initial image graph; and
A computer program stored in a recording medium including instructions for a processor to perform a graph-based object recognition method including recognizing the object by connecting the text graph and the image graph using a previously learned object recognition unit.