KR102365087B1

KR102365087B1 - System for recognizing object through cooperation with user terminals and a server and method thereof

Info

Publication number: KR102365087B1
Application number: KR1020200014246A
Authority: KR
Inventors: 민 장; 오주병; 홍승곤
Original assignee: 주식회사 코이노
Priority date: 2019-12-20
Filing date: 2020-02-06
Publication date: 2022-02-21
Also published as: KR20210080143A

Abstract

다자 간 협업 기반 객체 인식 시스템 및 그 방법이 개시된다. 일 실시 예에 따른 다자 간 협업 기반 객체 인식 시스템은 동일한 입력 데이터를 대상으로 각각 기계학습을 수행하여 객체를 인식하는 다수의 학습자 단말과, 각 학습자 단말로부터 객체 인식 결과 값을 수신하여 이로부터 단일의 최종 객체 인식 결과 값을 결정한 후 각 학습자 단말에 최종 객체 인식 결과 값을 전송하여 학습자 단말 간에 객체 인식 결과를 동기화 시키는 관리 서버를 포함한다.A multi-party collaboration-based object recognition system and method are disclosed. A multi-party collaboration-based object recognition system according to an embodiment includes a plurality of learner terminals for recognizing objects by performing machine learning on the same input data, respectively, and by receiving object recognition result values from each learner terminal, After determining the final object recognition result value, it includes a management server that transmits the final object recognition result value to each learner terminal to synchronize the object recognition result between the learner terminals.

Description

Multilateral collaboration-based object recognition system and method thereof {System for recognizing object through cooperation with user terminals and a server and method thereof}

본 발명은 데이터 처리기술에 관한 것으로, 보다 상세하게는 객체인식 기술에 관한 것이다.The present invention relates to data processing technology, and more particularly, to object recognition technology.

기계학습(machine learning) 방식은 인공지능의 한 분야로, 음성과 영상 등에서 사용되고 있으며, 특히 이미지 분류 및 대조와 비교분석 등에 많이 사용되고 있다. 대상 데이터가 이미지(image)일 경우, 처리 방식은 이미지 라이브러리 등을 확보하고 이를 카테고리화 한 후 콘볼루션 신경망(Convolutional Neural Network: CNN, 이하, 'CNN'이라 칭함)과 같은 인공신경망으로 특징을 추출하고 이를 학습시킴으로써 정확도를 높여가는 방식을 사용한다.The machine learning method is a field of artificial intelligence and is used in voice and video, and in particular, it is widely used in image classification, contrast, and comparative analysis. When the target data is an image, the processing method is to secure an image library, etc., categorize it, and extract features with an artificial neural network such as a convolutional neural network (CNN, hereinafter referred to as 'CNN'). And it uses a method to increase the accuracy by learning it.

일 실시 예에 따라, 다수의 학습자 단말과 관리 서버 간 협업을 통해 기계학습의 정확도 및 효율성을 높이는 다자 간 협업 기반 객체 인식 시스템 및 그 방법을 제안한다.According to an embodiment, a multi-party collaboration-based object recognition system and method for increasing the accuracy and efficiency of machine learning through collaboration between a plurality of learner terminals and a management server are proposed.

일 실시 예에 따른 다자 간 협업 기반 객체 인식 시스템은, 동일한 입력 데이터를 대상으로 각각 기계학습을 수행하여 객체를 인식하는 다수의 학습자 단말과, 각 학습자 단말로부터 객체 인식 결과 값을 수신하여 이로부터 단일의 최종 객체 인식 결과 값을 결정한 후 각 학습자 단말에 최종 객체 인식 결과 값을 전송하여 학습자 단말 간에 객체 인식 결과를 동기화 시키는 관리 서버를 포함한다.A multi-party collaboration-based object recognition system according to an embodiment includes a plurality of learner terminals for recognizing objects by performing machine learning on the same input data, and receiving an object recognition result value from each learner terminal and single After determining the final object recognition result value of , it includes a management server that transmits the final object recognition result value to each learner terminal to synchronize the object recognition result between the learner terminals.

관리 서버는, 각 학습자 단말로부터 수신된 객체 인식 결과 값들을 비교 분석하여 서로 동일한 객체 인식 결과 값의 수가 가장 많이 나오는 조건 또는 학습 정확도가 가장 높은 조건 중 적어도 하나를 충족하는 객체 인식 결과 값을 최종 객체 인식 결과 값으로 결정할 수 있다.The management server compares and analyzes the object recognition result values received from each learner terminal, and returns an object recognition result value that satisfies at least one of a condition in which the number of identical object recognition result values occurs the most or a condition in which the learning accuracy is the highest. It can be determined by the recognition result value.

관리 서버는, 각 학습자 단말로부터 수신된 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행하여 조합 데이터의 학습 결과를 최종 객체 인식 결과 값으로 결정할 수 있다.The management server may combine the object recognition result values received from each learner terminal and perform learning on the combined data to determine the learning result of the combined data as the final object recognition result value.

관리 서버는, 각 학습자 단말로부터 수신된 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행한 후, 조합 데이터의 학습 결과와 각 학습자 단말의 객체 인식 결과를 종합적으로 비교 분석하여 최종 객체 인식 결과 값을 결정할 수 있다.The management server combines the object recognition result values received from each learner terminal, performs learning on the combined data, and comprehensively compares and analyzes the learning result of the combined data and the object recognition result of each learner terminal to obtain the final object A recognition result value can be determined.

관리 서버는, 각 학습자 단말로부터 객체 인식 결과 값을 수신하는 수신부와, 각 객체 인식 결과 값을 비교 분석하는 분석부와, 각 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행하는 학습부와, 분석부의 각 객체 인식 결과 값에 대한 비교 분석 결과와, 학습부의 조합 데이터에 대한 학습 수행 결과 값을 종합적으로 판단하여 단일의 최종 객체 인식 결과 값을 결정하는 최종 결정부와, 학습자 단말 간 객체 인식 결과 동기화를 위해 각 학습자 단말에 최종 객체 인식 결과 값을 전송하는 전송부를 포함할 수 있다.The management server includes a receiver that receives an object recognition result value from each learner terminal, an analyzer that compares and analyzes each object recognition result value, and a learning that combines each object recognition result value and performs learning on the combined data Between the final decision unit that determines a single final object recognition result value by comprehensively judging the comparative analysis result for each object recognition result value of the unit and the analysis unit, and the learning performance result value for the combined data of the learning unit, and the learner terminal It may include a transmitter for transmitting the final object recognition result value to each learner terminal for object recognition result synchronization.

분석부는 각 학습자 단말로부터 수신된 객체 인식 결과 값들을 비교 분석하여 서로 동일한 객체 인식 결과 값이 있는지 여부를 판단하고, 최종 결정부는 동일한 객체 인식 결과 값이 있으면 동일한 객체 인식 결과 값의 수가 가장 많이 나오는 조건 또는 학습 정확도가 가장 높은 조건 중 적어도 하나를 충족하는 객체 인식 결과 값을 중간 객체 인식 결과 값으로 1차 결정하는 제1 결정부와, 제1 결정부의 중간 객체 인식 결과 값과 학습부의 조합 데이터의 학습 수행 결과 값을 비교하여 서로 동일한 객체인지 여부를 판단하고 동일하면 중간 객체 인식 결과 값을 최종 객체 인식 결과 값으로 확정하고, 서로 상이하면 중간 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 함께 전송부에 전달하는 제2 결정부를 포함할 수 있다.The analysis unit compares and analyzes the object recognition result values received from each learner terminal to determine whether there are identical object recognition result values, and the final decision unit determines whether the same object recognition result value is the same as the condition in which the largest number of the same object recognition result value occurs. or a first determination unit that first determines an object recognition result value that satisfies at least one of the conditions having the highest learning accuracy as an intermediate object recognition result value, and the intermediate object recognition result value of the first determination unit and the learning unit combined data Comparing the performance result values to determine whether they are the same object, if they are the same, the intermediate object recognition result value is determined as the final object recognition result value, and if they are different, the intermediate recognition result value and the learning performance result value of the combination data It may include a second determination unit to transmit to.

제2 결정부는, 학습부의 학습 수행 결과 학습 정확도가 미리 설정된 역치 값이 넘으면 학습부의 학습 수행 결과 값을 최종 객체 인식 결과 값으로 결정하여 제공하고, 미리 설정된 역치 값이 넘지 않으면 중간 인식 결과 값과 학습부의 학습 수행 결과 값을 함께 전송부에 전달할 수 있다.The second determining unit determines and provides the learning performance result value of the learning unit as a final object recognition result value when the learning accuracy as a result of the learning by the learning unit exceeds a preset threshold value The negative learning performance result value may be transmitted to the transmitter together.

관리 서버는, 전송부가 최종 객체 인식 결과 값을 각 학습자 단말에 전송하면서 재 학습을 요청하고, 수신부가 각 학습자 단말로부터 재 학습된 객체 인식 결과 값을 수신하는 과정을 반복함에 따라 각 학습자 단말의 학습을 훈련 시킬 수 있다.The management server requests re-learning while transmitting the final object recognition result value to each learner terminal, and the management server repeats the process of receiving the re-learned object recognition result value from each learner terminal. can be trained

다자 간 협업 기반 객체 인식 시스템은, 입력 데이터를 기계학습 과제로서 다수의 학습자 단말에 제공하는 가이드 단말을 더 포함하며, 각 학습자 단말은 과제로 할당된 기계학습을 수행하는 교육용 단말일 수 있다.The multi-party collaboration-based object recognition system further includes a guide terminal that provides input data to a plurality of learner terminals as a machine learning task, and each learner terminal may be an educational terminal for performing machine learning assigned to the task.

각 학습자 단말은, 기계학습 명령어가 저장되는 메모리와, 기계학습 명령어를 이용하여 입력 데이터에 대해 기계학습을 수행하고 수행에 따른 학습된 객체 정보와 학습 기록을 포함한 객체 인식 결과 값을 생성하고, 관리 서버로부터 최종 객체 인식 결과 값을 수신하면 객체 인식 결과 값을 최종 객체 인식 결과 값으로 변경하거나 최종 객체 인식 결과 값을 참조하여 새로 기계학습을 수행하여 객체를 인식하는 과정을 반복하는 프로세서와, 객체 인식 결과 값을 관리 서버에 전송하고, 관리 서버로부터 최종 객체 인식 결과 값을 수신하는 통신부를 포함할 수 있다.Each learner terminal performs machine learning on input data using a memory in which machine learning commands are stored and machine learning commands, and generates and manages object recognition result values including learned object information and learning records according to the performance A processor that repeats the process of recognizing an object by changing the object recognition result value to the final object recognition result value when receiving the final object recognition result value from the server or by newly performing machine learning by referring to the final object recognition result value; It may include a communication unit for transmitting the result value to the management server, and receiving the final object recognition result value from the management server.

각 학습자 단말은 사용자의 신체에 착용하는 본체에 장착되어 외부 영상을 촬영하여 촬영 영상을 획득하는 시력 보조장치를 포함할 수 있다.Each learner terminal may include a vision assisting device that is mounted on a main body worn on the user's body and acquires a photographed image by photographing an external image.

다른 실시 예에 따른 다자 간 협업 기반 객체 인식 방법은, 동일한 입력 데이터를 대상으로 다수의 학습자 단말로부터 각각 객체 인식 결과 값을 수신하는 단계와, 수신된 객체 인식 결과 값들로부터 단일의 최종 객체 인식 결과 값을 결정하는 단계와, 각 학습자 단말에 최종 객체 인식 결과 값을 전송하여 학습자 단말 간에 객체 인식 결과를 동기화 시키는 단계를 포함한다.A multi-party collaboration-based object recognition method according to another embodiment includes the steps of receiving each object recognition result value from a plurality of learner terminals for the same input data, and a single final object recognition result value from the received object recognition result values and transmitting the final object recognition result value to each learner terminal to synchronize the object recognition result between the learner terminals.

최종 객체 인식 결과 값을 결정하는 단계는, 각 객체 인식 결과 값을 비교 분석하는 단계와, 각 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행하는 단계와, 각 객체 인식 결과 값에 대한 비교 분석 결과와, 조합된 데이터에 대한 학습 수행 결과 값을 종합적으로 판단하여 단일의 최종 객체 인식 결과 값을 결정하는 단계를 포함할 수 있다.The step of determining the final object recognition result value includes the steps of comparing and analyzing each object recognition result value, combining each object recognition result value and performing learning on the combined data; It may include the step of determining a single final object recognition result value by comprehensively judging the comparison analysis result for the combined data and the learning performance result value for the combined data.

종합적으로 판단하여 단일의 최종 객체 인식 결과 값을 결정하는 단계는, 각 학습자 단말로부터 수신된 객체 인식 결과 값들을 비교 분석하여 서로 동일한 객체 인식 결과 값의 수가 가장 많이 나오는 조건 또는 학습 정확도가 가장 높은 조건 중 적어도 하나를 충족하는 객체 인식 결과 값을 중간 객체 인식 결과 값으로 결정하는 단계와, 중간 객체 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 비교하여 서로 동일한 객체인지 여부를 판단하고 동일하면 중간 객체 인식 결과 값을 최종 객체 인식 결과 값으로 확정하고, 서로 상이하면 중간 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 함께 제공하는 단계를 포함할 수 있다.In the step of determining a single final object recognition result value by comprehensively judging, the condition in which the number of identical object recognition result values is highest by comparing and analyzing the object recognition result values received from each learner terminal or the condition with the highest learning accuracy determining whether an object recognition result value satisfying at least one of The method may include determining the recognition result value as the final object recognition result value, and if they are different from each other, providing the intermediate recognition result value and the learning performance result value of the combination data together.

종합적으로 판단하여 단일의 최종 객체 인식 결과 값을 결정하는 단계는, 학습 수행 결과 학습 정확도가 미리 설정된 역치 값이 넘으면 조합 데이터의 학습 수행 결과 값을 최종 객체 인식 결과 값으로 결정하는 단계와, 학습 수행 결과 학습 정확도가 미리 설정된 역치 값을 넘지 않으면 중간 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 함께 제공하는 단계를 더 포함할 수 있다.The step of determining a single final object recognition result value by comprehensively judging, if the learning performance result learning accuracy exceeds a preset threshold value, determining the learning performance result value of the combination data as the final object recognition result value; If the result learning accuracy does not exceed a preset threshold value, the method may further include providing an intermediate recognition result value and a learning performance result value of the combination data together.

일 실시 예에 따른 다자 간 협업 기반 객체 인식 시스템 및 그 방법은 다수의 학습자 단말 별로 서로 상이한 단말 능력을 고려하여 관리 서버와의 협업 기능을 통해 각 학습자 단말의 학습을 가이드 하고 가이드를 통해 객체 인식 결과를 동기화 시켜 학습 정확도를 높일 수 있다. 특히, 이와 같은 협업 기능을 교육에 접목시켰을 때 협업하는 형태로 객체를 인식하여 형태 파악에 더 정확한 결과를 도출하거나 새로운 관찰형태를 보고하도록 가이드 할 수 있다.A multi-party collaboration-based object recognition system and method according to an embodiment guide the learning of each learner terminal through a collaboration function with a management server in consideration of different terminal capabilities for a plurality of learner terminals, and object recognition results through the guide can be synchronized to increase the learning accuracy. In particular, when such a collaboration function is applied to education, it can recognize objects in a collaborative form and guide them to derive more accurate results for shape identification or to report a new observation form.

도 1은 본 발명의 이해를 돕기 위한 콘볼루션 신경망(CNN)의 구조를 도시한 도면,
도 2는 본 발명의 일 실시 예에 따른 콘볼루션 신경망(CNN)에서의 기계학습 프로세스를 실제 처리하는 예를 도시한 신경망 구조를 도시한 도면,
도 3은 본 발명의 일 실시 예에 따른 다자 간 협업 기반 객체 인식 시스템의 구성을 도시한 도면,
도 4는 본 발명의 일 실시 예에 따른 학습자 단말의 구조를 도시한 도면,
도 5는 본 발명의 일 실시 예에 따른 학습자 단말의 세부 구성을 도시한 도면,
도 6은 본 발명의 일 실시 예에 따른 관리 서버의 구성을 도시한 도면,
도 7은 본 발명의 일 실시 예에 따른 다자 간 협업 기반 객체 인식 시나리오를 도시한 도면,
도 8은 본 발명의 일 실시 예에 따른 다자 간 협업 기반 객체 인식 방법의 프로세스를 도시한 도면이다.1 is a diagram showing the structure of a convolutional neural network (CNN) to help the understanding of the present invention;
2 is a diagram illustrating a neural network structure showing an example of actually processing a machine learning process in a convolutional neural network (CNN) according to an embodiment of the present invention;
3 is a diagram showing the configuration of a multi-party collaboration-based object recognition system according to an embodiment of the present invention;
4 is a diagram showing the structure of a learner terminal according to an embodiment of the present invention;
5 is a diagram illustrating a detailed configuration of a learner terminal according to an embodiment of the present invention;
6 is a diagram showing the configuration of a management server according to an embodiment of the present invention;
7 is a diagram illustrating a multi-party collaboration-based object recognition scenario according to an embodiment of the present invention;
8 is a diagram illustrating a process of an object recognition method based on multi-party collaboration according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Advantages and features of the present invention, and a method for achieving them will become apparent with reference to the embodiments described below in detail in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, and only these embodiments allow the disclosure of the present invention to be complete, and common knowledge in the technical field to which the present invention belongs It is provided to fully inform the possessor of the scope of the invention, and the present invention is only defined by the scope of the claims. Like reference numerals refer to like elements throughout.

본 발명의 실시 예들을 설명함에 있어서 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이며, 후술되는 용어들은 본 발명의 실시 예에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.In describing the embodiments of the present invention, if it is determined that a detailed description of a well-known function or configuration may unnecessarily obscure the gist of the present invention, the detailed description will be omitted, and the terms to be described later are used in the embodiment of the present invention. As terms defined in consideration of the function of Therefore, the definition should be made based on the content throughout this specification.

첨부된 블록도의 각 블록과 흐름도의 각 단계의 조합들은 컴퓨터 프로그램명령어들(실행 엔진)에 의해 수행될 수도 있으며, 이들 컴퓨터 프로그램 명령어들은 범용 컴퓨터, 특수용 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서에 탑재될 수 있으므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비의 프로세서를 통해 수행되는 그 명령어들이 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능들을 수행하는 수단을 생성하게 된다.Each block in the accompanying block diagram and combinations of steps in the flowchart may be executed by computer program instructions (execution engine), which are executed by the processor of a general-purpose computer, special-purpose computer, or other programmable data processing equipment. It may be mounted so that the instructions, which are executed by the processor of a computer or other programmable data processing equipment, create means for performing the functions described in each block of the block diagram or each step of the flowchart.

이들 컴퓨터 프로그램 명령어들은 특정 방식으로 기능을 구현하기 위해 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 지향할 수 있는 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장되는 것도 가능하므로, 그 컴퓨터 이용가능 또는 컴퓨터 판독 가능 메모리에 저장된 명령어들은 블록도의 각 블록 또는 흐름도의 각 단계에서 설명된 기능을 수행하는 명령어 수단을 내포하는 제조 품목을 생산하는 것도 가능하다.These computer program instructions may also be stored in a computer-usable or computer-readable memory that may direct a computer or other programmable data processing equipment to implement a function in a particular manner, and thus the computer-usable or computer-readable memory. It is also possible for the instructions stored in the block diagram to produce an article of manufacture containing instruction means for performing the function described in each block of the block diagram or each step of the flowchart.

그리고 컴퓨터 프로그램 명령어들은 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에 탑재되는 것도 가능하므로, 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비 상에서 일련의 동작 단계들이 수행되어 컴퓨터로 실행되는 프로세스를 생성해서 컴퓨터 또는 기타 프로그램 가능한 데이터 프로세싱 장비를 수행하는 명령어들은 블록도의 각 블록 및 흐름도의 각 단계에서 설명되는 기능들을 실행하기 위한 단계들을 제공하는 것도 가능하다.And, since the computer program instructions may be mounted on a computer or other programmable data processing equipment, a series of operational steps are performed on the computer or other programmable data processing equipment to create a computer-executable process to create a computer or other programmable data processing equipment. It is also possible that the instructions for performing the data processing equipment provide steps for performing the functions described in each block of the block diagram and in each step of the flowchart.

또한, 각 블록 또는 각 단계는 특정된 논리적 기능들을 실행하기 위한 하나 이상의 실행 가능한 명령어들을 포함하는 모듈, 세그먼트 또는 코드의 일부를 나타낼 수 있으며, 몇 가지 대체 실시 예들에서는 블록들 또는 단계들에서 언급된 기능들이 순서를 벗어나서 발생하는 것도 가능함을 주목해야 한다. 예컨대, 잇달아 도시되어 있는 두 개의 블록들 또는 단계들은 사실 실질적으로 동시에 수행되는 것도 가능하며, 또한 그 블록들 또는 단계들이 필요에 따라 해당하는 기능의 역순으로 수행되는 것도 가능하다.Also, each block or step may represent a module, segment, or portion of code that includes one or more executable instructions for executing specified logical functions, and in some alternative embodiments the blocks or steps referred to in the block or step. It should be noted that it is also possible for functions to occur out of sequence. For example, it is possible that two blocks or steps shown one after another may be performed substantially simultaneously, and also the blocks or steps may be performed in the reverse order of the corresponding functions, if necessary.

본 발명에 의해 학습 되는 신경망은 다양한 복잡한 계산 업무에 사용될 수 있다. 예를 들면, 신경망은 이미지 데이터가 주어졌을 때 사물 인식에 사용될 수 있다. 사물 인식은 안면 인식, 손 글씨 분석, 의료 이미지 분석, 그리고 이미지에 포함된 물체나 특징의 분석에 요구되는 일이나 그와 유사한 일들을 포함한다. 신경망은 환경 감시, 제조 및 생산 제어, 의료 진단 보조, 그리고 그와 유사한 다양한 절차에 사용될 수 있다. 신경망은 음성 인식, 언어 번역, 음성 데이터가 주어졌을 때 언어 작업들을 수행할 수 있다.The neural network learned by the present invention can be used for various complex computational tasks. For example, neural networks can be used for object recognition given image data. Object recognition includes tasks required for facial recognition, handwriting analysis, medical image analysis, and analysis of objects or features contained in images or the like. Neural networks can be used for environmental monitoring, manufacturing and production control, medical diagnostic assistance, and a variety of similar procedures. Neural networks can perform speech recognition, language translation, and language tasks given speech data.

이하, 본 발명의 이해를 돕기 위해 여기에 게시되는 용어들에 대한 의미를 정의한다.Hereinafter, meanings of terms posted herein are defined to help the understanding of the present invention.

여기서 언급되는 용어 “신경망”은 일반적으로 적응적 특징을 갖는 통계적 학습 알고리즘을 수행하는, 기계 학습에 유용한 소프트웨어를 의미한다. 신경망은 생체의 신경망을 모사하여 서로 상호 연결되어 네트워크를 형성하는“뉴런”, “처리 요소”, “유닛” 또는 다른 유사한 용어들로 알려진 복수의 인공적 노드들을 포함한다. 일반적으로, 신경망은 적응적 가중치(학습 알고리즘에 의해서 조정되는 숫자 파라미터)의 셋들을 포함하고, 그것들의 입력에 대해 근사적 비선형 함수 기능을 갖는다. 적응적 가중치는 훈련이나 예측 기간 동안 활성화되는 뉴런들 간의 연결 강도를 의미한다. 일반적으로, 신경망은 비선형, 분산, 병렬, 그리고 지역 처리 및 적응 원칙에 따라 동작한다.As used herein, the term “neural network” generally refers to software useful for machine learning that performs statistical learning algorithms with adaptive characteristics. A neural network includes a plurality of artificial nodes known as “neurons”, “processing elements”, “units” or other similar terms that are interconnected to form a network by simulating the neural network of a living body. In general, a neural network contains sets of adaptive weights (numeric parameters adjusted by a learning algorithm) and has an approximate non-linear function function for their inputs. The adaptive weight refers to the strength of connections between neurons activated during training or prediction periods. In general, neural networks operate according to nonlinear, distributed, parallel, and local processing and adaptation principles.

인공신경망 중 하나로 콘볼루션 신경망(CNN)이 있다. 일반적으로, 콘볼루션은 두 개의 함수(f, g)에 대한 수학 연산으로, 원래 함수의 변형된 버전의 제3 함수를 생성한다. 제3 함수는 두 함수 중 어느 하나의 원래 함수가 변형되는 양의 함수로서, 두 함수들의 영역 중첩을 포함한다.One of the artificial neural networks is a convolutional neural network (CNN). In general, convolution is a mathematical operation on two functions (f, g), producing a third function, a modified version of the original function. The third function is a positive function in which one original function of the two functions is transformed, and includes a domain overlap of the two functions.

일반적으로 콘볼루션 신경망(CNN)은 각각의 뉴런들이 타일 형태로 배치되고, 가시 필드에서의 중첩 영역에 응답하는 형태의 신경망 타입을 의미한다. 콘볼루션 신경망(CNN)은 입력 계층과 중간 계층 및 출력 계층을 포함한다. 입력 계층은 입력 데이터를 입력 받은 계층이고, 출력 계층은 입력 데이터에 대한 최종 분류 결과를 출력하는 계층이다. 중간 계층은 콘볼루션 계층(convolution layer), 풀링 계층(pooling layer) 및 상층의 완전 연결 계층(fully connected layer), 3종류의 계층으로 표현될 수 있다. 콘볼루션 계층은 콘볼루션 특징을 추출하는 계층으로, 의미있는 특징들을 추출하기 위한 층이다. 각각의 콘볼루션 계층은 콘볼루션 필터(convolution filter)에 의해서 파라미터화될 수 있다. 콘볼루션 신경망(CNN)의 파워는 입력 데이터를 대상으로 단순 특성으로 시작하는 계층들로부터 오며, 후속되는 계층이 고 레벨 의미를 가지도록 각 계층들을 통하여 점점 복잡한 특성들을 학습한다. 풀링 계층은 콘볼루션 계층 이후에 즉시 사용된다. 풀링 계층은 콘볼루션 계층의 출력을 단순화시킨다. 완전 연결 계층은 콘볼루션 계층과 풀링 계층에서 나온 특징을 이용하여 분류하는 층이다.In general, a convolutional neural network (CNN) refers to a type of neural network in which neurons are arranged in a tile form and respond to an overlapping region in a visible field. A convolutional neural network (CNN) includes an input layer, an intermediate layer, and an output layer. The input layer is a layer that receives input data, and the output layer is a layer that outputs the final classification result for the input data. The middle layer may be represented by three types of layers: a convolution layer, a pooling layer, and an upper fully connected layer. The convolutional layer is a layer for extracting convolutional features, and is a layer for extracting meaningful features. Each convolutional layer may be parameterized by a convolution filter. The power of a convolutional neural network (CNN) comes from layers starting with simple features for input data, and increasingly complex features are learned through each layer so that subsequent layers have high-level meaning. The pooling layer is used immediately after the convolutional layer. The pooling layer simplifies the output of the convolutional layer. The fully connected layer is a layer that classifies using features from the convolutional layer and the pooling layer.

여기서 언급되는 용어 “서브 샘플링” 또는 “다운 샘플링”은 신호의 전체 사이즈를 줄이는 것을 의미한다. “최대 풀링”으로 언급된 기술은, 감소된 행렬의 각각의 요소들의 최대값을 취하는 과정을 의미한다.As used herein, the term “sub-sampling” or “down-sampling” means reducing the overall size of a signal. The technique referred to as “maximum pooling” refers to the process of taking the maximum value of each element of the reduced matrix.

예시적인 실시 예에서, 여기에 게시되는 방법과 장치는 신경망을 훈련하는데 유용하다. 신경망은 이미지 데이터로부터 사물 인식을 수행하도록 설정될 수 있다. 하지만, 예시적인 실시 예들은 설명을 위한 것일 뿐 본 발명은 여기에 국한되지 않는다. 따라서, 여기에 게시되는 방법과 장치는 신경망을 사용하는 다른 응용에서도 동일하게 사용될 수 있다.In an exemplary embodiment, the methods and apparatus disclosed herein are useful for training neural networks. The neural network may be set to perform object recognition from image data. However, the exemplary embodiments are for illustrative purposes only, and the present invention is not limited thereto. Thus, the methods and apparatus disclosed herein can equally be used in other applications using neural networks.

이하, 첨부 도면을 참조하여 본 발명의 실시 예를 상세하게 설명한다. 그러나 다음에 예시하는 본 발명의 실시 예는 여러 가지 다른 형태로 변형될 수 있으며, 본 발명의 범위가 다음에 상술하는 실시 예에 한정되는 것은 아니다. 본 발명의 실시 예는 이 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위하여 제공된다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. However, the embodiments of the present invention illustrated below may be modified in various other forms, and the scope of the present invention is not limited to the embodiments described below. The embodiments of the present invention are provided to more completely explain the present invention to those of ordinary skill in the art to which the present invention pertains.

도 1은 본 발명의 이해를 돕기 위한 콘볼루션 신경망(CNN)의 구조를 도시한 도면이다.1 is a diagram showing the structure of a convolutional neural network (CNN) for helping understanding of the present invention.

도 1을 참조하면, 콘볼루션 신경망(CNN)은 특징 추출(feature extraction) 단계와 분류(classification) 단계로 이루어진다. 특징 추출단계는 콘볼루션 계층과 풀링 계층으로 구성된 특징 추출 계층이 복수 개로 이루어진다. 분류 단계는 완전 연결된 하나의 계층을 생성하고 추출된 특징들을 이용하여 결과치를 내는 단계이다.Referring to FIG. 1 , a convolutional neural network (CNN) consists of a feature extraction step and a classification step. The feature extraction step consists of a plurality of feature extraction layers composed of a convolutional layer and a pooling layer. The classification step is a step in which a fully connected layer is created and a result is generated using the extracted features.

콘볼루션 계층은 콘볼루션 기능을 수행하여 입력 이미지의 특징을 나타낸다. 콘볼루션 기능은 입력 유닛에 k×k 크기의 콘볼루션 필터를 적용하여 출력 유닛을 계산하는 기능이다. 출력 유닛은 이미지의 특징(Feature) 정보를 가진다. 콘볼루션 계산은 입력 유닛의 전 영역에서 가능한 모든 k×k 크기의 부분 영역을 추출하고, 그 다음 입력 유닛과 출력 유닛 사이에 고유하게 지정된 콘볼루션 필터의 각 단위요소들과 n×n 크기의 부분 영역의 각 값을 각각 곱한 후 합산하는 것(즉, 필터와 부분 영역 간의 내적의 곱의 합)을 의미한다. 여기서 콘볼루션 필터는 k×k 개의 파라미터로 구성되며, 커널(kernel)이라고도 지칭한다. 하나의 커널은 입력 유닛(즉, 채널)의 모든 부분 영역에 공통적으로 적용된다.The convolutional layer performs a convolution function to indicate the characteristics of the input image. The convolution function is a function of calculating the output unit by applying a k×k convolution filter to the input unit. The output unit has feature information of the image. Convolutional calculation extracts all possible subregions of size k×k from the entire region of the input unit, and then each unit element of the convolution filter uniquely designated between the input unit and the output unit and a portion of size n×n It means to multiply each value of the domain and then add them (ie, the sum of the products of the dot product between the filter and the partial domain). Here, the convolution filter is composed of k×k parameters, and is also referred to as a kernel. One kernel is commonly applied to all subregions of an input unit (ie, a channel).

한 계층의 출력 유닛이 다음 계층을 위한 입력 유닛으로 이용될 수 있다. 한 계층의 입력 유닛으로 이용되면 그 유닛을 채널(channel)이라고도 지칭하며, 한 계층의 출력 유닛으로 이용되면 그 유닛을 특징 맵(feature map) 이라고도 지칭한다.An output unit of one layer may be used as an input unit for the next layer. When used as an input unit of a layer, the unit is also referred to as a channel, and when used as an output unit of a layer, the unit is also referred to as a feature map.

풀링 계층은 콘볼루션 계층의 출력을 단순화시킨다. 예를 들어, 풀링 계층은 공간적으로 입력을 다운 샘플링 한다. 이미지 데이터는 많은 픽셀이 존재하기 때문에 특징을 줄이기 위해 서브 샘플링 한다. 풀링 방식 중 하나는 최대 풀링 방식의 서브 샘플링으로, 각 윈도에서 가장 큰 자극만을 선택하는 것이다.The pooling layer simplifies the output of the convolutional layer. For example, the pooling layer spatially downsamples the input. The image data is subsampled to reduce features because there are many pixels. One of the pooling methods is subsampling of the maximum pooling method, in which only the largest stimulus is selected in each window.

마지막 특징 추출 계층의 출력 유닛은 완전 연결 계층(fully connected layer)과 추가로 연결된다. 완전 연결 계층에서는 복수의 특징 추출 계층을 통해 추출된 특징을 이용하여 이미지 데이터에서, 개, 고양이, 새 등을 분류한다.The output unit of the last feature extraction layer is further connected to a fully connected layer. In the fully connected layer, a dog, a cat, a bird, etc. are classified in the image data using the features extracted through a plurality of feature extraction layers.

도 2는 본 발명의 일 실시 예에 따른 콘볼루션 신경망(CNN)에서의 기계학습 프로세스를 실제 처리하는 예를 도시한 신경망 구조를 도시한 도면이다.2 is a diagram illustrating a neural network structure illustrating an example of actually processing a machine learning process in a convolutional neural network (CNN) according to an embodiment of the present invention.

도 2를 참조하면, 특징 추출 단계에서, 32×32 픽셀 입력 이미지 데이터(Input)를 대상으로 5×5 콘볼루션 필터를 통해 특징을 추출하여 28×28 이미지 4장(C₁)을 추출하고, 이를 대상으로 2×2 서브 샘플링(크기를 줄이기 위한 액션)을 수행하여 동일한 4장의 14×14 이미지(S₁)를 생성한다. 그리고 다시 5×5 콘볼루션 필터를 통해 특징을 추출하여 10×10 이미지 12장(C₂)을 추출하고, 이를 대상으로 2×2 서브 샘플링 하여 동일한 12장의 5×5 이미지(S₂)를 생성하는 프로세스를 반복한다. 이어서, 분류 단계에서, 완전 연결된 하나의 행렬(n₁)을 만들고 이를 신경망에 입력하여 값을 비교한 뒤 결과치(Output)를 얻는다.Referring to FIG. 2, in the feature extraction step, four 28×28 images (C ₁ ) are extracted by extracting features through a 5×5 convolution filter targeting 32×32 pixel input image data (Input), 2×2 subsampling (action to reduce the size) is performed on this target to generate the same 4 14×14 images (S ₁ ). Then, the feature is extracted again through a 5×5 convolution filter to extract 12 10×10 images (C ₂ ), and 2×2 sub-sampling is performed to generate the same 12 5×5 images (S ₂ ). repeat the process Then, in the classification step, one fully connected matrix (n ₁ ) is created and input to the neural network to compare the values and then obtain the output (Output).

도 3은 본 발명의 일 실시 예에 따른 다자 간 협업 기반 객체 인식 시스템의 구성을 도시한 도면이다.3 is a diagram illustrating a configuration of an object recognition system based on multi-party collaboration according to an embodiment of the present invention.

도 3을 참조하면, 다자 간 협업 기반 객체 인식 시스템(1)은 학습자 단말(2), 관리 서버(4) 및 네트워크(5)를 포함하며, 가이드 단말(3)을 더 포함할 수 있다.Referring to FIG. 3 , the multi-party collaboration-based object recognition system 1 includes a learner terminal 2 , a management server 4 , and a network 5 , and may further include a guide terminal 3 .

학습자 단말(2)은 n(n은 정수)개로서, 서로 상이한 사용자가 소지하며, 객체 인식이 가능한 지능형 단말이다. 화면에 표시되는 객체의 종류를 파악하여 객체를 인식한 후 인식된 객체에 대한 부가 서비스 제공이 가능하다. 부가 서비스는 예를 들어, 객체에 대한 설명자료나 연계되는 제안이 가능한 웹 사이트 등으로 연결되는 서비스 등이 있다.The learner terminals 2 are n (n is an integer) number, which are possessed by different users, and are intelligent terminals capable of object recognition. After recognizing the object by identifying the type of object displayed on the screen, it is possible to provide additional services for the recognized object. The additional service includes, for example, a service that connects to a web site that can provide description data for an object or a related proposal.

학습자 단말(2)은 인공지능 기반 기계학습을 위한 컴퓨터 자원을 가진다. 예시적인 자원은, 기계학습을 수행하기 위한 메모리, 처리 능력, 데이터 스토리지 등을 포함한다. 각 학습자 단말(2-1, 2-2, …, 2-n) 별로 단말 능력에 따라 학습 정도가 상이하게 마련이다. 예를 들어, 각 학습자 단말(2-1, 2-2, …, 2-n)의 단말 능력에 따라 학습에 의해 인식된 객체의 크기가 서로 상이하거나 방향에 따라서 모양이 크게 상이해질 수 있는데, 이 경우 객체 인식이 어려울 수 있다. 단말 능력은 학습자 단말(2)의 기계학습을 처리할 수 있는 성능을 나타내는 파라미터로서, 예를 들어, CPU의 개수, 클록 속도, 캐쉬 메모리의 크기 등이 될 수 있다. 또는 학습자 단말(2)의 운영체제(OS), 웹 브라우저 등이 될 수도 있다.The learner terminal 2 has computer resources for machine learning based on artificial intelligence. Exemplary resources include memory, processing power, data storage, and the like for performing machine learning. For each learner terminal (2-1, 2-2, ..., 2-n), the degree of learning differs according to the terminal capability. For example, depending on the terminal capabilities of each learner terminal (2-1, 2-2, ..., 2-n), the size of the object recognized by learning may be different from each other, or the shape may be greatly different depending on the direction, In this case, object recognition may be difficult. The terminal capability is a parameter indicating the capability of the learner terminal 2 to process machine learning, and may be, for example, the number of CPUs, a clock speed, a size of a cache memory, and the like. Alternatively, it may be an operating system (OS) of the learner terminal 2, a web browser, or the like.

일 실시 예에 따른 다자 간 협업 기반 객체 인식 시스템(1)은 전술한 개별 학습자 단말(2-1, 2-2, …, 2-n) 별 상이한 단말 능력을 고려하여 관리 서버(4)와의 협업 기능을 통해 각 학습자 단말(2-1, 2-2, …, 2-n)의 학습을 가이드 하고 가이드를 통해 객체 인식 결과를 동기화 시켜 학습 정확도를 높이는 기능을 제안한다. 특히, 이와 같은 협업 기능을 교육에 접목시켰을 때 협업하는 형태로 객체를 인식하여 형태 파악에 더 정확한 결과를 도출하거나 새로운 관찰형태를 보고하도록 가이드 할 수 있다.The multi-party collaboration-based object recognition system 1 according to an embodiment cooperates with the management server 4 in consideration of different terminal capabilities for each individual learner terminal 2-1, 2-2, ..., 2-n described above. We propose a function to guide the learning of each learner terminal (2-1, 2-2, …, 2-n) through the function and to increase the learning accuracy by synchronizing the object recognition result through the guide. In particular, when such a collaboration function is applied to education, it can recognize objects in a collaborative form and guide them to derive more accurate results for shape identification or to report a new observation form.

개별 학습자 단말(2-1, 2-2, …, 2-n)의 인공지능을 이용하여 객체를 인식하고, 각 객체 인식 결과 값을 네트워크(5)를 거쳐 관리 서버(4)에 전송한다. 각 객체 인식 결과 값은 학습된 객체 정보와 학습 기록을 포함한다. 예를 들어, 객체 인식 대상으로서 '공 이미지'를 학습한다고 가정할 때, 학습된 객체 정보는 학습 대상이 되는 이미지, 이미지를 학습한 결과인 '포케몬 볼', 이미지 개수 '50장'이고, 학습 기록은 '학습 정확도 80%'이다.Objects are recognized using the artificial intelligence of individual learner terminals (2-1, 2-2, ..., 2-n), and each object recognition result value is transmitted to the management server 4 via the network 5 . Each object recognition result includes learned object information and a learning record. For example, assuming that a 'ball image' is learned as an object recognition target, the learned object information is an image to be learned, a 'Pokemon ball' that is a result of learning the image, and the number of images '50', The record is 'learning accuracy 80%'.

관리 서버(4)는 개별 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값들을 비교하여 가장 가능성 높은 객체 인식 결과 값을 최종 객체 인식 결과 값으로 결정할 수 있다. 이때, 관리 서버(4)가 결정된 최종 객체 인식 결과 값을 네트워크(5)를 거쳐 개별 학습자 단말(2-1, 2-2, …, 2-n)에 전송하며, 개별 학습자 단말(2-1, 2-2, …, 2-n)은 각각 자신의 객체 인식 결과 값을 최종 객체 인식 결과 값으로 변경하거나, 최종 객체 인식 결과 값을 참조하여 새로 객체를 인식하는 과정을 반복 수행함에 따라 개별 학습자 단말(2-1, 2-2, …, 2-n) 간 객체 인식 동기화를 달성할 수 있다. 개별 학습자 단말(2-1, 2-2, …, 2-n)에서는 관리 서버(4)에서 확정된 객체에 대해서는 수신하여 개별 학습자 단말(2-1, 2-2, …, 2-n)에서도 동일하게 동기를 맞추는 형태로 최종 객체 인식 결과 값을 서로 공유할 수 있다.The management server 4 compares the object recognition result values received from the individual learner terminals 2-1, 2-2, ..., 2-n to determine the highest object recognition result value as the final object recognition result value. there is. At this time, the management server 4 transmits the determined final object recognition result value to the individual learner terminals 2-1, 2-2, ..., 2-n through the network 5, and the individual learner terminals 2-1 , 2-2, . It is possible to achieve object recognition synchronization between terminals 2-1, 2-2, ..., 2-n. The individual learner terminals 2-1, 2-2, ..., 2-n receive the object determined by the management server 4, and the individual learner terminals 2-1, 2-2, ..., 2-n In the same way, the final object recognition result value can be shared with each other in the same way.

관리 서버(4)는 개별 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값들을 비교 분석함과 함께, 각 객체 인식 결과값들을 종합하여 구성해 보고 이에 대해 재검토를 할 수 있다. 관리 서버(4)는 재검토 단계에서, 수신된 객체 인식 결과 값들이 서로 상이한지 여부를 파악하고 가장 가능성이 높은 단일의 최종 객체 인식 결과 값을 확정한 후 확정된 최종 객체 인식 결과 값을 개별 학습자 단말(2-1, 2-2, …, 2-n)에 전송하여 객체 인식 결과를 동기화 한다.The management server 4 compares and analyzes the object recognition result values received from the individual learner terminals 2-1, 2-2, ..., 2-n, and synthesizes and configures each object recognition result value. may be re-examined. In the review step, the management server 4 determines whether the received object recognition result values are different from each other, determines the single most likely final object recognition result value, and then returns the final object recognition result value to the individual learner terminal (2-1, 2-2, …, 2-n) to synchronize the object recognition result.

화면에 한 눈에 들어오는 객체라 하더라도 개별 학습자 단말(2-1, 2-2, …, 2-n)의 인식수준 및 학습 정도 등을 포함한 단말 성능에 따라 서로 상이한 객체 인식 결과 값이 도출될 수 있다. 서로 다른 개별 학습자 단말(2-1, 2-2, …, 2-n)에서 서로 상이한 이미지를 인식한 후 각 객체 인식 결과를 관리 서버(4)로 전송하면, 관리 서버(4)에서 이를 확인하여 정상적인 객체 인식 결과 값인지를 파악해 보고, 아닐 경우 개별 학습자 단말(2-1, 2-2, …, 2-n)의 객체 인식 결과 값이 다 다른지 확인해 볼 수 있다. 이때, 다 다를 경우 인식한 객체가 동일한 숫자가 많을수록 이를 최종 객체 인식 결과 값으로 확정할 수 있다.Even if it is an object that catches a glance on the screen, different object recognition result values may be derived depending on the terminal performance including the recognition level and learning degree of the individual learner terminals (2-1, 2-2, …, 2-n). there is. After recognizing different images in different individual learner terminals (2-1, 2-2, ..., 2-n) and transmitting each object recognition result to the management server 4, the management server 4 confirms this Thus, it is possible to check whether the object recognition result value is normal, and if not, check whether the object recognition result values of the individual learner terminals (2-1, 2-2, ..., 2-n) are all different. In this case, if they are all different, as the number of the recognized objects increases, it can be determined as the final object recognition result value.

다른 예로, 관리 서버(4)는 개별 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행한 후, 관리 서버(4)의 학습 결과 값을 최종 객체 인식 결과 값을 결정할 수 있다. 예를 들어, 큰 개체인 코끼리 같은 경우, 다리, 코, 몸통 등을 따로 따로 인식할 때는 다리, 코, 몸통으로 인식하지만, 하나의 객체로 볼 때는 코끼리로 인식한다. 관리 서버(4)는 개별 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값(다리 이미지, 코 이미지, 몸통 이미지)를 조합하여 이를 학습한 후 최종 객체 인식 결과 값(코끼리)을 도출할 수 있다.As another example, the management server 4 combines the object recognition result values received from the individual learner terminals 2-1, 2-2, ..., 2-n, performs learning on the combined data, and then manages It is possible to determine the final object recognition result value based on the learning result value of the server 4 . For example, in the case of an elephant, which is a large object, when the legs, nose, and trunk are recognized separately, they are recognized as legs, nose, and torso, but when viewed as a single object, they are recognized as an elephant. The management server 4 combines the object recognition result values (leg image, nose image, torso image) received from the individual learner terminals (2-1, 2-2, ..., 2-n) and learns it, and then the final object A recognition result value (elephant) can be derived.

나아가, 관리 서버(4)는 개별 학습자 단말(2-1, 2-2, …, 2-n)의 객체 인식 결과 값이 서로 상이한 경우 인식한 객체가 동일한 숫자가 많을수록 이를 중간 객체 인식 결과 값으로 잠정적으로 결정한 후, 개별 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행한 후, 관리 서버(4)의 학습 결과와 개별 학습자 단말(2-1, 2-2, …, 2-n)의 학습 결과를 종합적으로 비교 분석한 후 최종 객체 인식 결과 값을 결정할 수도 있다.Furthermore, when the object recognition result values of the individual learner terminals 2-1, 2-2, ..., 2-n are different from each other, the management server 4 converts this into an intermediate object recognition result value as the number of the same recognized objects increases. After tentatively determining, combining the object recognition result values received from the individual learner terminals (2-1, 2-2, ..., 2-n) and performing learning on the combined data, the management server (4) It is also possible to determine the final object recognition result value after comprehensively comparing and analyzing the learning result of , and the learning result of the individual learner terminals (2-1, 2-2, ..., 2-n).

일 실시 예에 따른 학습자 단말(2)은 모바일 장치이다. 모바일 장치는 모바일 환경에서 사용 가능한 컴퓨팅 자원을 가진다. 모바일 장치는 컴퓨팅 자원의 축소된 셋을 가질 수 있다. 모바일 장치의 예로서, 스마트폰, 태블릿 컴퓨터 등이 있다. 모바일 장치는 애플사의 iOS 환경에서 동작하는 아이폰, 구글사의 안드로이드 환경에서 동작하는 안드로이드폰, 마이크로소프트사의 윈도 환경에서 동작하는 윈도폰을 모두 지원할 수 있다. 학습자 단말(2)은 헤드 마운트 디스플레이 (HMD), 스마트 글래스(smart glass) 등과 같이 사용자가 신체에 착용할 수 있는 웨어러블 단말일 수도 있다. 학습자 단말(2)은 원거리의 관리 서버(4)에서 동작하는 자원들과 네트워크(5)를 통해 통신할 수 있다.The learner terminal 2 according to an embodiment is a mobile device. A mobile device has computing resources available in a mobile environment. A mobile device may have a reduced set of computing resources. Examples of the mobile device include a smartphone, a tablet computer, and the like. The mobile device may support an iPhone operating in Apple's iOS environment, an Android phone operating in Google's Android environment, and a Windows phone operating in Microsoft's Windows environment. The learner terminal 2 may be a wearable terminal that the user can wear on the body, such as a head mounted display (HMD) or smart glass. The learner terminal 2 may communicate with resources operating in the remote management server 4 through the network 5 .

다른 실시 예에 따른 학습자 단말(2)은 사물 인터넷(Internet of Things, IoT)이 가능한 사물 인터넷 장치이다. 사물은 웹캠(webcam), 보안 카메라(security camera), 감시 카메라(surveillance camera), 온도 조절 장치(thermostat), 심박 모니터(heart rate monitor), 스마트 가전(smart appliance), 스마트 자동차(smart car), 필드 구동 장치(field operation device), 다양한 센서들과 같은 다양한 장치들일 수 있다.The learner terminal 2 according to another embodiment is an Internet of Things (IoT) device capable of Internet of Things (IoT). Things are a webcam, a security camera, a surveillance camera, a thermostat, a heart rate monitor, a smart appliance, a smart car, It can be a variety of devices, such as a field operation device, various sensors.

일 실시 예에서, 관리 서버(4)는 블레이드 관리 서버와 같은 통상의 관리 서버를 포함할 수 있고, 메인 프레임, 개인용 컴퓨터의 네트워크 또는 단순한 개인용 컴퓨터일 수 있다. 관리 서버(4)는 학습자 단말(2)로부터 원거리에 위치할 수 있다. 관리 서버(4)는 집중형 데이터 스토리지(centralized data storage) 프로세싱(processing) 및 분석을 수행하는데, 딥 러닝(deep learning)을 수행할 수 있다. 딥 러닝은 높은 연산력과 많은 양의 데이터 저장용량을 요구한다. 관리 서버(4)는 학습자 단말(2)을 원격 제어할 수 있다. 이때, 학습자 단말(2)의 실행 화면을 모니터링 및 제어할 수 있다. 관리 서버(4)는 신경망에 입력되는 입력 이미지를 조합하기 위한 이미지 처리 능력을 포함할 수 있다.In one embodiment, the management server 4 may include a conventional management server, such as a blade management server, and may be a mainframe, a network of personal computers, or a simple personal computer. The management server 4 may be located remotely from the learner terminal 2 . The management server 4 performs centralized data storage processing (processing) and analysis, and may perform deep learning. Deep learning requires high computational power and large amount of data storage. The management server 4 may remotely control the learner terminal 2 . At this time, it is possible to monitor and control the execution screen of the learner terminal (2). The management server 4 may include an image processing capability for combining input images input to the neural network.

가이드 단말(3)은 동일한 입력 데이터를 기계학습 과제로서 개별 학습자 단말(2-1, 2-2, …, 2-n)에 제공한다. 이때, 개별 학습자 단말(2-1, 2-2, …, 2-n)은 과제로 할당된 기계학습을 수행하여 학습률을 높이기 위해 훈련하는 교육용 단말로 기능할 수 있다.The guide terminal 3 provides the same input data to the individual learner terminals 2-1, 2-2, ..., 2-n as a machine learning task. In this case, the individual learner terminals 2-1, 2-2, ..., 2-n may function as educational terminals for training in order to increase the learning rate by performing machine learning assigned to the task.

도 4는 본 발명의 일 실시 예에 따른 학습자 단말의 구조를 도시한 도면이다.4 is a diagram illustrating the structure of a learner terminal according to an embodiment of the present invention.

도 4의 (a)에 도시된 바와 같이, 학습자 단말(2)은 시력 보조장치(20)와 제어장치(22)로 분리된 형태일 수 있다. 시력 보조장치(20)는 사용자 안경의 안경 다리에 탈부착하는 형태의 웨어러블 디바이스일 수 있다. 연결부(26)는 시력 보조장치(20)와 제어장치(22)를 연결한다. 연결부(26)는 유선으로 시력 보조장치(20)와 제어장치(22)를 연결할 수 있다. 무선방식을 이용하여 연결하거나 무선방식을 병행하여 사용할 수도 있다. 시력 보조장치(20)에 외부 영상을 촬영할 수 있는 촬영부가 장착되고 제어장치(22)에 배터리가 장착될 수 있다.As shown in (a) of FIG. 4 , the learner terminal 2 may be divided into a vision assistance device 20 and a control device 22 . The vision assisting device 20 may be a wearable device that is detachably attached to the leg of the user's glasses. The connection unit 26 connects the vision assisting device 20 and the control device 22 . The connection unit 26 may connect the vision assisting device 20 and the control device 22 by wire. It can be connected using a wireless method or can be used in parallel with a wireless method. A photographing unit capable of capturing an external image may be mounted on the vision assistance device 20 , and a battery may be mounted on the control device 22 .

시력 보조장치(20)는 사용자가 착용 및 소지가 용이한 웨어러블 디바이스이다. 웨어러블 디바이스의 예로는 사용자가 착용한 안경의 안경 테 또는 안경 다리 등에 탈부착하는 형태가 있다. 전면 또는 측면에 촬영부가 장착되어, 촬영부를 통해 영상을 실시간 촬영할 수 있다. 이러한 웨어러블 디바이스는 기존의 일반적인 안경에 탈부착 가능하며, 무게를 최소화하여 착용하기 편리하도록 한다.The vision assisting device 20 is a wearable device that a user can easily wear and possess. As an example of the wearable device, there is a type of attaching and detaching to a frame or a bridge of glasses worn by a user. A photographing unit is mounted on the front or side, so that an image can be captured in real time through the photographing unit. Such a wearable device can be attached to and detached from conventional glasses, and minimized weight to make it convenient to wear.

도 4의 (b)에 도시된 바와 같이, 학습자 단말(2)은 시력 보조장치(20)가 스마트 기기(24)과 연동된 형태일 수 있다. 스마트 기기(24)는 스마트폰 등이 있다. 연결부(26)는 시력 보조장치(20)와 스마트 기기(24)를 연결한다. 연결부(26)는 유선으로 시력 보조장치(20)와 스마트 기기(24)를 연결할 수 있다. 무선방식을 이용하여 연결하거나 무선방식을 병행하여 사용할 수도 있다. 시력 보조장치(20)에 외부 영상을 촬영할 수 있는 촬영부가 장착되고 스마트 기기(24)의 배터리를 이용할 수 있다. 스마트 기기(24)는 화면에 표시되는 시각정보를 인식할 수 있다.As shown in (b) of FIG. 4 , the learner terminal 2 may have a form in which the vision assisting device 20 is interlocked with the smart device 24 . The smart device 24 may be a smart phone or the like. The connection unit 26 connects the vision assisting device 20 and the smart device 24 . The connection unit 26 may connect the vision assisting device 20 and the smart device 24 by wire. It can be connected using a wireless method or can be used in parallel with a wireless method. A photographing unit capable of photographing an external image is mounted on the vision assisting device 20 , and a battery of the smart device 24 may be used. The smart device 24 may recognize the visual information displayed on the screen.

도 4의 (c)에 도시된 바와 같이, 학습자 단말(2)은 시력 보조장치 없이 스마트 기기(24)일 수 있다. 예를 들어, 시력 보조장치(20)를 착용하는 것 없이, 스마트 기기(24)만을 이용하여 전술한 기능들을 수행할 수 있다. 이때, 스마트 기기(24) 자체에 장착된 촬영부를 이용하여 외부영상을 촬영할 수 있고, 촬영된 영상 또는 표시 화면을 인식할 수 있다.As shown in FIG. 4C , the learner terminal 2 may be a smart device 24 without a visual aid device. For example, the functions described above may be performed using only the smart device 24 without wearing the vision assisting device 20 . In this case, an external image may be photographed using a photographing unit mounted on the smart device 24 itself, and the photographed image or display screen may be recognized.

도 5는 본 발명의 일 실시 예에 따른 학습자 단말의 세부 구성을 도시한 도면이다.5 is a diagram illustrating a detailed configuration of a learner terminal according to an embodiment of the present invention.

도 3 및 도 5를 참조하면, 학습자 단말(2)은 입력부(31), 프로세서(32), 출력부(33), 메모리(34), 통신부(35) 및 배터리(36)를 포함한다.3 and 5 , the learner terminal 2 includes an input unit 31 , a processor 32 , an output unit 33 , a memory 34 , a communication unit 35 , and a battery 36 .

촬영부(30)는 실시간으로 외부영상을 촬영한다. 촬영부(30)는 카메라일 수 있다. 입력부(31)는 사용자로부터 조작신호를 입력 받는다. 예를 들어, 키보드나 마우스 등의 입력장치를 통해 사용자 조작신호를 입력 받을 수 있다. 또한, 입력부(31)는 기계학습 대상이 되는 입력 데이터를 획득한다. 이때, 입력 데이터는 사용자로부터 입력 받을 수 있고, 촬영부(30)를 통해 촬영된 이미지 데이터를 입력 받을 수 있으며, 외부장치(예를 들어, 가이드 단말(3))로부터 입력 받을 수도 있다. 입력 데이터는 이미지, 영상, 음성 등일 수 있다.The photographing unit 30 captures an external image in real time. The photographing unit 30 may be a camera. The input unit 31 receives an operation signal from the user. For example, a user manipulation signal may be input through an input device such as a keyboard or a mouse. In addition, the input unit 31 obtains input data that is a machine learning target. In this case, the input data may be input from the user, image data captured through the photographing unit 30 may be input, and may also be input from an external device (eg, the guide terminal 3 ). The input data may be an image, video, audio, or the like.

메모리(34)는 데이터와 기계학습 명령어가 저장된다. 기계학습 명령어는 컴퓨팅 리소스 및 관련된 컴포넌트의 제어를 통해서 구현되는 본 발명의 방법을 실행하기 위한 것이다.The memory 34 stores data and machine learning instructions. The machine learning instructions are for executing the methods of the present invention implemented through control of computing resources and related components.

프로세서(32)는 학습자 단말(2)의 각 구성요소를 제어한다. 일 실시 예에 따른 프로세서(32)는 메모리(34)에 저장된 기계학습 명령어를 이용하여 입력부(31)를 통해 입력 받은 데이터에 대해 기계학습을 수행하여 객체를 인식하고 객체 인식 결과 값을 통신부(35)를 통해 관리 서버(4)에 전송한다. 프로세서(32)의 기계학습 프로세스는 도 1 및 도 2를 참조로 하여 전술한 바와 같다. 프로세서(32)는 관리 서버(4)로부터 통신부(35)를 통해 최종 객체 인식 결과 값을 수신하면, 객체 인식 결과 값을 최종 객체 인식 결과 값으로 변경하거나, 최종 객체 인식 결과 값을 참조하여 새로 기계학습을 수행하여 객체를 인식하는 과정을 반복함에 따라 객체 인식 결과를 다른 학습자 단말들과 동기화 한다.The processor 32 controls each component of the learner terminal 2 . The processor 32 according to an embodiment recognizes an object by performing machine learning on data received through the input unit 31 using a machine learning command stored in the memory 34 to recognize an object and transmits the object recognition result value to the communication unit 35 ) through the management server (4). The machine learning process of the processor 32 is as described above with reference to FIGS. 1 and 2 . When the processor 32 receives the final object recognition result value from the management server 4 through the communication unit 35, the processor 32 changes the object recognition result value to the final object recognition result value, or refers to the final object recognition result value to create a new machine. As the process of learning and recognizing an object is repeated, the object recognition result is synchronized with other learner terminals.

통신부(35)는 유무선 인터페이스를 통해 관리 서버(4)와 통신한다. 무선 인터페이스는 셀룰러, 블루투스, Wi-Fi, NFC, ZigBee 등의 프로토콜을 사용할 수 있다. 통신 서비스는 블루투스, Wi-Fi, 이더넷, DSL, LTE, PCS, 2G, 3G, 4G, 5G, LAN, CDMA, TDMA, GSM, WDM, WLAN 등을 포함하는 무선 통신 인터페이스를 통해서 제공될 수 있다. 통신 인터페이스는 음성 채널을 포함할 수 있다. 일 실시 예에 따른 통신부(35)는 프로세서(32)에서 생성된 객체 인식 결과 값을 관리 서버(4)에 전송하고, 관리 서버(4)로부터 생성된 최종 객체 인식 결과 값을 수신한다. 통신부(35)는 서브 샘플링을 거쳐 그 크기가 줄어든 중간 데이터를 관리 서버(4)에 전송할 수 있다. 이 경우, 입력 데이터가 통으로 관리 서버(4)에 전송되는 경우에 비해 훨씬 작은 데이터가 관리 서버(4)에 전송되므로 데이터 대역폭도 줄고 이후 프로세스를 관리 서버(4)에서 신속하게 처리할 수 있기 때문에 정확도와 효율성이 향상된다.The communication unit 35 communicates with the management server 4 through a wired/wireless interface. The wireless interface may use protocols such as cellular, Bluetooth, Wi-Fi, NFC, and ZigBee. The communication service may be provided through a wireless communication interface including Bluetooth, Wi-Fi, Ethernet, DSL, LTE, PCS, 2G, 3G, 4G, 5G, LAN, CDMA, TDMA, GSM, WDM, WLAN, and the like. The communication interface may include a voice channel. The communication unit 35 according to an embodiment transmits the object recognition result value generated by the processor 32 to the management server 4 , and receives the final object recognition result value generated from the management server 4 . The communication unit 35 may transmit intermediate data whose size is reduced through subsampling to the management server 4 . In this case, since much smaller data is transmitted to the management server 4 than when the input data is transmitted to the management server 4 through the passage, the data bandwidth is also reduced and the subsequent process can be quickly processed by the management server 4 . Accuracy and efficiency are improved.

출력부(33)는 학습자 단말(2)의 동작 수행을 위해 필요한 정보나 동작 수행에 따라 생성되는 정보를 출력한다. 출력부(33)는 디스플레이나 터치패널 등의 출력장치거나 이와 연결될 수 있다. 통신부(35)가 관리 서버(4)로부터 최종 객체 인식 결과 값을 수신하면, 출력부(33)는 이를 화면에 출력할 수 있고, 청각정보 형태로 출력할 수도 있다. 청각정보는 음성신호, 경고음 등이다.The output unit 33 outputs information necessary for the operation of the learner terminal 2 or information generated according to the operation of the learner terminal 2 . The output unit 33 may be an output device such as a display or a touch panel, or may be connected thereto. When the communication unit 35 receives the final object recognition result value from the management server 4 , the output unit 33 may output it to the screen or output it in the form of auditory information. The auditory information is a voice signal, a warning sound, and the like.

도 6은 본 발명의 일 실시 예에 따른 관리 서버의 구성을 도시한 도면이다.6 is a diagram illustrating a configuration of a management server according to an embodiment of the present invention.

도 3 및 도 6을 참조하면, 관리 서버(4)는 수신부(40), 분석부(42), 학습부(44), 최종 결정부(46) 및 전송부(48)를 포함한다.3 and 6 , the management server 4 includes a receiving unit 40 , an analysis unit 42 , a learning unit 44 , a final determination unit 46 , and a transmission unit 48 .

수신부(40)는 각 학습자 단말(2-1, 2-2, …, 2-n)로부터 객체 인식 결과 값을 수신한다.The receiver 40 receives an object recognition result value from each learner terminal 2-1, 2-2, ..., 2-n.

분석부(42)는 수신부(40)를 통해 수신된 각 객체 인식 결과 값을 비교 분석한다. 분석부(42)는 각 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값이 정상적인 값인지를 판단하고, 객체 인식 결과 값들을 비교 분석하여 서로 동일한 객체 인식 결과 값이 있는지 여부를 판단할 수 있다.The analysis unit 42 compares and analyzes each object recognition result value received through the reception unit 40 . The analysis unit 42 determines whether the object recognition result value received from each learner terminal 2-1, 2-2, ..., 2-n is a normal value, compares and analyzes the object recognition result values to obtain the same object It may be determined whether there is a recognition result value.

학습부(44)는 수신부(40)를 통해 수신된 각 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행한다. 이때, 조합은 수신 이미지들을 서로 겹치는 방식, 한 장에 수신 이미지를 모으는 방식 등이 있다. 예를 들어, 서로 겹치는 방식은 각 학습자 단말이 포케몬 볼, 요가볼, 볼링 볼과 같이 '공'이라는 공통된 객체를 인식한 경우에 해당하고, 한 장에 수신 이미지를 모으는 방식은 각 학습자 단말이 코끼리의 부분에 해당하는 다리, 코, 몸통 등을 인식한 경우에 해당한다.The learning unit 44 combines each object recognition result value received through the receiving unit 40 and performs learning on the combined data. In this case, the combination includes a method of overlapping received images, a method of collecting received images on one sheet, and the like. For example, the overlapping method corresponds to a case in which each learner terminal recognizes a common object called a 'ball' such as a Pokemon ball, a yoga ball, and a bowling ball. This corresponds to the case of recognizing the legs, nose, and torso corresponding to the part of

최종 결정부(46)는 각 학습자 단말로부터 수신된 객체 인식 결과 값들을 비교 분석하여 서로 동일한 객체 인식 결과 값의 수가 가장 많이 나오는 조건 또는 학습 정확도가 가장 높은 조건 중 적어도 하나를 충족하는 객체 인식 결과 값을 최종 객체 인식 결과 값을 결정할 수 있다. 예를 들어, 학습자 단말1(2-1)의 객체 인식 결과 값이 학습 이미지 50장, 학습 정확도 80%, '포케몬 볼'이고, 학습자 단말2(2-2)의 객체 인식 결과 값이 학습 이미지 70장, 학습 정확도 85%, '요가 볼'이며, 학습자 단말3(2-3)의 객체 인식 결과 값이 학습 이미지 100장, 학습 정확도 90%, '볼링 볼'이면, 학습 이미지 장수가 가장 많고 학습 정확도가 가장 높은 '볼링 볼'을 최종 객체 인식 결과 값으로 결정한다.The final determination unit 46 compares and analyzes object recognition result values received from each learner terminal, and an object recognition result value that satisfies at least one of a condition in which the number of identical object recognition result values occurs the most or a condition in which the learning accuracy is the highest. may determine the final object recognition result value. For example, the object recognition result value of the learner terminal 1 (2-1) is 50 learning images, 80% learning accuracy, 'Pokemon Ball', and the object recognition result value of the learner terminal 2 (2-2) is the learning image 70 sheets, learning accuracy 85%, 'yoga ball', and if the object recognition result value of learner terminal 3 (2-3) is 100 learning images, 90% learning accuracy, and 'Bowling ball', the number of learning images is the highest The 'bowling ball' with the highest learning accuracy is determined as the final object recognition result value.

다른 예로, 최종 결정부(46)는 각 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행하여 조합 데이터의 학습 결과를 최종 객체 인식 결과 값을 결정할 수 있다. 예를 들어, 학습자 단말1(2-1)의 객체 인식 결과 값이 '다리'이고, 학습자 단말2(2-2)의 객체 인식 결과 값이 '코'이며, 학습자 단말3(2-3)의 객체 인식 결과 값이 '몸통'이면 '다리', '코', '몸통'을 조합한 조합 데이터의 학습 수행 결과인 '코끼리'를 최종 객체 인식 결과 값으로 결정한다.As another example, the final determination unit 46 combines the object recognition result values received from each learner terminal 2-1, 2-2, ..., 2-n and performs learning on the combined data to obtain combined data It is possible to determine the final object recognition result value of the learning result. For example, the object recognition result value of the learner terminal 1 (2-1) is 'leg', the object recognition result value of the learner terminal 2 (2-2) is 'nose', and the learner terminal 3 (2-3) If the object recognition result value of 'trunk' is 'trunk', 'elephant', which is the learning performance result of the combination data combining 'legs', 'nose', and 'trunk', is determined as the final object recognition result value.

또 다른 예로, 최종 결정부(46)는 각 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행한 후, 조합 데이터의 학습 결과와 각 학습자 단말(2-1, 2-2, …, 2-n)의 학습 결과를 종합적으로 비교 분석한 후 최종 객체 인식 결과 값을 결정할 수 있다. 이를 위해, 최종 결정부(46)는 제1 결정부(461)와 제2 결정부(462)를 포함할 수 있다.As another example, the final determination unit 46 combines the object recognition result values received from each learner terminal 2-1, 2-2, ..., 2-n, and after performing learning on the combined data , it is possible to determine the final object recognition result value after comprehensively comparing and analyzing the learning result of the combination data and the learning result of each learner terminal (2-1, 2-2, ..., 2-n). To this end, the final determiner 46 may include a first determiner 461 and a second determiner 462 .

제1 결정부(461)는 각 학습자 단말(2-1, 2-2, …, 2-n)의 학습 결과를 비교 분석한 결과 동일한 객체 인식 결과 값이 있으면 동일한 객체 인식 결과 값의 수가 가장 많이 나오는 조건 또는 학습 정확도가 가장 높은 조건 중 적어도 하나를 충족하는 객체 인식 결과 값을 중간 객체 인식 결과 값으로 1차 결정한다. 제2 결정부(462)는 제1 결정부(461)의 중간 객체 인식 결과 값과 학습부(44)의 조합 데이터의 학습 수행 결과 값을 비교하여 서로 동일한 객체인지 여부를 판단한다. 이때, 동일하면 중간 객체 인식 결과 값을 최종 객체 인식 결과 값으로 확정한다. 서로 상이하면 중간 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 함께 제공한다.The first determiner 461 compares and analyzes the learning results of each learner terminal 2-1, 2-2, ..., 2-n, and if there is the same object recognition result value, the number of the same object recognition result value is the largest. An object recognition result value that satisfies at least one of a condition that appears or a condition with the highest learning accuracy is primarily determined as an intermediate object recognition result value. The second determiner 462 compares the intermediate object recognition result value of the first determiner 461 and the learning performance result value of the combination data of the learner 44 to determine whether they are the same object. In this case, if they are the same, the intermediate object recognition result value is determined as the final object recognition result value. If they are different from each other, the intermediate recognition result value and the learning performance result value of the combination data are provided together.

이때, 학습부(44)의 조합 데이터의 학습 수행 결과 학습 정확도가 미리 설정된 역치 값, 예를 들어 95%를 넘으면 제2 결정부(462)는 학습부(44)의 학습 수행 결과 값을 최종 객체 인식 결과 값으로 결정할 수 있다. 이에 비해, 학습 수행 결과 학습 정확도가 미리 설정된 역치 값, 예를 들어 95% 이하이면 제2 결정부(462)는 중간 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 함께 전송부(48)에 전달할 수 있다. 학습부(44)의 학습 수행 결과가 미리 설정된 역치 값을 넘는 경우는 관리 서버(4)의 학습을 신뢰할 수 있는 경우이다. 미리 설정된 역치 값을 넘지 못하는 경우는 관리 서버(4)의 학습을 신뢰하기 어려운 경우에 해당하므로, 중간 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 함께 각 학습자 단말(2-1, 2-2, …, 2-n)에 제공한 후 미리 설정된 역치 값이 넘도록 학습부(44)를 통해 학습을 추가로 수행할 수 있다.At this time, when the learning accuracy of the combination data of the learning unit 44 exceeds a preset threshold value, for example, 95%, the second determination unit 462 sets the learning performance result value of the learning unit 44 to the final object. It can be determined by the recognition result value. On the other hand, if the learning accuracy as a result of the learning is less than or equal to a preset threshold value, for example, 95%, the second determination unit 462 transmits the intermediate recognition result value and the learning result value of the combination data to the transmitter 48 together. can When the learning performance result of the learning unit 44 exceeds a preset threshold value, the learning of the management server 4 is reliable. If it does not exceed the preset threshold value, it corresponds to a case where it is difficult to trust the learning of the management server 4, so the intermediate recognition result value and the learning performance result value of the combination data are combined with each learner terminal (2-1, 2-2) , .

전송부(48)는 학습자 단말(2-1, 2-2, …, 2-n) 간 객체 인식 결과 동기화를 위해 각 학습자 단말(2-1, 2-2, …, 2-n)에 최종 객체 인식 결과 값을 전송한다. 관리 서버(4)는 전송부(48)가 최종 객체 인식 결과 값을 각 학습자 단말(2-1, 2-2, …, 2-n)에 전송하면서 재 학습을 요청하고, 수신부(40)가 각 학습자 단말(2-1, 2-2, …, 2-n)로부터 재 학습된 객체 인식 결과 값을 수신하는 과정을 반복함에 따라 학습자 단말(2-1, 2-2, …, 2-n)의 학습을 교육시킬 수 있다.The transmission unit 48 is finally sent to each learner terminal (2-1, 2-2, ..., 2-n) to synchronize the object recognition result between the learner terminals (2-1, 2-2, ..., 2-n). Transmits the object recognition result value. The management server 4 requests re-learning while the transmitting unit 48 transmits the final object recognition result value to each learner terminal (2-1, 2-2, ..., 2-n), and the receiving unit 40 As the process of receiving the re-learned object recognition result value from each learner terminal (2-1, 2-2, ..., 2-n) is repeated, the learner terminal (2-1, 2-2, ..., 2-n) ) can be taught.

도 7은 본 발명의 일 실시 예에 따른 다자 간 협업 기반 객체 인식 시나리오를 도시한 도면이다.7 is a diagram illustrating a multi-party collaboration-based object recognition scenario according to an embodiment of the present invention.

도 3 및 도 7을 참조하면, 가이드 단말(3)이 새로운 객체 인지 과제인 '포케몬 볼 이미지'를 10개의 학습자 단말(2-1, 2-2, …, 2-10)에 제공한다. 학습 과제에 대해 10개의 학습자 단말(2-1, 2-2, …, 2-10)이 각각 학습을 수행한다. 학습 수행 결과, 학습자1이 소지한 학습자 단말1(2-1)의 객체 인식 결과 값이 학습 이미지 50장, 학습 정확도 80%, '포케몬 볼'이고, 학습자2가 소지한 학습자 단말2(2-2)의 객체 인식 결과 값이 학습 이미지 70장, 학습 정확도 85%, '요가 볼'이며, 학습자10이 소지한 학습자 단말10(2-10)의 객체 인식 결과 값이 학습 이미지 100장, 학습 정확도 90%, '볼링 볼'이면, 이를 각각 관리 서버(4)에 전송한다.3 and 7 , the guide terminal 3 provides the 'Pokemon ball image', which is a new object recognition task, to the ten learner terminals 2-1, 2-2, ..., 2-10. For the learning task, 10 learner terminals 2-1, 2-2, ..., 2-10 perform learning, respectively. As a result of learning, the object recognition result value of learner terminal 1 (2-1) possessed by learner 1 is 50 learning images, learning accuracy 80%, 'Pokemon Ball', and learner terminal 2 (2- 2) object recognition result value is 70 learning images, learning accuracy 85%, 'yoga ball', and object recognition result value of learner terminal 10(2-10) possessed by learner 10 is 100 learning images, learning accuracy If 90%, 'bowling ball', it is transmitted to the management server 4, respectively.

관리 서버(4)는 학습 이미지 장수가 가장 많고 학습 정확도가 가장 높은 '볼링 볼'을 최종 객체 인식 결과 값으로 결정할 수 있다. 그러나 도 7의 예에 도시된 바와 같이 '볼링 볼'은 가이드 단말(3)이 제공한 새로운 객체 인지 과제인 '포케몬 볼'이 아니므로 잘못된 결과에 해당한다. 이와 같은 오류를 회피하기 위해, 관리 서버(4)는 각 학습자 단말(2-1, 2-2, …, 2-10)로부터 수신된 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행한 후, 조합 데이터의 학습 결과와 각 학습자 단말(2-1, 2-2, …, 2-10)의 학습 결과를 종합적으로 비교 분석한 후 최종 객체 인식 결과 값을 결정할 수 있다. 예를 들어, 학습자 단말1(2-1)의 학습 이미지 50장과, 학습자 단말2(2-2)의 학습 이미지 70장과, 학습자10의 학습자 단말10(2-10)의 학습 이미지 100장을 조합한 220장을 대상으로 학습을 수행한 후, 조합 데이터의 학습 결과인 '포케몬 볼'을 최종 객체 인식 결과 값으로 결정한다. 관리 서버(4)는 최종 객체 인식 결과 값 '포케몬 볼'을 각 학습자 단말(2-1, 2-2, …, 2-10)에 전송함으로써, 각 학습자 단말(2-1, 2-2, …, 2-10)이 '포케몬 볼'로 객체 인식 결과를 변경하도록 재 학습을 시켜서 최종적으로 각 학습자 단말(2-1, 2-2, …, 2-10)이 '포케몬 볼'로 객체 인식 결과를 동기화 시킨다.The management server 4 may determine a 'bowling ball' having the highest number of learning images and the highest learning accuracy as the final object recognition result value. However, as shown in the example of FIG. 7 , the 'bowling ball' is not a 'Pokemon ball' which is a new object recognition task provided by the guide terminal 3, and thus corresponds to an incorrect result. In order to avoid such an error, the management server 4 combines the object recognition result values received from each learner terminal (2-1, 2-2, ..., 2-10) and performs learning on the combined data. After performing, it is possible to determine the final object recognition result value after comprehensively comparing and analyzing the learning result of the combination data and the learning result of each learner terminal (2-1, 2-2, ..., 2-10). For example, 50 learning images of the learner terminal 1 (2-1), 70 learning images of the learner terminal 2 (2-2), and 100 learning images of the learner terminal 10 (2-10) of the learner 10 After learning is performed for 220 sheets that combine The management server 4 transmits the final object recognition result value 'Pokemon Ball' to each learner terminal (2-1, 2-2, ..., 2-10), so that each learner terminal (2-1, 2-2, …, 2-10) is re-learned to change the object recognition result to 'Pokemon Ball', and finally each learner terminal (2-1, 2-2, …, 2-10) recognizes an object as 'Pokemon Ball' Synchronize the results.

도 8은 본 발명의 일 실시 예에 따른 다자 간 협업 기반 객체 인식 방법의 프로세스를 도시한 도면이다.8 is a diagram illustrating a process of an object recognition method based on multi-party collaboration according to an embodiment of the present invention.

도 3 및 도 8을 참조하면, 관리 서버(4)는 동일한 입력 데이터를 대상으로 다수의 학습자 단말(2-1, 2-2, …, 2-n)로부터 각각 객체 인식 결과 값을 수신한다(810).3 and 8 , the management server 4 receives object recognition result values from a plurality of learner terminals 2-1, 2-2, ..., 2-n for the same input data ( 810).

이어서, 관리 서버(4)는 수신된 객체 인식 결과 값들로부터 단일의 최종 객체 인식 결과 값을 결정한다(820).Then, the management server 4 determines a single final object recognition result value from the received object recognition result values ( 820 ).

최종 객체 인식 결과 값을 결정하는 단계(820)에서, 관리 서버(4)는 각 객체 인식 결과 값을 비교 분석하고, 각 객체 인식 결과 값을 조합하고 조합된 데이터를 대상으로 학습을 수행한 후, 각 객체 인식 결과 값에 대한 비교 분석 결과와, 조합된 데이터에 대한 학습 수행 결과 값을 종합적으로 판단하여 단일의 최종 객체 인식 결과 값을 결정할 수 있다. 이때, 각 학습자 단말(2-1, 2-2, …, 2-n)로부터 수신된 객체 인식 결과 값들을 비교 분석하여 서로 동일한 객체 인식 결과 값의 수가 가장 많이 나오는 조건 또는 학습 정확도가 가장 높은 조건 중 적어도 하나를 충족하는 객체 인식 결과 값을 중간 객체 인식 결과 값으로 결정할 수 있다.In step 820 of determining the final object recognition result value, the management server 4 compares and analyzes each object recognition result value, combines each object recognition result value, and performs learning on the combined data, A single final object recognition result value can be determined by comprehensively judging the comparative analysis result for each object recognition result value and the learning performance result value for the combined data. At this time, the condition in which the number of the same object recognition result values is highest by comparing and analyzing the object recognition result values received from each learner terminal (2-1, 2-2, ..., 2-n) or the condition with the highest learning accuracy An object recognition result value that satisfies at least one of may be determined as an intermediate object recognition result value.

나아가, 관리 서버(4)는 중간 객체 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 비교하여 서로 동일한 객체인지 여부를 판단할 수 있다. 이때, 동일하면 중간 객체 인식 결과 값을 최종 객체 인식 결과 값으로 확정하고, 서로 상이하면 중간 인식 결과 값과 조합 데이터의 학습 수행 결과 값을 함께 각 각 학습자 단말(2-1, 2-2, …, 2-n)에 제공할 수 있다.Furthermore, the management server 4 may compare the intermediate object recognition result value with the learning performance result value of the combination data to determine whether they are the same object. At this time, if they are the same, the intermediate object recognition result value is determined as the final object recognition result value. , 2-n) can be provided.

이어서, 관리 서버(4)는 각 학습자 단말(2-1, 2-2, …, 2-n)에 최종 객체 인식 결과 값을 전송하여 학습자 단말(2-1, 2-2, …, 2-n) 간에 객체 인식 결과를 동기화 시킨다(830). 관리 서버(4)는 최종 객체 인식 결과 값을 각 학습자 단말(2-1, 2-2, …, 2-n)에 전송하면서 재 학습을 요청하고, 각 학습자 단말(2-1, 2-2, …, 2-n)이 최종 객체 인식 결과 값을 참조하여 재 학습을 수행하면, 각 학습자 단말(2-1, 2-2, …, 2-n)로부터 재 학습된 객체 인식 결과 값을 수신하는 과정을 반복함에 따라 각 학습자 단말(2-1, 2-2, …, 2-n)의 학습을 훈련 시킬 수 있다.Then, the management server 4 transmits the final object recognition result value to each learner terminal (2-1, 2-2, ..., 2-n) to the learner terminal (2-1, 2-2, ..., 2-n) n) to synchronize the object recognition result ( 830 ). The management server 4 requests re-learning while transmitting the final object recognition result value to each learner terminal (2-1, 2-2, ..., 2-n), and each learner terminal (2-1, 2-2) , …, 2-n) receives the re-learned object recognition result value from each learner terminal (2-1, 2-2, …, 2-n) by referring to the final object recognition result value By repeating the process, learning of each learner terminal (2-1, 2-2, ..., 2-n) can be trained.

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far, the present invention has been looked at focusing on the embodiments thereof. Those of ordinary skill in the art to which the present invention pertains will understand that the present invention can be implemented in modified forms without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is indicated in the claims rather than the foregoing description, and all differences within the scope equivalent thereto should be construed as being included in the present invention.

Claims

a plurality of learner terminals for recognizing objects by performing machine learning on the same input data, respectively; and
a management server that receives an object recognition result value from each learner terminal, determines a single final object recognition result value therefrom, and transmits the final object recognition result value to each learner terminal to synchronize the object recognition result between the learner terminals; includes,
management server
a receiver for receiving an object recognition result value from each learner terminal;
an analysis unit that compares and analyzes each object recognition result value;
a learning unit that combines respective object recognition result values and performs learning on the combined data;
a final determination unit for determining a single final object recognition result value by comprehensively judging a comparison analysis result for each object recognition result value of the analysis unit and a learning performance result value for the combined data of the learning unit; and
a transmitter configured to transmit a final object recognition result value to each learner terminal for synchronization of object recognition results between learner terminals; includes,
the analysis department
By comparing and analyzing the object recognition result values received from each learner terminal, it is determined whether there are the same object recognition result values,
the final decision
When there is the same object recognition result value, the first determination unit first determines as an intermediate object recognition result value an object recognition result value that satisfies at least one of a condition in which the number of identical object recognition result values occurs the most or a condition with the highest learning accuracy ; and
A second determination unit that compares the intermediate object recognition result value of the first determination unit and the learning performance result value of the combination data of the learning unit to determine whether they are the same object, and if they are the same, determines the intermediate object recognition result value as the final object recognition result value ; A multi-party collaboration-based object recognition system comprising a.

delete

The method of claim 1, wherein the second determining unit
Multi-party collaboration-based object recognition system, characterized in that if the objects are different from each other, the intermediate recognition result value and the learning performance result value of the combination data are transmitted together to the transmitter.

The method of claim 1, wherein the second determining unit
If the learning performance result of the learning unit exceeds the preset threshold value, the learning performance result value of the learning unit is determined and provided as the final object recognition result value. If the preset threshold value is not exceeded, the intermediate recognition result value and the learning performance result value of the learning unit A multi-party collaboration-based object recognition system, characterized in that it is transmitted together to the transmission unit.

The method of claim 1, wherein the management server
Training the learning of each learner terminal as the transmitter requests re-learning while transmitting the final object recognition result value to each learner terminal, and the receiver repeats the process of receiving the re-learned object recognition result value from each learner terminal A multi-party collaboration-based object recognition system characterized by

The method of claim 1, wherein the multi-party collaboration-based object recognition system is
a guide terminal for providing input data to a plurality of learner terminals as a machine learning task; further comprising,
Multi-party collaboration-based object recognition system, characterized in that each learner terminal is an educational terminal that performs machine learning assigned as a task.

The method of claim 1, wherein each learner terminal is
a memory in which machine learning instructions are stored;
When machine learning is performed on input data using machine learning commands, object recognition result values including learned object information and learning records are generated, and the final object recognition result value is received from the management server, the object recognition result value a processor that repeats the process of recognizing an object by changing to a final object recognition result value or by newly performing machine learning with reference to the final object recognition result value; and
a communication unit for transmitting an object recognition result value to a management server and receiving a final object recognition result value from the management server;
A multi-party collaboration-based object recognition system comprising a.

The method of claim 1, wherein each learner terminal is
A multi-party collaboration-based object recognition system, characterized in that it is mounted on a main body worn on the user's body and includes a vision assisting device that acquires a photographed image by photographing an external image.

receiving each object recognition result value from a plurality of learner terminals for the same input data;
determining a single final object recognition result value from the received object recognition result values; and
transmitting the final object recognition result value to each learner terminal to synchronize the object recognition result between the learner terminals; including,
The step of determining the final object recognition result value is
comparing and analyzing each object recognition result value;
combining each object recognition result value and performing learning on the combined data; and
determining a single final object recognition result value by comprehensively judging a comparison analysis result for each object recognition result value and a learning performance result value for the combined data; includes,
The step of determining a single final object recognition result value by comprehensive judgment is
By comparing and analyzing the object recognition result values received from each learner terminal, the object recognition result value that satisfies at least one of the condition in which the number of identical object recognition result values occurs the most or the condition with the highest learning accuracy is used as the intermediate object recognition result value. determining; and
comparing the intermediate object recognition result value with the learning performance result value of the combination data to determine whether they are the same object, and if the same, determining the intermediate object recognition result value as the final object recognition result value;
A multi-party collaboration-based object recognition method comprising a.

delete

The method according to claim 12, wherein the step of determining a single final object recognition result value by comprehensively judging
providing an intermediate recognition result value and a learning performance result value of the combination data together if the objects are different from each other;
Multi-party collaboration-based object recognition method, characterized in that it further comprises.

The method according to claim 12, wherein the step of determining a single final object recognition result value by comprehensively judging
determining a result of learning of the combination data as a final object recognition result value when the learning accuracy exceeds a preset threshold value as a result of the learning; and
providing an intermediate recognition result value and a learning performance result value of the combination data together when the learning accuracy as a result of the learning does not exceed a preset threshold value;
Multi-party collaboration-based object recognition method, characterized in that it further comprises.