KR102508952B1

KR102508952B1 - Electronic apparatus for allocating task to multiple robots and operation method thereof

Info

Publication number: KR102508952B1
Application number: KR1020220098671A
Authority: KR
Inventors: 임형우; 이다솔; 김종희; 최준성
Original assignee: 국방과학연구소
Priority date: 2022-08-08
Filing date: 2022-08-08
Publication date: 2023-03-14

Abstract

A method of allocating tasks to a plurality of robots by an electronic apparatus of the present disclosure includes a step of obtaining first information and second information indicating suitability for behaviors of the plurality of robots in a predetermined situation; a step of confirming first task information for allocating tasks to the plurality of robots based on the output of a first network into which the first information is input, and confirming second task information for allocating tasks to the plurality of robots based on the output of a second network into which the second information is input; a step of determining final task information for allocating tasks to the plurality of robots based on the first and second task information; and a step of providing the final task information.

Description

Electronic device for allocating tasks to a plurality of robots and its operation method

본 개시는 복수의 로봇에게 임무를 할당하기 위한 전자 장치 및 그의 동작 방법에 관한 것이다.The present disclosure relates to an electronic device for allocating tasks to a plurality of robots and an operating method thereof.

로봇에게 임무를 할당하는 기술은 미래 전장을 대비하여 현재 줄어드는 인력 자원, 늘어나는 로봇의 기여도, 및 다중 로봇을 제어하기 위해 미래 전장에서 전망이 촉망받는 주요 기술 중 하나이다. 미래 전장에서 인력의 적소 배치는 전장의 승패를 가르는 중요한 요소이고 목적을 달성하기 위해 최소한의 인원 배치를 통해 인적 자원의 원활함에 기여할 수 있다. 현재 다중 로봇의 임무 할당 선정은 수집된 데이터를 기반으로 사람의 직관과 지휘교리를 반영하여 결정된다. 사람의 최종결심으로 인한 선정은 human error 및 감정 차이, 숙련도 차이 등 많은 차이를 동반할 수 있으며, 그 중에서 숙련자와 비숙련자간의 능력치는 많은 차이가 있을 수 있다. 합성곱 심층 신경망(Convolutional Deep Neural Network)을 사용할 경우 임무 할당 정확도 향상에 큰 기여를 함에 따라 이를 다중 로봇의 임무 할당 및 결정에도 적용시키고자 한다.The technology of assigning tasks to robots is one of the key technologies with promising prospects in future battlefields in preparation for future battlefields in order to control currently reduced human resources, increased contribution of robots, and multiple robots. The placement of manpower in the right place in the future battlefield is an important factor that determines victory or defeat on the battlefield, and it can contribute to the smoothness of human resources through the minimum manpower placement to achieve the purpose. Currently, the selection of multi-robot mission assignments is determined by reflecting human intuition and command doctrine based on collected data. Selection due to the final decision of a person can be accompanied by many differences such as human error, emotional difference, and skill level difference. As the convolutional deep neural network contributes greatly to improving mission assignment accuracy, we intend to apply it to multi-robot mission assignment and decision-making.

개시된 실시예들은 복수의 로봇에게 임무를 할당하기 위한 전자 장치 및 그의 동작 방법을 개시하고자 한다. 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 이하의 실시예들로부터 또 다른 기술적 과제들이 유추될 수 있다.The disclosed embodiments are intended to disclose an electronic device and an operating method thereof for assigning tasks to a plurality of robots. The technical problem to be achieved by the present embodiment is not limited to the technical problems described above, and other technical problems can be inferred from the following embodiments.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 각각 나타내는 제1 정보 및 제2 정보를 획득하는 단계; 제1 정보가 입력된 제1 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제1 임무 정보를 확인하고, 제2 정보가 입력된 제2 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제2 임무 정보를 확인하는 단계; 제1 임무 정보 및 제2 임무 정보를 기초로 복수의 로봇에게 임무를 할당하기 위한 최종 임무 정보를 결정하는 단계; 및 최종 임무 정보를 제공하는 단계를 포함하는 방법이 제공될 수 있다.A method of allocating tasks to a plurality of robots by an electronic device according to an embodiment, comprising: acquiring first information and second information indicating suitability for a behavior of a plurality of robots in a predetermined situation; Based on the output of the first network into which the first information is input, first task information for allocating tasks to the plurality of robots is checked, and tasks are assigned to the plurality of robots based on the output of the second network into which the second information is input. checking second task information for allocating; determining final task information for allocating tasks to a plurality of robots based on the first task information and the second task information; and providing final mission information.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 제1 정보는 3차원 텐서(tensor)의 형태를 갖는 정보를 포함하고, 제2 정보는 2차원 이미지 정보를 포함할 수 있다.In a method for assigning tasks to a plurality of robots by an electronic device according to an embodiment, the first information may include information in the form of a 3D tensor, and the second information may include 2D image information. can

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 획득하는 단계는, 복수의 로봇 각각의 리소스 정보, 전장 상황에 관한 정보, 전장 상황별 적합도 함수에 관한 정보, 및 리소스-로봇 행동의 매칭 조건에 관한 정보를 기초로 제1 정보 및 제2 정보를 생성하는 단계를 포함할 수 있다.In the method of allocating a task to a plurality of robots by an electronic device according to an embodiment, the acquiring step includes resource information of each of the plurality of robots, information about battlefield conditions, information about a fitness function for each battlefield situation, and resource information. - Generating first information and second information based on the information about the matching condition of the robot behavior.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 제1 네트워크는, 전장 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하는 유전 알고리즘(Genetic Algorithm) 기반의 네트워크이고, 제2 네트워크는, 전장 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하도록 학습된 컨볼루션 뉴럴 네트워크(Convolutional Neural Network)일 수 있다.In the method for assigning tasks to a plurality of robots by an electronic device according to an embodiment, a first network includes a task for allocating tasks to the plurality of robots based on information indicating a degree of suitability for a robot's behavior in a battlefield situation. It is a genetic algorithm-based network that outputs information, and the second network learns to output mission information for allocating missions to a plurality of robots based on information representing the degree of fitness for robot behavior in a battlefield situation. may be a convolutional neural network.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 제2 정보는 제1 정보로부터 가공되어 생성된 정보이고, 제2 정보를 제2 네트워크의 입력 정보로서 획득하고 제1 정보가 입력된 제1 네트워크의 출력 정보를 제2 네트워크의 입력 정보에 대한 타겟 정보로서 획득하는 단계; 및 입력 정보 및 타겟 정보를 기초로 제2 네트워크를 학습시키는 단계를 포함할 수 있다.In a method for assigning tasks to a plurality of robots by an electronic device according to an embodiment, the second information is information generated by processing first information, the second information is obtained as input information of a second network, and the first obtaining output information of a first network, into which information is input, as target information for input information of a second network; and learning a second network based on the input information and the target information.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 제2 정보를 제2 네트워크의 입력 정보로서 획득하고 제2 정보에 대응하는 숙련자의 임무 정보를 제2 네트워크의 입력 정보에 대한 타겟 정보로서 획득하는 단계; 및 입력 정보 및 타겟 정보를 기초로 제2 네트워크를 학습시키는 단계를 포함할 수 있다.In a method for assigning tasks to a plurality of robots by an electronic device according to an embodiment, second information is obtained as input information of a second network, and task information of a skilled person corresponding to the second information is input information of the second network. Obtaining as target information for ; and learning a second network based on the input information and the target information.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 복수의 로봇은 제1 로봇, 제2 로봇, 제3 로봇, 및 제4 로봇을 포함하고, 제1 임무 정보는 제1 로봇에게 임무를 할당하기 위한 제1-1 임무 값, 제2 로봇에게 임무를 할당하기 위한 제1-2 임무 값, 제3 로봇에게 임무를 할당하기 위한 제1-3 임무 값, 및 제4 로봇에게 임무를 할당하기 위한 제1-4 임무 값을 포함하고, 제2 임무 정보는, 제1 로봇에게 임무를 할당하기 위한 제2-1 임무 값, 제2 로봇에게 임무를 할당하기 위한 제2-2 임무 값, 제3 로봇에게 임무를 할당하기 위한 제2-3 임무 값, 및 제4 로봇에게 임무를 할당하기 위한 제2-4 임무 값을 포함하고, 최종 임무 정보는, 제1 로봇에게 임무를 할당하기 위한 제1 최종 임무 값, 제2 로봇에게 임무를 할당하기 위한 제2 최종 임무 값, 제3 로봇에게 임무를 할당하기 위한 제3 최종 임무 값, 및 제4 로봇에게 임무를 할당하기 위한 제4 최종 임무 값을 포함하고, 결정하는 단계는, 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정하고, 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정하고, 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정하고, 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정하는 단계를 포함할 수 있다.In the method for assigning a task to a plurality of robots by an electronic device according to an embodiment, the plurality of robots include a first robot, a second robot, a third robot, and a fourth robot, and the first task information is 1-1 task value for allocating a task to 1 robot, 1-2 task value for allocating a task to a second robot, 1-3 task value for allocating a task to a 3 robot, and 4 1-4 task values for allocating a task to the robot, and the second task information includes a 2-1 task value for allocating a task to the first robot, and a second task value for allocating a task to the second robot. It includes a -2 task value, a 2-3 task value for assigning a task to the third robot, and a 2-4 task value for assigning a task to the fourth robot, and the final task information is provided to the first robot. A first final task value for assigning a task, a second final task value for assigning a task to a second robot, a third final task value for assigning a task to a third robot, and a task assignment to a fourth robot In the step of determining the fourth final task value, the first final task value is determined from the 1-1 task value and the 2-1 task value, and the 1-2 task value and the 2-2 task value are included. A second final task value is determined from the values, a third final task value is determined from the values of tasks 1-3 and tasks 2-3, and a fourth task value is determined from tasks 1-4 and tasks 2-4. It may include determining a final task value.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 확인하는 단계는, 제1 정보가 입력된 제1 네트워크의 출력을 기초로 제1 신뢰도, 제1-1 임무 값, 제1-2 임무 값, 제1-3 임무 값, 및 제1-4 임무 값을 확인하고, 제2 정보가 입력된 제2 네트워크의 출력을 기초로 제2 신뢰도, 제2-1 임무 값, 제2-2 임무 값, 제2-3 임무 값, 및 제2-4 임무 값을 확인하는 단계를 포함하고, 결정하는 단계는, 제1 신뢰도 및 제2 신뢰도에 기초하여, 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정하고, 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정하고, 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정하고, 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정하는 단계를 포함할 수 있다.In the method of allocating a task to a plurality of robots by an electronic device according to an embodiment, the checking may include: a first reliability, a 1-1 task value, The 1-2 task value, the 1-3 task value, and the 1-4 task value are checked, and based on the output of the second network to which the second information is input, the second reliability, the 2-1 task value, The step of determining the 2-2 task value, the 2-3 task value, and the 2-4 task value, wherein the determining step comprises: based on the first reliability and the second reliability, the 1-1 task value. A first final task value is determined from the value and the 2-1 task value, a second final task value is determined from the 1-2 task value and the 2-2 task value, and the 1-3 task value and the second task value are determined. determining a third final task value from the -3 task value, and determining a fourth final task value from the 1-4 task value and the 2-4 task value.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 제1 신뢰도는 제1 네트워크의 출력의 크론바하 알파 계수(Cronbach's Alpha Coefficient)를 기초로 계산되고, 제2 신뢰도는 제2 네트워크의 소프트맥스 레이어의 출력들에 기초하여 계산될 수 있다.In a method for assigning tasks to a plurality of robots by an electronic device according to an embodiment, a first reliability is calculated based on Cronbach's Alpha Coefficient of an output of a first network, and a second reliability is a first reliability. 2 can be calculated based on the outputs of the softmax layer of the network.

일 실시예에 따른 전자 장치가 복수의 로봇에게 임무를 할당하는 방법에 있어서, 통신 디바이스를 통해 최종 임무 정보를 복수의 로봇에게 전송하는 단계를 더 포함할 수 있다.The method of allocating tasks to a plurality of robots by an electronic device according to an embodiment may further include transmitting final task information to the plurality of robots through a communication device.

일 실시예에 따른 복수의 로봇에게 임무를 할당하는 전자 장치로서, 적어도 하나의 프로그램이 저장된 메모리; 및 적어도 하나의 프로그램을 실행함으로써, 소정의 상황에서 로봇의 행동에 대한 적합도를 각각 나타내는 제1 정보 및 제2 정보를 획득하고, 제1 정보가 입력된 제1 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제1 임무 정보를 확인하고, 제2 정보가 입력된 제2 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제2 임무 정보를 확인하고, 제1 임무 정보 및 제2 임무 정보를 기초로 복수의 로봇에게 임무를 할당하기 위한 최종 임무 정보를 결정하고, 및 최종 임무 정보를 제공하는 프로세서를 포함하는 전자 장치가 제공될 수 있다.An electronic device for allocating tasks to a plurality of robots according to an embodiment, comprising: a memory in which at least one program is stored; And by executing at least one program, obtaining first information and second information representing the suitability of the robot's behavior in a predetermined situation, respectively, and a plurality of robots based on the output of the first network into which the first information is input. Check first task information for allocating a task to a robot, check second task information for allocating a task to a plurality of robots based on the output of the second network into which the second information is input, and determine the first task information and An electronic device including a processor that determines final task information for allocating tasks to a plurality of robots based on the second task information and provides the final task information may be provided.

일 실시예에 따른 복수의 로봇에게 임무를 할당하는 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 비일시적 기록매체로서, 방법은 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 각각 나타내는 제1 정보 및 제2 정보를 획득하는 단계; 제1 정보가 입력된 제1 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제1 임무 정보를 확인하고, 제2 정보가 입력된 제2 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제2 임무 정보를 확인하는 단계; 제1 임무 정보 및 제2 임무 정보를 기초로 복수의 로봇에게 임무를 할당하기 위한 최종 임무 정보를 결정하는 단계; 및 최종 임무 정보를 제공하는 단계를 포함하는 비일시적 기록매체가 제공될 수 있다.A computer-readable non-transitory recording medium recording a program for executing a method of allocating tasks to a plurality of robots according to an embodiment in a computer, wherein the method measures the suitability of the actions of a plurality of robots in a predetermined situation, respectively. obtaining first information and second information indicating; Based on the output of the first network into which the first information is input, first task information for allocating tasks to the plurality of robots is checked, and tasks are assigned to the plurality of robots based on the output of the second network into which the second information is input. checking second task information for allocating; determining final task information for allocating tasks to a plurality of robots based on the first task information and the second task information; and providing final mission information. A non-temporary recording medium may be provided.

본 개시에 따르면 전자 장치는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 기초로 훈련된 뉴럴 네트워크와 유전 알고리즘 기반의 네트워크를 이용하여 복수의 로봇에게 임무를 할당할 수 있다. 전자 장치는 특히 이종 다중 로봇의 각각의 개별 특성을 고려하여 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 기초로 각각의 로봇에게 임무를 할당할 수 있다. 전자 장치는 유전 알고리즘 기반 네트워크와 뉴럴 네트워크를 동시에 이용하므로, 유전 알고리즘 기반 네트워크만을 사용하는 경우보다 복수의 로봇에게 임무를 할당하는데 필요한 데이터의 양을 효과적으로 감축할 수 있다. 또한, 전자 장치는 제1 네트워크의 제1 신뢰도와 제2 네트워크의 제2 신뢰도를 기초로, 제1 네트워크의 임무 값과 제2 네트워크의 임무 값으로부터 최종 임무 값을 결정하므로, 보다 신뢰도 높은 최종 임무 값을 계산하여 보다 정확하게 복수의 로봇에게 임무를 할당할 수 있다. 전자 장치는 제1 정보를 통한 제1 네트워크의 출력과 제2 정보를 통한 제2 네트워크의 출력의 신뢰도를 고려하여 융합하는 방법을 통해 향상된 임무 할당 성능을 제공할 수 있다.According to the present disclosure, an electronic device may assign tasks to a plurality of robots by using a neural network trained on the basis of information indicating suitability of a plurality of robots' behaviors and a network based on a genetic algorithm in a predetermined situation. In particular, the electronic device may allocate a task to each robot based on information representing a degree of suitability for a behavior of a plurality of robots in a predetermined situation in consideration of individual characteristics of each of the heterogeneous multi-robots. Since the electronic device simultaneously uses a genetic algorithm-based network and a neural network, the amount of data required to assign tasks to a plurality of robots can be effectively reduced compared to the case of using only the genetic algorithm-based network. In addition, since the electronic device determines a final task value from a task value of the first network and a task value of the second network based on the first reliability of the first network and the second reliability of the second network, the final task with higher reliability is determined. By calculating the values, tasks can be more accurately assigned to multiple robots. The electronic device may provide improved task assignment performance through a convergence method in consideration of the reliability of the output of the first network through the first information and the output of the second network through the second information.

도 1은 본 개시에 따른 전자 장치를 나타낸다.
도 2는 단위 로봇에 임무를 부여하는 방법에 관한 순서도를 나타낸다.
도 3은 소정의 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보를 생성하는 실시예를 나타낸다.
도 4a는 제1 네트워크의 입력 정보인 제1 정보의 일 실시예를 나타낸다.
도 4b는 제2 네트워크의 입력 정보인 제2 정보의 일 실시예를 나타낸다.
도 5는 제2 네트워크의 일 예시인 컨볼루션 뉴럴 네트워크를 나타낸다.
도 6은 제2 네트워크의 출력들에 대한 엔트로피의 예시를 나타낸다.
도 7은 프로세서가 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 결정하는 실시예를 나타낸다.
도 8은 전자 장치가 동작하는 일 실시예를 나타낸다.
도 9는 일 실시예에 따른 전자 장치의 동작 방법을 나타낸다.1 shows an electronic device according to the present disclosure.
2 shows a flowchart of a method of assigning a task to a unit robot.
3 shows an embodiment of generating information representing the degree of suitability of a robot's behavior in a given situation.
4A shows an example of first information that is input information of a first network.
4B shows an example of second information that is input information of a second network.
5 shows a convolutional neural network as an example of the second network.
6 shows an example of entropy for the outputs of the second network.
7 illustrates an embodiment in which a processor determines a first final task value, a second final task value, a third final task value, and a fourth final task value.
8 shows an embodiment in which an electronic device operates.
9 illustrates a method of operating an electronic device according to an exemplary embodiment.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

실시예들에서 사용되는 용어는 본 개시에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 개시에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 개시의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the embodiments have been selected from general terms that are currently widely used as much as possible while considering the functions in the present disclosure, but they may vary depending on the intention or precedent of a person skilled in the art, the emergence of new technologies, and the like. In addition, in a specific case, there are also terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the corresponding description. Therefore, terms used in the present disclosure should be defined based on the meaning of the term and the general content of the present disclosure, not simply the name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. 또한, 명세서에 기재된 "..부", "..모듈" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어와 소프트웨어의 결합으로 구현될 수 있다.When it is said that a certain part "includes" a certain component throughout the specification, it means that it may further include other components without excluding other components unless otherwise stated. In addition, terms such as "..unit" and "..module" described in the specification mean a unit that processes at least one function or operation, which may be implemented as hardware or software or a combination of hardware and software. there is.

명세서 전체에서 기재된 "a, b, 및 c 중 적어도 하나"의 표현은, 'a 단독', 'b 단독', 'c 단독', 'a 및 b', 'a 및 c', 'b 및 c', 또는 'a,b,c 모두'를 포괄할 수 있다.The expression of "at least one of a, b, and c" described throughout the specification means 'a alone', 'b alone', 'c alone', 'a and b', 'a and c', 'b and c' ', or 'all of a, b, and c'.

아래에서는 첨부한 도면을 참고하여 본 개시의 실시예에 대하여 본 개시가 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 그러나 본 개시는 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다.Hereinafter, with reference to the accompanying drawings, embodiments of the present disclosure will be described in detail so that those skilled in the art can easily carry out the present disclosure. However, the present disclosure may be embodied in many different forms and is not limited to the embodiments described herein.

본 개시는, 다중 로봇의 임무 할당 결심 체계를 보조하기 위하여 딥러닝 학습 이전에 유전 알고리즘(Genetic Algorithm, GA)과 같은 방법의 룰-기반이 적용된 임무 할당 체계를 구현하여 결과를 도출할 수 있다. 본 개시는, 지휘결정권자의 결심을 보조하기 위하여, 유전 알고리즘을 통하여 얻은 결과를 딥러닝 기반의 학습 방법과 결합하여 모델을 도출하고, 미학습 데이터가 입력되었을 때에 기 학습된 모델을 기반으로 다중 로봇에게 임무를 할당하고자 한다.In the present disclosure, a rule-based method such as a Genetic Algorithm (GA) is applied to implement a task assignment system prior to deep learning learning to assist a task assignment decision system of multiple robots, and derive results. The present disclosure derives a model by combining the results obtained through a genetic algorithm with a deep learning-based learning method to assist the decision maker in making decisions, and when unlearned data is input, multiple robots are based on the pre-learned model. You want to assign a task to

도 1은 본 개시에 따른 전자 장치를 나타낸다.1 shows an electronic device according to the present disclosure.

전자 장치(100)는 프로세서(110) 및 메모리(120)를 포함한다. 도 1에 도시된 전자 장치(100)에는 본 실시예들과 관련된 구성요소들만이 도시되어 있다. 따라서, 전자 장치(100)에는 도 1에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음은 당해 기술분야의 통상의 기술자에게 자명하다.The electronic device 100 includes a processor 110 and a memory 120 . In the electronic device 100 shown in FIG. 1 , only elements related to the present embodiments are shown. Accordingly, it is apparent to those skilled in the art that the electronic device 100 may further include other general-purpose components in addition to the components shown in FIG. 1 .

전자 장치(100)는 소정의 상황에서 로봇에게 임무를 할당할 수 있다. 구체적으로, 전자 장치(100)는 전장 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 기초로 복수의 로봇에게 임무를 할당할 수 있다.The electronic device 100 may assign a task to the robot in a predetermined situation. In detail, the electronic device 100 may assign tasks to a plurality of robots based on information indicating suitability of the behaviors of the plurality of robots in a battlefield situation.

프로세서(110)는 전자 장치(100)의 전반적인 기능들을 제어하는 역할을 한다. 예를 들어, 프로세서(110)는 전자 장치(100) 내의 메모리(120)에 저장된 프로그램들을 실행함으로써, 전자 장치(100)를 전반적으로 제어한다. 프로세서(110)는 전자 장치(100) 내에 구비된 CPU(central processing unit), GPU(graphics processing unit), AP(application processor) 등으로 구현될 수 있으나, 이에 제한되지 않는다.The processor 110 serves to control overall functions of the electronic device 100 . For example, the processor 110 generally controls the electronic device 100 by executing programs stored in the memory 120 of the electronic device 100 . The processor 110 may be implemented as a central processing unit (CPU), graphics processing unit (GPU), or application processor (AP) included in the electronic device 100, but is not limited thereto.

메모리(120)는 전자 장치(100) 내에서 처리되는 각종 데이터들을 저장하는 하드웨어로서, 메모리(120)는 전자 장치(100)에서 처리된 데이터들 및 처리될 데이터들을 저장할 수 있다. 또한, 메모리(120)는 전자 장치(100)에 의해 구동될 애플리케이션들, 드라이버들 등을 저장할 수 있다. 메모리(120)는 DRAM(dynamic random access memory), SRAM(static random access memory) 등과 같은 RAM(random access memory), ROM(read-only memory), EEPROM(electrically erasable programmable read-only memory), CD-ROM, 블루레이 또는 다른 광학 디스크 스토리지, HDD(hard disk drive), SSD(solid state drive), 또는 플래시 메모리를 포함할 수 있다.The memory 120 is hardware that stores various types of data processed in the electronic device 100, and the memory 120 may store data processed in the electronic device 100 and data to be processed. Also, the memory 120 may store applications and drivers to be driven by the electronic device 100 . The memory 120 may include random access memory (RAM) such as dynamic random access memory (DRAM) and static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), and CD-ROM. ROM, Blu-ray or other optical disk storage, hard disk drive (HDD), solid state drive (SSD), or flash memory.

일 실시예에 따라, 프로세서(110)는 소정의 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보를 획득할 수 있다. 여기서 적합도란 소정의 상황에서 특정 로봇이 특정 행동을 하는 경우, 이러한 로봇의 행동이 해당 상황에서 적합한 정도를 의미할 수 있고, 또는 소정의 상황에서 로봇의 행동이 해당 상황에 얼마나 적합한지를 판단하기 위한 정보를 나타낼 수 있다. 예를 들어, '적군 발견'의 상황에서 로봇이 행동으로 '대응 공격'을 하는 경우, 이러한 로봇의 행동이 소요시간, 공격력, 및 상대거리 등과 같은 기준에 따라 평가될 수 있고, 평가 결과에 따라 로봇의 행동에 대한 적합도는 소요시간은 제1 값, 공격력은 제2 값, 및 상대거리는 제3 값과 같이 표현될 수 있다. 일 실시예에 따라, 프로세서(110)는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 각각 나타내는 제1 정보 및 제2 정보를 획득할 수 있다. 예를 들어, 제1 정보 및 제2 정보 각각은 소정의 상황에서 제1 로봇의 행동에 대한 적합도를 나타낼 수 있고, 제1 정보 및 제2 정보 각각은 소정의 전장 상황에서 제2 로봇의 행동에 대한 적합도를 나타낼 수 있다. 여기서 소정의 상황은 특정 목적의 임무에서의 상황 또는 전장에서의 상황이 될 수 있으나, 이에 제한되지 않는다. 소정의 상황이 전장에서의 상황인 경우, 소정의 상황은 '적군 발견', '장애물 발견', 및 '정찰 요청'을 포함할 수 있으나, 이에 제한되지 않는다. 또한, 로봇의 행동은 '대응공격', '우회이동', '지원요청', '이동', '경로요청', '정찰', 및 '무시' 등을 포함할 수 있고, 로봇의 행동에 대한 적합도는 '추가 소요시간', '정찰영역', '위험도', '상대거리', '소요시간', 및 '공격력' 등의 기준에 따른 값으로 표현되거나 평가될 수 있으나, 이에 제한되지 않는다. 일 실시예에 따라, 제1 정보는 3차원 텐서(tensor)의 형태를 갖는 정보를 포함하고, 제2 정보는 이미지 정보를 포함할 수 있다. 다시 말해, 제1 정보는 소정의 상황에서 복수의 로봇 각각의 행동에 대한 적합도를 3차원 텐서의 형태로 나타낸 정보를 포함할 수 있고, 제2 정보는 소정의 상황에서 복수의 로봇 각각의 행동에 대한 적합도를 2차원 이미지로 나타낸 정보를 포함할 수 있다. 일 실시예에 따라, 제2 정보는 제1 정보로부터 가공되어 생성될 수 있다. 구체적으로, 프로세서(110)는 3차원 텐서의 형태로 나타낸 제1 정보를 2차원 행렬의 형태로 가공하는 과정을 통해 제2 정보를 생성할 수 있다. 다시 말해, 프로세서(110)는 제1 정보를 2차원 행렬의 형태로 가공하고 이를 이미지화하여 제2 정보를 생성할 수 있다. 프로세서(110)는 복수의 로봇 각각의 리소스 정보, 소정의 상황에 관한 정보, 소정의 상황 별 적합도 함수에 관한 정보, 및 리소스-로봇 행동의 매칭 조건에 관한 정보를 기초로 제1 정보 및 제2 정보를 생성할 수 있다. 소정의 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보가 생성되는 과정은 이하 도 3, 4a, 및 4b에서 상세히 설명될 것이다.According to an embodiment, the processor 110 may obtain information representing a degree of suitability of a robot's behavior in a predetermined situation. Here, the degree of fitness may mean the degree to which a robot's behavior is suitable for a given situation when a specific robot performs a specific action in a given situation, or it is used to determine how appropriate a robot's behavior is for a given situation. information can be displayed. For example, if a robot makes a 'counterattack' as an action in the situation of 'enemy discovery', the robot's action can be evaluated according to criteria such as time required, attack power, and relative distance, and according to the evaluation result The suitability for the behavior of the robot may be expressed as a first value for required time, a second value for attack power, and a third value for relative distance. According to an embodiment, the processor 110 may obtain first information and second information each indicating a degree of fitness for the behavior of a plurality of robots in a predetermined situation. For example, each of the first information and the second information may indicate a degree of suitability for the behavior of the first robot in a predetermined situation, and each of the first information and the second information may represent the behavior of the second robot in a predetermined battlefield situation. fit can be shown. Here, the predetermined situation may be a situation in a mission for a specific purpose or a situation in a battlefield, but is not limited thereto. When the predetermined situation is a battlefield situation, the predetermined situation may include 'enemy discovery', 'obstacle discovery', and 'reconnaissance request', but is not limited thereto. In addition, the action of the robot may include 'reaction attack', 'detour movement', 'support request', 'move', 'route request', 'reconnaissance', and 'ignore', etc. The degree of fitness may be expressed or evaluated as a value according to criteria such as 'additional required time', 'reconnaissance area', 'risk level', 'relative distance', 'time required', and 'attack power', but is not limited thereto. According to an embodiment, the first information may include information in the form of a 3D tensor, and the second information may include image information. In other words, the first information may include information representing the degree of fitness for the behavior of each of the plurality of robots in the form of a 3D tensor in a predetermined situation, and the second information may include information representing the behavior of each of the plurality of robots in the predetermined situation. It may include information showing the fitness for the 2D image. According to an embodiment, the second information may be generated by processing from the first information. Specifically, the processor 110 may generate second information through a process of processing first information expressed in the form of a 3D tensor into a form of a 2D matrix. In other words, the processor 110 may generate second information by processing the first information in the form of a two-dimensional matrix and imaging it. The processor 110 generates first information and second information based on resource information of each of a plurality of robots, information about a predetermined situation, information about a fitness function for each predetermined situation, and information about a resource-robot behavior matching condition. information can be generated. A process of generating information representing the degree of suitability of a robot's behavior in a given situation will be described in detail with reference to FIGS. 3, 4a, and 4b.

일 실시예에 따라, 프로세서(110)는 제1 정보 및 제2 정보를 전자 장치(100) 내 통신 디바이스(미도시)를 통해 외부로부터 획득할 수 있다. 다른 실시예에 따라, 프로세서(110)는 제1 정보 및 제2 정보를 메모리(120)로부터 획득할 수 있다.According to an embodiment, the processor 110 may acquire the first information and the second information from the outside through a communication device (not shown) in the electronic device 100 . According to another embodiment, the processor 110 may obtain the first information and the second information from the memory 120 .

도 2는 단위 로봇에 임무를 부여하는 방법에 관한 순서도를 나타낸다.2 shows a flowchart of a method of assigning a task to a unit robot.

S210 단계에서 단위 로봇은 1차 임무를 확인할 수 있다. 예를 들어, 단위 로봇은 자신에게 할당된 "기동"이라는 임무를 확인할 수 있다. 구체적으로, 단위 로봇은 당해 임무에 대응하는 임무 값을 확인하고, 임무 값에 해당하는 임무를 수행할 수 있다. 예를 들어, 단위 로봇이 "기동"이라는 임무에 대응하는 임무 값을 확인하는 경우, 단위 로봇은"기동"이라는 임무를 수행할 수 있다.In step S210, the unit robot may check the primary mission. For example, a unit robot may see a task assigned to it called "maneuver". Specifically, the unit robot may check a task value corresponding to the task and perform the task corresponding to the task value. For example, when the unit robot checks a task value corresponding to a task of “activating”, the unit robot may perform a task of “activating”.

S220 단계에서 단위 로봇은 전장에서 특정 상황(Event)의 발생 유무를 확인할 수 있다. 여기서 상황은 단위 로봇에 기 등록된 상황일 수 있으며, 예를 들어, 적군 공격, 적 지원군 도착, 적군 후퇴 등의 상황을 포함할 수 있다. 단위 로봇은 특정 상황이 발생하지 않은 경우 1차 임무를 그대로 수행하고, 특정 상황이 발생한 경우 S230 단계로 나아갈 수 있다.In step S220, the unit robot may check whether a specific event has occurred in the battlefield. Here, the situation may be a situation pre-registered in the unit robot, and may include, for example, an enemy attack, enemy reinforcements arrival, and an enemy retreat. The unit robot may perform the primary task as it is when a specific situation does not occur, and may proceed to step S230 when a specific situation occurs.

S230 단계에서 단위 로봇은 특정 상황의 성격을 파악할 수 있다. 단위 로봇은 통신 디바이스를 통해 전장 정보를 제공받거나 직접 수집한 전장 정보를 통해 특정 상황의 성격을 파악할 수 있다. 또한, 단위 로봇은 본 개시의 전자 장치(100)를 통해 특정 상황의 성격을 파악할 수 있다.In step S230, the unit robot may grasp the nature of a specific situation. The unit robot can receive battlefield information through a communication device or grasp the nature of a specific situation through directly collected battlefield information. In addition, the unit robot may grasp the nature of a specific situation through the electronic device 100 of the present disclosure.

S240 단계에서, 단위 로봇은 파악된 특정 상황의 성격에 따라 임무 재계획이 필요한지 유무를 확인할 수 있다. 임무 재계획이 필요한 경우란 기존에 단위 로봇에서 수행하던 1차 임무와 다른 임무를 수행해야 하는 것으로 확인된 경우를 의미할 수 있다. 임무 재계획이 필요 없다고 확인되는 경우, 단위 로봇은 기존에 수행하던 1차 임무를 지속적으로 수행할 수 있다. 단위 로봇은 본 개시의 전자 장치(100)를 통해 임무 재계획이 필요한지 유무를 확인할 수 있다. 임무 재계획이 필요하다고 확인되는 경우, 단위 로봇은 S250 단계로 나아갈 수 있다.In step S240, the unit robot may check whether mission replanning is necessary or not according to the characteristics of the identified specific situation. A case in which mission replanning is required may refer to a case in which it is confirmed that a mission different from the primary mission previously performed by a unit robot has to be performed. If it is confirmed that there is no need for re-planning of the mission, the unit robot can continue to perform the primary mission that was previously performed. The unit robot may check whether mission replanning is necessary or not through the electronic device 100 of the present disclosure. If it is determined that mission replanning is necessary, the unit robot may proceed to step S250.

S250 단계에서, 단위 로봇은 (1+n)차 임무를 확인할 수 있다. 구체적으로, S240 단계에서 임무 재계획이 필요하다고 확인된 경우 단위 로봇은 본 개시의 전자 장치(100)에 의해 결정된 최종 임무 정보에 대응하는 (1+n)차 임무를 확인할 수 있다. 여기서, (1+n)차 임무는 n번째 할당된 임무의 바로 다음으로 할당되는 임무를 의미할 수 있다. 단위 로봇은 자신에게 할당된 임무에 대응하는 임무 값을 확인하고, 임무 값에 해당하는 임무를 수행할 수 있다. 예를 들어, (1+n)차 임무로 "후퇴(목적지 변경한 기동)", "교전" 등의 임무가 할당되는 경우, 단위 로봇은 자신에게 할당된 "후퇴(목적지 변경한 기동)","교전"에 대응하는 임무 값을 확인하고, "후퇴(목적지 변경한 기동)","교전"이라는 임무를 수행할 수 있다.In step S250, the unit robot may check the (1+n)th task. Specifically, when it is determined that mission re-planning is necessary in step S240, the unit robot may check the (1+n)th mission corresponding to the final mission information determined by the electronic device 100 of the present disclosure. Here, the (1+n)th task may mean a task assigned right after the nth assigned task. The unit robot may check a mission value corresponding to a mission assigned to itself and perform a mission corresponding to the mission value. For example, if tasks such as "retreat (movement with destination change)" and "engagement" are assigned as the (1+n) order missions, the unit robot will be assigned "retreat (movement with destination change)", You can check the mission value corresponding to "engagement" and perform missions such as "retreat (movement with destination change)" and "engagement".

S260 단계에서, 단위 로봇은 자신에게 할당된 (1+n)차 임무를 수행한 후 전장에서 다른 특정 상황이 발생하였는지 유무를 확인할 수 있다. 전장에서 기존과 다른 특정 상황이 발생한 경우, 단위 로봇은 S230 단계를 다시 수행하고, 기존과 다른 특정 상황이 발생하지 않은 경우, 단위 로봇은 임무를 달성했다고 판단할 수 있다.In step S260, the unit robot may check whether another specific situation has occurred on the battlefield after performing the (1+n)th task assigned to it. When a specific situation different from the previous one occurs on the battlefield, the unit robot performs step S230 again, and when a specific situation different from the previous one does not occur, the unit robot may determine that the mission has been achieved.

이처럼 단위 로봇에게 임무를 부여하는 경우에는 다중 로봇에 비해 비교적 간단하므로 소수의 운용 인원으로 실시간 모니터링을 통해 임무를 할당하는 것이 가능할 수 있다. 다만, 다중 로봇은 이종(heterogeneous) 다중(multiple)으로 구성되는 것이 일반적이므로 플랫폼 별 특성이 상이하고, 소수의 운용 인원이 실시간 모니터링을 수행하는데 한계가 존재할 수 있다. 따라서, 다중 로봇을 효과적으로 통제하기 위해서는 운용자의 부담을 덜어줄 수 있는 자동화 기술이 필수적이라고 할 수 있다. 이러한 자동화 기술은 각각의 플랫폼 별 Failsafe 기능, 임무 할당 기술에서 우발 상황에 대한 대응방안으로 결정된 사항을 실행하기 위해 필요한 자율화 기술(예를 들어, 환경인식, 자율 이동 등)을 포함할 수 있다.Since assignment of tasks to unit robots is relatively simple compared to multi-robots, it may be possible to assign tasks through real-time monitoring with a small number of operating personnel. However, since multiple robots are generally composed of heterogeneous multiples, the characteristics of each platform are different, and there may be limitations in performing real-time monitoring by a small number of operating personnel. Therefore, in order to effectively control multiple robots, it can be said that an automation technology capable of relieving the operator's burden is essential. These automation technologies may include autonomous technologies (eg, environment recognition, autonomous movement, etc.) necessary to execute matters determined as a response to contingencies in the failsafe function for each platform and mission assignment technology.

전술한 것과 같이 다중 로봇에 동기적(Synchronous)으로 제공되어야 하는 임무 할당은 숙련자 및 비숙련자를 막론하고 어려운 결심이다. 이는 현재 상황, 급변하는 환경, 외란과 같은 데이터를 기반으로 한 숙련자의 지휘경험에 의존하는 방법이므로, 사람의 감정과 처한 상황에 따라 일정하지 않고 변화할 수 있기 때문이다. 따라서, 본 개시는 사람의 결심을 보조할 AI 기반의 모델을 사전에 제작해, 학습된 유사한 환경 및 사건이 도래하였을 때 다중 로봇의 임무할당 결심 체계를 보조하고자 한다.As described above, assignment of tasks that must be provided synchronously to multiple robots is a difficult decision for both skilled and unskilled users. This is because it is a method that relies on the commanding experience of an experienced person based on data such as the current situation, rapidly changing environment, and disturbance, so it is not constant and can change depending on people's emotions and situations. Therefore, the present disclosure intends to create an AI-based model to assist human decision-making in advance, and assist multi-robot task allocation decision systems when similar learned environments and events arrive.

도 3은 소정의 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보를 생성하는 실시예를 나타낸다. 3 shows an embodiment of generating information representing the degree of suitability of a robot's behavior in a given situation.

일 실시예에 따라, 프로세서(110)는 임무 할당 성능 향상을 위하여 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 제1 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하는 유전 알고리즘(Genetic Algorithm) 기반의 제1 네트워크를 이용할 수 있고, 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 제2 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하도록 학습된 제2 네트워크를 이용할 수 있다. According to an embodiment, the processor 110 outputs task information for allocating tasks to a plurality of robots based on first information representing the degree of suitability of the actions of the plurality of robots in a predetermined situation in order to improve task assignment performance. A first network based on a Genetic Algorithm may be used, and task information for allocating tasks to a plurality of robots is output based on second information representing the degree of suitability of the actions of the plurality of robots in a predetermined situation. A second network learned to do so can be used.

유전 알고리즘이란 생물학적 진화를 모방한 최적화 방법론을 의미한다. 유전 알고리즘은 최적화 알고리즘의 대상이 되는 솔루션을 유전적으로 표현(Genetic Representation)할 수 있다. 유전 알고리즘은 유전적으로 표현되는 솔루션을 평가하기 위한 적합도 함수(Fitness Function)를 설계할 수 있다면 어떠한 어플리케이션에서도 활용이 가능하다는 장점이 존재한다. 또한, 유전 알고리즘은 최적화 수행 중 지역 극솟값(Local Minima) 문제에서 자유롭고, GPU 서버 등이 가용한 경우, 병렬화를 통해 최적화 시간을 크게 단축시킬 수 있다는 장점이 존재한다. 그러나 유전 알고리즘은 이산 공간(Discrete Space)이 아닌 연속 공간(Continuous Space)에서는 적용하기가 힘들고, 적합도 함수의 복잡도에 따라 전체 최적화 시간에 크게 영향을 받을 수 있다는 단점이 존재한다. 반면에, CNN 기반 학습 모델은 학습된 데이터와 유사한 정보 입력 시 도출시간의 편차가 적고 GA 알고리즘 대비 상대적으로 빠른 추론이 가능하다는 장점이 존재한다. 또한, CNN 기반 학습 모델은 입력 데이터로서 2D RGB 기반의 이미지를 사용할 수 있어 널리 쓰이고 있는 사물인식 SOTA(State-of-the-art) 적용 또한 가능하다는 장점이 존재한다. 다만, CNN 기반 학습 모델은 유사도가 상대적으로 적은 미학습 데이터 입력 시 출력 값의 질이 저하될 수 있다는 단점이 존재할 수 있으나, 이를 극복하기 위해 별도 데이터 증강 방법이 채용될 수 있다. 본 개시에서는 이와 같은 GA 알고리즘 기반 네트워크와 뉴럴 네트워크의 다채널 정보를 통합하고, 각 채널을 통하여 입력받은 데이터의 신뢰도 측정을 기반으로 각각의 네트워크의 최적 조합을 찾아 다중 로봇의 임무할당 성능 향상을 모색하고자 한다.A genetic algorithm is an optimization methodology that mimics biological evolution. The genetic algorithm may genetically represent a solution that is a target of the optimization algorithm. The genetic algorithm has the advantage that it can be used in any application if a fitness function for evaluating a genetically expressed solution can be designed. In addition, the genetic algorithm has the advantage of being free from the local minima problem during optimization and greatly reducing the optimization time through parallelization when a GPU server or the like is available. However, the genetic algorithm is difficult to apply in a continuous space rather than a discrete space, and the overall optimization time may be greatly affected by the complexity of the goodness-of-fit function. On the other hand, the CNN-based learning model has the advantage that the deviation of the derivation time is small when inputting information similar to the learned data and relatively fast inference is possible compared to the GA algorithm. In addition, the CNN-based learning model can use 2D RGB-based images as input data, so it has the advantage of being able to apply the widely used object recognition SOTA (State-of-the-art). However, the CNN-based learning model may have a disadvantage that the quality of output values may deteriorate when inputting unlearned data having relatively low similarity, but a separate data augmentation method may be employed to overcome this. In the present disclosure, the GA algorithm-based network and the multi-channel information of the neural network are integrated, and based on the reliability measurement of the data input through each channel, the optimal combination of each network is sought to improve the performance of multi-robot assignment. want to do

일 실시예에 따라, 프로세서(110)는 복수의 로봇 각각의 리소스 정보, 소정의 상황에 관한 정보, 소정의 상황 별 적합도 함수에 관한 정보, 및 리소스-로봇 행동의 매칭 조건에 관한 정보를 기초로, 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 생성할 수 있다. 생성된 정보는 3차원 텐서의 형태를 갖는 정보일 수 있다.According to an embodiment, the processor 110 may perform resource information of each of a plurality of robots, information about a predetermined situation, information about a fitness function for each predetermined situation, and information about a resource-robot behavior matching condition based on , It is possible to generate information representing the degree of suitability for the behavior of a plurality of robots in a given situation. The generated information may be information in the form of a 3D tensor.

도 3을 참조하면, 프로세서(110)는 복수의 로봇 각각의 사전 정의된 리소스 정보, 실시간 리소스 정보, 사전 정의된 전장 상황에 관한 정보, 및 실시간 전장 상황에 관한 정보를 기초로, 전장 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 생성할 수 있다. 여기서 생성된 정보는 3차원 텐서의 형태를 갖는 정보일 수 있다. 일 실시예에 따라, 사전 정의된 리소스 정보는 임무 장비의 종류 및 능력에 관한 정보와 운용거리 및 통신거리 등에 관한 정보를 포함할 수 있고 실시간 리소스 정보는 위치, 속도, 및 임무 장비의 상태 등에 관한 정보를 포함할 수 있다. 사전 정의된 전장 상황에 관한 정보는 전장 상황 별 적합도 함수에 관한 정보, 리소스-로봇 행동의 매칭 조건에 관한 정보를 포함할 수 있고 실시간 전장 상황에 관한 정보는 적군 위치에 관한 정보 및 적군 규모 등에 관한 정보를 포함할 수 있다. 일 실시예에 따라, 프로세서(110)는 복수의 로봇 각각의 사전 정의된 리소스 정보와 실시간 리소스 정보를 기초로 현재 가용 자원에 관한 정보를 생성하고, 복수의 로봇 각각의 사전 정의된 전장 상황에 관한 정보와 실시간 전장 상황에 관한 정보를 기초로 현재 전장 상황 및 전장에 관한 정보를 생성할 수 있다. 이어서, 프로세서(110)는 생성된 현재 가용 자원에 관한 정보와 현재 전장 상황 및 전장에 관한 정보를 통해 3차원 텐서 형태를 갖는 정보(Information Tensor)를 생성할 수 있다. 결과적으로 프로세서(110)는 사전 정의된 정보와 실시간 정보를 종합적으로 함축하여 전장 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 생성하므로, 다중 로봇 각각의 상이한 특성이 생성된 정보에 효과적으로 반영될 수 있다. 3차원 텐서 형태를 갖는 정보에 관한 설명은 이하 도 4a에서 상세히 설명될 것이다.Referring to FIG. 3 , the processor 110 performs multiple operations in a battlefield situation based on predefined resource information of each of a plurality of robots, real-time resource information, predefined battlefield situation information, and real-time battlefield situation information. It is possible to generate information representing the suitability of the robot's behavior. The generated information may be information in the form of a 3D tensor. According to an embodiment, the predefined resource information may include information about the type and capability of mission equipment, and information about operating distance and communication distance, and real-time resource information about location, speed, and status of mission equipment. information may be included. Information on the predefined battlefield situation may include information on fitness functions for each battlefield situation and information on matching conditions for resource-robot behavior. information may be included. According to an embodiment, the processor 110 generates information on currently available resources based on predefined resource information and real-time resource information of each of a plurality of robots, and generates information about a predefined battlefield situation of each of a plurality of robots. The current battlefield situation and battlefield information may be generated based on the information and the real-time battlefield situation information. Subsequently, the processor 110 may generate information (Information Tensor) having a 3D tensor form through the generated information on currently available resources and information on the current battlefield situation and battlefield. As a result, the processor 110 comprehensively connotes predefined information and real-time information to generate information representing the degree of fitness for the behavior of a plurality of robots in a battlefield situation, so that the different characteristics of each of the multiple robots are effectively reflected in the generated information. It can be. A description of information having a 3D tensor form will be described in detail with reference to FIG. 4A below.

도 4a는 제1 네트워크의 입력 정보인 제1 정보의 일 실시예를 나타낸다.4A shows an example of first information that is input information of a first network.

일 실시예에 따라, 프로세서(110)는 복수의 로봇 각각의 리소스 정보, 소정의 상황(Event)에 관한 정보, 소정의 상황 별 적합도 함수에 관한 정보, 및 리소스-로봇 행동의 매칭 조건에 관한 정보를 기초로, 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 생성할 수 있고, 생성된 정보를 제1 네트워크의 입력 정보인 제1 정보로 확인할 수 있다. 예를 들어, 프로세서(110)는 도 3에서 생성된 3차원 텐서 형태를 갖는 정보를 제1 정보로 확인할 수 있다.According to an embodiment, the processor 110 includes resource information of each of a plurality of robots, information about a predetermined situation (Event), information about a fitness function for each predetermined situation, and information about matching conditions between resource-robot behaviors. Based on this, it is possible to generate information representing the degree of fitness for the behavior of a plurality of robots in a given situation, and the generated information can be identified as first information, which is input information of a first network. For example, the processor 110 may check information having a 3D tensor shape generated in FIG. 3 as first information.

도 4a를 참조하면, 제1 정보는 3차원 텐서 형태를 가질 수 있다. 도 4a를 참조하면, 제1 정보의 x축은 리소스를 나타내고, y축은 로봇의 행동을 나타내고, z축은 카테고리를 나타낸다. x축의 리소스란 가용 자원(즉, 로봇의 수)을 의미하고, y축의 로봇의 행동이란 특정 상황이 발생한 경우 로봇이 취할 수 있는 행동을 의미하고, z축의 카테고리란 특정 상황이 발생한 경우 로봇의 행동에 대한 적합도의 판단 기준을 의미한다. 예를 들어, 텐서의 크기가 (10, 5, 6)인 경우 리소스(즉, 로봇의 수)가 10개이고, 로봇이 취할 수 있는 행동은 5가지이며 특정 상황이 발생했을 때 로봇의 행동에 대한 적합도는 6가지의 기준에 따른 값으로 표현되거나 평가될 수 있음을 의미한다. 일 실시예에 따라, 제1 정보인 정보 텐서(Information Tensor)는 적어도 하나의 섹터로 나누어질 수 있다. 예를 들어 도 4a를 참조하면, 정보 텐서는 세가지 섹터로 나누어질 수 있다. 각각의 섹터는 특정 상황이 발생한 경우에 로봇의 행동에 대한 적합도를 나타낸다. 구체적으로, 좌측 하단의 섹터는 '적군 발견' 상황이 발생한 경우 로봇의 행동에 대한 적합도를 나타내고, 중간 섹터는 '장애물 발견' 상황이 발생한 경우 로봇의 행동에 대한 적합도를 나타내고, 우측 상단의 섹터는 '정찰 요청' 상황이 발생한 경우 로봇의 행동에 대한 적합도를 나타낸다. 정보 텐서에 기재된 숫자는 각 섹터에 대응하는 상황이 발생한 경우, 로봇이 특정 행동을 수행한다고 가정할 때의 적합도를 의미한다. 예를 들어, 좌측 하단 섹터의 경우, '적군 발견' 상황이 발생하여 로봇이 대응공격의 행동을 수행하면 상대거리는 8이고, 공격력은 5이며, 소요시간은 2임을 나타낸다. 마찬가지로, 좌측 하단 섹터에서 로봇이 우회이동의 행동을 수행하면 상대거리는 8이고, 공격력은 0이며, 소요시간은 4임을 나타낸다. 같은 섹터임에도 불구하고(즉, 발생한 상황이 동일함에도 불구하고) 로봇이 다른 행동을 수행할 때 적합도가 동일하거나 달라질 수 있다. 예를 들어, 좌측 하단 섹터('적군 발견' 상황이 발생한 경우)에서 로봇이 대응공격, 우회이동, 또는 지원요청의 행동을 수행하더라도 적군과의 상대거리는 변동이 없어 y축의 상대거리가 8로 모두 동일할 수 있다. 또한, 로봇이 우회이동의 행동을 수행하는 경우, 우회이동 시에는 공격을 하지 않는다는 것이 내포되므로 y축의 공격력은 0이 되어 로봇이 대응공격 또는 지원요청의 행동을 수행한 경우의 공격력과 차이가 있을 수 있다. 마지막으로, 로봇이 대응공격의 행동을 수행하는 경우보다 우회이동의 행동을 수행하는 경우가 소요시간이 상대적으로 늘어나므로, 로봇이 대응공격의 행동을 수행하는 경우의 y축의 소요시간은 2이지만 우회이동의 행동을 수행하는 경우의 y축의 소요시간은 4로 증가할 수 있다. 도 4a 및 도 4b에서 적합도는 적어도 하나의 기준에 따른 값으로 표현되고 적어도 하나의 기준은 상황마다 다르게 설정될 수 있으나, 도 4a 및 도 4b에 기재된 값과 기준은 일 예시일 뿐 이에 제한되지 않는다. Referring to FIG. 4A , the first information may have a 3D tensor shape. Referring to FIG. 4A , the x-axis of the first information represents a resource, the y-axis represents a robot's behavior, and the z-axis represents a category. Resources on the x-axis mean available resources (i.e., the number of robots), robot actions on the y-axis mean actions the robot can take when a specific situation occurs, and categories on the z-axis mean robot actions when a specific situation occurs. It means the criterion for judging the goodness of fit. For example, if the size of the tensor is (10, 5, 6), there are 10 resources (i.e., the number of robots), and there are 5 actions that the robot can take. Goodness of fit means that it can be expressed or evaluated as a value according to six criteria. According to an embodiment, an information tensor, which is first information, may be divided into at least one sector. For example, referring to Figure 4a, the information tensor can be divided into three sectors. Each sector represents the suitability of the robot's behavior when a specific situation occurs. Specifically, the lower left sector represents the suitability of the robot's behavior when an 'enemy is found' situation occurs, the middle sector represents the suitability of the robot's behavior when an 'obstacle is found' situation occurs, and the upper right sector represents the suitability of the robot's behavior It shows the suitability of the robot's behavior when a 'reconnaissance request' situation occurs. The number written in the information tensor means the degree of fitness when it is assumed that the robot performs a specific action when a situation corresponding to each sector occurs. For example, in the case of the lower left sector, when an 'enemy is found' situation occurs and the robot performs a response attack action, it indicates that the relative distance is 8, the attack power is 5, and the required time is 2. Similarly, if the robot performs detour movement in the lower left sector, it indicates that the relative distance is 8, the attack power is 0, and the required time is 4. The degree of fitness may be the same or different when the robot performs different actions despite being in the same sector (i.e., even though the situation that occurred is the same). For example, in the lower left sector (when an 'enemy found' situation occurs), even if the robot performs a response attack, a detour, or a request for support, the relative distance to the enemy does not change, so the relative distance on the y-axis is all set to 8. can be the same In addition, when the robot performs detour movement, it is implied that it does not attack during detour movement, so the attack power of the y-axis becomes 0, and there may be a difference from the attack power when the robot performs a response attack or support request action. can Finally, since the time required for the robot to perform an action of detour movement is relatively longer than the action of a response attack, the time required on the y-axis is 2 when the robot performs the action of a response attack, but In the case of performing the action of moving, the required time on the y-axis may increase to 4. In FIGS. 4A and 4B, the degree of fitness is expressed as a value according to at least one criterion, and at least one criterion may be set differently for each situation, but the values and criteria described in FIGS. 4A and 4B are only examples and are not limited thereto. .

도 4b는 제2 네트워크의 입력 정보인 제2 정보의 일 실시예를 나타낸다.4B shows an example of second information that is input information of a second network.

일 실시예에 따라, 프로세서(110)는 제1 정보를 가공하는 과정을 통해 제2 정보를 생성할 수 있다. 예를 들어, 프로세서(110)는 도 4a의 3차원 텐서 형태를 갖는 제1 정보를 2차원 행렬의 형태로 가공하고 이를 이미지화 하여 도 4b의 제2 정보를 생성할 수 있다. 다른 실시예에 따라, 프로세서(110)는 복수의 로봇 각각의 리소스 정보, 소정의 상황에 관한 정보, 소정의 상황 별 적합도 함수에 관한 정보, 및 리소스-로봇 행동의 매칭 조건에 관한 정보를 기초로, 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 생성할 수 있고, 생성된 정보를 제2 네트워크의 입력 정보인 제2 정보로 확인할 수 있다.According to an embodiment, the processor 110 may generate second information through a process of processing the first information. For example, the processor 110 may generate the second information of FIG. 4B by processing the first information in the form of a 3D tensor of FIG. 4A into a form of a 2D matrix and processing it into an image. According to another embodiment, the processor 110 may perform resource information of each of a plurality of robots, information about a predetermined situation, information about a fitness function for each predetermined situation, and information about a resource-robot behavior matching condition based on , It is possible to generate information representing the degree of suitability for the behavior of a plurality of robots in a given situation, and the generated information can be identified as second information, which is input information of a second network.

도 4b의 제2 정보(즉, 정보 매트릭스)는 도 4a의 정보 텐서와 x축을 제외하면 동일하므로 도 4a와 중복되는 내용에 대해서는 설명을 생략한다.Since the second information (ie, information matrix) of FIG. 4B is the same as the information tensor of FIG. 4A except for the x-axis, descriptions of overlapping contents with those of FIG. 4A are omitted.

일 실시예에 따라, 프로세서(110)는 임무 할당 성능 향상을 위하여 제2 정보(예를 들어, 도 4b의 정보 매트릭스)가 입력 정보로서 제공되었을 때 레이블 데이터를 기초로 Deep-Learning CNN 기법을 통해 학습 또는 훈련하여 모델을 도출할 수 있다. 또한, 프로세서(110)에 미학습된 새로운 입력 정보가 제공되었을 때, 기 학습 또는 훈련된 모델을 기반으로 복수의 로봇에게 임무를 할당할 수 있다. 학습(CNN deep learning 기반)을 통하여 임무를 할당하는 방법은 입력 정보와 그에 대한 정답을 페어링(pairing)하는 방법이 주로 사용된다. 이를 구현하기 위해 프로세서(110)는 GA 알고리즘을 통하여 얻은 결과를 정답지로 지정하여 제2 네트워크를 훈련시키거나 운용 환경에서 숙련자가 할당한 임무 결과를 정답지로 지정하여 제2 네트워크를 훈련시켜 모델을 도출할 수 있다. 일 실시예에 따라, 프로세서(110)는 제2 네트워크를 학습 또는 훈련시키기 위한 정보로서 소정의 상황에서 로봇의 행동에 대한 적합도를 나타내는 제2 정보를 획득할 수 있다. According to an embodiment, the processor 110 uses a deep-learning CNN technique based on label data when the second information (eg, the information matrix of FIG. 4B) is provided as input information to improve task assignment performance. Models can be derived by learning or training. In addition, when new unlearned input information is provided to the processor 110, tasks may be assigned to a plurality of robots based on previously learned or trained models. As a method of allocating tasks through learning (based on CNN deep learning), a method of pairing input information with the correct answer is mainly used. In order to implement this, the processor 110 assigns the result obtained through the GA algorithm to the answer sheet to train the second network, or assigns the task result assigned by the expert in the operating environment to the answer sheet to train the second network to derive the model can do. According to an embodiment, the processor 110 may obtain second information representing a degree of suitability of a robot's behavior in a predetermined situation as information for learning or training a second network.

도 4b를 참조하면, 일 실시예에 따라 프로세서(110)는 제1 정보와 마찬가지로 사전 정의된 정보와 실시간 정보를 종합적으로 함축하여 소정의 상황에서 로봇의 행동에 대한 적합도를 나타내는 제2 정보를 생성할 수 있다. 다른 실시예에 따라, 프로세서(110)는 기 생성된 제1 정보를 가공하고 이미지화하여 제2 정보를 생성할 수 있다. 예를 들어, 정보 텐서의 크기가 (4,8,8)인 경우, 리소스(즉, 로봇의 수)가 4개이고 로봇이 수행할 수 있는 행동은 8가지이며 특정 상황이 발생했을 때 로봇의 행동에 대한 적합도는 8가지의 기준으로 판단할 수 있음을 의미한다. 따라서, 제1 정보(즉, 도 4a의 정보 텐서)를 제2 정보(즉, 도 4b의 정보 매트릭스)로 변환하면 (4,8,8)은 (8,8) * 4가 되고, 도 4b의 정보 매트릭스와 같이 제1 로봇, 제2 로봇, 제3 로봇, 및 제4 로봇 각각의 정보 텐서에서 y축과 z축에 대응하는 값들을 얻을 수 있다. 도 4b의 정보 매트릭스도 정보 텐서와 마찬가지로 적어도 하나의 섹터로 나누어질 수 있다. 각각의 섹터는 특정 상황이 발생한 경우에 로봇의 행동에 대한 적합도를 나타낼 수 있다. 예를 들어 도 4b를 참조하면, 정보 매트릭스는 세가지 섹터로 나누어질 수 있다. 세가지 섹터는 RGB(Red, Green, 및 Blue)로 표시될 수 있다. 프로세서(110)는 HR-Net32, linear-net, 또는 ResNet50과 같은 Backbone Architecture를 사용하기 위해, 정보 매트릭스 내에 기재된 수(최소: 0, 최대: 10)를 각 섹터 색상의 최소값(0)에서 최대값(255)의 값과 선형 사상(linear mapping)하여 이미지화 할 수 있다. 또한, 프로세서(110)는 정보 매트릭스를 제2 네트워크의 입력 정보로서 사용하기 위해 224*224 픽셀의 크기를 갖는 이미지로 이미지 크기 조정(image resize)을 수행하여 제2 정보를 생성할 수 있다.Referring to FIG. 4B , according to an embodiment, the processor 110 comprehensively connotes predefined information and real-time information like the first information to generate second information representing the suitability of the robot's behavior in a given situation. can do. According to another embodiment, the processor 110 may generate second information by processing and imaging previously generated first information. For example, if the size of the information tensor is (4,8,8), there are 4 resources (i.e., the number of robots) and there are 8 actions that the robot can perform. This means that the fitness for can be judged by eight criteria. Therefore, when converting the first information (i.e., the information tensor of FIG. 4a) into the second information (i.e., the information matrix of FIG. 4b), (4,8,8) becomes (8,8) * 4, and FIG. 4b Values corresponding to the y-axis and the z-axis may be obtained from information tensors of each of the first robot, the second robot, the third robot, and the fourth robot, like the information matrix of . The information matrix of FIG. 4B can be divided into at least one sector like the information tensor. Each sector may indicate the suitability of the robot's behavior when a specific situation occurs. For example, referring to Figure 4b, the information matrix can be divided into three sectors. The three sectors may be represented by RGB (Red, Green, and Blue). The processor 110 sets the number (minimum: 0, maximum: 10) written in the information matrix from the minimum value (0) to the maximum value of each sector color in order to use Backbone Architecture such as HR-Net32, linear-net, or ResNet50. It can be imaged by linear mapping with the value of (255). In addition, the processor 110 may generate second information by performing image resize to an image having a size of 224*224 pixels in order to use the information matrix as input information of the second network.

도 4b를 참조하면, 정보 매트릭스 우측의 표는 제1 로봇, 제2 로봇, 제3 로봇, 및 제4 로봇 각각에게 할당된 임무를 나타낸다. 다중 로봇의 임무 할당은 로봇이 임무를 수행하는 도중 발생하는 여러 우발 상황에 대한 대응 방안을 도출하는 알고리즘을 의미한다. 이와 같이 도출된 대응 방안은 로봇 제어 관점의 HLAC(High-level Action Command)에 해당한다. 예를 들어, HLAC는 이동, 공격, 회피, 또는 대기 등을 포함할 수 있다. 이렇게 생성된 HLAC는 실제로 이를 실행하기 위해 LLCC(Low-level Control Command)를 생성하게 된다. 예를 들어, HLAC가 '이동'으로 주어지면, LLCC는 gps 값을 이용하여 '이동'을 수행하기 위한 경로를 생성해 알려줄 수 있다. HLAC는 대응 방안으로서 이동, 공격, 회피 등의 임무를 대응하는 임무 값의 인덱스로 변수화 할 수 있다. 또한, 이러한 변수화된 임무 값의 인덱스는 유전 알고리즘을 적용하기 위해 필수적으로 요구되는 유전적 표현으로 나타낼 수 있으므로 유전 알고리즘을 통하여 최적화된 대응방안 도출이 가능하다. 일 실시예에 따라, 프로세서(110)는 제1 정보를 가공(즉, 정보 매트릭스)하고 이미지화한 제2 정보를 제2 네트워크의 입력 정보로서 획득하고, 제1 정보가 입력된 제1 네트워크의 출력 정보(예를 들어, 도 4b의 우측 표)를 제2 네트워크의 입력 정보에 대한 타겟 정보로서 획득하고, 제2 네트워크의 입력 정보 및 제2 네트워크의 타겟 정보를 기초로 제2 네트워크를 학습시킬 수 있다. 다시 말해, 프로세서(110)는 제1 네트워크의 출력 정보를 인덱스화하여 json 파일로 저장하고, 저장된 json 파일과 제1 정보를 가공하는 과정을 통해 생성된 제2 정보를 이용하여 제2 네트워크를 훈련시킬 수 있다.Referring to FIG. 4B , a table on the right side of the information matrix indicates tasks assigned to each of the first robot, the second robot, the third robot, and the fourth robot. Multi-robot mission assignment refers to an algorithm that derives countermeasures for various contingencies that occur while robots are performing their missions. The countermeasures derived in this way correspond to HLAC (High-level Action Command) in terms of robot control. For example, HLACs may include move, attack, dodge, or wait. The HLAC created in this way generates a Low-level Control Command (LLCC) to actually execute it. For example, if HLAC is given as 'move', the LLCC may generate and inform a route for performing 'move' using a gps value. As a countermeasure, HLAC can transform missions such as movement, attack, and avoidance into parameters of corresponding mission values. In addition, since the index of these variable task values can be expressed as a genetic expression required to apply the genetic algorithm, it is possible to derive an optimized countermeasure through the genetic algorithm. According to an embodiment, the processor 110 processes (ie, information matrix) the first information and obtains second information imaged as input information of the second network, and outputs the first network to which the first information is input. information (eg, the table on the right of FIG. 4B) may be acquired as target information for the input information of the second network, and the second network may be learned based on the input information of the second network and the target information of the second network. there is. In other words, the processor 110 indexes the output information of the first network and stores it as a json file, and trains the second network using the second information generated through the process of processing the stored json file and the first information can make it

다시 도 1을 참조하면, 프로세서(110)는 제1 정보가 입력된 제1 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제1 임무 정보를 확인할 수 있고, 제2 정보가 입력된 제2 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제2 임무 정보를 확인할 수 있다. 일 실시예에 따라, 제1 네트워크는 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하는 네트워크일 수 있고, 제2 네트워크는 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 추론하도록 훈련될 수 있다. 다시 말해, 제1 네트워크는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하는 유전 알고리즘 기반 네트워크일 수 있고, 제2 네트워크는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하도록 학습된 컨볼루션 뉴럴 네트워크일 수 있다.Referring back to FIG. 1 , the processor 110 may check first task information for allocating tasks to a plurality of robots based on the output of the first network to which the first information is input, and the second information is input. Based on the output of the second network, second task information for allocating tasks to a plurality of robots may be checked. According to an embodiment, the first network may be a network that outputs task information for assigning tasks to a plurality of robots, and the second network may be trained to infer task information for assigning tasks to a plurality of robots. there is. In other words, the first network may be a genetic algorithm-based network that outputs task information for allocating tasks to a plurality of robots based on information representing the suitability of the behaviors of the plurality of robots in a predetermined situation, and the second network may be a convolutional neural network learned to output task information for allocating tasks to a plurality of robots based on information representing the suitability of the behaviors of the plurality of robots in a given situation.

일 실시예에 따라, 프로세서(110)는 제2 네트워크를 학습시킬 수 있다. 일 실시예에 따라, 프로세서(110)는 제1 정보로부터 가공되어 생성된 제2 정보를 확인하고, 확인된 제2 정보를 제2 네트워크의 입력 정보로서 획득하고 제1 정보가 입력된 제1 네트워크의 출력 정보를 제2 네트워크의 입력 정보에 대한 타겟 정보로서 획득하고, 제2 네트워크의 입력 정보 및 제2 네트워크의 타겟 정보에 기초하여 제2 네트워크를 학습시킬 수 있다. 다른 실시예에 따라, 프로세서(110)는 제1 정보로부터 가공되어 생성된 제2 정보를 확인하고, 확인된 제2 정보를 제2 네트워크의 입력 정보로서 획득하고 제2 정보에 대응하는 숙련자의 임무 정보를 제2 네트워크의 입력 정보에 대한 타겟 정보로서 획득하고, 제2 네트워크의 입력 정보 및 제2 네트워크의 타겟 정보에 기초하여 제2 네트워크를 학습시킬 수 있다.According to an embodiment, the processor 110 may train the second network. According to an embodiment, the processor 110 checks the second information generated by processing the first information, obtains the checked second information as input information of the second network, and obtains the first network to which the first information is input. Obtain output information of the second network as target information for input information of the second network, and learn the second network based on the input information of the second network and the target information of the second network. According to another embodiment, the processor 110 checks second information generated by processing the first information, obtains the checked second information as input information of the second network, and performs tasks of experts corresponding to the second information. The information may be acquired as target information for the input information of the second network, and the second network may be learned based on the input information of the second network and the target information of the second network.

일 실시예에 따라, 복수의 로봇은 각각의 개별 로봇을 포함하고, 임무 정보는 각각의 로봇에게 임무를 할당하기 위한 임무 값을 포함할 수 있다. 구체적으로, 제1 네트워크의 출력을 기초로 확인된 제1 임무 정보는, 제1 로봇에게 임무를 할당하기 위한 제1-1 임무 값, 제2 로봇에게 임무를 할당하기 위한 제1-2 임무 값, 제3 로봇에게 임무를 할당하기 위한 제1-3 임무 값, 제4 로봇에게 임무를 할당하기 위한 제1-4 임무 값을 포함할 수 있고, 제2 네트워크의 출력을 기초로 확인된 제2 임무 정보는, 제1 로봇에게 임무를 할당하기 위한 제2-1 임무 값, 제2 로봇에게 임무를 할당하기 위한 제2-2 임무 값, 제3 로봇에게 임무를 할당하기 위한 제2-3 임무 값, 제4 로봇에게 임무를 할당하기 위한 제2-4 임무 값을 포함할 수 있다.According to an embodiment, the plurality of robots may include individual robots, and the mission information may include mission values for allocating a mission to each robot. Specifically, the first task information identified based on the output of the first network includes a 1-1 task value for assigning a task to the first robot and a 1-2 task value for assigning a task to the second robot. , a 1-3 task value for assigning a task to a third robot, and a 1-4 task value for assigning a task to a fourth robot, and the second identified based on the output of the second network. The task information includes a 2-1 task value for assigning a task to the first robot, a 2-2 task value for assigning a task to the second robot, and a 2-3 task value for assigning a task to the third robot. value, and a 2nd-4th task value for allocating a task to the fourth robot.

도 5는 제2 네트워크의 일 예시인 컨볼루션 뉴럴 네트워크를 나타낸다.5 shows a convolutional neural network as an example of the second network.

도 5에 도시된 바와 같이, 컨볼루션 뉴럴 네트워크(500)는 컨볼루션 레이어(Convolutional layer)들과, 풀리 커넥티드 레이어(Fully connected layer)들과 소프트맥스 레이어(Softmax layer)로 구성될 수 있다. 일 실시예에 따르면 컨볼루션 뉴럴 네트워크(500)는 5개의 컨볼루션 레이어들과 2개의 풀리 커넥티드 레이어들과 소프트맥스 레이어로 구성될 수 있다. 일 실시예에 따라, 컨볼루션 뉴럴 네트워크(500)는 제1 네트워크의 입력 정보인 제1 정보로부터 가공되어 생성된 제2 정보와 제1 네트워크의 입력 정보인 제1 정보에 대한 제1 네트워크의 출력 정보인 복수의 로봇에게 임무를 할당하기 위한 임무 정보에 기초하여 훈련된 뉴럴 네트워크일 수 있다. 다른 실시예에 따라, 컨볼루션 뉴럴 네트워크(500)는 제1 네트워크의 입력 정보인 제1 정보로부터 가공되어 생성된 제2 정보와 제2 정보에 대응하는 숙련자의 임무 정보에 기초하여 훈련된 뉴럴 네트워크일 수 있다. 또한, 컨볼루션 뉴럴 네트워크(500)는 Flatten 함수가 이용될 수 있으며, 여기서 Flatten 함수는 데이터(tensor)의 형태(shape)를 바꾸는 함수를 의미할 수 있다. 예를 들어, Flatten 함수는 200x200x1의 데이터를 40000x1의 데이터로 바꿀 수 있다.As shown in FIG. 5 , the convolutional neural network 500 may include convolutional layers, fully connected layers, and a softmax layer. According to an embodiment, the convolutional neural network 500 may include 5 convolutional layers, 2 fully connected layers, and a softmax layer. According to an embodiment, the convolutional neural network 500 outputs second information generated by processing from first information that is input information of the first network and output of the first network for the first information that is input information of the first network. It may be a neural network trained based on mission information for allocating missions to a plurality of robots. According to another embodiment, the convolutional neural network 500 is a neural network trained based on second information generated by processing first information, which is input information of the first network, and task information of an expert corresponding to the second information. can be Also, the convolutional neural network 500 may use a Flatten function, and the Flatten function may mean a function that changes the shape of data (tensor). For example, the Flatten function can convert 200x200x1 data into 40000x1 data.

일 실시예에 따라, 제1 정보로부터 가공되어 생성된 제2 정보가 입력된 컨볼루션 뉴럴 네트워크(500)는 각각의 40개의 뉴런들을 통해 후보 임무 값들을 출력할 수 있다. 프로세서(110)는 40개의 후보 임무 값들 중에서 소프트맥스 레이어의 출력값이 가장 높은 후보 임무 값을 제2-1 임무 값(510), 제2-2 임무 값(520), 제2-3 임무 값(530), 및 제2-4 임무 값(540)으로 확인할 수 있다. 다시 말해, 프로세서(110)는 복수의 로봇에게 임무를 할당하기 위한 임무 정보로써 제2-1 임무 값(510), 제2-2 임무 값(520), 제2-3 임무 값(530), 및 제2-4 임무 값(540)을 확인할 수 있다.According to an embodiment, the convolutional neural network 500 to which second information generated by processing the first information is input may output candidate task values through each of 40 neurons. The processor 110 sets the candidate task value having the highest output value of the softmax layer among 40 candidate task values as the 2-1 task value 510, the 2-2 task value 520, and the 2-3 task value ( 530), and the 2-4 task value 540. In other words, the processor 110 provides a 2-1 task value 510, a 2-2 task value 520, a 2-3 task value 530 as task information for allocating tasks to a plurality of robots, and the 2-4 task value 540 can be checked.

또한, 프로세서(110)는 제1 네트워크의 출력의 크론바하 알파 계수(Cronbach's Alpha Coefficient)를 제1 네트워크의 신뢰도로 결정할 수 있다. 크론바하 알파란 문항들의 내적 일관성에 기초하여 추정되는 신뢰도 지수의 일종이다. 크론바하 알파 계수는 0 내지 1의 값을 가지고, 1에 가까울수록 신뢰도가 높다고 해석된다. 제1 네트워크는 유전 알고리즘 기반의 네트워크이므로, 프로세서(110)는 입력 정보인 제1 정보에 대해 유전 알고리즘을 반복적으로 수행하여 출력되는 임무 값들을 반복 회차 별로 수집하고, 전체 회차의 임무 값들에 대해 크론바하 알파 계수를 계산할 수 있다. 일 실시예에 따라, 프로세서(110)는 계산된 크론바하 알파 계수가 기 설정된 범위 내에 존재하는지 확인하고, 기 설정된 범위 내에 존재하는 경우 계산된 크론바하 알파 계수를 제1 네트워크의 신뢰도로 결정할 수 있다. 예를 들어, 기 설정된 범위가 0.75 내지 0.95인 경우, 크론바하 알파 계수가 0.8이면 기 설정된 범위 내이므로 제1 네트워크의 신뢰도가 0.8로 계산되나, 크론바하 알파 계수가 0.6이면 기 설정된 범위 밖이므로 제1 네트워크의 신뢰도는 0.6으로 계산되지 않고 프로세서(110)는 크론바하 알파 계수가 기 설정된 범위를 만족할 때까지 유전 알고리즘을 반복적으로 수행한다.Also, the processor 110 may determine Cronbach's Alpha Coefficient of the output of the first network as reliability of the first network. Cronbach's alpha is a type of reliability index estimated based on the internal consistency of items. The Cronbach's alpha coefficient has a value of 0 to 1, and the closer to 1, the higher the reliability. Since the first network is a network based on a genetic algorithm, the processor 110 repeatedly performs a genetic algorithm on the first information, which is input information, collects output task values for each repetition, and cronizes the task values of all times. Baja alpha coefficient can be calculated. According to an embodiment, the processor 110 may determine whether the calculated Cronbach's alpha coefficient exists within a preset range, and if it exists within the preset range, the calculated Cronbach's alpha coefficient may be determined as reliability of the first network. . For example, when the preset range is 0.75 to 0.95, if the Cronbach's alpha coefficient is 0.8, the reliability of the first network is calculated as 0.8 because it is within the preset range, but if the Cronbach's alpha coefficient is 0.6, it is outside the preset range. The reliability of 1 network is not calculated as 0.6, and the processor 110 repeatedly performs the genetic algorithm until the Cronbach's alpha coefficient satisfies a preset range.

일 실시예에 따라, 프로세서(110)는 컨볼루션 뉴럴 네트워크(500)의 소프트맥스 레이어의 출력들에 기초하여 제2 네트워크의 신뢰도를 계산할 수 있다. 일 실시예에 따라, 프로세서(110)는 40개의 뉴런들에 대응되는 소프트맥스 레이어의 출력들에 대한 엔트로피를 계산하여 제2 네트워크의 신뢰도를 계산할 수 있다. 다시 말해, 프로세서(110)는 제2-1 임무 값(510)에 관한 신뢰도를 제2 네트워크의 신뢰도로 확인할 수 있다. 다른 실시예에 따라, 프로세서(110)는 다른 40개의 뉴런들에 대응되는 소프트맥스 레이어의 출력들에 대한 엔트로피를 계산하여 제2 네트워크의 신뢰도를 계산할 수 있다. 다시 말해, 프로세서(110)는 제2-2 임무 값(520), 제2-3 임무 값(530), 또는 제2-4 임무 값(540)에 관한 신뢰도를 제2 네트워크의 신뢰도로 확인할 수 있다. 또 다른 실시예에 따라, 프로세서(110)는 제2-1 임무 값(510)에 대한 신뢰도, 제2-2 임무 값(520)에 대한 신뢰도, 제2-3 임무 값(530)에 대한 신뢰도, 및 제2-4 임무 값(540)에 대한 신뢰도의 평균 값을 제2 네트워크의 신뢰도로 확인하거나, 제2-1 임무 값(510)에 대한 신뢰도, 제2-2 임무 값(520)에 대한 신뢰도, 제2-3 임무 값(530)에 대한 신뢰도, 및 제2-4 임무 값(540)에 대한 신뢰도 중 가장 낮은 값을 제2 네트워크의 신뢰도로 확인할 수 있다.According to an embodiment, the processor 110 may calculate reliability of the second network based on outputs of the softmax layer of the convolutional neural network 500 . According to an embodiment, the processor 110 may calculate the reliability of the second network by calculating entropy of outputs of the softmax layer corresponding to the 40 neurons. In other words, the processor 110 may check the reliability of the 2-1 task value 510 as the reliability of the second network. According to another embodiment, the processor 110 may calculate the reliability of the second network by calculating entropy of outputs of the softmax layer corresponding to the other 40 neurons. In other words, the processor 110 may check the reliability of the 2-2nd task value 520, the 2-3rd task value 530, or the 2-4th task value 540 as the reliability of the second network. there is. According to another embodiment, the processor 110 provides reliability for the 2-1 task value 510, reliability for the 2-2 task value 520, and reliability for the 2-3 task value 530. , and the average value of the reliability for the 2-4 task value 540 is confirmed as the reliability of the second network, or the reliability for the 2-1 task value 510 and the 2-2 task value 520 The lowest value among the reliability of the second network, the reliability of the 2-3 task value 530, and the reliability of the 2-4 task value 540 may be identified as the reliability of the second network.

프로세서(110)는 제1 정보가 입력된 제1 네트워크로부터 제1 정보에 대한 제1 신뢰도를 계산할 수 있고, 제2 정보가 입력된 제2 네트워크로부터 제2 정보에 대한 제2 신뢰도를 계산할 수 있다. 구체적으로, 프로세서(110)는 제1 네트워크의 출력의 크론바하 알파 계수에 기초하여 제1 신뢰도를 계산할 수 있고, 제2 네트워크의 소프트맥스 레이어의 출력들에 기초하여 제2 신뢰도를 계산할 수 있다.The processor 110 may calculate a first reliability for the first information from the first network to which the first information is input, and calculate a second reliability to the second information from the second network to which the second information is input. . Specifically, the processor 110 may calculate the first reliability based on the Cronbach's alpha coefficient of the output of the first network, and calculate the second reliability based on the outputs of the softmax layer of the second network.

일 실시예에 따라, 프로세서(110)는 아래 수학식 1 및 수학식 2에 따라 제2 네트워크의 신뢰도를 계산할 수 있다.According to an embodiment, the processor 110 may calculate the reliability of the second network according to Equations 1 and 2 below.

수학식 1 및 2에서,

는 뉴럴 네트워크의 소프트맥스 레이어의 n개의 출력값들 중 i번째 출력값을 나타낸다. 따라서, 프로세서(110)는 수학식 1을 통해 엔트로피

를 계산하고, 수학식 2를 통해 엔트로피

의 역수로 신뢰도

를 계산할 수 있다. 여기서

는 0으로 나누는 것을 방지하기 위한 최소값으로 예를 들어 0.00001로 설정될 수 있다. 또한,

는 아래 수학식 3에 따라 계산될 수 있다. 수학식 3에서

는 소프트맥스 레이어에 입력되는 i번째 값을 나타낸다.In

Equations

1 and 2,

represents an i-th output value among n output values of the softmax layer of the neural network. Therefore, the processor 110 calculates the entropy through Equation 1

Calculate , and entropy through Equation 2

reliability as the reciprocal of

can be calculated. here

may be set to, for example, 0.00001 as a minimum value for preventing division by zero. also,

Can be calculated according to Equation 3 below. in Equation 3

represents the i-th value input to the softmax layer.

다른 실시예에 따라, 프로세서(110)는 아래 수학식 4에 따라 신뢰도

를 계산할 수 있다.According to another embodiment, the processor 110 determines the reliability according to Equation 4 below.

can be calculated.

수학식 4에서,

는 뉴럴 네트워크의 소프트맥스 레이어의 n개의 출력값들 중 가장 큰 값을 나타낸다. 수학식 4를 이용하여 신뢰도를 계산할 경우, 수학식 1 및 2를 이용하는 경우보다 연산 속도가 개선되는 효과를 가질 수 있다.In Equation 4,

represents the largest value among n output values of the softmax layer of the neural network. When the reliability is calculated using Equation 4, the calculation speed may be improved compared to the

case using Equations

1 and 2.

일 실시예에 따라, 프로세서(110)는 제2 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제2-1 임무 값

, 제2-2 임무 값

, 제2-3 임무 값

, 및 제2-4 임무 값

을 확인할 수 있다(categorical mode).According to an embodiment, the processor 110 may assign a 2-1 task value for allocating tasks to a plurality of robots based on the output of the second network.

, the value of the 2-2 mission

, 2-3 mission value

, and the value of task 2-4

can be checked (categorical mode).

수학식 5에서, 제2-1 임무 값

는 뉴럴 네트워크의 출력인 후보 제2-1 임무 값들 중에서 소프트맥스 레이어의 출력값인

가 가장 높은 i번째 후보 제2-1 임무 값을 나타내고, 제2-2 임무 값

는 뉴럴 네트워크의 출력인 후보 제2-2 임무 값들 중에서 소프트맥스 레이어의 출력값인

가 가장 높은 i번째 후보 제2-2 임무 값을 나타내고, 제2-3 임무 값

는 뉴럴 네트워크의 출력인 후보 제2-3 임무 값들 중에서 소프트맥스 레이어의 출력값인

가 가장 높은 i번째 후보 제2-3 임무 값을 나타내고, 제2-4 임무 값

는 뉴럴 네트워크의 출력인 후보 제2-4 임무 값들 중에서 소프트맥스 레이어의 출력값인

가 가장 높은 i번째 후보 제2-4 임무 값을 나타낸다.In Equation 5, the value of the 2-1 task

is the output value of the softmax layer among the candidate 2-1 task values that are outputs of the neural network.

represents the highest ith candidate 2-1 task value, and the 2-2 task value

is the output value of the softmax layer among the candidate 2-2 task values that are outputs of the neural network.

represents the highest ith candidate 2-2 task value, and the 2-3 task value

is the output value of the softmax layer among the candidate 2-3 task values that are outputs of the neural network.

represents the highest ith candidate 2-3 task value, and the 2-4 task value

is the output value of the softmax layer among the candidate task values 2-4, which are outputs of the neural network.

represents the highest i-th candidate task value 2-4.

도 6은 제2 네트워크의 출력들에 대한 엔트로피의 예시를 나타낸다.6 shows an example of entropy for the outputs of the second network.

좌측 그래프(610)는 일 실시예에 따른 뉴럴 네트워크의 소프트맥스 레이어의 출력값들을 나타낸다. 구체적으로, 좌측 그래프(610)는 후보 제2-1 임무 값들에 대한

를 나타내고,

가 모두 균등하게 일정한 값들임을 나타낸다. 뉴럴 네트워크를 통해 출력된 확률 값들이 서로 비슷한 값을 가지면 엔트로피가 높게 계산되므로, 신뢰도는 반비례하여 낮아진다. 한편, 출력된 확률 값이 일부에 치우친 경우 엔트로피가 낮게 계산되므로, 신뢰도는 반비례하여 높아진다. 도 6을 참조하면, 좌측 그래프(610)의 경우 후보 임무 값들이 서로 비슷한 값을 가져 엔트로피는 2.9957로 비교적 큰 값으로 계산되고, 이는 제2 네트워크의 신뢰도가 낮음을 의미할 수 있다. 이와 같이, 인공신경망이 어려워하는 예시(예를 들어, 좌측 그래프(610))에 대한 출력은 일반적으로 확률 값이 한 곳으로 치우치지 않고, 여러 확률 값들이 서로 비슷한 값을 가지므로 신뢰도가 낮아진다.A graph 610 on the left represents output values of a softmax layer of a neural network according to an exemplary embodiment. Specifically, the graph 610 on the left is for the candidate 2-1 task values.

represents,

indicates that all are equally constant values. When probability values output through the neural network have similar values to each other, entropy is calculated to be high, and thus reliability decreases in inverse proportion. On the other hand, since entropy is calculated low when the output probability value is biased to a part, reliability increases in inverse proportion. Referring to FIG. 6 , in the case of the graph 610 on the left, the candidate task values have values similar to each other, so the entropy is calculated as a relatively large value of 2.9957, which may mean that the reliability of the second network is low. In this way, the probability value of the output for an example (eg, the graph 610 on the left) that is difficult for the artificial neural network is generally not biased, and since several probability values have values similar to each other, reliability is lowered.

우측 그래프(620)는 다른 실시예에 따른 뉴럴 네트워크의 소프트맥스 레이어의 출력값들을 나타낸다. 구체적으로, 우측 그래프(620)는 후보 제2-1 임무 값들에 대한

를 나타내고, 특정 후보 제2-1 임무 값에 대한

가 높음을 나타낸다. 도 6을 참조하면, 우측 그래프(620)의 경우 후보 제2-1 임무 값이 일부에 치우쳐 엔트로피가 1.0457로 비교적 작은 값으로 계산되고, 이는 제2 네트워크의 신뢰도가 높음을 의미할 수 있다. 본원에서는 다채널 정보를 이용하여 각 뉴럴 네트워크로부터 계산되는 각 신뢰도의 융합을 통해 데이터의 양이 부족하면서 유사도가 적을 수 있는 전장 상황에서 다채널 정보의 상호 보완적 이용 방법을 개시한다.A graph 620 on the right represents output values of a softmax layer of a neural network according to another embodiment. Specifically, the graph 620 on the right is for the candidate 2-1 task values.

, and for a specific candidate 2-1 task value

indicates high. Referring to FIG. 6 , in the case of the graph 620 on the right, the value of the candidate 2-1 task is partially biased and the entropy is calculated as a relatively small value of 1.0457, which may mean that the reliability of the second network is high. In the present invention, a mutually complementary use method of multi-channel information is disclosed in a battlefield situation where the amount of data is insufficient and the degree of similarity may be low through convergence of each reliability calculated from each neural network using multi-channel information.

다시 도 1을 참조하면, 프로세서(110)는 제1 임무 정보 및 제2 임무 정보를 기초로 복수의 로봇에게 임무를 할당하기 위한 최종 임무 정보를 결정할 수 있다. 일 실시예에 따라, 프로세서(110)는 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정하고, 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정하고, 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정하고, 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정할 수 있다. 이어서, 프로세서(110)는 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 기초로 복수의 로봇에게 임무를 할당할 수 있다. 프로세서(110)는 통신 디바이스를 통해 결정된 최종 임무 정보를 복수의 로봇에게 전송할 수 있다.Referring back to FIG. 1 , the processor 110 may determine final task information for allocating tasks to a plurality of robots based on the first task information and the second task information. According to an embodiment, the processor 110 determines a first final task value from the 1-1 task value and the 2-1 task value, and determines a second final task value from the 1-2 task value and the 2-2 task value. The final task value is determined, the third final task value is determined from the first-third task value and the 2-3 task value, and the fourth final task value is determined from the first-4 task value and the 2-4 task value. can decide Subsequently, the processor 110 may allocate tasks to the plurality of robots based on the first final task value, the second final task value, the third final task value, and the fourth final task value. The processor 110 may transmit the determined final mission information to a plurality of robots through a communication device.

일 실시예에 따라, 프로세서(110)는 제1 네트워크의 출력 및 제2 네트워크의 출력에 기초하여 확인한 제1 신뢰도 및 제2 신뢰도를 기초로 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 로봇에게 임무를 할당하기 위한 제1 최종 임무 값을 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도를 기초로 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 로봇에게 임무를 할당하기 위한 제2 최종 임무 값을 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도를 기초로 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 로봇에게 임무를 할당하기 위한 제3 최종 임무 값을 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도를 기초로 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 로봇에게 임무를 할당하기 위한 제4 최종 임무 값을 결정할 수 있다.According to an embodiment, the processor 110 obtains the 1-1 task value and the 2-1 task value based on the first reliability and the second reliability determined based on the output of the first network and the output of the second network. A first final task value for assigning a task to the first robot may be determined, and the task is assigned to the second robot from the 1-2 task value and the 2-2 task value based on the first reliability and the second reliability. A third final task value for assigning a task to a third robot from the 1-3 task value and the 2-3 task value based on the first reliability and the second reliability , and a fourth final task value for assigning a task to the fourth robot may be determined from the 1-4 task values and the 2-4 task values based on the first reliability and the second reliability.

일 실시예에 따라, 제1 네트워크는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 3차원의 텐서로 표현하고 이에 대응하는 복수의 로봇에게 임무를 할당하기 위한 임무 정보가 출력되는 네트워크일 수 있다. 제2 네트워크는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 나타내는 정보를 행렬로 표현하고 이에 대응하는 복수의 로봇에게 임무를 할당하기 위한 임무 정보가 레이블링된 네트워크일 수 있다. 각각의 네트워크는 입력 정보를 받아 확인 정보(예를 들어, 임무 값)을 예측할 수 있다. 구체적인 실시예로 제1 네트워크에서 나온 제1-1 임무 값, 제1-2 임무 값, 제1-3 임무 값, 및 제1-4 임무 값과 제2 네트워크에서 나온 제2-1 임무 값, 제2-2 임무 값, 제2-3 임무 값, 및 제2-4 임무 값을 융합하는 방법이 제시될 수 있다. 예를 들어, 제1 네트워크 및 제2 네트워크를 통해 나온 각 임무 값의 평균을 내는 방법으로 융합할 수 있다. 일 실시예에 따라, 각각의 네트워크를 통해 확인된 각 신뢰도가 상이한 경우, 평균을 내는 방법보다 각 신뢰도를 바탕으로 각 네트워크의 출력 값을 융합하는 것이 더 효과적일 수 있다. 이하에서 구체적인 실시예로 각 네트워크의 신뢰도를 바탕으로 각 네트워크의 출력 값(즉, 임무 값)을 가중합하거나 신뢰도가 높은 네트워크의 출력 값을 최종 출력 값으로 결정하는 예를 개시한다.According to an embodiment, the first network expresses information representing the degree of fitness for the behavior of a plurality of robots in a 3-dimensional tensor in a predetermined situation, and assigns a task to a plurality of robots corresponding to the information representing the suitability. Mission information for outputting could be a network. The second network may be a network labeled with task information for allocating tasks to a plurality of robots corresponding to information representing the degree of fitness of a plurality of robots in a given situation in a matrix. Each network can receive input information and predict confirmation information (eg, task value). In a specific embodiment, the 1-1 task value, the 1-2 task value, the 1-3 task value, and the 1-4 task value from the first network and the 2-1 task value from the second network, A method of fusing the 2-2nd task value, the 2-3rd task value, and the 2-4th task value may be proposed. For example, it may be converged by averaging each task value obtained through the first network and the second network. According to an embodiment, when the respective reliability levels identified through each network are different, it may be more effective to fuse the output values of each network based on each reliability level rather than averaging. Hereinafter, as a specific embodiment, an example of weighting the output values (ie, task values) of each network based on the reliability of each network or determining the output value of a network with high reliability as the final output value will be described.

일 실시예에 따라, 프로세서(110)는 제1 신뢰도 및 제2 신뢰도의 비율에 따라 제1-1 임무 값 및 제2-1 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정할 수 있다. 프로세서(110)는 제1 신뢰도 및 제2 신뢰도의 비율에 따라 제1-2 임무 값 및 제2-2 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정할 수 있다. 프로세서(110)는 제1 신뢰도 및 제2 신뢰도의 비율에 따라 제1-3 임무 값 및 제2-3 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정할 수 있다. 프로세서(110)는 제1 신뢰도 및 제2 신뢰도의 비율에 따라 제1-4 임무 값 및 제2-4 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정할 수 있다.According to an embodiment, the processor 110 assigns a weight to each of the 1-1 task value and the 2-1 task value according to the ratio of the first reliability and the second reliability, and the weighted 1-1 The first final task value may be determined from the task value and the 2-1 task value. The processor 110 assigns a weight to each of the 1-2 task value and the 2-2 task value according to the ratio of the first reliability and the second reliability, and the weighted 1-2 task value and the 2- A second final task value may be determined from the 2 task values. The processor 110 assigns weights to the 1-3 task values and the 2-3 task values, respectively, according to the ratio of the first reliability and the second reliability, and the weighted 1-3 task values and the 2-3 task values are weighted. A third final task value may be determined from the three task values. The processor 110 assigns weights to the 1st-4th task values and the 2nd-4th task values, respectively, according to the ratio of the first reliability and the second reliability, and the weighted 1st-4th task values and the 2nd-4th task values are weighted. A fourth final task value may be determined from the 4 task values.

예를 들어, 프로세서(110)는 아래 수학식 6에 따라 제1 최종 임무 값

, 제2 최종 임무 값

, 제3 최종 임무 값

, 및 제4 최종 임무 값

을 결정할 수 있다.For example, the processor 110 calculates the first final task value according to Equation 6 below.

, the value of the second final task

, the value of the third final task

, and the value of the fourth final task

can determine

수학식 6에서,

는 제1 신뢰도를 나타낼 수 있고,

는 제2 신뢰도를 나타낼 수 있다. 예를 들어, 제1 신뢰도는 제1 네트워크에 대한 신뢰도를 의미할 수 있고, 제2 신뢰도는 제2 네트워크에 대한 신뢰도를 의미할 수 있다. 또한, 수학식 6에서,

는 제1-1 임무 값을 나타낼 수 있고,

는 제2-1 임무 값을 나타낼 수 있고,

은 제1-2 임무 값을 나타낼 수 있고,

는 제2-2 임무 값을 나타낼 수 있고,

은 제1-3 임무 값을 나타낼 수 있고,

는 제2-3 임무 값을 나타낼 수 있고,

은 제1-4 임무 값을 나타낼 수 있고,

는 제2-4 임무 값을 나타낼 수 있다. 따라서, 프로세서(110)는 제1 신뢰도 및 제2 신뢰도의 비율에 따라 가중치가 부여된 제1-1 임무 값 및 제2-1 임무 값을 융합하여 제1 최종 임무 값을 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도의 비율에 따라 가중치가 부여된 제1-2 임무 값 및 제2-2 임무 값을 융합하여 제2 최종 임무 값을 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도의 비율에 따라 가중치가 부여된 제1-3 임무 값 및 제2-3 임무 값을 융합하여 제3 최종 임무 값을 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도의 비율에 따라 가중치가 부여된 제1-4 임무 값 및 제2-4 임무 값을 융합하여 제4 최종 임무 값을 결정할 수 있다.In Equation 6,

May represent the first reliability,

may represent the second reliability. For example, the first reliability level may mean reliability of a first network, and the second reliability level may mean reliability of a second network. Also, in Equation 6,

May represent the 1-1 task value,

May represent the 2-1 task value,

May represent the 1st-2nd task value,

May represent the 2-2 task value,

May represent the 1-3 task values,

May represent the 2-3 task value,

May represent the 1-4 task values,

may represent the value of the 2-4 task. Accordingly, the processor 110 may determine the first final task value by fusing the 1-1 task value and the 2-1 task value weighted according to the ratio of the first reliability and the second reliability, and The second final task value may be determined by fusing the 1-2 task value and the 2-2 task value weighted according to the ratio of the reliability and the second reliability, and the second final task value may be determined according to the ratio of the first reliability and the second reliability. A third final task value may be determined by fusing the weighted task values 1-3 and task values 2-3, and tasks 1-4 weighted according to the ratio of the first reliability and the second reliability. The fourth final task value may be determined by fusing the value and the 2-4 task value.

다른 실시예에 따라, 프로세서(110)는 제1 신뢰도 및 제2 신뢰도 간의 크기 비교를 통해 제1-1 임무 값 및 제2-1 임무 값 중에서 어느 하나를 제1 최종 임무 값으로 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도 간의 크기 비교를 통해 제1-2 임무 값 및 제2-2 임무 값 중에서 어느 하나를 제2 최종 임무 값으로 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도 간의 크기 비교를 통해 제1-3 임무 값 및 제2-3 임무 값 중에서 어느 하나를 제3 최종 임무 값으로 결정할 수 있고, 제1 신뢰도 및 제2 신뢰도 간의 크기 비교를 통해 제1-4 임무 값 및 제2-4 임무 값 중에서 어느 하나를 제4 최종 임무 값으로 결정할 수 있다.According to another embodiment, the processor 110 may determine any one of the 1-1 task value and the 2-1 task value as the first final task value through a size comparison between the first reliability and the second reliability, Any one of the 1-2 task value and the 2-2 task value may be determined as the second final task value through the size comparison between the first reliability and the second reliability, and the size comparison between the first reliability and the second reliability may be determined. Through this, any one of the 1-3 task value and the 2-3 task value can be determined as the third final task value, and through the size comparison between the first reliability and the second reliability, the 1-4 task value and the 2- One of the four task values may be determined as the fourth final task value.

예를 들어, 프로세서(110)는 아래 수학식 7에 따라 제1 최종 임무 값

, 제2 최종 임무 값

, 제3 최종 임무 값

, 및 제4 최종 임무 값

을 결정할 수 있다.For example, the processor 110 calculates the first final task value according to Equation 7 below.

, the value of the second final task

, the value of the third final task

, and the value of the fourth final task

can determine

수학식 7에서,

는 제1 신뢰도를 나타낼 수 있고,

는 제2 신뢰도를 나타낼 수 있다. 예를 들어, 제1 신뢰도는 제1 네트워크에 대한 신뢰도를 의미할 수 있고, 제2 신뢰도는 제2 네트워크에 대한 신뢰도를 의미할 수 있다. 또한 수학식 7에서

는 제1-1 임무 값을 나타낼 수 있고,

는 제2-1 임무 값을 나타낼 수 있고,

은 제1-2 임무 값을 나타낼 수 있고,

는 제2-2 임무 값을 나타낼 수 있고,

은 제1-3 임무 값을 나타낼 수 있고,

는 제2-3 임무 값을 나타낼 수 있고,

은 제1-4 임무 값을 나타낼 수 있고,

는 제2-4 임무 값을 나타낼 수 있다. 따라서, 프로세서(110)는 제1 신뢰도와 제2 신뢰도 간의 크기 비교에 따라, 신뢰도가 높은 네트워크로부터 획득된 제1-1 임무 값 또는 제2-1 임무 값을 제1 최종 임무 값으로 결정할 수 있고, 제1-2 임무 값 또는 제2-2 임무 값을 제2 최종 임무 값으로 결정할 수 있고, 제1-3 임무 값 또는 제2-3 임무 값을 제3 최종 임무 값으로 결정할 수 있고, 제1-4 임무 값 또는 제2-4 임무 값을 제4 최종 임무 값으로 결정할 수 있다.In Equation 7,

May represent the first reliability,

may represent the second reliability. For example, the first reliability level may mean reliability of a first network, and the second reliability level may mean reliability of a second network. Also in Equation 7

May represent the 1-1 task value,

May represent the 2-1 task value,

May represent the 1st-2nd task value,

May represent the 2-2 task value,

May represent the 1-3 task values,

May represent the 2-3 task value,

May represent the 1-4 task values,

may represent the value of the 2-4 task. Accordingly, the processor 110 may determine the 1-1 task value or the 2-1 task value obtained from the high-reliability network as the first final task value according to the size comparison between the first reliability and the second reliability, , the 1-2 task value or the 2-2 task value may be determined as the second final task value, the 1-3 task value or the 2-3 task value may be determined as the third final task value, The 1-4 task value or the 2-4 task value may be determined as the fourth final task value.

일 실시예에 따라, 복수의 로봇은 제1 로봇, 제2 로봇, 제3 로봇 및 제4 로봇을 포함하고, 제2 신뢰도는 제2-1 임무 값에 대한 제2-1 신뢰도, 제2-2 임무 값에 대한 제2-2 신뢰도, 제2-3 임무 값에 대한 제2-3 신뢰도, 제2-4 임무 값에 대한 제2-4 신뢰도를 포함할 수 있다. 프로세서(110)는 제1 신뢰도 및 제2-1 신뢰도를 기초로 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-2 신뢰도를 기초로 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-3 신뢰도를 기초로 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-4 신뢰도를 기초로 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정할 수 있다. 일 실시예에 따라, 프로세서(110)는 제1 신뢰도 및 제2-1 신뢰도의 비율에 따라 제1-1 임무 값 및 제2-1 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정할 수 있다. 또한, 프로세서(110)는 제1 신뢰도 및 제2-2 신뢰도의 비율에 따라 제1-2 임무 값 및 제2-2 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정할 수 있다. 마찬가지로 프로세서(110)는 제1 신뢰도 및 제2-3 신뢰도의 비율에 따라 제1-3 임무 값 및 제2-3 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정할 수 있고, 제1 신뢰도 및 제2-4 신뢰도의 비율에 따라 제1-4 임무 값 및 제2-4 임무 값 각각에 가중치를 부여하고, 가중치가 부여된 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정할 수 있다. 다른 실시예에 따라, 프로세서(110)는 제1 신뢰도 및 제2-1 신뢰도 간의 크기 비교를 통해 제1-1 임무 값 및 제2-1 임무 값 중에서 어느 하나를 제1 최종 임무 값으로 결정할 수 있고, 제1 신뢰도 및 제2-2 신뢰도 간의 크기 비교를 통해 제1-2 임무 값 및 제2-2 임무 값 중에서 어느 하나를 제2 최종 임무 값으로 결정할 수 있고, 제1 신뢰도 및 제2-3 신뢰도 간의 크기 비교를 통해 제1-3 임무 값 및 제2-3 임무 값 중에서 어느 하나를 제3 최종 임무 값으로 결정할 수 있고, 제1 신뢰도 및 제2-4 신뢰도 간의 크기 비교를 통해 제1-4 임무 값 및 제2-4 임무 값 중에서 어느 하나를 제4 최종 임무 값으로 결정할 수 있다.According to an embodiment, the plurality of robots include a first robot, a second robot, a third robot, and a fourth robot, and the second reliability is a 2-1 reliability for a 2-1 task value, a second- 2-2 reliability for the value of task 2, 2-3 reliability for the value of task 2-3, and 2-4 reliability for the value of task 2-4 may be included. The processor 110 determines the first final task value from the 1-1 task value and the 2-1 task value based on the first reliability and the 2-1 task value, and determines the first and 2-2 task values. Based on the 1-2 task value and the 2-2 task value, the second final task value is determined, and the 1-3 task value and the 2-3 task value are based on the first reliability and the 2-3 task value. A third final task value may be determined from , and a fourth final task value may be determined from the first to fourth task values and the second to fourth task values based on the first reliability and the 2 to 4th reliability. According to an embodiment, the processor 110 assigns a weight to each of the 1-1 task value and the 2-1 task value according to the ratio of the 1st reliability level and the 2-1st reliability level, and the weighted first The first final task value may be determined from the -1 task value and the 2-1 task value. In addition, the processor 110 assigns weights to the 1-2 task values and the 2-2 task values, respectively, according to the ratio of the 1st reliability and the 2-2 task values, and the weighted 1-2 task values. and the second final task value may be determined from the value of the 2-2 task. Similarly, the processor 110 assigns weights to the 1-3 task values and the 2-3 task values, respectively, according to the ratio of the first reliability and the 2-3 task values, and the weighted 1-3 task values and A third final task value may be determined from the 2-3 task values, weights are assigned to each of the 1-4 task values and the 2-4 task values according to the ratio of the first reliability and the 2-4 task values, A fourth final task value may be determined from the weighted first-four task values and second-four task values. According to another embodiment, the processor 110 may determine any one of the 1-1 task value and the 2-1 task value as the first final task value through a size comparison between the 1st reliability level and the 2-1 level reliability level. and, through a comparison of magnitudes between the first reliability and the 2-2 reliability, any one of the 1-2 task value and the 2-2 task value may be determined as the second final task value, and the first reliability and the second-2 task value may be determined. Any one of the 1-3 task values and the 2-3 task values may be determined as the third final task value through the size comparison between the 3 reliability values, and the first and 2-4 task value comparisons may determine the first task value. Any one of the -4 task value and the 2-4 task value may be determined as the fourth final task value.

일 실시예에 따라, 전자 장치(100)는 유전 알고리즘 기반의 제1 네트워크 및 뉴럴 네트워크 기반의 제2 네트워크를 통해 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 결정할 수 있고, 결정된 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 기초로 제1 로봇, 제2 로봇, 제3 로봇, 및 제4 로봇에게 임무를 할당할 수 있다. 전자 장치(100)는 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 통해 제1 로봇, 제2 로봇, 제3 로봇, 및 제4 로봇에게 할당한 임무에 관한 정보를 각각의 로봇에게 전송할 수 있다. 전자 장치(100)는 유전 알고리즘 기반의 네트워크와 더불어 뉴럴 네트워크 기반의 네트워크를 상호 보완적으로 이용함으로써, 방대한 양의 데이터를 필요로 하는 유전 알고리즘의 단점을 개선하고, 유사도가 상대적으로 적은 미학습 데이터를 입력할 경우 출력 값의 질이 떨어지는 뉴럴 네트워크의 단점을 개선하여 임무 할당 성능을 향상시킬 수 있다. 뿐만 아니라, 전자 장치(100)는 뉴럴 네트워크를 사용함으로써 유전 알고리즘 기반의 네트워크만을 사용할 때보다 보다 빠르게 임무 값을 추론하여 다중 로봇에게 임무를 할당할 수 있다.According to an embodiment, the electronic device 100 may obtain a first final task value, a second final task value, a third final task value, and a fourth final task value through a first network based on a genetic algorithm and a second network based on a neural network. A final task value may be determined, and the first robot, the second robot, the third robot, and the second final task value may be determined based on the first final task value, the second final task value, the third final task value, and the fourth final task value. 4 You can assign tasks to robots. The electronic device 100 allocates the first robot, the second robot, the third robot, and the fourth robot through the first final task value, the second final task value, the third final task value, and the fourth final task value. Information about one task can be transmitted to each robot. The electronic device 100 complementarily uses a network based on a neural network as well as a network based on a genetic algorithm, thereby improving the disadvantages of a genetic algorithm that requires a large amount of data and unlearned data having a relatively low degree of similarity. , it is possible to improve the task assignment performance by improving the disadvantages of the neural network, which has poor output quality. In addition, by using a neural network, the electronic device 100 can infer a task value faster than when only a genetic algorithm-based network is used, and assign tasks to multiple robots.

도 7은 프로세서가 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 결정하는 실시예를 나타낸다.7 illustrates an embodiment in which a processor determines a first final task value, a second final task value, a third final task value, and a fourth final task value.

도 7을 참조하면, 프로세서(110)는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 각각 나타내는 제1 정보(710)가 입력된 제1 네트워크(730)의 출력을 기초로 제1-1 임무 값, 제1-2 임무 값, 제1-3 임무 값, 및 제1-4 임무 값을 확인할 수 있고, 제1-1 임무 값, 제1-2 임무 값, 제1-3 임무 값, 및 제1-4 임무 값에 대한 제1 신뢰도를 확인할 수 있다. 또한, 프로세서(110)는 소정의 상황에서 복수의 로봇의 행동에 대한 적합도를 각각 나타내는 제2 정보(720)가 입력된 제2 네트워크(740)의 출력을 기초로 제2-1 임무 값, 제2-2 임무 값, 제2-3 임무 값, 및 제2-4 임무 값을 확인할 수 있고, 제2-1 임무 값에 대한 제2-1 신뢰도, 제2-2 임무 값에 대한 제2-2 신뢰도, 제2-3 임무 값에 대한 제2-3 신뢰도, 및 제2-4 임무 값에 대한 제2-4 신뢰도를 확인할 수 있다.Referring to FIG. 7 , the processor 110 performs the first-1st operation based on the output of the first network 730 to which the first information 710 representing the degree of fitness for the behavior of a plurality of robots is input in a predetermined situation. It is possible to check the task value, the 1-2 task value, the 1-3 task value, and the 1-4 task value, the 1-1 task value, the 1-2 task value, the 1-3 task value, And it is possible to check the first reliability for the 1-4 task values. In addition, the processor 110 performs the 2-1 task value, the second information 720 indicating the degree of suitability for the behavior of a plurality of robots in a predetermined situation, respectively, based on the output of the second network 740 input thereto. It is possible to check the 2-2 task value, the 2-3 task value, and the 2-4 task value, the 2-1 reliability for the 2-1 task value, and the 2-1 reliability for the 2-2 task value. 2 reliability, 2-3 reliability for the 2-3 task value, and 2-4 reliability for the 2-4 task value can be confirmed.

일 실시예에 따라, 프로세서(110)는 제1 신뢰도 및 제2-1 신뢰도를 기초로 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-2 신뢰도를 기초로 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-3 신뢰도를 기초로 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-4 신뢰도를 기초로 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정할 수 있다. 예를 들어, 프로세서(110)는 제1 신뢰도 및 제2-1 신뢰도의 비율 또는 제1 신뢰도 및 제2-1 신뢰도 간의 크기 비교에 따라, 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 로봇에게 임무를 할당하기 위한 제1 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-2 신뢰도의 비율 또는 제1 신뢰도 및 제2-2 신뢰도 간의 크기 비교에 따라, 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 로봇에게 임무를 할당하기 위한 제2 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-3 신뢰도의 비율 또는 제1 신뢰도 및 제2-3 신뢰도 간의 크기 비교에 따라, 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 로봇에게 임무를 할당하기 위한 제3 최종 임무 값을 결정하고, 제1 신뢰도 및 제2-4 신뢰도의 비율 또는 제1 신뢰도 및 제2-4 신뢰도 간의 크기 비교에 따라, 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 로봇에게 임무를 할당하기 위한 제4 최종 임무 값을 결정할 수 있다. 일 실시예에 따라, 최종 임무 값은 기 정의된 임무에 대응하는 값일 수 있다. 프로세서(110)는 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값에 대응하는 각각의 임무를 제1 로봇, 제2 로봇, 제3 로봇, 및 제4 로봇에게 할당할 수 있다.According to an embodiment, the processor 110 determines a first final task value from the 1-1 task value and the 2-1 task value based on the first reliability and the 2-1 task value, and determines the first final task value and Based on the 2-2 reliability, the second final task value is determined from the 1-2 task value and the 2-2 task value, and the 1-3 task value and A third final task value may be determined from the 2-3 task value, and a fourth final task value may be determined from the 1-4 task value and the 2-4 task value based on the first reliability and the 2-4 task value. there is. For example, the processor 110 may determine the 1-1st task value and the 2-1st task value according to the ratio of the 1st reliability and the 2-1st reliability or the size comparison between the 1st reliability and the 2-1st reliability. Determines a first final task value for assigning a task to the first robot, and according to a ratio of the first reliability and the 2-2 reliability or a size comparison between the first reliability and the 2-2 reliability, the 1-2 task A second final task value for assigning a task to the second robot is determined from the value and the 2-2 task value, and the ratio between the first reliability and the 2-3 reliability or the size between the first reliability and the 2-3 reliability According to the comparison, a third final task value for assigning a task to a third robot is determined from the 1-3 task values and the 2-3 task values, and the ratio of the first reliability and the 2-4 reliability or the first According to the size comparison between the reliability and the 2-4th reliability, a fourth final task value for assigning a task to the fourth robot may be determined from the 1-4th task value and the 2-4th task value. According to an embodiment, the final task value may be a value corresponding to a predefined task. The processor 110 assigns each of the tasks corresponding to the first final task value, the second final task value, the third final task value, and the fourth final task value to the first robot, the second robot, the third robot, and the second final task value. 4 can be assigned to robots.

도 8은 전자 장치가 동작하는 일 실시예를 나타낸다.8 shows an embodiment in which an electronic device operates.

데이터 획득(Data Acquisition)을 위한 단계(S810)에서, 전자 장치(100)는 소정의 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보를 획득할 수 있다. 예를 들어, 전자 장치(100)는 제1 정보로서 정보 텐서를 획득할 수 있고, 제2 정보로서 제1 정보를 가공한 정보 매트릭스를 이미지화한 2차원 이미지 정보를 획득할 수 있다. 일 실시예에 따라, 전자 장치(100)는 메모리(120)에 저장된 전장 상황에서 로봇의 행동에 대한 적합도를 각각 나타내는 제1 정보 및 제2 정보를 획득할 수 있다.In step S810 for data acquisition (Data Acquisition), the electronic device 100 may acquire information representing the degree of suitability for the behavior of the robot in a predetermined situation. For example, the electronic device 100 may obtain an information tensor as first information and obtain 2D image information obtained by forming an information matrix obtained by processing the first information as second information. According to an embodiment, the electronic device 100 may obtain first information and second information each representing a degree of suitability for a robot's behavior in an battlefield situation stored in the memory 120 .

데이터 편집(Data Editing)을 위한 단계(S820)에서, 전자 장치(100)는 S810에서 획득된 데이터를 편집하여 제1 네트워크의 반복 수행에 필요가 없는 부분 또는 제2 네트워크의 학습에 필요가 없는 부분을 제거할 수 있다. 예를 들어, 전자 장치(100)는 제2 네트워크를 학습 또는 훈련시키기 위해 기 획득된 전장 상황에서 로봇의 행동에 대한 적합도를 나타내는 제2 정보가 사람의 단순 실수에 의해 중복되는 경우, 중복된 데이터를 제거할 수 있다.In the step for data editing (S820), the electronic device 100 edits the data obtained in S810 so that a portion not required for repetition of the first network or a portion not required for learning of the second network is edited. can be removed. For example, when the second information representing the suitability of a robot's behavior in a battlefield situation previously acquired to learn or train a second network is duplicated due to a simple human error, the electronic device 100 provides redundant data. can be removed.

훈련(Training)을 위한 단계(S830)에서, 전자 장치(100)는 S820에서 편집된 데이터에 기반하여 제2 네트워크를 학습시킬 수 있다. 구체적으로, 전자 장치(100)는 입력 정보인 제2 정보와 레이블인 제1 네트워크의 출력인 제1 임무 정보 또는 숙련자의 임무 정보를 통해 제2 네트워크를 훈련 또는 학습 시킬 수 있다.In step S830 for training, the electronic device 100 may train the second network based on the data edited in S820. Specifically, the electronic device 100 may train or learn the second network through second information, which is input information, and first task information, which is an output of the first network, which is a label, or task information of an expert.

테스팅(Testing)을 위한 단계(S840)에서, 전자 장치(100)는 S830에서 학습 또는 훈련된 제2 네트워크를 이용하여 복수의 로봇에게 임무를 할당하기 위한 제2 임무 정보를 확인할 수 있고, 제2 임무 정보에 따라 최종 임무 정보를 결정하고 제공할 수 있다. 구체적으로, 전자 장치(100)는 제1 정보 및 제2 정보에 기초하여 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 결정하고, 결정된 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 각각의 로봇에게 제공할 수 있다.In the step for testing (S840), the electronic device 100 may check second task information for allocating tasks to a plurality of robots using the second network learned or trained in S830. Depending on the mission information, final mission information can be determined and provided. Specifically, the electronic device 100 determines a first final task value, a second final task value, a third final task value, and a fourth final task value based on the first information and the second information, and determines the first final task value. The final task value, the second final task value, the third final task value, and the fourth final task value may be provided to each robot.

일 실시예에 따르면, 전자 장치(100)는 최종 임무 정보를 복수의 로봇에게 전송할 수 있다. 출력된 임무 할당 정보가 실제 숙련자의 임무 정보 또는 제1 네트워크의 출력인 임무 정보와 극히 상이한 경우, 전자 장치(100)가 복수의 로봇에게 임무를 정확하게 제공하지 못한 것으로 판단할 수 있다. 복수의 로봇에게 임무를 정확하게 제공하지 못한 경우, 전자 장치(100)는 제2 네트워크를 재학습시킬 수 있다. According to an embodiment, the electronic device 100 may transmit final mission information to a plurality of robots. If the outputted task assignment information is extremely different from the task information of the actual skilled worker or the task information output from the first network, it may be determined that the electronic device 100 has not accurately provided the tasks to the plurality of robots. If the tasks are not accurately provided to the plurality of robots, the electronic device 100 may relearn the second network.

도 9는 일 실시예에 따른 전자 장치의 동작 방법을 나타낸다.9 illustrates a method of operating an electronic device according to an exemplary embodiment.

도 9의 동작 방법의 각 단계는 도 1의 전자 장치(100)에 의해 수행될 수 있으므로, 도 9와 중복되는 내용에 대해서는 설명을 생략한다.Since each step of the operating method of FIG. 9 can be performed by the electronic device 100 of FIG. 1 , descriptions of overlapping contents with those of FIG. 9 will be omitted.

단계 S910에서, 전자 장치(100)는 소정의 상황에서 로봇의 행동에 대한 적합도를 각각 나타내는 제1 정보 및 제2 정보를 획득할 수 있다. 제1 정보는 3차원 텐서의 형태를 갖는 정보를 포함하고, 제2 정보는 2차원 이미지 정보를 포함할 수 있다. 제1 정보 및 제2 정보는 복수의 로봇 각각의 리소스 정보, 전장 상황에 관한 정보, 전장 상황 별 적합도 함수에 관한 정보, 및 리소스-로봇 행동의 매칭 조건에 관한 정보를 기초로 생성될 수 있다.In step S910, the electronic device 100 may obtain first information and second information each representing a degree of suitability for a robot's behavior in a predetermined situation. The first information may include information in the form of a 3D tensor, and the second information may include 2D image information. The first information and the second information may be generated based on resource information of each of a plurality of robots, battlefield situation information, information about a fitness function for each battlefield situation, and resource-robot behavior matching condition information.

단계 S920에서, 전자 장치(100)는 제1 정보가 입력된 제1 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제1 임무 정보를 확인하고, 제2 정보가 입력된 제2 네트워크의 출력을 기초로 복수의 로봇에게 임무를 할당하기 위한 제2 임무 정보를 확인할 수 있다. 제1 네트워크는 전장 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하는 유전 알고리즘 기반의 네트워크이고, 제2 네트워크는 전장 상황에서 로봇의 행동에 대한 적합도를 나타내는 정보에 기초하여 복수의 로봇에게 임무를 할당하기 위한 임무 정보를 출력하도록 학습된 컨볼루션 뉴럴 네트워크일 수 있다.In step S920, the electronic device 100 checks first task information for allocating tasks to a plurality of robots based on the output of the first network to which the first information is input, and the second network to which the second information is input. Second task information for allocating tasks to a plurality of robots may be checked based on the output of . The first network is a genetic algorithm-based network that outputs mission information for allocating missions to a plurality of robots based on information representing the suitability of robot behavior in a battlefield situation, and the second network is a network based on robot behavior in a battlefield situation. It may be a convolutional neural network learned to output task information for allocating tasks to a plurality of robots based on information representing the degree of suitability for .

단계 S930에서, 전자 장치(100)는 제1 임무 정보 및 제2 임무 정보를 기초로 복수의 로봇에게 임무를 할당하기 위한 최종 임무 정보를 결정할 수 있다. 전자 장치(100)는 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정하고, 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정하고, 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정하고, 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정할 수 있다. 또한, 전자 장치(100)는 제1 신뢰도 및 제2 신뢰도에 기초하여 제1-1 임무 값 및 제2-1 임무 값으로부터 제1 최종 임무 값을 결정하고, 제1-2 임무 값 및 제2-2 임무 값으로부터 제2 최종 임무 값을 결정하고, 제1-3 임무 값 및 제2-3 임무 값으로부터 제3 최종 임무 값을 결정하고, 제1-4 임무 값 및 제2-4 임무 값으로부터 제4 최종 임무 값을 결정할 수 있다.In step S930, the electronic device 100 may determine final task information for allocating tasks to a plurality of robots based on the first task information and the second task information. The electronic device 100 determines a first final task value from the 1-1 task value and the 2-1 task value, and determines a second final task value from the 1-2 task value and the 2-2 task value. and determine the third final task value from the 1-3 task values and the 2-3 task values, and determine the 4th final task value from the 1-4 task values and the 2-4 task values. Also, the electronic device 100 determines the first final task value from the 1-1 task value and the 2-1 task value based on the first reliability and the second reliability, and determines the 1-2 task value and the second task value. Determine the second final task value from the -2 task value, determine the third final task value from the 1-3 task value and the 2-3 task value, and determine the 1-4 task value and the 2-4 task value It is possible to determine the fourth final task value from

단계 S940에서, 전자 장치(100)는 최종 임무 정보를 제공할 수 있다. 전자 장치(100)는 결정된 제1 최종 임무 값, 제2 최종 임무 값, 제3 최종 임무 값, 및 제4 최종 임무 값을 제공할 수 있다. 또한, 전자 장치(100)는 통신 디바이스를 통해 최종 임무 정보를 복수의 로봇에게 전송할 수 있다.In step S940, the electronic device 100 may provide final mission information. The electronic device 100 may provide the determined first final task value, second final task value, third final task value, and fourth final task value. Also, the electronic device 100 may transmit final mission information to a plurality of robots through a communication device.

전술한 실시예들에 따른 전자 장치는, 프로세서, 프로그램 데이터를 저장하고 실행하는 메모리, 디스크 드라이브와 같은 영구 저장부(permanent storage), 외부 장치와 통신하는 통신 포트, 터치 패널, 키(key), 버튼 등과 같은 사용자 인터페이스 장치 등을 포함할 수 있다. 소프트웨어 모듈 또는 알고리즘으로 구현되는 방법들은 상기 프로세서상에서 실행 가능한 컴퓨터가 읽을 수 있는 코드들 또는 프로그램 명령들로서 컴퓨터가 읽을 수 있는 기록 매체 상에 저장될 수 있다. 여기서 컴퓨터가 읽을 수 있는 기록 매체로 마그네틱 저장 매체(예컨대, ROM(read-only memory), RAM(random-Access memory), 플로피 디스크, 하드 디스크 등) 및 광학적 판독 매체(예컨대, 시디롬(CD-ROM), 디브이디(DVD: Digital Versatile Disc)) 등이 있다. 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템들에 분산되어, 분산 방식으로 컴퓨터가 판독 가능한 코드가 저장되고 실행될 수 있다. 매체는 컴퓨터에 의해 판독가능하며, 메모리에 저장되고, 프로세서에서 실행될 수 있다. An electronic device according to the above-described embodiments includes a processor, a memory for storing and executing program data, a permanent storage unit such as a disk drive, a communication port for communicating with an external device, a touch panel, a key, User interface devices such as buttons and the like may be included. Methods implemented as software modules or algorithms may be stored on a computer-readable recording medium as computer-readable codes or program instructions executable on the processor. Here, the computer-readable recording medium includes magnetic storage media (e.g., read-only memory (ROM), random-access memory (RAM), floppy disk, hard disk, etc.) and optical reading media (e.g., CD-ROM) ), and DVD (Digital Versatile Disc). A computer-readable recording medium may be distributed among computer systems connected through a network, and computer-readable codes may be stored and executed in a distributed manner. The medium may be readable by a computer, stored in a memory, and executed by a processor.

본 실시예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들은 특정 기능들을 실행하는 다양한 개수의 하드웨어 또는/및 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 실시예는 하나 이상의 마이크로프로세서들의 제어 또는 다른 제어 장치들에 의해서 다양한 기능들을 실행할 수 있는, 메모리, 프로세싱, 로직(logic), 룩 업 테이블(look-up table) 등과 같은 직접 회로 구성들을 채용할 수 있다. 구성 요소들이 소프트웨어 프로그래밍 또는 소프트웨어 요소들로 실행될 수 있는 것과 유사하게, 본 실시 예는 데이터 구조, 프로세스들, 루틴들 또는 다른 프로그래밍 구성들의 조합으로 구현되는 다양한 알고리즘을 포함하여, C, C++, 자바(Java), 어셈블러(assembler) 등과 같은 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능적인 측면들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 실시 예는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다. “매커니즘”, “요소”, “수단”, “구성”과 같은 용어는 넓게 사용될 수 있으며, 기계적이고 물리적인 구성들로서 한정되는 것은 아니다. 상기 용어는 프로세서 등과 연계하여 소프트웨어의 일련의 처리들(routines)의 의미를 포함할 수 있다.This embodiment can be presented as functional block structures and various processing steps. These functional blocks may be implemented with any number of hardware or/and software components that perform specific functions. For example, embodiments may include integrated circuit configurations such as memory, processing, logic, look-up tables, etc., that may execute various functions by means of the control of one or more microprocessors or other control devices. can employ them. Similar to components that can be implemented as software programming or software elements, the present embodiments include data structures, processes, routines, or various algorithms implemented as combinations of other programming constructs, such as C, C++, Java ( It can be implemented in a programming or scripting language such as Java), assembler, or the like. Functional aspects may be implemented in an algorithm running on one or more processors. In addition, this embodiment may employ conventional techniques for electronic environment setting, signal processing, and/or data processing. Terms such as “mechanism”, “element”, “means” and “composition” may be used broadly and are not limited to mechanical and physical components. The term may include a meaning of a series of software routines in association with a processor or the like.

전술한 실시예들은 일 예시일 뿐 후술하는 청구항들의 범위 내에서 다른 실시예들이 구현될 수 있다.The foregoing embodiments are merely examples and other embodiments may be implemented within the scope of the claims described below.

Claims

A method in which an electronic device assigns a task to a plurality of robots,
First information including information representing the degree of fitness for the behavior of the plurality of robots in the form of a 3-dimensional tensor in a predetermined situation, and the degree of fitness for the behavior of the plurality of robots in the predetermined situation as a two-dimensional image obtaining second information including the information indicated by ;
Check first task information for allocating tasks to the plurality of robots, which is an output of the first network to which the first information is input, and tasks to the plurality of robots, which are output of a second network to which the second information is input checking second task information for allocating;
determining final task information for allocating tasks to the plurality of robots based on the first task information and the second task information; and
Including providing the final mission information,
The first network,
A network based on a genetic algorithm that outputs mission information for allocating missions to a plurality of robots based on information representing the degree of fitness for robot behavior in a given situation,
The second network,
A method, which is a convolutional neural network learned to output task information for allocating tasks to a plurality of robots based on information representing the degree of suitability of the robot's behavior in a given situation.

delete

According to claim 1,
The obtaining step is
The first information and the second information are obtained based on the resource information of each of the plurality of robots, the information about the predetermined situation, the information about the fitness function for each predetermined situation, and the information about the resource-robot behavior matching condition. A method comprising generating.

delete

According to claim 1,
The second information is information generated by processing from the first information,
acquiring the second information as input information of the second network and acquiring output information of the first network into which the first information is input as target information for the input information of the second network; and
and learning the second network based on the input information and the target information.

According to claim 1,
acquiring the second information as input information of the second network and acquiring task information of an expert corresponding to the second information as target information for the input information of the second network; and
and learning the second network based on the input information and the target information.

According to claim 1,
The plurality of robots include a first robot, a second robot, a third robot, and a fourth robot,
The first mission information,
A 1-1 task value for allocating a task to the first robot, a 1-2 task value for allocating a task to the second robot, and a 1-3 task value for allocating a task to the third robot , and 1-4 task values for allocating tasks to the fourth robot,
The second task information,
A 2-1 task value for allocating a task to the first robot, a 2-2 task value for allocating a task to the second robot, and a 2-3 task value for allocating a task to the third robot , and a 2-4 task value for assigning a task to the fourth robot,
The final mission information,
A first final task value for assigning a task to the first robot, a second final task value for assigning a task to the second robot, a third final task value for assigning a task to the third robot, and the A fourth final task value for assigning a task to a fourth robot;
The determining step is
The first final task value is determined from the 1-1 task value and the 2-1 task value, and the second final task value is determined from the 1-2 task value and the 2-2 task value and determining the third final task value from the 1-3 task value and the 2-3 task value, and the 4th final task value from the 1-4 task value and the 2-4 task value. A method comprising the step of determining

According to claim 1,
The checking step is
The first reliability, the 1-1 task value, the 1-2 task value, the 1-3 task value, and the 1-4 task value are checked based on the output of the first network to which the first information is input. And based on the output of the second network to which the second information is input, the second reliability, the 2-1st task value, the 2-2nd task value, the 2-3rd task value, and the 2-4th task value Including the step of checking,
The determining step is
Based on the first reliability and the second reliability, a first final task value is determined from the 1-1 task value and the 2-1 task value, and the 1-2 task value and the 2-1 task value are determined. A second final task value is determined from the 2 task value, a third final task value is determined from the 1-3 task value and the 2-3 task value, and the 1-4 task value and the 2-3 task value are determined. determining a fourth final task value from the four task values.

According to claim 8,
The first reliability is,
Calculated based on Cronbach's Alpha Coefficient of the output of the first network,
The second reliability is,
calculated based on outputs of the softmax layer of the second network.

According to claim 1,
and transmitting the final task information to the plurality of robots via a communication device.

As an electronic device that assigns tasks to a plurality of robots,
a memory in which at least one program is stored; and
By executing the at least one program, first information including information representing the degree of fitness for the behavior of the plurality of robots in the form of a three-dimensional tensor in a predetermined situation and the plurality of robots in the predetermined situation Obtaining second information including information representing the degree of fitness for the behavior of in a two-dimensional image;
Check first task information for allocating tasks to the plurality of robots, which is an output of the first network to which the first information is input, and tasks to the plurality of robots, which are output of a second network to which the second information is input Check the second mission information for allocating,
Determine final task information for allocating tasks to the plurality of robots based on the first task information and the second task information, and
Including a processor providing the final task information,
The first network,
A network based on a genetic algorithm that outputs mission information for allocating missions to a plurality of robots based on information representing the degree of fitness for robot behavior in a given situation,
The second network,
An electronic device that is a Convolutional Neural Network that has been trained to output task information for allocating tasks to a plurality of robots based on information representing a degree of suitability for a robot's behavior in a given situation.

A computer-readable non-transitory recording medium recording a program for executing a method of assigning a task to a plurality of robots on a computer,
The method,
First information including information representing the degree of fitness for the behavior of the plurality of robots in the form of a 3-dimensional tensor in a predetermined situation, and the degree of fitness for the behavior of the plurality of robots in the predetermined situation as a two-dimensional image obtaining second information including the information indicated by ;
Check first task information for allocating tasks to the plurality of robots, which is an output of the first network to which the first information is input, and tasks to the plurality of robots, which are output of a second network to which the second information is input checking second task information for allocating;
determining final task information for allocating tasks to the plurality of robots based on the first task information and the second task information; and
Including providing the final mission information,
The first network,
A network based on a genetic algorithm that outputs mission information for allocating missions to a plurality of robots based on information representing the degree of fitness for robot behavior in a given situation,
The second network,
A non-transitory recording medium, which is a convolutional neural network trained to output task information for allocating tasks to a plurality of robots based on information representing the degree of suitability of the robot's behavior in a given situation.