KR102537471B1

KR102537471B1 - Method and device for deep-learning based computation time management of go game service

Info

Publication number: KR102537471B1
Application number: KR1020210046229A
Authority: KR
Inventors: 이상현; 이창율
Original assignee: 엔에이치엔클라우드 주식회사
Priority date: 2021-04-09
Filing date: 2021-04-09
Publication date: 2023-05-30
Also published as: KR20220140123A

Abstract

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 장치는, 바둑판 상태에 따른 가치값 및 상대 착수시간 중 적어도 하나를 수신하는 통신부; 시간 조정부를 저장하는 메모리; 및 상기 시간 조정부를 독출하여, 상기 시간 조정부가 상기 가치값 및 상대 착수시간 중 적어도 하나를 이용하여 착수 준비 시간을 결정하도록 제어하는 프로세서;를 포함하는 것을 특징으로 한다. An apparatus for managing operation time of a Go game service based on deep learning according to an embodiment of the present invention includes a communication unit receiving at least one of a value value and a relative starting time according to a state of a Go board; a memory for storing the time adjustment unit; and a processor which reads the time adjustment unit and controls the time adjustment unit to determine a start preparation time using at least one of the value value and the relative start time.

Description

Deep learning-based Go game service computation time management method and device thereof

본 발명은 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치에 관한 것이다. 보다 상세하게는, 바둑 게임 서비스의 대국 상황에 따라서 딥러닝 연산시간을 관리하는 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치에 관한 것이다. The present invention relates to a deep learning-based Go game service operation time management method and apparatus therefor. More specifically, it relates to a deep learning-based Go game service calculation time management method and apparatus for managing deep learning calculation time according to the match situation of the Go game service.

스마트폰, 태블릿 PC, PDA(Personal Digital Assistant), 노트북 등과 같은 사용자 단말의 이용이 대중화되고 정보 처리 기술이 발달함에 따라 사용자 단말을 이용하여 보드 게임의 일종인 바둑을 할 수 있게 되었고 나아가 사람이 아닌 프로그램된 인공지능 컴퓨터와 바둑 대국을 할 수 있게 되었다. As the use of user terminals such as smartphones, tablet PCs, PDAs (Personal Digital Assistants), laptops, etc. has become popular and information processing technology has developed, it has become possible to play Go, a kind of board game, using user terminals. It became possible to play Go with programmed artificial intelligence computers.

바둑은 다른 보드게임인 체스나 장기와 비교하였을 때 경우의 수가 많아서 인공지능 컴퓨터가 사람 수준으로 대국을 하는데 한계가 있었고 인공지능 컴퓨터의 기력을 높이기 위한 연구가 활발하게 진행되고 있는 추세이다. Compared to other board games such as chess or chess, Go has a large number of cases, so artificial intelligence computers have limitations in playing a game at the human level, and research to increase the energy of artificial intelligence computers is actively progressing.

최근 개발자들은 인공지능 컴퓨터에 몬테 카를로 트리 서치(Monte Carlo Tree Search; MCTS) 알고리즘과 딥러닝 기술을 적용하여 인공지능 컴퓨터의 기력을 프로기사들의 수준 이상으로 올렸다.Recently, developers have applied the Monte Carlo Tree Search (MCTS) algorithm and deep learning technology to artificial intelligence computers to raise the power of artificial intelligence computers to the level of professional players.

또한, 바둑은 시간이 제한된 보드게임이다. 바둑 대회마다 시간이 다른데 보통 선수에게 각자 1시간에서 5시간의 다양한 시간이 주어질 수 있고, 주어진 시간이 초과되면 초읽기 규칙이 적용되어 초읽기 횟수를 넘기면 패배하는 규칙이 있다. Also, Go is a time-limited board game. The time is different for each Go tournament, but each player can be given various times from 1 hour to 5 hours, and when the given time is exceeded, a countdown rule is applied, and there is a rule to lose if the number of countdowns is exceeded.

따라서, 남은 바둑 시간을 파악하고 한 수에 얼마나 많은 시간을 사용하는지 결정하는 것은 게임 승리에 중요한 요소이다. Therefore, knowing the remaining Go time and determining how much time is spent on a single move is an important factor in winning the game.

그러나, 인공지능 컴퓨터는 한 수를 두기 위해 소비하는 시간이 항상 일정하여 중요한 국면에서 좋지 못한 수를 착수하는 문제점이 있다. However, artificial intelligence computers have a problem in that the amount of time spent to make a move is always constant, and thus, in an important phase, a bad move is made.

또한, 일반적으로 일반이나 아마추어 또는 인공지능 컴퓨터는 남은 경기 길이를 예측할 수 없어서 시간 전략을 세울 수 없는 문제점이 있다. In addition, general, amateur, or artificial intelligence computers generally have a problem in that they cannot set up a time strategy because they cannot predict the length of the remaining game.

JP 4392621 B2JP 4392621 B2

본 발명은, 바둑 게임 서비스 상의 대국 상황에 따라서 딥러닝 연산시간을 관리하는 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치를 제공하는데 그 목적이 있다. An object of the present invention is to provide a deep learning-based Go game service operation time management method and apparatus for managing the deep learning operation time according to the game situation on the Go game service.

자세히, 본 발명은, 승률의 변화나 상대방의 착수 소요시간 등을 고려하여 중요한 국면에서 착수 준비 시간을 변경하는 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치를 제공함을 목적으로 한다. In detail, an object of the present invention is to provide a deep learning-based Go game service operation time management method and apparatus for changing the start preparation time in an important phase in consideration of the change in odds or the opponent's start time.

또한, 본 발명은, 예측된 남은 경기 길이를 이용하여 착수 준비 시간을 효과적으로 나눌 수 있는 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치를 제공함을 목적으로 한다. In addition, an object of the present invention is to provide a deep learning-based Go game service operation time management method and apparatus capable of effectively dividing the starting preparation time using the predicted remaining game length.

다만, 본 발명 및 본 발명의 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다. However, the technical problems to be achieved by the present invention and the embodiments of the present invention are not limited to the technical problems described above, and other technical problems may exist.

이때, 상기 시간 조정부는, 상기 가치값의 변동 추세 및 변동 폭 중 적어도 하나를 판단하고, 상기 판단된 변동 추세 및 변동 폭에 따라서 상기 착수 준비 시간을 결정한다. At this time, the time adjustment unit determines at least one of a change trend and a change range of the value, and determines the start preparation time according to the determined change trend and change range.

또한, 상기 시간 조정부는, 상기 가치값의 변동 추세인 상기 가치값의 변화율이 소정의 수치 이하이면, 상기 착수 준비 시간을 증가시킨다. In addition, the time adjustment unit increases the start preparation time when the change rate of the value value, which is the change trend of the value value, is less than a predetermined value.

또한, 상기 시간 조정부는, 이전 착수 단계에서의 가치값과 현재 상기 가치값의 변동 폭이 소정의 값 이상으로 하락하면, 상기 착수 준비 시간을 증가시킨다. In addition, the time adjustment unit increases the start preparation time when the range of change between the value value in the previous start step and the current value value decreases by more than a predetermined value.

또한, 상기 시간 조정부는, 상기 상대 착수시간을 기초로 상대방의 직전 착수에 소요된 상대 착수시간인 직전 상대 착수시간과, 상기 상대방이 소정의 시점동안 수행한 복수의 착수에 소요된 상대 착수시간의 평균인 평균 상대 착수시간을 산출한다. In addition, the time adjustment unit, based on the relative start time, the relative start time immediately before the start, which is the relative start time required for the other party's previous start, and the relative start time required for the plurality of starts performed by the other party during a predetermined point in time. Calculate the average relative start time, which is the average.

또한, 상기 시간 조정부는, 상기 평균 상대 착수시간 보다 상기 직전 상대 착수시간이 크면, 상기 착수 준비 시간을 증가시킨다. In addition, the time adjustment unit increases the set-out preparation time when the previous relative set-off time is greater than the average relative set-out time.

또한, 상기 시간 조정부는, 상기 착수 준비 시간을 증가시키는 조건을 만족하면, 상기 착수 준비 시간을 증가시키는 기준이 되는 착수시간 증가량을 산출한다. In addition, the time adjustment unit, if the conditions for increasing the set-out preparation time are satisfied, calculates the set-out time increment which is a criterion for increasing the set-out preparation time.

또한, 상기 메모리는, 기설정된 착수 준비 시간인 기초 착수시간을 제공하는 시간 관리부를 더 포함하고, 상기 시간 조정부는, 상기 시간 관리부에서 제공된 상기 기초 착수시간을 기초로 상기 착수시간 증가량을 결정한다. In addition, the memory further includes a time management unit providing a basic start time that is a preset start preparation time, and the time adjustment unit determines the start time increment based on the basic start time provided by the time management unit.

또한, 상기 시간 조정부는, 기설정된 초읽기 시간과 상기 기초 착수시간을 기초로 상기 착수시간 증가량을 결정한다. In addition, the time adjustment unit determines the start time increment based on the preset countdown time and the basic start time.

또한, 상기 메모리는, 기설정된 착수 준비 시간인 기초 착수시간을 제공하는 시간 관리부를 더 포함하고, 상기 시간 조정부는, 상기 직전 상대 착수시간과 상기 시간 관리부에서 제공된 상기 기초 착수시간 중 더 큰 착수시간을 상기 착수 준비 시간으로 결정한다. In addition, the memory further includes a time management unit providing a basic start time that is a preset start preparation time, and the time adjustment unit is a larger start time of the previous relative start time and the basic start time provided by the time management unit. is determined as the launch preparation time.

또한, 상기 시간 조정부는, 상기 평균 상대 착수시간 대비 직전 상대 착수시간 비율인 상대 착수시간 비율을 산출하고, 상기 메모리는, 기설정된 착수 준비 시간인 기초 착수시간을 제공하는 시간 관리부를 더 포함하고, 상기 시간 조정부는, 상기 상대 착수시간 비율이 소정의 기준을 충족하면, 상기 상대 착수시간 비율과 상기 시간 관리부에서 제공된 상기 기초 착수시간에 기반한 소정의 연산을 수행하여 상기 착수 준비 시간을 결정한다. In addition, the time adjustment unit calculates a relative start time ratio that is the ratio of the previous relative start time to the average relative start time, and the memory further includes a time management unit that provides a basic start time that is a preset start preparation time, The time adjustment unit determines the start preparation time by performing a predetermined operation based on the relative start time ratio and the basic start time provided by the time management unit, when the relative start time ratio satisfies a predetermined criterion.

한편, 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법은, 시간 관리 모델 서버에서 딥러닝 기반의 바둑 게임 서비스 연산시간을 관리하는 방법으로서, 바둑판 상태에 따른 가치값 및 상대 착수시간 중 적어도 하나를 수신하는 단계; 및 상기 수신된 가치값 및 상대 착수시간 중 적어도 하나를 이용하여 착수 준비 시간을 결정하는 단계를 포함하고, 상기 착수 준비 시간을 결정하는 단계는, 상기 가치값의 변동 추세 및 변동 폭 중 적어도 하나를 판단하고, 상기 판단된 변동 추세 및 변동 폭에 따라서 상기 착수 준비 시간을 결정하는 단계와, 상기 상대 착수시간을 기초로 상대방의 직전 착수에 소요된 상대 착수시간인 직전 상대 착수시간과 상기 상대방이 소정의 시점동안 수행한 복수의 착수에 소요된 상대 착수시간의 평균인 평균 상대 착수시간을 산출하고, 상기 산출된 직전 상대 착수시간과 상기 평균 상대 착수시간을 기초로 상기 착수 준비 시간을 결정하는 단계를 포함한다. On the other hand, the deep learning-based Go game service calculation time management method according to an embodiment of the present invention is a method for managing the deep learning-based Go game service calculation time in a time management model server, and the value value and relative value according to the checkerboard state Receiving at least one of start times; and determining a start preparation time using at least one of the received value value and the relative start time, wherein the step of determining the start preparation time comprises determining at least one of a change trend and a range of change in the value value. and determining the start preparation time according to the determined change trend and the range of change, and based on the relative start time, the previous relative start time, which is the relative start time required for the other party's previous start, and the other party's predetermined start time. Calculating an average relative start time, which is an average of relative start times required for a plurality of starts performed during the time point, and determining the start preparation time based on the calculated relative start time immediately before and the average relative start time include

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치는, 중요한 국면에서 착수 준비 시간을 변경할 수 있다.The deep learning-based Go game service operation time management method and apparatus according to an embodiment of the present invention can change the starting preparation time in an important phase.

또한, 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치는, 남은 대국 시간을 예측할 수 있다.In addition, the deep learning-based Go game service operation time management method and apparatus according to an embodiment of the present invention can predict the remaining game time.

또한, 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 연산시간 관리 방법 및 그 장치는, 예측된 남은 대국 시간을 이용하여 착수 준비 시간을 효과적으로 분배할 수 있다. In addition, the deep learning-based Go game service operation time management method and apparatus according to an embodiment of the present invention can effectively distribute the start preparation time using the predicted remaining game time.

다만, 본 발명에서 얻을 수 있는 효과는 이상에서 언급한 효과들로 제한되지 않으며, 언급하지 않은 또 다른 효과들은 아래의 기재로부터 명확하게 이해될 수 있다. However, the effects obtainable in the present invention are not limited to the effects mentioned above, and other effects not mentioned can be clearly understood from the description below.

도 1은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 시스템에 대한 예시도이다.
도 2는 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스에서 인공지능 컴퓨터의 착수를 위한 착수 모델 서버의 착수 모델 구조를 설명하기 위한 도면이다.
도 3은 착수 모델의 정책에 따른 착수점에 대한 이동 확률 분포를 설명하기 위한 도면이다.
도 4는 착수 모델의 착수점에 대한 가치값과 방문 횟수를 설명하기 위한 도면이다.
도 5는 착수 모델이 탐색부의 파이프 라인에 따라 착수하는 과정을 설명하기 위한 도면이다.
도 6은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스의 형세 판단 기능을 제공하는 화면을 보여주는 예시도이다.
도 7은 본 발명의 형세 판단 모델 서버의 형세 판단 모델 구조를 설명하기 위한 도면이다.
도 8은 본 발명의 형세 판단 모델의 복수의 블록으로 이루어진 신경망 구조 중 하나의 블록을 설명하기 위한 도면이다.
도 9는 본 발명의 형세 판단 모델을 학습하기 위해 사용되는 정답 레이블을 생성하기 위한 제1 및 제2 전처리 단계를 설명하기 위한 도면이다.
도 10은 본 발명의 형세 판단 모델을 학습하기 위해 사용되는 정답 레이블을 생성하기 위한 제1 및 제2 전처리 단계를 설명하기 위한 도면이다.
도 11은 본 발명의 형세 판단 모델을 학습하기 위해 사용되는 정답 레이블을 생성하기 위한 제3 전처리 단계를 설명하기 위한 도면이다.
도 12는 본 발명의 형세 판단 모델의 형세 판단 결과를 설명하기 위한 도면이다.
도 13은 본 발명의 형세 판단 모델의 형세 판단 결과와 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 결과를 비교한 모습이다.
도 14는 본 발명의 형세 판단 모델의 형세 판단 결과와 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 결과를 비교한 모습이다.
도 15는 본 발명의 형세 판단 모델의 형세 판단 결과와 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 결과를 비교한 모습이다.
도 16은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 시스템에 신호 흐름에 대한 예시도이다.
도 17은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 형세 판단 방법이다.
도 18은 도 17의 형세 판단 방법 중 정답 레이블을 생성하기 위한 트레이닝 데이터의 전처리 방법이다.
도 19는 본 발명의 일 실시예에 따른 시간 관리 모델 서버의 시간 관리부를 설명하기 위한 도면이다.
도 20a 및 도 20b는 본 발명의 일 실시예에 따른 시간 관리부의 분산 산출을 설명하기 위한 도면이다.
도 21은 본 발명의 일 실시예에 따른 시간 관리 모델 서버의 바둑 게임 서비스 시스템에서의 신호 흐름에 대한 예시도이다.
도 22는 본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 착수 준비 시간 결정 방법이다.
도 23은 본 발명의 다른 실시예에 따른 시간 관리 모델 서버의 시간 조정부를 설명하기 위한 도면이다.
도 24는 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 착수 준비 시간 결정 방법이다.
도 25는 본 발명의 다른 실시예에 따른 가치값의 변동 추세에 따라서 착수 준비 시간을 결정하는 방법을 설명하기 위한 도면의 일례이다.
도 26은 본 발명의 다른 실시예에 따른 가치값의 변동 폭에 따라서 착수 준비 시간을 결정하는 방법을 설명하기 위한 도면의 일례이다.
도 27은 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 착수 준비 시간 결정 방법이다.
도 28은 본 발명의 다른 실시예에 따른 상대 착수시간에 따라서 착수 준비 시간을 결정하는 방법을 설명하기 위한 도면의 일례이다.
도 29는 본 발명의 또 다른 실시예에 따른 시간 관리 모델 서버의 시간 관리 모델을 설명하기 위한 도면이다.
도 30a 및 도 30b는 본 발명의 또 다른 실시예에 따른 게임 시간 정보를 생성하기 위해 사용되는 집수 변화량을 설명하기 위한 도면이다.
도 31a 및 도 31b는 본 발명의 또 다른 실시예에 따른 게임 시간 정보를 생성하기 위해 사용되는 집수 변화량을 설명하기 위한 도면이다.
도 32a 및 도 32b는 본 발명의 또 다른 실시예에 따른 게임 시간 정보를 생성하기 위해 사용되는 공배수를 설명하기 위한 도면이다.
도 33은 본 발명의 또 다른 실시예에 따른 시간 관리 모델 서버의 바둑 게임 서비스 시스템에서의 신호 흐름에 대한 예시도이다.
도 34는 본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 게임 시간 정보 생성 방법이다.1 is an exemplary view of a deep learning-based Go game service system according to an embodiment of the present invention.
2 is a diagram for explaining the structure of an initiation model of an initiation model server for an initiation of an artificial intelligence computer in a deep learning-based Go game service according to an embodiment of the present invention.
3 is a diagram for explaining a movement probability distribution for an initiation point according to an initiation model policy.
4 is a diagram for explaining the value of the starting point of the starting model and the number of visits.
5 is a diagram for explaining a process in which an initiation model is initiated according to a pipeline of a search unit.
6 is an exemplary diagram showing a screen providing a situation determination function of a deep learning-based Go game service according to an embodiment of the present invention.
7 is a diagram for explaining the structure of the situation judgment model of the situation judgment model server of the present invention.
8 is a diagram for explaining one block of a neural network structure composed of a plurality of blocks of the position judgment model of the present invention.
9 is a diagram for explaining first and second preprocessing steps for generating correct answer labels used to learn a situation judgment model of the present invention.
10 is a diagram for explaining first and second pre-processing steps for generating correct answer labels used to learn a position judgment model of the present invention.
11 is a diagram for explaining a third preprocessing step for generating a correct answer label used to learn a position judgment model of the present invention.
12 is a diagram for explaining the result of the situation judgment of the situation judgment model of the present invention.
13 is a comparison between the situation judgment result of the situation judgment model of the present invention and the situation judgment result by the deep learning model according to the prior art.
14 is a comparison between the situation judgment result of the situation judgment model of the present invention and the situation judgment result by the deep learning model according to the prior art.
15 is a comparison between the situation judgment result of the situation judgment model of the present invention and the situation judgment result of the deep learning model according to the prior art.
16 is an exemplary diagram of a signal flow in a deep learning-based Go game service system according to an embodiment of the present invention.
17 is a situation judgment method among deep learning-based Go game service methods according to an embodiment of the present invention.
18 is a preprocessing method of training data for generating a correct answer label in the situation judgment method of FIG. 17 .
19 is a diagram for explaining a time management unit of a time management model server according to an embodiment of the present invention.
20A and 20B are diagrams for explaining distributed calculation of a time management unit according to an embodiment of the present invention.
21 is an exemplary diagram of a signal flow in a Go game service system of a time management model server according to an embodiment of the present invention.
22 is a method for determining start preparation time among deep learning-based Go game service methods according to an embodiment of the present invention.
23 is a diagram for explaining a time adjustment unit of a time management model server according to another embodiment of the present invention.
24 is a method for determining start preparation time among deep learning-based Go game service methods according to another embodiment of the present invention.
25 is an example of a diagram for explaining a method of determining a start preparation time according to a change trend of a value according to another embodiment of the present invention.
26 is an example of a diagram for explaining a method of determining a start preparation time according to a variation range of a value according to another embodiment of the present invention.
27 is a method for determining start preparation time among deep learning-based Go game service methods according to another embodiment of the present invention.
28 is an example of a diagram for explaining a method of determining a start preparation time according to a relative start time according to another embodiment of the present invention.
29 is a diagram for explaining a time management model of a time management model server according to another embodiment of the present invention.
30A and 30B are diagrams for explaining a collection change amount used to generate game time information according to another embodiment of the present invention.
31A and 31B are diagrams for explaining a collection change amount used to generate game time information according to another embodiment of the present invention.
32A and 32B are diagrams for explaining common multiples used to generate game time information according to another embodiment of the present invention.
33 is an exemplary diagram of a signal flow in a Go game service system of a time management model server according to another embodiment of the present invention.
34 is a method for generating game time information among deep learning-based Go game service methods according to another embodiment of the present invention.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 본 발명의 효과 및 특징, 그리고 그것들을 달성하는 방법은 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 다양한 형태로 구현될 수 있다. 이하의 실시예에서, 제1, 제2 등의 용어는 한정적인 의미가 아니라 하나의 구성 요소를 다른 구성 요소와 구별하는 목적으로 사용되었다. 또한, 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 또한, 포함하다 또는 가지다 등의 용어는 명세서상에 기재된 특징, 또는 구성요소가 존재함을 의미하는 것이고, 하나 이상의 다른 특징들 또는 구성요소가 부가될 가능성을 미리 배제하는 것은 아니다. 또한, 도면에서는 설명의 편의를 위하여 구성 요소들이 그 크기가 과장 또는 축소될 수 있다. 예컨대, 도면에서 나타난 각 구성의 크기 및 두께는 설명의 편의를 위해 임의로 나타내었으므로, 본 발명이 반드시 도시된 바에 한정되지 않는다.Since the present invention can apply various transformations and have various embodiments, specific embodiments will be illustrated in the drawings and described in detail in the detailed description. Effects and features of the present invention, and methods for achieving them will become clear with reference to the embodiments described later in detail together with the drawings. However, the present invention is not limited to the embodiments disclosed below and may be implemented in various forms. In the following embodiments, terms such as first and second are used for the purpose of distinguishing one component from another component without limiting meaning. Also, expressions in the singular number include plural expressions unless the context clearly dictates otherwise. In addition, terms such as include or have mean that features or elements described in the specification exist, and do not preclude the possibility that one or more other features or elements may be added. In addition, in the drawings, the size of components may be exaggerated or reduced for convenience of explanation. For example, since the size and thickness of each component shown in the drawings are arbitrarily shown for convenience of description, the present invention is not necessarily limited to the illustrated bar.

이하, 첨부된 도면을 참조하여 본 발명의 실시예들을 상세히 설명하기로 하며, 도면을 참조하여 설명할 때 동일하거나 대응하는 구성 요소는 동일한 도면부호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, and when describing with reference to the drawings, the same or corresponding components are assigned the same reference numerals, and overlapping descriptions thereof will be omitted. .

도 1은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 시스템에 대한 예시도이다.1 is an exemplary view of a deep learning-based Go game service system according to an embodiment of the present invention.

도 1을 참조하면, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 시스템은, 단말기(100), 바둑서버(200), 착수 모델 서버(300), 형세 판단 모델 서버(400) 및 네트워크(500)를 포함할 수 있다.Referring to FIG. 1, the deep learning-based Go game service system according to an embodiment includes a terminal 100, a Go server 200, a starting model server 300, a situation judgment model server 400, and a network 500. can include

도 1의 각 구성요소는, 네트워크(500)를 통해 연결될 수 있다. 단말기(100), 바둑서버(200), 착수 모델 서버(300), 형세 판단 모델 서버(400) 및 시간 관리 모델 서버(500) 등과 같은 각각의 노드 상호 간에 정보 교환이 가능한 연결 구조를 의미하는 것으로, 이러한 네트워크의 일 예에는 3GPP(3rd Generation Partnership Project) 네트워크, LTE(Long Term Evolution) 네트워크, WIMAX(World Interoperability for Microwave Access) 네트워크, 인터넷(Internet), LAN(Local Area Network), Wireless LAN(Wireless Local Area Network), WAN(Wide Area Network), PAN(Personal Area Network), 블루투스(Bluetooth) 네트워크, 위성 방송 네트워크, 아날로그 방송 네트워크, DMB(Digital Multimedia Broadcasting) 네트워크 등이 포함되나 이에 한정되지는 않는다. Each component of FIG. 1 may be connected through a network 500 . It refers to a connection structure capable of exchanging information between nodes such as the terminal 100, the Go server 200, the start model server 300, the situation judgment model server 400, and the time management model server 500. , Examples of such networks include a 3rd Generation Partnership Project (3GPP) network, a Long Term Evolution (LTE) network, a World Interoperability for Microwave Access (WIMAX) network, the Internet, a Local Area Network (LAN), and a Wireless LAN (Wireless). Local Area Network), WAN (Wide Area Network), PAN (Personal Area Network), Bluetooth (Bluetooth) network, satellite broadcasting network, analog broadcasting network, DMB (Digital Multimedia Broadcasting) network, etc. are included, but are not limited thereto.

<단말기(100)><Terminal (100)>

먼저, 단말기(100)는, 바둑 게임 서비스를 제공받고자 하는 유저의 단말기이다. 또한, 단말기(100)는 다양한 작업을 수행하는 애플리케이션들을 실행하기 위한 유저가 사용하는 하나 이상의 컴퓨터 또는 다른 전자 장치이다. 예컨대, 컴퓨터, 랩탑 컴퓨터, 스마트 폰, 모바일 전화기, PDA, 태블릿 PC, 혹은 바둑서버(200)와 통신하도록 동작 가능한 임의의 다른 디바이스를 포함한다. 다만 이에 한정되는 것은 아니고 단말기(100)는 다양한 머신들 상에서 실행되고, 다수의 메모리 내에 저장된 명령어들을 해석하여 실행하는 프로세싱로직을 포함하고, 외부 입력/출력 디바이스상에 그래픽 사용자 인터페이스(GUI)를 위한 그래픽 정보를 디스플레이하는 프로세스들과 같이 다양한 기타 요소들을 포함할 수 있다. 아울러 단말기(100)는 입력 장치(예를 들면 마우스, 키보드, 터치 감지 표면 등) 및 출력 장치(예를 들면 디스플레이장치, 모니터, 스크린 등)에 접속될 수 있다. 단말기(100)에 의해 실행되는 애플리케이션들은 게임 어플리케이션, 웹 브라우저, 웹 브라우저에서 동작하는 웹 애플리케이션, 워드 프로세서들, 미디어 플레이어들, 스프레드시트들, 이미지 프로세서들, 보안 소프트웨어 또는 그 밖의 것을 포함할 수 있다.First, the terminal 100 is a terminal of a user who wants to receive a Go game service. Also, the terminal 100 is one or more computers or other electronic devices used by a user to execute applications that perform various tasks. For example, a computer, laptop computer, smart phone, mobile phone, PDA, tablet PC, or any other device operable to communicate with the Go server 200. However, it is not limited thereto, and the terminal 100 is executed on various machines, includes processing logic that interprets and executes commands stored in a plurality of memories, and provides a graphical user interface (GUI) on an external input/output device. It may contain various other elements, such as processes that display graphical information. In addition, the terminal 100 may be connected to an input device (eg, mouse, keyboard, touch sensitive surface, etc.) and an output device (eg, display device, monitor, screen, etc.). Applications executed by the terminal 100 may include game applications, web browsers, web applications running on web browsers, word processors, media players, spreadsheets, image processors, security software, or the like. .

또한, 단말기(100)는 명령들을 저장하는 적어도 하나의 메모리(101), 적어도 하나의 프로세서(102) 및 통신부(103)를 포함할 수 있다. In addition, the terminal 100 may include at least one memory 101 storing instructions, at least one processor 102 and a communication unit 103 .

단말기(100)의 메모리(101)는 단말기(100)에서 구동되는 다수의 응용 프로그램(application program) 또는 애플리케이션(application), 단말기(100)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 명령들은 프로세서(102)로 하여금 동작들을 수행하게 하기 위해 프로세서(102)에 의해 실행 가능하고, 동작들은 바둑 게임 실행 요청 신호를 전송, 게임 데이터 송수신, 착수 정보 송수신, 형세 판단 요청 신호를 전송, 형세 판단 결과 수신, 게임 시간 정보 요청, 게임 시간 정보 수신 및 각종 정보 수신하는 동작들을 포함할 수 있다. 또한, 메모리(101)는 하드웨어적으로, ROM, RAM, EPROM, 플래시 드라이브, 하드 드라이브 등과 같은 다양한 저장기기 일 수 있고, 메모리(130)는 인터넷(internet)상에서 상기 메모리(101)의 저장 기능을 수행하는 웹 스토리지(web storage)일 수도 있다. The memory 101 of the terminal 100 may store a plurality of application programs or applications running in the terminal 100, data for operation of the terminal 100, and commands. The instructions are executable by the processor 102 to cause the processor 102 to perform operations, the operations sending a Go game execution request signal, sending and receiving game data, sending and receiving starting information, sending a situation judgment request signal, It may include operations for receiving a decision result, requesting game time information, receiving game time information, and receiving various types of information. In addition, the memory 101 may be a variety of storage devices such as ROM, RAM, EPROM, flash drive, hard drive, etc. in terms of hardware, and the memory 130 performs the storage function of the memory 101 on the Internet. It can also be a web storage that performs.

단말기(100)의 프로세서(102)는 전반적인 동작을 제어하여 바둑 게임 서비스를 제공받기 위한 데이터 처리를 수행할 수 있다. 단말기(100)에서 바둑 게임 어플리케이션이 실행되면, 단말기(100)에서 바둑 게임 환경이 구성된다. 그리고 바둑 게임 어플리케이션은 네트워크(500)를 통해 바둑 서버(200)와 바둑 게임 데이터를 교환하여 단말기(100) 상에서 바둑 게임 서비스가 실행되도록 한다. 이러한 프로세서(102)는 ASICs (application specific integrated circuits), DSPs(digital signal processors), DSPDs(digital signal processing devices), PLDs(programmable logic devices), FPGAs(field programmable gate arrays), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기타 기능 수행을 위한 임의의 형태의 프로세서일 수 있다.The processor 102 of the terminal 100 may perform data processing to receive a Go game service by controlling overall operations. When a Go game application is executed in the terminal 100, a Go game environment is configured in the terminal 100. In addition, the Go game application exchanges Go game data with the Go server 200 through the network 500 so that the Go game service is executed on the terminal 100 . These processors 102 include application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, and microcontrollers. It may be micro-controllers, microprocessors, or any type of processor for performing other functions.

단말기(100)의 통신부(103)는, 하기 통신방식(예를 들어, GSM(Global System for Mobile communication), CDMA(Code Division Multi Access), HSDPA(High Speed Downlink Packet Access), HSUPA(High Speed Uplink Packet Access), LTE(Long Term Evolution), LTE-A(Long Term Evolution-Advanced) 등), WLAN(Wireless LAN), Wi-Fi(Wireless-Fidelity), Wi-Fi(Wireless Fidelity) Direct, DLNA(Digital Living Network Alliance), WiBro(Wireless Broadband), WiMAX(World Interoperability for Microwave Access)에 따라 구축된 네트워크망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신할 수 있다. The communication unit 103 of the terminal 100 uses the following communication methods (eg, Global System for Mobile communication (GSM), Code Division Multi Access (CDMA), High Speed Downlink Packet Access (HSDPA), High Speed Uplink (HSUPA)) Packet Access), LTE (Long Term Evolution), LTE-A (Long Term Evolution-Advanced), etc.), WLAN (Wireless LAN), Wi-Fi (Wireless-Fidelity), Wi-Fi (Wireless Fidelity) Direct, DLNA ( A wireless signal may be transmitted and received with at least one of a base station, an external terminal, and a server on a network constructed according to Digital Living Network Alliance (WiBro), Wireless Broadband (WiBro), and World Interoperability for Microwave Access (WiMAX).

이러한 단말기(100)는, 후술되는 바둑서버(200), 착수 모델 서버(300), 형세 판단 모델 서버(400) 및 시간 관리 모델 서버(500) 중 적어도 하나에서 수행되는 기능 동작의 적어도 일부를 수행할 수도 있다. This terminal 100 performs at least some of the functional operations performed by at least one of the Go server 200, the starting model server 300, the situation judgment model server 400, and the time management model server 500, which will be described later. You may.

<바둑서버(200)><Go server (200)>

바둑서버(200)가 제공하는 바둑 게임 서비스는 바둑서버(200)가 제공하는 가상의 컴퓨터 유저와 실제 유저가 함께 게임에 참여하는 형태로 구성될 수 있다. 이는 유저측 단말기(100) 상에서 구현되는 바둑 게임 환경에서 하나의 실제 유저와 하나의 컴퓨터 유저가 함께 게임을 플레이 한다. 다른 측면에서, 바둑서버(200)가 제공하는 바둑 게임 서비스는 복수의 유저측 디바이스가 참여하여 바둑 게임이 플레이되는 형태로 구성될 수도 있다.The Go game service provided by the Go server 200 may be configured in a form in which a virtual computer user provided by the Go server 200 and a real user participate in the game together. In the Go game environment implemented on the user-side terminal 100, one real user and one computer user play the game together. In another aspect, the Go game service provided by the Go server 200 may be configured in a form in which a plurality of user-side devices participate to play the Go game.

바둑서버(200)는 명령들을 저장하는 적어도 하나의 메모리(201), 적어도 하나의 프로세서(202) 및 통신부(203)를 포함할 수 있다. The Go server 200 may include at least one memory 201 storing instructions, at least one processor 202 and a communication unit 203.

바둑서버(200)의 메모리(201)는 바둑서버(200)에서 구동되는 다수의 응용 프로그램(application program) 또는 애플리케이션(application), 바둑서버(200)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 명령들은 프로세서(202)로 하여금 동작들을 수행하게 하기 위해 프로세서(202)에 의해 실행 가능하고, 동작들은 게임 실행 요청 신호 수신, 게임 데이터 송수신, 착수 정보 송수신, 형세 판단 요청 신호 송수신, 형세 판단 결과 송수신, 착수 준비 시간 송수신, 게임 정보 시간 송수신 및 각종 전송 동작을 포함할 수 있다. 또한, 메모리(201)는 바둑서버(200)에서 대국을 하였던 복수의 기보 또는 기존에 공개된 복수의 기보를 저장할 수 있다. 복수의 기보 각각은 대국 시작의 첫 착수 정보인 제1 착수부터 대국이 종료되는 최종 착수까지의 정보를 모두 포함할 수 있다. 즉, 복수의 기보는 착수에 관한 히스토리 정보를 포함할 수 있다. 바둑서버(200)는 형세 판단 모델 서버(400)의 트레이닝을 위하여 저장된 복수의 기보를 형세 판단 모델 서버(400)에 제공할 수 있게 한다. 또한, 메모리(201)는 하드웨어적으로, ROM, RAM, EPROM, 플래시 드라이브, 하드 드라이브 등과 같은 다양한 저장기기 일 수 있고, 메모리(201)는 인터넷(internet)상에서 상기 메모리(201)의 저장 기능을 수행하는 웹 스토리지(web storage)일 수도 있다.The memory 201 of the Go server 200 may store a plurality of application programs or applications running in the Go server 200, data for the operation of the Go server 200, and commands. . Instructions are executable by the processor 202 to cause the processor 202 to perform operations, which include receiving a game execution request signal, transmitting and receiving game data, transmitting and receiving starting information, transmitting and receiving a situation judgment request signal, and transmitting and receiving a situation judgment result. , start preparation time transmission and reception, game information time transmission and reception, and various transmission operations. In addition, the memory 201 may store a plurality of notations or previously published notations that were played in the Go server 200. Each of the plurality of notation may include all information from the first start, which is the first start information of the start of the match, to the final start when the match ends. That is, a plurality of notes may include history information about the undertaking. The Go server 200 enables the position determination model server 400 to provide a plurality of notations stored for training of the position determination model server 400 . In addition, the memory 201 may be a variety of storage devices such as ROM, RAM, EPROM, flash drive, hard drive, etc. in terms of hardware, and the memory 201 performs the storage function of the memory 201 on the Internet. It can also be a web storage that performs.

바둑서버(200)의 프로세서(202)는 전반적인 동작을 제어하여 바둑 게임 서비스를 제공하기 위한 데이터 처리를 수행할 수 있다. 이러한 프로세서(202)는 ASICs (application specific integrated circuits), DSPs(digital signal processors), DSPDs(digital signal processing devices), PLDs(programmable logic devices), FPGAs(field programmable gate arrays), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기타 기능 수행을 위한 임의의 형태의 프로세서일 수 있다.The processor 202 of the Go server 200 may perform data processing to provide a Go game service by controlling overall operations. These processors 202 include application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, and microcontrollers. It may be micro-controllers, microprocessors, or any type of processor for performing other functions.

바둑서버(200)는 통신부(203)를 통해 네트워크(500)를 경유하여 단말기(100), 착수 모델 서버(300) 및 형세 판단 모델 서버(400)와 통신을 수행할 수 있다. The Go server 200 may communicate with the terminal 100, the starting model server 300, and the situation judgment model server 400 via the network 500 through the communication unit 203.

<착수 모델 서버(300)><Initiation model server 300>

착수 모델 서버(300)는, 별도의 클라우드 서버나 컴퓨팅 장치를 포함할 수 있다. 또한, 착수 모델 서버(300)는 단말기(100)의 프로세서 또는 바둑서버(200)의 데이터 처리부에 설치된 신경망 시스템일 수 있지만, 이하에서 착수 모델 서버(300)는, 단말기(100) 또는 바둑 서버(200)와 별도의 장치로 설명한다.The undertaking model server 300 may include a separate cloud server or computing device. In addition, the start model server 300 may be a neural network system installed in the processor of the terminal 100 or the data processing unit of the Go server 200, but hereinafter, the start model server 300 is the terminal 100 or the Go server ( 200) and a separate device.

착수 모델 서버(300)는 명령들을 저장하는 적어도 하나의 메모리(301), 적어도 하나의 프로세서(302) 및 통신부(303)를 포함할 수 있다. The undertaking model server 300 may include at least one memory 301 storing instructions, at least one processor 302 and a communication unit 303.

착수 모델 서버(300)는 바둑 규칙에 따라 스스로 학습하여 딥러닝 모델인 착수 모델을 구축하고 단말기(100)의 유저와 대국을 할 수 있는 인공지능 컴퓨터로써 자신의 턴에서 대국에서 이길 수 있도록 바둑돌의 착수를 수행할 수 있다. 착수 모델 서버(300)가 착수 모델로 트레이닝하는 자세한 설명은 도 2 내지 도 5의 착수 모델에 관한 설명을 따른다.The starting model server 300 learns by itself according to the rules of Go, builds a deep learning model, the starting model, and is an artificial intelligence computer capable of playing a game with the user of the terminal 100. launch can be carried out. A detailed description of how the initiation model server 300 trains the initiation model follows the description of the initiation model of FIGS. 2 to 5 .

착수 모델 서버(300)의 메모리(301)는 착수 모델 서버(300)에서 구동되는 다수의 응용 프로그램(application program) 또는 애플리케이션(application), 착수 모델 서버(300)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 명령들은 프로세서(302)로 하여금 동작들을 수행하게 하기 위해 프로세서(302)에 의해 실행 가능하고, 동작들은 착수 모델 학습(트레이닝) 동작, 착수 정보 송수신, 착수 준비 시간 수신, 게임 시간 정보 수신 및 각종 전송 동작을 포함할 수 있다. 또한, 메모리(301)는 딥러닝 모델인 착수 모델을 저장할 수 있다. 또한, 메모리(301)는 하드웨어적으로, ROM, RAM, EPROM, 플래시 드라이브, 하드 드라이브 등과 같은 다양한 저장기기 일 수 있고, 메모리(301)는 인터넷(internet)상에서 상기 메모리(301)의 저장 기능을 수행하는 웹 스토리지(web storage)일 수도 있다.The memory 301 of the initiation model server 300 stores a plurality of application programs or applications running in the initiation model server 300, data for the operation of the initiation model server 300, and commands. can be saved Instructions are executable by the processor 302 to cause the processor 302 to perform operations, which include a set-up model learning (training) operation, set-up information transmission and reception, set-up preparation time reception, game time information reception, and various transmissions. Actions may be included. In addition, the memory 301 may store an undertaking model that is a deep learning model. In addition, the memory 301 may be a variety of storage devices such as ROM, RAM, EPROM, flash drive, hard drive, etc. in terms of hardware, and the memory 301 performs the storage function of the memory 301 on the Internet. It can also be a web storage that performs.

착수 모델 서버(300)의 프로세서(302)는 메모리(302)에 저장된 착수 모델을 독출하여, 구축된 신경망 시스템에 따라서 하기 기술하는 착수 모델 학습 및 바둑알 착수를 수행하게 된다. 실시예에 따라서 프로세서(302)는, 전체 유닛들을 제어하는 메인 프로세서와, 착수 모델에 따라 신경망 구동시 필요한 대용량의 연산을 처리하는 복수의 그래픽 프로세서(Graphics Processing Unit, GPU)를 포함하도록 구성될 수 있다. The processor 302 of the set-up model server 300 reads the set-out model stored in the memory 302, and performs the set-out model learning and Go game set-up described below according to the built neural network system. Depending on the embodiment, the processor 302 may be configured to include a main processor for controlling all units and a plurality of graphics processing units (GPUs) for processing large-capacity calculations required for driving a neural network according to an initiation model. there is.

착수 모델 서버(300)는 통신부(303)를 통해 네트워크(500)를 경유하여 바둑 서버(200)와 통신을 수행할 수 있다. The undertaking model server 300 may communicate with the Monarch server 200 via the network 500 through the communication unit 303 .

<형세 판단 모델 서버(400)><Scenario judgment model server 400>

형세 판단 모델 서버(400)는, 별도의 클라우드 서버나 컴퓨팅 장치를 포함할 수 있다. 또한, 형세 판단 모델 서버(400)는 단말기(100)의 프로세서 또는 바둑서버(200)의 데이터 처리부에 설치된 신경망 시스템일 수 있지만, 이하에서 형세 판단 모델 서버(400)는, 단말기(100) 또는 바둑 서버(200)와 별도의 장치로 설명한다.The situation judgment model server 400 may include a separate cloud server or a computing device. In addition, the situation judgment model server 400 may be a neural network system installed in the processor of the terminal 100 or the data processing unit of the Go server 200, but hereinafter, the situation judgment model server 400 is used for the terminal 100 or the Go server. It will be described as a device separate from the server 200.

형세 판단 모델 서버(400)는 명령들을 저장하는 적어도 하나의 메모리(401), 적어도 하나의 프로세서(402) 및 통신부(403)를 포함할 수 있다. The layout judgment model server 400 may include at least one memory 401 storing instructions, at least one processor 402 and a communication unit 403 .

형세 판단 모델 서버(400)는 통신부(403)를 통하여 바둑서버(200)로부터 트레이닝 데이터 셋을 수신할 수 있다. 트레이닝 데이터 셋은 복수의 기보와 해당 복수의 기보에 대한 형세 판단 정보일 수 있다. 형세 판단 모델 서버(400)는 수신한 트레이닝 데이터 셋을 이용하여 바둑알이 놓인 바둑판의 상태에 대한 형세를 판단할 수 있도록 지도학습하여 딥러닝 모델인 형세 판단 모델을 구축하고 단말기(100) 유저의 형세 판단 요청에 따라 형세 판단을 수행할 수 있다. 형세 판단 모델 서버(400)가 형세 판단 모델로 트레이닝하는 자세한 설명은 도 6 내지 도 18의 형세 판단 모델에 관한 설명을 따른다.The situation judgment model server 400 may receive a training data set from the Go server 200 through the communication unit 403 . The training data set may be a plurality of notations and situation determination information for the plurality of notations. The situation judgment model server 400 uses the received training data set to perform supervised learning to determine the situation of the state of the board on which the game of Go is placed, and builds a situation judgment model, which is a deep learning model, and determines the situation of the user of the terminal 100 Depending on the request for judgment, the situation may be judged. A detailed description of how the situation judgment model server 400 trains the situation judgment model follows the description of the situation judgment model of FIGS. 6 to 18 .

형세 판단 모델 서버(400)의 메모리(401)는 형세 판단 모델 서버(400)에서 구동되는 다수의 응용 프로그램(application program) 또는 애플리케이션(application), 형세 판단 모델 서버(400)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 명령들은 프로세서(402)로 하여금 동작들을 수행하게 하기 위해 프로세서(402)에 의해 실행 가능하고, 동작들은 형세 판단 모델 학습(트레이닝) 동작, 형세 판단 수행, 형세 판단 결과 송신, 복수의 기보 정보 수신, 집수의 변화량 정보 송신, 공배수 정보 송신 및 각종 전송 동작을 포함할 수 있다. 또한, 메모리(401)는 딥러닝 모델인 형세 판단 모델을 저장할 수 있다. 또한, 메모리(401)는 하드웨어적으로, ROM, RAM, EPROM, 플래시 드라이브, 하드 드라이브 등과 같은 다양한 저장기기 일 수 있고, 메모리(401)는 인터넷(internet)상에서 상기 메모리(301)의 저장 기능을 수행하는 웹 스토리지(web storage)일 수도 있다.The memory 401 of the layout judgment model server 400 stores a plurality of application programs or applications running in the layout judgment model server 400 and data for the operation of the layout judgment model server 400. , can store commands. The instructions are executable by the processor 402 to cause the processor 402 to perform operations, the operations being a situation judgment model learning (training) operation, performing a situation judgment, transmitting a situation judgment result, receiving a plurality of notational information, It may include transmission of variation information of collections, transmission of common multiple information, and various transmission operations. In addition, the memory 401 may store a situation judgment model that is a deep learning model. In addition, the memory 401 may be a variety of storage devices such as ROM, RAM, EPROM, flash drive, hard drive, etc. in terms of hardware, and the memory 401 performs the storage function of the memory 301 on the Internet. It can also be a web storage that performs.

형세 판단 모델 서버(400)의 프로세서(402)는 메모리(402)에 저장된 형세 판단 모델을 독출하여, 구축된 신경망 시스템에 따라서 하기 기술하는 형세 판단 모델 학습 및 대국 중 바둑판의 형세 판단을 수행하게 된다. 실시예에 따라서 프로세서(402)는, 전체 유닛들을 제어하는 메인 프로세서와, 형세 판단 모델에 따라 신경망 구동시 필요한 대용량의 연산을 처리하는 복수의 그래픽 프로세서(Graphics Processing Unit, GPU)를 포함하도록 구성될 수 있다. The processor 402 of the situation judgment model server 400 reads the situation judgment model stored in the memory 402, and performs learning of the situation judgment model described below and judgment of the situation of the checkerboard among the players according to the built neural network system. . According to an embodiment, the processor 402 is configured to include a main processor that controls all units and a plurality of graphics processing units (GPUs) that process large-capacity calculations required for driving a neural network according to a situation judgment model. can

형세 판단 모델 서버(400)는 통신부(403)를 통해 네트워크(500)를 경유하여 바둑 서버(200)와 통신을 수행할 수 있다. The position judgment model server 400 may communicate with the Monarch server 200 via the network 500 through the communication unit 403 .

<시간 관리 모델 서버(500)><Time management model server (500)>

시간 관리 모델 서버(500)는, 별도의 클라우드 서버나 컴퓨팅 장치를 포함할 수 있다. 또한, 시간 관리 모델 서버(500)는 단말기(100)의 프로세서, 바둑서버(200)의 메모리, 착수 모델 서버(300)의 메모리 또는 형세 판단 모델 서버(400)의 메모리에 설치된 신경망 시스템일 수 있지만, 이하에서 시간 관리 모델 서버(500)는, 단말기(100), 바둑 서버(200), 착수 모델 서버(300) 또는 형세 판단 모델 서버(400)와 별도의 장치로 설명한다.The time management model server 500 may include a separate cloud server or a computing device. In addition, the time management model server 500 may be a neural network system installed in the processor of the terminal 100, the memory of the Go server 200, the memory of the start model server 300, or the memory of the situation judgment model server 400. Hereinafter, the time management model server 500 will be described as a device separate from the terminal 100, the Monarch server 200, the starting model server 300, or the situation judgment model server 400.

시간 관리 모델 서버(500)는 명령들을 저장하는 적어도 하나의 메모리(501), 적어도 하나의 프로세서(502) 및 통신부(503)를 포함할 수 있다. The time management model server 500 may include at least one memory 501 storing instructions, at least one processor 502 and a communication unit 503 .

또한, 시간 관리 모델 서버(500)는 통신부(503)를 통하여 착수 모델 서버(300)로부터 방문 횟수, 탐색 확률값 또는 가치값을 수신할 수 있다. 시간 관리 모델 서버(500)는 수신한 방문 횟수, 탐색 확률값, 가치값을 이용하여 착수 준비 시간을 결정할 수 있다. 시간 관리 모델 서버(500)의 착수 준비 시간 결정 방법에 대한 자세한 설명은 도 19 내지 도 22의 설명을 따른다. In addition, the time management model server 500 may receive the number of visits, a search probability value, or a value value from the undertaking model server 300 through the communication unit 503 . The time management model server 500 may determine the start preparation time using the received number of visits, a search probability value, and a value value. A detailed description of the method for determining the start preparation time of the time management model server 500 follows the descriptions of FIGS. 19 to 22 .

또한, 시간 관리 모델 서버(500)는 통신부(503)를 통하여 바둑서버(500)로부터 트레이닝 데이터 셋을 수신할 수 있다. 트레이닝 데이터 셋은 복수의 기보, 해당 복수의 기보에 대한 형세 판단 정보, 복수의 기보에 대한 각 바둑판 상태에 따른 시간 정보일 수 있다. In addition, the time management model server 500 may receive a training data set from the Go server 500 through the communication unit 503 . The training data set may be a plurality of notations, position judgment information for the plurality of notations, and time information according to each checkerboard state for the plurality of notations.

또한, 시간 관리 모델 서버(500)는 통신부(503)를 통하여 착수 모델 서버(300) 및/또는 바둑서버(500)로부터 가치값 및/또는 상대방의 착수 준비 시간(이하, 상대 착수시간)을 수신할 수 있다. 시간 관리 모델 서버(500)는, 수신된 가치값 및/또는 상대 착수시간을 이용하여 착수 준비 시간을 결정할 수 있다. 시간 관리 모델 서버(500)의 착수 준비 시간 결정 방법에 대한 자세한 설명은 도 23 내지 도 28의 설명을 따른다. In addition, the time management model server 500 receives a value and/or the opponent's start preparation time (hereinafter, the opponent's start time) from the start model server 300 and/or the Go server 500 through the communication unit 503. can do. The time management model server 500 may determine the start preparation time using the received value and/or the relative start time. A detailed description of the method for determining the start preparation time of the time management model server 500 follows the descriptions of FIGS. 23 to 28 .

또한, 시간 관리 모델 서버(500)는 통신부(503)를 통하여 형세 판단 모델 서버(400)로부터 형세 판단 정보를 수신할 수 있다. 형세 판단 정보는 집수의 변화량 정보, 공배수 정보 등을 포함할 수 있다. 시간 관리 모델 서버(500)는 수신한 트레이닝 데이터 셋, 형세 판단 정보, 가치값 등을 이용하여 게임 시간 정보를 생성할 수 있도록 지도학습하여 딥러닝 모델인 시간 관리 모델을 구축하고 착수 모델 서버(300) 또는 단말기(100)에 바둑판 상태에 따른 게임 시간 정보를 제공할 수 있다. 시간 관리 모델 서버(50))가 시간 관리 모델로 트레이닝하는 자세한 설명은 도 29 내지 도 34의 시간 관리 모델에 관한 설명을 따른다. In addition, the time management model server 500 may receive situation determination information from the situation determination model server 400 through the communication unit 503 . The situation determination information may include change amount information of catchments, common multiple information, and the like. The time management model server 500 builds a time management model, which is a deep learning model, by conducting supervised learning to generate game time information using the received training data set, situation judgment information, value value, etc. ) or game time information according to the checkerboard state may be provided to the terminal 100 . A detailed description of how the time management model server 50 trains the time management model follows the description of the time management model of FIGS. 29 to 34 .

시간 관리 모델 서버(500)의 메모리(501)는 시간 관리 모델 서버(500)에서 구동되는 다수의 응용 프로그램(application program) 또는 애플리케이션(application), 형세 판단 모델 서버(400)의 동작을 위한 데이터들, 명령어들을 저장할 수 있다. 명령들은 프로세서(502)로 하여금 동작들을 수행하게 하기 위해 프로세서(502)에 의해 실행 가능하고, 동작들은 시간 관리 모델 학습(트레이닝) 동작, 방문 횟수, 탐색 확률값 또는 가치값을 수신, 착수 준비 시간 결정, 복수의 기보 정보 수신, 집수의 변화량 정보 수신, 공배수 정보 수신, 게임 시간 정보 생성 및 각종 전송 동작을 포함할 수 있다. 또한, 메모리(501)는 시간 관리부, 시간 조정부 또는 딥러닝 모델인 시간 관리 모델을 저장할 수 있다. 또한, 메모리(501)는 하드웨어적으로, ROM, RAM, EPROM, 플래시 드라이브, 하드 드라이브 등과 같은 다양한 저장기기 일 수 있고, 메모리(501)는 인터넷(internet)상에서 상기 메모리(501)의 저장 기능을 수행하는 웹 스토리지(web storage)일 수도 있다.The memory 501 of the time management model server 500 stores data for the operation of a plurality of application programs or applications running in the time management model server 500 and the position judgment model server 400. , can store commands. Instructions are executable by the processor 502 to cause the processor 502 to perform operations, such as time management model learning (training) operations, receiving number of visits, search probabilities or values, and determining preparation time for launching. , It may include receiving a plurality of notation information, receiving change amount information of collections, receiving common multiple information, generating game time information, and various transmission operations. Also, the memory 501 may store a time management model that is a time management unit, a time adjustment unit, or a deep learning model. In addition, the memory 501 may be a variety of storage devices such as ROM, RAM, EPROM, flash drive, hard drive, etc. in terms of hardware, and the memory 501 performs the storage function of the memory 501 on the Internet. It can also be a web storage that performs.

시간 관리 모델 서버(500)의 프로세서(502)는 메모리(501)에 저장된 시간 관리부를 독출하여, 하기 기술하는 착수 준비 시간 결정을 수행하게 된다.The processor 502 of the time management model server 500 reads the time management unit stored in the memory 501 and determines the start preparation time described below.

또한, 시간 관리 모델 서버(500)의 프로세서(502)는 메모리(501)에 저장된 시간 조정부를 독출하여, 후술되는 착수 준비 시간 결정을 수행하게 된다. In addition, the processor 502 of the time management model server 500 reads the time adjustment unit stored in the memory 501 and determines the start preparation time described later.

또한, 시간 관리 모델 서버(500)의 프로세서(502)는 메모리(501)에 저장된 시간 관리 모델을 독출하여, 구축된 신경망 시스템에 따라서 하기 기술하는 게임 시간 정보 생성을 수행하게 된다. In addition, the processor 502 of the time management model server 500 reads the time management model stored in the memory 501 and generates game time information described below according to the constructed neural network system.

실시예에 따라서 프로세서(502)는, 전체 유닛들을 제어하는 메인 프로세서와, 시간 관리 모델에 따라 신경망 구동시 필요한 대용량의 연산을 처리하는 복수의 그래픽 프로세서(Graphics Processing Unit, GPU)를 포함하도록 구성될 수 있다. According to an embodiment, the processor 502 is configured to include a main processor that controls all units and a plurality of graphics processing units (GPUs) that process large-capacity calculations required for driving a neural network according to a time management model. can

시간 관리 모델 서버(500)는 통신부(403)를 통해 네트워크(500)를 경유하여 바둑 서버(200), 착수 모델 서버(300) 및 형세 판단 모델 서버(400)와 통신을 수행할 수 있다.The time management model server 500 may communicate with the Monarch server 200, the start model server 300, and the situation judgment model server 400 via the network 500 through the communication unit 403.

<착수 모델><Startup model>

도 2는 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스에서 인공지능 컴퓨터의 착수를 위한 착수 모델 서버의 착수 모델 구조를 설명하기 위한 도면이고, 도 3은 착수 모델의 정책에 따른 착수점에 대한 이동 확률 분포를 설명하기 위한 도면이고, 도 4는 착수 모델의 착수점에 대한 가치값과 방문 횟수를 설명하기 위한 도면이고, 도 5는 착수 모델이 탐색부의 파이프 라인에 따라 착수하는 과정을 설명하기 위한 도면이다.2 is a diagram for explaining the start model structure of the start model server for the start of an artificial intelligence computer in the deep learning-based Go game service according to an embodiment of the present invention, and FIG. 3 is the start point according to the start model policy. Figure 4 is a diagram for explaining the movement probability distribution for , Figure 4 is a diagram for explaining the value value and the number of visits to the starting point of the initiation model, and Figure 5 shows the process initiating the initiation model according to the pipeline of the search unit. It is a drawing for explanation.

도 2를 참조하면, 본 발명의 실시예에 따른 착수 모델은 착수 모델 서버(300)의 딥러닝 모델로써 탐색부(310), 셀프 플레이부(320) 및 착수 신경망(330)을 포함할 수 있다. Referring to FIG. 2 , the initiation model according to an embodiment of the present invention is a deep learning model of the initiation model server 300, and may include a search unit 310, a self-play unit 320, and an initiation neural network 330. .

착수 모델은 탐색부(310), 셀프 플레이부(320) 및 착수 신경망(330)을 이용하여 대국에서 이길 수 있도록 착수하는 모델로 학습할 수 있다. 보다 구체적으로, 탐색부(310)는 착수 신경망(330)의 가이드에 따라 몬테 카를로 트리 서치(Monte Carlo Tree Search; MCTS) 동작을 수행할 수 있다. MCTS는 모종의　의사 결정을 위한 체험적　탐색 알고리즘이다. 즉, 탐색부(310)는 착수 신경망(330)이 제공하는 이동 확률값(p) 및/또는 가치값(V)에 기초하여 MCTS를 수행할 수 있다. 일 예로, 착수 신경망(330)에 의해 가이드된 탐색부(310)는 MCTS를 수행하여 착수점들에 대한 확률분포값인 탐색 확률값(

)을 출력할 수 있다. 셀프 플레이부(320)는 탐색 확률값(

)에 따라 스스로 바둑 대국을 할 수 있다. 셀프 플레이부(320)는 게임의 승패가 결정되는 시점까지 스스로 바둑 대국을 진행하고, 자가 대국이 종료되면 바둑판 상태(S), 탐색 확률값(

), 자가 플레이 가치값(z)을 착수 신경망(330)에 제공할 수 있다. 바둑판 상태(S)는 착수점들에 바둑돌이 놓여진 상태이다. 자가 플레이 가치값(z)은 바둑판 상태(S)에서 자가 대국을 하였을 때 승률 값이다. 착수 신경망(330)은 이동 확률값(p)과 가치값(V)을 출력할 수 있다. 이동 확률값(p)은 바둑판 상태(S)에 따라 착수점들에 대해 어느 착수점에 착수하는 것이 게임을 이길 수 있는 좋은 수인지 수치로 나타낸 확률분포값이다. 가치값(V)은 해당 착수점에 착수시 승률을 나타낸다. 예를 들어, 이동 확률값(p)이 높은 착수점이 좋은 수일 수 있다. 착수 신경망(330)은 이동 확률값(p)이 탐색 확률값(

)과 동일해지도록 트레이닝되고, 가치값(V)이 자가 플레이 가치값(z)과 동일해지도록 트레이닝될 수 있다. 이후 트레이닝된 착수 신경망(330)은 탐색부(310)를 가이드하고, 탐색부(310)는 이전 탐색 확률값(

)보다 더 좋은 수를 찾도록 착수 준비 시간 동안 MCTS를 진행하여 새로운 탐색 확률값(

)을 출력하게 한다. 예를 들어, 착수 준비 시간은 MCTS 진행 시간에 따라 평균 착수 준비 시간, 제1 착수 준비 시간 및 제2 착수 준비 시간 중 어느 하나의 착 수 준비 시간을 따를 수 있다. 착수 준비 시간은 시간 관리 모델 서버(500)에서 제공할 수 있고 기본적으로 평균 착수 준비 시간으로 설정되어 있을 수 있다. 셀프 플레이부(320)는 새로운 탐색 확률값(

)에 기초하여 바둑판 상태(S)에 따른 새로운 자가 플레이 가치값(z)을 출력하고 바둑판 상태(S), 새로운 탐색 확률값(

), 새로운 자가 플레이 가치값(z)을 착수 신경망(330)에 제공할 수 있다. 착수 신경망(330)은 이동 확률값(p)과 가치값(V)이 새로운 탐색 확률값(

)과 새로운 자가 플레이 가치값(z)으로 출력되도록 다시 트레이닝될 수 있다. 즉, 착수 모델은 이러한 과정을 반복하여 착수 신경망(330)이 대국에서 이기기 위한 더 좋은 착수점을 찾도록 트레이닝 될 수 있다. 일 예로, 착수 모델은 착수 손실(l)을 이용할 수 있다. 착수 손실(l)은 수학식 1과 같다.The starting model may be learned as a starting model to win a game by using the search unit 310, the self-play unit 320, and the starting neural network 330. More specifically, the search unit 310 may perform a Monte Carlo Tree Search (MCTS) operation according to the guidance of the initiating neural network 330 . MCTS is a heuristic search algorithm for some kind of decision making. That is, the search unit 310 may perform MCTS based on the movement probability value (p) and/or value value (V) provided by the initiating neural network 330 . For example, the search unit 310 guided by the initiating neural network 330 performs MCTS to search probability values that are probability distribution values for the initiation points (

) can be output. The self-play unit 320 provides a search probability value (

), you can play Go yourself. The self-play unit 320 proceeds to play Go by itself until the game is decided, and when the self-playing game ends, the checkerboard state (S), search probability value (

), the self-play value value z may be provided to the initiating neural network 330 . The checkerboard state (S) is a state in which Go stones are placed at starting points. The self-play value value (z) is a win rate value when the player plays a game in the checkerboard state (S). The onset neural network 330 may output a movement probability value (p) and a value value (V). The moving probability value (p) is a probability distribution value numerically indicating which starting point is a good number to win the game with respect to the starting points according to the checkerboard state (S). The value value (V) represents the winning rate when starting at the starting point. For example, a starting point having a high movement probability value p may be a good number. The onset neural network 330 determines that the movement probability value (p) is the search probability value (

), and the value (V) may be trained to be equal to the self-play value (z). Then, the trained initiating neural network 330 guides the search unit 310, and the search unit 310 uses the previous search probability value (

) to find a better number than the new search probability value (

) to output. For example, the start preparation time may follow any one of the start preparation time, the first start preparation time, and the second start preparation time according to the MCTS progress time. The start preparation time may be provided by the time management model server 500 and may be basically set to the average start preparation time. The self-play unit 320 provides a new search probability value (

) Based on the checkerboard state (S), a new self-play value value (z) is output, and the checkerboard state (S), a new search probability value (

), a new self-play value value z may be provided to the initiating neural network 330. The starting neural network 330 has a new search probability value (p) and value value (V).

) and a new self play value (z) can be trained again. That is, the starting model can be trained to find a better starting point for winning the game by repeating this process so that the starting neural network 330 can win. As an example, the launch model may use the launch loss (l). The landing loss (l) is shown in Equation 1.

(수학식 1)(Equation 1)

는 신경망의 파라미터이고, c는 매우 작은 상수이다.

is a parameter of the neural network, and c is a very small constant.

수학식 1의 착수 손실(l)에서 z와 v가 같아 지도록 하는 것은 평균 제곱 손실(mean square loss) 텀에 해당되고,

와 p가 같아 지도록 하는 것은 크로스 엔트로피 손실(cross entropy loss) 텀에 해당되고,

에 c를 곱하는 것은 정규화 텀으로 오버피팅(overfitting)을 방지하기 위한 것이다.Making z and v equal in the starting loss (l) of Equation 1 corresponds to the mean square loss term,

and p are equal to the cross entropy loss term,

Multiplying by c is to prevent overfitting with the regularization term.

예를 들어, 도 3을 참조하면 트레이닝된 착수 모델은 착수점들에 이동 확률값(p)을 도 3과 같이 확률분포값으로 나타낼 수 있다. 도 4를 참조하면 트레이닝 된 착수 모델의 탐색 확률값(

)은 하나의 착수점에서 위에 표시된 값으로 나타낼 수 있다. 탐색 확률값(

)은 착수 후보수의 방문 횟수를 전체 횟수로 나눈 비율일 수 있다. 일 예로, MCTS 시뮬레이션 전체 횟수가 1000번이고 90.00이라고 표시되어 있으면 해당 착수 후보수에 1000번 중 900번 방문했다는 것을 의미한다. 트레이닝 된 착수 모델의 가치값(V)은 도 4의 하나의 착수점에서 아래에 표시된 값으로 나타낼 수 있다. 착수 신경망(330)은 신경망 구조로 구성될 수 있다. 일 예로, 착수 신경망(330)은 한 개의 컨볼루션(convolution) 블록과 19개의 레지듀얼(residual) 블록으로 구성될 수 있다. 컨볼루션 블록은 3X3 컨볼루션 레이어가 여러개 중첩된 형태일 있다. 하나의 레지듀얼 블록은 3X3 컨볼루션 레이어가 여러개 중첩되고 스킵 커넥션을 포함한 형태일 수 있다. 스킵 커넥션은 소정의 레이어의 입력이 해당 레이어의 출력값과 합하여서 출력되어 다른 레이어에 입력되는 구조이다. 또한, 착수 신경망(330)의 입력은 흑 플레이어의 최근 8 수에 대한 돌의 위치 정보과 백 플레이어의 최근 8 수에 대한 돌의 위치 정보와 현재 플레이어가 흑인지 백인지에 대한 차례 정보를 포함한 19*19*17의 RGB 이미지가 입력될 수 있다.For example, referring to FIG. 3 , the trained initiation model may represent movement probability values p at the initiation points as probability distribution values as shown in FIG. 3 . Referring to FIG. 4, the search probability value of the trained initiation model (

) can be expressed as the value shown above at one starting point. Search probability value (

) may be a ratio of the number of visits of the starting candidates to the total number of visits. For example, if the total number of MCTS simulations is 1000 and 90.00 is displayed, it means that 900 out of 1000 visits were made to the number of starting candidates. The value (V) of the trained initiation model can be represented by the value shown below at one initiation point in FIG. 4 . The initiating neural network 330 may be composed of a neural network structure. For example, the initiating neural network 330 may include one convolution block and 19 residual blocks. A convolution block may have a form in which several 3X3 convolution layers are overlapped. One residual block may have a form in which several 3X3 convolutional layers are overlapped and a skip connection is included. A skip connection is a structure in which an input of a predetermined layer is outputted by summing an output value of a corresponding layer and inputted to another layer. In addition, the input of the starting neural network 330 is 19 * including stone position information for the last 8 moves of the black player, stone position information for the last 8 moves of the white player, and turn information on whether the current player is black or white. A 19*17 RGB image may be input.

도 5를 참조하면, 학습된 착수 모델은 자신의 차례에서 착수 신경망(330)과 탐색부(310)를 이용하여 착수할 수 있다. 착수 모델은 선택 과정(a)을 통하여 현재 제1 바둑판 상태(S1)에서 MCTS를 통해 탐색하지 않은 가지 중 활동 함수(Q)와 신뢰값(U)이 높은 착수점을 가지는 제2 바둑판 상태(S1-2)를 선택한다. 활동 함수(Q)는 해당 가지를 지날 때마다 산출된 가치값(V)들의 평균값이다. 신뢰값(U)은 해당 가지를 지나는 방문 횟수(N)에 반비례하고 이동 확률값(p)에 비례한다. 착수 모델은 확장과 평가 과정(b)을 통하여 선택된 착수점에서의 제3 바둑판 상태(S1-2-1)로 확장하고 이동 확률값(p)을 산출할 수 있다. 착수 모델은 상기 확장된 제3 바둑판 상태(S1-2-1)의 가치값(V)을 산출하고 백업 과정(c)을 통하여 지나온 가지들의 활동 함수(Q), 방문 횟수(N), 이동 확률값(p)을 저장할 수 있다. 착수 모델은 착수 준비 시간 동안 선택(a), 확장 및 평가(b), 백업(c) 과정을 반복하고 각 착수점에 대한 방문 횟수(N)를 이용하여 확률 분포를 만들어서 탐색 확률값(

)을 출력할 수 있다. 착수 모델은 착수점들 중 가장 높은 탐색 확률값(

)을 선택하여 착수할 수 있다. Referring to FIG. 5 , the learned start model may start using the start neural network 330 and the search unit 310 in its own turn. Through the selection process (a), the starting model is the second checkerboard state (S1) having a starting point with a high activity function (Q) and high confidence value (U) among the branches not searched through MCTS in the current first checkerboard state (S1). -2) is selected. The activity function (Q) is the average value of value values (V) calculated every time a corresponding branch is passed. The confidence value (U) is inversely proportional to the number of visits (N) passing through the corresponding branch and proportional to the movement probability value (p). The starting model may be expanded to the third checkerboard state (S1-2-1) at the selected starting point through the expansion and evaluation process (b) and calculate a movement probability value (p). The initiation model calculates the value value (V) of the extended third checkerboard state (S1-2-1), and the activity function (Q), number of visits (N), and movement probability value of the branches passed through the backup process (c) (p) can be stored. The initiation model repeats the process of selection (a), expansion and evaluation (b), and backup (c) during the preparation time for initiation, and a probability distribution is created using the number of visits (N) for each initiation point, and the search probability value (

) can be output. The initiation model has the highest search probability value among the initiation points (

) can be selected.

<형세 판단 모델><Scenario Judgment Model>

도 6은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스의 형세 판단 기능을 제공하는 화면을 보여주는 예시도이고, 도 7은 본 발명의 형세 판단 모델 서버의 형세 판단 모델 구조를 설명하기 위한 도면이고, 도 8은 본 발명의 형세 판단 모델의 복수의 블록으로 이루어진 신경망 구조 중 하나의 블록을 설명하기 위한 도면이다.6 is an exemplary diagram showing a screen for providing a layout judgment function of a deep learning-based Go game service according to an embodiment of the present invention, and FIG. 7 is for explaining the layout judgment model structure of the layout judgment model server of the present invention. FIG. 8 is a diagram for explaining one block of a neural network structure composed of a plurality of blocks of the situation judgment model of the present invention.

도 6을 참조하면, 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스는 현재 바둑판 상태의 형세 판단을 할 수 있다. 일 예로, 도 6과 같이 유저가 단말기(100)의 화면에서 바둑 대국 중 형세 판단 메뉴(A)를 클릭하여 형세 판단을 요청하면 딥러닝 기반의 바둑 게임 서비스가 팝업 창에 형세 판단 결과를 제공할 수 있다. 형세 판단은 바둑 대국 중에 상대방과 나의 집을 계산하여 누가 몇점으로 이기고 있는지 판단하는 것이다. 예를 들어, 유저는 형세가 나에게 유리하다는 판단이 서면 더 이상 무리하지 말고 현재의 유리한 상황을 그대로 유지한 채 대국을 종료하는 방향으로 전략을 세울 것이고, 만약 불리하다는 판단이면 게임 국면을 새롭게 전환할 수 있도록 여러가지 전략을 모색할 수 있다. 형세 판단의 기준은 바둑돌이 바둑판에 배치된 상태에 따른 집, 사석, 돌, 공배, 빅이 된다. 돌은 바둑판에 놓여진 돌이고 한국 규칙에서는 점수가 아니다. 집은 한 가지 색의 바둑돌로 둘러쌓인 빈 점으로 구성된 영역으로 한국 규칙에서는 점수이다. 공배와 빅은 바둑이 끝났을 때 흑집도 백집도 아닌 영역으로 한국 규칙에서는 점수가 아니다. 판위사석은 바둑판 위에 놓여진 돌 중에서 어떻게 두어도 잡힐 수밖에 없어 죽게 된 돌로 한국 규칙에서는 상대방의 집을 메우는데 사용하므로 점수이다. 빅은 바둑이 끝났을 때, 흑집도 백집도 아닌 영역을 말한다.　 따라서, 형세 판단은 바둑돌이 놓인 바둑판 상태에서 집, 사석, 돌, 공배, 빅을 정확히 구분 또는 예측해야 정확한 판단이 될 수 있다. 이 때, 집, 사석, 돌, 공배, 빅을 정확히 구분하는 것은 집, 사석, 돌, 공배, 빅이 완전히 이루어진 상태를 구분하는 것이고, 집, 사석, 돌, 공배, 빅을 정확히 예측하는 것은 집, 사석, 돌, 공배, 빅이 될 가능성이 높은 상태를 예측하는 것일 수 있다. Referring to FIG. 6 , the deep learning-based Go game service according to an embodiment of the present invention can determine the situation of the current Go board state. For example, as shown in FIG. 6 , when a user clicks a situation judgment menu (A) during a game of Go on the screen of the terminal 100 and requests a situation judgment, the deep learning-based Go game service provides the situation judgment result in a pop-up window. can The situation judgment is to determine who is winning by how many points by calculating the opponent and my house during the game of Go. For example, if the user judges that the situation is advantageous to me, he will not overdo it anymore and will set up a strategy in the direction of ending the match while maintaining the current favorable situation as it is, and if it is judged that the situation is unfavorable, the game phase will be changed to a new one. There are several strategies you can try to do this. The criteria for judging the situation are house, private stone, stone, gongbae, and big according to the state in which the stones are placed on the board. A stone is a stone placed on a checkerboard and is not a score in Korean rules. A house is an area consisting of empty dots surrounded by stones of one color, which is a score in Korean rules. Gongbae and Big are areas that are neither black nor white when the game of Baduk is over, and are not points in Korean rules. Panwisa-seok is a stone that has no choice but to be caught among the stones placed on the checkerboard, and it is used to fill the opponent's house in Korean rules, so it is a score. Big refers to the area that is neither black nor white when the game of Go is over. Therefore, the judgment of the situation can be an accurate judgment only when accurately distinguishing or predicting the house, four stones, stones, gongbae, and big in the state of the checkerboard where the baduk stones are placed. At this time, accurately distinguishing House, Four Stone, Stone, Gongbae, and Big is to distinguish the state in which the house, four stone, stone, Gongbae, and Big are completely formed, and accurately predicting House, Four Stone, Dol, Gongbae, and Big is , it may be to predict a state with a high probability of becoming a stone, a stone, a ball, or a big.

도 7을 참조하면, 본 발명의 실시예에 따른 형세 판단 모델은 형세 판단 모델 서버(400)의 딥러닝 모델로써 형세 판단 신경망(410), 입력 특징 추출부(420) 및 정답 레이블 생성부(430)를 포함할 수 있다. Referring to FIG. 7 , the situation judgment model according to an embodiment of the present invention is a deep learning model of the situation judgment model server 400, and includes a situation judgment neural network 410, an input feature extractor 420, and a correct answer label generator 430. ) may be included.

형세 판단 모델은 형세 판단 신경망(410)을 이용하여 현재 바둑판 상태의 형세를 판단할 수 있도록 지도 학습(supervised learning)할 수 있다. 보다 구체적으로, 형세 판단 모델 바둑판 상태(S)에 관한 트레이닝 데이터 셋을 생성하고 생성된 트레이닝 데이터 셋을 이용하여 형세 판단 신경망(410)이 현재 바둑판 상태(S)에 따른 형세를 판단할 수 있도록 학습시킬 수 있다. 형세 판단 모델 서버(400)는 바둑서버(200)로부터 복수의 기보를 수신할 수 있다. 복수의 기보의 각 기보는 착수 순서에 따른 각각의 바둑판 상태(S)를 포함할 수 있다. The situation judgment model may perform supervised learning so that the situation of the current checkerboard state can be determined using the situation judgment neural network 410 . More specifically, a training data set related to the checkerboard state (S) of the layout judgment model is created, and the layout judgment neural network 410 learns to determine the layout according to the current checkerboard state (S) using the generated training data set can make it The position judgment model server 400 may receive a plurality of notations from the Go server 200 . Each notation of a plurality of notations may include each checkerboard state (S) according to the starting order.

입력 특징 추출부(420)는 복수의 기보의 바둑판 상태(S)에서 입력 특징(IF)을 추출하여 형세 판단 신경망(410)에 트레이닝을 위한 입력 데이터로 제공할 수 있다. 바둑판 상태(S)의 입력 특징(IF)은 흑 플레이어의 최근 8 수에 대한 돌의 위치 정보과 백 플레이어의 최근 8 수에 대한 돌의 위치 정보와 현재 플레이어가 흑인지 백인지에 대한 차례 정보를 포함한 19*19*18의 RGB 이미지일 수 있다. 일 예로, 입력 특징 추출부(420)는 신경망 구조로 되어 있을 수 있으며 일종의 인코더를 포함할 수 있다.The input feature extraction unit 420 may extract an input feature (IF) from a plurality of checkerboard states (S) and provide the input feature (IF) to the shape judgment neural network 410 as input data for training. The input feature (IF) of the checkerboard state (S) includes stone position information for the black player's last 8 moves, stone position information for the white player's last 8 moves, and turn information on whether the current player is black or white. It may be a 19*19*18 RGB image. For example, the input feature extractor 420 may have a neural network structure and may include a kind of encoder.

정답 레이블 생성부(430)는 현재 바둑판 상태(S)로 전처리 과정을 거쳐 정답 레이블(ground truth)을 생성하고 정답 레이블을 형세 판단 신경망(410)에 트레이닝을 위한 타겟 데이터(

)로 제공할 수 있다. 정답 레이블 생성부(430)의 정답 레이블 생성은 후술하는 도 9 내지 도 11의 설명을 따른다. 일 예로, 정답 레이블 생성부(430)는 신경망 구조의 롤아웃 또는 인코더를 포함할 수 있다.The correct answer label generation unit 430 generates a correct answer label (ground truth) through a preprocessing process with the current checkerboard state (S), and transfers the correct answer label to the position judgment neural network 410 as target data for training (

) can be provided. The generation of correct answer labels by the correct answer label generator 430 follows the description of FIGS. 9 to 11 to be described later. For example, the correct answer label generation unit 430 may include a rollout or an encoder of a neural network structure.

형세 판단 모델은 입력 특징(IF)을 입력 데이터로 하고 정답 레이블을 타겟 데이터(

)로 한 트레이닝 데이터 셋을 이용하여 형세 판단 신경망(410)에서 생성된 출력 데이터(o)가 타겟 데이터(

)와 동일해지도록 형세 판단 신경망(420)을 충분히 학습할 수 있다. 일 예로, 형세 판단 모델은 형세 판단 손실(

)을 이용할 수 있다. 형세 판단 손실(

)은 평균 제곱 에러(mean square error)를 이용할 수 있다. 예를 들어, 형세 판단 손실(

)은 수학식 2와 같다.The position judgment model takes input features (IF) as input data and correct answer labels as target data (

), the output data (o) generated by the situation judgment neural network 410 using the training data set is the target data (

), the situation determination neural network 420 may be sufficiently learned. As an example, the situation judgment model is a situation judgment loss (

) can be used. loss of judgment (

) can use the mean square error. For example, loss of judgment (

) is the same as Equation 2.

(수학식 2)(Equation 2)

B는 바둑판의 전체 교차점 수이다. 바둑판은 가로 19줄 및 세로 19줄이 서로 교차하여 361개의 교차점이 배치된다. 이에 제한되는 것은 아니고 바둑판이 가로 9줄 및 세로 9줄일 경우 81개의 교차점이 배치될 수 있다.

는 현재 바둑판 상태(S)에서 정답 레이블에 따른 소정의 교차점(i)에 대한 형세값이다. 형세값에 대한 설명은 후술하는 도 11의 설명에 따른다.

는 현재 바둑판 상태(S)에서 소정의 교차점(i)을 형세 판단 신경망(410)에 입력하였을 때에 출력되는 출력 데이터이다. 형세 판단 모델은 형세 판단 손실(

)이 최소화되도록 경사 하강법(gradient-descent)과 역전파(backpropagation)을 이용하여 형세 판단 신경망(410) 내의 가중치와 바이어스 값들을 조절하여 형세 판단 신경망(410)를 학습시킬 수 있다.B is the total number of intersections on the checkerboard. In the checkerboard, 19 horizontal and 19 vertical lines intersect each other, and 361 intersections are arranged. It is not limited thereto, and when the checkerboard has 9 horizontal lines and 9 vertical lines, 81 intersection points may be arranged.

is a positional value for a predetermined intersection point (i) according to the correct answer label in the current checkerboard state (S). Description of the shape value follows the description of FIG. 11 to be described later.

is output data that is output when a predetermined intersection (i) is input to the situation judgment neural network 410 in the current checkerboard state (S). The layout judgment model is based on the layout loss (

) is minimized, the situation judgment neural network 410 may be trained by adjusting weights and bias values in the situation judgment neural network 410 using gradient-descent and backpropagation.

형세 판단 신경망(410)은 신경망 구조로 구성될 수 있다. 일 예로, 형세 판단 신경망(420)은 19개의 레지듀얼(residual) 블록으로 구성될 수 있다. 도 8을 참조하면, 하나의 레지듀얼 블록은 256개의 3X3 컨볼루션 레이어, 일괄 정규화(batch normalization) 레이어, Relu 활성화 함수 레이어, 256개의 3X3 컨볼루션 레이어, 일괄 정규화(batch normalization) 레이어, 스킵 커넥션, Relu 활성화 함수 레이어 순으로 배치될 수 있다. 일괄 정규화(batch normalization) 레이어는 학습하는 도중에 이전 레이어의 파라미터 변화로 인해 현재 레이어의 입력의 분포가 바뀌는 현상인 공변량 변화(covariate shift)를 방지하기 위한 것이다. 스킵 커넥션은 블록 층이 두꺼워지더라도 신경망의 성능이 감소하는 것을 방지하고 블록 층을 더욱 두껍게 하여 전체 신경망 성능을 높일 수 있게 한다. 스킵 커넥션은 레지듀얼 블록의 최초 입력 데이터가 두 번째 일괄 정규화(batch normalization) 레이어의 출력과 합하여 두번째 Relu 활성화 함수 레이어에 입력되는 형태일 수 있다.The situation judgment neural network 410 may have a neural network structure. For example, the position judgment neural network 420 may be composed of 19 residual blocks. Referring to FIG. 8, one residual block includes 256 3X3 convolution layers, batch normalization layers, Relu activation function layers, 256 3X3 convolution layers, batch normalization layers, skip connections, Relu activation function layers may be arranged in order. The batch normalization layer is intended to prevent covariate shift, a phenomenon in which the distribution of inputs of the current layer changes due to changes in parameters of the previous layer during learning. The skip connection prevents the performance of the neural network from decreasing even when the block layer becomes thicker, and increases overall neural network performance by making the block layer thicker. The skip connection may be input to a second Relu activation function layer by summing the first input data of the residual block with the output of the second batch normalization layer.

도 9 및 도 10은 본 발명의 형세 판단 모델을 학습하기 위해 사용되는 정답 레이블을 생성하기 위한 제1 및 제2 전처리 단계를 설명하기 위한 도면이고, 도 11은 본 발명의 형세 판단 모델을 학습하기 위해 사용되는 정답 레이블을 생성하기 위한 제3 전처리 단계를 설명하기 위한 도면이다.9 and 10 are views for explaining first and second preprocessing steps for generating correct answer labels used to learn the situation judgment model of the present invention, and FIG. 11 is for learning the situation judgment model of the present invention. It is a diagram for explaining the third pre-processing step for generating the correct answer label used for the first step.

정답 레이블 생성부(430)는 형세 판단 신경망(410)이 정확한 형세 판단을 할 수 있도록 학습하는데 이용되는 정답 레이블을 생성할 수 있다.The correct answer label generation unit 430 may generate correct answer labels used for learning so that the situation judgment neural network 410 can accurately determine the situation.

보다 구체적으로, 정답 레이블 생성부(430)는 입력 데이터에 기초가 되는 바둑판 상태(S)를 입력으로 받고, 현재 바둑판 상태(S)에서 끝내기를 하는 제1 전처리를 수행하여 제1 전처리 상태(P1)를 생성할 수 있다. 제1 전처리인 끝내기는 집 계산을 하기 전에 집의 경계가 명확해지도록 소정의 착수를 하여 게임을 마무리하는 과정이다. 일 예로, 도 9를 참조하면 정답 레이블 생성부(430)는 도 9의 (a)의 현재 바둑판 상태(S)에서 끝내기를 하여 도 9의 (b)의 제1 전처리 상태(P1)를 생성할 수 있다. More specifically, the correct answer label generation unit 430 receives the checkerboard state (S) based on the input data as an input, performs the first preprocessing of ending in the current checkerboard state (S), and performs the first preprocessing state (P1). ) can be created. The first preprocessing, the ending, is a process of finishing the game by making a predetermined start so that the boundary of the house becomes clear before calculating the house. For example, referring to FIG. 9 , the correct answer label generation unit 430 generates the first preprocessing state P1 of FIG. can

정답 레이블 생성부(430)는 제1 전처리 상태(P1)에서 집 경계 내에 배치되며 집 구분에 불필요한 돌을 제거하는 제2 전처리를 수행하여 제2 전처리 상태(P2)를 생성할 수 있다. 예를 들어, 집 경계 내에 배치되며 집 구분에 불필요한 돌은 사석일 수 있다. 사석은 집안에 상대방 돌이 배치되어 어떻게 두어도 잡힐수 밖에 없어 죽게 된 돌임을 앞서 설명하였다. 또한, 집 경계 내에 배치되며 집 구분에 불필요한 돌은 집안에 배치된 자신의 돌일 수 있다. 일 예로, 도 9를 참조하면 정답 레이블 생성부(430)는 도 9의 (b)의 제1 전처리 상태(P1)에서 집 구분에 불필요한 돌을 제거하여 도 9의 (c)의 제2 전처리 상태(P2)를 생성할 수 있다.The correct answer label generation unit 430 may generate a second preprocessing state P2 by performing a second preprocessing of removing stones disposed within the house boundary and unnecessary for house classification in the first preprocessing state P1. For example, a stone that is placed within the boundary of a house and is unnecessary for house division may be a rubble stone. It was explained earlier that Sa-seok was a stone that died because the opponent's stone was placed in the house and no matter how it was placed, it was inevitable to be caught. In addition, a stone placed within the house boundary and unnecessary for house division may be a stone of one's own placed in the house. For example, referring to FIG. 9 , the correct answer label generation unit 430 removes stones unnecessary for house classification in the first preprocessing state P1 of FIG. 9(b) to the second preprocessing state of FIG. (P2) can be created.

다른 예로, 도 10을 참조하면, 정답 레이블 생성부(430)는 도 10의 (a)의 현재 바둑판 상태(S)에서 제1 전처리인 끝내기를 위하여 도 10의 (b)와 같이 빨간색 x에 착수할 수 있다. 정답 레이블 생성부(430)는 도 10의 (b)에서 파란색 x로 표시된 사석을 제거하기 위하여 녹색 x에 착수하여 사석을 제거하고 사석 제거를 위해 사용된 녹색 x에 착수한 돌도 제거하여 제2 전처리를 수행할 수 있다.As another example, referring to FIG. 10 , the correct answer label generation unit 430 embarks on a red x as shown in FIG. can do. The correct answer label generation unit 430 removes the rubble stones indicated by the blue x in FIG. Pre-processing can be done.

정답 레이블 생성부(430)는 제2 전처리 상태(P2)에서 각 교차점을 -1 부터 +1까지 표시된 형세값(g, 단 g는 정수)으로 변경하는 제3 전처리를 수행할 수 있다. 즉, 제3 전처리는 정답 레이블 생성부(430)가 이미지 특징인 제2 전처리 상태(P2)를 수치 특징인 제3 전처리 상태(P3)로 변경하는 것이다. 일 예로, 제2 전처리 상태(P2)에서 교차점에 내 돌이 배치되면 0, 내 집 영역이면 +1, 상대 돌이 배치되면 0, 상대 집 영역이면 -1로 대응할 수 있다. 이 경우, 형세 판단 신경망(410)은 형세 판단시 집, 돌, 사석을 구분할 수 있도록 학습될 수 있다. 다른 예로, 제2 전처리 상태(P2)에서 교차점에 내 돌이 배치되면 0, 내 집 영역이면 +1, 상대 돌이 배치되면 0, 상대 집 영역이면 -1, 빅 또는 공배이면 0으로 대응할 수 있다. 다른 예의 경우 형세 판단 신경망(410)은 형세 판단시 빅 또는 공배를 구분할 수 있도록 학습될 수 있다. 예를 들어, 도 11을 참조하면, 정답 레이블 생성부(430)는 도 11의 (a)의 제2 전처리 상태(P2)를 도 11의 (b)의 제3 전처리 상태(P3)로 특징을 변경할 수 있다. The correct answer label generation unit 430 may perform a third preprocessing in the second preprocessing state P2 to change each intersection into a shape value (g, where g is an integer) displayed from -1 to +1. That is, in the third pre-processing, the correct answer label generation unit 430 changes the second pre-processing state P2, which is an image feature, into the third pre-processing state P3, which is a numerical feature. For example, in the second preprocessing state P2, if my stone is placed at the intersection, 0, if it is my home area, +1, if the opponent's stone is placed, 0, and if it is the other house area, -1. In this case, the situation determination neural network 410 may be trained to distinguish between a house, a stone, and a private stone when determining a situation. As another example, in the second preprocessing state P2, 0 if my stone is placed at the intersection point, +1 if it is my home area, 0 if the other stone is placed, -1 if the other stone is placed, and 0 if it is big or common. In another example, the situation determination neural network 410 may be trained to distinguish a big or a bad outcome when determining a situation. For example, referring to FIG. 11, the correct answer label generator 430 characterizes the second preprocessing state P2 of FIG. 11(a) as the third preprocessing state P3 of FIG. 11(b). can be changed

제3 전처리 상태(P3)는 바둑판 상태(S)에서의 형세 판단의 정답 레이블이 되고 형세 판단 신경망(410)의 학습 시 타겟 데이터(

)로 이용될 수 있다. The third preprocessing state (P3) becomes the correct answer label of the situation judgment in the checkerboard state (S), and the target data (

) can be used.

도 12는 본 발명의 형세 판단 모델의 형세 판단 결과를 설명하기 위한 도면이다.12 is a diagram for explaining the result of the situation judgment of the situation judgment model of the present invention.

학습된 형세 판단 모델은 바둑판 상태가 입력되면 바둑판의 모든 교차점에 대한 형세값을 제공할 수 있다. 즉, 바둑판 교차점의 361개 지점에 대해 형세값인 -1 내지 +1의 정수 값을 제공할 수 있다. When a checkerboard state is input, the learned layout judgment model may provide layout values for all intersections of the checkerboard. That is, an integer value of -1 to +1, which is a shape value, can be provided for 361 points of the checkerboard intersection.

도 12를 참조하면, 형세 판단 모델 서버(400)는 형세 판단 모델이 제공한 형세값, 소정의 임계값, 돌의 유무를 이용하여 형세를 판단할 수 있다. 일 예로, 형세 판단 모델 서버(400)는 돌이 없는 곳이며, 형세 값이 제1 임계값을 넘으면 내 집이 될 가능성이 높은 곳으로 판단하고, +1에 가까운 값이면 내 집 영역으로 판단할 수 있다. 형세 판단 모델 서버(400)는 내 집일 가능성이 높을수록 점점 커지는 내 돌과 같은 색의 네모 형태로 표시할 수 있다. 형세 판단 모델 서버(400)는 돌이 없는 곳이며, 형세 값이 제2 임계값 이하이면 상대 집이 될 가능성이 높은 곳으로 판단하고, -1에 가까운 값이면 상대 집 영역으로 판단할 수 있다. 형세 판단 모델 서버(400)는 상대 집일 가능성이 높을수록 점점 커지는 상대 돌과 같은 색의 네모 형태로 표시할 수 있다. 형세 판단 모델 서버(400)는 돌이 없는 곳이며, 형세 값이 제3 임계값 범위 이내 또는 0에 가까운 값이면 공배 또는 빅으로 판단할 수 있다. 형세 판단 모델 서버(400)는 공배 또는 빅으로 판단하면 X로 표시할 수 있다. 형세 판단 모델 서버(400)는 돌이 있는 곳이며, 형세 값이 제3 임계값 범위 이내 또는 0에 가까운 값이면 내 돌 또는 상대 돌로 판단할 수 있다. 형세 판단 모델 서버(400)는 공배 또는 빅으로 판단하면 아무런 표시를 안할 수 있다. 형세 판단 모델 서버(400)는 돌이 있는 곳이며, 형세 값이 제1 임계값을 넘으면 상대 돌의 사석이 될 가능성이 높은 곳으로 판단하고, +1에 가까운 값이면 상대 돌의 사석으로 판단할 수 있다. 형세 판단 모델 서버(400)는 상대 돌의 사석일 가능성이 높을수록 점점 커지는 내 돌과 같은 색의 네모 형태로 표시할 수 있다. 형세 판단 모델 서버(400)는 돌이 있는 곳이며, 형세 값이 제2 임계값 이하이면 내 돌의 사석이 될 가능성이 높은 곳으로 판단하고, -1에 가까운 값이면 상대 돌의 사석으로 판단할 수 있다. 형세 판단 모델 서버(400)는 상대 돌의 사석일 가능성이 높을수록 점점 커지는 상대 돌과 같은 색의 네모 형태로 표시할 수 있다. Referring to FIG. 12 , the situation judgment model server 400 may determine the situation using the situation value provided by the situation judgment model, a predetermined threshold value, and the presence or absence of stones. For example, the layout judgment model server 400 may determine a place where there is no stone, and if the value of the layout exceeds a first threshold value, it is highly likely to be my home, and if the value is close to +1, it is determined to be my home area. there is. The layout judgment model server 400 may display a square of the same color as my stone, which gradually increases as the probability of being my house increases. The layout judgment model server 400 may determine that there is no stone, and if the value of the layout is less than the second threshold, the location is highly likely to be the other's house. The layout judgment model server 400 may display a square of the same color as the opponent's stone, which gradually increases as the probability of the opponent's house increases. The position judgment model server 400 is a place where there is no stone, and if the position value is within the range of the third threshold or a value close to 0, it can be judged as an empty score or a big match. The situation judgment model server 400 may indicate an X when it is judged as a common or a big match. The position judgment model server 400 is a place where there is a stone, and if the position value is within the third threshold range or a value close to 0, it can be determined as my stone or the opponent's stone. The situation judgment model server 400 may not display anything when it is determined that the situation judgment model server 400 is an empty match or a big match. The situation judgment model server 400 is a place where there is a stone, and if the value of the situation exceeds the first threshold, it is determined that it is likely to be the opponent's stone, and if the value is close to +1, it can be determined as the opponent's stone. there is. The layout judgment model server 400 may display a square of the same color as my stone, which gradually increases as the probability of the opponent's stone being rubbish increases. The situation judgment model server 400 is a place where there is a stone, and if the value of the situation is less than the second threshold, it is determined that it is a place with a high possibility of being a stone of my stone, and if it is close to -1, it can be determined as a stone of the opponent's stone. there is. The layout judgment model server 400 may display a square of the same color as the opponent's stone, which gradually increases as the probability of the opponent's stone being a rubble stone increases.

또한, 형세 판단 모델 서버(300)는 각 교차점에서 판단한 형세 판단 기준을 이용하여 현재 바둑판 상태에서의 계가 결과를 표시할 수 있다. In addition, the situation judgment model server 300 may display the result of the count in the current checkerboard state by using the situation judgment criterion determined at each intersection.

또한, 형세 판단 모델 서버(300)는 바둑판 상태에 따른 집수의 변화량 정보 및 공배수 정보를 생성할 수 있다. 예를 들어, 형세 판단 모델 서버(300)는 이전 착수에 따른 바둑판 상태의 형세 판단 결과와 현재 바둑판 상태의 형세 판단 결과를 이용하여 집수의 변화량 정보를 생성할 수 있다. 또한, 형세 판단 모델 서버(300)는 바둑판 상태의 형세 판단 결과를 이용하여 공배수 정보를 생성할 수 있다. In addition, the situation judgment model server 300 may generate change amount information and common multiple information of catchments according to the checkerboard state. For example, the situation determination model server 300 may generate information on the amount of change in the catchment by using the result of determining the situation in the checkerboard state according to the previous start and the result of the determination of the situation in the current checkerboard state. In addition, the position determination model server 300 may generate common multiple information using the position determination result in a checkerboard state.

따라서, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 그 장치는 딥러닝 신경망을 이용하여 바둑 형세를 판단할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 바둑 규칙에 따른 집, 사석, 돌, 공배, 빅을 정확히 구분하여 바둑의 형세를 정확히 판단할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 바둑 규칙에 따른 집, 사석, 돌, 공배, 빅을 예측하여 바둑의 형세를 정확히 판단할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 바둑 대국 중 신속하게 형세를 판단할 수 있다.Therefore, the device of the deep learning-based Go game service according to the embodiment may determine the situation of Go using the deep learning neural network. In addition, the deep learning-based Go game service device according to the embodiment can accurately determine the situation of Go by accurately classifying house, sandstone, stone, ball game, and big according to the rules of Go. In addition, the deep learning-based Go game service device according to the embodiment can accurately determine the situation of Go by predicting House, Four Stone, Stone, Gongbae, and Big according to the rules of Go. In addition, the deep learning-based Go game service device according to the embodiment can quickly determine the situation during the game of Go.

도 13은 본 발명의 형세 판단 모델의 형세 판단 결과와 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 결과를 비교한 모습이고, 도 14는 본 발명의 형세 판단 모델의 형세 판단 결과와 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 결과를 비교한 모습이고, 도 15는 본 발명의 형세 판단 모델의 형세 판단 결과와 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 결과를 비교한 모습이다.13 is a comparison between the situation judgment result of the situation judgment model of the present invention and the situation judgment result by the deep learning model according to the prior art, and FIG. 14 is the situation judgment result of the situation judgment model of the present invention and the situation judgment result according to the prior art 15 is a comparison of the situation judgment result by the deep learning model, and FIG. 15 is a comparison between the situation judgment result of the situation judgment model of the present invention and the situation judgment result of the deep learning model according to the prior art.

도 13을 참조하면, 본 발명의 형세 판단 모델은 도 13의 (a)의 B영역과 같이 교차점 마다 집, 돌, 사석을 구분하여 형세를 판단한다. 그러나 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 모델은 도 13의 (b)에서 도 13의 (a)와 대응되는 영역의 교차점에 대하여 집, 돌, 사석을 구분하지 못한다.Referring to FIG. 13, the situation judgment model of the present invention determines the situation by classifying houses, stones, and rubble stones at each intersection point as in region B of FIG. 13(a). However, the situation judgment model using the deep learning model according to the prior art cannot distinguish between a house, a stone, and a sandstone at the intersection of the region corresponding to FIG. 13(a) in FIG. 13(b).

마찬가지로 도 14를 참조하면, 본 발명의 형세 판단 모델은 도 14의 (a)의 C영역과 같이 교차점 마다 집, 돌, 사석을 구분하여 형세를 판단한다. 그러나 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 모델은 도 14의 (b)에서 도 13의 (a)와 대응되는 영역의 교차점에 대하여 집, 돌, 사석을 구분하지 못한다.Likewise, referring to FIG. 14, the situation judgment model of the present invention determines the situation by classifying houses, stones, and rubble stones at each intersection point as in area C of FIG. 14 (a). However, the situation judgment model using the deep learning model according to the prior art cannot distinguish between a house, a stone, and a sandstone at the intersection of the region corresponding to FIG. 14(b) to FIG. 13(a).

도 15을 참조하면, 본 발명의 형세 판단 모델은 도 15의 (a)의 D영역과 같이 백집을 제대로 인식한다. 그러나 종래 기술에 따른 딥러닝 모델에 의한 형세 판단 모델은 도 15의 (b)에서 도 15의 (a)와 대응되는 영역에서 백집을 구분하지 못한다.Referring to FIG. 15 , the position judgment model of the present invention correctly recognizes a bag house as shown in area D of FIG. 15 (a). However, the situation judgment model based on the deep learning model according to the prior art cannot discriminate the back house in the area corresponding to FIG. 15(a) in FIG. 15(b).

도 16은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 시스템에 신호 흐름에 대한 예시도이다.16 is an exemplary diagram of a signal flow in a deep learning-based Go game service system according to an embodiment of the present invention.

도 16을 참조하면, 착수 모델 서버(300)는 인공지능 컴퓨터로써 자신의 턴에서 대국에서 이길 수 있도록 바둑돌의 착수를 수행할 수 있도록 바둑 규칙에 따라 스스로 학습하여 딥러닝 모델인 착수 모델을 트레이닝 할 수 있다(s11). 바둑서버(22)는 복수의 기보를 형세 판단 모델 서버(400)에게 송신할 수 있다. 형세 판단 모델 서버(400)는 트레이닝 데이터 셋을 생성할 수 있다. 먼저, 형세 판단 모델 서버(400)는 복수의 기보의 바둑판 상태에서 입력 특징을 추출할 수 있다(S13). 형세 판단 모델 서버(400)는 입력 특징을 추출한 바둑판 상태를 이용하여 정답 레이블을 생성할 수 있다(S14). 형세 판단 모델 서버(400)은 입력 특징을 입력 데이터로 하고 정답 레이블을 타겟 데이터로 한 트레이닝 데이터 셋을 이용하여 형세 판단 모델을 트레이닝 할 수 있다(S15). 단말기(100)는 바둑서버(200)에 인공지능 컴퓨터를 상대로 또는 다른 유저 단말기를 상대로 바둑 게임을 요청할 수 있다(S16). 바둑서버(200)는 단말기(100)가 인공지능 컴퓨터를 상대로 바둑 게임을 요청하면 착수 모델 서버(300)에 착수를 요청할 수 있다(S17). 바둑서버(200)는 바둑 게임을 진행하며 단말기(100)와 착수 모델 서버(300)가 자신의 턴에 착수를 수행할 수 있다(S18 내지 S20). 대국 중 단말기(100)는 바둑서버(200)에 형세 판단을 요청할 수 있다(S21). 바둑서버(200)는 형세 판단 모델 서버(400)에게 현재 바둑판 상태에 대한 형세 판단을 요청할 수 있다(S22). 형세 판단 모델 서버(400)는 현재 바둑판 상태의 입력 특징을 추출하고, 딥러닝 모델인 형세 판단 모델이 입력 특징을 이용하여 형세값을 생성하고, 바둑판 상태와 형세값을 이용하여 형세 판단을 수행할 수 있다(S23). 형세 판단 모델 서버(400)는 형세 판단 결과를 바둑서버(200)에 제공할 수 있다(S24). 바둑서버(200)는 단말기(100)에 형세 판단 결과를 제공할 수 있다(S25).Referring to FIG. 16, the start model server 300 is an artificial intelligence computer that learns by itself according to the rules of Go and trains the start model, which is a deep learning model, so that it can perform the start of the Go stone so that it can win the game in its own turn. can (s11). The Go server 22 may transmit a plurality of notations to the situation judgment model server 400 . The situation judgment model server 400 may generate a training data set. First, the position judgment model server 400 may extract input features from a checkerboard state of a plurality of notations (S13). The position judgment model server 400 may generate a correct answer label using the checkerboard state from which the input features are extracted (S14). The layout judgment model server 400 may train a layout judgment model using a training data set in which input features are input data and correct answer labels are target data (S15). The terminal 100 may request the Go server 200 to play a Go game against an artificial intelligence computer or against another user terminal (S16). The Go server 200 may request the start model server 300 to start when the terminal 100 requests a Go game against the artificial intelligence computer (S17). The Go server 200 proceeds with the Go game, and the terminal 100 and the start model server 300 may perform a start on their turn (S18 to S20). During the game, the terminal 100 may request the Go server 200 to determine the situation (S21). The Go server 200 may request the position judgment model server 400 to determine the position of the current Go board state (S22). The layout judgment model server 400 extracts the input features of the current checkerboard state, the layout judgment model, which is a deep learning model, generates a layout value using the input features, and performs the layout judgment using the checkerboard status and the layout values. It can (S23). The situation judgment model server 400 may provide the situation judgment result to the Go server 200 (S24). The Go server 200 may provide the result of the situation determination to the terminal 100 (S25).

도 17은 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 형세 판단 방법이고, 도 18은 도 17의 형세 판단 방법 중 정답 레이블을 생성하기 위한 트레이닝 데이터의 전처리 방법이다.17 is a method for determining a situation among deep learning-based Go game service methods according to an embodiment of the present invention, and FIG. 18 is a method for preprocessing training data to generate a correct answer label among the method for determining a situation in FIG. 17 .

도 17을 참조하면, 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 형세 판단 모델 서버가 바둑서버로부터 복수의 기보를 수신하는 단계(S100)를 포함할 수 있다. Referring to FIG. 17, the deep learning-based Go game service method according to an embodiment of the present invention may include a step (S100) of receiving a plurality of notations from the Go server by the situation judgment model server.

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 형세 판단 모델 서버의 형세 판단 모델 중 입력 특징 추출부가 복수의 기보의 바둑판 상태에서 입력 특징을 추출하는 단계(S200)를 포함할 수 있다. 입력 특징을 추출하는 방법은 도 7의 설명을 따른다.The deep learning-based Go game service method according to an embodiment of the present invention may include a step (S200) of extracting input features from a plurality of notation checkerboard states by an input feature extraction unit in a layout judgment model of a layout judgment model server. . A method of extracting input features follows the description of FIG. 7 .

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 형세 판단 모델 중 정답 레이블 생성부가 입력 특징을 추출한 바둑판 상태에 기초하여 정답 레이블을 생성하는 단계(S300)를 포함할 수 있다. 일 예로, 도 18을 참조하면, 정답 레이블 생성 단계(S300)는 정답 레이블 생성부가 현재 바둑판 상태에서 끝내기 하는 제1 전처리하는 단계(S301)를 포함할 수 있다. 제1 전처리하는 단계(S301)는 도 9 내지 도 10의 설명을 따른다. 정답 레이블 생성 단계(S300)는 정답 레이블 생성부가 제1 전처리된 바둑판 상태에서 불필요한 돌을 제거하는 제2 전처리하는 단계(S302)를 포함할 수 있다. 제2 전처리하는 단계(S302)는 도 9 내지 도 10의 설명을 따른다. 정답 레이블 생성 단계(S300)는 정답 레이블 생성부가 제2 전처리된 바둑판 상태의 각 교차점을 형세값으로 변경하는 제3 전처리하는 단계(S303)를 포함할 수 있다. 제3 전처리하는 단계(S303)는 도 11의 설명을 따른다. 정답 레이블 생성 단계(S300)는 제3 전처리 상태를 정답 레이블로 하여 형세 판단 신경망에 타겟 데이터로 제공하는 단계(S303)를 포함할 수 있다. 타겟 데이터를 제공하는 단계(S301)는 도 7 및 도 11의 설명을 따른다.The deep learning-based Go game service method according to an embodiment of the present invention may include a step of generating a correct answer label based on a checkerboard state from which an input feature is extracted by a correct answer label generator in a situation judgment model (S300). For example, referring to FIG. 18 , the correct answer label generating step ( S300 ) may include a first preprocessing step ( S301 ) in which the correct answer label generating unit finishes in the current checkerboard state. The first preprocessing step (S301) follows the description of FIGS. 9 to 10. The correct answer label generation step (S300) may include a second preprocessing step (S302) of removing unnecessary stones from the first preprocessed checkerboard state by the correct answer label generator. The second pre-processing step (S302) follows the description of FIGS. 9 to 10. The correct answer label generating step (S300) may include a third preprocessing step (S303) of changing each intersection of the second preprocessed checkerboard state into a shape value by the correct answer label generating unit. The third pre-processing step (S303) follows the description of FIG. The correct answer label generation step (S300) may include a step (S303) of providing the third preprocessing state as the correct answer label to the situation judgment neural network as target data. The step of providing target data (S301) follows the description of FIGS. 7 and 11.

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 트레이닝 데이터 셋을 이용하여 형세 판단 모델의 형세 판단 신경망을 트레이닝하는 단계(S400)을 포함할 수 있다. 형세 판단 신경망을 트레이닝(학습)하는 방법은 도 7의 설명을 따른다.The deep learning-based Go game service method according to an embodiment of the present invention may include training a situation judgment neural network of a situation judgment model using a training data set (S400). A method of training (learning) the situation decision neural network follows the description of FIG. 7 .

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 형세 판단 신경망의 트레이닝이 완료되어 형세 판단 모델을 구축하는 단계(S500)를 포함한다. 일 예로, 형세 판단 신경망의 트레이닝의 완료는 도 7의 형세 판단 손실이 소정의 값 이하가 된 경우일 수 있다.The deep learning-based Go game service method according to an embodiment of the present invention includes a step (S500) of constructing a situation judgment model by completing training of a situation judgment neural network. For example, the training of the situation judgment neural network may be completed when the situation judgment loss of FIG. 7 becomes less than or equal to a predetermined value.

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 단말기의 형세 판단 요청에 의해 현재 바둑판 상태가 형세 판단 모델에 입력되는 단계(S600)를 포함할 수 있다. The deep learning-based Go game service method according to an embodiment of the present invention may include a step (S600) of inputting a current checkerboard state to a situation judgment model in response to a situation judgment request from a terminal.

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 형세 판단 모델이 입력된 현재 바둑판 상태의 형세 판단을 수행하는 단계(S700)를 포함할 수 있다. 형세 판단을 수행하는 단계(S700)는 도 12에서 설명한 형세 판단 모델이 현재 바둑판 상태의 형세값을 생성하는 설명을 따를 수 있다.The deep learning-based Go game service method according to an embodiment of the present invention may include a step (S700) of determining the situation of the current Go board state, to which the situation judgment model is input. The step of determining the situation (S700) may follow the description that the situation judgment model described in FIG. 12 generates the situation value of the current checkerboard state.

본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 형세 판단 모델 서버가 형세 판단 결과를 출력하는 단계(S800)를 포함할 수 있다. 형세 판단 결과를 출력하는 단계(S800)는 도 12에서 설명한 형세 판단 모델 서버가 형세값, 바둑판의 상태, 소정의 임계값을 이용하여 형세 판단 결과를 제공하는 설명을 따를 수 있다. The deep learning-based Go game service method according to an embodiment of the present invention may include a step (S800) of outputting a situation judgment result by a situation judgment model server. The step of outputting the situation judgment result (S800) may follow the description of providing the situation judgment result by using the situation value, the state of the checkerboard, and a predetermined threshold value by the situation judgment model server described in FIG. 12 .

따라서, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 딥러닝 신경망을 이용하여 바둑 형세를 판단할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 바둑 규칙에 따른 집, 사석, 돌, 공배, 빅을 정확히 구분하여 바둑의 형세를 정확히 판단할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 바둑 규칙에 따른 집, 사석, 돌, 공배, 빅을 예측하여 바둑의 형세를 정확히 판단할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 바둑 대국 중 신속하게 형세를 판단할 수 있다.Therefore, the deep learning-based Go game service method according to the embodiment may determine the situation of Go using a deep learning neural network. In addition, the deep learning-based Go game service method according to the embodiment can accurately determine the situation of Go by accurately classifying house, sandstone, stone, ball game, and big according to the rules of Go. In addition, the deep learning-based Go game service method according to the embodiment can accurately determine the situation of Go by predicting House, Four Stone, Stone, Gongbae, and Big according to the Go rules. In addition, the deep learning-based Go game service method according to the embodiment can quickly determine the situation during a game of Go.

<일 실시예에 따른 시간 관리 모델 서버><Time management model server according to an embodiment>

도 19는 본 발명의 일 실시예에 따른 시간 관리 모델 서버의 시간 관리부를 설명하기 위한 도면이고, 도 20은 본 발명의 일 실시예에 따른 시간 관리부의 분산 산출을 설명하기 위한 도면이다.19 is a diagram for explaining a time management unit of a time management model server according to an embodiment of the present invention, and FIG. 20 is a diagram for explaining distributed calculation of a time management unit according to an embodiment of the present invention.

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스는 시간 관리 모델 서버(500)가 시간 관리 정보 중 하나인 착수 준비 시간을 결정하고, 착수 모델이 기 설정된 또는 결정된 착수 준비 시간에 기초하여 착수할 수 있다. 시간 관리 모델 서버(500)는 사용자와 인공지능 컴퓨터의 대국 또는 인공지능 컴퓨터 간의 대국에서 경기가 막상막하일 경우 인공지능 컴퓨터가 착수하기 더 좋은 수를 두기 위하여 좋은 수를 찾기 위해 착수 준비 시간을 증가시킬 수 있다. In the deep learning-based Go game service according to an embodiment of the present invention, the time management model server 500 determines the start preparation time, which is one of time management information, and the start model is set based on the preset or determined start preparation time. can get started The time management model server 500 increases the start preparation time to find a good move so that the artificial intelligence computer can make a better move when the game is close to the game between the user and the artificial intelligence computer or the artificial intelligence computer. can make it

보다 구체적으로, 도 19를 참조하면, 시간 관리 모델 서버(500)는 착수 모델 서버(300) 로부터 탐색 확률값(

), 방문 횟수(N) 또는 가치값(V)을 수신하고, 착수 모델 서버(300)로 착수 준비 시간(PP)을 제공할 수 있다. 일 예로, 착수 준비 시간(PP)은 평균 착수 준비 시간, 제1 착수 준비 시간 또는 제2 착수 준비 시간을 포함할 수 있다. 제1 착수 준비 시간은 평균 착수 준비 시간보다 길고, 제2 착수 준비 시간은 제1 착수 준비 시간보다 길 수 있다. 시간 관리 모델 서버(500)는 시간 관리부(510)를 포함할 수 있다. 시간 관리부(510)는 탐색 확률값(

) 및 방문 횟수(N)에 기초하여 착수 준비 시간(PP)을 결정할 수 있다. 시간 관리부(510)는 탐색 확률값(

), 방문 횟수(N)을 이용하여 분산을 산출하고, 분산과 임계 분산 값을 비교하여 착수 준비 시간을 결정할 수 있다. 즉, 착수 후보점에 대한 분산이 낮을수록 착수 후보점이 가장 좋은 수가 아닐 가능성이 높다는 것이고 이러한 정보에 비추어 현재 경기가 막상막하일 가능성이 높을 수 있다. 이에, 시간 관리부는 분산이 임계 분산 값보다 낮을 경우 착수 준비 시간을 더 길게 하여 착수 모델이 더 좋은 수를 착도록 할 수 있다. 분산은 수학식 3과 수학식 4를 이용하여 구할 수 있다.More specifically, referring to FIG. 19 , the time management model server 500 obtains a search probability value from the initiation model server 300 (

), the number of visits (N) or value (V), and the start preparation time (PP) may be provided to the start model server 300. For example, the start preparation time (PP) may include an average start preparation time, a first start preparation time, or a second start preparation time. The first start preparation time may be longer than the average start preparation time, and the second start preparation time may be longer than the first start preparation time. The time management model server 500 may include a time management unit 510 . The time management unit 510 determines the search probability value (

) and the number of visits (N), the start preparation time (PP) can be determined. The time management unit 510 determines the search probability value (

), the number of visits (N) is used to calculate the variance, and the start preparation time can be determined by comparing the variance and the critical variance value. That is, the lower the variance of the starting candidate points, the higher the possibility that the starting candidate point is not the best number. Accordingly, when the variance is lower than the critical variance value, the time management unit may make the start preparation time longer so that the start model arrives at a better number. The variance can be obtained using

Equations

3 and 4.

(수학식 3)(Equation 3)

수학식 3에서 n은 교차점 수이고,

는 각 교차점에 대한 방문 횟수이고,

는 각 교차점에 대한 탐색 확률값이고,

는 평균 방문 횟수이다.In Equation 3, n is the number of intersection points,

is the number of visits to each intersection,

is the search probability for each intersection,

is the average number of visits.

(수학식 4)(Equation 4)

수학식 4에서 Var는 분산이다.In Equation 4, Var is the variance.

시간 관리부(510)는 분산(Var)이 임계 분산값보다 낮으면 제1 착수 준비 시간으로 착수 준비 시간을 결정할 수 있고, 분산(Var)이 임계 분산값보다 낮지 않으면 평균 착수 준비 시간으로 착수 준비 시간을 결정할 수 있다.The time management unit 510 may determine the start preparation time as the first start preparation time when the variance (Var) is lower than the critical variance value, and the start preparation time as the average start preparation time when the variance (Var) is not lower than the critical variance value. can determine

예를 들어, 도 20을 참조하면, 임계 분산값은 0.05일 수 있다. 도 20(a)의 경우, 시간 관리부(510)는 분산(Var)이 0.2109로 산출할 수 있다. 이 경우, 시간 관리부(510)는 분산(Var)이 임계 분산값보다 높으므로 평균 착수 준비 시간으로 착수 준비 시간을 결정할 수 있다. 도 20(b)의 경우, 시간 관리부(510)는 분산(Var)이 0.0145로 산출할 수 있다. 이 경우, 시간 관리부(510)는 분산(Var)이 임계 분산값보다 낮으므로 제1 착수 준비 시간으로 착수 준비 시간을 결정할 수 있다. 이에, 착수 모델은 더 오랫동안 또는 더 많은 횟수의 MCTS 시뮬레이션을 수행하여 더 좋은 착수 후보점을 선택할 수 있다. For example, referring to FIG. 20 , the critical variance value may be 0.05. In the case of FIG. 20 (a), the time management unit 510 may calculate the variance (Var) as 0.2109. In this case, the time manager 510 may determine the start preparation time as the average start preparation time since the variance Var is higher than the critical variance value. In the case of FIG. 20 (b), the time management unit 510 may calculate the variance (Var) as 0.0145. In this case, the time manager 510 may determine the start preparation time as the first start preparation time since the variance (Var) is lower than the critical variance value. Accordingly, the initiation model may select a better initiation candidate point by performing MCTS simulations for a longer time or a greater number of times.

또한, 시간 관리부(510)는 가치값(V)에 기초하여 착수 준비 시간(PP)을 결정할 수 있다. 보다 구체적으로, 시간 관리부(510)는 제1 착수 준비 시간 결정 후 가치값(V)이 임계 가치값 이하이면 착수 준비 시간(PP)을 제2 착수 준비 시간으로 결정할 수 있다. 예를 들어, 임계 가치값은 50.0%일 수 있다. 즉, 착수 후보점의 가치값이 낮다는 것은 현재 경기에서 지고 있을 가능성이 높다는 것이므로 시간 관리부(510)가 더 좋은 착수 후보점을 찾도록 착수 모델이 더 오랫동안 또는 더 많은 횟수의 MCTS 시뮬레이션을 수행할 수 있도록 하는 것이다.In addition, the time management unit 510 may determine the start preparation time (PP) based on the value (V). More specifically, the time management unit 510 may determine the start preparation time PP as the second start preparation time when the value V is less than or equal to the threshold value after determining the first start preparation time. For example, the threshold value may be 50.0%. That is, since the value of the starting candidate point is low, it is likely that the current game is lost, so the starting model needs to perform MCTS simulations for a longer time or a greater number of times so that the time management unit 510 can find a better starting point. is to make it possible

따라서, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 바둑 게임 시간 관리를 할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 중요한 국면에서 착수 준비 시간을 변경할 수 있다.Therefore, the deep learning-based Go game service device according to the embodiment can manage the Go game time. In addition, the deep learning-based Go game service device according to the embodiment may change the starting preparation time in an important phase.

도 21은 본 발명의 일 실시예에 따른 시간 관리 모델 서버의 바둑 게임 서비스 시스템에서의 신호 흐름에 대한 예시도이다.21 is an exemplary diagram of a signal flow in a Go game service system of a time management model server according to an embodiment of the present invention.

도 21을 참조하면, 착수 모델 서버(300)는 대국 중에 대국에서 이기기 위한 착수점을 찾기 위하여 MCTS 시뮬레이션을 수행할 수 있다(S2101). 착수 모델 서버(300)는 MCTS 시뮬레이션 결과 생성된 탐색 확률값(

), 방문 횟수(N), 가치값(V)를 시간 관리 모델 서버(500)에 전송할 수 있다(S2102). 시간 관리 모델 서버(500)는 수신한 탐색 확률값(

), 방문 횟수(N), 가치값(V)에 기초하여 착수 준비 시간을 결정할 수 있다(S2103). 시간 관리 모델 서버(500)는 결정된 착수 준비 시간을 착수 모델 서버(300)에 전송할 수 있다(S2104). 착수 모델 서버(300)는 수신한 착수 준비 시간 또는 기설정된 착수 준비 시간에 기초하여 착수를 수행할 수 있다(S2105). 착수 모델 서버(300)는 수신한 착수 준비 시간과 기설정된 착수 준비 시간 중 수신한 착수 준비 시간에 우선하여 착수를 수행할 수 있다. Referring to FIG. 21 , the start model server 300 may perform MCTS simulation to find a starting point for winning a game among players (S2101). The initiation model server 300 is a search probability value generated as a result of the MCTS simulation (

), the number of visits (N), and the value (V) may be transmitted to the time management model server 500 (S2102). The time management model server 500 receives the search probability value (

), the number of visits (N), and the value (V), the start preparation time may be determined (S2103). The time management model server 500 may transmit the determined start preparation time to the start model server 300 (S2104). The start model server 300 may perform the start based on the received start preparation time or the preset start preparation time (S2105). The start model server 300 may perform the start in priority to the received start preparation time of the received start preparation time and the preset start preparation time.

도 22는 본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 착수 준비 시간 결정 방법이다.22 is a method for determining start preparation time among deep learning-based Go game service methods according to an embodiment of the present invention.

도 22를 참조하면, 본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 탐색 확률값(

), 방문 횟수(N) 또는 가치값(V)을 수신하는 단계(S2201)을 포함할 수 있다. Referring to FIG. 22, in the deep learning-based Go game service method according to an embodiment of the present invention, the time management model server 500 has a search probability value (

), receiving the number of visits (N) or value (V) (S2201).

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500) 중 시간 관리부가 시간 관리 모델 서버(500)가 탐색 확률값(

), 방문 횟수(N)에 기초하여 분산을 산출하는 단계(S2202)를 포함할 수 있다. 분산을 산출하는 방법은 도 19 및 도 20의 설명을 따른다.In the deep learning-based Go game service method according to an embodiment of the present invention, the time management unit of the time management model server 500 sets the time management model server 500 to a search probability value (

), calculating a variance based on the number of visits (N) (S2202). The method for calculating the variance follows the description of FIGS. 19 and 20 .

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500) 중 시간 관리부가 산출된 분산이 임계 분산값 미만인지 판단하는 단계(S2203)를 포함할 수 있다. The deep learning-based Go game service method according to an embodiment of the present invention may include determining whether the variance calculated by the time management unit of the time management model server 500 is less than a critical variance value (S2203).

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500) 중 시간 관리부가 분산이 임계 분산값 미만이 아니면 착수 준비 시간을 평균 착수 준비 시간으로 결정하는 단계(S2204)를 포함할 수 있다. 착수 준비시간을 평균 착수 준비 시간으로 결정하는 방법은 도 19 및 도 20의 설명을 따른다.In the deep learning-based Go game service method according to an embodiment of the present invention, the time management unit of the time management model server 500 determines the start preparation time as the average start preparation time if the variance is not less than the critical variance value (S2204 ) may be included. The method of determining the start preparation time as the average start preparation time follows the description of FIGS. 19 and 20 .

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500) 중 시간 관리부가 분산이 임계 분산값 미만이면 착수 준비 시간을 제1 착수 준비 시간으로 결정하는 단계(S2205)를 포함할 수 있다. 착수 준비 시간을 제1 착수 준비 시간으로 결정하는 방법은 도 19 및 도 20의 설명을 따른다.In the deep learning-based Go game service method according to an embodiment of the present invention, the time management unit of the time management model server 500 determines the start preparation time as the first start preparation time when the variance is less than the critical variance value (S2205 ) may be included. A method of determining the start preparation time as the first start preparation time follows the description of FIGS. 19 and 20 .

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500) 중 시간 관리부가 착수 준비 시간을 제1 착수 준비 시간으로 결정 후 가치값이 임계 가치값 이하인지 판단하는 단계(S2206)를 포함할 수 있다. 가치값이 임계 가치값 이하인지 판단하는 방법은 도 19 및 도 20의 설명을 따른다. 시간 관리부는 가치값이 임계 가치값 이하가 아니면 제1 착수 준비 시간을 착수 준비 시간으로 결정할 수 있다.In the deep learning-based Go game service method according to an embodiment of the present invention, the time management unit of the time management model server 500 determines the start preparation time as the first start preparation time and determines whether the value is less than or equal to the threshold value Step S2206 may be included. A method of determining whether the value is less than or equal to the threshold value follows the description of FIGS. 19 and 20 . The time management unit may determine the first start preparation time as the start preparation time when the value is not less than or equal to the threshold value.

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500) 중 시간 관리부가 가치값이 임계 가치값 이하이면 착수 준비 시간을 제2 착수 준비 시간으로 결정하는 단계(S2207)를 포함할 수 있다. 착수 준비 시간을 제2 착수 준비 시간으로 결정하는 방법은 도 19 및 도 20의 설명을 따른다. In the deep learning-based Go game service method according to an embodiment of the present invention, the time management unit of the time management model server 500 determines the start preparation time as the second start preparation time when the value is less than the threshold value ( S2207) may be included. A method of determining the start preparation time as the second start preparation time follows the description of FIGS. 19 and 20 .

본 발명의 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500) 중 시간 관리부가 결정된 착수 준비 시간을 전송하는 단계(S2208)을 포함할 수 있다.The deep learning-based Go game service method according to an embodiment of the present invention may include transmitting the start preparation time determined by the time management unit of the time management model server 500 (S2208).

따라서, 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 바둑 게임 시간 관리를 할 수 있다. 또한, 일 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 중요한 국면에서 착수 준비 시간을 변경할 수 있다.Therefore, the deep learning-based Go game service method according to an embodiment can manage the Go game time. In addition, the deep learning-based Go game service method according to an embodiment may change the starting preparation time in an important phase.

<다른 실시예에 따른 시간 관리 모델 서버><Time management model server according to another embodiment>

도 23은 본 발명의 다른 실시예에 따른 시간 관리 모델 서버의 시간 조정부를 설명하기 위한 도면이고, 도 24는 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 착수 준비 시간 결정 방법이고, 도 25는 본 발명의 다른 실시예에 따른 가치값의 변동 추세에 따라서 착수 준비 시간을 결정하는 방법을 설명하기 위한 도면의 일례이고, 도 26은 본 발명의 다른 실시예에 따른 가치값의 변동 폭에 따라서 착수 준비 시간을 결정하는 방법을 설명하기 위한 도면의 일례이다. 23 is a diagram for explaining a time adjustment unit of a time management model server according to another embodiment of the present invention, and FIG. 24 is a method for determining start preparation time among deep learning-based Go game service methods according to another embodiment of the present invention. 25 is an example of a diagram for explaining a method of determining a start preparation time according to a trend of change in value according to another embodiment of the present invention, and FIG. 26 is an example of a value according to another embodiment of the present invention. It is an example of a drawing for explaining a method of determining the start preparation time according to the fluctuation range.

도 23을 참조하면, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스는 시간 관리 모델 서버(500)가 시간 관리 정보 중 하나인 착수 준비 시간(PP)을 결정할 때, 가치값(V) 및/또는 상대 착수시간(OT: 즉, 상대방의 착수 준비 시간)을 이용하여 상기 착수 준비 시간(PP)을 결정할 수 있다. 또한, 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스는 기 설정된 또는 결정된 착수 준비 시간(PP)에 기초하여 착수할 수 있다. 23, in the deep learning-based Go game service according to another embodiment of the present invention, when the time management model server 500 determines the start preparation time (PP), which is one of the time management information, the value value (V ) and/or the relative start time (OT: that is, the other party's start preparation time) to determine the start preparation time (PP). In addition, the deep learning-based Go game service according to another embodiment may start based on a preset or determined starting preparation time (PP).

이때, 시간 관리 모델 서버(500)는 사용자와 인공지능 컴퓨터의 대국 또는 인공지능 컴퓨터 간의 대국에서 경기의 중요한 국면을 마주한 경우(예컨대, 경기가 막상막하인 경우 등) 인공지능 컴퓨터가 착수하기 더 좋은 수를 두기 위하여 좋은 수를 찾기 위한 착수 준비 시간(PP)을 증가시킬 수 있다. At this time, the time management model server 500 is a better way for the artificial intelligence computer to start when an important phase of the game is faced in a match between the user and the artificial intelligence computer or a match between the artificial intelligence computer (eg, when the match is close). You can increase the set-up preparation time (PP) to find a good move to make a move.

자세히, 시간 관리 모델 서버(500)는 시간 조정부(550)를 포함할 수 있다. 시간 조정부(550)는 가치값(V) 및/또는 상대 착수시간(OT)에 기초하여 착수 준비 시간(PP)을 결정할 수 있다. In detail, the time management model server 500 may include a time adjustment unit 550 . The time adjustment unit 550 may determine the start preparation time (PP) based on the value (V) and/or the relative start time (OT).

구체적으로, 도 24를 참조하면, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 가치값(V)을 기초로 착수 준비 시간(PP)을 결정하게 할 수 있다. Specifically, referring to FIG. 24, in the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 determines the start preparation time (PP) based on the value (V). can do

자세히, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 가치값(V)의 변화 즉, 승률의 변화에 따라서 착수 준비 시간(PP)을 결정하게 할 수 있다. In detail, in the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 determines the starting preparation time (PP) according to the change in the value value (V), that is, the change in the odds ratio. can do.

보다 상세히, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 복수의 가치값(V)을 수신하는 단계(S2301)를 포함할 수 있다. In more detail, the deep learning-based Go game service method according to another embodiment of the present invention may include receiving a plurality of value values (V) by the time management model server 500 (S2301).

즉, 시간 관리 모델 서버(500)는 착수 모델 서버(300)로부터 바둑 대국의 매 착수에 따른 복수의 가치값(V)을 수신할 수 있다. That is, the time management model server 500 may receive a plurality of value values V according to each start of the Go game from the start model server 300 .

또한, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 수신된 복수의 가치값(V)을 누적하여 저장하는 단계(S2302)를 포함할 수 있다. In addition, the deep learning-based Go game service method according to another embodiment of the present invention may include accumulating and storing a plurality of received value values (V) by the time management model server 500 (S2302). .

자세히, 시간 관리 모델 서버(500)는 착수 모델 서버(300)로부터 수신된 복수의 가치값(V)을 순차적으로 누적하여 저장할 수 있다. In detail, the time management model server 500 may sequentially accumulate and store a plurality of value values V received from the undertaking model server 300 .

또한, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 누적된 가치값(V)을 기초로 승률변화를 판단하는 단계(S2303)를 포함할 수 있다. In addition, the deep learning-based Go game service method according to another embodiment of the present invention may include a step of determining, by the time management model server 500, a change in odds based on the accumulated value (V) (S2303). there is.

구체적으로, 시간 관리 모델 서버(500)는 누적된 가치값(V)에 기초하여 가치값(V)의 변동 추세 및/또는 변동 폭을 파악할 수 있다. Specifically, the time management model server 500 may grasp a trend and/or a range of change of the value value V based on the accumulated value value V.

여기서, 상기 가치값(V)의 변동 추세란 가치값(V)(즉, 승률)이 점차적으로 상승하고 있는지 또는 하락하고 있는지 여부를 나타내는 가치값(V)의 변화율을 의미할 수 있다. 자세히, 상기 가치값(V)의 변동 추세는 대국 시작 이후 소정의 착수 시점부터 현재 착수 시점까지의 복수의 가치값(V)의 변화 상태를 나타낸다. 보다 상세히는, 상기 복수의 가치값(V) 각각에 대한 변화율이 소정의 횟수 이상 음수인 경우 즉, 가치값(V)이 소정의 횟수 이상 감소하는 경우를 상기 가치값(V)이 점차적으로 하락하고 있는 형세라고 판단할 수 있다. 반면, 상기 복수의 가치값(V) 각각에 대한 변화율이 소정의 횟수 이상 양수인 경우 즉, 가치값(V)이 소정의 횟수 이상 증가하는 경우를 상기 가치값(V)이 점차적으로 상승하고 있는 형세라고 판단할 수 있다. 또한, 이러한 가치값(V)의 변동 추세는 해당 가치값(V) 변화율의 기울기로 판단할 수도 있다. Here, the change rate of the value V may mean a rate of change of the value V indicating whether the value V (ie, win rate) is gradually increasing or decreasing. In detail, the change trend of the value value (V) represents the change state of a plurality of value values (V) from a predetermined start time to the current start time after the start of the game. More specifically, when the rate of change for each of the plurality of value values (V) is a negative number more than a predetermined number of times, that is, if the value value (V) decreases more than a predetermined number of times, the value value (V) gradually decreases. It can be judged that the situation is doing. On the other hand, when the rate of change for each of the plurality of value values (V) is a positive number more than a predetermined number of times, that is, if the value value (V) increases more than a predetermined number of times, the value value (V) is gradually increasing. can be judged. In addition, the change trend of the value value (V) may be determined by the slope of the change rate of the corresponding value value (V).

예를 들어, 상기 가치값(V)의 변동 추세는 최근 착수된 5수에 대한 가치값(V)의 변화율 및/또는 기울기(즉, 가치값(V)이 상승하고 있는지 또는 가치값(V)이 하락하고 있는지 여부)일 수 있다. For example, the trend of change of the value value (V) is the change rate and/or the slope (ie, whether the value value (V) is rising or the value value (V) for the recently launched number 5). is falling).

한편, 상기 가치값(V)의 변동 폭이란 가치값(V)의 변화량(변화 정도)를 의미할 수 있다. On the other hand, the variation range of the value value (V) may mean the amount of change (degree of change) of the value value (V).

또한, 시간 관리 모델 서버(500)는 파악된 가치값(V)의 변동 추세 및/또는 변동 폭을 기초로 상기 누적된 복수의 가치값(V)에 기초한 승률변화를 판단할 수 있다. In addition, the time management model server 500 may determine a change in odds ratio based on the plurality of accumulated values V based on a trend and/or range of change of the identified value V.

계속해서, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 위와 같이 판단된 승률변화에 따라서 착수 준비 시간(PP)을 결정하는 단계(S2304)를 포함할 수 있다. Continuing, in the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 determines the starting preparation time (PP) according to the change in the odds ratio determined as above (S2304). can include

자세히, 도 25를 참조하면, 시간 관리 모델 서버(500)는 가치값(V)의 변동 추세에 따른 착수 준비 시간(PP)을 결정할 수 있다. In detail, referring to FIG. 25 , the time management model server 500 may determine the set-up preparation time (PP) according to the change trend of the value value (V).

보다 상세히, 시간 관리 모델 서버(500)는 상기 복수의 가치값(V)에 기초한 변동 추세가 지속적으로 감소하는 형세이면, 착수 준비 시간(PP)을 증가시킬 수 있다. 이는, 승률이 지속적으로 하락하는 경우 더 좋은 착수점을 찾기 위해 더 많은 착수 준비 시간(PP)을 할당해야 하는 중요한 국면이라는 점을 감안한 것이다. In more detail, the time management model server 500 may increase the set-up preparation time (PP) when the trend of change based on the plurality of value values (V) continuously decreases. This is in consideration of the fact that it is an important phase in which more preparation time (PP) must be allocated to find a better starting point if the winning rate continues to drop.

예를 들면, 시간 관리 모델 서버(500)는 최근 착수된 5수(예컨대, 72수, 74수, 76수, 78수 및 80수)에 대한 가치값(V)의 변동 추세가 점차 하락하는 형세인 경우(예컨대, 72수의 가치값: 70%, 74수의 가치값: 67%, 76수의 가치값: 65%, 78수의 가치값: 63% 및 80수의 가치값: 61%), 중요한 국면이라고 판단하고 현재 사용할 착수 준비 시간(PP)을 증가시킬 수 있다. For example, in the time management model server 500, the change trend of the value V for the recently launched 5 numbers (eg, 72 numbers, 74 numbers, 76 numbers, 78 numbers, and 80 numbers) gradually decreases. (e.g., the value of the number 72: 70%, the value of the number 74: 67%, the value of the number 76: 65%, the value of the number 78: 63%, and the value of the number 80: 61%) , it is judged to be an important phase, and the launch preparation time (PP) to be used now can be increased.

또한, 도 26을 참조하면, 시간 관리 모델 서버(500)는 가치값(V)의 변동 폭에 따른 착수 준비 시간(PP)을 결정할 수 있다. In addition, referring to FIG. 26 , the time management model server 500 may determine the set-up preparation time (PP) according to the variation range of the value (V).

자세히, 시간 관리 모델 서버(500)는 상기 가치값(V)의 변동 폭이 소정의 기준(예컨대, 기설정된 수치 및/또는 비율 등) 이상으로 크게 감소하는 형세이면, 해당 상황을 중요한 국면으로 판단하고 더 좋은 착수점을 찾기 위하여 보다 많은 착수 준비 시간(PP)을 할당하게 할 수 있다. In detail, the time management model server 500 determines that the situation is an important phase when the fluctuation range of the value value V is greatly reduced by more than a predetermined criterion (eg, a predetermined value and/or ratio). In order to find a better starting point, more starting preparation time (PP) can be allocated.

예를 들어, 시간 관리 모델 서버(500)는 최근 착수된 2수(예컨대, 64수 및 66수) 간의 가치값(V) 변동 폭이 소정의 기준(예컨대, 기설정된 수치 및/또는 비율 등) 이상으로 크게 감소하는 형세인 경우(예컨대, 64수의 가치값: 58% 및 66수의 가치값: 35%), 중요한 국면이라고 판단하고 현재 사용할 착수 준비 시간(PP)을 증가시킬 수 있다. For example, the time management model server 500 determines the variation range of the value (V) between two recently initiated numbers (eg, 64 and 66) based on a predetermined criterion (eg, a preset value and/or ratio). In the case of a situation where the value of 64 numbers is greatly reduced (eg, the value of 64 numbers: 58% and the value of 66 numbers: 35%), it is determined that it is an important phase and the start preparation time (PP) to be used at present can be increased.

이때, 시간 관리 모델 서버(500)는 착수 준비 시간(PP)을 증가시킬 시 해당 착수 준비 시간(PP)을 얼마나 증가시킬지에 대한 착수 준비 시간(PP) 증가량(이하, 착수시간 증가량)을 소정의 방식에 따라서 결정할 수 있다. At this time, when the time management model server 500 increases the start preparation time (PP), the start preparation time (PP) increase amount (hereinafter, start time increase amount) for how much to increase the start preparation time (PP) is set to a predetermined amount. It can be decided according to the method.

자세히, 시간 관리 모델 서버(500)는 상술된 시간 관리부(510)에 의하여 결정된 착수 준비 시간(PP)을 기초 착수 준비 시간(이하, 기초 착수시간)으로 설정할 수 있다. In detail, the time management model server 500 may set the start preparation time (PP) determined by the above-described time management unit 510 as the basic start preparation time (hereinafter, the basic start time).

또한, 시간 관리 모델 서버(500)는 상기 1) 기초 착수시간을 기반으로 상기 착수시간 증가량을 결정할 수 있다. In addition, the time management model server 500 may determine the start time increment based on 1) the basic start time.

구체적으로, 시간 관리 모델 서버(500)는 상기 기초 착수시간에 기반한 소정의 연산을 기초로 산출되는 소정의 수치만큼 상기 착수시간 증가량을 결정할 수 있다. Specifically, the time management model server 500 may determine the start time increment by a predetermined value calculated based on a predetermined operation based on the basic start time.

실시예로 시간 관리 모델 서버(500)는 상기 기초 착수시간 대비 소정의 퍼센트(예컨대, 기 설정된 수치 이하의 퍼센트 등)만큼 증가하도록 상기 착수시간 증가량을 결정할 수 있다. 예를 들면, 시간 관리 모델 서버(500)는 상기 기초 착수시간이 10초이고 상기 소정의 퍼센트가 30%이면, 상기 착수시간 증가량을 3초로 결정할 수 있다. In an embodiment, the time management model server 500 may determine the start time increase amount to increase by a predetermined percentage (eg, a percentage less than or equal to a predetermined value) compared to the basic start time. For example, if the basic start time is 10 seconds and the predetermined percentage is 30%, the time management model server 500 may determine the start time increment as 3 seconds.

또한, 시간 관리 모델 서버(500)는 위와 같이 결정된 착수시간 증가량을 상기 기초 착수시간에 반영하여 조정된 착수 준비 시간(PP)을 결정할 수 있다. In addition, the time management model server 500 may determine the adjusted start preparation time (PP) by reflecting the determined start time increment to the basic start time.

또는, 시간 관리 모델 서버(500)는 2) 바둑 게임 서비스 상에 기 설정되어 있는 초읽기 시간을 기초로 상기 착수시간 증가량을 결정할 수 있다. Alternatively, the time management model server 500 may 2) determine the starting time increment based on a preset countdown time on the Go game service.

자세히, 시간 관리 모델 서버(500)는 바둑 게임 서비스에 기 설정된 초읽기 시간을 상기 착수시간 증가량으로 결정하고, 이를 상기 시간 관리부(510)에 의하여 결정된 기초 착수시간에 가산하여 조정된 착수 준비 시간(PP)을 결정할 수 있다. 예를 들어, 시간 관리 모델 서버(500)는 기 설정된 초읽기 시간이 60초이고 상기 기초 착수시간이 10초이면, 최종적으로 조정된 착수 준비 시간(PP)을 70초로 결정할 수 있다. In detail, the time management model server 500 determines the countdown time preset for the Go game service as the start time increment, and adds it to the basic start time determined by the time management unit 510 to adjust the start preparation time (PP) ) can be determined. For example, if the preset countdown time is 60 seconds and the basic start time is 10 seconds, the time management model server 500 may determine the finally adjusted start preparation time (PP) as 70 seconds.

계속해서, 시간 관리 모델 서버(500)는 위와 같이 결정된 착수 준비 시간(PP)을 착수 모델 서버(300)로 제공할 수 있다. Subsequently, the time management model server 500 may provide the start preparation time (PP) determined as above to the start model server 300 .

즉, 시간 관리 모델 서버(500)의 시간 조정부(550)는 상기 착수 준비 시간(PP)을 수신한 착수 모델 서버(300)의 착수 모델이 더 오랫동안 또는 더 많은 횟수의 MCTS 시뮬레이션을 수행하여 보다 좋은 착수 후보점을 선택하게 할 수 있다. That is, the time adjustment unit 550 of the time management model server 500 performs MCTS simulations for a longer time or a greater number of times so that the initiation model of the initiation model server 300 having received the initiation preparation time (PP) is better. It is possible to select a starting point.

따라서, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 바둑 게임 시간 관리를 할 수 있다. 또한, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 중요한 국면에서 착수 준비 시간(PP)을 변경할 수 있다. Therefore, the deep learning-based Go game service device according to the embodiment can manage the Go game time. In addition, the deep learning-based Go game service device according to the embodiment may change the starting preparation time (PP) in an important phase.

도 27은 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 착수 준비 시간(PP) 결정 방법이고, 도 28은 본 발명의 다른 실시예에 따른 상대 착수시간(OT)에 따라서 착수 준비 시간(PP)을 결정하는 방법을 설명하기 위한 도면의 일례이다. 27 is a method for determining start preparation time (PP) among deep learning-based Go game service methods according to another embodiment of the present invention, and FIG. 28 is a start according to relative start time (OT) according to another embodiment of the present invention. It is an example of a drawing for explaining a method of determining the preparation time (PP).

한편, 도 27을 참조하면, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 상대 착수시간(OT)을 기초로 착수 준비 시간(PP)을 결정하게 할 수 있다. Meanwhile, referring to FIG. 27, in the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 determines the start preparation time (PP) based on the relative start time (OT). can do

자세히, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 상대방의 직전 착수(즉, 현재시점 바로 이전에 착수된 상대방의 착수)에 소요된 상대 착수시간(이하, 직전 상대 착수시간)과, 상대방이 사용한 복수의 상대 착수시간(OT)의 평균(이하, 평균 상대 착수시간)을 이용하여 착수 준비 시간(PP)을 결정하게 할 수 있다. In detail, in the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 takes the opponent's start immediately before the opponent's start (ie, the opponent's start that started right before the current point in time) The start preparation time (PP) can be determined using the time (hereinafter, the previous relative start time) and the average of the plurality of relative start times (OT) used by the other party (hereinafter, the average relative start time).

구체적으로, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 복수의 상대 착수시간(OT)을 수신하는 단계(S2401)를 포함할 수 있다. Specifically, the deep learning-based Go game service method according to another embodiment of the present invention may include a step of receiving, by the time management model server 500, a plurality of relative start times (OT) (S2401).

보다 상세히, 시간 관리 모델 서버(500)는 착수 모델 서버(300) 및/또는 바둑서버(500)로부터 대국이 진행되는 과정에서 상대방이 수행한 매 착수마다 소요된 상대방의 착수 준비 시간(즉, 상대 착수시간(OT))을 복수 개 수신할 수 있다. In more detail, the time management model server 500 is the start preparation time of the other party (ie, opponent A plurality of start times (OT) may be received.

이때, 상기 수신되는 복수의 상대 착수시간(OT)은 대국 시작 이후 소정의 시점부터 상기 직전 착수 시점 또는 상기 직전 착수 시점을 제외한 그 이전까지의 시점에 대한 모든 상대 착수시간(OT)을 포함할 수 있다. At this time, the received plurality of relative start times (OT) may include all relative start times (OT) for the time from a predetermined time after the game starts to the point before the previous start time or the previous start time excluding the previous start time. there is.

예를 들면, 상기 복수의 상대 착수시간(OT)은 상대방의 첫 착수 시점의 제1 착수에 대한 제1 상대 착수시간(OT)부터 상기 직전 상대 착수시간까지의 모든 상대 착수시간(OT)을 포함할 수도 있고, 또는 상기 상대방의 첫 착수 시점의 제1 착수에 대한 제1 상대 착수시간(OT)부터 상기 직전 상대 착수시간을 제외한 그 이전까지의 모든 상대 착수시간(OT)을 포함할 수도 있다. For example, the plurality of relative launch times (OT) include all relative launch times (OT) from the first relative launch time (OT) for the first launch at the time of the other party's first launch to the previous relative launch time. Or it may include all relative start times (OT) from the first relative start time (OT) for the first start at the time of the other party's first start to the previous relative start time excluding the immediately preceding relative start time.

또한, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 상기 수신된 복수의 상대 착수시간(OT)에 따른 평균 상대 착수시간을 산출하는 단계(S2402)를 포함할 수 있다. In addition, in the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 calculates an average relative start time according to the received plurality of relative start times (OT) (S2402 ) may be included.

실시예로, 시간 관리 모델 서버(500)는 상기 복수의 상대 착수시간(OT)에 대한 평균연산을 수행하여 상기 평균 상대 착수시간을 산출할 수 있다. As an embodiment, the time management model server 500 may calculate the average relative start time by performing an average operation on the plurality of relative start times (OT).

또한, 본 발명의 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 위와 같이 산출된 평균 상대 착수시간과 직전 상대 착수시간을 기초로 착수 준비 시간(PP)을 결정하는 단계(S2403)를 포함할 수 있다. In addition, in the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 calculates the start preparation time (PP) based on the average relative start time and the previous relative start time calculated as above. A step of determining (S2403) may be included.

이때, 상기 직전 상대 착수시간은 상기 복수의 상대 착수시간(OT)에 포함하여 착수 모델 서버(300) 및/또는 바둑서버(500)로부터 수신할 수도 있고, 상기 복수의 상대 착수시간(OT)과는 구분하여 상기 착수 모델 서버(300) 및/또는 바둑서버(500)로부터 수신할 수도 있다. At this time, the previous relative start time may be included in the plurality of relative start times (OT) and received from the start model server 300 and/or the Go server 500, and the plurality of relative start times (OT) and may be separately received from the set-up model server 300 and/or the Go server 500.

자세히, 도 28을 참조하면, 시간 관리 모델 서버(500)는 상대방의 직전 착수(즉, 현재시점 바로 이전에 착수된 상대방의 착수)에 소요된 직전 상대 착수시간이 상기 산출된 평균 상대 착수시간 보다 크면, 착수 준비 시간(PP)을 증가시킬 수 있다. 이는, 상대방이 대국 과정에서 사용한 평균적인 착수 준비 시간(PP)보다 직전 착수에서 보다 많은 착수 준비 시간(PP)을 사용했다는 것은 상기 직전 착수가 상대방에게 중요한 국면이었다는 점을 고려한 것이다. 그리하여 상대방에게 중요한 국면이었던 이전 착수 이후 수행하는 현재 착수 또한 중요한 국면이라고 판단하고 더 좋은 착수점을 찾기 위하여 보다 많은 착수 준비 시간(PP)을 할당하도록 착수 준비 시간(PP)을 증가시킬 수 있다. In detail, referring to FIG. 28, the time management model server 500 determines that the previous relative start time required for the other party's previous start (ie, the other party's start that started right before the current point in time) is greater than the calculated average relative start time. If it is large, the launch preparation time (PP) can be increased. This is in consideration of the fact that the opponent's previous start was an important phase for the opponent that the opponent used more start preparation time (PP) in the previous start than the average start preparation time (PP) used in the game process. Therefore, the current start after the previous start, which was an important phase for the other party, is also an important phase, and the start preparation time (PP) can be increased to allocate more start preparation time (PP) to find a better starting point.

예를 들면, 시간 관리 모델 서버(500)는 상대방이 최근 착수한 8수(예컨대, 54수, 56수, 58수, 60수, 62수, 64수, 66수 및 직전 착수인 68수) 각각에 대한 상대 착수시간(OT)을 포함하는 복수의 상대 착수시간(OT)(예컨대, 54수 상대 착수시간: 20초, 56수 상대 착수시간: 10초, 58수 상대 착수시간: 20초, 60수 상대 착수시간: 10초, 62수 상대 착수시간: 30초, 64수 상대 착수시간: 5초, 66수 상대 착수시간: 2초 및 68수 상대 착수시간: 38초)을 수신할 수 있다. For example, the time management model server 500 performs each of 8 moves (eg, 54 moves, 56 moves, 58 moves, 60 moves, 62 moves, 64 moves, 66 moves and the previous 68 moves) that the other party recently started. A plurality of relative start times (OT) including the relative start time (OT) for (e.g., 54 number relative start time: 20 seconds, 56 number relative start time: 10 seconds, 58 number relative start time: 20 seconds, 60 number relative launch time: 10 seconds, 62 number relative launch time: 30 seconds, 64 number relative launch time: 5 seconds, 66 number relative launch time: 2 seconds, and 68 number relative launch time: 38 seconds).

또한 본 예시에서 시간 관리 모델 서버(500)는 수신된 복수의 상대 착수시간(OT)에 대한 평균연산을 수행하여 평균 상대 착수시간(예컨대, 약 14초)을 산출할 수 있다. Also, in this example, the time management model server 500 may calculate an average relative start time (eg, about 14 seconds) by performing an average operation on a plurality of received relative start times (OT).

또한 본 예시에서 시간 관리 모델 서버(500)는 상기 산출된 평균 상대 착수시간(예컨대, 약 14초)과 상기 직전 착수(예컨대, 68수)에 대한 직전 상대 착수시간(예컨대, 38초)을 비교할 수 있다. In addition, in this example, the time management model server 500 compares the calculated average relative start time (eg, about 14 seconds) with the previous relative start time (eg, 38 seconds) for the previous start (eg, 68 moves). can

또한 본 예시에서 시간 관리 모델 서버(500)는 상기 비교를 통하여 상기 평균 상대 착수시간(예컨대, 약 14초)보다 상기 직전 상대 착수시간(예컨대, 38초)이 더 크다고 판단되면, 중요한 국면이라고 판단하고 현재 사용할 착수 준비 시간(PP)을 증가시킬 수 있다. In addition, in this example, if the time management model server 500 determines that the previous relative start time (eg, 38 seconds) is greater than the average relative start time (eg, about 14 seconds) through the comparison, it is determined that it is an important phase. and increase the launch preparation time (PP) currently used.

이때, 시간 관리 모델 서버(500)는 착수 준비 시간(PP)을 증가시킬 시 해당 착수 준비 시간(PP)을 얼마나 증가시킬지에 대한 착수시간 증가량을 소정의 방식에 따라서 결정할 수 있다. In this case, when the start preparation time PP is increased, the time management model server 500 may determine the start time increase amount for how much to increase the start preparation time PP according to a predetermined method.

자세히, 시간 관리 모델 서버(500)는 상술된 바와 같이 1) 시간 관리부(510)에 의하여 결정된 기초 착수시간을 기반으로 상기 착수시간 증가량을 결정할 수 있다. In detail, as described above, the time management model server 500 may 1) determine the start time increment based on the basic start time determined by the time management unit 510 .

또는, 시간 관리 모델 서버(500)는 3) 상대 착수시간(OT)에 따라서 착수시간 증가량을 결정할 수 있다. Alternatively, the time management model server 500 may determine the start time increment according to 3) the relative start time (OT).

이때, 상기 상대 착수시간(OT)은 직전 상대 착수시간 및/또는 평균 상대 착수시간을 포함할 수 있다. In this case, the relative start time OT may include a previous relative start time and/or an average relative start time.

구체적으로, 시간 관리 모델 서버(500)는 상기 1] 기초 착수시간과 상기 직전 상대 착수시간을 비교하고, 비교의 결과 더 큰 시간을 가지는 착수시간을 현재 착수에 사용할 착수 준비 시간(PP)으로 결정할 수 있다. Specifically, the time management model server 500 compares the 1] basic start time with the previous relative start time, and determines the start time having the larger time as the start preparation time (PP) to be used for the current start. can

즉, 시간 관리 모델 서버(500)는 상술된 시간 관리부(510)에서 도출된 상기 기초 착수시간 보다 상기 직전 상대 착수시간이 큰 경우, 해당 격차만큼을 착수시간 증가량으로 결정할 수 있다. 예를 들면, 시간 관리 모델 서버(500)는 상기 기초 착수시간이 35초이고 상기 직전 상대 착수시간이 예컨대 38이면, 더 큰 시간을 가지는 상기 직전 상대 착수시간을 착수 준비 시간(PP)으로 설정하기 위하여 상기 착수시간 증가량을 두 착수시간 간 격차인 3초로 결정할 수 있다. That is, when the previous relative start time is greater than the basic start time derived from the above-described time management unit 510, the time management model server 500 may determine the corresponding gap as the start time increment. For example, if the basic start time is 35 seconds and the previous relative start time is, for example, 38, the time management model server 500 sets the previous relative start time having a larger time as the start preparation time (PP). In order to do so, the increase in the start time may be determined as 3 seconds, which is the gap between the two start times.

또는, 시간 관리 모델 서버(500)는 상기 2] 평균 상대 착수시간 대비 직전 상대 착수시간 비율(이하, 상대 착수시간 비율)을 산출하고, 산출된 상대 착수시간 비율에 기초하여 착수시간 증가량을 결정할 수 있다. Alternatively, the time management model server 500 may calculate the ratio of the previous relative start time to the average relative start time (hereinafter referred to as relative start time ratio) and determine the increment of start time based on the calculated relative start time ratio. there is.

자세히, 시간 관리 모델 서버(500)는 상기 상대 착수시간 비율(즉, 직전 상대 착수시간/평균 상대 착수시간)을 산출하여 상대방이 평균적으로 사용한 착수 준비 시간(PP)에 대비하여 직전 착수에 어느정도 비율의 시간을 사용하였는지 판단할 수 있다. In detail, the time management model server 500 calculates the relative start time ratio (ie, the previous relative start time/average relative start time) to determine the percentage of the previous start compared to the start preparation time (PP) used by the other party on average. of time can be judged.

또한, 시간 관리 모델 서버(500)는 상기 산출된 상대 착수시간 비율(예컨대, 소정의 퍼센트)이 소정의 기준(실시예로, 소정의 임계값(예컨대, 1.00%)을 초과 등)을 충족하는 경우, 상기 상대 착수시간 비율을 상기 기초 착수시간에 반영하여 착수시간 증가량을 결정할 수 있다. 즉, 시간 관리 모델 서버(500)는 상기 상대 착수시간 비율을 토대로 상대방이 평균적인 착수 준비 시간(PP)보다 직전 착수에 더 큰 시간을 사용하였다고 판단되면(실시예로, 상대 착수시간 비율이 소정의 임계값을 초과하면), 해당 상대 착수시간 비율을 상기 기초 착수시간에 적용하여 착수시간 증가량을 결정할 수 있다. In addition, the time management model server 500 determines whether the calculated relative start time ratio (eg, a predetermined percentage) meets a predetermined criterion (eg, exceeds a predetermined threshold value (eg, 1.00%), etc.) In this case, the start time increment may be determined by reflecting the relative start time ratio to the basic start time. That is, when the time management model server 500 determines that the counterpart has spent a larger amount of time on the previous start than the average start preparation time (PP) based on the relative start time ratio (for example, the relative start time ratio is set at a predetermined start time ratio). exceeds the threshold value of), the start time increment may be determined by applying the corresponding relative start time ratio to the basic start time.

실시예로, 시간 관리 모델 서버(500)는 상기 산출된 상대 착수시간 비율이 소정의 임계값(예컨대, 1.00%)을 초과하는 경우, 상기 산출된 상대 착수시간 비율과 상기 기초 착수시간을 이용한 소정의 연산(예컨대, 곱셈 연산 등)을 수행할 수 있고, 그 연산의 결과를 토대로 착수시간 증가량을 결정할 수 있다. 여기서는 상기 소정의 기준을 소정의 임계값(예컨대, 1.00%)을 초과하는 것으로 한정하여 설명하나 이에 제한되는 것은 아니다. In an embodiment, the time management model server 500 determines the relative start time ratio using the calculated relative start time ratio and the basic start time when the calculated relative start time ratio exceeds a predetermined threshold value (eg, 1.00%). It is possible to perform an operation (eg, a multiplication operation, etc.) of, and the start time increase amount can be determined based on the result of the operation. Here, the predetermined criterion is limited to exceeding a predetermined threshold value (eg, 1.00%), but is not limited thereto.

예를 들면, 시간 관리 모델 서버(500)는 상기 기초 착수시간이 '30초'이고 상기 평균 상대 착수시간이 '40초'이고 상기 직전 상대 착수시간이 '60초'인 경우, 평균 상대 착수시간과 상기 직전 상대 착수시간에 기반하여 상기 상대 착수시간 비율을 '1.5%(60초/40초)'로 산출할 수 있다. 또한, 시간 관리 모델 서버(500)는 상기 산출된 상대 착수시간 비율이 소정의 임계값인 '1.00%'를 초과하는지 판단할 수 있다. 또한 시간 관리 모델 서버(500)는 상기 상대 착수시간 비율이 소정의 임계값을 초과하면, 상기 상대 착수시간 비율(본 예시에서, '1,5%')을 상기 기초 착수시간(본 예시에서, '30초')에 소정의 연산(본 예시에서 곱셈 연산)을 이용하여 적용할 수 있다. 그리하여 시간 관리 모델 서버(500)는 최종적인 착수 준비 시간(PP)을 '45초('30초*1.5')'로 도출할 수 있고 이에 따른 착수시간 증가량을 '15초'로 결정할 수 있다. For example, the time management model server 500 calculates the average relative start time when the basic start time is '30 seconds', the average relative start time is '40 seconds', and the previous relative start time is '60 seconds'. And the relative start time ratio can be calculated as '1.5% (60 seconds / 40 seconds)' based on the previous relative start time. In addition, the time management model server 500 may determine whether the calculated relative start time ratio exceeds '1.00%', a predetermined threshold value. In addition, the time management model server 500 sets the relative start time ratio ('1.5%' in this example) to the basic start time (in this example, when the relative start time ratio exceeds a predetermined threshold). '30 seconds') can be applied using a predetermined operation (multiplication operation in this example). Thus, the time management model server 500 may derive the final set-up preparation time (PP) as '45 seconds ('30 seconds * 1.5')' and determine the set-up time increment accordingly as '15 seconds'.

즉, 시간 관리 모델 서버(500)는 대국 과정에서 상대방이 착수에 소요한 시간인 상대 착수시간(OT)을 기초로 착수시간 증가량을 결정할 수 있고, 결정된 착수시간 증가량을 상기 기초 착수시간에 반영하여 조정된 착수 준비 시간(PP)을 결정할 수 있다. That is, the time management model server 500 may determine the start time increase based on the relative start time (OT), which is the time required for the other party to start in the game process, and reflect the determined start time increase to the basic start time An adjusted launch preparation time (PP) can be determined.

또한, 시간 관리 모델 서버(500)는 위와 같이 결정된 착수 준비 시간(PP)을 착수 모델 서버(300)로 제공할 수 있다. In addition, the time management model server 500 may provide the start preparation time (PP) determined as above to the start model server 300 .

그리하여 시간 관리 모델 서버(500)의 시간 조정부(550)는 상기 착수 준비 시간(PP)을 수신한 착수 모델 서버(300)의 착수 모델이 더 오랫동안 또는 더 많은 횟수의 MCTS 시뮬레이션을 수행하여 보다 좋은 착수 후보점을 선택하게 할 수 있다. Thus, the time adjustment unit 550 of the time management model server 500 performs the MCTS simulation for a longer time or a greater number of times so that the initiation model of the initiation model server 300 having received the initiation preparation time (PP) improves the initiation preparation time (PP). candidate points can be selected.

따라서, 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 바둑 게임 시간 관리를 할 수 있고, 중요한 국면에서 착수 준비 시간(PP)을 변경할 수 있다. Therefore, the deep learning-based Go game service device according to the embodiment can manage the Go game time and change the starting preparation time (PP) in an important phase.

<또 다른 실시예에 따른 시간 관리 모델 서버><Time management model server according to another embodiment>

도 29은 본 발명의 또 다른 실시예에 따른 시간 관리 모델 서버의 시간 관리 모델을 설명하기 위한 도면이고, 도 30은 본 발명의 또 다른 실시예에 따른 게임 시간 정보를 생성하기 위해 사용되는 집수 변화량을 설명하기 위한 도면이고, 도 31은 본 발명의 또 다른 실시예에 따른 게임 시간 정보를 생성하기 위해 사용되는 집수 변화량을 설명하기 위한 도면이고, 도 32는 본 발명의 또 다른 실시예에 따른 게임 시간 정보를 생성하기 위해 사용되는 공배수를 설명하기 위한 도면이다.29 is a diagram for explaining a time management model of a time management model server according to another embodiment of the present invention, and FIG. 30 is a collection variation used to generate game time information according to another embodiment of the present invention. , FIG. 31 is a diagram for explaining a collection variation used to generate game time information according to another embodiment of the present invention, and FIG. 32 is a diagram for a game according to another embodiment of the present invention. It is a diagram for explaining common multiples used to generate time information.

도 29을 참조하면, 본 발명의 실시예에 따른 딥러닝 기반의 바둑 게임 서비스는 시간 관리 모델 서버(500)의 시간 관리 모델을 이용하여 현재 바둑판 상태에서 게임 시간 정보를 생성할 수 있다. 게임 시간 정보는 바둑판 상태에 따른 경기 종료시까지 예측되는 남은 경기 길이를 포함할 수 있다. 시간 관리 모델은 시간 관리 모델 서버(500)의 딥러닝 모델로써 시간 관리 제1 신경망(520), 시간 관리 제2 신경망(530), 제1 입력 특징 생성부(540)를 포함할 수 있다. Referring to FIG. 29 , the deep learning-based Go game service according to an embodiment of the present invention may generate game time information in the current Go board state by using the time management model of the time management model server 500. The game time information may include the remaining game length predicted until the end of the game according to the state of the checkerboard. The time management model is a deep learning model of the time management model server 500 and may include a first time management neural network 520 , a second time management neural network 530 , and a first input feature generator 540 .

시간 관리 모델은 현재 바둑판 상태의 게임 시간 정보인 남은 경기 길이를 예측할 수 있도록 지도 학습(supervised learning)할 수 있다. 보다 구체적으로, 시간 관리 모델 서버(500)는 바둑판 상태(S)에 따른 남은 경기 길이 예측을 위한 트레이닝 데이터 셋을 생성하고 생성된 트레이닝 데이터 셋을 이용하여 시간 관리 모델이 현재 바둑판 상태(S)에 따른 남은 경기 길이를 예측할 수 있도록 학습시킬 수 있다. 시간 관리 모델 서버(500)는 바둑서버(200)로부터 복수의 기보를 수신할 수 있다. 복수의 기보의 각 기보는 착수 순서에 따른 각각의 바둑판 상태(S)를 포함할 수 있다. 또한, 복수의 기복의 각 기보는 각각의 바둑판 상태(S)에서 게임 시간 정보, 특히 경기 종료시까지의 남은 경기 길이 정보를 포함할 수 있다. 또한, 시간 관리 모델 서버(500)는 형세 판단 모델 서버(400)로부터 형세 판단 정보를 수신할 수 있다. 형세 판단 정보는 바둑서버(200)에서 시간 관리 모델 서버(500)로 제공하는 복수의 기보에 기초한 형세 판단 정보이고, 집수의 변화량 정보 및 공배수 정보 등을 포함할 수 있다. 또한, 시간 관리 모델 서버(500)는 착수 모델 서버(300)로부터 가치값을 수신할 수 있다. 가치값은 바둑서버(200)에서 시간 관리 모델 서버(500)로 제공하는 복수의 기보에 기초한 가치값일 수 있다. The time management model may perform supervised learning to predict the remaining game length, which is game time information in a current checkerboard state. More specifically, the time management model server 500 generates a training data set for predicting the remaining game length according to the checkerboard state (S), and uses the created training data set to determine the time management model in the current checkerboard state (S). It can be trained to predict the length of the remaining game. The time management model server 500 may receive a plurality of notations from the Go server 200 . Each notation of a plurality of notations may include each checkerboard state (S) according to the starting order. In addition, each notation of the plurality of ups and downs may include game time information, in particular, remaining game length information until the end of the game in each checkerboard state (S). Also, the time management model server 500 may receive situation determination information from the situation determination model server 400 . The situation judgment information is situation judgment information based on a plurality of notations provided from the Go server 200 to the time management model server 500, and may include change amount information and common multiple information of collections. Also, the time management model server 500 may receive a value from the undertaking model server 300 . The value may be a value based on a plurality of notations provided by the Go server 200 to the time management model server 500.

제1 입력 특징 추출부(540)는 복수의 기보의 바둑판 상태(S)에서 제1 입력 특징(IF1)을 추출하여 시간 관리 제1 신경망(520)에 트레이닝을 위한 입력 데이터로 제공할 수 있다. 바둑판 상태(S)의 제1 입력 특징(IF)은 흑 플레이어의 최근 8 수에 대한 돌의 위치 정보과 백 플레이어의 최근 8 수에 대한 돌의 위치 정보와 현재 플레이어가 흑인지 백인지에 대한 차례 정보를 포함한 19*19*18의 RGB 이미지일 수 있다. 일 예로, 제1 입력 특징 추출부(540)는 신경망 구조로 되어 있을 수 있으며 일종의 인코더를 포함할 수 있다.The first input feature extractor 540 may extract the first input feature IF1 from the checkerboard state S of a plurality of notations and provide the first input feature IF1 to the time management first neural network 520 as input data for training. The first input feature (IF) of the checkerboard state (S) is the position information of the stone for the last 8 moves of the black player, the position information of the stone for the last 8 moves of the white player, and the turn information on whether the current player is black or white. It may be a 19*19*18 RGB image including . For example, the first input feature extractor 540 may have a neural network structure and may include a kind of encoder.

또한, 시간 관리 모델은 제2 입력 특징(IF2) 및 제3 입력 특징(IF3)을 시간 관리 제1 신경망(520)에 트레이닝을 위한 입력 데이터로 제공할 수 있다. 제2 입력 특징(IF2)은 바둑판 상태(S)에 따른 집수의 변화량 정보일 수 있다. 제3 입력 특징(IF3)은 바둑판 상태(S)에 따른 공배수 정보일 수 있다. 시간 관리 모델은 제4 입력 특징(IF4)를 시간 관리 제2 신경망(530)에 트레이닝을 위한 입력 데이터로 제공할 수 있다. 제4 입력 특징(IF4)는 바둑판 상태(S)에 따른 가치값일 수 있다. In addition, the time management model may provide the second input feature IF2 and the third input feature IF3 to the first time management neural network 520 as input data for training. The second input feature IF2 may be change amount information of the catchment according to the checkerboard state S. The third input feature IF3 may be common multiple information according to the checkerboard state S. The time management model may provide the fourth input feature IF4 to the second time management neural network 530 as input data for training. The fourth input feature IF4 may be a value according to the checkerboard state S.

시간 관리 제1 신경망(520)은 제1 내지 제3 입력 특징(IF1 내지 IF3)를 입력으로 하여 출력값을 시간 관리 제2 신경망(530)에 제공할 수 있다. 시간 관리 제1 신경망(520)은 신경망 구조로 구성될 수 있다. 일 예로, 시간 관리 제1 신경망(520)은 20개의 레지듀얼(residual) 블록으로 구성될 수 있다. 도 8을 참조하면, 하나의 레지듀얼 블록은 256개의 3X3 컨볼루션 레이어, 일괄 정규화(batch normalization) 레이어, Relu 활성화 함수 레이어, 256개의 3X3 컨볼루션 레이어, 일괄 정규화(batch normalization) 레이어, 스킵 커넥션, Relu 활성화 함수 레이어 순으로 배치될 수 있다. 일괄 정규화(batch normalization) 레이어는 학습하는 도중에 이전 레이어의 파라미터 변화로 인해 현재 레이어의 입력의 분포가 바뀌는 현상인 공변량 변화(covariate shift)를 방지하기 위한 것이다. 스킵 커넥션은 블록 층이 두꺼워지더라도 신경망의 성능이 감소하는 것을 방지하고 블록 층을 더욱 두껍게 하여 전체 신경망 성능을 높일 수 있게 한다. 스킵 커넥션은 레지듀얼 블록의 최초 입력 데이터가 두 번째 일괄 정규화(batch normalization) 레이어의 출력과 합하여 두번째 Relu 활성화 함수 레이어에 입력되는 형태일 수 있다. The first time management neural network 520 may provide an output value to the second time management neural network 530 by using the first to third input features IF1 to IF3 as inputs. The first time management neural network 520 may have a neural network structure. For example, the first time management neural network 520 may include 20 residual blocks. Referring to FIG. 8, one residual block includes 256 3X3 convolution layers, batch normalization layers, Relu activation function layers, 256 3X3 convolution layers, batch normalization layers, skip connections, Relu activation function layers may be arranged in order. The batch normalization layer is intended to prevent covariate shift, a phenomenon in which the distribution of inputs of the current layer changes due to changes in parameters of the previous layer during learning. The skip connection prevents the performance of the neural network from decreasing even when the block layer becomes thicker, and increases overall neural network performance by making the block layer thicker. The skip connection may be input to a second Relu activation function layer by summing the first input data of the residual block with the output of the second batch normalization layer.

시간 관리 제2 신경망(530)은 시간 관리 제1 신경망(520)의 출력값과 제4 입력 특징(IF4)를 입력으로 하여 예측한 남은 경기 길이에 관한 게임 시간 정보를 생성할 수 있다. 시간 관리 제2 신경망(530)은 신경망 구조로 구성될 수 있다. 일 예로, 시간 관리 제2 신경망(530)은 풀리 커넥티드 레이어 구조일 수 있다. The second time management neural network 530 may generate game time information about the predicted remaining game length by taking the output value of the first time management neural network 520 and the fourth input feature IF4 as inputs. The time management second neural network 530 may have a neural network structure. For example, the second time management neural network 530 may have a fully connected layer structure.

시간 관리 모델은 제1 내지 제4 입력 특징(IF1 내지 IF4)을 입력 데이터로 하고 남은 경기 길이 정보를 타겟 데이터(

)로 한 트레이닝 데이터 셋을 이용하여 시간 관리 제1 신경망(520)을 거쳐 시간 관리 제2 신경망(530)에서 생성된 출력 데이터(r)가 타겟 데이터(

)와 동일해지도록 시간 관리 제1 신경망(520) 및 시간 관리 제2 신경망(530)을 충분히 학습할 수 있다. 일 예로, 시간 관리 모델은 남은 경기 길이 예측 손실(

)을 이용하여 남은 경기 길이 예측 손실(

)이 최소가 되도록 트레이닝 할 수 있다. 예를 들어, 남은 경기 길이 예측 손실(

)은 수학식 5를 따를 수 있다.The time management model uses the first to fourth input features (IF1 to IF4) as input data and the remaining game length information as target data (

The output data (r) generated from the second time management neural network 530 through the first time management neural network 520 using the training data set of ) is the target data (

), the first time management neural network 520 and the second time management neural network 530 may be sufficiently learned. As an example, the time management model predicts the remaining match length loss (

) using remaining game length prediction loss (

) can be trained to be a minimum. For example, remaining match length prediction loss (

) may follow Equation 5.

(수학식 5)(Equation 5)

수학식 5에서

는 집수의 변화량 손실이다. 남은 경기 길이 예측 손실(

)은 집수의 변화량 손실(

)을 이용한다. 남은 경기 길이 예측을 위하여 집수의 변화량을 이용하는 이유는 대국중 초반의 집수의 변화가 상대적으로 크고, 후반의 집수의 변화가 상대적으로 적을 가능성이 높으므로 남은 경기 길이 예측 판단의 요소로 사용할 수 있기 때문이다. 일 예로, 도 30를 참고하면, 도 30는 임의의 기보의 초반부의 바둑판 상태(S)에 대하여 형세 판단 모델 서버(400)를 이용하여 생성된 형세 판단 정보이다. 도 30(a)는 104수까지 둔 경우로 흑이 46집이고 백이 63.5집이 된다. 도 30(b)는 흑이 한수를 더 둔 105수까지 둔 경우로 흑이 46집이고 백이 59.5집으로 백의 경우 4집의 차이가 난다. 즉, 초반부의 바둑판 상태(S)는 한 수 둘때마다 집수의 변화량이 크다. 도 31를 참고하면, 도 31는 임의의 기보의 후반부의 바둑판 상태(S)에 대하여 형세 판단 모델 서버(400)를 이용하여 생성된 형세 판단 정보이다. 도 31(a)는 222수까지 둔 경우로 흑이 64집이고 백이 63.5집이 된다. 도 31(b)는 백이 한 수를 더 둔 223수까지 둔 경우로 흑이 63집이고 백이 64.5집으로 백의 경우 1집의 차이가 난다. 즉, 후반부의 바둑판 상태(S)는 한 수 둘때마다 집수의 변화량이 작다.in Equation 5

is the change loss of the catchment. Remaining match length prediction loss (

) is the change in catchment loss (

) is used. The reason for using the change in the catchment to predict the length of the remaining game is that the change in the catchment at the beginning of the game is relatively large and the change in the catchment in the second half is likely to be relatively small, so it can be used as a factor in determining the prediction of the remaining game length. am. As an example, referring to FIG. 30 , FIG. 30 is layout judgment information generated using the layout judgment model server 400 for the checkerboard state (S) at the beginning of any notation. In Fig. 30(a), in the case of up to 104 moves, black has 46 houses and white has 63.5 houses. 30(b) shows a case where black has 105 moves, one more, and black has 46 houses and white has 59.5 houses, so in the case of white, there is a difference of 4 houses. That is, in the checkerboard state (S) at the beginning of the game, the amount of change in the number of catches is large every two moves. Referring to FIG. 31 , FIG. 31 is position judgment information generated using the position determination model server 400 for the checkerboard state (S) of the second half of an arbitrary notation. In Figure 31(a), in the case of up to 222 moves, black is 64 houses and white is 63.5 houses. 31(b) shows the case where White adds one more move to 223 moves. Black has 63 houses and White has 64.5 houses, so in the case of White, there is a difference of one house. That is, in the checkerboard state (S) in the second half, the amount of change in the catchment is small for every two moves.

수학식 5에서

는 공배수 손실이다. 남은 경기 길이 예측 손실(

)은 공배수 손실(

)을 이용한다. 남은 경기 길이 예측을 위하여 공배수를 이용하는 이유는 대국 중에 후반으로 갈수록 공배수가 점점 줄어들기 때문에 이를 이용하여 남은 경기 길이 예측 판단의 요소로 사용할 수 있기 때문이다. 일 예로, 도 32을 참고하면, 도 32은 임의의 기보의 초반부와 후반부의의 바둑판 상태(S)에 대하여 형세 판단 모델 서버(400)를 이용하여 생성된 형세 판단 정보이다. 공배는 도 32의 형세 판단 결과에서 붉은 선으로 X표시가 된 곳이다. 도 32(a)는 대국의 초반부이기 때문에 공배가 많다. 도 32(b)는 대국의 후반부이기 때문에 공배가 적다.in Equation 5

is the common multiple loss. Remaining match length prediction loss (

) is the common multiple loss (

) is used. The reason why the common multiple is used to predict the remaining game length is that it can be used as a factor in determining the remaining game length prediction as the common multiple gradually decreases toward the second half of the game. As an example, referring to FIG. 32 , FIG. 32 is the layout judgment information generated using the layout judgment model server 400 for the checkerboard state (S) of the first and second half of any notation. Gongbae is a place marked with a red line X in the situation judgment result of FIG. 32 . Figure 32 (a) is the first half of the game, so there are many common games. Figure 32 (b) is the second half of the game, so there are few public games.

수학식 5에서

는 가치값 손실이다. 남은 경기 길이 예측 손실(

)은 가치값 손실(

)을 이용한다. 남은 경기 길이 예측을 위하여 가치값 이용하는 이유는 대국 중에 후반으로 갈수록 어느 한쪽의 가치값이 높아지기 때문에 이를 이용하여 남은 경기 길이 예측 판단의 요소로 사용할 수 잇기 때문이다. 또한, 가치값 손실(

)은 초반에는 큰 변화가 없기 때문에 경기 후반부에 이용될 수 있다. 예를 들어, 경기 후반부는 임의의 기보에서 250수이상의 착수가 된 바둑판 상태일 수 있고, 이에 제한되는 것은 아니다.in Equation 5

is the value loss. Remaining match length prediction loss (

) is the value loss (

) is used. The reason why the value value is used to predict the length of the remaining game is that it can be used as an element of predicting the length of the remaining game by using this value because the value of one side increases toward the second half of the game. In addition, value loss (

) can be used later in the game because there is no significant change in the early stages. For example, the second half of the game may be a checkerboard state in which more than 250 numbers have been set out in any notation, but is not limited thereto.

수학식 5에서,

는 하이퍼 파라미터들이다. 사용자는 하이퍼 파라미터는 조절하여 각 손실의 상대적인 중요도를 조절할 수 있다. 예를 들어,

는 경기 후반부로 갈수록 중요도가 높아지므로 수치가 높아질 수 있고, 경기 초반부에는 중요도가 낮으므로 수치가 낮을 수 있다.In Equation 5,

are hyperparameters. The user can adjust the relative importance of each loss by adjusting the hyperparameters. for example,

The value may increase as the importance increases towards the later part of the game, and the value may decrease due to its low importance in the early part of the game.

학습된 시간 관리 모델은 대국중 현재 바둑판 상태가 입력되면 예측한 남은 경기 길이에 관한 게임 시간 정보를 생성할 수 있다. 보다 구체적으로, 학습된 시간 관리 모델은 대국중에 현재 바둑판 상태가 입력되면 제1 입력 특징 추출부(540)에 의해 제1 입력 특징이 추출할 수 있다. 시간 관리 모델은 형세 판단 모델 서버가 현재 바둑판 상태에 대한 형세 판단으로 제공하는 집수의 변화량 정보를 제2 입력 특징으로 하고, 공배수 정보를 제3 입력 특징으로 할 수 있다. 시간 관리 모델은 제1 입력 특징 내지 제3 입력 특징을 입력 데이터로 하여 시간 관리 제1 신경망(520)에서 생성한 출력값을 시간 관리 제2 신경망(530)에 제공할 수 있다. 시간 관리 모델은 착수 모델 서버가 제공하는 현재 바둑판 상태에 착수 후보점에 대한 가치값을 제4 입력 특징으로 할 수 있다. 시간 관리 모델은 시간 관리 제1 신경망(520)의 출력값과 제4 입력 특징을 입력 데이터로 하여 시간 관리 제2 신경망(530)에서 예측된 남은 경기 길이에 관한 게임 시간 정보를 생성할 수 있다.The learned time management model may generate game time information about the predicted remaining game length when a current checkerboard state is input during a game. More specifically, the learned time management model may extract a first input feature by the first input feature extractor 540 when a current checkerboard state is input during a game. The time management model may use change amount information of collections provided by the situation judgment model server as a situation judgment for the current checkerboard state as a second input feature, and common multiple information as a third input feature. The time management model may provide an output value generated by the first time management neural network 520 to the second time management neural network 530 by using the first to third input characteristics as input data. The time management model may use, as a fourth input characteristic, a value for a starting candidate point in the current checkerboard state provided by the starting model server. The time management model may generate game time information about the remaining game length predicted by the second time management neural network 530 by using the output value of the first time management neural network 520 and the fourth input feature as input data.

또한, 시간 관리 모델 서버(500)는 게임 시간 정보의 예측된 남은 경기 길이에 따라서 착수 준비 시간을 조절할 수 있다. 시간 관리 모델 서버(500)는 조절된 착수 준비 시간을 착수 모델 서버(300)에 제공할 수 있다.In addition, the time management model server 500 may adjust the starting preparation time according to the predicted remaining game length of the game time information. The time management model server 500 may provide the adjusted start preparation time to the start model server 300 .

따라서, 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 남은 경기 길이를 예측할 수 있다. 또한, 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 장치는 예측된 남은 경기 길이를 이용하여 착수 준비 시간을 효과적으로 나눌 수 있다.Therefore, the deep learning-based Go game service device according to another embodiment may predict the remaining game length. In addition, the deep learning-based Go game service apparatus according to another embodiment can effectively divide the start preparation time by using the predicted remaining game length.

도 33은 본 발명의 또 다른 실시예에 따른 시간 관리 모델 서버의 바둑 게임 서비스 시스템에서의 신호 흐름에 대한 예시도이다.33 is an exemplary diagram of a signal flow in a Go game service system of a time management model server according to another embodiment of the present invention.

도 33을 참조하면, 바둑서버(200)는 복수의 기보를 착수 모델 서버(300), 형세 판단 모델 서버(400) 및 시간 관리 모델 서버(500)에 송신할 수 있다(S2701). 시간 관리 보델 서버(500)는 수신한 복수의 기보의 바둑판 상태의 제1 입력 특징을 추출할 수 있다(S2702). 형세 판단 모델 서버(400)는 수신한 복수의 기보의 형세 판단 정보를 생성하고, 시간 관리 모델 서버(500)에 형세 판단 정보를 송신할 수 있다(S2703, S2704). 형세 판단 정보는 집수의 변화량 정보 및 공배수 정보일 수 있다. 시간 관리 모델 서버(500)는 집수의 변화량 정보를 제2 입력 특징으로 하고 공배수 정보를 제3 입력 특징으로 하여 시간 관리 모델의 시간 관리 제1 신경망에 입력할 수 있다(S2705). 착수 모델 서버(300)는 복수의 기보의 바둑판 상태에 따른 가치값을 생성하고, 생성된 가치값을 시간 관리 모델 서버(500)에 전송할 수 있다(S2706, S2707). 시간 관리 모델 서버(500)는 수신한 가치값을 제4 입력 특징으로 하여 시간 관리 모델의 시간 관리 제2 신경망에 입력할 수 있다(S2708). 시간 관리 모델 서버(500)는 제1 내지 제4 입력 특징의 입력 데이터와 복수의 기보의 바둑판 상태에 따른 남은 경기 길이 정보를 타겟 데이터로 하여 시간 관리 모델을 트레이닝 할 수 있다(S2709). 바둑서버(200)는 바둑 게임을 진행하며 단말기(100)와 착수 모델 서버(300)가 자신의 턴에 착수를 수행할 수 있다(S2710 내지 S2712). 형세 판단 모델 서버(400)는 현재 바둑판 상태의 입력 특징을 추출하고, 딥러닝 모델인 형세 판단 모델이 입력 특징을 이용하여 형세값을 생성하고, 바둑판 상태와 형세값을 이용하여 형세 판단을 수행할 수 있다(S2713). 시간 관리 모델 서버(500)는 현재 바둑판 상태의 입력 특징을 추출하여 제1 입력 특징으로 하고, 형세 판단 정보를 제2 및 제3 입력 특징으로 하고, 착수 모델 서버(300)에서 제공되는 가치값을 제4 입력 특징으로 하여, 딥러닝 모델인 시간 관리 모델이 게임 시간 정보를 생성할 수 있다(S2714).Referring to FIG. 33, the Monarch server 200 may transmit a plurality of notations to the starting model server 300, the situation judgment model server 400, and the time management model server 500 (S2701). The time management model server 500 may extract the first input feature of the checkerboard state of the plurality of notations received (S2702). The situation judgment model server 400 may generate situation judgment information of a plurality of notations received and transmit the situation judgment information to the time management model server 500 (S2703 and S2704). The situation determination information may be change amount information and common multiple information of collections. The time management model server 500 may input the variation information of collections as the second input feature and the common multiple information as the third input feature to the time management first neural network of the time management model (S2705). Embarkation model server 300 may generate a value value according to the checkerboard state of a plurality of notation, and transmit the generated value value to the time management model server 500 (S2706, S2707). The time management model server 500 may input the received value as a fourth input feature to the second neural network for time management of the time management model (S2708). The time management model server 500 may train a time management model using the input data of the first to fourth input characteristics and remaining game length information according to the checkerboard state of a plurality of notations as target data (S2709). The Go server 200 proceeds with the Go game, and the terminal 100 and the start model server 300 may perform a start on their turn (S2710 to S2712). The layout judgment model server 400 extracts the input features of the current checkerboard state, the layout judgment model, which is a deep learning model, generates a layout value using the input features, and performs the layout judgment using the checkerboard status and the layout values. It can (S2713). The time management model server 500 extracts the input feature of the current checkerboard state as the first input feature, uses the situation determination information as the second and third input features, and sets the value provided by the start model server 300 to the first input feature. As a fourth input feature, a time management model, which is a deep learning model, may generate game time information (S2714).

도 34은 본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법 중 게임 시간 정보 생성 방법이다.34 is a method for generating game time information among deep learning-based Go game service methods according to another embodiment of the present invention.

도 34을 참조하면, 본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 복수의 기보를 수신하는 단계(S2801)을 포함할 수 있다. 복수의 기보에 대한 설명은 도 29의 설명을 따른다.Referring to FIG. 34 , the deep learning-based Go game service method according to another embodiment of the present invention may include a step (S2801) of receiving a plurality of notations by the time management model server 500. A description of a plurality of notations follows the description of FIG. 29 .

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 수신한 복수의 기보의 바둑판 상태에서 제1 입력 특징을 추출하는 단계(s2802)를 포함할 수 있다. 제1 입력 특징을 추출하는 방법은 도 29의 설명을 따른다. 시간 관리 모델 서버(500)는 추출된 제1 입력 특징을 시간 관리 제1 신경망의 입력 데이터로 제공할 수 있다.The deep learning-based Go game service method according to another embodiment of the present invention may include extracting a first input feature from the checkerboard state of a plurality of notations received by the time management model server 500 (s2802). there is. A method of extracting the first input feature follows the description of FIG. 29 . The time management model server 500 may provide the extracted first input feature as input data of the first time management neural network.

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)가 복수의 기보의 바둑판 상태에 따른 형세 판단 정보를 수신하는 단계(S2803)을 포함할 수 있다. 형세 판단 정보를 수신하는 방법은 도 29의 설명을 따른다. 형세 판단 정보는 집수의 변화량 정보 및 공배수 정보일 수 있다. 시간 관리 모델 서버(500)는 집수의 변화량 정보를 제2 입력 특징으로 하고 공배수 정보를 제3 입력 특징으로 할 수 있다.The deep learning-based Go game service method according to another embodiment of the present invention may include a step (S2803) of receiving, by the time management model server 500, position determination information according to a checkerboard state of a plurality of notations. A method of receiving situation determination information follows the description of FIG. 29 . The situation determination information may be information on the amount of change in the number of catchments and information on common multiples. The time management model server 500 may use change amount information of collections as a second input feature and common multiple information as a third input feature.

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)는 제2 입력 특징 및 제3 입력 특징을 시간 관리 제1 신경망에 입력하는 단계(S2804)를 포함할 수 있다. A deep learning-based Go game service method according to another embodiment of the present invention includes inputting, by the time management model server 500, a second input feature and a third input feature to a first time management neural network (S2804). can do.

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)는 착수 모델 서버로부터 가치값을 수신하는 단계(S2805)를 포함할 수 있다. 시간 관리 모델 서버(500)는 가치값을 제4 입력 특징으로 할 수 있다.The deep learning-based Go game service method according to another embodiment of the present invention may include a step of receiving, by the time management model server 500, a value from the start model server (S2805). The time management model server 500 may use a value as a fourth input feature.

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)는 제4 입력 특징을 시간 관리 제2 신경망에 입력하는 단계(S2806)을 포함할 수 있다.The deep learning-based Go game service method according to another embodiment of the present invention may include inputting, by the time management model server 500, a fourth input characteristic to a second time management neural network (S2806).

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)는 제1 내지 제4 입력 특징의 입력 데이터와 복수의 기보의 바둑판 상태에 따른 남은 경기 길이 정보를 타겟 데이터로 하여 시간 관리 모델을 트레이닝하는 단계(S2807)를 포함할 수 있다. 시간 관리 모델의 트레이닝 방법은 도 29 내지 도 32의 설명을 따른다.In the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 receives the input data of the first to fourth input characteristics and the remaining game length information according to the checkerboard state of a plurality of notations. A step of training a time management model using target data (S2807) may be included. The training method of the time management model follows the description of FIGS. 29 to 32 .

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)는 게임 대국중 바둑판 상태, 형세 판단 정보, 가치값을 수신하는 단계(S2809)를 포함할 수 있다.In the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 may include a step (S2809) of receiving a checkerboard state, situation judgment information, and a value value during a game. .

본 발명의 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 시간 관리 모델 서버(500)는 수신한 바둑판 상태, 형세 판단 정보, 가치값을 이용하여 게임 시간 정보를 생성하는 단계(S2810)을 포함할 수 있다. 게임 시간 정보를 생성하는 방법은 도 29의 설명을 따른다.In the deep learning-based Go game service method according to another embodiment of the present invention, the time management model server 500 generates game time information using the received checkerboard state, situation judgment information, and value (S2810) can include A method of generating game time information follows the description of FIG. 29 .

따라서, 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 남은 대국 시간을 예측할 수 있다. 또한, 또 다른 실시예에 따른 딥러닝 기반의 바둑 게임 서비스 방법은 예측된 남은 대국 시간을 이용하여 착수 준비 시간을 효과적으로 나눌 수 있다.Therefore, the deep learning-based Go game service method according to another embodiment can predict the remaining game time. In addition, the deep learning-based Go game service method according to another embodiment can effectively divide the start preparation time using the predicted remaining game time.

이상 설명된 본 발명에 따른 실시예는 다양한 컴퓨터 구성요소를 통하여 실행될 수 있는 프로그램 명령어의 형태로 구현되어 컴퓨터 판독 가능한 기록 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능한 기록 매체는 프로그램 명령어, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 컴퓨터 판독 가능한 기록 매체에 기록되는 프로그램 명령어는 본 발명을 위하여 특별히 설계되고 구성된 것이거나 컴퓨터 소프트웨어 분야의 당업자에게 공지되어 사용 가능한 것일 수 있다. 컴퓨터 판독 가능한 기록 매체의 예에는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등과 같은, 프로그램 명령어를 저장하고 실행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는, 컴파일러에 의하여 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용하여 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드도 포함된다. 하드웨어 장치는 본 발명에 따른 처리를 수행하기 위하여 하나 이상의 소프트웨어 모듈로 변경될 수 있으며, 그 역도 마찬가지이다.Embodiments according to the present invention described above may be implemented in the form of program instructions that can be executed through various computer components and recorded on a computer-readable recording medium. The computer readable recording medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the computer-readable recording medium may be specially designed and configured for the present invention, or may be known and usable to those skilled in the art of computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical recording media such as CD-ROMs and DVDs, and magneto-optical media such as floptical disks. medium), and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter or the like as well as machine language codes generated by a compiler. A hardware device may be modified with one or more software modules to perform processing according to the present invention and vice versa.

본 발명에서 설명하는 특정 실행들은 일 실시 예들로서, 어떠한 방법으로도 본 발명의 범위를 한정하는 것은 아니다. 명세서의 간결함을 위하여, 종래 전자적인 구성들, 제어 시스템들, 소프트웨어, 상기 시스템들의 다른 기능적인 측면들의 기재는 생략될 수 있다. 또한, 도면에 도시된 구성 요소들 간의 선들의 연결 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것으로서, 실제 장치에서는 대체 가능하거나 추가의 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들로서 나타내어질 수 있다. 또한, “필수적인”, “중요하게” 등과 같이 구체적인 언급이 없다면 본 발명의 적용을 위하여 반드시 필요한 구성 요소가 아닐 수 있다.Specific implementations described in the present invention are examples and do not limit the scope of the present invention in any way. For brevity of the specification, description of conventional electronic components, control systems, software, and other functional aspects of the systems may be omitted. In addition, the connection of lines or connecting members between the components shown in the drawings are examples of functional connections and / or physical or circuit connections, which can be replaced in actual devices or additional various functional connections, physical connection, or circuit connections. In addition, if there is no specific reference such as “essential” or “important”, it may not be a component necessarily required for the application of the present invention.

또한 설명한 본 발명의 상세한 설명에서는 본 발명의 바람직한 실시 예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자 또는 해당 기술분야에 통상의 지식을 갖는 자라면 후술할 특허청구범위에 기재된 본 발명의 사상 및 기술 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다. 따라서, 본 발명의 기술적 범위는 명세서의 상세한 설명에 기재된 내용으로 한정되는 것이 아니라 특허청구범위에 의해 정하여져야만 할 것이다.In addition, the detailed description of the present invention described has been described with reference to preferred embodiments of the present invention, but those skilled in the art or those having ordinary knowledge in the art will find the spirit of the present invention described in the claims to be described later. And it will be understood that the present invention can be variously modified and changed without departing from the technical scope. Therefore, the technical scope of the present invention is not limited to the contents described in the detailed description of the specification, but should be defined by the claims.

100 단말기
200 바둑서버
300 착수 모델 서버
310 탐색부
320 셀프 플레이부
330 착수 신경망
400 형세 판단 모델 서버
410 형세 판단 신경망
420 입력 특징 추출부
430 정답 레이블 생성부
500 시간 관리 모델 서버
510 시간 관리부
520 시간 관리 제1 신경망
530 시간 관리 제2 신경망
540 제1 입력 특징 추출부
550 시간 조정부100 terminals
200 Go server
300 launch model server
310 search unit
320 Self Play Department
330 Initiate Neural Networks
400 Layout Model Server
410 Neural Networks
420 input feature extraction unit
430 Correct Answer Label Generation Unit
500 hour management model server
510 Time Management Division
520 Time Management First Neural Network
530 Time Management Secondary Neural Network
540 first input feature extraction unit
550 time adjustment unit

Claims

a communication unit for receiving at least one of a value value and a relative start time according to a checkerboard state;
a memory for storing the time adjustment unit; and
And a processor for reading the time adjustment unit and controlling the time adjustment unit to determine a start preparation time using at least one of the value value and the relative start time.
The time adjustment unit,
Determining at least one of the change trend and the change range of the value value, and determining the start preparation time according to the determined change trend and change range
Computation time management device.

delete

◈Claim 3 was abandoned when the registration fee was paid.◈

According to claim 1,
The time adjustment unit,
If the change rate of the value value, which is the trend of change of the value value, is less than a predetermined value, increasing the start preparation time
Computation time management device.

◈Claim 4 was abandoned when the registration fee was paid.◈

According to claim 1,
The time adjustment unit,
If the range of change between the value value in the previous start step and the current value value decreases by more than a predetermined value, increasing the start preparation time
Computation time management device.

a communication unit for receiving at least one of a value value and a relative start time according to a checkerboard state;
a memory for storing the time adjustment unit; and
And a processor for reading the time adjustment unit and controlling the time adjustment unit to determine a start preparation time using at least one of the value value and the relative start time.
The time adjustment unit,
Based on the relative start time, the average relative start time, which is the average of the previous relative start time, which is the relative start time required for the other party's previous start, and the relative start time required for a plurality of starts performed by the other party during a predetermined point in time yielding
Computation time management device.

According to claim 5,
The time adjustment unit,
If the previous relative start time is greater than the average relative start time, increasing the start preparation time
Computation time management device.

◈Claim 7 was abandoned when the registration fee was paid.◈

The method of any one of claims 3, 4 and 6,
The time adjustment unit,
When the condition for increasing the start preparation time is satisfied, calculating the start time increase amount which is a standard for increasing the start preparation time
Computation time management device.

◈Claim 8 was abandoned when the registration fee was paid.◈

According to claim 7,
The memory further includes a time management unit providing a basic start time that is a preset start preparation time,
The time adjustment unit,
Determining the start time increment based on the basic start time provided by the time management unit
Computation time management device.

◈Claim 9 was abandoned when the registration fee was paid.◈

According to claim 8,
The time adjustment unit,
Determining the start time increment based on the preset countdown time and the basic start time
Computation time management device.

◈Claim 10 was abandoned when the registration fee was paid.◈

According to claim 6,
The memory further includes a time management unit providing a basic start time that is a preset start preparation time,
The time adjustment unit,
Determining the larger start time of the previous relative start time and the basic start time provided by the time management unit as the start preparation time
Computation time management device.

According to claim 6,
The time adjustment unit,
Calculate the relative start time ratio, which is the ratio of the previous relative start time to the average relative start time,
The memory further includes a time management unit providing a basic start time that is a preset start preparation time,
The time adjustment unit,
When the relative start time ratio meets a predetermined criterion, determining the start preparation time by performing a predetermined operation based on the relative start time ratio and the basic start time provided by the time management unit
Computation time management device.

As a method for managing the computation time of a deep learning-based Go game service in a time management model server,
Receiving at least one of a value value and a relative starting time according to a checkerboard state; And determining a start preparation time using at least one of the received value value and the relative start time;
The step of determining the start preparation time,
Determining at least one of a change trend and a change range of the value, and determining the start preparation time according to the determined change trend and change range;
Based on the relative start time, the average relative start time, which is the average of the relative start time immediately before the opponent's previous start and the relative start time required for the plurality of starts performed by the other party during a predetermined point in time, is calculated. And determining the start preparation time based on the calculated relative start time immediately before and the average relative start time
Operation time management method.