KR102430468B1

KR102430468B1 - Surgical robot system based on Headset using voice recognition Microphone

Info

Publication number: KR102430468B1
Application number: KR1020200131637A
Authority: KR
Inventors: 김성완; 김영균; 임민혁; 윤단; 문경빈; 김병수
Original assignee: 서울대학교 산학협력단; 서울대학교병원
Priority date: 2020-10-13
Filing date: 2020-10-13
Publication date: 2022-08-09
Also published as: WO2022080788A1; KR20220049062A

Abstract

본 발명은 헤드셋 기반의 음성인식 마이크로폰을 이용한 수술 로봇 시스템에 관한 것으로서, 상세하게는 사용자가 내시경 영상을 VR(Virtual Reality) 헤드셋(Headset)을 통해 확인하면서 음성인식 마이크로폰을 이용하여 손을 사용할 필요없이 음성으로 내시경 카메라의 동작을 제어할 수 있는 수술로봇 시스템에 관한 것이다. 본 발명의 일실시 예에 따른 수술로봇 시스템은, 소정의 절개부위를 통해 수술공간으로 삽입되어 수술부위를 실시간 촬영하는 내시경 카메라; 상기 내시경 카메라를 이동시키는 구동부; 사용자의 머리에 착탈 가능하게 착용되어 상기 내시경 카메라에서 촬영된 영상을 표시하는 헤드셋; 사용자의 음성을 인식하는 음성인식부; 및 상기 음성인식부에서 인식된 음성데이터로부터 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 파악한 후 이를 기초로 상기 구동부를 제어하기 위한 구동제어신호를 상기 구동부로 송신하며, 상기 내시경 카메라로부터 수신된 영상정보를 상기 헤드셋으로 전송하는 제어부;를 포함할 수 있다. The present invention relates to a surgical robot system using a headset-based voice recognition microphone, and more particularly, without the need to use a hand using a voice recognition microphone while a user checks an endoscopic image through a VR (Virtual Reality) headset. It relates to a surgical robot system that can control the operation of an endoscope camera with voice. A surgical robot system according to an embodiment of the present invention includes: an endoscope camera inserted into an operating space through a predetermined incision to photograph the surgical site in real time; a driving unit for moving the endoscope camera; a headset that is detachably worn on the user's head and displays the image captured by the endoscope camera; A voice recognition unit for recognizing the user's voice; And after recognizing the user's intention for the movement of the endoscope camera from the voice data recognized by the voice recognition unit, based on this, a driving control signal for controlling the driving unit is transmitted to the driving unit, and the image received from the endoscope camera It may include; a control unit for transmitting information to the headset.

Description

Surgical robot system based on Headset using voice recognition Microphone}

본 발명은 헤드셋 기반의 음성인식 마이크로폰을 이용한 수술 로봇 시스템에 관한 것으로서, 상세하게는 사용자가 내시경 영상을 VR(Virtual Reality) 헤드셋(Headset)을 통해 확인하면서 음성인식 마이크로폰을 이용하여 손을 사용할 필요없이 음성으로 내시경 카메라의 동작을 제어할 수 있는 수술로봇 시스템에 관한 것이다.The present invention relates to a surgical robot system using a headset-based voice recognition microphone, and more particularly, without the need to use a hand using a voice recognition microphone while a user checks an endoscopic image through a VR (Virtual Reality) headset. It relates to a surgical robot system that can control the operation of an endoscope camera with voice.

일반적으로 복강경은 복벽의 작은 구멍을 통해 삽입되어 담낭, 복막, 간장, 소화관 외측 등을 검사하기 위한 내시경을 말하며, 내시경 수술이란 기존의 피부 절개를 통해 시행하는 수술이 아닌 하나 또는 여러 개의 구멍을 통해 진행하는 수술을 말한다.In general, laparoscopy refers to an endoscope that is inserted through a small hole in the abdominal wall to examine the gallbladder, peritoneum, liver, and the outside of the digestive tract. refers to the operation being performed.

이 경우, 복강경 내시경을 이용한 수술은 복부에 대략 지름 5mm 내지 1cm 가량의 작은 구멍을 두 군에 이상 형성하고, 상기 구멍으로 복강경이나 집게 또는 메스를 삽입하여, 복강경이 촬영하는 체내의 모양을 비디오 모니터 등을 통해서 화면을 화면서 행하는 수술에 해당한다.In this case, surgery using a laparoscopic endoscope forms a small hole with a diameter of about 5 mm to 1 cm in the abdomen in two groups or more, and inserts a laparoscope, forceps, or a scalpel into the hole to monitor the shape of the body captured by the laparoscope. It corresponds to surgery performed on the screen through the back.

또한, 로봇을 이용한 수술은 수술 도구를 움직일 수 있는 로봇의 동작을 의사 등이 제어하는 수술을 말한다. 특히, 복강경 로봇 수술은 복부에 여러 개의 구멍을 뚫은 후 상기 구멍으로 복강경 내시경과 함께 로봇수술을 위한 수술용 도구(end-effector)가 부착된 수술로봇의 로봇팔을 삽입하여 수술하는 것을 말한다.In addition, surgery using a robot refers to an operation in which a doctor or the like controls the operation of a robot capable of moving a surgical tool. In particular, laparoscopic robotic surgery refers to an operation by inserting a robotic arm of a surgical robot attached with an end-effector for robotic surgery together with a laparoscopic endoscope into the hole after drilling several holes in the abdomen.

이러한 복강경 로봇수술의 경우, 기존의 개복수술보다 절개부위가 적어 환자의 회복시간이 빠르고, 고통 및 흉터도 적다는 장점뿐만 아니라 로봇팔을 이용하여 수술을 진행하기 때문에 더 정밀한 수술이 가능하게 된다. 또한, 로봇수술에서는 수술공간에 대한 깊이정보를 전달하기 위해 3D 영상을 제공할 수 있는 복강경 내시경과 디스플레이를 사용하여 수술을 진행하기에 2D 영상을 이용하는 복강경 수술보다 더 정확한 수술이 가능해진다.In the case of laparoscopic robotic surgery, more precise surgery is possible because it uses a robotic arm, as well as the advantages of faster recovery time and less pain and scars than conventional open surgery because there are fewer incisions. In addition, since robotic surgery uses a laparoscopic endoscope and a display that can provide a 3D image to deliver depth information about the operating space, surgery is performed more accurately than laparoscopic surgery using a 2D image.

종래기술에 따르면 복강경 수술과 같은 최소 침습 수술을 위해 대략 3개의 로봇팔과 내시경을 이용하여 수술을 진행하게 되며, 상기 로봇팔 및 내시경의 조작을 위하여 사용자의 양손으로 컨트롤러를 조작하게 된다. 이 경우, 복강경 내시경의 위치를 이동시키기 위해서는 발로 페달 등을 밟아서 조작 대상을 로봇팔에서 내시경으로 전환하여 내시경을 조작하게 된다. According to the prior art, surgery is performed using approximately three robot arms and an endoscope for minimally invasive surgery such as laparoscopic surgery, and a controller is operated with both hands of a user to operate the robot arm and the endoscope. In this case, in order to move the position of the laparoscopic endoscope, the endoscope is manipulated by stepping on a pedal or the like with the foot to switch the manipulation target from the robot arm to the endoscope.

따라서, 사용자가 로봇팔에서 내시경으로 조작 대상을 전환하는 경우에 수술의 흐름이 끊기게 되고, 수술 도구의 충돌, 수술 부위의 손상, 수술 시간의 증가를 초래할 수 있다. Therefore, when the user switches the operation target from the robot arm to the endoscope, the flow of surgery is interrupted, which may cause collision of surgical tools, damage to the surgical site, and increase in operation time.

한편, 위와 같은 문제를 해결하기 위한 방안으로, 한국 공개특허공보 제10-2020-0108124호(공개일자 : 2020.09.17)에는 헤드셋의 움직임으로 내시경의 움직임을 제어할 수 있는 "최소 침습 수술을 위한 수술로봇 시스템 및 상기 수술로봇 시스템의 구동방법"이 개시된 바 있다.On the other hand, as a method to solve the above problem, Korean Patent Application Laid-Open No. 10-2020-0108124 (published on September 17, 2020) discloses a “minimally invasive surgery for controlling the movement of the endoscope with the movement of the headset. A surgical robot system and a method of driving the surgical robot system" have been disclosed.

그러나, 상기 선행기술에 따르면, 사용자가 헤드셋을 머리에 쓴 상태에서 머리의 움직임으로 내시경의 움직임을 제어해야 하는데, 머리의 움직임으로 내시경을 정밀하게 제어하기는 어려우며, 특히 수술중에는 내시경의 움직임을 위한 의도적인 머리 움직임 외에도 의도하지 않은 다양한 형태의 머리 움직임이 발생할 수 있다는 문제가 있다.However, according to the prior art, it is necessary to control the movement of the endoscope by the movement of the head while the user wears the headset on the head, but it is difficult to precisely control the endoscope by the movement of the head. In addition to intentional head movements, there is a problem that various types of unintentional head movements may occur.

한국 공개특허공보 제10-2020-0108124호 (공개일자 : 2020.09.17)Korean Patent Laid-Open Publication No. 10-2020-0108124 (Published date: 2020.09.17)

본 발명은 전술한 문제점을 해결하기 위한 것으로서, 사용자가 손으로 조작하지 않고 음성으로 내시경 카메라의 동작을 제어할 수 있는 수술로봇 시스템을 제공하는데 목적이 있다.An object of the present invention is to solve the above problems, and it is an object of the present invention to provide a surgical robot system in which a user can control the operation of an endoscope camera by voice without manipulating it by hand.

또한, 본 발명은 음성인식 마이크로폰을 통해 획득된 음성데이터로부터 내시경 카메라의 움직임에 대한 사용자의 의도를 인공지능 기반으로 파악하여 이를 기초로 내시경 카메라의 동작을 제어할 수 있는 수술로봇 시스템을 제공하는데 목적이 있다.In addition, the present invention is to provide a surgical robot system that can control the operation of the endoscopic camera based on the AI-based understanding of the user's intention for the movement of the endoscopic camera from voice data acquired through the voice recognition microphone. There is this.

또한, 본 발명은 내시경 카메라의 동작 제어에 대한 정보가 헤드셋에 디스플레이되는 내시경 영상에 자막 형식으로 표시되도록 할 수 있는 수술로봇 시스템을 제공하는데 목적이 있다.Another object of the present invention is to provide a surgical robot system capable of displaying information on operation control of an endoscopic camera in a caption format on an endoscopic image displayed on a headset.

또한, 본 발명은 내시경 카메라와 수술 도구 간의 충돌 확률을 계산하여 상기 계산된 충돌 확률이 헤드셋에 디스플레이되는 내시경 영상에 자막 형식으로 표시되도록 할 수 있는 수술로봇 시스템을 제공하는데 목적이 있다.Another object of the present invention is to provide a surgical robot system capable of calculating a collision probability between an endoscopic camera and a surgical tool so that the calculated collision probability is displayed in a caption format on an endoscopic image displayed on a headset.

또한, 본 발명은 헤드셋에 디스플레이되는 내시경 영상에 자막 형식으로 표시되는 충돌확률의 기준을 사용자가 선택할 수 있도록 하는 수술로봇 시스템을 제공하는데 목적이 있다. Another object of the present invention is to provide a surgical robot system that allows a user to select a criterion for a collision probability displayed in a caption format on an endoscopic image displayed on a headset.

본 발명의 일실시 예에 따른 수술로봇 시스템은, 소정의 절개부위를 통해 수술공간으로 삽입되어 수술부위를 실시간 촬영하는 내시경 카메라; 상기 내시경 카메라를 이동시키는 구동부; 사용자의 머리에 착탈 가능하게 착용되어 상기 내시경 카메라에서 촬영된 영상을 표시하는 헤드셋; 사용자의 음성을 인식하는 음성인식부; 및 상기 음성인식부에서 인식된 음성데이터로부터 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 파악한 후 이를 기초로 상기 구동부를 제어하기 위한 구동제어신호를 상기 구동부로 송신하며, 상기 내시경 카메라로부터 수신된 영상정보를 상기 헤드셋으로 전송하는 제어부;를 포함할 수 있다. A surgical robot system according to an embodiment of the present invention includes: an endoscope camera inserted into an operating space through a predetermined incision to photograph the surgical site in real time; a driving unit for moving the endoscope camera; a headset that is detachably worn on the user's head and displays the image captured by the endoscope camera; A voice recognition unit for recognizing the user's voice; And after recognizing the user's intention for the movement of the endoscope camera from the voice data recognized by the voice recognition unit, based on this, a driving control signal for controlling the driving unit is transmitted to the driving unit, and the image received from the endoscope camera It may include; a control unit for transmitting information to the headset.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 제어부는, 상기 음성인식부를 통해 확보된 음성데이터로부터 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 파악하는 음성인식 판단부; 상기 음성인식 판단부에서 파악된 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 기초로 상기 내시경 카메라의 동작 제어에 필요한 발화지침을 생성하는 발화지침 생성부; 상기 발화지침에 기초하여 상기 구동부를 제어하기 위한 구동제어신호를 생성하는 구동제어신호 생성부; 및 상기 내시경 카메라로부터 수신된 영상정보와 상기 발화지침을 상기 헤드셋으로 전송하며, 상기 구동제어신호를 상기 구동부로 전송하는 송신부;를 포함하고, 상기 헤드셋은, 상기 송신부에서 전송되는 영상정보를 표시하는 영상표시부; 및 상기 송신부에서 전송되는 발화지침을 상기 영상표시부에 자막으로 표시하는 자막표시부;를 포함할 수 있다. In addition, the surgical robot system according to an embodiment of the present invention, the control unit, the voice recognition determination unit for identifying the user's intention for the movement of the endoscope camera from the voice data secured through the voice recognition unit; a speech guide generation unit for generating speech guidelines necessary for controlling the operation of the endoscope camera based on the user's intention for the movement of the endoscope camera detected by the speech recognition determination unit; a driving control signal generating unit that generates a driving control signal for controlling the driving unit based on the ignition guide; and a transmitter that transmits the image information received from the endoscope camera and the utterance instruction to the headset, and transmits the driving control signal to the driver, wherein the headset displays image information transmitted from the transmitter video display unit; and a caption display unit for displaying the utterance instructions transmitted from the transmitter as captions on the image display unit.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 음성인식 판단부는 머신 러닝 기반의 음성 인식 모델을 이용하여 상기 음성인식부를 통해 확보된 음성데이터 중에서 상기 내시경 카메라의 동작 제어에 필요한 단어들을 구분하고 패턴화하여 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 파악할 수 있다. In addition, in the surgical robot system according to an embodiment of the present invention, the voice recognition determination unit uses a machine learning-based voice recognition model to select words necessary for operation control of the endoscope camera from among the voice data secured through the voice recognition unit. By dividing and patterning, it is possible to grasp the user's intention for the movement of the endoscope camera.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 음성 인식 모델은 딥 러닝(deep learning) 기반의 학습에 기초하여 구축된 심층신경망(Deep neural network)를 포함할 수 있다. In addition, in the surgical robot system according to an embodiment of the present invention, the voice recognition model may include a deep neural network constructed based on deep learning-based learning.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 발화 지침은 ‘캠 라이트(Cam right)’, ‘캠 레프트(Cam left)’, ‘캠 업(Cam up)’, ‘캠 다운(Cam down)’, ‘캠 줌 인(Cam zoon in)’및 ‘캠 줌 아웃(Camera zoom out)’을 포함할 수 있다.In addition, in the surgical robot system according to an embodiment of the present invention, the firing instructions are 'Cam right', 'Cam left', 'Cam up', 'Cam down (Cam right)', 'Cam left', 'Cam up' 'Cam down)', 'Cam zoom in', and 'Camera zoom out'.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 음성인식부는 상기 헤드셋 일측에 착탈가능하게 구비되거나 상기 헤드셋에 일체로 구비된 마이크로폰일 수 있다. In addition, in the surgical robot system according to an embodiment of the present invention, the voice recognition unit may be detachably provided on one side of the headset or a microphone provided integrally with the headset.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 소정의 절개부위를 통해 수술공간으로 삽입되어 수술부위를 실시간 촬영하는 내시경 카메라; 상기 수술공간으로 삽입되어 상기 수술부위를 수술하기 위한 로봇팔; 상기 내시경 카메라를 이동시키는 구동부; 사용자의 머리에 착탈 가능하게 착용되어 상기 내시경 카메라에서 촬영된 영상을 표시하는 헤드셋; 사용자의 음성을 인식하는 음성인식부; 및 상기음성인식부에서 인식된 음성데이터로부터 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 파악한 후 이를 기초로 상기 구동부를 제어하기 위한 구동제어신호를 상기 구동부로 송신하고, 상기 내시경 카메라와 상기 로봇팔의 충돌확률을 계산하여 상기 계산된 충돌확률을 상기 헤드셋으로 전송하며, 상기 내시경 카메라로부터 수신된 영상정보를 상기 헤드셋으로 전송하는 제어부;를 포함할 수 있다. In addition, the surgical robot system according to an embodiment of the present invention, an endoscope camera inserted into an operating space through a predetermined incision site to photograph the surgical site in real time; a robot arm inserted into the operating space to operate the surgical site; a driving unit for moving the endoscope camera; a headset that is detachably worn on the user's head and displays the image captured by the endoscope camera; A voice recognition unit for recognizing the user's voice; And after recognizing the user's intention for the movement of the endoscope camera from the voice data recognized by the voice recognition unit, transmit a driving control signal for controlling the driving unit based on this to the driving unit, and the endoscope camera and the robot arm and a control unit for calculating the collision probability of , transmitting the calculated collision probability to the headset, and transmitting image information received from the endoscope camera to the headset.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 제어부는, 상기 음성인식부를 통해 확보된 음성데이터로부터 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 파악하는 음성인식 판단부; 상기 음성인식 판단부에서 파악된 상기 내시경 카메라의 움직임에 대한 사용자의 의도를 기초로 상기 내시경 카메라의 동작 제어에 필요한 발화지침을 생성하는 발화지침 생성부; 상기 발화지침에 기초하여 상기 구동부를 제어하기 위한 구동제어신호를 생성하는 구동제어신호 생성부; 상기 내시경 카메라와 상기 로봇팔의 충돌가능성을 확률로 계산하는 충돌확률 계산부; 및 상기 내시경 카메라로부터 수신된 영상정보와, 상기 발화지침과, 상기 충돌확률 계산부에서 계산된 충돌확률을 상기 헤드셋으로 전송하며, 상기 구동제어신호를 상기 구동부로 전송하는 송신부;를 포함하고, 상기 헤드셋은, 상기 송신부에서 전송되는 영상정보를 표시하는 영상표시부; 및 상기 송신부에서 전송되는 발화지침과 충돌확률을 상기 영상표시부에 표시하는 자막표시부;를 포함할 수 있다. In addition, the surgical robot system according to an embodiment of the present invention, the control unit, the voice recognition determination unit for identifying the user's intention for the movement of the endoscope camera from the voice data secured through the voice recognition unit; a speech guide generation unit for generating speech guidelines necessary for controlling the operation of the endoscope camera based on the user's intention for the movement of the endoscope camera detected by the speech recognition determination unit; a driving control signal generating unit that generates a driving control signal for controlling the driving unit based on the ignition guide; a collision probability calculator for calculating the probability of collision between the endoscope camera and the robot arm as a probability; and a transmitter for transmitting the image information received from the endoscope camera, the utterance instructions, and the collision probability calculated by the collision probability calculation unit to the headset, and transmitting the driving control signal to the driving unit; The headset includes: an image display unit for displaying image information transmitted from the transmitter; and a caption display unit for displaying the utterance instructions transmitted from the transmitter and the collision probability on the image display unit.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 음성인식 판단부는 상기 내시경 카메라와 로봇팔의 충돌확률을 표시하는 기준에 대한 사용자 의도를 파악하도록 구비되고, 상기 제어부는 상기 충돌확률 계산부에서 계산된 충돌확률이 상기 음성인식 판단부에서 파악된 상기 내시경 카메라와 로봇팔의 충돌확률을 표시하는 기준에 대한 사용자 의도에 따른 충돌확률보다 높은 경우에 상기 송신부가 상기 충돌확률 계산부에서 계산된 충돌확률을 상기 헤드셋으로 전송하도록 하는 충돌확률 표시여부 판단부를 더 포함할 수 있다. In addition, in the surgical robot system according to an embodiment of the present invention, the voice recognition determination unit is provided to determine the user's intention with respect to a criterion for displaying the collision probability between the endoscopic camera and the robot arm, and the control unit calculates the collision probability When the collision probability calculated in the unit is higher than the collision probability according to the user's intention with respect to the criterion for displaying the collision probability between the endoscopic camera and the robot arm determined by the voice recognition determination unit, the transmitter calculates the collision probability calculation unit It may further include a collision probability display determination unit for transmitting the determined collision probability to the headset.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 음성인식 판단부는 머신 러닝 기반의 음성 인식 모델을 통해 상기 음성인식부를 통해 확보된 음성데이터 중에서 상기 내시경 카메라 동작 제어에 필요한 단어들과 상기 내시경 카메라와 로봇팔의 충돌확률을 표시하는 기준에 대한 단어들을 구분하고 패턴화하여 상기 내시경 카메라의 움직임에 대한 사용자 의도와 상기 내시경 카메라와 상기 로봇팔의 충돌확률을 표시하는 기준에 대한 사용자 의도를 파악할 수 있다. In addition, in the surgical robot system according to an embodiment of the present invention, the voice recognition determining unit includes words and the By dividing and patterning the words for the criterion for displaying the collision probability between the endoscope camera and the robot arm, the user intention for the movement of the endoscope camera and the user intention for the criterion for displaying the collision probability between the endoscope camera and the robot arm can figure out

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템은, 상기 음성인식부는 상기 헤드셋 일측에 착탈가능하게 구비되거나 상기 헤드셋에 일체로 구비되는 마이크로폰일 수 있다. In addition, in the surgical robot system according to an embodiment of the present invention, the voice recognition unit may be a microphone detachably provided on one side of the headset or integrally provided with the headset.

상기와 같은 구성을 가지는 본 발명의 일실시 예에 따른 수술로봇 시스템에 의하면, 사용자가 헤드셋을 통해 디스플레이되는 내시경 영상을 보면서 음성으로 내시경 카메라의 동작을 제어할 수 있게 됨에 따라, 손으로 로봇팔의 동작을 제어하면서 동시에 내시경 카메라의 동작 제어가 가능하며, 따라서 로봇수술의 흐름이 끊기지 않고 원활하게 진행될 수 있다. According to the surgical robot system according to an embodiment of the present invention having the above configuration, the user can control the operation of the endoscopic camera by voice while viewing the endoscopic image displayed through the headset. It is possible to control the motion of the endoscope camera while simultaneously controlling the motion, so that the flow of robotic surgery can proceed smoothly without interruption.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템에 의하면, 음성인식 마이크로폰을 통해 획득된 음성데이터로부터 내시경 카메라의 움직임에 대한 사용자 의도를 인공지능 기반으로 파악하여 이를 기초로 내시경 카메라의 동작을 제어할 수 있어서, 사용자의 의도에 따른 정확한 내시경 카메라의 동작 제어가 가능하다.In addition, according to the surgical robot system according to an embodiment of the present invention, the user's intention for the movement of the endoscope camera is identified based on artificial intelligence from the voice data obtained through the voice recognition microphone, and the operation of the endoscope camera is controlled based on this. Thus, it is possible to accurately control the operation of the endoscope camera according to the user's intention.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템에 의하면, 내시경 카메라의 동작 제어에 대한 정보가 헤드셋에 디스플레이되는 내시경 영상에 자막 형식으로 표시됨에 따라, 사용자는 내시경 영상과 함께 내시경 카메라의 동작 제어 정보를 동시에 확인하면서 상기 내시경 카메라의 움직임을 제어할 수 있다.In addition, according to the surgical robot system according to an embodiment of the present invention, as information on the operation control of the endoscopic camera is displayed in the form of captions on the endoscopic image displayed on the headset, the user controls the operation of the endoscopic camera together with the endoscopic image. It is possible to control the movement of the endoscope camera while simultaneously checking information.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템에 의하면, 내시경 카메라와 수술도구 간의 충돌 확률이 헤드셋에 디스플레이되는 내시경 영상에 자막 형식으로 표시됨에 따라, 사용자는 내시경 영상과 함께 내시경 카메라와 수술도구 간의 충돌 확률을 동시에 확인하면서 상기 내시경 카메라의 움직임을 제어할 수 있다.In addition, according to the surgical robot system according to an embodiment of the present invention, as the collision probability between the endoscopic camera and the surgical tool is displayed in the caption format on the endoscopic image displayed on the headset, the user can use the endoscopic camera and the surgical tool together with the endoscopic image. It is possible to control the movement of the endoscope camera while simultaneously checking the collision probability.

또한, 본 발명의 일실시 예에 따른 수술로봇 시스템에 의하면, 내시경 영상에 자막 형식으로 표시되는 충돌확률의 기준을 사용자가 선택할 수 있어서, 불필요한 과도한 정보에 대한 사용자의 노출을 방지할 수 있으며, 그에 따라 로봇수술의 안전성을 도모함과 동시에 사용자의 편의성을 증대시킬 수 있다.In addition, according to the surgical robot system according to an embodiment of the present invention, the user can select the criterion of the collision probability displayed in the caption format on the endoscopic image, thereby preventing the user's exposure to unnecessary and excessive information. Therefore, it is possible to promote the safety of robotic surgery and increase the convenience of the user at the same time.

본 발명에 따른 효과들은 이상에서 언급된 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 청구범위와 상세한 설명의 기재로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진자에게 명확하게 이해될 수 있을 것이다.The effects according to the present invention are not limited to the effects mentioned above, and other effects not mentioned will be clearly understood by those of ordinary skill in the art to which the present invention belongs from the description of the claims and detailed description. will be able

도 1은 본 발명의 일 실시예에 따른 수술로봇 시스템의 구성을 도시한 개략도이고,
도 2는 도 1에 따른 수술로봇 시스템의 헤드셋과 제어부의 구성을 도시한 블록도이고,
도 3은 헤드셋의 영상표시부에 발화지침이 자막으로 표시된 일례를 도시하고,
도 4는 음성인식 판단부에 멀티 레이어 신경망을 이용한 음성인식 모델이 적용된 일예로서, 음성인식 판단부에서 발화지침을 형성하기 위한 단어들을 구분한 결과 출력을 수행하는 일례를 도시하고,
도 5는 다른 실시 예에 따른 수술로봇 시스템의 제어부의 구성을 도시한 블록도이고,
도 6은 헤드셋의 영상표시부에 충돌확률이 자막으로 표시된 일례를 도시한다.1 is a schematic diagram showing the configuration of a surgical robot system according to an embodiment of the present invention,
2 is a block diagram showing the configuration of a headset and a control unit of the surgical robot system according to FIG. 1;
3 shows an example in which speech instructions are displayed in captions on the video display unit of the headset;
4 is an example in which a speech recognition model using a multi-layer neural network is applied to the speech recognition determination unit, and shows an example of outputting the result of dividing words for forming the utterance guide in the speech recognition determination unit;
5 is a block diagram illustrating a configuration of a control unit of a surgical robot system according to another embodiment;
6 shows an example in which the probability of collision is displayed as a caption on the image display unit of the headset.

이하에서는 첨부된 도면을 참고하여 본 발명의 실시예들에 대하여 상세하게 설명한다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

이하에 기술될 장치의 구성이나 제어방법은 본 발명의 실시예를 설명하기 위한 것일 뿐 본 발명의 권리범위를 한정하기 위함은 아니다.The configuration or control method of the device to be described below is only for explaining the embodiment of the present invention, not for limiting the scope of the present invention.

첨부 도면을 참조하여 설명함에 있어, 동일한 구성요소는 동일한 도면 부호를 부여하고, 이에 대한 중복설명은 생략한다.In the description with reference to the accompanying drawings, the same components are given the same reference numerals, and redundant description thereof will be omitted.

어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 구비할 수 있다는 것을 의미한다.When a part "includes" a certain component, it means that other components may be further included without excluding other components unless otherwise stated.

각 실시예는 독립적으로 실시되거나 함께 실시될 수 있으며, 발명의 목적에 부합하게 일부 구성요소는 제외될 수 있다. Each embodiment may be implemented independently or together, and some components may be excluded in accordance with the purpose of the invention.

도 1은 본 발명의 일 실시예에 따른 수술로봇 시스템의 구성을 도시한 개략도이다.1 is a schematic diagram showing the configuration of a surgical robot system according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 일실시 예에 따른 수술로봇 시스템(100)은 소정의 절개부위(21, 23, 25, 27)를 통해 수술공간(12)으로 삽입되어 수술부위를 실시간으로 촬영하는 내시경 카메라(110), 상기 수술공간(12) 내부에서 상기 내시경 카메라(110)을 이동시키는 구동부(120), 사용자의 머리에 착탈 가능하게 착용되어 상기 내시경 카메라(110)에서 촬영된 영상을 표시하는 헤드셋(headset)(130), 사용자의 음성을 인식하는 음성인식부(140), 상기 내시경 카메라(110)로부터 수신된 영상정보를 상기 헤드셋(130)으로 전송하는 제어부(150)를 포함할 수 있다.Referring to FIG. 1 , the surgical robot system 100 according to an embodiment of the present invention is inserted into the operating space 12 through predetermined incisions 21, 23, 25, and 27 to photograph the surgical site in real time. The endoscope camera 110, the driving unit 120 for moving the endoscope camera 110 inside the operating space 12, is detachably worn on the user's head to display the image taken by the endoscope camera 110 may include a headset 130 , a voice recognition unit 140 for recognizing the user's voice, and a control unit 150 for transmitting image information received from the endoscope camera 110 to the headset 130 . have.

또한, 상기 제어부(150)는 상기 음성인식부(140)에서 인식된 음성데이터로부터 상기 내시경 카메라(110)의 움직임에 대한 사용자의 의도를 파악한 후 이를 기초로 상기 구동부(120)를 제어하기 위한 구동제어신호를 상기 구동부(120)로 송신할 수 있다. In addition, the control unit 150 determines the user's intention for the movement of the endoscope camera 110 from the voice data recognized by the voice recognition unit 140, and then drives to control the driving unit 120 based on this. A control signal may be transmitted to the driving unit 120 .

또한, 상기 수술로봇 시스템(100)은 상기 수술공간(12)으로 삽입되어 상기 수술부위를 수술하기 위한 복수의 로봇팔(115)을 더 구비할 수 있다.In addition, the surgical robot system 100 may further include a plurality of robotic arms 115 inserted into the operating space 12 to operate the surgical site.

종래기술에 따르면 최소 침습 수술을 위해 대략 3개의 로봇팔과 내시경 카메라를 이용하여 수술을 진행하게 되며, 상기 로봇팔의 조작을 위하여 사용자의 양손으로 별도의 컨트롤러를 조작하게 된다. 이 경우, 상기 내시경 카메라의 위치를 이동시키기 위해서는 상기 로봇팔의 컨트롤러 이외의 별도의 내시경 카메라 컨트롤러를 사용하게 된다. 따라서, 사용자가 로봇팔 컨트롤러에서 내시경 카메라 컨트롤러를 조작하기 위해 손을 이동시키는 경우 수술의 흐름이 끊기게 되고, 수술 도구의 충돌, 수술 부위의 손상, 수술 시간의 증가를 초래할 수 있다. According to the prior art, surgery is performed using approximately three robot arms and an endoscope camera for minimally invasive surgery, and a separate controller is operated with both hands of a user for manipulation of the robot arm. In this case, in order to move the position of the endoscope camera, a separate endoscope camera controller other than the controller of the robot arm is used. Therefore, when the user moves the hand from the robot arm controller to the endoscopic camera controller, the flow of surgery is interrupted, which may result in collision of surgical tools, damage to the surgical site, and increase in operation time.

종래기술의 다른 타입에 따르면 상기 내시경 카메라를 이동시키기 위하여 발로 조작하는 조작부를 구비한 장치도 있으나, 발로 조작하는 경우에 손으로 조작하는 경우에 비해 섬세한 조작이 상대적으로 곤란하여, 전술한 문제점을 완전히 해결하지는 못하게 된다.According to another type of the prior art, there is also a device having a manipulation unit for operating the endoscope camera with a foot to move the endoscope camera. can't solve it

또한, 종래기술의 다른 타입에 따르면 사용자가 헤드셋을 머리에 쓴 상태에서 머리의 움직임으로 내시경 카메라의 움직임을 제어하는 장치도 있으나, 머리의 움직임으로 내시경을 정밀하게 제어하기는 어려우며, 특히 수술중에는 내시경의 움직임을 위한 의도적인 머리 움직임 외에도 의도하지 않은 다양한 형태의 머리 움직임이 발생할 수 있는 문제가 있다.In addition, according to another type of the prior art, there is a device for controlling the movement of the endoscope camera by the movement of the head while the user wears the headset on the head, but it is difficult to precisely control the endoscope by the movement of the head. In addition to intentional head movements for movement, there is a problem that various types of unintentional head movements may occur.

본 발명의 경우 전술한 문제점을 해결하기 위하여 사용자가 헤드셋을 통해 디스플레이되는 내시경 영상을 보면서 음성으로 내시경 카메라의 동작을 제어할 수 있다. 따라서, 사용자는 손으로 로봇팔의 컨트롤러를 조작하고, 음성으로 상기 내시경 카메라의 이동을 조작하여 수술의 흐름이 끊기지 않고 수술 시간을 단축시킬 수 있다.In the case of the present invention, in order to solve the above-described problem, the user can control the operation of the endoscope camera by voice while viewing the endoscope image displayed through the headset. Accordingly, the user can reduce the operation time without interrupting the flow of surgery by manipulating the controller of the robot arm by hand and manipulating the movement of the endoscope camera by voice.

도 1을 참조하면, 상기 수술공간(12)은 사람 또는 동물의 몸체 내부에 해당하며, 상기 수술공간(12)에는 장기(15), 동맥과 정맥과 같은 각종 혈관(17) 또는 신경(19) 등이 분포할 수 있다.Referring to FIG. 1 , the operating space 12 corresponds to the inside of a human or animal body, and the operating space 12 includes organs 15 , various blood vessels 17 such as arteries and veins, or nerves 19 . etc. can be distributed.

상기 로봇팔(115)과 내시경 카메라(110)는 미리 형성된 절개부위(21, 23, 25, 27)를 통해 그 하단부가 상기 수술공간(12)의 내측으로 삽입될 수 있다. The lower end of the robot arm 115 and the endoscope camera 110 may be inserted into the operating space 12 through the pre-formed incisions 21 , 23 , 25 , 27 .

상기 로봇팔(115)은 상기 수술공간(12)의 내측에서 상기 수술부위에 대해 필요한 조치를 취할 수 있는 수술도구 등을 그 하단부에 구비할 수 있으며, 특정한 형태로 한정하지는 않는다.The robot arm 115 may include, at its lower end, a surgical tool that can take necessary measures for the surgical site from the inside of the operating space 12, and is not limited to a specific shape.

한편, 상기 절개부위(21, 23, 25, 27)는 상기 내시경 카메라(110) 또는 로봇팔(115)의 하단부가 삽입될 수 있는 크기 또는 직경을 가지는 복강공과 같은 구멍으로 이루어질 수 있다.Meanwhile, the incisions 21 , 23 , 25 , and 27 may be formed of a hole such as an abdominal cavity having a size or diameter into which the lower end of the endoscope camera 110 or the robot arm 115 can be inserted.

상기 절개부위(21, 23, 25, 27)의 개수는 수술에 필요한 상기 로봇팔(115)과 내시경 카메라(110)의 개수에 대응하도록 결정될 수 있다. 예를 들어, 도면에 도시된 바와 같이 3개의 로봇팔(115)과 1개의 내시경 카메라(110)을 이용하여 수술을 진행하는 경우에 상기 절개부위(21, 23, 25, 27)는 상기 내시경 카메라(110)의 삽입을 위한 1개의 내시경 절개부위(21)와, 상기 로봇팔(115)의 삽입을 위한 3개의 로봇팔 절개부위(23, 25, 27)로 이루어질 수 있다. The number of the incisions 21 , 23 , 25 , and 27 may be determined to correspond to the number of the robot arm 115 and the endoscope camera 110 required for surgery. For example, as shown in the drawings, when surgery is performed using three robot arms 115 and one endoscope camera 110, the incisions 21, 23, 25, 27 are the endoscope cameras. One endoscopic incision 21 for insertion of 110 and three robotic arm incisions 23, 25, and 27 for insertion of the robot arm 115 may be included.

다만, 본 발명에서 설명하는 상기 절개부위(21, 23, 25, 27)의 개수는 일예를 들어 설명한 것에 불과하며, 적절하게 변형될 수 있다.However, the number of the incisions 21 , 23 , 25 , and 27 described in the present invention is merely an example and may be appropriately modified.

한편, 사용자의 머리에 착용될 수 있는 헤드셋(130)은 최근 들어 활발히 개발되고 있는 VR 헤드셋(virtual reality headset) 등의 형태로 구현될 수 있다. 하지만, 상기 헤드셋(130)은 단순히 VR 기기의 형태에 한정되지 않으며 다양한 예로 구현이 가능하다.Meanwhile, the headset 130 that can be worn on the user's head may be implemented in the form of a virtual reality headset that has been actively developed in recent years. However, the headset 130 is not limited to the form of a simple VR device and may be implemented in various examples.

또한, 상기 음성인식부(140)는 상기 헤드셋(130) 일측에 착탈가능하게 구비되거나 상기 헤드셋(130)에 일체로 구비된 마이크로폰(microphone)의 형태로 구현될 수 있다.Also, the voice recognition unit 140 may be detachably provided on one side of the headset 130 or implemented in the form of a microphone integrally provided with the headset 130 .

도 2는 도 1에 따른 수술로봇 시스템의 헤드셋과 제어부의 구성을 도시한 블록도이다. FIG. 2 is a block diagram illustrating the configuration of a headset and a control unit of the surgical robot system according to FIG. 1 .

도 2를 참조하면, 상기 제어부(150)는 음성인식 판단부(151), 발화지침 생성부(153), 구동제어신호 생성부(155), 송신부(157)를 포함할 수 있다.Referring to FIG. 2 , the control unit 150 may include a voice recognition determination unit 151 , a utterance guide generation unit 153 , a driving control signal generation unit 155 , and a transmission unit 157 .

상기 음성인식 판단부(151)는 상기 음성인식부(140)를 통해 확보된 음성데이터로부터 상기 내시경 카메라(110)의 움직임에 대한 사용자의 의도를 파악할 수 있다.The voice recognition determination unit 151 may determine the user's intention for the movement of the endoscope camera 110 from the voice data secured through the voice recognition unit 140 .

상기 발화지침 생성부(153)은 상기 음성인식 판단부(151)에서 파악된 상기 내시경 카메라(110)의 움직임에 대한 사용자의 의도를 기초로 상기 내시경 카메라(110)의 동작 제어에 필요한 발화지침을 생성할 수 있다. The speech guide generation unit 153 generates a speech guideline necessary for operation control of the endoscope camera 110 based on the user's intention for the movement of the endoscope camera 110 identified by the voice recognition determination unit 151 . can create

상기 발화 지침은 ‘캠 라이트(Cam right)’, ‘캠 레프트(Cam left)’, ‘캠 업(Cam up)’, ‘캠 다운(Cam down)’, ‘캠 줌 인(Cam zoon in)’및 ‘캠 줌 아웃(Camera zoom out)’을 포함할 수 있다.The utterance instructions are 'Cam right', 'Cam left', 'Cam up', 'Cam down', and 'Cam zoom in'. and 'Camera zoom out'.

예를 들어, 상기 내시경 카메라(110)의 위치에 3차원 직교 좌표계를 적용하면, ‘캠 라이트(Cam right)’와 ‘캠 레프트(Cam left)’는 X축 상의 이동을 나타내는 의미로 사용될 수 있으며, ‘캠 업(Cam up)’과 ‘캠 다운(Cam down)’은 Y축 상의 이동을 나타내는 의미로 사용될 수 있으며, ‘캠 줌 인(Cam zoon in)’과 ‘캠 줌 아웃(Cam zoom out)’은 Z축 상의 이동을 나타내는 의미로 사용될 수 있다.For example, when a three-dimensional Cartesian coordinate system is applied to the position of the endoscope camera 110, 'Cam right' and 'Cam left' may be used to indicate movement on the X-axis. , 'Cam up' and 'Cam down' can be used to indicate movement on the Y-axis, and 'Cam zoom in' and 'Cam zoom out' )' may be used to indicate movement along the Z-axis.

상기 구동제어신호 생성부(155)는 상기 발화지침에 기초하여 상기 구동부(120)를 제어하기 위한 구동제어신호를 생성할 수 있다.The driving control signal generating unit 155 may generate a driving control signal for controlling the driving unit 120 based on the ignition instruction.

상기 구동제어신호는 상기 발화지침에 따른 상기 내시경 카메라(110)의 동작을 수행하기 위한 상기 구동부(120)의 작동을 위한 것으로, 상기 구동부(120)의 구체적인 구성에 따라 다양한 형태의 구동제어신호가 생성될 수 있다. The driving control signal is for the operation of the driving unit 120 for performing the operation of the endoscope camera 110 according to the ignition guidelines, and various types of driving control signals are provided according to the specific configuration of the driving unit 120 . can be created

예를 들어, 상기 구동부(120)가 상기 내시경 카메라(110)를 3축 각각으로 동작시키기 위한 3개의 모터로 이루어지고, 상기 발화지침이 ‘캠 라이트(Cam right)’ 또는 ‘캠 레프트(Cam left)’인 경우, 상기 구동제어신호는 상기 내시경 카메라(110)를 X축 상으로 이동시키는 모터를 작동시키기 위한 신호일 수 있다. For example, the driving unit 120 is composed of three motors for operating the endoscope camera 110 in each of three axes, and the ignition guide is 'Cam right' or 'Cam left (Cam left). )', the driving control signal may be a signal for operating a motor that moves the endoscope camera 110 on the X-axis.

상기 송신부(157)는 상기 내시경 카메라(110)로부터 수신된 영상정보와 상기 발화지침을 상기 헤드셋(130)으로 전송하며, 상기 구동제어신호를 상기 구동부(120)로 전송할 수 있다.The transmitter 157 may transmit the image information received from the endoscope camera 110 and the utterance instruction to the headset 130 , and transmit the driving control signal to the driver 120 .

한편, 상기 헤드셋(130)은 상기 송신부(130)에서 전송되는 영상정보를 표시하는 영상표시부(133)를 포함할 수 있다. Meanwhile, the headset 130 may include an image display unit 133 that displays image information transmitted from the transmitter 130 .

상기 내시경 카메라(110)에서 촬영된 영상정보는 상기 송신부(157)로 전송되고, 이어서 상기 송신부(157)에 의해 상기 헤드셋(100)로 전송되어 상기 영상표시부(133)를 통해 표시될 수 있다. The image information photographed by the endoscope camera 110 may be transmitted to the transmitter 157 and then transmitted to the headset 100 by the transmitter 157 to be displayed through the image display unit 133 .

상기 헤드셋(120)은 무선 또는 유선으로 상기 송신부(157)와 연결되어 상기 영상정보를 전송받을 수 있다.The headset 120 may be connected to the transmitter 157 wirelessly or wired to receive the image information.

상기 헤드셋(130)에는 사용자의 양측 눈에 대응하는 위치에 머리 착용 디스플레이(HMD, Head Mounted Display)와 같은 형태로 상기 영상표시부(133)가 제공될 수 있다.The headset 130 may be provided with the image display unit 133 at positions corresponding to both eyes of the user in the form of a head mounted display (HMD).

따라서, 사용자는 상기 영상표시부(133)를 통해 상기 내시경 카메라(110)에서 촬영되는 영상정보를 실시간으로 확인하면서 수술을 진행할 수 있다.Accordingly, the user can proceed with the operation while checking the image information captured by the endoscope camera 110 in real time through the image display unit 133 .

또한, 상기 헤드셋(130)은 상기 송신부(130)에서 전송되는 발화지침을 상기 영상표시부(133)에 자막으로 표시하는 자막표시부(134)를 더 포함할 수 있다. In addition, the headset 130 may further include a caption display unit 134 for displaying the utterance instructions transmitted from the transmitter 130 as captions on the image display unit 133 .

도 3은 상기 헤드셋(130)의 영상표시부(133)에 발화지침이 자막으로 표시된 일례를 도시한다. FIG. 3 shows an example in which utterance instructions are displayed in captions on the image display unit 133 of the headset 130 .

도 3에서 보이는 바와 같이, 상기 내시경 카메라(110)의 동작 제어에 대한 정보 즉, 상기 발화지침이 상기 헤드셋(130)의 영상표시부(133)에 디스플레이되는 내시경 영상에 자막 형식으로 표시되면, 사용자는 내시경 영상과 함께 상기 발화지침을 동시에 확인하면서 상기 내시경 카메라(110)의 동작을 제어할 수 있다.As shown in FIG. 3 , when information on the operation control of the endoscope camera 110 , that is, the utterance instruction is displayed in the endoscopic image displayed on the image display unit 133 of the headset 130 in the form of captions, the user It is possible to control the operation of the endoscope camera 110 while simultaneously checking the utterance instructions together with the endoscopic image.

한편, 상기 음성인식 판단부(151)는 머신 러닝 기반의 음성 인식 모델을 이용하여 상기 음성인식부(140)를 통해 확보된 음성데이터 중에서 상기 내시경 카메라(110)의 동작 제어에 필요한 단어들을 구분하고 패턴화하여 상기 내시경 카메라(110)의 움직임에 대한 사용자의 의도를 파악할 수 있다.On the other hand, the voice recognition determination unit 151 uses a machine learning-based voice recognition model to classify words necessary for operation control of the endoscope camera 110 from among the voice data secured through the voice recognition unit 140, and By patterning, the user's intention for the movement of the endoscope camera 110 can be identified.

이때, 상기 음성 인식 모델은 딥 러닝(deep learning) 기반의 학습에 기초하여 구축된 심층신경망(Deep neural network)를 포함할 수 있다. In this case, the voice recognition model may include a deep neural network constructed based on deep learning-based learning.

도 4는 상기 음성인식 판단부(151)에 멀티 레이어 신경망을 이용한 음성인식 모델이 적용된 일예를 도시한다.4 illustrates an example in which a speech recognition model using a multi-layer neural network is applied to the speech recognition determination unit 151 .

이하에서는 도 4를 참조하여, 상기 음성인식 판단부(151)에 적용되는 음성 인식 모델에 대해서 상세히 설명한다. 여기서, 음성 인식 모델은 인공 지능의 일례로서, 딥러닝에 대해서 상세히 설명한다. Hereinafter, a voice recognition model applied to the voice recognition determination unit 151 will be described in detail with reference to FIG. 4 . Here, the voice recognition model is an example of artificial intelligence, and deep learning will be described in detail.

인공 지능이란 인간의 지능으로 할 수 있는 사고, 학습, 자기 개발 등을 컴퓨터가 할 수 있도록 하는 방법을 연구하는 컴퓨터 공학 및 정보기술의 일 분야에 해당한다. 이러한 인공지능의 연구 분야 중 하나인 머신 러닝은 경험적 데이터를 기반으로 예측을 수행하고, 학습을 통해 스스로의 성능을 향상시키는 시스템을 의미할 수 있다. Artificial intelligence is a field of computer science and information technology that studies how computers can do the thinking, learning, and self-development that human intelligence can do. Machine learning, one of the research fields of artificial intelligence, can mean a system that makes predictions based on empirical data and improves its own performance through learning.

머신 러닝의 일종인 딥러닝 기술은 데이터를 기반으로 다단계로 깊은 수준까지 내려가 학습하는 것이다.Deep learning technology, a type of machine learning, learns from data in multiple stages down to a deep level.

딥러닝은 단계를 높여 갈수록 복수의 데이터로부터 핵심적인 데이터를 추출하는 머신 러닝 알고리즘의 집합을 나타낼 수 있다.Deep learning can represent a set of machine learning algorithms that extract core data from multiple pieces of data as the level goes up.

딥러닝 구조는 인공신경망(artificial neural network(ANN))을 포함할 수 있고, 예를 들어 딥러닝 구조는 CNN(convolutional neural network), RNN(recurrent neural network), DBN(deep belief network) 등 심층신경망(deep neural network)으로 구성될 수 있다.The deep learning structure may include an artificial neural network (ANN), for example, the deep learning structure is a deep neural network such as a convolutional neural network (CNN), a recurrent neural network (RNN), or a deep belief network (DBN). (deep neural network) can be configured.

도 4를 참조하면, 인공신경망은 입력 레이어(Input Layer), 히든 레이어(Hidden Layer), 및 출력 레이어 (Output Layer)를 포함할 수 있다. 각 레이어는 복수의 노드들을 포함하고, 각 레이어는 다음 레이어와 연결된다. 인접한 레이어 사이의 노드들은 웨이트(weight)를 가지고 서로 연결될 수 있다.Referring to FIG. 4 , the artificial neural network may include an input layer, a hidden layer, and an output layer. Each layer includes a plurality of nodes, and each layer is connected to the next layer. Nodes between adjacent layers may be connected to each other with weights.

여기서, 히든 레이어가 단일 레이어를 가질 수도 있으며 복수 레이어를 가질 수 있다. 전자를 단일 레이어 신경망(single-layer neural network)라 하고 후자를 멀티 레이어 신경망(multi-layer neural network)라 할 수 있다. 도 4에는 멀티 레이어 신경망이 상기 음성인식 판단부(151)에 적용된 일례를 도시하고 있다. Here, the hidden layer may have a single layer or a plurality of layers. The former may be referred to as a single-layer neural network, and the latter may be referred to as a multi-layer neural network. 4 shows an example in which a multi-layer neural network is applied to the speech recognition determination unit 151 .

상기 음성인식 판단부(151)는 상기 음성인식부(140)를 통해 확보된 음성데이터 중에서 일정한 패턴을 발견해 특징맵(feature-map)을 형성할 수 있다. The voice recognition determination unit 151 may form a feature-map by discovering a certain pattern among the voice data secured through the voice recognition unit 140 .

예컨대, 도 4에 도시된 바와 같이 상기 음성인식 판단부(151)는 하위레벨 특징(hidden layer 1)부터, 중간레벨 특징(hidden layer 2), 및 상위레벨 특징(hidden layer 3)을 추출하여, 대상을 인식하고 그 결과를 출력(output layer)할 수 있다.For example, as shown in FIG. 4, the voice recognition determination unit 151 extracts a low-level feature (hidden layer 1), a middle-level feature (hidden layer 2), and a high-level feature (hidden layer 3), It can recognize an object and output the result (output layer).

인공신경망은 다음 순서의 레이어로 갈 수록 더욱 상위레벨의 특징으로 추상화할 수 있다. 그리고, 다음 순서의 레이어로 갈 수록 노드의 개수는 줄어든다고 할 수 있다. The artificial neural network can be abstracted with higher-level features as it goes to the next layer. And, it can be said that the number of nodes decreases as the next layer progresses.

각 노드들은 활성화 모델에 기초하여 동작할 수 있고, 활성화 모델에 따라 입력값에 대응하는 출력값이 결정될 수 있다.Each node may operate based on an activation model, and an output value corresponding to an input value may be determined according to the activation model.

임의의 노드, 예를 들어, 하위레벨 특징(hidden layer 1)의 출력값은 해당 노드와 연결된 다음 레이어, 예를 들어, 중간레벨 특징(hidden layer 2)의 노드로 입력될 수 있다. 다음 레이어의 노드, 예를 들어, 중간레벨 특징(hidden layer 2)의 노드는 하위레벨 특징(hidden layder 1)의 복수의 노드들로부터 출력되는 값들을 입력받을 수 있다.An output value of an arbitrary node, for example, a low-level feature (hidden layer 1) may be input to a node of a next layer connected to the corresponding node, for example, a node of a middle-level feature (hidden layer 2). A node of a next layer, for example, a node of an intermediate-level feature (hidden layer 2) may receive values output from a plurality of nodes of a lower-level feature (hidden layer 1).

이 때, 각 노드의 입력값은 이전 레이어의 노드의 출력값에 웨이트(weight)가 적용된 값일 수 있다. 웨이트는 노드 간의 연결 강도를 의미할 수 있다.In this case, the input value of each node may be a value in which a weight is applied to the output value of the node of the previous layer. The weight may mean the strength of a connection between nodes.

또한, 딥러닝 과정은 적절한 웨이트를 찾아내는 과정으로도 볼 수 있다.In addition, the deep learning process can be viewed as a process of finding an appropriate weight.

한편, 임의의 노드, 예를 들어, 중간레벨 특징(hidden layer 2)의 출력값은 해당 노드와 연결된 다음 레이어, 예를 들어, 상위레벨 특징(hidden layer 3)의 노드로 입력될 수 있다. 다음 레이어의 노드, 예를 들어, 상위레벨 특징(hidden layer 3)의 노드는 중간레벨 특징(hidden layer 2)의 복수의 노드로부터 출력되는 값들을 입력받을 수 있다.Meanwhile, an output value of an arbitrary node, for example, a middle-level feature (hidden layer 2) may be input to a node of a next layer connected to the corresponding node, for example, a node of a higher-level feature (hidden layer 3). A node of a next layer, for example, a node of a higher-level feature (hidden layer 3) may receive values output from a plurality of nodes of a middle-level feature (hidden layer 2).

인공신경망은 각 레벨에 대응하는 학습된 레이어(layer)를 이용하여, 각 레벨에 대응하는 특징 정보를 추출할 수 있다. 인공신경망은 순차적으로 추상화하여, 가장 상위 레벨의 특징 정보를 출력할 수 있다. The artificial neural network may extract feature information corresponding to each level by using a learned layer corresponding to each level. The artificial neural network may sequentially abstract and output the highest-level feature information.

따라서, 상기 음성인식 판단부(151)는 상기 가장 상위 레벨의 특징 정보로부터 상기 내시경 카메라(110)의 움직임에 대한 사용자의 의도를 파악할 수 있다. Accordingly, the voice recognition determination unit 151 may determine the user's intention for the movement of the endoscope camera 110 from the characteristic information of the highest level.

즉, 상기 음성인식 판단부(151)는 머신 러닝 기반의 상기 음성 인식 모델을 이용하여 상기 음성인식부(140)를 통해 확보된 음성데이터 중에서 상기 내시경 카메라(110)의 동작 제어에 필요한 단어들을 구분하고 패턴화하여 상기 내시경 카메라(110)의 움직임에 대한 사용자의 의도를 파악할 수 있다.That is, the voice recognition determination unit 151 uses the machine learning-based voice recognition model to classify words necessary for operation control of the endoscope camera 110 from among the voice data secured through the voice recognition unit 140 . and patterned to determine the user's intention for the movement of the endoscope camera 110 .

그리고, 이와 같이 파악된 사용자의 의도는 상기 발화지침 생성부(153)에 의해 상기 내시경 카메라(110)의 동작 제어에 필요한 발화지침으로 생성될 수 있다.In addition, the user's intention identified in this way may be generated as a speech guideline necessary for operation control of the endoscope camera 110 by the speech guideline generating unit 153 .

한편, 본 실시예와 같이 음성으로 상기 내시경 카메라(110)의 움직임을 제어하는 경우, 상기 내시경 카메라(110) 만을 통해서 수술부위를 관찰하게 되므로 상기 수술공간으로 삽입된 내시경 카메라(110)와 로봇팔(115)의 충돌이 발생할 수 있다. 특히, 상기 로봇팔(115)을 복수개 구비하는 경우에 상기 로봇팔(115)과 내시경 카메라(110)의 충돌 가능성이 높아지게 된다. 상기 로봇팔(115)과 내시경 카메라(110)가 충돌하게 되면, 상기 수술공간(12)에 위치한 장기(15), 동맥과 정맥과 같은 각종 혈관(17) 또는 신경(19) 등에 손상이 발생할 수 있다.On the other hand, when controlling the movement of the endoscope camera 110 by voice as in the present embodiment, the surgical site is observed only through the endoscope camera 110, so the endoscope camera 110 and the robot arm inserted into the operating space A collision of 115 may occur. In particular, when a plurality of the robot arm 115 is provided, the possibility of collision between the robot arm 115 and the endoscope camera 110 is increased. When the robot arm 115 and the endoscope camera 110 collide, damage to the organs 15 located in the operating space 12, various blood vessels 17 such as arteries and veins, or nerves 19 may occur. have.

특히, 수술 로봇을 이용한 최소 침습 수술의 경우에는 수술 부위가 협소하여 내시경 카메라(110)의 제어 범위가 좁으므로, 상술한 바와 같이 상기 발화지침이 상기 헤드셋(130)의 영상표시부(133)에 디스플레이되는 내시경 영상에 자막 형식으로 표시되도록 하여 사용자가 내시경 카메라(110)의 움직임을 원활히 인지할 수 있도록 하는 피드백 정보뿐만 아니라, 상기 내시경 카메라(110)와 로봇팔(115)의 충돌 가능성에 대한 정보 역시 사용자가 실시간으로 인지하도록 할 필요가 있다.In particular, in the case of minimally invasive surgery using a surgical robot, since the surgical site is narrow and the control range of the endoscopic camera 110 is narrow, the utterance instructions are displayed on the image display unit 133 of the headset 130 as described above. In addition to feedback information that allows the user to smoothly recognize the movement of the endoscope camera 110 by displaying the endoscope image in a caption format, information on the possibility of collision between the endoscope camera 110 and the robot arm 115 is also It is necessary to make the user aware in real time.

도 5는 다른 실시 예에 따른 수술로봇 시스템의 제어부의 구성을 도시한 블록도이다. 5 is a block diagram illustrating a configuration of a control unit of a surgical robot system according to another embodiment.

도 5를 참조하면, 본 실시 예에 따른 수술로봇 시스템(100)의 제어부(150)는, 상기 실시 예에 따른 수술로봇 시스템의 제어부와 비교하여, 상기 내시경 카메라(110)와 상기 로봇팔(115)의 충돌확률을 계산하여 상기 계산된 충돌확률을 상기 헤드셋(130)으로 전송하는 기능을 더 수행하도록 구성될 수 있다. Referring to FIG. 5 , the control unit 150 of the surgical robot system 100 according to the present embodiment is compared with the control unit of the surgical robot system according to the embodiment, the endoscope camera 110 and the robot arm 115 . ) by calculating the collision probability may be configured to further perform a function of transmitting the calculated collision probability to the headset 130 .

이를 위해, 본 실시 예에 따른 수술로봇 시스템(100)의 제어부(150)는, 상기 실시 예에 따른 수술로봇 시스템(100)의 제어부(100)와 비교하여, 충돌확률 계산부(160)를 더 포함할 수 있다.To this end, the control unit 150 of the surgical robot system 100 according to the present embodiment, compared with the control unit 100 of the surgical robot system 100 according to the embodiment, the collision probability calculation unit 160 more may include

상기 충돌확률 계산부(160)는 상기 내시경 카메라(110)와 로봇팔(115)의 충돌가능성을 확률로 계산할 수 있다.The collision probability calculator 160 may calculate the probability of collision between the endoscope camera 110 and the robot arm 115 as a probability.

예를 들어, 상기 충돌확률 계산부(160)는 상기 내시경 카메라(110)와 상기 로봇팔(115) 사이의 거리가 '0'인 경우를 충돌확률 100%로 하고, 상기 내시경 카메라(110)와 상기 로봇팔(115) 사이의 거리가 멀어질수록 충돌확률이 낮아지는 것으로 계산할 수 있다.For example, the collision probability calculator 160 sets the collision probability 100% when the distance between the endoscope camera 110 and the robot arm 115 is '0', and the endoscope camera 110 and It can be calculated that as the distance between the robot arms 115 increases, the collision probability decreases.

이때, 상기 송신부(157)는 상기 충돌확률 계산부(160)에서 계산된 충돌확률을 상기 헤드셋(130)으로 전송할 수 있으며, 상기 헤드셋(130)의 자막표시부(134)는 상기 송신부(157)에서 전송되는 충돌확률을 상기 영상표시부(133)에 자막으로 표시할 수 있다. At this time, the transmitter 157 may transmit the collision probability calculated by the collision probability calculator 160 to the headset 130 , and the caption display unit 134 of the headset 130 receives the transmission unit 157 . The transmitted collision probability may be displayed as a caption on the image display unit 133 .

도 6은 상기 헤드셋(130)의 영상표시부(133)에 충돌확률이 자막으로 표시된 일례를 도시한다.6 shows an example in which the probability of collision is displayed as a caption on the image display unit 133 of the headset 130 .

도 6에서 보이는 바와 같이, 상기 내시경 카메라(110)와 로봇팔(115)간의 충돌확률이 상기 헤드셋(130)의 영상표시부(133)에 디스플레이되는 내시경 영상에 자막 형식으로 표시됨에 따라, 사용자는 내시경 영상과 함께 상기 내시경 카메라(110)와 로봇팔(115)간의 충돌확률을 동시에 확인하면서 상기 내시경 카메라(110)의 동작을 제어할 수 있다. As shown in FIG. 6 , as the collision probability between the endoscope camera 110 and the robot arm 115 is displayed in the form of a caption on the endoscope image displayed on the image display unit 133 of the headset 130 , the user can use the endoscope The operation of the endoscope camera 110 can be controlled while simultaneously checking the collision probability between the endoscope camera 110 and the robot arm 115 together with the image.

따라서, 본 실시 예에 따른 수술로봇 시스템(100)은, 상기 발화지침이 상기 헤드셋(130)의 영상표시부(133)에 디스플레이되는 내시경 영상에 자막 형식으로 표시되도록 함으로써, 사용자가 상기 내시경 카메라(110)의 움직임을 원활히 인지할 수 있도록 할 수 있을 뿐만 아니라, 상기 내시경 카메라(110)와 로봇팔(115)의 충돌 가능성에 대한 정보까지도 사용자가 실시간으로 인지할 수 있도록 할 수 있으므로, 사용자는 보다 안전하고 원활하게 상기 내시경 카메라(110)의 움직임을 제어할 수 있다.Therefore, in the surgical robot system 100 according to the present embodiment, the utterance instruction is displayed in the form of a caption on the endoscopic image displayed on the image display unit 133 of the headset 130, so that the user can use the endoscopic camera 110 ), not only can the movement of the endoscopic camera 110 and the robot arm 115 be recognized in real time, but also information about the possibility of a collision between the endoscope camera 110 and the robot arm 115 can be recognized in real time, so that the user is safer And it is possible to smoothly control the movement of the endoscope camera (110).

또한, 본 실시 예에 따른 수술로봇 시스템(100)은, 상기 충돌확률에 대한 과도한 정보에 대한 사용자의 노출을 방지할 수 있도록, 상기 헤드셋(130)의 영상표시부(133)에 자막으로 표시되는 충돌확률의 기준을 사용자가 선택할 수 있도록 구성될 수 있다. In addition, the surgical robot system 100 according to the present embodiment is a collision displayed as a caption on the image display unit 133 of the headset 130 so as to prevent the user's exposure to excessive information about the collision probability. It may be configured to allow a user to select a criterion of probability.

이를 위해, 상기 음성인식 판단부(151)는 상기 음성인식부(140)를 통해 확보된 음성데이터로부터 상기 내시경 카메라(110)의 움직임에 대한 사용자의 의도뿐만 아니라, 상기 내시경 카메라(110)와 로봇팔(115)의 충돌확률을 표시하는 기준에 대한 사용자 의도를 파악하도록 구성될 수 있다.To this end, the voice recognition determination unit 151 is not only the user's intention for the movement of the endoscope camera 110 from the voice data secured through the voice recognition unit 140, but also the endoscope camera 110 and the robot. It may be configured to grasp the user's intention with respect to the criterion for indicating the collision probability of the arm (115).

이때, 상기 음성인식 판단부(151)는 머신 러닝 기반의 음성 인식 모델을 통해 상기 음성인식부(140)를 통해 확보된 음성데이터 중에서 상기 내시경 카메라(110) 동작 제어에 필요한 단어들과 상기 내시경 카메라(110)와 로봇팔(115)의 충돌확률을 표시하는 기준에 대한 단어들을 구분하고 패턴화하여 상기 내시경 카메라(110)의 움직임에 대한 사용자 의도와 상기 내시경 카메라(110)와 상기 로봇팔(115)의 충돌확률을 표시하는 기준에 대한 사용자 의도를 파악할 수 있다.In this case, the voice recognition determination unit 151 may include words necessary for controlling the operation of the endoscope camera 110 from among the voice data secured through the voice recognition unit 140 through a machine learning-based voice recognition model and the endoscope camera. (110) and the robot arm (115) by dividing and patterning the words for the criterion for indicating the probability of collision, the user's intention for the movement of the endoscope camera 110 and the endoscope camera 110 and the robot arm 115 ), it is possible to grasp the user's intention for the criterion for indicating the probability of collision.

여기서, 상기 음성 인식 모델은 딥 러닝(deep learning) 기반의 학습에 기초하여 구축된 심층신경망(Deep neural network)를 포함할 수 있음은 전술한 바와 같다.Here, as described above, the voice recognition model may include a deep neural network constructed based on deep learning-based learning.

또한, 상기 제어부(150)는 충돌확률 표시여부 판단부(165)를 더 포함할 수 있다.In addition, the control unit 150 may further include a collision probability display determination unit 165 .

상기 충돌확률 표시여부 판단부(165)는 상기 충돌확률 계산부(160)에서 계산된 충돌확률이 상기 음성인식 판단부(151)에서 파악된 상기 내시경 카메라(110)와 로봇팔(115)의 충돌확률을 표시하는 기준에 대한 사용자 의도에 따른 충돌확률보다 높은 경우에 상기 송신부(157)가 상기 충돌확률 계산부(160)에서 계산된 충돌확률을 상기 헤드셋(130)으로 전송하도록 할 수 있다.The collision probability display determination unit 165 is a collision probability between the endoscope camera 110 and the robot arm 115, the collision probability calculated by the collision probability calculation unit 160 is determined by the voice recognition determination unit 151 When the collision probability according to the user's intention with respect to the criterion for indicating the probability is higher than the collision probability, the transmitter 157 may transmit the collision probability calculated by the collision probability calculation unit 160 to the headset 130 .

예를 들어, 사용자가 상기 음성인식부(140)를 통해 충돌확률 기준을 80%라고 발화하면, 상기 충돌확률 기준은 상기 음성인식 판단부(151)에 의해 80%로 파악되고, 상기 충돌확률 표시여부 판단부(165)는 상기 충돌확률 계산부(160)에서 계산된 충돌확률이 80% 미만일 때는 상기 송신부(157)가 상기 충돌확률을 상기 헤드셋(130)으로 송신하지 않도록 하여 상기 충돌확률이 상기 영상표시부(133)에 표시되지 않도록 하며, 상기 충돌확률 계산부(160)에서 계산된 충돌확률이 80% 이상일 때는 상기 송신부(157)가 상기 충돌확률을 상기 헤드셋(130)으로 송신하도록 하여 상기 충돌확률이 상기 영상표시부(133)에 표시되도록 할 수 있다. For example, if the user utters that the collision probability criterion is 80% through the voice recognition unit 140, the collision probability criterion is identified as 80% by the voice recognition determination unit 151, and the collision probability is displayed If the collision probability calculated by the collision probability calculation unit 160 is less than 80%, the determination unit 165 prevents the transmission unit 157 from transmitting the collision probability to the headset 130 so that the collision probability is the The collision probability is not displayed on the image display unit 133, and when the collision probability calculated by the collision probability calculation unit 160 is 80% or more, the transmitting unit 157 transmits the collision probability to the headset 130, so that the collision probability is not displayed. The probability may be displayed on the image display unit 133 .

따라서, 사용자는 상기 내시경 카메라(110)와 로봇팔(115)의 충돌확률이 높은 위험한 상황에서만 상기 충돌확률에 대한 정보를 자막을 통해 확인할 수 있게 됨에 따라, 상기 충돌확률에 대한 과도한 정보에 대한 사용자의 노출을 방지할 수 있으며, 따라서 로봇수술의 안전성을 도모할 수 있음과 동시에 사용자의 편의성을 증대시킬 수 있다.Therefore, the user can check the information on the collision probability through subtitles only in a dangerous situation where the collision probability between the endoscope camera 110 and the robot arm 115 is high, so that the user about excessive information about the collision probability It is possible to prevent the exposure of , and thus, it is possible to promote the safety of robotic surgery and at the same time increase the convenience of the user.

이상에서 살펴본 바와 같이, 본 발명은 사용자가 내시경 영상을 VR(Virtual Reality) 헤드셋(Headset)을 통해 확인하면서 음성인식 마이크로폰을 이용하여 손을 사용할 필요없이 음성으로 내시경 카메라의 동작을 제어할 수 있는 수술로봇 시스템에 관한 것으로서, 그 실시 형태는 다양한 형태로 변경가능하다 할 것이다. 따라서 본 발명은 본 명세서에서 개시된 실시 예에 의해 한정되지 않으며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 변경 가능한 모든 형태도 본 발명의 권리범위에 속한다 할 것이다.As described above, the present invention allows the user to control the operation of the endoscopic camera by voice without using a hand using a voice recognition microphone while checking the endoscopic image through a VR (Virtual Reality) headset. As it relates to the robot system, the embodiment will be said to be changeable in various forms. Therefore, the present invention is not limited by the embodiments disclosed herein, and all forms that can be changed by a person of ordinary skill in the art to which the present invention pertains will fall within the scope of the present invention.

100 : 수술로봇 시스템
110 : 내시경 카메라
120 : 구동부 130 : 헤드셋
133 : 영상표시부 134 : 자막표시부
140 : 음성인식부 150 : 제어부
151 : 음성인식 판단부
153 : 발화지침 생성부
155 : 구동제어신호 생성부
157 : 송신부
160 : 충돌확률 계산부
165 : 충돌확률 표시여부 판단부100: surgical robot system
110: endoscope camera
120: driving unit 130: headset
133: video display unit 134: subtitle display unit
140: voice recognition unit 150: control unit
151: voice recognition determination unit
153: ignition guide generation unit
155: drive control signal generator
157: transmitter
160: collision probability calculator
165: collision probability display determination unit

Claims

delete

an endoscopic camera inserted into the operating space through a predetermined incision to photograph the surgical site in real time;
a robot arm inserted into the operating space to operate the surgical site;
a driving unit for moving the endoscope camera;
a headset that is detachably worn on the user's head and displays the image captured by the endoscope camera;
a voice recognition unit for recognizing the user's voice; and
After recognizing the user's intention for the movement of the endoscope camera from the voice data recognized by the voice recognition unit, a driving control signal for controlling the driving unit is transmitted to the driving unit based on this, and the endoscope camera and the robot arm A control unit that calculates a collision probability, transmits the calculated collision probability to the headset, and transmits image information received from the endoscope camera to the headset;
The control unit is
a voice recognition determination unit for identifying a user's intention for the movement of the endoscope camera from the voice data secured through the voice recognition unit;
a speech guide generation unit for generating speech guidelines necessary for controlling the operation of the endoscope camera based on the user's intention for the movement of the endoscope camera detected by the speech recognition determination unit;
a driving control signal generating unit configured to generate a driving control signal for controlling the driving unit based on the ignition guide;
a collision probability calculator for calculating the probability of collision between the endoscope camera and the robot arm as a probability; and
a transmission unit for transmitting the image information received from the endoscope camera, the ignition instructions, and the collision probability calculated by the collision probability calculator to the headset, and transmitting the driving control signal to the driving unit;
The headset is
an image display unit for displaying image information transmitted from the transmitting unit; and
and a caption display unit for displaying the utterance instructions transmitted from the transmitter and the collision probability on the image display unit;
The voice recognition determination unit is provided to determine the user's intention with respect to the criterion for displaying the probability of collision between the endoscope camera and the robot arm,
The control unit is the transmission unit when the collision probability calculated by the collision probability calculation unit is higher than the collision probability according to the user's intention with respect to the criterion for displaying the collision probability between the endoscope camera and the robot arm identified by the voice recognition determination unit The surgical robot system further comprising a collision probability display determination unit for transmitting the collision probability calculated by the collision probability calculation unit to the headset.

13. The method of claim 12,
The voice recognition determination unit is a word for a criterion for displaying the probability of collision between the endoscope camera and the robot arm and the words necessary for controlling the operation of the endoscope camera among the voice data secured through the voice recognition unit through the machine learning-based voice recognition model Surgical robot system, characterized in that by distinguishing and patterning the user intention for the movement of the endoscope camera and the user intention with respect to a criterion for displaying the probability of collision between the endoscope camera and the robot arm.

14. The method of claim 13,
The voice recognition model is a surgical robot system, characterized in that it comprises a deep neural network (Deep neural network) built based on learning based on deep learning (deep learning).

14. The method of claim 13,
The utterance instructions are 'Cam right', 'Cam left', 'Cam up', 'Cam down', and 'Cam zoom in'. And Surgical robot system comprising a 'cam zoom out (Camera zoom out)'.

13. The method of claim 12,
The voice recognition unit is a surgical robot system, characterized in that the microphone is detachably provided on one side of the headset or provided integrally with the headset.