RU2693197C2

RU2693197C2 - Universal operator intelligent 3-d interface

Info

Publication number: RU2693197C2
Application number: RU2017115870A
Authority: RU
Inventors: Юрий Монгушевич Шыырап; Роман Юрьевич Скоробогатов
Priority date: 2017-05-04
Filing date: 2017-05-04
Publication date: 2019-07-01
Also published as: RU2017115870A; RU2017115870A3

Abstract

FIELD: physics.

SUBSTANCE: invention relates to non-contact interaction of user with controlled devices. Technical result is achieved by the fact that control signal is sent by user, capturing a three-dimensional image, recognizing commands and transmitting the corresponding instruction to the controlled device. Signal is transmitted in minimum redundant form with volume of 1–10 Mb, at that, on the controlled device using the intelligent processing inverse transforms, the original command is restored, wherein the operator universal intelligent 3D interface in real time forms, complements, modifies distributed database of spatial images of operator space operator and working object, develops knowledge base.

EFFECT: technical result consists in providing universal intelligent 3D-interface, providing a universal, most close to natural, human interaction of the operator with different technical systems.

1 cl, 5 dwg

Description

Изобретение относится к области бесконтактного взаимодействия пользователя с управляемыми устройствами в распределенной мультиагентной информационной сети, а именно, к управлению локальной операционной системой, распределенной сетевой вычислительной системой, роботами, АСУ производства.The invention relates to the field of contactless user interaction with controlled devices in a distributed multi-agent information network, namely, the management of a local operating system, a distributed network computing system, robots, production automated control systems.

Известна заявка на патент US 20100020078, в которой раскрываются система и метод для создания трехмерной карты объекта [1]. Система содержит (фиг. 1) устройство изображения (1), которое генерирует и проецирует множество лучей на объект и захватывает изображение, отраженное объектом.Known patent application US 20100020078, which discloses a system and method for creating a three-dimensional map of the object [1]. The system contains (FIG. 1) an image device (1) that generates and projects a plurality of rays on an object and captures the image reflected by the object.

Процессор (2) обрабатывает данные изображения, генерируемые устройством изображения, для воссоздания 3D-карты объекта. Процессор изображения вычисляет 3D-координаты точек на поверхности объекта, определяя различие в интенсивности освещения. Процессор изображения может быть отдельным устройством или микросхемой в корпусе устройства изображения. 3D-карта (3), получаемая процессором изображения, может быть использована для различных назначений. Например, карта может быть направлена на выходное устройство, такое как дисплей. Если объект (4) это тело или его часть, например, кисть руки, система может быть использована для обеспечения, основанного на жестах пользовательского интерфейса, для интерактивного управления приложением, вместо элементов с тактильным интерфейсом, таких как мышь или джойстик.The processor (2) processes the image data generated by the image device to recreate a 3D object map. The image processor calculates the 3D coordinates of points on the object's surface, determining the difference in illumination intensity. The image processor may be a separate device or a chip in the body of the image device. The 3D map (3) obtained by the image processor can be used for various purposes. For example, the card may be directed to an output device, such as a display. If the object (4) is a body or part of it, for example, a hand, the system can be used to provide, based on user interface gestures, to interactively control the application, instead of tactile interface elements, such as a mouse or joystick.

Известна заявка на патент US 2010199228 "Набор текста жестами" [2]. Согласно описанию, система (фиг. 2) состоит из вычислительного средства (5) и устройства захвата (6). Вычислительное средство может быть компьютером, игровой системой или консолью и может включать аппаратные и/или программные компоненты.Known patent application US 2010199228 "Typing gestures" [2]. According to the description, the system (Fig. 2) consists of a computational means (5) and a capture device (6). The computing means may be a computer, a gaming system or a console, and may include hardware and / or software components.

Согласно описанию данного метода, устройство захвата связано с вычислительным средством через канал связи. Это может быть проводная связь (например, USB соединение, Firewire соединение, кабельное Ethernet соединение) или беспроводное соединение. Устройство захвата генерирует модель скелета и передает по каналу связи вычислительному средству, которое получает данные глубины для распознавания жеста пользователя (7) для управления приложением (8). Вычислительное средство может включать распознаватель жестов. Захваченные данные могут сравниваться с жестами в распознавателе для управления приложениями.According to the description of this method, the capture device is connected to the computing device through a communication channel. This can be a wired connection (for example, a USB connection, a Firewire connection, an Ethernet cable connection) or a wireless connection. The capture device generates a skeleton model and transmits via communication channel to the computing facility that receives depth data for recognizing the user's gesture (7) to control the application (8). Computing means may include a gesture recognizer. Captured data can be compared with gestures in the recognizer for controlling applications.

Известен патент RU 2455676 "Способ управления устройством с помощью жестов и 3D-сенсор для его осуществления»". Способ управления устройством с помощью жестов (фиг. 3), подаваемых пользователем (9), заключается в том, что с помощью 3D-ceнcopa (10) захватывают трехмерное изображение, распознают жест (11) и выдают на управляемое устройство соответствующую жесту команду, при этом в окружающем пользователя пространстве выделяют, по меньшей мере, одну сенсорную область (12), с каждой сенсорной областью ассоциируют, по меньшей мере, один ожидаемый жест и с каждой комбинацией сенсорной области и жеста ассоциируют одну команду, определяют положение глаз пользователя, положение его головы и положение его руки, определяют условную линию взгляда (13), исходящую из органа зрения пользователя и направленную в видимую пользователем точку окружающего пространства, проверяют, направлена ли условная линия взгляда на сенсорную область, анализируют подаваемый рукой пользователя жест и в случае, если жест подают одновременно с наведением условной линии взгляда на сенсорную область, то на управляемое устройство выдают команду, ассоциированную с данной комбинацией сенсорной области и поданного жеста [3].Known patent RU 2455676 "A method of controlling a device using gestures and a 3D sensor for its implementation." The device control method using gestures (Fig. 3) given by the user (9) consists in capturing a three-dimensional image using 3D-сoncopa (10), recognizing the gesture (11) and issuing a command to the controlled device, when this in the surrounding user space is allocated at least one sensory area (12), at least one expected gesture is associated with each sensory area and one command is associated with each combination of the sensory area and gesture, the position of his head and the position of his hand, determine the conventional line of sight (13), coming from the user's organ of sight and directed to the point of the surrounding space visible by the user, check whether the conventional line of sight is directed to the sensory area, analyze the gesture given by the user's hand and, if the gesture is served simultaneously with the guidance of a conditional line of sight at the sensory region, then the command associated with the given combination of the sensory region and the gesture is issued to the controlled device [3].

Известен патент US 8933876 "Трехмерный пользовательский интерфейс управления сессией". Способ (фиг. 4) описывает процесс бесконтактного взаимодействия пользователя (14) с компьютером (15) посредством жестов рук (16), расположенных в поле зрения датчика (17), подключенного к компьютеру. После обнаружения жеста бесконтактный трехмерный пользовательский интерфейс переходит из одного состояния в другое [4].Known patent US 8933876 "Three-dimensional user interface for managing the session." The method (Fig. 4) describes the process of contactless interaction of the user (14) with the computer (15) by means of hand gestures (16) located in the field of view of the sensor (17) connected to the computer. After detecting a gesture, the contactless three-dimensional user interface moves from one state to another [4].

Патент US 8933876 наиболее близок к предлагаемому изобретению и принят за прототип.Patent US 8933876 closest to the proposed invention and adopted for the prototype.

Недостатком прототипа и прочих известных устройств для распознавания жестов является использование внешних интерфейсов для передачи изображения, таких как S-video кабель, коаксиальный кабель, USB кабель и т.п. Достаточно большой объем передаваемых данных и пропускная способность внешнего интерфейса ограничивают скорость реакции системы на жесты пользователя, и, соответственно, скорость работы управляющей системы в целом.The disadvantage of the prototype and other known gesture recognition devices is the use of external interfaces for image transmission, such as an S-video cable, a coaxial cable, a USB cable, etc. A sufficiently large amount of transmitted data and the bandwidth of the external interface limit the speed of the system's response to user gestures, and, accordingly, the speed of the control system as a whole.

Еще одним недостатком существующих систем для определения жестов является проблема распознавания одним техническим устройством одновременно и крупных жестов и мелких жестов. Указанная проблема связана, в основном, с низким разрешением воспринимаемого 3D-изображения, с ограниченной точностью определения координат и с недостаточным быстродействием системы в целом.Another disadvantage of existing systems for determining gestures is the problem of recognition by one technical device of large gestures and small gestures simultaneously. This problem is mainly related to the low resolution of the perceived 3D image, the limited accuracy of determining the coordinates and the insufficient speed of the system as a whole.

Предложенный способ управления с помощью жестов (фиг. 5), подаваемых оператором(ами) (18), заключается в том, что на передающей стороне (19), соединенной каналом связи (20) с приемной стороной (21), через который сигнал передается в минимально избыточной форме объемом 1-10 Мб, производится распознавание положения оператора(ов) в заданном операторном, рабочем и глобальном (например геопозиционном) пространствах. На основе распределенных баз данных пространственных образов (22) и соответствующих баз знаний (23) формируются кодовые посылки координат узлов скелета оператора(ов), имя оператора(ов) (пространственного образа), управляющие сигналы виртуального устройства (мышь (26), клавиатура (27), джойстик (28), сенсорный экран (29), микрофон (30), kinect (31)), данные обмена распределенных баз знаний и баз данных, данные координат рабочих объектов, присутствующих в операторном пространстве, их наименование и описание состояния, а также данные взаимодействия объектов и субъектов мультиагентной информационной среды. На приемной стороне на основе полученных от передающей стороны данных с помощью обратных преобразований интеллектуальной обработки, восстанавливается исходный информационный поток.The proposed control method using gestures (Fig. 5), supplied by the operator (s) (18), is that on the transmitting side (19) connected by a communication channel (20) to the receiving side (21) through which the signal is transmitted in the minimum redundant form with a volume of 1-10 MB, the position of the operator (s) in the specified operator, work and global (for example, geo-location) spaces is made. Based on the distributed databases of spatial images (22) and corresponding knowledge bases (23), the code sends the coordinates of the operator’s skeleton nodes (s), the name of the operator (s) (spatial image), the control signals of the virtual device (mouse (26), keyboard ( 27), joystick (28), touch screen (29), microphone (30), kinect (31)), data of the exchange of distributed knowledge bases and databases, coordinate data of work objects present in the operator space, their name and description of the state, as well as interaction data of objects and subjects of multi-agent information environment. On the receiving side, on the basis of the data received from the transmitting side using the inverse transformations of intellectual processing, the original information flow is restored.

Главным элементом системы является интеллектуальный модуль (24). На вход ИМ поступает 3D-видео и аудио информация, полученная с 3D сканера (25). Функции интеллектуального модуля:The main element of the system is an intelligent module (24). 3D video and audio information obtained from a 3D scanner (25) is being input to IM. Functions of the intellectual module:

1) Выделить из общего потока информации часть, соответствующую отдельным объектам (субъектам) 3D-операторного пространства.1) Select from the general information flow the part corresponding to the individual objects (subjects) of the 3D operator space.

2) Распознавание выделенных объектов на основе распределенной базы данных.2) Recognition of selected objects based on a distributed database.

3) Для оператора(ов) и рабочего объекта(ов) определяются координаты скелетных узлов, которые позволяют однозначно определить их положение.3) For the operator (s) and the working object (s), the coordinates of the skeletal nodes are determined, which make it possible to unambiguously determine their position.

4) На основе полученных данных, в соответствии с установленной спецификацией, распознаются команды и передаются в виде кодовых посылок на приемную сторону.4) On the basis of the received data, in accordance with the established specification, commands are recognized and transmitted in the form of code parcels to the receiving side.

5) Дополнение БД и передача по каналу связи необходимой информации.5) Supplement the database and transfer via the communication channel the necessary information.

6) ИМ отслеживает динамические изменения координат объектов (субъектов) операторного пространства, сохраняет в БД информацию об этих изменениях. На основе накопленной в БД информации и на основе алгоритмов машинного обучения получает новые знания, пополняя соответствующую БЗ.6) IM monitors dynamic changes in the coordinates of objects (subjects) of the operator space, saves information about these changes to the database. Based on the information accumulated in the database and on the basis of machine learning algorithms, it gains new knowledge, adding to the corresponding knowledge base.

Задачей, на решение которой направлен заявленный способ, является распознавание, кодирование, передача и воспроизведение управляющих действий оператора(ов) для локальной вычислительной системы (32), распределенной сетевой вычислительной системы (33), роботов (34), АСУ производства (35).The task which the claimed method is intended to solve is the recognition, coding, transmission and reproduction of operator control actions (s) for the local computing system (32), the distributed network computing system (33), robots (34), production automated control systems (35).

Технический результат заключается в возможности с помощью данного операторного универсального интеллектуального 3D-интерфейса:The technical result consists in the possibility of using this operator universal intelligent 3D-interface:

- обеспечить универсальное, наиболее приближенное к естественному человеческому взаимодействию оператора с различными техническими системами (локальная вычислительная система, распределенная сетевой вычислительная система, роботы, АСУ производства) на основе однотипных движений, жестов, речевых команд, а также управлять этими системами в режиме генерации данных любых обычных аппаратных средств операторного управления (клавиатура, мышь, джойстик, Kinect, сенсорный экран и т.д.).- to provide universal, closest to the natural human interaction of the operator with various technical systems (local computing system, distributed network computing system, robots, production automated control systems) based on the same type of movements, gestures, voice commands, and also manage these systems in any data generation mode conventional hardware operator control (keyboard, mouse, joystick, Kinect, touch screen, etc.).

- значительно сократить количество передаваемой по каналу связи информации путем интеллектуального выделения и передачи координат скелетных узлов в режиме реального времени, которые используются для формирования и распознавания заданного перечня команд управления, и передачи изменяющейся информации о 3D-пространственных образах оператора(ов) операторного пространства и рабочего объекта(ов), которые необходимы для интерпретации сцен операторного управления.- significantly reduce the amount of information transmitted via a communication channel by intelligently extracting and transmitting coordinates of skeletal nodes in real time, which are used to form and recognize a given list of control commands, and transfer changing information about 3D spatial images of the operator space (s) and operating space object (s) that are required to interpret operator control scenes.

ИСТОЧНИКИ ИНФОРМАЦИИINFORMATION SOURCES

1. Заявка на патент US 20100020078 А1, кл. G09G 5/00, G06T 17/00. «Depth mapping using multi-beam illumination». Alexander Shpunt.1. The patent application US 20100020078 A1, cl. G09G 5/00, G06T 17/00. "Depth mapping using multi-beam illumination." Alexander Shpunt.

2. Заявка на патент US 20100199228 A1, кл. G06F 17/28, G06F 3/033. «Gesture Keyboarding». Stephen G. Latta.2. The patent application US 20100199228 A1, cl. G06F 17/28, G06F 3/033. "Gesture Keyboarding". Stephen G. Latta.

3. Патент RU 2455676, кл. G06F 3/00. «Способ управления устройством с помощью жестов и 3D-сенсор для его осуществления». А.В. Валик, П.А. Зайцев, Д.А. Морозов.3. Patent RU 2455676, cl. G06F 3/00. "The method of controlling the device using gestures and a 3D sensor for its implementation." A.V. Roller, P.A. Zaitsev, D.A. Morozov.

4. Патент US 8933876 В2, кл. G06F 3/00, G06F 3/01, G09G 5/00, G06F 3/03. «Three dimensional user interface session control». Micha Galor, Jonathan Pokrass, Amir Hoffhung.4. Patent US 8933876 B2, cl. G06F 3/00, G06F 3/01, G09G 5/00, G06F 3/03. "Three dimensional user interface session control". Micha Galor, Jonathan Pokrass, Amir Hoffhung.

Claims

Operator universal intelligent 3D-interface, which includes a control signal (gesture, voice command, movement) by the user, capturing a three-dimensional image, recognizing the command and transmitting the corresponding command to the controlled device, characterized in that the signal is transmitted in a minimally redundant form with a volume of 1-10 MB , while on the managed device using the inverse transformations of intellectual processing, the original command is restored, while the operator universal intelligent 3D Interface allows you to generate real-time, add, edit a distributed database of spatial images of the operator operator data space and the work object, to develop the knowledge base for problems of intellectual computer control online interaction and for process manufacturing and robotics.