RU2789609C1

RU2789609C1 - Method for tracking, detection and identification of objects of interest and autonomous device with protection from copying and hacking for their implementation

Info

Publication number: RU2789609C1
Application number: RU2021137574A
Authority: RU
Inventors: Константин Викторович Глебов; Алексей Владимирович Долгополов; Павел Александрович Казанцев; Павел Вячеславович Скрибцов; Сергей Олегович Суриков; Владимир Юрьевич Сухоруков; Денис Владимирович Тюляев
Original assignee: Ооо "Мирп-Ис"
Filing date: 2021-12-17
Publication date: 2023-02-06

Abstract

FIELD: detection equipment.

SUBSTANCE: invention relates to methods and device for tracking, detection, and identification of persons. The method for detection of persons of interest includes following stages: modification of weight coefficients of deep neural nets to a level, at which it is impossible to detect persons, but it is possible to fulfill the most part of required amount of calculations; setting of a size of a processing cycle N; filtration of traffic zones; generation of a person detection flow; generation of a tracking module line; generation of a sequence of images of same person faces; generation of a performed tracking line; filtration of the performed tracking line to select best images; determination of a face image signature in the detection flow; determination of a person’s age and sex; comparison of the face image signature with face image signatures in computing device memory; storage of obtained data.

EFFECT: reduction in required computing capacities of an autonomous video analytics system, reduction in a number of errors in detection, increase in a speed of identification of persons of interest, provision of protection of a neural network.

13 cl, 11 dwg

Description

Заявленная группа изобретений относится к области вычислительной техники, в частности к способам и устройствам отслеживания обнаружения и идентификации лиц, и может быть использована в качестве автономной системы видеоаналитики для распознавания, отслеживания, оценки пола и возраста лиц, а также подсчета лиц в магазинах, офисах, промышленных предприятиях и др.The claimed group of inventions relates to the field of computer technology, in particular to methods and devices for tracking the detection and identification of persons, and can be used as an autonomous video analytics system for recognizing, tracking, evaluating the sex and age of persons, as well as counting faces in stores, offices, industrial enterprises, etc.

В настоящее время системы видеоаналитики широко используются в области ритейла (распознавание человека для контроля очередей, анализ траектории движения покупателей, тепловые карты, сбор демографических данных о посетителях магазина), промышленности (детектор присутствия человека в опасных зонах, контроль выполнения операций, биометрический контроль доступа, проверка соблюдения техники безопасности), транспорта (распознавание лиц и детекция человека для аналитики пассажиропотока, поиска правонарушителей, контроль действий и состояния водителя).Currently, video analytics systems are widely used in retail (human recognition for queue control, customer trajectory analysis, heat maps, collection of demographic data about store visitors), industry (human presence detector in hazardous areas, operation control, biometric access control, checking compliance with safety regulations), transport (face recognition and human detection for passenger traffic analytics, search for offenders, control of the actions and condition of the driver).

Известны способы обнаружения человеческих объектов в видео (RU 2635066 C2, 08.11.2017), содержащие определение пикселей видеоизображения, которые являются приоритетными пикселями, группой приоритетных пикселей, составляющих набор приоритетных блобов одного или нескольких приоритетных блобов, для каждой из N предопределенных форм в соответствующих местоположениях из N местоположений в видеоизображении, где N - целое число, сравнение соответствующей предопределенной формы с набором приоритетных блобов для получения соответствующей вероятности нахождения человека в этом соответствующем предопределенном местоположении, получая, таким образом, N вероятностей, соответствующих N местоположениям, использование N вероятностей, определение X людей, представленных набором приоритетных блобов, где X - целое число, и выдачу по меньшей мере одного из отчетов предупреждения и обнаружения события на основании определенного представления X людей (по первому пункту формулы) или для каждого из N предопределенных форм в соответствующем местоположении из N предопределенных местоположений видеоизображения, сравнение соответствующей предопределенной формы с набором приоритетных блобов для определения X людей, представленных набором приоритетных блобов, и местоположение каждого человека из этих X людей определяется как местоположение в горизонтальной плоскости реального мира; и выдачу по меньшей мере одного из отчетов предупреждения и обнаружения события, когда плотность толпы превышает порог, на основании определения представления X людей.Known methods for detecting human objects in video (RU 2635066 C2, 08.11.2017), containing the definition of video image pixels, which are priority pixels, a group of priority pixels that make up a set of priority blobs of one or more priority blobs, for each of N predefined shapes in the corresponding locations of N locations in the video image, where N is an integer, comparing the respective predefined shape with the set of priority blobs to obtain the respective probability of the person being at that respective predefined location, thus obtaining N probabilities corresponding to the N locations, using N probabilities, determining X of people represented by a set of priority blobs, where X is an integer, and issuing at least one of the warning and event detection reports based on a specific representation of X people (by the first claim) or for each of N warnings. the divided shapes at the corresponding location of the N predefined video image locations, comparing the corresponding predefined shape with the set of priority blobs to determine the X people represented by the set of priority blobs, and the location of each person of these X people is determined as a location in the horizontal plane of the real world; and issuing at least one of warning and event detection reports when the crowd density exceeds a threshold based on the X people representation determination.

Недостатком этой группы способов является невозможность получения статистической информации по результатам оценки пола и возраста лиц, а также идентификации интересующих лиц.The disadvantage of this group of methods is the impossibility of obtaining statistical information based on the results of assessing the sex and age of persons, as well as identifying persons of interest.

Известен способ и система распознавания образов из видеопотока (RU 2714901 C1, 02.20.2020), который выполняется, по меньшей мере, одним вычислительным устройством и в котором получают из по меньшей мере одной камеры по меньшей мере один кадр, содержащий по меньшей мере один товар из киоска, причем каждый кадр разделяет виртуальная линия кассы; передают полученный на предыдущем шаге, по меньшей мере, один кадр с товаром на вход обученной искусственной нейронной сети, которая идентифицирует на кадре по меньшей мере один товар посредством получения координат товара и класса, к которому относится данный товар; определяют посредством вычислительного устройства, находится ли центр товара за виртуальной линией кассы или снаружи, причем, если по меньшей мере один товар появился в зоне кассы и пересек линию, данный товар добавляется в виртуальную корзину; направляют содержимое виртуальной корзины, сформированное на предыдущем шаге на сервер для выполнения расчетов с пользователем за по меньшей мере один извлеченный товар.There is a known method and system for recognizing images from a video stream (RU 2714901 C1, 02.20.2020), which is performed by at least one computing device and in which at least one frame containing at least one product is received from at least one camera from a kiosk, each frame being separated by a virtual checkout line; at least one frame with the product obtained in the previous step is transmitted to the input of the trained artificial neural network, which identifies at least one product on the frame by obtaining the coordinates of the product and the class to which this product belongs; determining by means of a computing device whether the center of the product is behind the virtual checkout line or outside, wherein if at least one product has appeared in the checkout area and crossed the line, this product is added to the virtual basket; sending the contents of the virtual basket generated in the previous step to the server to perform settlements with the user for at least one retrieved item.

Недостатком способа и системы является возможность анализа только одной области интереса - внутри виртуальной линии кассы.The disadvantage of the method and system is the ability to analyze only one area of interest - inside the virtual cash line.

Известны способ и система обеспечения безопасности посредством биометрической идентификации лиц (WO 2008081051 A1, 10.07.2008). Система включает следующие этапы: считывание биометрических данных; составление сообщения с идентификацией терминала, биометрическим считыванием и инструкцией; шифрование сообщения; отправка их через виртуальную частную сеть в Интернете; прием сообщения серверным компьютером и его расшифровка; идентификация личности; сопоставление с базой данных идентификатора терминала, идентификатора человека и идентификатора инструкции; отправка зашифрованного сообщения средствам безопасности, определенным базой данных, в случае положительного совпадения. Система безопасности включает средства безопасности, управляемые с терминалов с биометрическим считывателем; устройство шифрования и устройство для связи через Интернет с одним или несколькими серверными компьютерами посредством виртуальной частной сети.Known method and system for ensuring security through biometric identification of persons (WO 2008081051 A1, 10.07.2008). The system includes the following steps: reading biometric data; composing a message with terminal identification, biometric reading and instructions; message encryption; sending them through a virtual private network on the Internet; receiving the message by the server computer and decrypting it; personal identification; matching with the database of the terminal ID, person ID and instruction ID; sending an encrypted message to the security defined by the database in case of a positive match. The security system includes security features controlled from terminals with a biometric reader; an encryption device; and a device for communicating over the Internet with one or more server computers via a virtual private network.

Недостатками данных способа и системы являются необходимость тратить вычислительную мощность на шифрование и дешифрование биометрических данных, хранение данных на удаленных ресурсах пусть и в зашифрованном, но не обезличенном виде, что создает предпосылки к неправомерному доступу через Интернет к биометрическим данным, а также невозможность работы системы и реализации данного способа при отсутствии доступа в Интернет.The disadvantages of these methods and systems are the need to spend computing power on encryption and decryption of biometric data, storage of data on remote resources, albeit in an encrypted but not anonymized form, which creates the prerequisites for illegal access to biometric data via the Internet, as well as the impossibility of the system and implementation of this method in the absence of Internet access.

Известны способ и устройство распознавания возраста и метод обучения и устройство модели распознавания возраста (CN111914772 A, 10.11.2020), которые включают этапы: ввод изображения в предварительно обученную модель распознавания возраста после получения изображений, содержащих подлежащие распознаванию лица, при этом модель распознавания возраста распознает пол подлежащих распознаванию лиц на основе признаков, за исключением гендерных признаков, подлежащих распознаванию лиц, вывод результатов распознавания возраста подлежащих распознаванию лиц, согласно методу, когда распознаются возрасты подлежащих распознаванию лиц, распознавание возраста осуществляется на основе признаков, не имеющих отношения к полу, устраняются гендерные факторы, существенно влияющие на результат распознавания возраста, с одной стороны, повышается точность распознавания возраста, между тем облегчается обучение модели и улучшается способность к обобщению модели распознавания возраста.A method and device for age recognition and a method for training and device for an age recognition model are known (CN111914772 A, 11/10/2020), which include the steps of: inputting an image into a pre-trained age recognition model after receiving images containing faces to be recognized, while the age recognition model gender of recognizable faces based on characteristics other than gender characteristics of recognizable faces, outputting the results of age recognition of recognizable faces, according to the method, when the ages of recognizable faces are recognized, age recognition is performed on the basis of features not related to sex, eliminate gender factors that significantly affect the result of age recognition, on the one hand, the accuracy of age recognition is increased, meanwhile, model training is facilitated and the ability to generalize the age recognition model is improved.

Недостатком является необходимость проведения предварительного распознавания пола и исключения гендерных признаков. The disadvantage is the need for preliminary gender recognition and the exclusion of gender characteristics.

Известно электронное устройство для распознавания идентичности лица и атрибутов лица на изображении с помощью дополненной сверточной нейронной сети (KR 20200010993 A, 31.01.2020), содержащее: память, хранящую сверточную нейронную сеть (CNN), обученную извлекать идентификационный признак для идентификации лица, включенного по меньшей мере в одно изображение, на основе обучающих данных, включая множество изображений, и ввод изображения в CNN для получения по меньшей мере одного идентификационного признака для входного изображения, ввод полученного идентификационного признака по меньшей мере в один скрытый слой, к которому применяется регуляризация отсева, а также идентификация и распознавание свойства лица, включенного во входное изображение, через один или несколько независимых полностью связанных слоев на основе вывода скрытого слоя в соответствии с вводом признака.An electronic device is known for recognizing the identity of a face and face attributes in an image using an augmented convolutional neural network (KR 20200010993 A, 01/31/2020), containing: a memory storing a convolutional neural network (CNN) trained to extract an identification feature to identify a person included in into at least one image, based on the training data, including a plurality of images, and inputting the image into the CNN to obtain at least one identification feature for the input image, inputting the obtained identification feature into at least one hidden layer to which dropout regularization is applied, and identifying and recognizing a property of a face included in the input image through one or more independent fully connected layers based on the output of the hidden layer according to the feature input.

Недостатками является необходимость использования идентификационного признака, включенного по меньшей мере в одно изображение, и то, что возраст и пол людей измеряются с помощью прогностического расчета для отдельных фотографий обрабатываемого кластера.The disadvantages are the need to use an identification feature included in at least one image, and the fact that the age and sex of people are measured using a predictive calculation for individual photographs of the processed cluster.

Известны система и способ для проведения оценки возраста и/или распознавания пола на основе особенностей лица (CN 110991256 A, 10.04.2020), включающие следующие этапы: получение изображения, перехваченного фронтальным оборудованием в качестве входного изображения, обнаружение и калибровка лиц осуществляются через сеть обнаружения и калибровки лиц по входному изображению, а обработка баланса белого осуществляется по данным изображения после обнаружения и калибровки, отправка обработанных данных изображения в модель для извлечения признаков человеческого лица для получения признаков человеческого лица на изображении; сопоставление полученных признаков лица и того же количества признаков мужского и женского лица в одном и том же многомерном пространстве, вычисление расстояния между точками признаков и оценка суммы расстояний между точками признаков и чертами мужского и женского лица для получения пола, между тем, сопоставляя полученные черты лица и черты лица каждой возрастной группы в одно и то же многомерное пространство, вычисляя расстояние между характерными точками и оценивая расстояние между характерными точками и каждой возрастной группой, чтобы получить возрастную группу, к которой принадлежат характерные точки.A system and a method for performing age estimation and/or gender recognition based on facial features are known (CN 110991256 A, 04/10/2020), including the following steps: obtaining an image intercepted by the frontal equipment as an input image, face detection and calibration are carried out through a detection network and calibrating the faces from the input image, and white balance processing is carried out according to the image data after detection and calibration, sending the processed image data to the model for extracting features of the human face to obtain the features of the human face in the image; matching the received facial features and the same number of features of a male and female face in the same multidimensional space, calculating the distance between the feature points and estimating the sum of the distances between the feature points and the features of the male and female faces to obtain gender, meanwhile, matching the received facial features and facial features of each age group into the same multidimensional space by calculating the distance between the feature points and estimating the distance between the feature points and each age group to obtain the age group to which the feature points belong.

Недостатком является невозможность проверки уникальности лица на перехваченном изображении.The disadvantage is the impossibility of checking the uniqueness of the face in the captured image.

Известен способ проверки личности человека с использованием портативного носителя данных, на котором были сохранены кодированные сигналы персональной идентификационной информации (US4532508A, 30.07.1985), который включает в себя следующие этапы: распознавание личных характеристик указанного лица, подлежащих идентификации, получение текущих информационных сигналов, указывающих на упомянутые ощущаемые личные особенности, разработка ключа шифрования из упомянутых имеющихся информационных сигналов, получение первого набора сигналов сравнения из упомянутых имеющихся информационных сигналов, обнаружение упомянутых кодированных сигналов персональной идентификационной информации, хранящихся на упомянутом портативном носителе данных, декодирование упомянутых кодированных информационных сигналов персональной идентификации, считанных с упомянутого портативного носителя данных, причем упомянутый ключ скремблирования используется для декодирования упомянутых сигналов персональной идентификационной информации, получение второго набора сигналов сравнения из упомянутых декодированных информационных сигналов личной идентификации и сравнение упомянутого первого и упомянутого второго набора содержащихся сигналов для проверки личности упомянутого человека.There is a known method for verifying a person's identity using a portable data carrier on which encoded personal identification information signals were stored (US4532508A, 07/30/1985), which includes the following steps: recognizing the personal characteristics of the specified person to be identified, obtaining current information signals indicating on said perceived personal characteristics, developing an encryption key from said available information signals, obtaining a first set of comparison signals from said available information signals, detecting said encoded personal identification information signals stored on said portable data carrier, decoding said encoded personal identification information signals read from said portable data carrier, wherein said scrambling key is used to decode said personal identification signals. information, obtaining a second set of comparison signals from said decoded personal identification information signals, and comparing said first and said second set of contained signals to verify the identity of said person.

Недостатком также является необходимость тратить вычислительную мощность на кодирование и декодирование информационных сигналов персональной идентификации, а также хранение данных пусть и в зашифрованном, но не обезличенном виде, что создает предпосылки к неправомерному доступу к персональной идентификационной информации.The disadvantage is also the need to spend computing power on encoding and decoding personal identification information signals, as well as storing data, albeit in an encrypted, but not anonymized form, which creates the preconditions for illegal access to personal identification information.

Общим недостатком выявленных технических решений является то, что распознавание и отслеживание (трекинг) лиц требует установки высокопроизводительных компьютеров и взаимодействия с ними видеокамер, с которых формируются видеопотоки данных. Существующие решения распознавания образов не обеспечивают эффективного автономного распознавания лиц, пола, возраста оборудованием с низкой себестоимостью и малыми размерами и энергопотреблением. Также не обеспечивается защита от копирования и взлома установленного программного обеспечения.A common disadvantage of the identified technical solutions is that face recognition and tracking (tracking) requires the installation of high-performance computers and the interaction of video cameras with them, from which video data streams are generated. Existing image recognition solutions do not provide effective offline recognition of faces, gender, age with equipment with low cost and small size and power consumption. Also, protection against copying and hacking of installed software is not provided.

В качестве прототипа для способа отслеживания и способа обнаружения и идентификации интересующих лиц можно рассматривать известный способ одновременного распознавания атрибутов лиц и идентификации личности при организации фотоальбомов (RU2710942C1, 14.01.2020), основанный на модификации эффективной сверточной нейронной сети, которая извлекает представления лиц, пригодные для выполнения задач идентификации лиц и распознавания атрибутов (возраст, пол, этническая принадлежность, эмоции и т.д.). Способ позволяет решать все задачи одновременно, не требуя дополнительных сверточных нейронных сетей, в результате чего обеспечивается очень быстрая система анализа лиц. В одном из зависимых пунктов этого способа на одном или нескольких входных изображениях детектируют области, ассоциированные с лицами, и используют эти области в качестве, по меньшей мере, частей одного или нескольких входных изображений для извлечения с помощью слоев базовой сверточной нейронной сети характерных признаков, приходных для идентификации лиц.As a prototype for the tracking method and the method for detecting and identifying faces of interest, one can consider the well-known method for simultaneous face attribute recognition and personality identification when organizing photo albums (RU2710942C1, 14.01.2020), based on the modification of an efficient convolutional neural network that extracts face representations suitable for performing tasks of face identification and recognition of attributes (age, gender, ethnicity, emotions, etc.). The method allows solving all problems simultaneously without requiring additional convolutional neural networks, resulting in a very fast face analysis system. In one of the dependent steps of this method, regions associated with faces are detected on one or more input images, and these regions are used as at least parts of one or more input images to extract, using the layers of the basic convolutional neural network, features, incoming to identify individuals.

Недостатком является отсутствие возможности хранения данных в обезличенном виде.The disadvantage is the inability to store data in an anonymized form.

В качестве прототипа для автономного устройства для осуществления способа отслеживания и способа обнаружения и идентификации интересующих лиц можно рассматривать известное электронное устройство для распознавания идентичности лица и атрибутов лица на изображении с помощью дополненной сверточной нейронной сети (KR20200010993A, 31.01.2020), реализованное для автоматического анализа пола, возраста и т.д., то есть клиентской базы, путем автоматического распознавания лица клиента в то время, когда клиент посещает магазин и покупает продукт, находясь за рулем самостоятельно, и к способу этого. В устройстве блок камеры улавливает приближение клиента и фокусирует лицо клиента на фотографии, блок распознавания лиц распознает лицо клиента, сфотографированное блоком камеры, и кэширует данные изображения лица, блок анализа клиентской базы анализирует клиентскую базу путем извлечения характерных точек из данных изображения лица, кэшированных блоком распознавания лиц, а блок хранения данных хранит данные клиентской базы, проанализированные блоком анализа клиентской базы.As a prototype for an autonomous device for implementing a tracking method and a method for detecting and identifying persons of interest, one can consider a well-known electronic device for recognizing face identity and face attributes in an image using an augmented convolutional neural network (KR20200010993A, 01/31/2020), implemented for automatic gender analysis , age, etc., that is, the customer base, by automatically recognizing the face of the customer at the time when the customer visits the store and buys the product while driving himself, and to the method of this. In the device, the camera unit detects the approach of the customer and focuses the customer's face in the photo, the face recognition unit recognizes the customer's face photographed by the camera unit and caches the face image data, the customer base analysis unit analyzes the customer base by extracting feature points from the face image data cached by the recognition unit persons, and the data storage unit stores the customer base data analyzed by the customer base analysis unit.

Недостатком является отсутствие возможности хранения данных в обезличенном виде. The disadvantage is the inability to store data in an anonymized form.

Технической задачей является создание эффективной автономной системы видеоаналитики с защитой от копирования и взлома на основе оборудования с низкой себестоимостью и малыми размерами и энергопотреблением, реализующей функции отслеживания, обнаружения и идентификации интересующих лиц, а также разработка быстродействующих способов отслеживания, обнаружения и идентификации интересующих лиц, в которых определяется общее количество лиц, количество уникальных лиц, пол и возраст лиц, а также данные о количестве интересующих лиц, попавших в зону обзора видеокамеры.The technical task is to create an effective stand-alone video analytics system with copy and tamper protection based on equipment with low cost and small size and power consumption, which implements the functions of tracking, detecting and identifying persons of interest, as well as developing high-speed methods for tracking, detecting and identifying persons of interest, in which determine the total number of faces, the number of unique faces, gender and age of faces, as well as data on the number of persons of interest who fell into the video camera's field of view.

Технический результат заявленной группы изобретений состоит в снижении требуемых вычислительных возможностей автономной системы видеоаналитики, уменьшении числа ошибок при распознавании, повышении скорости отслеживания, а также обнаружения и идентификации интересующих лиц, включая определение общего количества лиц, количества уникальных лиц, пола и возраста лиц, а также данных о количестве интересующих лиц, попавших в зону обзора видеокамеры. Другой технический результат состоит в обеспечении защиты нейронной сети от копирования и взлома и обеспечении защиты персональных данных.The technical result of the claimed group of inventions is to reduce the required computing capabilities of an autonomous video analytics system, reduce the number of errors in recognition, increase the speed of tracking, as well as the detection and identification of persons of interest, including determining the total number of faces, the number of unique faces, gender and age of persons, as well as data on the number of persons of interest who fell into the field of view of the video camera. Another technical result consists in ensuring the protection of the neural network from copying and hacking and ensuring the protection of personal data.

Указанные технические результаты достигаются за счет использования следующих существенных признаков.These technical results are achieved through the use of the following essential features.

Способ отслеживания включает в себя предварительную установку программного обеспечения, включающего набор взаимосвязанных алгоритмов нейронных сетей, с помощью локально подключаемого к вычислительному устройству компьютера, обучение программного обеспечения, проведение модификации весовых коэффициентов глубинных нейронных сетей, до уровня, когда становится невозможным осуществление операций распознавания лиц и определения их свойств, но сохраняется возможность выполнить большую часть требуемого объема вычислений, введение модифицированных весовых коэффициентов глубинных нейронных сетей в выделенную область памяти вычислительного устройства и задание размера цикла обработки N, равного целому числу кадров. Далее, непосредственно в ходе выполнения отслеживания получают видеопоток изображений из одной или нескольких матричных видеокамер, посредством вычислительного устройства выполняют обнаружение на кадре лиц, на кадрах видеопотока изображений, кроме каждого N-го, в режиме реального времени посредством вычислительного устройства отфильтровывают зоны движения, формируют очередь модуля детектора из каждых N-х кадров видеопотока изображений с отфильтрованными зонами движения, для каждой отфильтрованной зоны движения формируют поток обнаружения лиц, посредством вычислительного устройства в каждом потоке обнаружения лиц в режиме реального времени выполняют обнаружение на кадре лиц, из кадров с обнаруженными лицами формируют очередь модуля слежения, которую подают в поток модуля слежения (трекинга), в потоке модуля слежения (трекинга) формируют последовательности изображений лиц одного и того же человека, для каждой последовательности изображений лиц одного и того же человека формируют очередь произведенного трекинга, осуществляют фильтрацию очереди произведенного трекинга с целью отбора лучших изображений путем отбора по параметрам размера изображения, ориентации лица на изображении, степени выраженности дефектов изображения, отобранные изображения из очереди произведенного трекинга подают в поток распознавания, в потоке распознавания в режиме офлайн посредством вычислительного устройства и обученной первой нейронной сети определяют сигнатуру изображения лица в виде вектора из вещественных чисел, посредством вычислительного устройства и второй обученной нейронной сети определяют возраст человека по изображению лица и посредством вычислительного устройства и третьей обученной нейронной сети определяют пол человека по изображению. После этого выполняют сравнение сигнатуры изображения лица с сигнатурами изображений лиц в памяти вычислительного устройства, причем, если сходство сигнатуры, полученной в потоке распознавания, с сигнатурами изображений лиц в памяти вычислительного устройства меньше заданного порога, то сигнатуру изображения лица вносят в память вычислительного устройства, затем в память вычислительного устройства вносят данные об общем количестве лиц, количестве уникальных лиц, поле и возрасте лиц.The tracking method includes pre-installation of software, including a set of interconnected neural network algorithms, using a computer locally connected to a computing device, software training, modifying the weight coefficients of deep neural networks, to the level when it becomes impossible to perform face recognition and determination operations. of their properties, but it remains possible to perform most of the required amount of calculations, the introduction of modified weight coefficients of deep neural networks into the allocated memory area of the computing device and setting the size of the processing cycle N, equal to an integer number of frames. Further, directly in the course of tracking, a video stream of images is received from one or several matrix video cameras, faces are detected on the frame by means of a computing device, on the frames of the video stream of images, except for each Nth, in real time, motion zones are filtered by means of a computing device, a queue is formed of the detector module, from each N-th frames of the video stream of images with filtered motion zones, for each filtered motion zone, a face detection stream is formed, by means of a computing device, in each face detection stream, face detection is performed on the frame in real time, a queue is formed from frames with detected faces of the tracking module, which is fed into the stream of the tracking (tracking) module, in the stream of the tracking (tracking) module, sequences of images of the faces of the same person are formed, for each sequence of images of the faces of the same person, a queue is formed of the produced tracking, the queue of the produced tracking is filtered in order to select the best images by selecting according to the parameters of the image size, the orientation of the face in the image, the degree of severity of image defects, the selected images from the queue of the produced tracking are fed into the recognition stream, in the recognition stream offline by means of a computing device and the trained first neural network determines the signature of the face image in the form of a vector of real numbers, by means of the computing device and the second trained neural network the age of the person is determined from the face image, and by means of the computing device and the third trained neural network the gender of the person is determined from the image. After that, the face image signature is compared with the face image signatures in the memory of the computing device, and if the similarity of the signature obtained in the recognition stream with the face image signatures in the memory of the computing device is less than a specified threshold, then the face image signature is entered into the memory of the computing device, then data on the total number of persons, the number of unique persons, gender and age of persons are entered into the memory of the computing device.

Способ обнаружения и идентификации интересующих лиц выполняет те же действия, что и способ отслеживания, но в отличие от него включает предварительное введение изображения интересующих лиц в выделенную область памяти вычислительного устройства и определение сигнатуры изображения интересующего лица. Для того, чтобы указать, что лицо принадлежит к группе интересующих лиц, по меньшей мере один из элементов вектора, описывающего сигнатуру, используют в качестве указателя. При этом одно лицо может одновременно входить в несколько групп, для каждой из которых может использоваться отдельный указатель. Если сходство сигнатуры, полученной в потоке распознавания, с сигнатурами изображений лиц в памяти вычислительного устройства больше заданного порога, а сигнатура, полученная в потоке распознавания, содержит указатель принадлежности к группе интересующих лиц, то данные о количестве интересующих лиц, попавших в зону обзора видеокамеры, вносят в память вычислительного устройства.The method for detecting and identifying persons of interest performs the same actions as the method for tracking, but unlike it, it includes the preliminary introduction of an image of persons of interest into a dedicated memory area of the computing device and determining the signature of the image of the person of interest. In order to indicate that a person belongs to a group of persons of interest, at least one of the elements of the vector describing the signature is used as a pointer. In this case, one person can simultaneously be included in several groups, for each of which a separate index can be used. If the similarity of the signature obtained in the recognition stream with the face image signatures in the memory of the computing device is greater than the specified threshold, and the signature obtained in the recognition stream contains an indicator of belonging to a group of persons of interest, then data on the number of persons of interest that fell into the video camera field of view, entered into the memory of the computing device.

Статистические данные об общем количестве лиц, количестве уникальных лиц, поле и возрасте лиц, а в случае реализации способа обнаружения и идентификации интересующих лиц, данные о количестве интересующих лиц, попавших в зону обзора видеокамеры, могут быть переданы в память вычислительного устройства на подключаемый дисплей или подключаемый USB накопитель.Statistical data on the total number of faces, the number of unique faces, the sex and age of faces, and in the case of implementing a method for detecting and identifying persons of interest, data on the number of persons of interest who fell into the video camera's field of view can be transferred to the memory of a computing device on a connected display or connected USB storage device.

Автономное устройство c защитой от копирования и взлома для их осуществления содержит корпус, в котором размещены вычислительное устройство, матричная видеокамера с объективом, выполненная с возможностью захвата изображения зоны контроля и подключенная к вычислительному устройству, карта памяти, подключенная к вычислительному устройству, элемент питания, подключенный к вычислительному устройству, на вычислительное устройство установлено программное обеспечение, содержащее программный набор взаимосвязанных программ и алгоритмов нейронных сетей для обнаружения на изображении лиц, определения сигнатур изображений лиц, определения возраста и пола лиц, внутри корпуса размещен программируемый логический контроллер, подключенный к вычислительному устройству через один из интерфейсов ввода-вывода вычислительного устройства, причем программируемый логический контроллер осуществляет защиту программного обеспечения и хранимых сигнатур.A stand-alone device with protection against copying and tampering for their implementation contains a housing in which a computing device is placed, a matrix video camera with a lens, configured to capture an image of the control zone and connected to the computing device, a memory card connected to the computing device, a battery connected to the computing device, the computing device is equipped with software containing a software set of interconnected programs and algorithms of neural networks for detecting faces in the image, determining face image signatures, determining the age and gender of faces, inside the case there is a programmable logic controller connected to the computing device through one from the I / O interfaces of the computing device, and the programmable logic controller protects the software and stored signatures.

При этом упомянутая в способе отслеживания последовательность действий, когда на кадрах видеопотока изображений, кроме каждого N-го, в режиме реального времени посредством вычислительного устройства отфильтровывают зоны движения, формируют очередь модуля детектора из каждых N-х кадров видеопотока изображений с отфильтрованными зонами движения, для каждой отфильтрованной зоны движения формируют поток обнаружения лиц, посредством вычислительного устройства в каждом потоке обнаружения лиц в режиме реального времени выполняют обнаружение на кадре лиц, из кадров с обнаруженными лицами формируют очередь модуля слежения, которую подают в поток модуля слежения (трекинга), в потоке модуля слежения (трекинга) формируют последовательности изображений лиц одного и того же человека, для каждой последовательности изображений лиц одного и того же человека формируют очередь произведенного трекинга, осуществляют фильтрацию очереди произведенного трекинга с целью отбора лучших изображений путем отбора по параметрам размера изображения, ориентации лица на изображении, степени выраженности дефектов изображения, позволяет осуществлять экономию вычислительных ресурсов и повышать точность распознавания лиц путем предварительного отбора тех изображений лиц, которые наилучшим образом подходят для расчета сигнатуры. Экономия же вычислительных ресурсов позволяет повысить скорость отслеживания, а также обнаружения и идентификации интересующих лиц.At the same time, the sequence of actions mentioned in the tracking method, when on the frames of the video stream of images, except for each Nth one, motion zones are filtered in real time by means of a computing device, a queue of the detector module is formed from every Nth frames of the video stream of images with filtered motion zones, for of each filtered motion zone, a face detection stream is formed, by means of a computing device in each face detection stream, face detection is performed in real time on a frame, from frames with detected faces, a tracking module queue is formed, which is fed into the tracking (tracking) module stream, in the module stream tracking (tracking) form a sequence of images of the faces of the same person, for each sequence of images of the faces of the same person form a queue of the produced tracking, filter the queue of the produced tracking in order to select the best images by selecting n about the parameters of the image size, orientation of the face in the image, the degree of severity of image defects, allows you to save computing resources and improve the accuracy of face recognition by pre-selecting those face images that are best suited for signature calculation. Saving computing resources allows you to increase the speed of tracking, as well as the detection and identification of persons of interest.

Это происходит вследствие того, что, во-первых, вычислительные мощности устройства для расчета сигнатур не используются для некачественных изображений и, во-вторых, для повышения качества распознавания из потока трекинга отбираются наиболее подходящие для этого изображения. При этом в качестве критериев отбора используются:This is due to the fact that, firstly, the computing power of the device for calculating signatures is not used for low-quality images and, secondly, the most suitable images for this are selected from the tracking stream to improve the quality of recognition. The following are used as selection criteria:

- размер изображения, когда выбирается изображение наибольшего размера, удовлетворяющее всем прочим условиям;- image size, when the largest image size that satisfies all other conditions is selected;

- ориентация лица на изображении, которая оценивается двумя способами - по характеристическим точкам лица (найденным в процессе выравнивания изображения лица) и по информации, получаемой в процессе работы детектора лиц (если она доступна для данного конкретного изображения), причем при прочих равных условиях предпочтение отдается фронтальным лицам;- orientation of the face in the image, which is estimated in two ways - according to the characteristic points of the face (found in the process of face image alignment) and according to the information obtained during the operation of the face detector (if it is available for this particular image), with other things being equal, preference is given to front faces;

- качество изображения, когда производится оценка дефектов изображения - уровня его засветки, смазанности и т.д., причем при прочих равных условиях выбирается наиболее качественное изображение из трека. - image quality, when the image defects are assessed - the level of its illumination, blurring, etc., and, other things being equal, the most high-quality image from the track is selected.

Подобный отбор позволяет примерно на 40% снизить вычислительную нагрузку на устройство и на 15% уменьшить число ошибок при распознавании.Such selection allows reducing the computational load on the device by approximately 40% and reducing the number of recognition errors by 15%.

Как выполнение обнаружения на кадре лиц посредством вычислительного устройства на параллельно работающих ядрах центрального процессора вычислительного устройства, так и выполнение определения сигнатуры изображения лица посредством вычислительного устройства на параллельно работающих ядрах графического процессора вычислительного устройства позволяют еще больше увеличить скорость отслеживания, а также обнаружения и идентификации интересующих лиц.Both performing face detection on a frame by a computing device on parallel CPU cores of a computing device and performing face image signature detection by a computing device on parallel GPU cores of a computing device can further increase the speed of tracking, as well as detection and identification of faces of interest. .

При этом упомянутая в способе отслеживания последовательность действий, когда предварительно однократно проводят модификацию весовых коэффициентов глубинных нейронных сетей, до уровня, когда становится невозможным осуществление операций распознавания лиц и определения их свойств, но сохраняется возможность выполнить большую часть требуемого объема вычислений, модифицированные весовые коэффициенты глубинных нейронных сетей вводят в выделенную область памяти вычислительного устройства, а затем в потоке распознавания в режиме офлайн посредством вычислительного устройства и обученной первой нейронной сети определяют сигнатуру изображения лица в виде вектора из вещественных чисел, а также использование в автономном устройстве программируемого логического контроллера, осуществляющего защиту программного обеспечения и хранимых сигнатур, обеспечивает достижение технического результата по защите нейронной сети от копирования и взлома и по обеспечению защиты персональных данных за счет использования обезличенных данных для сбора статистической информации.At the same time, the sequence of actions mentioned in the tracking method, when the weight coefficients of deep neural networks are previously modified once, to the level when it becomes impossible to perform face recognition operations and determine their properties, but it remains possible to perform most of the required amount of calculations, the modified weight coefficients of deep neural networks networks are entered into a dedicated memory area of the computing device, and then in the offline recognition stream, by means of the computing device and the trained first neural network, the signature of the face image is determined in the form of a vector of real numbers, as well as the use of a programmable logic controller in the offline device that protects the software and stored signatures, ensures the achievement of a technical result in protecting the neural network from copying and hacking and in ensuring the protection of personal data through the use of anonymized data to collect statistical information.

Осуществление группы изобретений может быть проиллюстрировано с помощью чертежей:The implementation of a group of inventions can be illustrated with the help of drawings:

фиг. 1 - многопоточная модель программного обеспечения, использующая CPU и GPU;fig. 1 - multi-threaded software model using CPU and GPU;

фиг. 2 - схема процесса анализа Appearance модулем статистики;fig. 2 is a diagram of the Appearance analysis process by the statistics module;

фиг. 3 - метод искажения весовых коэффициентов;fig. 3 - weight distortion method;

фиг. 4 - реализация метода искажения весовых коэффициентов;fig. 4 - implementation of the weight distortion method;

фиг. 5 - график количества обнаруженных лиц с использованием нейронной сети с искаженными коэффициентами (синий - с ключом защиты, оранжевый - без ключа защиты);fig. 5 - graph of the number of detected faces using a neural network with distorted coefficients (blue - with a security key, orange - without a security key);

фиг. 6 - диаграмма времени расчета выходов нейронной сети с искаженными коэффициентами и времени восстановления правильного выходного вектора;fig. 6 is a diagram of the calculation time of neural network outputs with distorted coefficients and the recovery time of the correct output vector;

фиг. 7 - автономное устройство с защитой от копирования и взлома для осуществления способа отслеживания и способа обнаружения и идентификации интересующих лиц;fig. 7 shows a stand-alone copy and tamper proof device for implementing a tracking method and a method for detecting and identifying persons of interest;

фиг. 8 - график общего количества посетителей и количества уникальных посетителей на основе статистической информации, получаемой в ходе выполнения заявленных способов;fig. 8 is a graph of the total number of visitors and the number of unique visitors based on statistical information obtained during the implementation of the claimed methods;

фиг. 9 - распределение персон по возрасту; fig. 9 - distribution of persons by age;

фиг. 10 и 11 - примеры физического исполнения автономного устройства.fig. 10 and 11 are examples of the physical implementation of a stand-alone device.

Обучение программного обеспечения может быть выполнено с использованием набора данных для обучения и тестирования моделей распознавания лиц (Labeled Faces in the Wild, LFW) [1].Software training can be performed using the dataset for training and testing face recognition models (Labeled Faces in the Wild, LFW) [1].

Размер цикла обработки, равный целому числу кадров, выбирается в диапазоне от 2 до 16. При этом уменьшение этого числа за пределы диапазона приводит к повышению нагрузки на процессор вычислительного устройства и снижению быстродействия, а увеличение - к возможности потери лица на кадре из-за его выхода за пределы области обзора видеокамеры при быстром движении.The size of the processing cycle, equal to an integer number of frames, is selected in the range from 2 to 16. In this case, a decrease in this number outside the range leads to an increase in the load on the processor of the computing device and a decrease in performance, and an increase in the possibility of losing a face on a frame due to its going out of the field of view of the camcorder when moving quickly.

Вектор из вещественных чисел для сигнатуры изображения лица выбирается, например, в размере 256 чисел. При этом снижение размера сигнатуры снижает точность идентификации, а увеличение приводит к повышению нагрузки на процессор вычислительного устройства, снижению быстродействия и увеличению необходимого объема хранимой в памяти вычислительного устройства обезличенной информации. Сравнение сигнатур выполняется по косинусной мере.A vector of real numbers for the face image signature is chosen, for example, in the amount of 256 numbers. At the same time, a decrease in the size of the signature reduces the accuracy of identification, and an increase leads to an increase in the load on the processor of the computing device, a decrease in performance and an increase in the required amount of depersonalized information stored in the memory of the computing device. The signatures are compared by the cosine measure.

Программное обеспечение, реализующее способ отслеживания, функционирует на базе многопоточной модели. В данной модели весь процесс обработки разбивается на отдельные задачи: главный цикл, обнаружение лиц, слежение за лицами, распознавание (фиг. 1). Каждая задача выполняется в отдельном потоке CPU. Для задачи детекции используется пул потоков CPU. Задача распознавания в процессе работы использует GPU для расчетов. Для обмена данными между потоками применяются очереди с приоритетами. Технически такая модель позволяет использовать ресурсы CPU/GPU параллельно и тем самым повысить производительность работы.The software that implements the tracking method operates on the basis of a multithreaded model. In this model, the entire processing process is divided into separate tasks: the main loop, face detection, face tracking, recognition (Fig. 1). Each task runs on a separate CPU thread. The detection task uses a pool of CPU threads. The task of recognition in the course of work uses the GPU for calculations. Priority queues are used to exchange data between threads. Technically, this model allows you to use CPU / GPU resources in parallel and thereby increase performance.

Общая схема работы программного обеспечения в многопоточной модели: в главном цикле из одной матричной видеокамеры получают видеопоток изображений. На кадрах видеопотока изображений, кроме каждого N-го, в режиме реального времени посредством вычислительного устройства отфильтровывают зоны движения, формируют очередь из каждых N-х кадров видеопотока изображений с отфильтрованными зонами движения. Сформированные кадры с отфильтрованными зонами движения подаются в очередь модуля детектора. Каждый поток модуля детекции проверяет очередь и обрабатывает кадр с отфильтрованными зонами движения параллельно с другими потоками детекции. Основной задачей модуля детекции является обнаружение лиц. Модуль детекции в режиме реального формирует кадры с обнаруженными лицами и передает их в очередь для модуля слежения (трекинга). В потоке модуля слежения (трекинга) выполняется формирование Appearances - последовательности структур данных Appearance, в которых выделено место под запись изображения лица, сигнатуру, возраст и пол. Во время формирования Appearances в него выполняется запись изображений лиц одного и того же человека. Сформированные модулем слежения Appearances поступают в очередь модуля распознавания.The general scheme of software operation in a multi-threaded model: in the main loop, a video stream of images is received from one matrix video camera. On the frames of the video stream of images, except for each N-th, in real time by means of a computing device, motion zones are filtered out, a queue is formed from every N-th frames of the video stream of images with filtered motion zones. Formed frames with filtered motion zones are fed into the queue of the detector module. Each detection module thread checks the queue and processes the frame with filtered motion zones in parallel with other detection threads. The main task of the detection module is face detection. The detection module generates frames with detected faces in real mode and transfers them to the queue for the tracking module. In the thread of the tracking module (tracking), Appearances are formed - a sequence of Appearance data structures in which space is allocated for recording a face image, signature, age, and gender. During the formation of Appearances, images of the faces of the same person are recorded in it. The Appearances generated by the tracking module enter the queue of the recognition module.

Модуль слежения выполняет слежение за лицами, найденными модулем детекции лиц. Модуль одновременно может выполнять слежение за множеством лиц. В качестве базового алгоритма слежения за одним лицом используется корреляционный алгоритм DSST (Discriminative Scale Space Tracker tracking algorithm). Для каждой последовательности лица используется отдельный трекер. Для слежения за множеством лиц сначала формируется матрица соответствия между существующими трекерами и найденными лицами. В качестве меры соответствия используется функция, которая включает в себя компоненты: местоположение, размер, LBP-дескриптор (local binary pattern), время. Далее выполняется анализ матрицы соответствия для формирования лучших соответствий между существующими трекерами и найденными лицами. После этого происходит обновление существующих трекеров с использованием информации о найденных лицах, либо создаются новые трекеры, если для найденных лиц не было найдено соответствия. Трекеры прекращают слежение, если длина формируемого appearance превышает заданный порог, или было резкое изменение объекта слежения (например перекрытие другим лицом, другими объектами сцены).The tracking module performs tracking of the faces found by the face detection module. The module can simultaneously monitor multiple faces. The DSST (Discriminative Scale Space Tracker tracking algorithm) correlation algorithm is used as the basic single-face tracking algorithm. A separate tracker is used for each face sequence. To track a lot of faces, a correspondence matrix is first formed between existing trackers and found faces. As a measure of compliance, a function is used that includes the following components: location, size, LBP descriptor (local binary pattern), time. Next, an analysis of the matching matrix is performed to form the best matches between existing trackers and found faces. After that, existing trackers are updated using information about the found faces, or new trackers are created if no match was found for the found faces. Trackers stop tracking if the length of the generated appearance exceeds the specified threshold, or if there was a sharp change in the tracking object (for example, overlapping with another person, other scene objects).

В результате на выходе модуля слежения формируются последовательности изображений лиц одного и того же человека. Далее из последовательностей изображений лиц одного и того же человека выбираются лица, которые могут быть хорошо распознаны. Такой подход позволяет снизить вычислительную нагрузку на модуль распознавания и повысить производительность системы в целом. Для выбора лиц используются критерии: освещенности, ориентации (лицо должно быть фронтальным), смазанности (слишком смазанные лица плохо распознаются).As a result, sequences of images of faces of the same person are formed at the output of the tracking module. Next, from the sequences of images of faces of the same person, faces are selected that can be well recognized. This approach allows to reduce the computational load on the recognition module and improve the performance of the system as a whole. To select faces, the following criteria are used: illumination, orientation (the face must be frontal), blurring (too blurry faces are poorly recognized).

Далее выбранные лица распознаются в модуле распознавания - формируются сигнатуры лиц, определяется пол и возраст. Найденные сигнатуры лиц, пол и возраст записываются в Appearances.Further, the selected faces are recognized in the recognition module - face signatures are formed, gender and age are determined. The found face signatures, sex and age are recorded in Appearances.

После распознавания Appearances поступают в модуль статистики, в котором выполняется их анализ для определения уникальности лиц, добавления в локальную базу данных, расположенную в памяти вычислительного устройства.After recognition, Appearances are sent to the statistics module, which analyzes them to determine the uniqueness of faces, adding them to the local database located in the memory of the computing device.

На вход модуля статистики поступают Appearances, содержащие сигнатуры лиц. Для каждого Appearance выполняется процесс анализа (фиг. 2). Сначала выполняется поиск лучшей персоны по локальной базе данных. В процессе поиска для сопоставления двух сигнатур используется косинусная метрика. Если была найдена лучшая персона (далее - BestPerson), то выполняется проверка возможности объединения Appearance и найденной персоны BestPerson: если найденное значение похожести по косинусной метрики меньше заданного порога, то выполняется объединение Appearance с BestPerson. При объединении к BestPerson добавляются только такие сигнатуры лиц, которые сильно отличаются от уже записанных в BestPerson.The statistics module receives Appearances containing face signatures as input. For each Appearance, an analysis process is performed (FIG. 2). First, a search for the best person in the local database is performed. The search process uses a cosine metric to compare two signatures. If the best person (hereinafter - BestPerson) was found, then the possibility of combining Appearance and the found person BestPerson is checked: if the similarity value found by the cosine metric is less than the specified threshold, then Appearance is combined with BestPerson. When merging, only such face signatures are added to BestPerson that are very different from those already recorded in BestPerson.

Если BestPerson не была найдена или невозможно объединить BestPerson с Appearance, то из Appearance создается новая персона, которая записывается в локальную базу данных.If the BestPerson was not found or it is not possible to merge the BestPerson with the Appearance, then a new person is created from the Appearance and written to the local database.

В памяти вычислительного устройства хранятся модифицированные весовые коэффициенты нейронной сети, не позволяющие использовать данную сеть напрямую для распознавания, обеспечивает устойчивость предлагаемых способов отслеживания, обнаружения и идентификации лиц к взлому посредством «состязательных атак» (Adversarial attack). Суть подобных атак состоит в поиске таких входных сигналов для нейронной сети (в данном случае изображений), на которых сеть будет выдавать некорректный выходной сигнал. Так как даже при наличии доступа к вычислительному устройству получить исходные (не модифицированные) весовые коэффициенты нейронной сети из его памяти невозможно, то подбор требуемых для «состязательной атаки» входных сигналов становится чрезвычайно сложной, и в большинстве случаев, невыполнимой задачей.The memory of the computing device stores modified weight coefficients of the neural network, which does not allow using this network directly for recognition, ensures the stability of the proposed methods for tracking, detecting and identifying faces against hacking through "adversarial attacks". The essence of such attacks is to search for such input signals for the neural network (in this case, images), on which the network will produce an incorrect output signal. Since even with access to a computing device, it is impossible to obtain the original (not modified) weight coefficients of the neural network from its memory, the selection of the input signals required for an “adversarial attack” becomes an extremely difficult, and in most cases, impossible task.

Для защиты нейронной сети распознавания лиц от незаконного использования разработан и применяется метод искажения весовых коэффициентов (фиг. 3).To protect the face recognition neural network from illegal use, a weight distortion method has been developed and applied (Fig. 3).

В данном методе из нейронной сети N извлекаются веса (W) и смещения (B) полносвязного слоя. Веса W представляются в 32-битном формате IEEE754 (1 бит - знак, 8 бит - экспонента, 23 бита- мантисса). Для всех весов извлекаются знаки весов W. Далее с помощью генератора случайных чисел генерируется секретный ключ K, размерность которого равна числу выходов слоя. Извлеченные (истинные) смещения B, знаки весов S и секретный ключ K записываются в память контроллера ПЛК. Далее выполняется модификация весов W, смещений B и знаков весов W. В процессе модификации смещениям и знакам весов присваиваются нулевые значения. Также выполняется модификация заданной функцией F(W,K) экспоненты весов W с использованием секретного ключа K. Новые значения весов (W_new), смещений (B_new) сохраняются обратно в полносвязный слой и новая защищенная сеть N_s записывается на диск.In this method, the weights (W) and biases (B) of a fully connected layer are extracted from the neural network N. W weights are represented in 32-bit IEEE754 format (1 bit sign, 8 bit exponent, 23 bit mantissa). For all weights, the signs of the weights W are extracted. Then, using a random number generator, a secret key K is generated, the dimension of which is equal to the number of layer outputs. The extracted (true) offsets B, the signs of the weights S and the secret key K are stored in the PLC controller memory. Next, the weights W, the offsets B, and the signs of the weights W are modified. During the modification, the offsets and signs of the weights are assigned zero values. The weight exponent W given by the function F(W,K) is also modified using the secret key K. The new values of the weights (W _new ), biases (B _new ) are saved back to the fully connected layer, and the new secure network N _s is written to disk.

Нейронные сети c искаженными весовыми коэффициентами используются следующим образом (фиг. 4).Neural networks with distorted weight coefficients are used as follows (Fig. 4).

Сначала загружается нейронная сеть N_s с искаженными весовыми коэффициентами. Далее из контроллера ПЛК в полносвязный слой выгружаются правильные знаки весов. Далее выполняется расчет выходов сети в обычном режиме (может выполняться с использованием CPU, GPU). Выход (O) такой сети является искаженным. Восстановление истинного выхода O_true=F_r(O,B,K) выполняется на контроллере ПЛК (STM32) c использованием искаженного выходного вектора сети O, смещений B и секретного ключа K и заданной функции восстановления Fr.First, the neural network N _s is loaded with skewed weights. Further, the correct signs of weights are unloaded from the PLC controller into the fully connected layer. Next, the network outputs are calculated in the usual mode (can be performed using the CPU, GPU). The output (O) of such a network is distorted. Restoration of the true output O _true =F _r (O,B,K) is performed on the PLC controller (STM32) using the distorted network output vector O, offsets B and secret key K and the specified recovery function Fr.

В случае, когда контроллер ПЛК недоступен, устройство может продолжать функционировать, но выходы нейронных сетей сильно искажены, что приводит и к резкому сокращению числа обнаруженных уникальных людей (фиг. 5), и к уменьшению корректности распознавания лиц, пола или возраста.In the case when the PLC controller is not available, the device can continue to function, but the outputs of neural networks are highly distorted, which leads to a sharp reduction in the number of unique people detected (Fig. 5), and to a decrease in the correctness of face recognition, gender or age.

Предложенный метод искажения не вызывает сильного увеличения времени расчета выходов нейронной сети и не вызывает уменьшения точности распознавания. Восстановление правильного выходного вектора занимает не более 1% времени от общего времени расчета (фиг. 6).The proposed distortion method does not cause a strong increase in the calculation time of the neural network outputs and does not cause a decrease in recognition accuracy. Restoring the correct output vector takes no more than 1% of the total calculation time (Fig. 6).

В качестве примера реализации вычислительное устройство в составе заявленного автономного устройства может быть выполнено на базе одноплатного компьютера Odroid XU4Q с установленной операционной системой Ubuntu 16.04, имеющего 8 ядер CPU и 6 ядер GPU (фиг. 7). Видеокамера выполнена с объективом Gaona-291 и подключена к вычислительному устройству через порт USB вычислительного устройства. Программируемый логический контроллер выполнен на основе отладочной платы Core405R и подключен к вычислительному устройству через порт USB вычислительного устройства. В качестве карты памяти используется карта Samsung EVO Plus microSDXC 64Gb. Элемент питания используется литиевый (CR2032), предназначен для сохранения системных часов компьютера и подключен к вычислительному устройству через разъем RTC Battery Connector вычислительного устройства. Для вывода статистических результатов и (или) визуализации к вычислительному устройству могут быть подключены дисплей (через разъем HDMI вычислительного устройства) и USB накопитель (через разъем USB вычислительного устройства). Подключение внешнего компьютера для установки программного обеспечения и вывода статистических результатов может быть осуществлено либо через разъем Ethernet вычислительного устройства, либо через GSM-модем, подключаемый к разъему USB вычислительного устройства.As an example of implementation, the computing device as part of the claimed stand-alone device can be made on the basis of an Odroid XU4Q single-board computer with the Ubuntu 16.04 operating system installed, which has 8 CPU cores and 6 GPU cores (Fig. 7). The video camera is made with a Gaona-291 lens and is connected to the computing device via the USB port of the computing device. The programmable logic controller is based on the Core405R development board and is connected to the computing device via the USB port of the computing device. A Samsung EVO Plus microSDXC 64Gb card is used as a memory card. The battery used is lithium (CR2032), designed to save the computer's system clock and is connected to the computing device via the RTC Battery Connector of the computing device. To output statistical results and (or) visualization, a display (via the HDMI connector of the computing device) and a USB drive (via the USB connector of the computing device) can be connected to the computing device. Connecting an external computer to install the software and output statistical results can be done either through the Ethernet connector of the computing device, or via a GSM modem connected to the USB connector of the computing device.

Разработанное автономное, компактное устройство позволяет идентифицировать до 10000 персон, при этом подсчитывается общее количество посетителей и количество уникальных посетителей. Пример графиков общего количества посетителей и количества уникальных посетителей на основе статистической информации, получаемой в ходе выполнения заявленных способов, показан на фиг. 8, где графики показывают почасовое количество посетителей в среднем за 30 дней. Пример статистически обработанных результатов идентификации изображений лиц по полу и возрасту показан на фиг. 9, где выведено распределение персон по возрасту менее 20 лет, от 20 до 30 лет, от 30 до 40 лет, от 40 до 50 лет, от 50 и более лет.The developed stand-alone, compact device allows you to identify up to 10,000 people, while counting the total number of visitors and the number of unique visitors. An example of graphs of the total number of visitors and the number of unique visitors based on statistical information obtained during the implementation of the claimed methods is shown in FIG. 8, where the graphs show the hourly number of visitors over a 30-day average. An example of statistically processed results of face image identification by sex and age is shown in Fig. 9, which shows the distribution of persons by age less than 20 years, from 20 to 30 years, from 30 to 40 years, from 40 to 50 years, from 50 years and more.

Проверка точности распознавания осуществлялась на действующем образце устройства с помощью теста на наборе данных для обучения и тестирования моделей распознавания лиц [1] и составила для лиц - 98.3%, для пола - 95%, для возраста - 80%.The recognition accuracy was tested on a working sample of the device using a test on a data set for training and testing face recognition models [1] and was 98.3% for faces, 95% for gender, and 80% for age.

Примеры физического исполнения автономного устройства для осуществления способа отслеживания и способа обнаружения и идентификации интересующих лиц показаны на фиг. 10 и 11.Examples of the physical execution of a stand-alone device for implementing the tracking method and the method for detecting and identifying persons of interest are shown in FIG. 10 and 11.

Список использованных источников.List of used sources.

1. Learned-Miller, E., Huang, G.B., RoyChowdhury, A., Li, H., Hua, G.: Labeled faces in the wild: A survey. In: Advances in face detection and facial image analysis. Springer (2016), P. 189-2481. Learned-Miller, E., Huang, G.B., RoyChowdhury, A., Li, H., Hua, G.: Labeled faces in the wild: A survey. In: Advances in face detection and facial image analysis. Springer (2016), P. 189-248

контроллере ПЛК (STM32) c использованием искаженного выходного вектора сети O, смещений B и секретного ключа K и заданной функции восстановления Fr. PLC controller (STM32) using distorted network output vector O, offsets B and secret key K and given recovery function Fr.

В случае, когда контроллер ПЛК недоступен, устройство может продолжать функционировать, но выходы нейронных сетей сильно искажены, что приводит и к резкому сокращению числа обнаруженных уникальных людей (фиг. 5), и к уменьшению корректности распознавания лиц, пола или возраста. In the case when the PLC controller is not available, the device can continue to function, but the outputs of neural networks are highly distorted, which leads to a sharp reduction in the number of unique people detected (Fig. 5), and to a decrease in the correctness of face recognition, gender or age.

В качестве примера реализации может быть выполнено вычислительное устройство в составе заявленного автономного устройства (фиг. 7), где обозначеныAs an example of implementation, a computing device can be made as part of the claimed autonomous device (Fig. 7), where

1 - одноплатный компьютер Odroid XU4Q с установленной операционной системой Ubuntu 16.04, имеющий 8 ядер CPU и 6 ядер GPU, 1 - Odroid XU4Q single board computer with Ubuntu 16.04 operating system installed, having 8 CPU cores and 6 GPU cores,

2 - внешний блок питания,2 - external power supply,

3 - видеокамера с объективом Gaona-291,3 - video camera with Gaona-291 lens,

4 - программируемый логический контроллер на основе отладочной платы Core405R,4 - programmable logic controller based on the Core405R development board,

5 - карта памяти Samsung EVO Plus microSDXC 64Gb,5 - memory card Samsung EVO Plus microSDXC 64Gb,

6 - элемент питания литиевый (CR2032) предназначенный для сохранения системных часов компьютера и подключенный к вычислительному устройству через разъем RTC Battery Connector вычислительного устройства,6 - lithium battery (CR2032) designed to save the computer's system clock and connected to the computing device via the RTC Battery Connector of the computing device,

7 - дисплей, подключенный через разъем HDMI вычислительного устройства для визуализации,7 - display connected via the HDMI connector of a computing device for visualization,

8 - USB накопитель, подключенный через разъем USB вычислительного устройства для вывода статистических результатов,8 - USB drive connected via the USB connector of a computing device to display statistical results,

9 - внешний компьютер для установки программного обеспечения и вывода статистических результатов, подключенный либо через разъем Ethernet вычислительного устройства, либо через GSM-модем, подключаемый к разъему USB вычислительного устройства.9 - an external computer for installing software and displaying statistical results, connected either via the Ethernet connector of the computing device, or via a GSM modem connected to the USB connector of the computing device.

1. Learned-Miller, E., Huang, G.B., RoyChowdhury, A., Li, H., Hua, G.: Labeled faces in the wild: A survey. In: Advances in face detection and facial image analysis. Springer (2016), P. 189-248.1. Learned-Miller, E., Huang, G.B., RoyChowdhury, A., Li, H., Hua, G.: Labeled faces in the wild: A survey. In: Advances in face detection and facial image analysis. Springer (2016), pp. 189-248.

Claims

1. A method of tracking persons of interest, in which:

install software, including a set of interconnected neural network algorithms, executed using high-performance computing;

receiving a video stream of images from one or more matrix video cameras;

by means of a computing device, faces are detected on the frame,

characterized in that

preliminarily, the weight coefficients of deep neural networks are modified to a level where it becomes impossible to perform face recognition operations and determine their properties, but it remains possible to perform most of the required amount of calculations;

introducing modified weight coefficients of deep neural networks into a dedicated memory area of the computing device;

set the size of the processing cycle N, equal to an integer number of frames;

on the frames of the video stream of images, except for each N-th, in real time, by means of a computing device, motion zones are filtered out;

forming a queue of the detector module from every N-th frames of the video stream of images with filtered motion zones;

for each filtered motion zone, a face detection stream is generated;

face detection on the frame is performed by the computing device in each real-time face detection stream;

frames with detected faces form a queue of the tracking module,

the queue of the tracking module is fed into the thread of the tracking module (tracking);

in the flow of the tracking module (tracking) form a sequence of images of the faces of the same person;

for each sequence of images of faces of the same person, a queue of produced tracking is formed;

filtering the queue of the produced tracking in order to select the best images by selecting according to the parameters of the image size, the orientation of the face in the image, the degree of severity of image defects;

selected images from the queue of produced tracking are fed into the recognition stream;

in the recognition flow, a face image signature is determined in offline mode by means of a computing device and a trained first neural network in the form of a vector of real numbers, and to determine the face image signature by means of a computing device and a trained first neural network, in addition to the main processor of the computing device, the microprocessor connected to the computing device is used. a programmable logic controller device with a firmware encryption function;

determine by means of the computing device and the second trained neural network the age of the person from the face image offline;

determine by means of a computing device and a third trained neural network the sex of a person from the face image in offline mode;

comparing the face image signature with the face image signatures in the memory of the computing device, and if the similarity of the signature obtained in the recognition stream with the face image signatures in the memory of the computing device is less than a predetermined threshold, then the face image signature is entered into the memory of the computing device;

entering data on the total number of persons, the number of unique persons, gender and age of persons into the memory of the computing device.

2. The tracking method according to claim 1, characterized in that the processing cycle size N is set in the range from 2 to 16.

3. The tracking method according to claim 1, characterized in that the face image signature is determined at the output of the first trained neural network as a vector of 256 real numbers.

4. The tracking method according to claim. 1, characterized in that the detection of faces on the frame by means of a computing device is performed on parallel operating cores of the central processing unit of the computing device.

5. The tracking method according to claim. 1, characterized in that the determination of the signature of the face image by means of a computing device is performed on parallel operating cores of the graphics processor of the computing device.

6. A method for detecting and identifying persons of interest, in which

perform the tracking method according to claim 1,

characterized in that

the images of persons of interest are preliminarily entered into the allocated memory area of the computing device, after which, by means of the computing device and the trained first neural network, the image signature of the person of interest is determined in the form of a vector of real numbers and the signature of the image of the person of interest is entered into the memory of the computing device, and at least one of the elements of the vector describing the signature are used as an indicator of membership in a group of persons of interest;

if the similarity of the signature obtained in the recognition stream with the face image signatures in the memory of the computing device is greater than the specified threshold, and the signature obtained in the recognition stream contains an indicator of belonging to a group of persons of interest, then data is entered on the number of persons of interest who fell into the video camera field of view , into the memory of the computing device.

7. A stand-alone device with copy and tamper protection for implementing a tracking method and a method for detecting and identifying persons of interest, containing a housing in which are placed:

computing Device,

a matrix video camera with a lens capable of capturing an image of the control zone and connected to a computing device,

memory card connected to a computing device,

a battery connected to a computing device,

different in that

software is installed on the computing device, containing a software set of interconnected programs and neural network algorithms for detecting faces in an image, determining face image signatures, determining the age and gender of faces;

a programmable logic controller is located inside the case, connected to the computing device through one of the input-output interfaces of the computing device, and the programmable logic controller has the function of copy protection and firmware hacking;

the computing device contains multi-core central and graphic processors.

8. Autonomous device according to claim 7, characterized in that the computer is made on the basis of the ODroid XU4Q single -pay computer with the installed Ubuntu 16.04 operating system.

9. Autonomous device according to claim 8, characterized in that the video camera is connected to the computing device via the USB port of the computing device.

10. Autonomous device according to claim 7, characterized in that the programmable logical controller is based on the Core405R debt.

11. Autonomous device according to claim 7, characterized in that the programmable logical controller is connected to the computing device via the USB port of the computing device.

12. Autonomous device according to claim 7, characterized in that the power element is connected to the computing device through the RTC Battle Connector connector.

13. Autonomous device according to claim 7, characterized in that a display is connected to a computing device through the HDMI computing device connector to output statistical results and (or) visualization.