RU2616152C1

RU2616152C1 - Method of spatial position control of the participants of the sports event on the game field

Info

Publication number: RU2616152C1
Application number: RU2016114013A
Authority: RU
Inventors: Александр Владиславович Мартынов; Евгений Николаевич Хохлов; Олег Юрьевич Лашманов; Анна Владимировна Трушкина; Мария Геннадьевна Серикова; Антон Валерьевич Пантюшин
Original assignee: Общество с ограниченной ответственностью "ТрекИТ"
Priority date: 2016-04-11
Filing date: 2016-04-11
Publication date: 2017-04-12

Abstract

FIELD: sports.

SUBSTANCE: method of controlling the spatial position of the sporting events on the playing field is proposed. The method comprises receiving via at least one calibrated internal and external parameters of a video sequence of colour video frames contain images of participants sporting events on the game board and the playing field during a sporting event. Further, according to the method, the selection is performed on each frame of a video sequence obtained regions containing image areas of sports events, and determining the positions of areas on the frame. And track of the movement of mentioned areas on the video sequence frames, shall identify the participants of sports events by their assigned game numbers and the colours of their shape.

EFFECT: improving data quality through improved imaging quality and improve the accuracy of determining the trajectories of the participants.

2 cl, 3 dwg

Description

Способ контроля пространственного положения участников спортивного события на игровом полеThe way to control the spatial position of the participants of a sporting event on the playing field

Изобретение относится к способам для генерирования регистрационной записи событий, ассоциированных с участниками спортивного события, а именно к области автоматизированной обработки данных, полученных при помощи измерительных видеосистем, и может быть использован для слежения за перемещением участников спортивного события на игровом поле во время спортивного события.The invention relates to methods for generating a recording of events associated with participants in a sports event, namely, to the field of automated processing of data obtained by measuring video systems, and can be used to track the movement of participants in a sports event on the playing field during a sports event.

В практике проведения спортивных состязаний, например, хоккейных, футбольных, баскетбольных матчей и других командных игр, принято собирать статистические данные об эффективности действий каждого из участников спортивного события во время матча. Процедура сбора данных включает в себя определение по полученным во время матча последовательностям кадров положения каждого из участников спортивного события на площадке игрового поля и непрерывное отслеживание их перемещений во время спортивного события. Редактирование записанных материалов в ручном режиме с целью непрерывного слежения за действиями всех участков спортивного события на всем его протяжении и регистрация траекторий движения участников - трудоемкая задача, решение которой подвержено ошибкам оператора. Поэтому с развитием вычислительной техники была предпринята попытка переложить эту задачу на автоматизированные средства обработки последовательности кадров.In the practice of conducting sports, for example, hockey, football, basketball matches and other team games, it is customary to collect statistical data on the effectiveness of the actions of each of the participants in a sporting event during the match. The data collection procedure includes determining by the sequence of frames received during the match the position of each of the participants in the sporting event on the playing field and continuously monitoring their movements during the sporting event. Manual editing of recorded materials in order to continuously monitor the actions of all sections of a sporting event throughout its duration and recording the trajectories of the participants is a laborious task, the solution of which is subject to operator errors. Therefore, with the development of computer technology, an attempt was made to shift this task to automated means of processing a sequence of frames.

Целью контроля пространственного положения участников спортивного события на игровом поле является получение отчета о перемещениях каждого участника спортивного события во время матча, что позволяет оптимизировать эффективность действий команды в целом.The purpose of monitoring the spatial position of the participants of a sporting event on the playing field is to obtain a report on the movements of each participant in a sporting event during the match, which allows optimizing the effectiveness of the team as a whole.

Из уровня техники (US 9007463 B2, опубл. 14.04.2015) известен способ автоматизированного слежения за участниками спортивного события на игровом поле и их идентификация посредством обработки видеозаписи спортивного события. Способ включает получение с помощью по меньшей мере одной откалиброванной по внутренним и внешним параметрам видеосистемы последовательности цветных кадров, содержащих изображения участников спортивного события на игровом поле и игрового поля во время спортивного события, выделение на каждом кадре полученной видеопоследовательности областей, содержащих изображения участков спортивного события и определение положений этих областей на кадре, отслеживание перемещений упомянутых областей на видеопоследовательности кадров, идентификацию участников спортивного события по их игровым номерам и цветам их формы, осуществление обработки полученных данных и определение траектории участников посредством сопоставления изображений участников на каждом кадре с изображениями участников на предшествующих кадрах видеопоследовательности. При этом процедура выделения областей с изображениями участников спортивного события включает отделение изображения участников от изображения игрового поля и изображения зрителей, морфологическую фильтрацию, сегментацию и фильтрацию сегментов, содержащих изображения участников, по площади.The prior art (US 9007463 B2, publ. 04/14/2015) known a method of automated tracking of participants in a sporting event on the playing field and their identification by processing video of a sporting event. The method includes obtaining, using at least one video system calibrated by internal and external parameters of the video system, a sequence of color frames containing images of participants in a sports event on the playing field and a playing field during a sports event, highlighting on each frame of the obtained video sequence areas containing images of sections of a sports event and determining the positions of these areas on the frame, tracking the movements of the mentioned areas on the video sequence of frames, identifying ation the sporting events in their game number and the color of their shape, the implementation of data processing and determination of the trajectory of the participants by comparing images of participants on each frame image of participants at the previous frame of the video sequence. The procedure for selecting areas with images of participants in a sporting event includes separating the image of the participants from the image of the playing field and the image of the audience, morphological filtering, segmentation and filtering of segments containing images of participants by area.

Вышеуказанный способ упрощает работу оператора, уменьшая исходный объем работы до уровня необходимой коррекции результатов применения автоматизированного определения траекторий, однако вследствие несовершенства процедуры выделения областей с изображениями участников спортивного события автоматический режим определения траекторий также не свободен от ошибок.The above method simplifies the work of the operator, reducing the initial amount of work to the level of the necessary correction of the results of the application of automated determination of trajectories, however, due to the imperfection of the procedure for selecting areas with images of participants in a sporting event, the automatic mode of determining trajectories is also not error free.

Описанный в прототипе способ не формирует отчет с отметками о предупреждениях, привязанных к определенным точкам траекторий, предъявляемый оператору для внесения корректировок в ручном режиме в случае наличия таких предупреждений, а, следовательно, оператор должен верифицировать все видеопоследовательности на предмет ошибок обработки данных.The method described in the prototype does not generate a report with warnings marks attached to certain points of the trajectories presented to the operator for manual adjustments in the event of such warnings, and, therefore, the operator must verify all video sequences for data processing errors.

С одной стороны, ошибки могут появляться в результате недостаточно качественного выделения областей с изображениями участников на отдельных кадрах, что приводит к возникновению большого числа ложных объектов, квалифицируемых как изображения участников, а значит к возникновению большого числа ложных траекторий. Одним из распространенных источников таких ошибок является ошибка разделения изображения одного участника на несколько изображений (майка, шорты, рука, нога и т.д.), возникающая вследствие некорректной сегментации изображения.On the one hand, errors can arise as a result of insufficiently high-quality identification of areas with images of participants in separate frames, which leads to the emergence of a large number of false objects that qualify as images of participants, and therefore to the appearance of a large number of false trajectories. One of the common sources of such errors is the error in splitting the image of one participant into several images (shirt, shorts, arm, leg, etc.), which arises as a result of incorrect image segmentation.

С другой стороны, ошибки могут появляться при сопоставлении отдельных изображений участников матча с выявленными траекториями. Такие ошибки наиболее часто возникают в результате резкого изменения направления движения изображения участника и (или) резких изменений в скорости или ускорении такого движения.On the other hand, errors may appear when comparing individual images of match participants with identified paths. Such errors most often occur as a result of a sharp change in the direction of movement of the image of the participant and (or) sharp changes in speed or acceleration of such movement.

В настоящем изобретении делается попытка уменьшить количество вышеуказанных ошибок, что, в конечном счете, приводит к уменьшению количества корректировок, требуемых от оператора. Таким образом, техническим результатом является повышение качества обработки данных за счет повышения качества обработки изображений и повышения точности определения траекторий участников.The present invention attempts to reduce the number of the above errors, which ultimately leads to a decrease in the number of adjustments required from the operator. Thus, the technical result is to improve the quality of data processing by improving the quality of image processing and improving the accuracy of determining the trajectories of participants.

Поставленная задача решена за счет того, что в известном способе контроля пространственного положения участников спортивного события на игровом поле, включающем получение с помощью по меньшей мере одной откалиброванной по внутренним и внешним параметрам видеосистемы видеопоследовательности цветных кадров, содержащих изображения участников спортивного события на игровом поле и игрового поля во время спортивного события, выделение на каждом кадре полученной видеопоследовательности областей, содержащих изображения участков спортивного события, и определение положений этих областей на кадре, отслеживание перемещений упомянутых областей на видеопоследовательности кадров, идентификацию участников спортивного события по присвоенным им игровым номерам и цветам их формы, осуществление обработки полученных данных и определение траектории участников посредством сопоставления изображений участников на каждом кадре с изображениями участников на предшествующих кадрах видеопоследовательности, согласно настоящему изобретению, осуществляют нормализацию освещенности изображения, включающую фильтрацию изображения, преобразование изображения в полутоновое, которое подвергается гистограммной эквализации, нормирование изображения по интенсивности к диапазону от 0 до 1, с образованием матрицы нормирующих коэффициентов, которая инвертируется, после чего все кадры видеопоследовательности умножаются на полученную матрицу коэффициентов, а выделение на каждом кадре полученной видеопоследовательности областей, содержащих изображения участков, проводят посредством многопроходной сегментации изображения при помощи алгоритма интерактивного отделения от фона с итерацией по графам, обученного в автоматическом режиме, причем обучение осуществляют при помощи бинаризованной маски, созданной на базе полученного ранее сегмента, идентификацию каждого участника спортивного события по цветам его формы производят посредством классификации изображения участника по априорно заданным группам посредством определения параметров и характеристик совместного распределения частоты попаданий значений хроматических составляющих изображения в заданные диапазоны, причем совместное распределение частоты попаданий значений хроматических составляющих изображения в заданные диапазоны определяется посредством двумерной гистограммы, в которой количество интервалов зависит от типа спортивного события и количества цветов в форме участников спортивного события, данная гистограмма представляет собой псевдоизображение, которое бинаризируют по низкому порогу, находят суммарную интенсивность бинаризованного псевдоизображения, которая представляет собой количество прошедших через порог пикселей псевдоизображения, полученное значение суммарной интенсивности сопоставляют с порогом на площадь, если значение суммарной интенсивности меньше пороговой площади, то соответствующий сегмент отмечается как «плохой» и в дальнейшей обработке не участвует, если больше или равен, то соответствующий сегмент отмечается как «идентифицируемый», определение траектории участников производят путем нахождения наименьшего значения метрики, которую определяют для каждой пары «возможная траектория - возможный участник», сопоставляя параметры и характеристики изображений участников на каждом кадре и изображений участников на предшествующих кадрах видеопоследовательности, при этом метрика является суммарным штрафом по нескольким критериям, пара «возможная траектория - возможный участник» имеющая наименьшее значение метрики, назначается продолжением траектории объекта на текущем кадре, если дистанция между координатами объекта на текущем кадре и координатами последнего объекта траектории будет меньше установленного порогового значения, зависящего от максимально возможной скорости движения участника спортивного события и частоты съемки кадров камерами.The problem is solved due to the fact that in the known method of controlling the spatial position of the participants of a sporting event on the playing field, including using at least one video system calibrated by internal and external parameters, the video sequence of color frames containing images of the participants of the sporting event on the playing field and the game fields during a sporting event, highlighting on each frame of the resulting video sequence of areas containing images of sports areas event, and determining the positions of these areas on the frame, tracking the movements of the mentioned areas on the video sequence of frames, identifying the participants of a sporting event by the game numbers and colors of their shape assigned to them, processing the received data and determining the trajectory of the participants by comparing the images of participants on each frame with the images participants in the previous frames of the video sequence according to the present invention, normalize the illumination image including filtering the image, converting the image to grayscale, which is subjected to histogram equalization, normalizing the image in intensity to a range from 0 to 1, with the formation of a matrix of normalizing coefficients, which is inverted, after which all frames of the video sequence are multiplied by the resulting matrix of coefficients, and the selection by each frame of the obtained video sequence of regions containing images of sections is carried out by means of multi-pass image segmentation I am using the algorithm of interactive separation from the background with iteration over graphs, trained in automatic mode, and training is carried out using a binarized mask created on the basis of a segment obtained earlier, each participant in a sporting event is identified by the colors of his shape by classifying the participant’s image according to a priori groups by determining the parameters and characteristics of the joint distribution of the frequency of hits of the values of the chromatic components of the image in the task different ranges, and the joint distribution of the frequency of falling values of the chromatic components of the image into the specified ranges is determined by a two-dimensional histogram, in which the number of intervals depends on the type of sporting event and the number of colors in the form of participants in a sporting event, this histogram is a pseudo-image that binarizes at a low threshold, find the total intensity of the binarized pseudo-image, which is the number passed through og pseudo-image pixels, the resulting value of the total intensity is compared with the threshold per area, if the value of the total intensity is less than the threshold area, the corresponding segment is marked as “bad” and is not involved in further processing, if it is greater than or equal, the corresponding segment is marked as “identifiable” , the trajectory of the participants is determined by finding the smallest metric value, which is determined for each pair “possible trajectory - possible participant” by comparing the parameters and characteristics of the participant’s images on each frame and the participant’s images on the previous frames of the video sequence, the metric being the total penalty according to several criteria, the “possible path - possible participant” pair having the lowest metric value is assigned by the continuation of the object path in the current frame, if the distance between the coordinates of the object on the current frame and the coordinates of the last object of the trajectory will be less than the set threshold value, depending on the maximum about the possible speed of a participant in a sporting event and the frequency of camera shots.

Возможен вариант, при котором для калибровки видеосистемы по внутренним и внешним параметрам используют способ калибровки видеосистемы для контроля объектов на плоской площадке, включающий сканирование эталонных объектов в пространстве предметов каждой камерой видеосистемы таким образом, чтобы каждый отсканированный кадр содержал изображение по меньшей мере одного эталонного объекта, измерение положения точек изображения объекта для каждого эталонного объекта на отсканированных кадрах, определение внутренних параметров каждой камеры, при которых суммарное отклонение измеренных точек от их расчетного положения для всех отсканированных одной камерой кадров было минимальным, исправление нелинейных искажений в изображении на кадрах с каждой камеры посредством применения соответствующих внутренних параметров камеры, определение матрицы пересчета изображений, полученных камерами видеосистемы, в пространство предметов, при этом в качестве эталонного объекта в пространстве предметов выбирают прямолинейный объект, изображение которого находится на наименьшем расстоянии от центра кадра и имеет максимальную протяженность до периферии кадра, для измеренных точек изображения эталонного объекта определяют аппроксимирующую прямую, проведенную через по меньшей мере две измеренные точки изображения объекта, наиболее близко расположенных к центру кадра, для определения внутренних параметров каждой камеры минимизируют суммарное абсолютное отклонение положения измеренных точек изображения эталонного объекта от аппроксимирующей прямой, определяют взаимное положение элементов площадки посредством обработки ее изображения, определение матриц пересчета координат на изображениях, полученных камерами видеосистемы, в пространство предметов путем сопоставления положения не менее четырех точек, относящихся к элементам площадки, и соответствующих им точек на изображениях, полученных каждой из камер видеосистемы, после исправления присутствующих на них нелинейных искажений.A variant is possible in which to calibrate the video system according to internal and external parameters, use the video system calibration method to control objects on a flat platform, including scanning reference objects in the space of objects with each camera of the video system so that each scanned frame contains an image of at least one reference object, measuring the position of the image points of the object for each reference object on the scanned frames, determining the internal parameters of each measures in which the total deviation of the measured points from their calculated position for all frames scanned by one camera was minimal, correction of nonlinear distortions in the image on frames from each camera by applying the corresponding internal camera parameters, determination of the matrix for converting images received by video system cameras into the space of objects in this case, as a reference object in the space of objects, a rectilinear object is selected, the image of which is located at the smallest the distance from the center of the frame and has a maximum length to the periphery of the frame, for the measured image points of the reference object, an approximating straight line drawn through at least two measured image points of the object closest to the center of the frame is determined to minimize the total absolute deviation of each camera the position of the measured image points of the reference object from the approximating straight line, determine the mutual position of the elements of the site by processing of its image, determination of coordinate conversion matrices on images received by video system cameras into the space of objects by comparing the positions of at least four points related to the elements of the site and the corresponding points on the images received by each of the video system cameras, after correcting the non-linear distortions.

Таким образом, вся совокупность существенных признаков позволяет повысить качество обработки данных за счет повышения качества обработки изображений, благодаря выполнению многопроходной сегментации изображения и выравниванию освещенности изображения, при одновременном повышении точности определения траекторий участников, благодаря использованию метрик, идентификации каждого участника спортивного события по цветам его формы и номерам. В конечном итоге это приводит к уменьшению количества ошибок контроля пространственного положения участников спортивного события на игровом поле за счет уменьшения количества необходимых корректировок со стороны оператора. Таким образом, заявляемое изобретение позволяет значительным образом уменьшить количество ложных сопоставлений и снизить нагрузку на оператора по внесению необходимых корректировок в результат автоматического сопоставления.Thus, the entire set of essential features allows improving the quality of data processing by improving the quality of image processing due to multi-pass image segmentation and leveling the image brightness, while improving the accuracy of determining the trajectories of participants, through the use of metrics, identifying each participant in a sporting event by the colors of its shape and numbers. Ultimately, this leads to a decrease in the number of errors in controlling the spatial position of participants in a sports event on the playing field by reducing the number of necessary adjustments by the operator. Thus, the claimed invention can significantly reduce the number of false comparisons and reduce the burden on the operator to make the necessary adjustments to the result of automatic comparisons.

Заявителем проведен патентный поиск по данной теме и заявляемая совокупность существенных признаков не выявлена.The applicant has conducted a patent search on this topic and the claimed combination of essential features has not been identified.

Осуществление способа выполняют следующим образом.The implementation of the method is as follows.

На Фиг. 1 показано изображение плоской площадки (например, игровое поле), искаженное дисторсией оптической системы.In FIG. 1 shows an image of a flat area (for example, a playing field) distorted by the distortion of the optical system.

На Фиг. 2 показана последовательность кадров площадки, содержащей в центре изображение прямолинейного объекта.In FIG. 2 shows a frame sequence of a pad containing in the center an image of a rectilinear object.

На Фиг. 3 показано определение поля зрения камеры видеосистемы относительно площадки.In FIG. 3 shows the determination of the field of view of the camera of the video system relative to the site.

Способ может быть реализован с помощью автоматизированной системы слежения, включающей видеосистему, содержащую по меньшей мере одну камеру, которую устанавливают за пределами игровой поля необходимое количество видеокамер, так чтобы их суммарное поле зрения покрывало всю поверхность игровой площадки игрового поля без «слепых» зон. Для обработки кадров, полученных с камер видеосистемы, и выполнения необходимых алгоритмических процедур калибровки может быть использован компьютер с запоминающим устройством для хранения данных и соответствующим программным обеспечением, соединенный с камерами видеосистемы посредством линий связи с достаточной пропускной способностью для обмена изображениями необходимого разрешения и управления режимом захвата кадров. Конкретная конфигурация программного и аппаратного обеспечения, а также их архитектура зависят от количества камер, их разрешения, режима их работы. При этом камеры должны обладать возможностью синхронной записи видеокадров и разрешением необходимым для различения номеров игроков. Взаимное расположение камер, их количество и положение относительно площадки игрового поля, так же как и конкретные параметры камер, зависят от конфигурации площадки игрового поля, дистанции до нее, угла съемки.The method can be implemented using an automated tracking system, including a video system containing at least one camera, which is installed outside the playing field the required number of cameras, so that their total field of view covers the entire surface of the playing field of the playing field without blind spots. To process frames received from the cameras of the video system and perform the necessary algorithmic calibration procedures, a computer with a memory device for storing data and appropriate software can be used connected to the cameras of the video system via communication lines with sufficient bandwidth to exchange images of the required resolution and control the capture mode frames. The specific configuration of software and hardware, as well as their architecture, depends on the number of cameras, their resolution, and their mode of operation. At the same time, cameras must have the ability to synchronously record video frames and the resolution necessary to distinguish the numbers of players. The relative position of the cameras, their number and position relative to the playing field, as well as the specific parameters of the cameras, depend on the configuration of the playing field, the distance to it, the shooting angle.

Используемые для съемки спортивного события видеокамеры должны быть откалиброваны по внешним и внутренним параметрам. Т.е. обеспечивать возможность исправления нелинейных искажений в получаемых с камер изображениях и возможность пересчета координат, измеренных на полученных видеокадрах, в систему координат, связанную с игровым полем.The camcorders used to record a sporting event must be calibrated to external and internal parameters. Those. provide the ability to correct non-linear distortions in the images received from the cameras and the ability to recalculate the coordinates measured on the received video frames into a coordinate system associated with the playing field.

Далее для отснятых записей спортивного события, полученных с каждой камеры, производят определение игровых промежутков времени. К таким промежуткам могут относиться игровые периоды за исключением времени остановки игры. Дальнейшей обработке подвергают только кадры, соответствующие игровым промежуткам времени.Next, for the captured records of a sporting event received from each camera, the determination of game time intervals is made. Such periods may include game periods, with the exception of the time the game stops. Only frames corresponding to game time intervals are subjected to further processing.

Таким образом, на момент начала обработки формируют набор цветных видеокадров для каждой камеры, соответствующий игровым промежуткам времени.Thus, at the start of processing, a set of color video frames for each camera corresponding to the game time intervals is formed.

Для фильтрации высокочастотного шума на полученных кадрах могут применяться алгоритмы гауссова размытия, скользящего среднего или другие известные способы фильтрации.To filter high-frequency noise on the received frames, algorithms of Gaussian blur, moving average or other known filtering methods can be used.

Затем для каждого кадра производят процедуру очищения от фона. В качестве фона выступает изображение игрового поля, свободное от изображений участников спортивного события. Такое изображение может быть получено как до момента начала спортивного события, так и во время его проведения, основываясь на алгоритмах анализа движения в кадре. Полученное изображение фона, уникальное для каждой из камер системы, вычитают из всех кадров цветной видеопоследовательности для соответствующей камеры. Таким образом, на кадрах видеопоследовательности остаются области с изображениями участников спортивного события, а также области с изображениями зрителей на трибунах. Остальные части изображения (относящиеся к фону) становятся черными (имеют нулевую интенсивность в каждом цветовом канале).Then, for each frame, a background clearing procedure is performed. The background is the image of the playing field, free from images of participants in a sporting event. Such an image can be obtained both before the start of a sporting event, and during its holding, based on the motion analysis algorithms in the frame. The resulting background image, unique to each of the cameras in the system, is subtracted from all frames of the color video sequence for the corresponding camera. Thus, in the frames of the video sequence there remain areas with images of participants in a sporting event, as well as areas with images of spectators in the stands. The remaining parts of the image (related to the background) turn black (have zero intensity in each color channel).

Фильтрация областей, содержащих изображения зрителей, может быть реализована применением априорно заданной обнуляющей маски для каждой камеры видеосистемы и определяемой оператором на этапе подготовки видеопоследовательностей к обработке. Обнуляющая маска позволяет сделать нулевой интенсивность областей, содержащих изображения зрителей или изображения других объектов, попавших в кадр, но не относящихся ни к изображениям участников спортивного события, ни к изображению площадки игрового поля.Filtering areas containing images of viewers can be implemented by applying an a priori set nulling mask for each camera of the video system and determined by the operator at the stage of preparing video sequences for processing. The zeroing mask allows you to make zero the intensity of areas containing images of spectators or images of other objects that fell into the frame, but not related to images of participants in a sports event, or to the image of the playing field.

Известно, что изображающие объективы зачастую несвободны от виньетирования, вызывающего падение освещенности на отснятых кадрах от центра к периферии, кроме того освещение площадки игрового поля тоже зачастую не равномерно. Это явление может значительным образом отразиться на качестве выделения областей с изображениями участников спортивного события. Так применение пороговой фильтрации с единым порогом, подобранным для, например, центра кадра и его периферии, используемое в процедуре сегментации изображений участников спортивного события, может привести к значительной потере изображений участников на периферической области кадра. Это может привести к их отбрасыванию из дальнейшей обработки при фильтрации сегментов по площади. С другой стороны, для изображений участников, расположенных близко к центральной области кадра, отсутствие соответствующей коррекции может привести к ошибочному объединению в единый сегмент двух близкорасположенных изображений участников спортивного события. Для коррекции этого эффекта может использоваться составной порог бинаризации изображений на кадре, задающий уникальные значения порога для каждого пикселя или групп пикселей на кадре. В качестве альтернативы может использоваться нормализация освещенности в изображении на кадре, когда на первом этапе нормализации производят размытие изображения, например, фона (соответствующей камеры) с фильтром большой апертуры. В качестве такого фильтра может быть использован гауссов фильтр с апертурой с 300×200 пикселей. Размытое таким образом цветное изображение сначала преобразуется в полутоновое изображение, которое подвергается гистограммной эквализации (динамический диапазон изображения расширяется для полного покрытия диапазона яркостей, например, от 0 до 255). После этого изображение нормируется по интенсивности к диапазону от 0 до 1, образуя матрицу нормирующих коэффициентов. Полученная таким образом матрица нормирующих коэффициентов инвертируется по правилам инвертирования изображений, после чего все кадры видеопоследовательности для соответствующей камеры умножаются на полученную матрицу коэффициентов. Откорректированные таким образом видеокадры имеют преимущественно однородную освещенность для всех областей кадра, что позволяет снизить риск получения ошибок при дальнейшей сегментации изображений участников спортивного события при использовании фильтрации по единому порогу (единый порог бинаризации).It is known that imaging lenses are often not free from vignetting, which causes a drop in illumination in the captured frames from the center to the periphery, in addition, the illumination of the playing field is also often not uniform. This phenomenon can significantly affect the quality of the allocation of areas with images of participants in a sporting event. Thus, the use of threshold filtering with a single threshold selected for, for example, the center of the frame and its periphery, used in the segmentation procedure of images of participants in a sports event, can lead to a significant loss of images of participants in the peripheral region of the frame. This can lead to their rejection from further processing when filtering segments by area. On the other hand, for images of participants located close to the central region of the frame, the lack of appropriate correction can lead to the erroneous combination of two closely located images of participants in a sports event into a single segment. To correct this effect, a composite threshold for binarizing images on a frame can be used, setting unique threshold values for each pixel or groups of pixels in a frame. As an alternative, the normalization of illumination in the image on the frame can be used, when at the first stage of normalization, the image is blurred, for example, of the background (corresponding camera) with a large aperture filter. As such a filter, a Gaussian filter with an aperture of 300 × 200 pixels can be used. The color image thus blurred is first converted to a grayscale image, which undergoes histogram equalization (the dynamic range of the image is expanded to completely cover the range of brightness, for example, from 0 to 255). After that, the image is normalized in intensity to a range from 0 to 1, forming a matrix of normalizing coefficients. The matrix of normalizing coefficients obtained in this way is inverted according to the rules for inverting images, after which all frames of the video sequence for the corresponding camera are multiplied by the obtained matrix of coefficients. Video frames corrected in this way have predominantly uniform illumination for all areas of the frame, which reduces the risk of errors when further segmenting images of participants in a sporting event when filtering by a single threshold (single binarization threshold).

После проведения бинаризации изображений для каждого кадра видеопоследовательностей проводят морфологическую фильтрацию бинарного изображения, состоящую из последовательного применения операций морфологического закрытия и открытия. Эта процедура позволяет устранить мелкие детали на кадре, например, изображений шайбы и клюшек. Кроме того, в ряде случаев это позволяет разделить области на кадре, содержащие близкорасположенные изображения разных участников (коллизии). Однако побочным эффектом этой традиционно применяемой в такого рода системах операции является риск разделения изображения участника на несколько отдельных областей с дальнейшим удалением некоторых из этих областей при фильтрации их по площади. Для исключения такого рода ошибок в раскрываемом способе применяется усовершенствованная процедура сегментации в два прохода. На первом этапе все бинарные изображения на кадре сегментируются по 8-связной окрестности. Далее из рассмотрения откидываются (интенсивность на бинаризованном изображении обнуляется) наиболее мелкие сегменты, не прошедшие через порог (конкретная величина порога определяется исходя из разрешения камер, фокусного расстояния их объективов и расположения камер относительно площадки игрового поля). Далее из рассмотрения удаляются все сегменты, имеющие непропорционально большое или малое соотношение сторон описанного прямоугольника. После указанных операций в рассмотрении остаются только сегменты, чья площадь близка к пороговой площади, соответствующей площади изображения игрока в дальней точке поля зрения камеры. Конкретное значение пороговой площади зависит от характеристик камер и их расположения относительно площадки игрового поля. Оставшиеся сегменты проверяют на выполнение априорного условия по соотношению сторон описанного прямоугольника (например, вертикальная сторона больше горизонтальной). Для сегментов, прошедших проверку и не имеющих общей границы с маской для фильтрации изображений зрителей (граничные сегменты с большой долей вероятности являются изображениями зрителей), проводят уточняющую сегментацию в окне заданного размера (заведомо перекрывающего изображения игрока для данной области кадра). Для проведения такой сегментации область внутри окна заполняют изображением исходного кадра (вместе с фоном). Для полученного таким образом изображения проводят интерактивное отделение от фона с итерацией по графам (Interactive Foreground Extraction using Iterated Graph Cuts), т.н. сегментацию по методу GrabCut. Отличительной особенностью здесь является то, что обучение сегментации GrabCut проводят в автоматическом режиме при помощи бинаризованной маски, созданной на базе полученного ранее сегмента (вокруг которого создавалось окно для уточняющей сегментации). Автоматическая маркировка производится следующим образом. Пикселам, расположенным по границе окна, присваивают метку «фон». Пикселам, соответствующим пикселам бинаризованной маски, созданной на базе полученного ранее сегмента, присваивают метку «данный объект». Мелкие сегменты (соответствующие размерам (площади) изображений отдельных частей игроков, конкретное значение этого параметра зависит от характеристик камер и их расположения относительно площадки игрового поля) бинаризованной маски, полностью попавшие в окно, получают метку «возможно, данный объект». Мелкие объекты, частично попавшие в окно, получают метку «фон». Крупные сегменты (размеры которых соответствуют размерам игроков для данной области кадра) бинаризованной маски, частично или полностью попавшие в окно, получают метку «фон». Всем остальным пикселям, попавшим в окно, присваивается метка «возможно фон». Изображение в окне с присвоенными его пикселям метками «данный объект», «возможно, данный объект», «фон» и «возможно фон» подается на вход алгоритма интерактивного отделения от фона GrabCut, который уточняет границы бинарной маски для данного сегмента. В случае если алгоритм GrabCut увеличил площадь сегмента до, например, 98% окна, то такой сегмент откидывается из рассмотрения (его интенсивность на бинаризованной маске обнуляется), т.к. он квалифицируется как фоновое изображение. Основная цель уточняющей обработки - уменьшить количество некачественно (например, ошибочно разделенных) сегментированных изображения участников спортивного события.After the binarization of the images for each frame of the video sequences, morphological filtering of the binary image is carried out, consisting of the sequential application of the operations of morphological closing and opening. This procedure allows you to eliminate small details on the frame, for example, images of the puck and clubs. In addition, in some cases, this allows you to separate the areas on the frame containing closely located images of different participants (collisions). However, a side effect of this operation traditionally used in such systems is the risk of dividing the participant’s image into several separate areas with the further removal of some of these areas when filtering them by area. To eliminate such errors in the disclosed method, an advanced two-pass segmentation procedure is used. At the first stage, all binary images in the frame are segmented in an 8-connected neighborhood. Then, the smallest segments that have not passed the threshold are discarded (the intensity on the binarized image is zeroed) (the specific threshold value is determined based on the resolution of the cameras, the focal length of their lenses and the location of the cameras relative to the playing field). Further, all segments having a disproportionately large or small aspect ratio of the described rectangle are deleted from consideration. After these operations, only segments whose area is close to the threshold area corresponding to the area of the player’s image at the far point of the camera’s field of view remain in consideration. The specific value of the threshold area depends on the characteristics of the cameras and their location relative to the playing field. The remaining segments are checked for an a priori condition by the aspect ratio of the described rectangle (for example, the vertical side is larger than the horizontal). For segments that have been tested and do not have a common border with a mask for filtering spectator images (boundary segments are most likely to be spectator images), refinement segmentation is carried out in a window of a given size (obviously overlapping player images for a given area of the frame). To carry out such segmentation, the area inside the window is filled with the image of the original frame (together with the background). Interactive Foreground Extraction using Iterated Graph Cuts, for the images thus obtained, is interactively separated from the background with iteration over the graphs. GrabCut segmentation. A distinctive feature here is that GrabCut segmentation training is carried out automatically using a binarized mask created on the basis of the previously obtained segment (around which a window was created for refinement segmentation). Automatic marking is as follows. Pixels located along the border of the window are assigned a background label. The pixels corresponding to the pixels of the binarized mask created on the basis of the previously obtained segment are assigned the label “this object”. Small segments (corresponding to the size (area) of images of individual parts of the players, the specific value of this parameter depends on the characteristics of the cameras and their location relative to the playing field) of the binarized mask, which completely fall into the window, get the label “perhaps this object”. Small objects that partially fell out of the window receive a “background” label. Large segments (the sizes of which correspond to the sizes of the players for a given area of the frame) of the binarized mask, partially or completely falling into the window, receive the label “background”. All other pixels that fall into the window are labeled “possible background”. The image in the window with the labels “given object”, “perhaps this object”, “background” and “possible background” is assigned to the input of the algorithm for interactive separation from the GrabCut background, which specifies the boundaries of the binary mask for this segment. If the GrabCut algorithm increased the area of the segment to, for example, 98% of the window, then such a segment is thrown back from consideration (its intensity on the binarized mask is reset), because It qualifies as a background image. The main purpose of refinement processing is to reduce the number of poorly (for example, erroneously divided) segmented images of participants in a sporting event.

Следующим этапом усовершенствованной процедуры сегментации является применение бинаризованной маски к изображению на кадре, полученному после нормализации освещенности. Над измененным таким образом изображением проводится повторная сегментация по 8-связной окрестности.The next step in the advanced segmentation procedure is to apply a binarized mask to the image on the frame obtained after normalizing the illumination. Over the image thus modified, repeated segmentation is carried out in an 8-connected neighborhood.

На выходе описанной усовершенствованной двухпроходной процедуры сегментации (многопроходной сегментации изображения на базе алгоритма интерактивного отделения от фона с итерацией по графам, обученного в автоматическом режиме) полученные сегменты изображения фильтруют по площади, откидывая из рассмотрения сегменты, площадь которых менее порога, соответствующего, например, четверти площади изображения игрока для данной области изображения (конкретное его значение зависит от типа спортивного события, внешнего вида формы участников, характеристик камер и их расположения относительно площадки игрового поля). Сегменты, прошедшие через порог, отмечаются как «неидентифицируемые», а сегменты, площадь которых соответствует от 70% до 130% от площади изображения игрока, например, отмечаются как «идентифицируемые», при этом конкретные значения порога зависят от типа спортивного события, внешнего вида формы его участников, характеристик камер и их расположения относительно площадки игрового поля.At the output of the described improved two-pass segmentation procedure (multi-pass image segmentation based on the algorithm of interactive separation from the background with iteration over graphs, trained automatically), the resulting image segments are filtered by area, discarding segments from the consideration that are less than a threshold corresponding, for example, to a quarter the player’s image area for a given image area (its specific value depends on the type of sporting event, the appearance of the participants ’shape, x camera characteristics and their location relative to the playing field). Segments that have passed through the threshold are marked as “unidentifiable”, and segments whose area corresponds to 70% to 130% of the player’s image area, for example, are marked as “identifiable”, while the specific threshold values depend on the type of sporting event, appearance the forms of its participants, the characteristics of the cameras and their location relative to the playing field.

Как «идентифицируемые», так и «неидентифицируемые» сегменты участвуют в процедуре определения траекторий движения участников спортивного события. Однако процедуры идентификации участников как по цветам формы, так и номеру на ней производят только для «идентифицируемые» сегментов.Both “identifiable” and “unidentifiable” segments participate in the procedure for determining the trajectories of the participants in a sporting event. However, the procedures for identifying participants both in the colors of the form and the number on it are performed only for “identifiable” segments.

При проведении процедуры идентификации участников по принадлежности к команде (по цветам формы) и по номеру на форме в качестве команды рассматривают также «команду судей». Идентификация участников по цветам формы может производиться как на основе априорно известных цветов формы, так и после процедуры обучения на изображениях участников, полученных на обрабатываемой последовательности видеокадров (во многих случаях такое обучение наиболее предпочтительно, так как учитывает искажение цветов формы под действием условий съемки и настроек снимающих камер). Это позволяет снизить риск ошибочного соотнесения сегмента (с изображением участника) с неверной командой, кроме того, может быть реализована автоматическая кластеризация игроков на основе априорных данных о типе спортивного события.In the process of identifying participants by membership in the team (by the colors of the form) and by the number on the form, the “team of judges” is also considered as a team. Identification of participants by form colors can be made both on the basis of a priori known form colors, and after the training procedure on participants' images obtained on the processed sequence of video frames (in many cases, such training is most preferable, since it takes into account the distortion of form colors under the influence of shooting conditions and settings shooting cameras). This allows you to reduce the risk of mistakenly associating a segment (with the participant’s image) with an incorrect team, in addition, automatic clustering of players based on a priori data on the type of sporting event can be implemented.

Идентификация участников по принадлежности к команде производится в соответствие со следующей процедурой.Identification of participants by belonging to the team is carried out in accordance with the following procedure.

Для каждого сегмента, отмеченного как «идентифицируемый», формируют новое окно, размеры которого совпадают с обрамляющим сегмент прямоугольником. Пиксели нового окна, соответствующие пикселям сегмента, заполняют пикселями исходного цветного RGB-изображения, остальные пиксели (пиксели фона и других сегментов) оставляют черными. Полученное изображение пересчитывают в цветовое пространство, линейное с точки зрения человеческого восприятия, например Lab. Затем для а и b (в случае Lab) компонент цветового пространства Lab находят двумерную гистограмму (совместное распределение частоты попаданий значений хроматических составляющих изображения а и b в заданные диапазоны), количество интервалов гистограммы, т.е. количество заданных диапазонов, зависит от типа спортивного события и количества цветов в форме участников спортивного события, например [0, 31] для каждой из компонент. Такую гистограмму можно рассматривать как псевдоизображение (в данном примере размерностью 32×32 пкс). Полученное псевдоизображение бинаризуют по низкому порогу (например, 5% от максимального значения «интенсивности» псевдоизображения, его конкретное значение зависит от типа спортивного события и априорной информации о количестве цветов в форме участников спортивного события, характеристик камер и их расположения относительно площадки игрового поля). Находят суммарную интенсивность бинаризованного псевдоизображения, т.е. количество прошедших через порог пикселей псевдоизображения. Полученное значение суммарной интенсивности (количество пикселей) сопоставляют с порогом на площадь, равным, например, 10, его значение соответствует наиболее разноцветному участку площадки игрового поля, не содержащей изображение участника спортивного события, соответствующей ему по размерам. Если значение суммарной интенсивности меньше пороговой площади, то соответствующий сегмент отмечается как «плохой» и в дальнейшей обработке не участвует, если больше или равен, то соответствующий сегмент отмечается как «идентифицируемый». Таким образом, оказывается возможным отсечь от дальнейшей обработки области, не содержащие изображения участников спортивного события или области, в которых площадь фона значительно превосходит площадь изображения участника спортивного события. Псевдо-изображение, соответствующее «идентифицируемым» сегментам преобразуется в вектор (в случае работы с псевдо-изображением размером 32×32 его преобразуют в вектор из 1024 значений), который подается на вход машины опорных векторов SVM (Support Vector Machine), обученной по прецедентам, для классификации. На выходе SVM получаем классификацию псевдоизображения по четырем категориям: «команда 1», «команда 2», «команда судей» и «команда не определена».For each segment marked as “identifiable”, a new window is formed whose dimensions coincide with the rectangle surrounding the segment. The pixels of the new window corresponding to the pixels of the segment are filled with pixels of the original color RGB image, the remaining pixels (pixels of the background and other segments) are left black. The resulting image is converted into a color space that is linear in terms of human perception, for example Lab. Then, for a and b (in the case of Lab), the components of the color space Lab find a two-dimensional histogram (the joint distribution of the frequency of occurrence of the values of the chromatic components of the image a and b in the given ranges), the number of histogram intervals, i.e. the number of specified ranges depends on the type of sporting event and the number of colors in the form of participants in a sporting event, for example [0, 31] for each of the components. Such a histogram can be considered as a pseudo image (in this example, the dimension is 32 × 32 pixels). The resulting pseudo-image is binarized at a low threshold (for example, 5% of the maximum value of the “intensity” of the pseudo-image, its specific value depends on the type of sporting event and a priori information about the number of colors in the form of participants in a sporting event, camera characteristics and their location relative to the playing field). Find the total intensity of the binarized pseudo-image, i.e. the number of pseudo-image pixels passed through the threshold. The obtained value of the total intensity (number of pixels) is compared with a threshold for an area equal to, for example, 10; its value corresponds to the most colorful part of the playing field area that does not contain an image of a sports event participant corresponding in size. If the value of the total intensity is less than the threshold area, then the corresponding segment is marked as “bad” and does not participate in further processing, if it is greater than or equal, then the corresponding segment is marked as “identifiable”. Thus, it is possible to cut off from further processing areas that do not contain images of participants in a sporting event or areas in which the background area significantly exceeds the image area of a participant in a sporting event. The pseudo-image corresponding to the “identifiable” segments is converted into a vector (in case of working with a 32 × 32 pseudo-image it is converted into a vector of 1024 values), which is fed to the input of the support vector machine SVM (trained Vector Machine) , for classification. At the SVM output, we obtain a pseudo-image classification into four categories: “team 1”, “team 2”, “team of judges” and “team not defined”.

Обучение машины опорных векторов SVM может быть произведено перед обработкой всех видеопоследовательностей, по запросу к оператору с требованием выделить на отснятых кадрах с разных камер некоторое количество изображений участников спортивного события от разных команд («команда 1», «команда 2», «команда судей»), а также произвольных (не содержащих целых изображений участников спортивного события) областей на площадке игрового поля. Например, для каждой «команды» может быть выбрано не менее 15-20 изображений участников спортивного события полученных с разных камер. При этом для улучшения результатов классификации желательно обеспечить выбор изображений, соответствующих различным областям поля зрения камер видеосистемы. Выбранные оператором области с изображениями участников обрабатываются по той же процедуре, по которой формируются псевдоизображения при классификации. Это позволяет адаптировать классификатор к распознаванию изображений, полученных в конкретных условиях съемки (влияние цветопередачи, шумов камер, спектрального состава осветительного оборудования на площадке игрового поля и т.д.) на цвета изображения формы, а также учесть влияние искажения цветов в изображении формы из-за перемещений участников спортивного события (при движении участника изображение его формы трансформируется, повторяя контуры фигуры, а значит искажается и соотношение площадей изображений, приходящихся на участки формы разного цвета). Классификатор может быть обучен и по изображениям формы участников, полученным отдельно (не в этом матче, не при съемке этими камерами, без движения участников, а как простое фото с дизайном их формы), кроме того может быть реализована автоматическая кластеризация изображений формы участников на основе априорных данных о типе спортивного события, например о местах выхода на площадку игрового поля разных команд.Training of the SVM reference vector machine can be done before processing all the video sequences, upon request to the operator with the requirement to select on the captured frames from different cameras a certain number of images of participants in a sporting event from different teams (“team 1”, “team 2”, “team of judges” ), as well as arbitrary (not containing whole images of participants in a sporting event) areas on the playing field. For example, for each “team” at least 15-20 images of participants in a sporting event received from different cameras can be selected. Moreover, to improve the classification results, it is desirable to provide a choice of images corresponding to different areas of the field of view of the cameras of the video system. The operator-selected areas with images of participants are processed according to the same procedure by which pseudo-images are generated during classification. This allows you to adapt the classifier to the recognition of images obtained in specific shooting conditions (the effect of color rendition, camera noise, the spectral composition of lighting equipment on the playing field, etc.) on the colors of the shape image, and also take into account the effect of color distortion in the shape image due to for the movements of participants in a sporting event (when the participant moves, the image of its shape is transformed, repeating the contours of the figure, which means that the ratio of the areas of the images falling on the different colors) rmy. The classifier can also be trained on the images of the participants ’form obtained separately (not in this match, not when shooting with these cameras, without movement of the participants, but as a simple photo with the design of their form), in addition, automatic clustering of images of the participants’ shapes based on a priori information about the type of sporting event, for example, about the places where different teams enter the playing field.

Для каждого «неидентифицируемого» и «идентифицируемого» сегмента (области с изображениями участников спортивного события) определяют границы обрамляющего прямоугольника, координаты середины нижней стороны которого пересчитывают из пикселей (система координат камеры) в мировую систему координат, связанную с площадкой игрового поля в удобные единицы измерения (метры/футы) через подобранные на этапе калибровки параметры камер.For each "unidentifiable" and "identifiable" segment (areas with images of participants in a sporting event), the boundaries of the framing rectangle are determined, the coordinates of the middle of the lower side of which are converted from pixels (camera coordinate system) to the world coordinate system associated with the playing field in convenient units (meters / feet) through the camera parameters selected at the calibration stage.

Таким образом, для каждого кадра видеопоследовательности каждой камеры получают набор координат (например, в метрах), соответствующих положению участников спортивного события на игровом поле в определенный момент времени.Thus, for each frame of the video sequence of each camera, a set of coordinates (for example, in meters) is obtained corresponding to the position of the participants in a sporting event on the playing field at a certain point in time.

Каждая пара координат из вышеуказанного набора в совокупности с данными о принадлежности к команде, соответствующего им сегмента, образуют объект. Для каждого такого объекта определяют, является ли данный объект частью единой траектории с одним из объектов предыдущего кадра.Each pair of coordinates from the above set, together with data on membership in the team corresponding to the segment, form an object. For each such object, it is determined whether this object is part of a single path with one of the objects of the previous frame.

Такое сопоставление зачастую является источником большого числа ошибок, поскольку движение участников спортивного события, в частности, хоккейного матча, труднопредсказуемо из-за непостоянства скорости движения и ускорения. В такой ситуации, традиционно используемые алгоритмы слежения, основанные, например, на фильтре Калмана, не позволяют свести к минимуму количество вышеуказанных ошибок, что, в конечном счете, приводит к необходимости внесения значительного количества корректировок оператором в ручном режиме. В настоящем изобретении раскрывается решение, позволяющее значительным образом уменьшить количество ложных сопоставлений и снизить нагрузку на оператора по внесению необходимых корректировок в результат автоматического сопоставления.Such a comparison is often the source of a large number of errors, since the movement of participants in a sporting event, in particular, a hockey match, is difficult to predict because of the inconsistency of speed and acceleration. In such a situation, the traditionally used tracking algorithms, based, for example, on the Kalman filter, do not allow minimizing the number of the above errors, which ultimately leads to the need for a significant number of corrections by the operator in manual mode. The present invention discloses a solution that can significantly reduce the number of false comparisons and reduce the burden on the operator to make the necessary adjustments to the result of automatic matching.

Для всех объектов, обнаруженных на кадре, производится поиск соответствий со всеми траекториями (т.е. производится поиск соответствующих пар «возможная траектория - возможный участник»), для которых последний объект траектории был обнаружен на одном из N предшествующих кадров (например, N=5). Для каждой пары (объект на текущем кадре - последний объект траектории, далее пара «возможная траектория - возможный участник»), проверяемой на соответствие, определяют метрику, рассчитанную как суммарный штраф по нескольким критериям. Пара, имеющая наименьшее значение метрики, назначается соответствующей (объект на текущем кадре записывается как продолжение траектории), но только в том случае, если дистанция между координатами объекта на текущем кадре и координатами последнего объекта траектории (в метрах) окажется меньше некоторого порогового значения, зависящего от максимально возможной скорости движения участника спортивного события и частоты съемки кадров камер.For all objects detected on the frame, a search is made for all paths (that is, the search is made for the matching pairs “possible path - possible participant”) for which the last path object was detected on one of the N previous frames (for example, N = 5). For each pair (the object on the current frame is the last object of the trajectory, then the pair “possible trajectory is a possible participant”) checked for compliance, a metric is calculated that is calculated as a total penalty according to several criteria. The pair with the lowest metric value is assigned (the object on the current frame is recorded as a continuation of the path), but only if the distance between the coordinates of the object on the current frame and the coordinates of the last path object (in meters) is less than a certain threshold value, which depends from the maximum possible speed of a participant in a sporting event and the frequency of shooting camera frames.

Суммарный штраф вычисляется как совокупность, например, следующих штрафных баллов.The total penalty is calculated as a combination of, for example, the following penalty points.

Штраф P₁ на дистанцию может быть найден как значение дистанции между координатами объекта на текущем кадре и координатами последнего объекта траектории, умноженное на весовой коэффициент w₁. Коэффициент w₁ выбирается таким образом, чтобы приоритет отдавался траекториям, наиболее близко подходящим к объекту на текущем кадре. При этом если вычисленная дистанция превышает пороговую, то данная пара «возможный объект - возможная траектория» исключается из претендентов на соответствие.The penalty P ₁ for the distance can be found as the distance between the coordinates of the object on the current frame and the coordinates of the last object of the trajectory, multiplied by the weight coefficient w ₁ . The coefficient w ₁ is chosen so that priority is given to the trajectories that are closest to the object in the current frame. Moreover, if the calculated distance exceeds the threshold, then this pair of “possible object - possible trajectory” is excluded from the applicants for compliance.

Штраф Р₂ на касание с обнуляющей маской для фильтрации областей, содержащих изображения зрителей, может быть вычислен как весовой коэффициент w₂, в случае, если объект не касается обнуляющей маски, или коэффициент w₃ - в противном случае. Соотношение коэффициентов w₂ и w₃ позволяет регулировать степень нежелательности попадания в траекторию объектов, примыкающих к области расположения зрителей.The fine P ₂ at the touch with the nulling mask for filtering areas containing the image of the audience can be calculated as the weight coefficient w ₂ if the object does not touch the nullifying mask, or the coefficient w ₃ otherwise. The ratio of the coefficients w ₂ and w ₃ allows you to adjust the degree of undesirability of getting into the trajectory of objects adjacent to the area of the audience.

Штраф Р₃ на время жизни траектории может быть найден как весовой коэффициент w₄, деленный на время жизни траектории. Значение коэффициента выбирается таким образом, чтобы приоритет отдавался более длительным траекториям.The fine P ₃ for the lifetime of the trajectory can be found as the weight coefficient w ₄ divided by the lifetime of the trajectory. The coefficient value is selected so that priority is given to longer paths.

Штраф Р₄ на временной промежуток между кадром t₁, на котором для данной траектории был обнаружен последний объект, и текущим кадром t₂, может быть вычислен как Р₄=(t₂-t₁)*w₅. Значение коэффициента w₅ выбирается таким образом, чтобы приоритет отдавался тем траекториям, для которых последний объект был обнаружен позднее (на кадре наиболее близком к текущему).The fine P ₄ for the time interval between the frame t ₁ on which the last object was detected for a given trajectory and the current frame t ₂ can be calculated as P ₄ = (t ₂ -t ₁ ) * w ₅ . The value of the coefficient w ₅ is chosen so that priority is given to those trajectories for which the last object was discovered later (in the frame closest to the current one).

Штраф Р₅ на нежелательность резкого изменения направления траектории может быть вычислен следующим образом. Для N последних объектов траектории рассчитывают вектор r преимущественного направления, на основе, например, взвешенного усреднения, где последние точки траектории имеют больший вес. Затем рассчитывают вектор k направления между объектом на текущем кадре и последним объектом траектории, находят расстояние между векторами ||r,k|| и вычисляют значение штрафа как Р₅=||r,k||/w₆. Коэффициент w6 выбирается таким образом, чтобы приоритет отдавался тем объектам, которые при добавлении в траекторию изменят преимущественное направление движения ее объектов меньше всего.The fine P ₅ for the undesirability of a sharp change in the direction of the trajectory can be calculated as follows. For the N last trajectory objects, the vector r of the preferred direction is calculated based on, for example, weighted averaging, where the last trajectory points have more weight. Then calculate the direction vector k between the object on the current frame and the last object of the trajectory, find the distance between the vectors || r, k || and calculate the value of the penalty as P ₅ = || r, k || / w ₆ . The coefficient w6 is chosen so that priority is given to those objects that, when added to the trajectory, change the preferred direction of movement of its objects the least.

Штраф Р₅ на резкое изменение площади сегмента может быть рассчитан как отношение площадей объекта на текущем кадре к площади последнего объекта траектории, умноженный на весовой коэффициент w₇, который выбирается таким образом, чтобы приоритет был отдан объектам, которые наименее сильно отличаются от последнего объекта траектории по площади.The fine P ₅ for a sharp change in the area of a segment can be calculated as the ratio of the areas of the object in the current frame to the area of the last object of the trajectory, multiplied by the weight coefficient w ₇ , which is selected so that priority is given to objects that are least different from the last object of the trajectory by area.

Соотношение между штрафами регулируется изменением весовых коэффициентов таким образом, чтобы уменьшить вероятность ошибочного сопоставления пар «возможный объект - возможная траектория» и зависит от типа спортивного события, расположения камер, вероятности появления ложных объектов в поле зрения камеры, которые по пространственным и спектральным характеристикам схожи с обрабатываемыми объектами (например, блики на поверхности площадки).The ratio between the penalties is regulated by changing the weight coefficients in such a way as to reduce the likelihood of erroneous matching of the pairs “possible object - possible trajectory” and depends on the type of sporting event, the location of the cameras, the likelihood of false objects appearing in the camera’s field of view, which are similar in spatial and spectral characteristics to processed objects (for example, glare on the surface of the site).

При вычислении суммарного штрафа могут быть добавлены и другие штрафы, если условия съемки или действия участников во время спортивного события могут быть описаны количественным штрафом.When calculating the total penalty, other penalties may be added if the shooting conditions or the actions of the participants during a sporting event can be described by a quantitative penalty.

В том случае, если объект на текущем кадре не был признан соответствующим ни одной рассматриваемой траектории, то для такого объекта создается новая траектория.In the event that the object in the current frame was not recognized as matching any of the considered paths, then a new path is created for such an object.

Для уменьшения количества траекторий (максимального приближения числа итоговых траекторий к действительному количеству участников спортивного события) могут быть добавлены дополнительные правила, применяемые, в том числе, на этапе постобработки (после обработки всех доступных кадров и сопоставления всех объектов с траекториями). Такие правила могут быть основаны на преимущественном положении объектов в некоторых априорно известных зонах (например, вратарь) или дополнительном анализе коллизий (например, по допустимой площади сегментов). Коллизия возникает, когда изображения двух и более участников перекрывают друг друга. В этом случае процедура выделения областей с изображениями участников спортивного события может ошибочно распознать такие изображения, как изображение одного участника, что в результате приводит к ошибкам определения траекторий движения участников спортивного события. Для обнаружения коллизий среди составленных траекторий производится анализ траекторий на расстояние между ними в синхронные моменты времени. В случае, если расстояние оказывается меньше порогового, а в один из следующих моментов времени одна из траекторий оборвалась, то для момента времени, соответствующего обрыву траектории, считают моментом коллизии анализируемых траекторий. Порог зависит от характеристик камер, их расположения относительно площадки, типа спортивного события, условий съемки. Кроме того, для детектирования может дополнительно анализироваться количество вновь возникших траекторий в априорно заданном радиусе (определяется оператором на этапе подготовки видеопоследовательности к обработке) вокруг места коллизии в моменты времени наиболее близкие к моменту коллизии по времени. Моменты коллизий могут быть также обнаружены на основе вычисления разницы между известным количеством участников в действительности присутствующих на площадке (известно из протокола спортивного события) и количеством участников, определенных процедурой выделения изображений участников в соответствующий момент времени.To reduce the number of trajectories (to maximize the approximation of the number of final trajectories to the actual number of participants in a sporting event), additional rules can be added that apply, among other things, at the post-processing stage (after processing all available frames and comparing all objects with trajectories). Such rules can be based on the preferential position of objects in some a priori known zones (for example, the goalkeeper) or additional collision analysis (for example, according to the allowable area of the segments). A collision occurs when images of two or more participants overlap each other. In this case, the procedure for selecting areas with images of participants in a sports event may erroneously recognize images such as the image of one participant, which as a result leads to errors in determining the trajectories of the participants in a sports event. To detect collisions among the compiled trajectories, the trajectories are analyzed for the distance between them at synchronous times. If the distance is less than the threshold, and at one of the following time instants one of the trajectories breaks, then for the moment of time corresponding to the break of the trajectory, consider the moment of collision of the analyzed trajectories. The threshold depends on the characteristics of the cameras, their location relative to the site, type of sporting event, shooting conditions. In addition, for detection, the number of newly arising trajectories in an a priori specified radius (determined by the operator at the stage of preparing the video sequence for processing) around the collision point at time moments closest to the time collision can be additionally analyzed. Collision moments can also be detected by calculating the difference between the known number of participants actually present at the site (known from the protocol of a sporting event) and the number of participants determined by the procedure for extracting images of participants at the corresponding time.

Для исправления ошибок, возникающих в результате наличия коллизий, могут быть использованы различные подходы. Например, для каждой траектории, предположительно участвующей в коллизии, для всех моментов времени, предшествующих коллизии, а также для моментов времени, следующих после момента коллизии, проводят анализ принадлежности ее к команде. Отметкой команды для анализируемого отрезка траектории назначается наиболее часто встречающаяся отметка команды для всех объектов данной траектории (из набора «команда 1», «команда 2», «команда судей»). Т.е. анализ наиболее часто встречающейся отметки команды проводится только среди объектов с отметкой, отличной от «команда не определена».To correct errors resulting from the presence of collisions, various approaches can be used. For example, for each trajectory presumably involved in a collision, for all the moments of time preceding the collision, as well as for the moments of time following the moment of the collision, an analysis of its belonging to the team is carried out. The command mark for the analyzed trajectory segment is assigned the most common command mark for all objects of the given trajectory (from the set “team 1”, “team 2”, “team of judges”). Those. An analysis of the team’s most common mark is carried out only among objects with a mark other than “team not defined”.

Если в предполагаемой коллизии участвуют две траектории, классифицированные как траектории, принадлежащие разным командам, то они могут быть разделены на основе раздельной классификации траекторий до и после момента коллизии.If two trajectories, classified as trajectories belonging to different teams, take part in the proposed collision, then they can be separated on the basis of a separate classification of the trajectories before and after the moment of the collision.

В случае если в коллизии участвуют более двух траекторий или они идентифицированы как относящиеся к одной команде, то для избежания возможных ошибок к траекториях-участниках коллизии может быть использован анализ направления движения изображений участников спортивного события в моменты времени до, после и в момент коллизии на основе обработки изображения, например, методом оптического потока.If more than two paths are involved in the collision or they are identified as belonging to the same team, then to avoid possible errors to the paths of the collision, an analysis of the direction of movement of the images of the participants of a sporting event at times before, after and at the moment of the collision can be used based on image processing, for example, by the optical flow method.

Кроме того, для избежания возможных ошибок к траекториях-участниках коллизии может быть проведена обработка изображений, полученных из дополнительных видеоматериалов (например, телевизионная трансляция) в соответствующие коллизии моменты времени, что позволяет проанализировать дополнительные проекции изображений участников спортивного события, снятые с большим разрешением, на которых изображения участников могут быть разделены.In addition, in order to avoid possible errors in the trajectories involved in the collision, processing of images obtained from additional video materials (for example, television broadcasting) at the corresponding collisions can be performed at time instants, which allows one to analyze additional projections of images of participants in a sporting event taken with high resolution at which images of participants can be shared.

Также количество траекторий может быть уменьшено за счет идентификации участников спортивного события по номерам на их форме на основе технологий оптического распознавания и машинного обучения. Для всех сегментов, отмеченных как «идентифицируемые», производится анализ движения. Для этого среди отмеченных сегментов выбираются такие сегменты, для которых вектор перемещения соответствующих им объектов траекторий соответствует направлению движения «от камеры» (производится для каждой камеры). Это можно сделать, поскольку положение каждого объекта каждой траектории известно в мировой системе координат (в метрах), а также в этой же мировой системе координат известно положение всех камер системы (результат применения процедуры калибровки). Такое положение изображения участника на кадре обеспечивает наиболее благоприятные условия для распознавания номеров, поскольку наиболее крупное изображение номера на форме участника спортивного события обычно располагается на спине, а значит, указанная процедура позволяет обнаружить изображения номеров с наиболее удачного ракурса.Also, the number of trajectories can be reduced by identifying participants in a sporting event by numbers on their form based on optical recognition technologies and machine learning. For all segments marked as “identifiable”, a motion analysis is performed. To do this, among the marked segments, those segments are selected for which the displacement vector of the corresponding path objects corresponds to the direction of movement “from the camera” (produced for each camera). This can be done, since the position of each object of each trajectory is known in the world coordinate system (in meters), and also in the same world coordinate system, the position of all the cameras in the system is known (the result of applying the calibration procedure). This position of the participant’s image on the frame provides the most favorable conditions for recognition of numbers, since the largest image of the number on the form of a participant in a sporting event is usually located on the back, which means that this procedure allows you to detect the image of numbers from the most successful angle.

Вокруг каждого из отобранных сегментов определяется обрамляющий прямоугольник, заполняемый соответствующей областью исходного изображения, прошедшего высокочастотную фильтрацию.Around each of the selected segments, a framing rectangle is defined, filled with the corresponding area of the original image that has passed high-pass filtering.

Для полученного таким образом изображения участника производится проверка наличия на нем изображений цифр. Для этого полученное изображение участника поочередно сканируют набором разноразмерных окон, размер каждого из которых не превышает размера полученного изображения участника. Размеры окон из сканирующего набора выбираются априорно или вычисляются в зависимости от положения изображения участника на кадре видеопоследовательностей, таким образом, чтобы выбранные размеры окон соответствовали наиболее вероятному размеру изображения цифры на форме участника. Например, для сканирования могут быть использованы окна 8-ми размеров с инкрементным увеличением размера на 1 пиксель с каждой стороны окна (соотношение сторон сканирующего окна сохраняется).For the participant’s image thus obtained, a check is made for the presence of digital images on it. For this, the received image of the participant is alternately scanned with a set of different-sized windows, the size of each of which does not exceed the size of the received image of the participant. Window sizes from the scanning set are selected a priori or calculated depending on the position of the participant’s image on the frame of video sequences, so that the selected window sizes correspond to the most probable image size of the figure on the participant’s form. For example, windows of 8 sizes can be used for scanning with an incremental increase in size by 1 pixel on each side of the window (the aspect ratio of the scanning window is preserved).

Для каждого положения сканирующего окна каждого размера вычисляются значения признаков, выбранных для распознавания изображений цифр на этапе машинного обучения. Машинное обучение может быть произведено априорно по базе изображений цифр, характерных для данного вида спорта, шрифтов в начертании цифр на форме играющих команд, искажений, возникающих, например, вследствие движения участника в ходе матча, и т.д. Полученные значения признаков подаются на вход обученного алгоритма анализа указанных признаков каждому положению сканирующего окна каждого размера с определенной вероятностью ставится в соответствие цифра. Также допустим вариант, что на анализируемом изображении цифра отсутствует. В качестве распознанной цифры для данного положения данного сканирующего окна выбирается цифра с наибольшей вероятностью или выбирается вариант, что цифры нет.For each position of the scanning window of each size, the values of the features selected for recognition of the images of the numbers at the machine learning stage are calculated. Machine learning can be done a priori on the basis of images of numbers specific to a given sport, fonts in the numerical design on the form of playing teams, distortions arising, for example, due to a participant’s movement during a match, etc. The obtained values of the signs are fed to the input of a trained algorithm for analyzing these signs with each position of the scanning window of each size with a certain probability a number is assigned. Also suppose that there is no digit on the image being analyzed. As the recognized digit for the given position of the given scanning window, the digit with the highest probability is selected or the option is selected that there is no digit.

Таким образом, для сканируемого изображения участника получают набор вероятных цифр (или их отсутствия), соответствующих различным частям сканируемого изображения участника. Полученный набор анализируют на предмет наличия цифр (например, если абсолютное большинство представителей набора «голосуют» за присутствие какой-либо цифры на изображении), на предмет количества цифр в номере и их итоговых значений (например, путем анализа двумерного вероятностного распределения значений представителей набора на соотношение величины пиков распределения и расстояния между ними).Thus, for a scanned image of a participant, a set of probable numbers (or lack thereof) corresponding to different parts of the scanned image of a participant is obtained. The resulting set is analyzed for the presence of numbers (for example, if the vast majority of the representatives of the set “vote” for the presence of any number in the image), for the number of numbers in the number and their final values (for example, by analyzing the two-dimensional probability distribution of the values of the representatives of the set on ratio of the magnitude of the distribution peaks and the distance between them).

Полученный таким образом номер участника спортивного события может быть дополнительно сопоставлен с базой данных о присутствии и отсутствии на поле игроков в анализируемом временном промежутке (информация из протокола матча), а также с информацией о принадлежности соответствующей траектории к одной из команд. Полученная информация позволяет провести дополнительную процедуру слияния отдельных траекторий на основании их принадлежности одному и тому же участнику спортивного события. Это позволяет снизить число «неизвестных» траекторий, предъявляемых для идентификации оператору в ручном режиме.The participant number of the sporting event thus obtained can be further compared with a database of the presence and absence of players on the field in the analyzed time period (information from the match protocol), as well as with information on the belonging of the corresponding trajectory to one of the teams. The information obtained allows us to carry out an additional procedure for merging individual trajectories based on their belonging to the same participant in a sporting event. This allows you to reduce the number of "unknown" trajectories presented for identification to the operator in manual mode.

Кроме того, в течение всей обработки формируется ряд предупреждений оператору о необходимости верификации отдельных моментов матча, в тех случаях, когда принятие решения в автоматическом режиме невозможно или с высокой долей вероятности приведет к неверному результату, например: при коллизии более 2-х игроков или 2 участников одинаковых команд, при не возможности отнести участника игрового события к одной из команд, в случае не возможности определить номер участника, при окончании траектории в точках не принадлежащим зонам выхода с поля в моменты времени отличные от момента окончания игровых периодов, при возникновении новых траекторий в точках, не принадлежащим зонам входа на игровую площадку, в моменты времени отличные от момента начала игровых периодов и т.п.In addition, during the entire processing, a series of warnings to the operator is generated about the need to verify certain moments of the match, in cases where a decision in the automatic mode is impossible or with a high degree of probability will lead to an incorrect result, for example: in case of a collision of more than 2 players or 2 participants of the same teams, if it is not possible to assign the participant of the game event to one of the teams, if it is not possible to determine the number of the participant, at the end of the trajectory at points that do not belong to the exit zones from the field at time points other than the end of game periods, when new paths occur at points that do not belong to the areas of entry to the playing area, at time points other than the start of game periods, etc.

Калибровку видеосистемы по внутренним и внешним параметрам осуществляют с использованием способа калибровки видеосистемы для контроля объектов на плоской площадке, включающий сканирование эталонных объектов в пространстве предметов каждой камерой видеосистемы таким образом, чтобы каждый отсканированный кадр содержал изображение по меньшей мере одного эталонного объекта, измерение положения точек изображения объекта для каждого эталонного объекта на отсканированных кадрах, определение внутренних параметров каждой камеры, при которых суммарное отклонение измеренных точек от их расчетного положения для всех отсканированных одной камерой кадров было минимальным, исправление нелинейных искажений в изображении на кадрах с каждой камеры посредством применения соответствующих внутренних параметров камеры, определение матрицы пересчета изображений, полученных камерами видеосистемы, в пространство предметов. При этом в качестве эталонного объекта в пространстве предметов выбирают прямолинейный объект, изображение которого находится на наименьшем расстоянии от центра кадра и имеет максимальную протяженность до периферии кадра, для измеренных точек изображения эталонного объекта определяют аппроксимирующую прямую, проведенную через по меньшей мере две измеренные точки изображения объекта, наиболее близко расположенных к центру кадра, для определения внутренних параметров каждой камеры минимизируют суммарное абсолютное отклонение положения измеренных точек изображения эталонного объекта от аппроксимирующей прямой, определяют взаимное положение элементов площадки посредством обработки ее изображения, определение матриц пересчета координат на изображениях, полученных камерами видеосистемы, в пространство предметов путем сопоставления положения не менее четырех точек, относящихся к элементам площадки, и соответствующих им точек на изображениях, полученных каждой из камер видеосистемы, после исправления присутствующих на них нелинейных искажений.The video system is calibrated according to internal and external parameters using the video system calibration method for monitoring objects on a flat platform, including scanning reference objects in the space of objects with each camera of the video system so that each scanned frame contains an image of at least one reference object, measuring the position of image points object for each reference object on scanned frames, determining the internal parameters of each camera, at The total deviation of the measured points from their calculated position for all frames scanned by one camera was minimal, correction of nonlinear distortions in the image on frames from each camera by applying the corresponding internal camera parameters, determination of the matrix for converting images received by video system cameras into the space of objects. In this case, as a reference object in the space of objects, a rectilinear object is selected, the image of which is located at the smallest distance from the center of the frame and has a maximum length to the periphery of the frame, for the measured image points of the reference object, an approximating straight line drawn through at least two measured image points of the object is determined closest to the center of the frame, to determine the internal parameters of each camera, minimize the total absolute deviation of the position the measured points of the image of the reference object from the approximating straight line, determine the relative position of the elements of the site by processing its image, the definition of the matrix of the coordinates on the images obtained by the cameras of the video system, into the space of objects by comparing the positions of at least four points related to the elements of the site, and their corresponding points on the images obtained by each of the cameras of the video system, after correcting the non-linear distortions present on them.

На первом этапе производят определение внутренних параметров каждой из камер видеосистемы для устранения нелинейных искажений в изображении площадки. На Фиг. 1 показано отклонение 1 прямого бортика игрового поля от прямой, вследствие нелинейных искажений объектива (дисторсия), и обозначен центр 2 кадра. Для этого фиксируют конфигурацию камер видеосистемы так, чтобы их внутренние параметры (фокусное расстояние, диафрагменное число и т.п.) оставались неизменными на всем протяжении процедуры калибровки, при этом изображение площадки было максимально четким на всей площади кадра. Взаимное расположение камер, их количество и положение относительно площадки зависят от конфигурации площадки, дистанции до нее и угла съемки. Для обработки кадров, полученных с камер видеосистемы, и выполнения необходимых алгоритмических процедур калибровки может быть использован компьютер с запоминающим устройством для хранения данных и соответствующим программным обеспечением, соединенный с камерами видеосистемы посредством линий связи с достаточной пропускной способностью для обмена изображениями необходимого разрешения и управления режимом захвата кадров. Конкретная конфигурация программного и аппаратного обеспечения, а также их архитектура могут отличаться от описанных выше с сохранением возможности управления режимами работы камер, съема кадров, их сохранения и обработки в соответствие со способом, раскрываемым в настоящем изобретении.At the first stage, the internal parameters of each of the cameras of the video system are determined to eliminate non-linear distortions in the image of the site. In FIG. 1 shows the deviation 1 of the straight edge of the playing field from the straight line due to nonlinear distortion of the lens (distortion), and the center of frame 2 is indicated. For this, the configuration of the cameras of the video system is fixed so that their internal parameters (focal length, aperture, etc.) remain unchanged throughout the calibration procedure, while the image of the site was as clear as possible over the entire frame area. The relative position of the cameras, their number and position relative to the site depend on the configuration of the site, the distance to it and the shooting angle. To process frames received from the cameras of the video system and perform the necessary algorithmic calibration procedures, a computer with a memory device for storing data and appropriate software can be used connected to the cameras of the video system via communication lines with sufficient bandwidth to exchange images of the required resolution and control the capture mode frames. The specific configuration of software and hardware, as well as their architecture, may differ from those described above while maintaining the ability to control camera operating modes, record frames, save and process them in accordance with the method disclosed in the present invention.

После фиксации внутренних параметров камер видеосистемы посредством ряда линейных смещений и разворотов камер относительно площадки производят сканирование прямолинейных элементов 3 разметки каждой из камер видеосистемы (Фиг. 2). В результате такого сканирования получают набор кадров, на каждом из которых есть как минимум одно изображение прямой линии 3, проходящей через центр 2 кадра и периферию кадра. В качестве таких линий при съемке, например, хоккейного матча могут быть выбраны изображения синих линий, отделяющих зоны защиты и нападения от центральной зоны и/или изображение прямолинейной части желтой отбойной планки.After fixing the internal parameters of the cameras of the video system through a series of linear displacements and turns of the cameras relative to the site, scanning of the rectilinear elements 3 of the marking 3 of each of the cameras of the video system (Fig. 2). As a result of such a scan, a set of frames is obtained, each of which has at least one image of a straight line 3 passing through the center 2 of the frame and the periphery of the frame. As such lines when shooting, for example, a hockey match, images of blue lines separating the defense and attack zones from the central zone and / or the image of the rectilinear part of the yellow baffle can be selected.

Для изображения каждого прямолинейного объекта, удовлетворяющих описанным выше условиям, на отсканированных кадрах измеряют положение точек изображения, принадлежащих одной линии в пространстве предметов. Измерения могут быть получены посредством ручного выделения необходимого количества точек на изображениях оператором или посредством автоматизированного их выделения на базе алгоритмов распознавания изображений. Минимально необходимое количество точек - три, причем, по меньшей мере, две из них должны находиться на наименьшем, и остальные на наибольшем расстоянии от центра 2 кадра.For the image of each rectilinear object that satisfy the conditions described above, the position of image points belonging to one line in the space of objects is measured on scanned frames. Measurements can be obtained by manually selecting the required number of points on the images by the operator or by automatically selecting them based on image recognition algorithms. The minimum required number of points is three, and at least two of them must be at the smallest, and the rest at the greatest distance from the center of frame 2.

Для каждой такой линии находят аппроксимирующую прямую, проведенную, по меньшей мере, через две измеренные точки ее изображения, наиболее близко расположенные к центру 2 кадра.For each such line, an approximating straight line is drawn drawn through at least two measured points of its image, which are closest to the center of the frame 2.

Определяют суммарное абсолютное отклонение измеренных точек от соответствующей аппроксимирующей прямой. Посредством применения, например, алгоритмов оптимизации, подбирают такие внутренние параметры камеры (общие для всех отсканированных одной видеокамерой кадров), чтобы суммарное отклонение измеренных точек от соответствующих им аппроксимирующих линий для всех отсканированных одной камерой кадров было минимальным.The total absolute deviation of the measured points from the corresponding approximating line is determined. Using, for example, optimization algorithms, such internal camera parameters (common for all frames scanned by one video camera) are selected so that the total deviation of the measured points from their corresponding approximating lines for all frames scanned by one camera is minimal.

Для подбора внутренних параметров камеры и исправления дисторсии может быть использована, например, известная математическая модель [Описание параметров калибровки. Внутренние параметры камеры (модель камеры). http://www.vision.caltech.edu/bouguetj/calib_doc/htmls/parameters.html].To select the internal parameters of the camera and correct distortion, for example, the well-known mathematical model [Description of calibration parameters. Internal camera parameters (camera model). http://www.vision.caltech.edu/bouguetj/calib_doc/htmls/parameters.html].

Измеренные координаты точек приводят в единичную плоскость:The measured coordinates of the points lead to a unit plane:

где х¹, у¹ - абсциссы и ординаты точек в единичной плоскости соответственно, с_х, с_у - абсцисса и ордината центра дисторсии, ƒ_х, ƒ_у - отношение фокусного расстояния к размеру пикселя вдоль х и у соответственно.where x ¹ , y ¹ are the abscissas and ordinates of points in the unit plane, respectively, with _x , c _{y the} abscissa and the ordinate of the center of distortion, ƒ _x , ƒ _y are the ratio of the focal length to the pixel size along x and y, respectively.

Затем определяют величину искажений, вызванных радиальной и тангенсальной составляющими дисторсии, например, по следующим формулам [Описание параметров калибровки. Внутренние параметры камеры (модель камеры). http://www.vision.caltech.edu/bouguetj/calib_doc/htmls/parameters.html]:Then determine the amount of distortion caused by the radial and tangential components of distortion, for example, by the following formulas [Description of calibration parameters. Internal camera parameters (camera model). http://www.vision.caltech.edu/bouguetj/calib_doc/htmls/parameters.html]:

где k_r - искажение, вызванное радиальной составляющей дисторсии,

- квадрат расстояния от центра дисторсии до каждой из точек в единичной плоскости, k₁, k₂, k₃ - коэффициенты радиальной дисторсии.where k _r is the distortion caused by the radial component of distortion,

- the square of the distance from the center of distortion to each of the points in the unit plane, k ₁ , k ₂ , k ₃ are the coefficients of radial distortion.

где t_x, t_y - искажение абсциссы и ординаты точек в единичной плоскости, вызванное тангенсальной составляющей дисторсии.where t _x , t _y is the distortion of the abscissa and the ordinate of the points in the unit plane, caused by the tangential component of distortion.

Устраняют искажения, вызванные дисторсией, например, по соотношению:Eliminate distortion caused by distortion, for example, in the ratio:

где

,

- абсциссы и ординаты точек в единичной плоскости с исправленными искажениями.Where

,

- abscissas and ordinates of points in the unit plane with corrected distortions.

Возвращают исправленные координаты из единичной плоскости в плоскость изображения:Corrected coordinates are returned from the unit plane to the image plane:

Следует отметить, что поскольку модель дисторсии предполагает внесение искажений в единичной плоскости, то отношение фокусного расстояния к размеру пикселя ƒ_x, ƒ_y может быть выбрано произвольным, но близким к действительному.It should be noted that since the distortion model involves introducing distortions in the unit plane, the ratio of the focal length to the pixel size ƒ _x , ƒ _y can be chosen arbitrarily, but close to the real one.

В известном способе (US 9007463 B2) для подбора внутренних параметров камеры в качестве метрики для минимизации служит ошибка репроецирования калибровочных марок (калибровочной сетки). Т.е. координаты калибровочной сетки, известные в пространстве предметов с высокой точностью проецируются в плоскость кадра через подбираемые внутренние и внешние параметры камеры. К внутренним параметрам относится отношение фокусного расстояния, центр и коэффициенты дисторсии, к внешним - положение камеры относительно системы координат, в которой измерены известные координаты калибровочной сетки в пространстве предметов.In the known method (US 9007463 B2) to select the internal parameters of the camera as a metric to minimize the error is the reprojection of the calibration marks (calibration grid). Those. the coordinates of the calibration grid known in the space of objects are projected with high accuracy into the plane of the frame through the selected internal and external camera parameters. The internal parameters include the ratio of the focal length, the center and distortion coefficients, and the external ones include the position of the camera relative to the coordinate system in which the known coordinates of the calibration grid in the space of objects are measured.

В раскрываемом изобретении при подборе внутренних параметров камеры внешние ее параметры не участвуют, поскольку шаблон из пространства предметов в пространство изображений не проецируют. Метрика никаким образом не связана с координатами объектов в пространстве предметов. Все вычисления проводятся только в пространстве изображений (модель существует только в этом пространстве). В качестве минимизирующей метрики здесь выступает отклонение изображения заведомо прямолинейного объекта от прямой, поскольку это отклонение (нелинейность в изображении объекта) вызвано наличием дисторсии изображающего объектива. В качестве модели используют прямую, проведенную через точки, наиболее близкие к центру кадра, т.к. в зоне, в которой влияние дисторсии пренебрежимо мало и построенная прямая является наиболее достоверной (наиболее приближена к положению, в котором находилось бы изображение прямолинейного объекта, если бы дисторсия объектива отсутствовала).In the disclosed invention, in selecting the internal parameters of the camera, its external parameters are not involved, since the template is not projected from the space of objects into the space of images. The metric is in no way connected with the coordinates of objects in the space of objects. All calculations are carried out only in the space of images (the model exists only in this space). Here, the deviation of the image of a deliberately rectilinear object from the straight line acts as a minimizing metric, since this deviation (non-linearity in the image of the object) is caused by the presence of distortion of the imaging lens. As a model, use a straight line drawn through the points closest to the center of the frame, because in the zone in which the influence of distortion is negligible and the straight line constructed is the most reliable (closest to the position in which the image of the rectilinear object would be if there was no lens distortion).

Исправляют нелинейные искажения в изображении площадки для всех кадров посредством найденных для соответствующих камер внутренних параметров.Correct nonlinear distortions in the image of the site for all frames by means of the internal parameters found for the corresponding cameras.

Задачей второго этапа калибровки является определение коэффициентов пересчета системы координат, связанной с плоской площадкой и системами координат каждой из камер видеосистемы (Фиг. 3), т.е. определение матриц пересчета координат на изображениях, полученных камерами видеосистемы, в пространство предметовThe task of the second calibration stage is to determine the conversion factors of the coordinate system associated with a flat platform and coordinate systems of each of the cameras of the video system (Fig. 3), i.e. determination of coordinate conversion matrices on images obtained by video system cameras into object space

Для этого сначала получают панорамный снимок 4 площадки, путем сканирования поверхности плоской площадки по меньшей мере одной камерой, исправления нелинейных искажений полученных кадров, и последующего соединения отсканированных кадров в единую панораму площадки.To do this, first obtain a panoramic image 4 of the site by scanning the surface of the flat site with at least one camera, correcting the non-linear distortions of the received frames, and then connecting the scanned frames to a single panorama of the site.

Далее определяют взаимное положение элементов площадки посредством обработки ее изображения.Next, determine the relative position of the elements of the site by processing its image.

Несмотря на то, что расположение элементов разметки площадки без проведения дополнительных измерений обычно неизвестно, среди отдельных элементов разметки могут быть такие элементы, конфигурация которых строго регламентирована правилами использования площадки. Например, к таким элементам можно отнести ширину ворот и размеры штрафной площадки на футбольном поле, размеры ворот и конфигурация точек вбрасывания на хоккейной площадке и т.д.Despite the fact that the location of the layout elements of the site without additional measurements is usually unknown, among the individual layout elements there may be elements whose configuration is strictly regulated by the rules for using the site. For example, such elements include the width of the goal and the size of the penalty area on the football field, the size of the goals and the configuration of the face-offs on the hockey field, etc.

Посредством обработки изображений определяют координаты изображений известных элементов разметки на панорамном снимке 4 и определяют такое преобразование системы координат панорамного снимка 4, которое преобразует координаты элементов на панорамном снимке 4 в систему координат, связанную с плоской площадкой. Для этого на основе алгоритмов оптимизации минимизируют абсолютное отклонение размеров известных элементов разметки от размеров их изображений на панорамном снимке 4, полученных после преобразования. Таким образом, определяют, например, матрицу томографии, которая обеспечивает пересчет координат площадки на панорамном снимке 4 таким образом, чтобы регламентированный размер элементов разметки в пространстве предметов совпадал с их размерами на изображении с точностью до масштабного множителя S, а следовательно, и взаимное положение элементов площадки становится известным.Using image processing, the coordinates of the images of known marking elements in the panoramic image 4 are determined and such a transformation of the coordinate system of the panoramic image 4 is determined that converts the coordinates of the elements in the panoramic image 4 into a coordinate system associated with a flat area. To this end, on the basis of optimization algorithms, the absolute deviation of the sizes of known markup elements from the sizes of their images in the panoramic image 4 obtained after conversion is minimized. Thus, they determine, for example, a tomography matrix that provides a recalculation of the coordinates of the site in the panoramic image 4 so that the regulated size of the marking elements in the space of objects coincides with their sizes in the image up to a scale factor S, and therefore the relative position of the elements The site is becoming famous.

Далее жестко фиксируют взаимное положение камер видеосистемы друг относительно друга и их положение относительно площадки. Зафиксированное положение должно при этом соответствовать положению видеосистемы, необходимому для съемки событий на плоской площадки в рабочем положении.Next, the relative position of the cameras of the video system relative to each other and their position relative to the site are rigidly fixed. In this case, the fixed position should correspond to the position of the video system necessary for shooting events on a flat platform in the working position.

Измеряют положение не менее четырех точек 6 на панорамном изображении площадки и соответствующих им точек 6 на изображениях 7 площадки, полученных каждой из камер системы (после исправления присутствующих на них нелинейных искажений). Точки на панорамном изображении пересчитывают в систему координат площадки посредством найденной матрицы томографии и масштабного множителя S. Для каждой камеры определяют матрицу томографии, связывающую пересчитанные точки панорамного изображения и точки на изображении соответствующей камеры. Полученные таким образом матрицы томографии определяют внешние параметры камер, а именно их положение в системе координат площадки (мировая система координат) через пересчет координат на изображениях, полученных камерами видеосистемы, в пространство предметов.Measure the position of at least four points 6 on the panoramic image of the site and the corresponding points 6 on the images 7 of the site received by each of the cameras of the system (after correcting the non-linear distortions present on them). The points in the panoramic image are converted into the coordinate system of the site using the found tomography matrix and scale factor S. For each camera, the tomography matrix is determined that relates the recalculated points of the panoramic image and the points in the image of the corresponding camera. The tomography matrices thus obtained determine the external parameters of the cameras, namely their position in the coordinate system of the site (world coordinate system) through recalculation of coordinates on the images obtained by the cameras of the video system into the space of objects.

В случае изменения взаимного расположения камер видеосистемы или изменения положения видеосистемы относительно мировой системы координат достаточно найти новые соответствия между точками на кадрах с камер в новом положении и панорамным кадром, после чего повторить расчет матриц томографии отдельных камер. Поиск таких соответствий может быть произволен как в ручном, так и в автоматическом режиме программным обеспечением, установленном на обрабатывающем компьютере.In case of a change in the relative position of the cameras of the video system or a change in the position of the video system relative to the world coordinate system, it is enough to find new correspondences between the points on the frames from the cameras in the new position and the panoramic frame, and then repeat the calculation of the tomography matrices of the individual cameras. The search for such correspondences can be arbitrary either manually or automatically by software installed on the processing computer.

В результате применения заявляемого способа удается оценить фокусное расстояние, центр поля зрения, коэффициенты дисторсии, а также коэффициенты томографии камер. После чего откалиброванная видеосистема готова к осуществлению контроля положения объектов на площадке.As a result of the application of the proposed method, it is possible to estimate the focal length, the center of the field of view, distortion coefficients, as well as camera tomography coefficients. Then the calibrated video system is ready to monitor the position of objects on the site.

Таким образом, описанный способ калибровки не требует проведения трудоемких операций по предварительным замерам площадки нанесенной на нее разметки, не требует расстановки специальных калибровочных маркеров на плоскую площадку и обеспечивает удобный способ калибровки, в том числе варифокальных видеосистем с перестраиваемой конфигурацией.Thus, the described calibration method does not require laborious operations for preliminary measurements of the marking area applied to it, does not require the placement of special calibration markers on a flat platform, and provides a convenient calibration method, including varifocal video systems with a configurable configuration.

В результате проведения вышеуказанных процедур обработки видеопоследовательностей автоматически формируют отчет о перемещениях каждого участника спортивного события во время матча, что позволяет оптимизировать эффективность действий команды в целом. Отчет содержит информацию о вычисленных координатах пространственного положения участников спортивного события на игровом поле в мировой системе координат (привязанной к площадке игрового поля), привязанные к игровому хронометражу. Отчет с отметками о предупреждениях, привязанных к определенным точкам траекторий, предъявляется оператору для внесения корректировок в ручном режиме в случае наличия таких предупреждений.As a result of the above procedures for processing video sequences, a report is automatically generated on the movements of each participant in a sporting event during the match, which allows optimizing the effectiveness of the team as a whole. The report contains information on the calculated coordinates of the spatial position of the participants in a sporting event on the playing field in the world coordinate system (tied to the playing field), tied to the game timing. A report with warnings marks attached to certain points of the trajectories is presented to the operator for manual adjustments in case of such warnings.

Таким образом, удается уменьшить количество ошибок как при выделении участников матча на игровом поле, так и при сопоставлении отдельных изображений участников матча с выявленными траекториями, что, в конечном счете, приводит к уменьшению количества корректировок, требуемых от оператора.Thus, it is possible to reduce the number of errors both when highlighting the participants of the match on the playing field, and when comparing individual images of the participants of the match with the identified paths, which ultimately leads to a decrease in the number of corrections required from the operator.

Claims

1. A method of controlling the spatial position of participants in a sporting event on the playing field includes obtaining, using at least one video system calibrated by internal and external parameters, a video sequence of color frames containing images of participants in a sporting event on the playing field and playing field during a sporting event, highlighting on each the frame of the obtained video sequence of areas containing images of sections of a sporting event, and the determination of the positions of these areas per frame f, tracking the movements of the mentioned areas on the video sequence of frames, identifying the participants of a sporting event by their assigned game numbers and the colors of their shape, processing the received data and determining the trajectory of the participants by comparing the images of participants on each frame with the images of participants on the previous frames of the video sequence, characterized in that normalize the illumination of the image, including filtering the image, image conversion halftone, which is subjected to histogram equalization, normalization of the image in intensity to a range from 0 to 1 with the formation of a matrix of normalizing coefficients, which is inverted, after which all frames of the video sequence are multiplied by the obtained matrix of coefficients, and the selection on each frame of the resulting video sequence of regions containing images sections, carried out by means of multi-pass image segmentation using the algorithm of interactive separation from the background iterated over graphs trained in automatic mode, and training is carried out using a binarized mask created on the basis of a previously obtained segment, each participant in a sporting event is identified by the colors of his shape by classifying the participant’s image according to a priori given groups by determining the parameters and characteristics of the joint distribution of the frequency of hit values chromatic components of the image in the specified ranges, and the joint distribution of the frequency of hits of the chromatic components of the image in the specified ranges is determined by a two-dimensional histogram, in which the number of intervals depends on the type of sporting event and the number of colors in the form of participants in the sporting event, this histogram is a pseudo-image that binarize at a low threshold, the total intensity of the binarized pseudo-image is found, which represents the number of pseudo-images passed through the threshold of the pixels, the resulting value of the total intensity the values are compared with the threshold for the area, if the value of the total intensity is less than the threshold area, then the corresponding segment is marked as “bad” and does not participate in further processing, if it is greater than or equal, then the corresponding segment is marked as “identifiable”, the trajectory of the participants is determined by finding the lowest value of the metric, which is determined for each pair “possible path - possible participant”, comparing the parameters and characteristics of the images of participants on each frame and expressions of participants in the previous frames of the video sequence, the metric being the total penalty according to several criteria, the pair “possible path - possible participant” having the lowest metric value is assigned by continuing the object path in the current frame, if the distance between the coordinates of the object on the current frame and the coordinates of the last the object of the trajectory will be less than the set threshold value, depending on the maximum possible speed of a participant in a sporting event and frequency shooting frames with cameras.

2. The method according to p. 1, characterized in that for calibrating the video system according to internal and external parameters, use the method of calibrating the video system to control objects on a flat platform, including scanning reference objects in the space of objects with each camera of the video system so that each scanned frame contains an image at least one reference object, measuring the position of the image points of the object for each reference object on scanned frames, determining the internal parameters to each camera, in which the total deviation of the measured points from their calculated position for all frames scanned by one camera was minimal, correcting non-linear distortions in the image on frames from each camera by applying the corresponding internal camera parameters, determining the matrix for converting the images received by the cameras of the video system into space objects, while as a reference object in the space of objects choose a rectilinear object, the image of which is located on the closest distance from the center of the frame and has a maximum length to the periphery of the frame, for the measured image points of the reference object, an approximating straight line drawn through at least two measured image points of the object closest to the center of the frame is determined to minimize the total parameters of each camera to minimize the total absolute the deviation of the position of the measured image points of the reference object from the approximating straight line, determine the relative position of the elements of the site by means of processing its images, determining the coordinate conversion matrices on the images obtained by the cameras of the video system into the space of objects by comparing the positions of at least four points related to the elements of the site and the corresponding points on the images received by each camera of the video system, after correcting those present on them nonlinear distortion.