RU2805760C1

RU2805760C1 - Method and device for retail goods identification

Info

Publication number: RU2805760C1
Application number: RU2023103180A
Authority: RU
Inventors: Анна Ильинична СОКОЛОВА; Александр Георгиевич ЛИМОНОВ; Анна Борисовна ВОРОНЦОВА; Антон Сергеевич Конушин
Original assignee: Самсунг Электроникс Ко., Лтд.
Filing date: 2023-02-13
Publication date: 2023-10-23

Abstract

FIELD: goods identification.

SUBSTANCE: methods for identification retail goods located on supermarket shelves. The product identification device contains a camera connected to an inertial measuring unit and a detector, a computing unit connected to an inertial measuring unit and a detector, a database, a comparison unit connected to the computing unit and the database. To identify the goods, an image of the rack with goods is taken using a camera. The angle between the optical axis of the camera when shooting and the direction of gravity is determined. An image of a product from a shelf caught in the captured image is detected. The position of the center of each bounding box in the image captured by the camera is determined. The angle between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding box is calculated. A set of images of known products is retrieved from the database, from which the image that has the highest degree of agreement with the detected image of the product from the shelf is selected, and the product is identified.

EFFECT: increasing the accuracy of identification of goods located on the shelf of the rack.

3 cl, 4 dwg

Description

Область техникиField of technology

Настоящее изобретение относится к способам распознавания розничных товаров, расположенных на полках стеллажей в супермаркете.The present invention relates to methods for identifying retail items located on shelves in a supermarket.

Описание предшествующего уровня техникиDescription of the Prior Art

В настоящее время сфера видеоаналитики товаров и торговых стеллажей находится в стадии активного развития. Конкуренция розничных сетей ведет к развитию новых подходов к привлечению покупателей. Вычислительные мощности быстро увеличиваются и дешевеют, а вместе с этим появляются новые возможности технологий, которые возможно использовать в том числе и в сфере торговли. В гипермаркетах на стеллажах располагаются тысячи товаров, при этом согласно статистике, средняя посещаемость гипермаркетов колеблется от 16000 до 18000 человек в сутки, поэтому необходимы технологические решения, которые помогут быстро анализировать состояния товаров на стеллажах, их наличие и срок годности, чтобы вовремя заменять негодные товары на свежие или вовремя пополнять полки стеллажей популярными товарами. Currently, the field of video analytics of goods and retail shelving is in a stage of active development. Competition among retail chains is leading to the development of new approaches to attracting customers. Computing power is rapidly increasing and becoming cheaper, and along with this, new technology opportunities are emerging that can be used, including in the field of trade. In hypermarkets, thousands of goods are located on the shelves, while according to statistics, the average attendance of hypermarkets ranges from 16,000 to 18,000 people per day, so technological solutions are needed that will help quickly analyze the condition of goods on the shelves, their availability and expiration date, in order to replace unusable goods in a timely manner for fresh ones or to replenish the shelves with popular goods in a timely manner.

Распознавание розничного товара является важной задачей визуальной аналитики в ретейле. Распознавание товаров может потребоваться в системах розничного бизнеса, таких как:Retail product recognition is an important task of visual analytics in retail. Product recognition may be required in retail business systems, such as:

внутренние системы управления; internal control systems;

роботы-консультанты и мобильные устройства, помогающие персоналу супермаркетов;robo-advisors and mobile devices to assist supermarket staff;

маркетинговые кампании;marketing campaigns;

предоставление рекомендаций покупателю на основе того, на что смотрит покупатель;providing recommendations to the buyer based on what the buyer is looking at;

обслуживание покупателей;customer service;

помощь покупателю в поиске товара, который он ищет.helping the buyer find the product he is looking for.

Товары на полках магазина расположены произвольно так, что этикетки могут быть не видны, что чрезвычайно затрудняет распознавание и поиск определенного товара не только сотрудникам магазинов, но и покупателям, поскольку большинство товаров выглядят по-разному с разных сторон. Эту проблему можно решить, расширив базу данных товаров изображениями товаров, снятыми с разных ракурсов. Однако это значительно увеличивает время, необходимое для распознавания товара, так как соответственно увеличится количество изображений для сравнения.Products on store shelves are arranged randomly so that labels may not be visible, making it extremely difficult not only for store employees but also for customers to recognize and find a specific product, since most products look different from different angles. This problem can be solved by expanding the product database with product images taken from different angles. However, this significantly increases the time required to recognize the product, since the number of images for comparison will correspondingly increase.

Из уровня техники известен документ US 20200380226 A1 (дата публикации 03.12.2020), в котором раскрыты способ и устройство для идентификации товаров. Изображения товаров, снятые обычными или пленоптическими камерами, обрабатываются для получения нескольких различных изображений посредством преобразования перспективы. Складки и другие деформации на упаковке продукта определяют оптически. Известная система содержит первый и второй датчики, причем один из указанных датчиков содержит камеру, другой из указанных датчиков является датчиком, указывающим на наличие или отсутствие товара на полке; один или несколько процессоров и одно или несколько запоминающих устройств. Недостаткам известного решения является необходимость использования двух датчиков, один из которых должен быть прикреплен к потолку и снимать сцену под известным фиксированным ракурсом, что усложняет и ограничивает применимость этого подхода. Кроме того, в данном известном подходе для идентификации продукта используются дополнительные характеристики, такие как штрих код, текст или водяные знаки, что усложняет процесс. Document US 20200380226 A1 (publication date 12/03/2020) is known from the prior art, which discloses a method and device for identifying goods. Product images captured with conventional or plenoptic cameras are processed to produce several different images through perspective transformation. Wrinkles and other deformations on product packaging are detected optically. The known system contains first and second sensors, wherein one of these sensors contains a camera, the other of these sensors is a sensor indicating the presence or absence of goods on the shelf; one or more processors and one or more storage devices. The disadvantages of the known solution are the need to use two sensors, one of which must be attached to the ceiling and film the scene from a known fixed angle, which complicates and limits the applicability of this approach. In addition, this known approach uses additional characteristics such as barcodes, text or watermarks to identify the product, which adds complexity to the process.

Из уровня техники известен документ US 10922353 B2 (дата публикации 16.02.2021), в котором раскрыты система и способ определения объекта или продукта, представленного на изображении. Система получает первое изображение, определяет интересующую область на первом изображении, определяет классификационную оценку для интересующей области с помощью сверточной нейронной сети, которая присваивает интересующей области классификационную оценку, соответствующую классу, и идентифицирует продукт на изображении на основе оценки классификации. В данном подходе не используется информация о ракурсе объекта, которая позволила бы улучшить качество распознавания.Document US 10922353 B2 (publication date 02/16/2021) is known from the prior art, which discloses a system and method for determining an object or product presented in an image. The system receives the first image, determines a region of interest in the first image, determines a classification score for the region of interest using a convolutional neural network that assigns a classification score corresponding to the class to the region of interest, and identifies the product in the image based on the classification score. This approach does not use information about the angle of the object, which would improve the quality of recognition.

Из уровня техники известен документ US 8397181 B2 (дата публикации 12.03.2013), в котором раскрыт способ маркировки положения объекта реального мира на прозрачном дисплее. Известный способ включает в себя захват изображения объекта реального мира с помощью устройства формирования изображения. Положение объекта в реальном мире рассчитывается на основе угла обзора объекта и расстояния до объекта. Определяется положение на прозрачном дисплее, которое соответствует реальному положению объекта. Затем на дисплее отображается метка в месте, которое соответствует объекту реального мира. В этом известном решении наравне с цветной камерой и гироскопом используются датчики GPS и датчики дальности, позволяющие определить расстояние до объекта. Необходимость наличия этих устройств ограничивает применимость данного решения. Кроме того, целью известного решения является отображение объектов в виртуальной действительности, а не их идентификация.Document US 8397181 B2 (publication date March 12, 2013) is known from the prior art, which discloses a method for marking the position of a real-world object on a transparent display. The known method involves capturing an image of a real world object using an imaging device. The position of an object in the real world is calculated based on the viewing angle of the object and the distance to the object. The position on the transparent display is determined, which corresponds to the real position of the object. The display then shows a mark at a location that corresponds to a real-world object. This well-known solution, along with a color camera and a gyroscope, uses GPS sensors and range sensors to determine the distance to an object. The need for these devices limits the applicability of this solution. In addition, the purpose of the known solution is to display objects in virtual reality, and not to identify them.

Таким образом, необходимо создать простое в использовании и изготовлении устройство, которое позволяет распознавать товар, расположенный на полке стеллажа, по изображению, снятому камерой пользователя. Thus, it is necessary to create a device that is easy to use and manufacture, which allows you to recognize the product located on the shelf of the rack from the image captured by the user's camera.

Сущность изобретенияThe essence of the invention

Предлагается устройство идентификации товара, содержащее:A product identification device is proposed, containing:

камеру, выполненную с возможностью съемки по меньшей мере одного изображения товаров, расположенных на полках стеллажа;a camera configured to capture at least one image of goods located on the shelves of the rack;

инерциальный измерительный блок, соединенный с камерой и выполненный с возможностью определения угла между оптической осью камеры при съемке и направлением гравитации;an inertial measurement unit connected to the camera and configured to determine the angle between the optical axis of the camera when shooting and the direction of gravity;

детектор, соединенный с камерой, и выполненный с возможностью:a detector connected to the camera and configured to:

- обнаружения по меньшей мере одного изображения товара с полки, попавшего на отснятое камерой изображение,- detection of at least one image of a product from the shelf included in the image captured by the camera,

- формирования ограничивающих прямоугольников, каждый один из которых описан вокруг каждого одного обнаруженного изображения товара с полки;- formation of bounding rectangles, each one of which is described around each one detected image of a product from the shelf;

- определения положения центра каждого ограничивающего прямоугольника на изображении, отснятом камерой;- determining the position of the center of each bounding rectangle in the image captured by the camera;

вычислительный блок, соединенный с инерциальным измерительным блоком и детектором, выполненный с возможностью вычисления угла Ɣ_i между горизонтальной плоскостью и лучом, соединяющим центр объектива камеры и центр ограничивающего прямоугольника;a computing unit connected to the inertial measurement unit and the detector, configured to calculate an angle Ɣ _i between the horizontal plane and a ray connecting the center of the camera lens and the center of the bounding box;

базу данных, включающую в себя изображения известных товаров, причем для каждого известного товара в базе данных хранятся его изображения, на каждом из которых известный товар снят под известным углом между горизонтальной плоскостью и лучом, соединяющим центр объектива камеры и центр ограничивающего прямоугольника, описанного вокруг известного товара на изображении, причем упомянутые известные углы для всех изображений одного и того же известного товара разные; a database that includes images of known products, and for each known product, the database stores its images, on each of which the known product was shot at a known angle between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding rectangle described around the known product in the image, and the mentioned known angles are different for all images of the same known product;

блок сравнения, соединенный с вычислительным блоком и базой данных, выполненный с возможностью:a comparison unit connected to a computing unit and a database, configured to:

- извлечения из базы данных набора изображений известных товаров из базы данных, снятых под упомянутым известным углом, отличным от Ɣ_i не более чем на заранее заданный угол,- retrieving from the database a set of images of known products from the database, taken at said known angle other than Ɣ_i no more than a predetermined angle,

- из полученного набора изображений известных товаров из базы данных, выбора изображения известного товара из базы данных, которое имеет наибольшую степень совпадения с обнаруженным изображением товара с полки, по сравнению с другими изображениями известных товаров из базы данных из упомянутого извлеченного набора изображений известных товаров из базы данных;- from the resulting set of images of known products from the database, selecting an image of a known product from the database that has the highest degree of match with the detected image of the product from the shelf, compared with other images of known products from the database from the said extracted set of images of known products from the database data;

- идентификации товара на обнаруженном изображении товара с полки на основании выбранного изображения известного товара из базы данных. - identification of the product in the detected image of the product from the shelf based on the selected image of a known product from the database.

Также предлагается способ работы устройства идентификации товара, содержащий этапы, на которых:A method of operation of a product identification device is also proposed, comprising the steps of:

А) производят съемку по меньшей мере одного изображения товаров, расположенных на полках стеллажа, посредством камеры;A) take at least one image of the goods located on the shelves of the rack using a camera;

Б) определяют угол между оптической осью камеры при съемке и направлением гравитации посредством инерциального измерительного блока;B) determine the angle between the optical axis of the camera when shooting and the direction of gravity using an inertial measuring unit;

В) посредством детектора:B) through a detector:

- обнаруживают по меньшей мере одно изображение товара с полки, попавшего на отснятое изображение,- detect at least one image of a product from the shelf included in the captured image,

- формируют ограничивающие прямоугольники, каждый один из которых описан вокруг каждого одного обнаруженного изображения товара с полки,- form bounding rectangles, each one of which is described around each one detected image of a product from the shelf,

- определяют положение центра каждого ограничивающего прямоугольника на изображении, отснятом камерой; - determine the position of the center of each bounding rectangle in the image captured by the camera;

для каждого обнаруженного изображения товара с полки:for each detected product image from the shelf:

Г) вычисляют угол Ɣ_i между горизонтальной плоскостью и лучом, соединяющим центр объектива камеры и центр ограничивающего прямоугольника посредством вычислительного блока;D) calculate the angle Ɣ _i between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding rectangle by means of a computing unit;

Д) посредством блока сравнения:D) through a comparison block:

- извлекают из базы данных набор изображений известных товаров из базы данных, снятых под известным углом между горизонтальной плоскостью и лучом, соединяющим центр объектива камеры и центр ограничивающего прямоугольника, описанного вокруг известного товара на изображении из базы данных, причем упомянутый набор содержит изображения известных товаров из базы данных, снятых под упомянутым известным углом, отличным от Ɣ_i не более чем на заранее заданный угол;- extract from the database a set of images of known products from the database, taken at a known angle between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding rectangle described around the known product in the image from the database, wherein said set contains images of known products from database taken at said known angle other than Ɣ_i no more than a predetermined angle;

- выбирают из упомянутых изображений известных товаров из базы данных, изображение известного товара из базы данных, которое имеет наибольшую степень совпадения с обнаруженным изображением товара с полки, по сравнению с другими изображениями известных товаров из базы данных из упомянутого извлеченного набора изображений известных товаров из базы данных; - selecting, from said known product images from the database, an image of a known product from the database that has the highest degree of match with the detected image of a shelf product, compared to other images of known products from the database from said retrieved set of images of known products from the database ;

- идентифицируют товар на обнаруженном изображении товара с полки на основании выбранного изображения известного товара из базы данных.- identify the product in the detected image of the product from the shelf based on the selected image of a known product from the database.

Угол Ɣ_i можно вычислять с использованием соотношения:The angle Ɣ _i can be calculated using the relation:

, ,

где α - это угол между оптической осью камеры и направлением гравитации,where α is the angle between the optical axis of the camera and the direction of gravity,

f - это фокусное расстояние объектива камеры,f is the focal length of the camera lens,

b_i - это ордината центра ограничивающего прямоугольника, причем начало координат расположено в верхнем углу изображения, снятого камерой, ось у направлена вдоль боковой стороны изображения, отснятого камерой;b _i is the ordinate of the center of the bounding box, with the origin located in the upper corner of the image captured by the camera, the y-axis is directed along the side of the image captured by the camera;

с - ордината точки пересечения оптической оси объектива камеры с плоскостью изображения. c is the ordinate of the point of intersection of the optical axis of the camera lens with the image plane.

Краткое описание чертежейBrief description of drawings

Вышеописанные и другие признаки и преимущества настоящего изобретения поясняются в последующем описании, иллюстрируемом чертежами, на которых представлено следующее: The above and other features and advantages of the present invention will become more apparent from the following description, taken in conjunction with the drawings, in which:

Фиг. 1 кратко иллюстрирует алгоритм предлагаемого способа. Fig. 1 briefly illustrates the algorithm of the proposed method.

Фиг. 2 иллюстрирует различный вид одного и того же товара на изображении в зависимости от его расположения во время съемки относительно камеры.Fig. 2 illustrates the different appearance of the same product in the image depending on its location relative to the camera during shooting.

Фиг. 3 иллюстрирует геометрическое обоснование вычисления угла Ɣ_i.Fig. 3 illustrates the geometric rationale for calculating the angle Ɣ _i .

Фиг. 4 иллюстрирует блок-схему предлагаемого способа распознавания розничных товаров на изображении, отснятом камерой. Fig. 4 illustrates a block diagram of the proposed method recognition of retail products in an image captured by a camera.

Подробное описание изобретенияDetailed Description of the Invention

Предлагаемое изобретение позволяет быстро и точно распознать товар, расположенный на полке стеллажа, только по его изображению, отснятому камерой в текущий момент времени. The proposed invention allows you to quickly and accurately recognize a product located on a rack shelf only by its image captured by the camera at the current moment in time.

Предлагаемое устройство идентификации розничных товаров представляет собой электронное устройство, содержащее камеру для съемки изображения, инерциальный измерительный блок (IMU), соединенный с камерой, детектор, соединенный с камерой, вычислительный блок, соединенный с инерциальным измерительным блоком и детектором, базу данных, блок сравнения, соединенный с вычислительным блоком и базой данных. The proposed retail product identification device is an electronic device comprising a camera for capturing an image, an inertial measurement unit (IMU) coupled to the camera, a detector coupled to the camera, a computing unit coupled to the IMU and detector, a database, a comparison unit, connected to a computing unit and a database.

Блоки IMU известны из уровня техники и представляют собой электронное устройство, которое измеряет и сообщает угловую скорость, то есть скорость поворота камеры относительно оси вращения при движении камеры, и ориентацию тела (углы поворота по трем осям), в данном случае IMU определяет угол между оптической осью камеры при съемке и направлением гравитации, используя комбинацию акселерометров, гироскопов и магнитометров. В данном изобретении предлагается использовать информацию об ориентации камеры для оценки угла обзора и улучшения распознавания товара с полки. Блок IMU может быть расположен внутри камеры или прикреплен к камере. IMU units are known from the prior art and are an electronic device that measures and reports angular velocity, that is, the speed of rotation of the camera relative to the axis of rotation when the camera moves, and body orientation (angles of rotation along three axes), in this case the IMU determines the angle between the optical the axis of the camera when shooting and the direction of gravity, using a combination of accelerometers, gyroscopes and magnetometers. The present invention proposes to use camera orientation information to estimate viewing angle and improve shelf recognition. The IMU may be located inside the camera or attached to the camera.

Предлагаемое устройство может быть интегрировано в смартфон или любое другое подходящее компьютерное электронное устройство, имеющее камеру, память и возможность поддержки программного обеспечения. The proposed device can be integrated into a smartphone or any other suitable computer electronic device having a camera, memory and software support.

На фиг. 1 кратко проиллюстрирован алгоритм предлагаемого способа идентификации товара на полке стеллажа.In fig. Figure 1 briefly illustrates the algorithm of the proposed method for identifying goods on a shelf.

А) Пользователь снимает полку с товарами на камеру, получая изображение полки с товарами;A) The user films a shelf with goods on a camera, receiving an image of a shelf with goods;

Б) Посредством программного обеспечения осуществляется обнаружение (детектирование) каждого изображения товара с полки на полученном изображении полки с товарами, для этого полученное изображение полки с товарами сканируется алгоритмом детекции, обнаруживающим изображения отдельных предметов на изображении полки с товарами. Далее для краткости указанный алгоритм будет называться детектором. В уровне техники известны подобные детекторы, например, алгоритм DenseDet, описанный в статье Tianze Rong, Yanjia Zhu, Hongxiang Cai, and Yichao Xiong. «A Solution to Product Detection in Densely Packed Scenes», arXiv preprint arXiv: 2007.11946 (2020), https://arxiv.org/pdf/2007.11946.pdf, или, например, алгоритм RetailDet, описанный в статье Chen, F. et al. (2022). Unitail: Detecting, Reading, and Matching in Retail Scene. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision - ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13667. Springer, Cham. https://arxiv.org/pdf/2204.00298v4.pdf (RetailDet).B) The software detects (detects) each image of a product from a shelf on the resulting image of a shelf with goods; for this, the resulting image of a shelf with goods is scanned by a detection algorithm that detects images of individual items on the image of a shelf with goods. In what follows, for brevity, this algorithm will be called a detector. Similar detectors are known in the art, for example, the DenseDet algorithm described in the article by Tianze Rong, Yanjia Zhu, Hongxiang Cai, and Yichao Xiong. “A Solution to Product Detection in Densely Packed Scenes,” arXiv preprint arXiv: 2007.11946 (2020), https://arxiv.org/pdf/2007.11946.pdf, or, for example, the RetailDet algorithm described in the article by Chen, F. et al. (2022). Unitail: Detecting, Reading, and Matching in Retail Scene. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision - ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13667. Springer, Cham. https://arxiv.org/pdf/2204.00298v4.pdf (RetailDet).

Упомянутый детектор обнаруживает все изображения товаров с полки на изображении полки с товарами, полученном камерой, и далее осуществляется анализ каждого обнаруженного изображения товара с полки. The said detector detects all the shelf product images in the product shelf image captured by the camera and further analyzes each detected shelf product image.

Каждое обнаруженное изображение товара с полки вписывается в прямоугольник. Для примера на фиг. 1 (позиция Б) только один обнаруженный детектором товар с полки вписан в прямоугольник, однако вся описанная ниже процедура производится со всеми обнаруженными на изображении товарами, а не только с одним. Each detected product image from the shelf fits into a rectangle. For example in FIG. 1 (position B) only one product from the shelf detected by the detector is inscribed in a rectangle, however, the entire procedure described below is performed with all products detected in the image, and not just with one.

В известных применяемых алгоритмах детекции объекты, искомые на отснятых изображениях, параметризуются координатами ограничивающих прямоугольников, содержащих в себе изображение объекта. Прямоугольник - это общепринятый формат, которым может быть ограничено изображение товара для дальнейших расчетов. При обработке изображения формируется ограничивающий прямоугольник, который описан вокруг изображения объекта, то есть изображение объекта оказывается вписанным в ограничивающий прямоугольник. Результатом работы детектора являются координаты ограничивающих прямоугольников. In known detection algorithms used, objects sought in captured images are parameterized by the coordinates of bounding rectangles containing the image of the object. A rectangle is a generally accepted format that can be used to limit the image of a product for further calculations. When processing an image, a bounding rectangle is formed, which is described around the object image, that is, the object image is inscribed in the bounding rectangle. The result of the detector's operation is the coordinates of the bounding rectangles.

Систему координат связывают с изображением, полученным камерой. Изображение, полученное камерой, представляет собой изображение стеллажа, имеющего полки, на которых расположены товары, подлежащие идентификации. Изображение стеллажа имеет верхний край, расположенный выше изображения самой верхней полки, попавшей в кадр при съемке, и нижний край, расположенный ниже самой нижней полки, попавшей в кадр при съемке, и боковые края изображения, перпендикулярные верхнему и нижнему краю кадра. Начало координат расположим в одном из верхних углов изображения, полученного камерой, ось y направим вдоль одного из боковых краев изображения, ось x вдоль одного из верхнего или нижнего края изображения. The coordinate system is associated with the image obtained by the camera. The image obtained by the camera is an image of a rack having shelves on which the goods to be identified are located. The image of the shelving has a top edge located above the image of the topmost shelf included in the frame when shooting, a bottom edge located below the lowest shelf included in the frame when shooting, and side edges of the image perpendicular to the top and bottom edges of the frame. Let's place the origin of coordinates in one of the upper corners of the image obtained by the camera, direct the y-axis along one of the side edges of the image, and the x-axis along one of the top or bottom edges of the image.

На фиг. 1 центр ограничивающего прямоугольника, который ограничивает один из обнаруженных товаров, обозначен точкой b. Точка b имеет координату b_i по оси y, где i - это индекс (номер) обнаруженного изображения товара с полки, i= 1, 2…N. Также на изображении, отснятом камерой, отмечена точка С пересечения оптической оси объектива камеры и плоскости изображения, снимаемого камерой, и проекция с точки C на ось y.In fig. 1 The center of the bounding box that encloses one of the detected products is indicated by point b. Point b has a coordinate b _i along the y axis, where i is the index (number) of the detected product image from the shelf, i= 1, 2…N. Also on the image captured by the camera, point C is marked at the intersection of the optical axis of the camera lens and the plane of the image captured by the camera, and the projection from point C to the y-axis.

В) По данным, полученным от инерциального измерительного блока, определяют параметры камеры (описанные ниже) для вычисления угла Ɣ_i между горизонтальной плоскостью и лучом, соединяющим центр объектива камеры и центр ограничивающего прямоугольника, в который вписано обнаруженное изображение i-го товара с полки. Угол Ɣ_iявляется ракурсом, под которым виден товар с полки на изображении, или, другими словами, ориентацией Ɣ_i товара на полке по отношению к камере при съемке. B) Based on the data received from the inertial measurement unit, the camera parameters (described below) are determined to calculate the angle Ɣ _i between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding rectangle in which the detected image of the i-th product from the shelf is inscribed. Angle Ɣ _i is the angle from which the product on the shelf is seen in the image, or in other words, the orientation Ɣ _i of the product on the shelf relative to the camera when photographed.

Г) Каждое обнаруженное изображение товара с полки сравнивают с изображениями известных товаров из базы данных. В базе данных собраны изображения известных товаров (предметов), которые могут быть расположены на полке стеллажа в магазине, причем для каждого известного j-го товара собраны его изображения, снятые с разной ориентацией этого товара по отношению к камере при съемке, то есть под разными углами Ɣ_j где j - это индекс (номер) изображения известного товара из базы данных, j= 1, 2…M. Сначала выбирают из базы данных изображения всех известных товаров, имеющих по отношению к камере одну и ту же ориентацию Ɣ_j, совпадающую с ориентацией товара с полки, изображение которого обнаружено на изображении полки с товарами, полученном камерой. D) Each detected product image from the shelf is compared with images of known products from the database. The database contains images of known goods (items) that can be located on a shelf in a store, and for each known j-th product its images are collected, taken with different orientations of this product relative to the camera when shooting, that is, under different angles Ɣ_j where j is the index (number) of the image of a known product from the database, j= 1, 2…M. First, images of all known products that have the same orientation with respect to the camera are selected from the database Ɣ_j, coinciding with the orientation of the shelf product, the image of which is detected in the image of the shelf with goods received by the camera.

Д) Затем из выбранных из базы данных изображений известных товаров выбирают изображение известного товара из базы данных, которое имеет наибольшую степень совпадения с обнаруженным изображением b товара с полки, по сравнению с другими изображениями известных товаров из выбранных из базы данных. Поиск и выбор похожего изображения из базы данных может производится, например, посредством нейронной сети или другим известным из уровня техники подходящим способом. D) Then, from the known product images selected from the database, select the image of the known product from the database that has the highest degree of match with the detected product image b from the shelf, compared to other images of the known products selected from the database. Searching and selecting a similar image from a database can be done, for example, using a neural network or other suitable method known in the art.

Е) На основании сравнения c выбранным изображением известного товара из базы данных идентифицируют товар на обнаруженном изображении товара с полки. E) Based on comparison with the selected image of a known product from the database, the product is identified in the detected image of the product from the shelf.

Фиг. 2 иллюстрирует идентичные товары 1 и 2 на полученном изображении полки с товарами в зависимости от их расположения относительно камеры во время съемки, то есть в зависимости от ракурса Ɣ_iи их поворота вокруг своей оси. Пунктирная линия, проведенная к середине изображения с левой стороны чертежа, обозначает проекцию точки С пересечения оптической оси камеры и плоскости изображения на ось y, то есть с - координата по оси y точки С. Ограничивающие прямоугольники, которыми обведены товар 1 на нижней полке и товар 2 на верхней полке, обозначают примеры изображений товаров, обнаруженных детектором на полученном камерой изображении. Центр ограничивающего прямоугольника обнаруженного изображения товара 1, расположенного на нижней полке, обозначен b1, центр ограничивающего прямоугольника обнаруженного изображения товара 2, расположенного на верхней полке, обозначен b2. Таким образом обрабатываются все обнаруженные изображения товаров с полок, найденных посредством детектирования на изображении полки с товарами, полученном камерой, при этом сам пользователь не принимает участия в выборе товара с полки, а только производит съемку камерой стеллажа с товарами. Точка b₁ обозначает проекцию на ось y центра b1 ограничивающего прямоугольника обнаруженного изображения товара 1 с полки. Fig. 2 illustrates identical products 1 and 2 in the resulting image of a shelf with goods depending on their location relative to the camera during shooting, that is, depending on the angle Ɣ _i and their rotation around their axis. The dotted line drawn to the middle of the image on the left side of the drawing denotes the projection of point C of the intersection of the optical axis of the camera and the image plane onto the y-axis, that is, c is the y-coordinate of point C. Bounding rectangles that enclose product 1 on the bottom shelf and product 2 on the top shelf indicate examples of images of goods detected by the detector in the image received by the camera. The center of the bounding box of the detected image of product 1 located on the bottom shelf is designated b1, the center of the bounding box of the detected image of product 2 located on the top shelf is designated b2. In this way, all detected images of goods from the shelves, found through detection on the image of the shelf with goods received by the camera, are processed, while the user himself does not take part in selecting the goods from the shelf, but only takes pictures of the rack with goods with the camera. Point b ₁ denotes the projection onto the y axis of the center b1 of the bounding rectangle of the detected image of product 1 from the shelf.

Товары на полке обычно расположены под разными ракурсами Ɣ относительно камеры при съемке. Во время съемки камера может быть направлена снизу-вверх на одни товары, на другие - сверху вниз, на третьи - сбоку и т.п. То есть, один и тот же товар, получаемый на изображении полки с товарами под разными углами Ɣ, может выглядеть на снятом камерой кадре по-разному, т.к. в кадре могут быть видны разные визуальные признаки одного и того же товара в зависимости от ракурса его съемки. Поэтому необходимо, чтобы набор изображений одного и того же известного товара, содержащийся в базе данных, состоял из изображений этого известного товара, снятых под различными заранее известными ракурсами Ɣ_j. Products on the shelf are usually located at different angles Ɣ relative to the camera when shooting. During shooting, the camera can be directed from bottom to top at some products, at others - from top to bottom, at others - from the side, etc. That is, the same product, obtained in an image of a shelf with goods at different angles Ɣ, may look different on the frame captured by the camera, because Different visual signs of the same product may be visible in the frame depending on the angle at which it was shot. Therefore, it is necessary that the set of images of the same known product contained in the database consist of images of this known product taken from different previously known angles Ɣ _j .

Существуют различные известные способы получения базы данных изображений одного и того же известного товара, расположенного на полке стеллажа под разными известными ракурсами Ɣ_jпри съемке. Для предлагаемого изобретения возможно использовать, например, следующие:There are various known methods for obtaining a database of images of the same known product located on a shelf of a rack from different known angles Ɣ _j when shooting. For the present invention it is possible to use, for example, the following:

1) Включение в базу данных фотографий реальных товаров. Для этого снимают каждый известный товар на камеру под заранее определенным набором углов Ɣ_j (углом между горизонтальной плоскостью и лучом, соединяющим центр объектива камеры и центр ограничивающего товар прямоугольника). Для каждого из положений товара на полке под углом Ɣ_j делают снимки с поворотом известного товара вокруг его оси с шагом, например, в 10°-15° (или с любым другим шагом) чтобы товар был под одним и тем же ракурсом Ɣ_j, но был виден с разных сторон для получения изображений одного и того же товара с разными визуальными признаками изображения товара, видимых с разных сторон. Такими признаками, в случае использования известных из уровня техники программ сравнения изображений, могут быть форма, цвет с разных сторон, градиент интенсивности цвета с одной стороны, наличие надписей и изображений на разных сторонах товара, углы и линии на изображении, и т.п. В случае применения нейронных сетей для целей сравнения указанные визуальные признаки, как правило, плохо интерпретируемы. Иногда возможно выделить некоторые паттерны, например, линии под некоторыми углами, надписи, цвет, формы, на которые будет реагировать нейронная сеть, но в целом могут иметь место и признаки, непонятные человеческому глазу, поскольку признаки, получаемые с помощью нейронной сети, являются абстрактными многомерными векторами. 1) Inclusion of photographs of real goods into the database. To do this, each known product is filmed with a camera at a predetermined set of angles Ɣ _j (the angle between the horizontal plane and the ray connecting the center of the camera lens and the center of the rectangle enclosing the product). For each position of the product on the shelf at an angle Ɣ _j , photographs are taken with the known product rotated around its axis in steps of, for example, 10°-15° (or with any other step) so that the product is at the same angle Ɣ _j , but was visible from different sides to obtain images of the same product with different visual features of the product image visible from different sides. Such features, in the case of using image comparison programs known from the prior art, may be shape, color on different sides, a gradient of color intensity on one side, the presence of inscriptions and images on different sides of the product, angles and lines in the image, etc. When neural networks are used for comparison purposes, these visual features are usually poorly interpreted. Sometimes it is possible to identify some patterns, for example, lines at certain angles, inscriptions, colors, shapes, to which the neural network will react, but in general there may also be signs that are incomprehensible to the human eye, since the signs obtained using a neural network are abstract multidimensional vectors.

Полученные изображения сохраняют в базе данных.The resulting images are stored in a database.

2) Включение в базу данных отдельных изображений товара, сформированных из его 3D модели. При наличии 3D модели товара можно осуществить рендеринг, то есть сформировать разные виды одного и того же товара на компьютере под различными известными заранее углами Ɣ_j. Рендеринг нужно сделать для всех товаров, имеющихся в магазине. 3D модель получают с помощью 3D сканера, такие процессы известны из уровня техники. 2) Inclusion in the database of individual images of the product, generated from its 3D model. If you have a 3D model of a product, you can render it, that is, create different views of the same product on a computer at different angles Ɣ _j known in advance. Rendering must be done for all products available in the store. The 3D model is obtained using a 3D scanner, such processes are known from the prior art.

В любом из вариантов для каждого товара, имеющегося в магазине, в базе данных должны быть сохранены:In any of the options, for each product available in the store, the following must be stored in the database:

- наборы изображений (снимков) под разными углами Ɣ_i между горизонтальной плоскостью и лучом, соединяющим центр объектива камеры и центр ограничивающего товар прямоугольника, - sets of images (shots) at different angles Ɣ _i between the horizontal plane and the ray connecting the center of the camera lens and the center of the rectangle enclosing the product,

- причем каждый набор изображений под каждым из углов Ɣ_i содержит изображения, на которых товар повернут вокруг своей вертикальной оси или горизонтальной оси, или любой другой своей оси, как это удобно (в зависимости от формы товара и того, как товар может располагаться на полке), чтобы признаки товара были видны на изображении товара с разных сторон товара.- and each set of images at each of the angles Ɣ _i contains images in which the product is rotated around its vertical axis or horizontal axis, or any other axis, as convenient (depending on the shape of the product and how the product may be positioned on the shelf ) so that the characteristics of the product are visible in the image of the product from different sides of the product.

Обычно в уровне техники для идентификации товара изображение обнаруженного товара с полки сравнивается со всеми вариантами изображений, содержащимися в базе данных, что требует много времени и ресурсов памяти. В предлагаемом изобретении для каждого известного товара из базы данных на изображениях, хранящихся в базе данных, угол Ɣ_jизвестен. В процессе идентификации товара с полки в первую очередь осуществляется сравнение угла Ɣ_iидентифицируемого товара с полки и угла Ɣ_j,под которым сняты изображения известных товаров из базы данных. Поэтому количество сравнений с данными из базы данных значительно уменьшается, то есть уменьшается время идентификации товара с полки, поскольку изображения известных товаров из базы данных, имеющие угол Ɣ_i, не совпадающий с углом Ɣ_j обнаруженного изображения товара не принимаются во внимание.Typically, in the prior art, to identify a product, the image of a detected product from the shelf is compared with all the image variants contained in the database, which requires a lot of time and memory resources. In the proposed invention, for each known product from the database in the images stored in the database, the angle Ɣ_jfamous. In the process of identifying a product from a shelf, the first step is to compare the angle Ɣ_iidentifiable product from the shelf and angle Ɣ_j,under which images of famous products from the database are taken. Therefore, the number of comparisons with data from the database is significantly reduced, that is, the time to identify a product from the shelf is reduced, since images of known products from the database having an angle Ɣ_i, not coinciding with angle Ɣ_j detected product images are not taken into account.

Угол Ɣ_i,под которым пользователь снимает камерой идентифицируемый товар с полки, при известном угле наклона оптической оси камеры относительно горизонта и известном положении товара на полученном изображении вычисляется следующим образом. Положение товара на изображении полки с товарами определяется координатой y центра ограничивающего прямоугольника, в который вписано обнаруженное изображение товара. Angle Ɣ_i,at which the user removes the identified product from the shelf with a camera, with a known angle of inclination of the camera’s optical axis relative to the horizon and a known position of the product in the resulting image, is calculated as follows. The position of the product in the image of the shelf with goods is determined by the y coordinate of the center of the bounding rectangle into which the detected product image is inscribed.

Фиг. 3 схематично иллюстрирует геометрическое обоснование вычисления угла Ɣ_i. Весь кадр изображения отображается в виде отрезка AB, точка С и ее проекция с на ось у (ось у на виде сбоку совпадает с прямой, на которой расположен отрезок AB), показанные на фиг. 2, визуализируются на фиг. 3 одной точкой с (то есть ординатой точки С). Fig. 3 schematically illustrates the geometric rationale for calculating the angle Ɣ _i . The entire image frame is displayed as a segment AB, point C and its projection c onto the y-axis (the y-axis in the side view coincides with the straight line on which segment AB is located), shown in Fig. 2 are visualized in FIG. 3 with one point c (that is, the ordinate of point C).

Ось х направлена вперед на наблюдателя, поэтому она не видна на фиг. 3. Величины b_i, то есть b₁, b₂,…b_N - это центры прямоугольников, в которые вписаны обнаруженные изображения товаров с полок, которые на виде сбоку совпадают с y-координатами центров соответствующих ограничивающих прямоугольников, в которые вписаны обнаруженные изображения товаров с полок. The x-axis is directed forward towards the observer, so it is not visible in Fig. 3. The quantities b _i , that is, b ₁ , b ₂ ,…b _N are the centers of the rectangles into which the detected images of goods from the shelves are inscribed, which in the side view coincide with the y-coordinates of the centers of the corresponding bounding rectangles into which the detected images are inscribed goods from the shelves.

с - это у-координата (ордината) оптического центра С камеры в системе координат, показанной на фиг. 2. Точка С называется принципиальной точкой камеры, и является точкой пересечения оптической оси объектива камеры с плоскостью изображения, которое снимает камера, как правило, эта точка близка или совпадает с центром изображения. В идеальной камере принципиальная точка камеры находится точно в центре изображения, поскольку в идеальной камере не учитываются оптические искажения, из-за которых в реальных камерах принципиальная точка немного смещена от центра изображения, например, тангенциальное искажение или несовершенное центрирование компонентов объектива и другие производственные дефекты. Принципиальная точка камеры является известным внутренним параметром любой камеры, предназначенной для съемки изображения, то есть величина с известна и зависит от выбранной камеры, причем при расчетах может использоваться значение координаты у при проекции точки С на ось y, которое будет одинаковым для любого снимка, сделанного одной и той же камерой. Как показано на фиг. 3, AC не равно BC. Withis the y-coordinate (ordinate) of the optical center C of the camera in the coordinate system shown in FIG. 2. Point WITH is called the principal point of the camera, and is the point of intersection of the optical axis of the camera lens with the plane of the image that the camera takes; as a rule, this point is close to or coincides with the center of the image. In an ideal camera, the principal point of the camera is exactly in the center of the image, since the ideal camera does not take into account the optical distortions that in real cameras cause the principal point to be slightly off center of the image, such as tangential distortion or imperfect alignment of lens components and other manufacturing defects. The principal point of the camera is a known internal parameter of any camera intended for taking an image, that is, the value c is known and depends on the selected camera, and the calculations can use the value of the y coordinate when projecting point C onto the y axis, which will be the same for any picture taken the same camera. As shown in FIG. 3, AC is not equal to BC.

На фиг. 3 приведены примеры для двух обнаруженных изображений товаров с полок, b₁ - проекция на ось y центра b1 ограничивающего прямоугольника, описывающего первое обнаруженное изображение товара с полки, показанного на фиг. 2 на нижней полке, b₂ - проекция на ось y центра b2 ограничивающего прямоугольника, описывающего второе обнаруженное изображение товара с полки, показанного на фиг. 2 на верхней полке. Если при съемке в кадр попало несколько объектов (товаров), расположенных на одной полке, и несколько объектов (товаров), расположенных на другой полке, то в результате обнаружения (детекции) будут найдены товары и с одной полки, и с другой полки, как бы они ни были расположены друг относительно друга. Изображения всех обнаруженных товаров с полок, попавших в кадр при съемке камерой, обрабатываются посредством программного обеспечения. Фокусное расстояние объектива камеры f - это внутренний параметр камеры. Как было указано выше, Ɣ_i-это угол между горизонтальной плоскостью и лучом, соединяющим центр камеры и центр прямоугольника, описывающего обнаруженный товар с полки, горизонтальная плоскость - эта плоскость, ортогональная направлению гравитации. Угол α - это угол между оптической осью камеры и направлением гравитации, определяемый с помощью блока IMU. In fig. Figure 3 shows examples for two detected images of products from shelves, b₁ - projection onto the y-axis of the center b1 of the bounding box describing the first detected image of the product from the shelf shown in FIG. 2 on the bottom shelf, b₂ - projection on the y-axis of the center b2 of the bounding box describing the second detected image of the product from the shelf shown in FIG. 2 on the top shelf. If during shooting the frame includes several objects (products) located on one shelf, and several objects (products) located on another shelf, then as a result of detection (detection) goods from both one shelf and the other shelf will be found, as no matter where they are located relative to each other. Images of all detected products from the shelves that were captured by the camera are processed through software. The focal length of a camera lens, f, is an internal parameter of the camera. As stated above, Ɣ_iis the angle between the horizontal plane and the ray connecting the center of the camera and the center of the rectangle describing the detected product from the shelf, the horizontal plane is a plane orthogonal to the direction of gravity. Angle α is the angle between the camera's optical axis and the direction of gravity, determined by the IMU.

Точка О - это центр объектива камеры, откуда выходит оптическая ось камеры. Отрезок АВ, проходящий через точки b₁, c, b₂ - это проекция плоскости изображения на ось y. Величины b_i, то есть b₁, b₂,…b_N - это значения y-координаты центров прямоугольников, в которые вписаны обнаруженные изображения товаров 1, 2,…N, которые на виде сбоку совпадают с центрами прямоугольников, в которые вписаны обнаруженные изображения товара с полки. Point O is the center of the camera lens, where the camera's optical axis emerges. The segment AB passing through points b ₁ , c , b ₂ is the projection of the image plane onto the y-axis. The values b _i , that is, b ₁ , b ₂ ,…b _N are the values of the y-coordinate of the centers of the rectangles in which the detected images of goods 1, 2,…N are inscribed, which in the side view coincide with the centers of the rectangles in which the detected ones are inscribed images of goods from the shelf.

β_i, то есть любой из β_i=β₁, β₂,… β_N - это угол, под которым виден i-ый товар на изображении, то есть угол между плоскостью изображения при съемке, которая параллельна плоскости объектива камеры, и лучом, соединяющим центр камеры и центр ограничивающего прямоугольника, в который вписано обнаруженное изображение товара. Если обнаруженное изображение товара i с полки расположено в центре изображения (в центре кадра), то угол β_i=0. Если обнаруженное изображение товара i с полки расположено внизу изображения (внизу кадра), то товар виден сверху вниз и угол β_i большой. Конкретная величина угла β_i зависит от положения обнаруженного изображения товара с полки, ординаты с принципиальной точки С камеры и фокусного расстояния f камеры.β _i , that is, any of β _i= β ₁ , β ₂ ,… β _N is the angle at which the i-th product is visible in the image, that is, the angle between the image plane when shooting, which is parallel to the plane of the camera lens, and the ray , connecting the center of the camera and the center of the bounding rectangle into which the detected product image is inscribed. If the detected image of product i from the shelf is located in the center of the image (in the center of the frame), then the angle β _i =0. If the detected image of product i from the shelf is located at the bottom of the image (bottom of the frame), then the product is visible from top to bottom and the angle β _i is large. The specific value of the angle β _i depends on the position of the detected image of the product from the shelf, the ordinate from the principal point C of the camera and the focal length f of the camera.

Соотношения, связывающие упомянутые величины, исходя из фиг. 3, представляют собой следующие выражения: The relationships connecting the mentioned quantities, based on Fig. 3 are the following expressions:

, ,

(1) (1)

На фиг. 3:In fig. 3:

Большой треугольник △ ОDЕ: сторона DЕ - это плоскость полки, изображение которой захвачено камерой. Вершина О - это место расположения камеры. ОD и OE ограничивают поле обзора камеры.Large triangle △ ODE: side DE is the plane of the shelf, the image of which is captured by the camera. Vertex O is the location of the camera. OD and OE limit the camera's field of view.

Маленький треугольник △OAB: сторона AB - плоскость изображения, куда проецируется реальная полка с помощью камеры. Плоскость кадра (плоскость изображения) ортогональна оптической оси камеры ОН. Поскольку на фиг. 3 показан вид сбоку, то плоскость изображения выглядит как отрезок AB, на котором лежат точки b₁, b₂, c. Плоскость изображения ортогональна оптической оси камеры, что видно из фиг. 3 (Oc ⊥ AB). Поскольку при съемке оптическая ось камеры может быть направлена не параллельно горизонту (не ортогонально гравитации), прямая AB, соответствующая плоскости изображения, не вертикальна. Точки F и G - это центры прямоугольников, в которые вписаны реальные товары с полок, попадающие на изображение при съемке. Точки b₁ и b₂- этоточки, в которые проецируются камерой точки F и G при съемке. Small triangle △OAB: side AB is the image plane where the real shelf is projected using the camera. The frame plane (image plane) is orthogonal to the optical axis of the OH camera. Since in FIG. 3 shows a side view, the image plane looks like a segment AB on which points b lie₁,b₂,c. The image plane is orthogonal to the optical axis of the camera, as can be seen from Fig. 3(Oc ⊥AB). Since when shooting the optical axis of the camera may not be directed parallel to the horizon (not orthogonal to gravity), the straight line AB corresponding to the image plane is not vertical. Points F and G are the centers of the rectangles that contain real goods from the shelves that appear in the image during shooting. Points b₁ and b₂- Thisthe points at which the F and G points are projected by the camera when shooting.

α - угол между оптической осью камеры и осью (направлением) гравитации.α is the angle between the optical axis of the camera and the axis (direction) of gravity.

OH - направление оптической оси камеры при съемке.OH is the direction of the camera's optical axis when shooting.

Зная ракурс обнаруженного изображения товара с полки, то есть угол Ɣ_i, вычисляемый с помощью выражения (1), сравнивают обнаруженное изображение товара с полки с изображениями товаров из базы данных. При этом выбирается из базы данных набор изображений известных товаров с углом Ɣ_j, отличающимся от вычисленного Ɣ_i не более чем на заранее заданный угол. Затем из этого набора выбирают единственное изображение известного товара из базы данных, имеющего признаки, которые имеют наибольшую степень совпадения с признаками обнаруженного изображения товара с полки.Knowing the angle of the detected product image from the shelf, that is, the angle Ɣ _i , calculated using expression (1), the detected product image from the shelf is compared with the product images from the database. In this case, a set of images of known products with an angle Ɣ _j differing from the calculated Ɣ _i by no more than a predetermined angle is selected from the database. Then, from this set, a single image of a known product from the database is selected that has features that have the highest degree of overlap with the features of the detected product image from the shelf.

По отобранному из базы данных изображению известного товара из базы данных идентифицируют товар на обнаруженном изображении товара с полки, то есть товар на обнаруженном изображении товара с полки будет идентифицирован как товар, изображение которого отобрано из базы данных.Based on the image of a known product selected from the database, the product in the detected image of the product from the shelf is identified, that is, the product in the detected image of the product from the shelf will be identified as the product whose image is selected from the database.

Два или более обнаруженных изображения товара с полки могут иметь одну и ту же у-координату b_i центров ограничивающих прямоугольников в кадре, например, если они расположены на одной полке с товарами и имеют одинаковый размер. Тогда для этих обнаруженных изображений товаров с полки углы β_i будут равны между собой, также вычисленные углы Ɣ_i будут равны между собой. В этом случае из базы данных выбирается один и тот же набор изображений известных товаров из базы данных для этих двух или более упомянутых обнаруженных изображений товара с полки. Затем из этого набора для каждого из этих двух или более обнаруженных изображений товара с полки выбирают единственное изображение известного товара из базы данных, признаки которого совпадают с признаками изображения обнаруженного товара с полки. По отобранному из базы данных изображению известного товара идентифицируют реальный товар на обнаруженном изображении товара с полки.Two or more detected shelf product images may have the same y-coordinate b _i of the bounding box centers in the frame, for example, if they are located on the same shelf as the products and are the same size. Then for these detected images of goods from the shelf, the angles β _i will be equal to each other, and the calculated angles Ɣ _i will also be equal to each other. In this case, the same set of known product images from the database are selected from the database for these two or more mentioned detected shelf product images. Then, from this set, for each of these two or more detected shelf product images, select a single image of a known product from the database whose features match those of the detected shelf product image. Using an image of a known product selected from the database, the real product in the detected image of the product from the shelf is identified.

С помощью предложенного способа повышается качество распознавания товара на полученном пользователем изображении. Качество распознавания повышается относительно случая, когда в базе данных имеется только одно изображение каждого известного товара, при этом велик шанс, что товар с полки, снятый на камеру, будет расположен не под тем же углом Ɣ_i, под которым снято изображение известного товара, находящееся в базе данных, и схожесть изображения обнаруженного товара и соответствующего изображения из базы может быть мала, что может привести к ошибке распознавания.Using the proposed method, the quality of product recognition in the image received by the user is improved. The quality of recognition increases relative to the case when there is only one image of each known product in the database, and there is a high chance that the product from the shelf, captured on camera, will not be located at the same angle Ɣ _i at which the image of the known product located in the database, and the similarity between the image of the detected product and the corresponding image from the database may be small, which may lead to a recognition error.

Изображение, снятое на камеру пользователем, может быть обработано, например, нейронной сетью. После обнаружения изображений товаров с полки на полученном изображении дескрипторы нейронной сети, являющиеся характеристиками (признаками) изображения и вычисленные с помощью нейронной сети, сравниваются, при этом сопоставляют каждое обнаруженное изображение товара с полки с изображениями известных товаров из базы данных. По дескрипторам оценивается «степень похожести» изображений. Такие дескрипторы известны из уровня техники и могут быть получены разными способами, например, с помощью применения предобученной нейронной сети. Нейронная сеть может обучаться, например, на наборах данных товаров из магазинов, например, на известном из уровня техники наборе данных SKU-110K https://paperswithcode.com/dataset/sku110k. Однако, указанный набор не является единственным подобным набором, нейронную сеть можно обучать и на каком-либо другом известном и подходящем по контексту наборе данных. An image captured by a user's camera can be processed, for example, by a neural network. After detecting images of products from the shelf in the resulting image, the neural network descriptors, which are characteristics (features) of the image and calculated using the neural network, are compared, and each detected image of the product from the shelf is compared with images of known products from the database. The descriptors evaluate the “degree of similarity” of the images. Such descriptors are known in the art and can be obtained in various ways, for example, by using a pre-trained neural network. A neural network can be trained, for example, on data sets of goods from stores, for example, on the SKU-110K data set known from the prior art https://paperswithcode.com/dataset/sku110k. However, the specified set is not the only similar set; the neural network can be trained on any other known and contextually appropriate data set.

В качестве дескрипторов могут использоваться классические, известные из уровня техники, такие как, например, SIFT - масштабно-инвариантная трансформация признаков, которая является алгоритмом выявления признаков в компьютерном зрении для выявления и описания локальных признаков в изображениях. Такие способы известны из уровня техники и не являются объектом изобретения. Вкратце, алгоритм распознавания находит соответствующий объекту-запросу объект из базы данных. И это соответствие может быть установлено на основе каких угодно дескрипторов, нейросетевых, или SIFT, или любых других. Возможно применение любых подходящих алгоритмов, известных из уровня техники. Classic descriptors known from the prior art can be used as descriptors, such as, for example, SIFT - scale-invariant feature transformation, which is a feature detection algorithm in computer vision for identifying and describing local features in images. Such methods are known from the prior art and are not the subject of the invention. In short, the recognition algorithm finds a matching object from the database that matches the query object. And this correspondence can be established based on any descriptors, neural network, or SIFT, or any others. Any suitable algorithms known in the art may be used.

Нейронная сеть является не единственным способом обнаружения товаров. Товары могут быть обнаружены на изображении любым подходящим и известным из уровня техники способом, поэтому изобретение не ограничивается нейросетевыми подходами.A neural network is not the only way to detect products. Products can be detected in an image by any suitable method known from the prior art, so the invention is not limited to neural network approaches.

На фиг. 4 показана блок-схема предлагаемого способа идентификации товара на полке стеллажа. Последовательность действий следующая:In fig. Figure 4 shows a block diagram of the proposed method for identifying goods on a shelf. The sequence of actions is as follows:

1 - пользователь снимает камерой стеллаж с товарами.1 - the user films a rack of goods with a camera.

2 - получают в камере изображение стеллажа с товарами, которое преобразуется в форму, (то есть, например, числовой массив фиксированного размера с тремя каналами, соответствующими красному, зеленому и синему цветам (RGB)), необходимую для обработки изображения программным обеспечением. 2 - an image of a rack of goods is obtained in the camera, which is converted into a shape (that is, for example, a numerical array of a fixed size with three channels corresponding to red, green and blue (RGB)) necessary for processing the image by software.

3 - с помощью блока IMU, соединенного с камерой, определяют угол α между оптической осью камеры во время съемки и направлением гравитации;3 - using the IMU unit connected to the camera, the angle α between the optical axis of the camera during shooting and the direction of gravity is determined;

4 - в работу включается детектор, соединенный с камерой, который обнаруживает изображения всех товаров, попавших на отснятое изображение, при этом формируется ограничивающий прямоугольник, который описан вокруг обнаруженного изображения товара с полки.4 - a detector connected to the camera is activated, which detects images of all products included in the captured image, and a bounding rectangle is formed, which is described around the detected image of the product from the shelf.

Для каждого i-го обнаруженного изображения товара с полки:For each i-th detected image of a product from the shelf:

4а - с помощью детектора получают координату y (ординату) центра ограничивающего прямоугольника, в который вписано i-ое обнаруженное изображение товара с полки, то есть b_i, определяющее положение обнаруженного изображения товара с полки на изображении полки с товарами, отснятом камерой.4a - using the detector, the y-coordinate (ordinate) of the center of the bounding rectangle is inscribed, in which the i-th detected image of the product from the shelf is inscribed, that is, b _i , which determines the position of the detected image of the product from the shelf on the image of the shelf with goods captured by the camera.

Опциально 4b - в полученном детектором изображении возможно вырезать из ограничивающего прямоугольника изображение самого товара с полки, по которому далее будет распознаваться товар. Это необязательный этап. Технически распознавание товара можно реализовать и без этапа вырезания, если в базе данных находятся изображения товаров, не вырезанные из ограничивающих прямоугольников. Optional 4b - in the image received by the detector, it is possible to cut out from the bounding rectangle the image of the product itself from the shelf, by which the product will be further recognized. This is an optional step. Technically, product recognition can be implemented without the cutting step if the database contains product images that are not cut out from bounding rectangles.

5 - с помощью вычислительного блока, соединенного с IMU и детектором, используя выражение (1), вычисляют угол Ɣ_iдля обнаруженного изображения товара с полки. При этом угол α определен на этапе 3, b_iопределен на этапе 4a, величины с и f, как указано выше, являются известными характеристиками камеры.5 - using a computing unit connected to the IMU and the detector, using expression (1), calculate the angle Ɣ_ifor the detected product image from the shelf. In this case, the angle α is determined at stage 3, b_idetermined in step 4a, the quantitiesWith and f, as stated above, are known camera characteristics.

Ɣ_i - это результат применения предыдущего этапа 5, то есть угол Ɣ_i, вычисленный для обнаруженного изображения товара с полки.Ɣ_i - this is the result of applying the previous step 5, that is, angle Ɣ_i, calculated for the detected product image from the shelf.

6 - с помощью блока сравнения, соединенного с вычислительным блоком и базой данных, извлекают из базы данных изображения известных товаров, снятых под углом Ɣ_j, отличающимся от вычисленного Ɣ_i на предыдущем этапе не более чем на заранее заданный угол. То есть, если, например, база данных собрана с шагом угла Ɣ_j в 10 градусов, то угол Ɣ_i,рассчитанный на предыдущем этапе, может отличаться от угла Ɣ_jиз базы данных не более, чем, например, на 5 градусов (то есть на половину от шага угла в базе данных).6 - using a comparison unit connected to the computing unit and the database, extract from the database images of known products taken at an angle Ɣ_j, different from the calculated Ɣ_i at the previous stage by no more than a predetermined angle. That is, if, for example, the database is collected with an angle step Ɣ_j at 10 degrees, then angle Ɣ_i,calculated in the previous step may differ from the angle Ɣ_jfrom the database by no more than, for example, 5 degrees (that is, half the angle step in the database).

6а - это результат применения предыдущего этапа 6, то есть извлеченный из базы данных набор изображений разных известных товаров, снятых под углами Ɣ_j, отличными от угла Ɣ_i, вычисленного на этапе 5, не более чем на заранее заданный угол.6a is the result of applying the previous step 6, that is, a set of images of different known products extracted from the database, taken at angles Ɣ _j different from the angle Ɣ _i calculated in step 5, by no more than a predetermined angle.

7 - далее с помощью блока сравнения по визуальным признакам изображения обнаруженного товара с полки выбирают из полученного набора 6а изображений, изображение известного товара из базы данных с визуальными признаками, наиболее совпадающими с визуальными признаками обнаруженного изображения товара с полки. Например, подобное сравнение изображений по вычисленным признаком описано в уровне техники https://arxiv.org/pdf/1404.1777.pdf, идентифицируя таким образом товар 7а, обнаруженный на отснятом пользователем изображении. Этапы 4а - 7 повторяют для всех обнаруженных изображений товара с полки на изображении, отснятом камерой.7 - then, using a comparison block based on visual features, an image of a detected product from the shelf is selected from the resulting set 6a of images, an image of a known product from the database with visual features that most closely matches the visual features of the detected image of a product from the shelf. For example, such a comparison of images using calculated features is described in the prior art https://arxiv.org/pdf/1404.1777.pdf, thus identifying product 7a detected in a user-captured image. Steps 4a - 7 are repeated for all detected images of the product from the shelf in the image captured by the camera.

Предлагаемое изобретение позволяет быстро и легко найти необходимый заранее известный товар на полке магазина среди множества других товаров. Например, пользователь просто может нести включенное электронное устройство, содержащее камеру, направленную на стеллажи с товарами, и при обнаружении необходимого товара устройство оповестит пользователя о том, что товар найден. Когда пользователь несет камеру вдоль стеллажей, камера фотографирует полки с заданной частотой и анализирует изображения, выдавая на экран результат распознавания всех товаров в кадре. Этот процесс можно применить для распознавания конкретных заранее отмеченных товаров, интересных пользователю, однако это выходит за рамки изобретения, поскольку изобретение направлено только на идентификацию всех товаров, попавших на изображение, отснятое камерой. В другом варианте осуществления пользователь, наведя электронное устройство, содержащее камеру, на понравившийся товар, может увидеть на экране устройства информацию о товарах, попавших в кадр, например, название, состав, срок годности, стоимость и т.п.The proposed invention allows you to quickly and easily find the required pre-known product on a store shelf among many other products. For example, a user could simply carry a powered-on electronic device containing a camera pointed at shelves of merchandise, and when the desired item is detected, the device will notify the user that the item has been found. When the user carries the camera along the shelves, the camera photographs the shelves at a given frequency and analyzes the images, displaying the result of recognizing all products in the frame. This process can be applied to identify specific pre-marked products of interest to the user, but this is outside the scope of the invention since the invention is only intended to identify all products included in the image captured by the camera. In another embodiment, the user, by pointing an electronic device containing a camera at a product he likes, can see on the device screen information about the products included in the frame, for example, name, composition, expiration date, cost, etc.

Хотя изобретение описано с некоторыми иллюстративными вариантами осуществления, следует понимать, что сущность изобретения не ограничивается этими конкретными вариантами осуществления. Напротив, предполагается, что сущность изобретения включает в себя все альтернативы, коррекции и эквиваленты, которые могут быть включены в сущность и объем формулы изобретения.Although the invention has been described with certain illustrative embodiments, it should be understood that the invention is not limited to these specific embodiments. On the contrary, the spirit of the invention is intended to include all alternatives, modifications and equivalents that may be included within the spirit and scope of the claims.

Кроме того, изобретение включает в себя все эквиваленты заявляемого изобретения, даже если пункты формулы изобретения изменяются в процессе рассмотрения.In addition, the invention includes all equivalents of the claimed invention, even if the claims are changed during the examination.

Claims

1. Product identification device containing:

a camera configured to capture at least one image of goods located on the shelves of the rack;

an inertial measurement unit connected to the camera and configured to determine the angle between the optical axis of the camera when shooting and the direction of gravity;

a detector connected to the camera and configured to:

- detection of at least one image of a product from the shelf included in the image captured by the camera,

- formation of bounding rectangles, each one of which is described around each one detected image of a product from the shelf;

- determining the position of the center of each bounding rectangle in the image captured by the camera;

a computing unit connected to the inertial measurement unit and the detector, configured to calculate an angle Ɣ _i between the horizontal plane and a ray connecting the center of the camera lens and the center of the bounding box;

a database that includes images of known products, and for each known product, the database stores its images, on each of which the known product was shot at a known angle between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding rectangle described around the known product in the image, and the mentioned known angles are different for all images of the same known product;

a comparison unit connected to a computing unit and a database, configured to:

- retrieving from the database a set of images of known products from the database, taken at said known angle, different from Ɣ _i by no more than a predetermined angle,

- from the resulting set of images of known products from the database, selecting an image of a known product from the database that has the highest degree of match with the detected image of the product from the shelf, compared with other images of known products from the database from the said extracted set of images of known products from the database data;

- identification of the product in the detected image of the product from the shelf based on the selected image of a known product from the database.

2. The method of operation of the product identification device according to claim 1, containing the steps in which:

A) take at least one image of the goods located on the shelves of the rack using a camera;

B) determine the angle between the optical axis of the camera when shooting and the direction of gravity using an inertial measuring unit;

B) through a detector:

- detect at least one image of a product from the shelf included in the captured image,

- form bounding rectangles, each one of which is described around each one detected image of a product from the shelf,

- determine the position of the center of each bounding rectangle in the image captured by the camera;

for each detected product image from the shelf:

D) calculate the angle Ɣ _i between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding rectangle by means of a computing unit;

D) through a comparison block:

- extract from the database a set of images of known products from the database, taken at a known angle between the horizontal plane and the ray connecting the center of the camera lens and the center of the bounding rectangle described around the known product in the image from the database, wherein said set contains images of known products from databases taken at said known angle different from Ɣ _i by no more than a predetermined angle;

- selecting, from said known product images from the database, an image of a known product from the database that has the highest degree of match with the detected image of a shelf product, compared to other images of known products from the database from said retrieved set of images of known products from the database ;

- identify the product in the detected image of the product from the shelf based on the selected image of a known product from the database.

3. The method according to claim 2, in which the angle Ɣ _i is calculated using the relation

,

where α is the angle between the optical axis of the camera and the direction of gravity,

f is the focal length of the camera lens,

b _i is the ordinate of the center of the bounding box, with the origin located in the upper corner of the image captured by the camera, the y-axis is directed along the side of the image captured by the camera;

c is the ordinate of the point of intersection of the optical axis of the camera lens with the image plane.