CN111882606A

CN111882606A - Goods shelf commodity layering method based on deep learning

Info

Publication number: CN111882606A
Application number: CN202010626453.0A
Authority: CN
Inventors: 魏勋
Original assignee: Shanghai Pinlan Data Technology Co ltd
Current assignee: Shanghai Pinlan Data Technology Co ltd
Priority date: 2020-07-01
Filing date: 2020-07-01
Publication date: 2020-11-03

Abstract

The invention discloses a goods shelf commodity layering method based on deep learning, and relates to the technical field of artificial intelligence and machine vision deep learning. The method of the invention comprises the following steps: s1, obtaining shelf pictures with different angles, different illumination and different resolutions; s2, training a commodity detection model; s3, training a shelf layer segmentation model; s4, detecting the shelf image to obtain the position frame and the category information of the commodity; s5, carrying out shelf layer segmentation on the shelf image to obtain the region of the shelf layer; s6, acquiring a minimum external rectangle of the shelf layer, and then acquiring a shelf line; and S7, obtaining the center point of the commodity position frame, and then judging the shelf layer where the commodity position frame is located. The invention can automatically layer the commodities on the goods shelf to obtain the number of the goods shelf layers where the commodities are positioned, and further can judge whether the commodities are displayed on the gold position. The method is efficient and accurate, and replaces a manual mode which is low in efficiency and easy to make mistakes.

Description

Goods shelf commodity layering method based on deep learning

Technical Field

The invention belongs to the technical field of artificial intelligence and machine vision deep learning, and particularly relates to a goods shelf commodity layering method based on deep learning.

Background

In a new retail age, in order to better control the selling condition of the off-line commodities, retail manufacturers generally send out an industry substitute to go to a supermarket to take a picture for inspection, and then examine the display mode of the commodities. Wherein whether the commodity is displayed in a gold position is very important. "gold locations" generally refer to shelf locations that are easily visible and accessible to consumers, such as 5-tier shelves where gold locations are generally the second and third tiers (from top to bottom).

Judging whether the commodity is displayed in the gold position, firstly, layering the goods shelf, and then, acquiring the layer number of the goods shelf where the commodity is positioned. At present, the information is mainly acquired by checking the pictures by naked eyes, however, for massive pictures and a large number of commodities, the mode is often inefficient and is easy to make mistakes. Therefore, in order to solve the problems, the method for layering the goods on the shelf based on deep learning is significant.

Disclosure of Invention

The invention provides a goods shelf goods layering method based on deep learning, and solves the problems of low efficiency and high possibility of errors of a manual examination mode.

In order to solve the technical problems, the invention is realized by the following technical scheme:

the invention relates to a goods shelf commodity layering method based on deep learning, which comprises the following steps of;

s1, obtaining shelf pictures with different angles, different illumination and different resolutions;

s2, training a commodity detection model, specifically:

s21, collecting shelf commodity pictures with different angles and illumination;

s22, manually marking and training a weaker commodity detection model MO on a part of commodity pictures;

s23, pre-labeling the full data by using the commodity detection model MO, and manually adjusting;

s24, training a commodity detection model M1 by using full data;

s3, training a shelf layer segmentation model, specifically:

s31, collecting shelf commodity pictures with different angles and illumination;

s32, manually marking and training a weak shelf layer segmentation model M2 on part of the commodity pictures;

s33, pre-labeling the full data by using a shelf layer segmentation model M2, and manually adjusting;

s34, training a shelf layer segmentation model M3 by using full data;

s4, detecting the shelf image to obtain the position frame and the category information of the commodity;

s5, carrying out shelf layer segmentation on the shelf image to obtain the region of the shelf layer;

s6, acquiring a minimum external rectangle of the shelf layer, and then acquiring a shelf line;

and S7, obtaining the center point of the commodity position frame, and then judging the shelf layer where the commodity position frame is located.

Further, the commodity detection model MO and the commodity detection model M1 both use a target detection model based on RetinaNet.

Further, the shelf-level segmentation model M2 and the shelf-level segmentation model M3 both use a mask-RCNN image-based segmentation model.

Compared with the prior art, the invention has the following beneficial effects:

the invention can automatically layer the commodities on the goods shelf to obtain the number of the goods shelf layers where the commodities are positioned, and further can judge whether the commodities are displayed on the gold position. The method is efficient and accurate, and replaces a manual mode which is low in efficiency and easy to make mistakes.

Of course, it is not necessary for any product in which the invention is practiced to achieve all of the above-described advantages at the same time.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a shelf commodity layering method based on deep learning according to the present invention;

FIG. 2 is a diagram of the overall steps of the method of the present invention based on FIG. 1;

FIG. 3 is a substep diagram of step S2 in FIG. 2;

fig. 4 is a partial step diagram of step S3 in fig. 2.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1-4, a shelf goods layering method based on deep learning according to the present invention includes the following steps;

s2, training a commodity detection model, specifically:

s24, training a commodity detection model M1 by using full data;

s3, training a shelf layer segmentation model, specifically:

s34, training a shelf layer segmentation model M3 by using full data;

The commodity detection model MO and the commodity detection model M1 both adopt a target detection model based on RetinaNet, and the model can give consideration to both detection precision and detection speed. The RetinaNet uses the Focal local to well balance the foreground target and the background target, so that the detector can be more concentrated on the target which is difficult to distinguish, and the detection precision is improved; meanwhile, due to the single-stage design framework of the single-stage design framework, the high FPS (up to 15FPS on the Nvidia Tesla V100) can be achieved.

The target detection model based on RetinaNet specifically comprises training setting, forward setting and an output result of the model;

RetinaNet training setup:

1. image resize is carried out according to the long edge 1333 pixels and the short edge 800 pixels;

the Batch size is set to 2;

backbone is resnet50, using imageNet based weights;

anchor uses default settings;

5. the learning rate is initialized to 0.01, and the attenuation is 0.1 every 10 epochs;

step 6 is set to 10000 steps, epoch is set to 50

RetinaNet forward setting:

2. the detection box score threshold is set to 0.5;

3. the detection box NMS threshold is set to 0.5;

the output result of the RetinaNet model is divided into two parts, namely a detection frame of an object, wherein the detection frame is a list and the elements are [ x1, y1, x2, y2, score ]; the class of objects is a list with elements [ class _1_ pro, class _2_ pro, class _3_ pro,.., class _ n _ pro ] (probability vector for each class). And taking the maximum value of the class probability vector of the object, wherein the corresponding index is the prediction class number.

The shelf layer segmentation model M2 and the shelf layer segmentation model M3 both adopt a mask-RCNN image segmentation model, and the model can segment the shelf layer accurately at the pixel level. Since the shelves are usually inclined at a certain angle, the conventional detection model cannot obtain the precise contour of the shelf layer, so that a complete contour of the shelf layer must be obtained by using a segmentation model, and then an inclined rectangular frame of the shelf layer is obtained, so as to further calculate the shelf line.

The mask-RCNN-based image segmentation model specifically comprises training setting, forward setting and an output result of the model;

mask-RCNN training settings:

1. carrying out image resize according to 1024 pixels on the long side and 800 pixels on the short side;

the Batch size is set to 2;

backbone is resnet50, using imageNet based weights;

anchor uses default settings;

step is set as 10000 steps, epoch is set as 50;

mask-RCNN Forward setup:

the output result of the mask-RCNN model is divided into three parts: object detection box, object mask, object class. The information of the object detection frame and the category information are the same as those of the RetinaNet. For each piece of mask information of the RoI, the output dimension is Km m. Wherein K denotes encoding m x m images with K binary classes of masks, each mask having K classes;

after the contour of the shelf layer is obtained, the minimum circumscribed rectangle of each shelf layer is fitted, then the middle points are taken in the vertical direction and connected to obtain the shelf line. Then, according to the detection frame information of each commodity, the central point of each commodity is calculated, and then the shelf layer where each commodity is located can be obtained through the distance from the point to the straight line.

The preferred embodiments of the invention disclosed above are intended to be illustrative only. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise embodiments disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best utilize the invention. The invention is limited only by the claims and their full scope and equivalents.

Claims

1. A goods shelf layering method based on deep learning is characterized by comprising the following steps;

s2, training a commodity detection model, specifically:

s24, training a commodity detection model M1 by using full data;

s3, training a shelf layer segmentation model, specifically:

s34, training a shelf layer segmentation model M3 by using full data;

2. The deep learning-based goods shelf layering method according to claim 1, wherein the goods detection model MO and the goods detection model M1 both use a RetinaNet-based target detection model.

3. The deep learning-based shelf commodity layering method according to claim 1, wherein the shelf layer segmentation model M2 and the shelf layer segmentation model M3 both use a mask-RCNN image segmentation model.