CN113591811A

CN113591811A - Retail container commodity searching and identifying method, system and computer readable storage medium

Info

Publication number: CN113591811A
Application number: CN202111143607.1A
Authority: CN
Inventors: 李庆鹏; 付浩龙; 李智勇; 方乐缘; 康予涵
Original assignee: Hunan University
Current assignee: Hunan University
Priority date: 2021-09-28
Filing date: 2021-09-28
Publication date: 2021-11-02

Abstract

The invention provides a retail counter commodity searching and identifying method, which comprises the steps of manually marking commodity images acquired by a camera at the top of a retail counter to acquire a data set, sending a training set into a model for training, wherein the training process mainly comprises a searching method of characteristic aggregation and Re-id priority; secondly, performing a dissolving experiment on the model and performing fine adjustment to obtain an optimal model; finally, the obtained optimal search model is used for the commodity search task of the retail container, the commodity search identification method which can be added and has high precision and high speed is realized, the cost of the retail container is effectively reduced, and the related industry layout is accelerated. The invention also provides a retail container commodity searching and identifying system and a computer readable storage medium.

Description

Retail container commodity searching and identifying method, system and computer readable storage medium

Technical Field

The invention relates to the technical field of data identification, in particular to a retail container commodity searching and identifying method, a retail container commodity searching and identifying system and a computer readable storage medium.

Background

Along with the vigorous development of the internet of things, electronic commerce and mobile payment technology, the popularity of e-commerce for online shopping is gradually slowed down, and the e-commerce industry has a ceiling, so that a new development direction is urgently needed. Therefore, the concept of 'new retail sales' is produced at the same time, and the consumption closed loop of the commodities, the goods and the places is realized by upgrading and transforming the links from production to circulation and sale of the commodities and relying on big data analysis and artificial intelligence. The retail container is an entrance for acquiring offline flow for retail enterprises, is very beneficial to the overall layout of retail ecology, and has more and more attention in the future development under the background that the technical scheme of the retail container tends to be mature.

However, in the application of retail container technology, the product bar code and Radio Frequency Identification (RFID) are still used as the main product Identification method, and manual code scanning is often required. In addition, if the bar code of the commodity is damaged or falls off, the commodity cannot be identified, and the radio frequency identification brings extra cost of the electronic tag, so that the commodity is difficult to popularize and use. Therefore, there is a need to provide a method, a system and a computer readable storage medium for searching and identifying merchandise in retail containers to solve the above problems.

Disclosure of Invention

The invention provides a method and a system for searching and identifying commodities of retail containers and a computer readable storage medium, which are characterized in that image features are extracted in a multi-level mode through a feature aggregation module, image semantic information is represented more accurately, and in addition, detection is carried out after Re-id operation is carried out through an anchor-free frame, so that the precision and the speed of commodity matching are ensured, the commodity identification cost is reduced, and the identification speed and the precision are improved.

In order to solve the technical problems, the invention adopts the technical scheme that:

a retail container commodity searching and identifying method comprises the following steps:

s1: the method comprises the steps of obtaining commodity images in the retail container, making a data set after manual marking, and dividing the marked data set into two types, wherein one type is a commodity library only containing one commodity image, and the other type is a database comprising a plurality of commodity images;

s2: sending the images and the labels in the database into an anchor-free search frame for training and extracting features to obtain a multi-level feature tensor, and aggregating the multi-level feature tensor by using a feature aggregation module to realize comprehensive feature fusion containing shallow layer, middle layer and high layer information;

s3: training by using the feature tensor passing through the feature aggregation module as the input of Re-id, matching a potential target with an image library, and supervising the training process by using a Circle Loss function;

s4: while the Re-id task is carried out, taking the feature tensor passing through the feature aggregation module as the input of a detection head, dividing the input feature tensor into two branches by using an anchor-free search frame for regression and classification, respectively, sequentially carrying out deep convolution on the regression branches and the classification branches, and then accessing the regression branches and the classification branches into a full connection layer, wherein the regression branches are used for predicting regression offset and center score of a boundary frame, and the classification branches are used for foreground/background classification;

s5: and associating each position on the characteristic graph output by the characteristic aggregation module with a boundary box with classification and center score and the Re-id characteristic tensor, matching the label name in the commodity library for each detection box, and completing the retrieval process of the commodity.

Preferably, the content labeled in step S1 includes the category and coordinate position of the commodity.

Preferably, the step S2 specifically includes:

s21: inputting the images and labels in the database into an anchor-free search frame, and extracting multilayer features of a shallow layer, a middle layer and a high layer through a ResNet-50 basic network;

s22: the obtained multi-level feature tensor is fused through a feature aggregation module, cavity convolution operation is carried out on the middle-level feature tensor, the high-level feature tensor is up-sampled to enable the dimensionality of the high-level feature tensor to be equal to the dimensionality of the middle-level feature tensor, and then the high-level feature tensor and the middle-level feature tensor are spliced to obtain a new first feature tensor;

s23: performing cavity convolution operation on the shallow feature tensor, performing up-sampling on the first feature tensor to enable the dimension of the first feature tensor to be equal to that of the shallow feature tensor, and then splicing the first feature tensor and the shallow feature tensor to obtain a new second feature tensor;

s24: and performing hole convolution on the second feature tensor to obtain a final fusion feature tensor.

Preferably, the ResNet _50 basic network includes an initial convolutional layer, a final fully-connected layer, and four blocks, where the four blocks respectively include 3, 4, 6, and 3 modules, and each module includes three convolutional layers.

Preferably, for a single sample in the feature space, there is sample dependentKAn intra-similarity score andLthe similarity scores among the classes are respectively expressed as

And

the Circle Loss function is expressed as:

in the formula (I), the compound is shown in the specification,

is a scale factor that is a function of,

and

a non-negative weighting factor;

in the formula (I), the compound is shown in the specification,

is a zero-point cut-off operation for ensuring

And

is not negative.

The invention also provides a retail container commodity search and identification system, which comprises computer equipment, wherein the computer equipment at least comprises a microprocessor and a memory which are connected with each other, the microprocessor is programmed or configured to execute the steps of the retail container commodity search and identification method, or the memory is stored with a computer program which is programmed or configured to execute the retail container commodity search and identification method.

The present invention also provides a computer-readable storage medium having stored therein a computer program programmed or configured to execute the retail container commodity search identification method described above.

Compared with the related technology, the invention has the beneficial technical effects that:

(1) according to the invention, the non-anchor frame architecture is adopted to build the commodity search model, if a new commodity is added each time, only the number of the sample libraries needs to be increased, and the model training process does not need to be carried out again, so that the time cost can be effectively reduced, the model search speed is greatly increased on the premise of ensuring the precision, and the working cost is effectively reduced;

(2) compared with a target detection method, the method has the advantages that the commodity types can be deleted, added and deleted at any time only by changing the commodity samples in the commodity library, and the defects of poor changing capability and difficult commodity updating of the target detection method are overcome; the method has the advantages of better ductility, higher searching speed and wider application scenes;

(3) compared with a radio frequency identification method, the method greatly reduces the labor cost in the early stage, and can realize the commodity searching and identifying process only by one fish-eye camera.

Drawings

FIG. 1 is a schematic diagram of a basic flow chart of a method for searching and identifying commodities in a retail container provided by the invention;

FIG. 2 is a schematic structural diagram of an anchorless search model in the retail container commodity search identification method provided by the present invention;

FIG. 3 is a graph of the overall Loss function during training using the retail container commodity search identification method provided by the present invention;

FIG. 4 is a schematic diagram of search results on a test set of a method for searching and identifying commodities in retail containers according to the present invention.

Detailed Description

The following description of the present invention is provided to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention and to make the above objects, features and advantages of the present invention more comprehensible.

Referring to fig. 1-4, the invention provides a retail container commodity searching and identifying method, which comprises the following steps:

s1: the method comprises the steps of obtaining commodity images in the retail container, carrying out manual labeling to obtain a data set, and dividing the labeled data set into two types, wherein one type is a commodity library only containing one commodity image, and the other type is a database comprising a plurality of commodity images.

The commodity image in the retail container can be acquired through the fisheye camera, the labeling process is manually labeled by adopting a picture labeling tool, the commodity image is manually labeled to obtain a commodity image label, and the commodity image label is the category and the coordinate position of the commodity.

The commodity library comprises all commodity categories in the retail container, and each commodity at least comprises one image.

Recording information such as file names, sizes, coordinate positions, categories and the like of marked commodity images by using XML files; and then creating three file directories of annotations, frames and query, wherein the annotations file directory is used for storing an XML description file corresponding to each commodity image, the frames file directory is used for storing a database, and the directory query is used for storing a commodity library. And converting the data set into a json file in a COCO format after the annotation is finished, wherein the json file comprises the following contents:

the information comprises data set description, the licenses comprises category information, the images comprise commodity image information, the indications comprise annotation information, and the categories comprise training set and verification set division information.

S2: and sending the images and the labels in the database into an anchor-free search framework for training and extracting features to obtain a multi-level feature tensor, and aggregating the multi-level feature tensor by using a feature aggregation module to realize comprehensive feature fusion containing shallow layer, middle layer and high layer information.

The step S2 specifically includes:

In this embodiment, the ResNet _50 base network includes four groups of blocks including 3, 4, 6, and 3 modules, each including three convolutional layers, in addition to the first convolutional layer and the last fully-connected layer. And selecting C3, C4 and C5 as the output of ResNet _50 basic network feature extraction, and performing feature aggregation through a feature aggregation module to obtain a final feature extraction result. Performing three different 3x3 hole convolutions of partition rate =1, 2 and 3 and 2 times up-sampling on the feature tensor output by C5, performing three different 3x3 hole convolution operations of partition rate =1, 2 and 3 on the feature tensor output by C4, and performing concatation operation on the two hole convolutions to obtain P4; similarly, 2 times of upsampling is carried out on P4, three different 3x3 hole convolution operations of contrast rate =1, 2 and 3 are carried out on the feature tensor output by C3, the two are subjected to concat operation to obtain P3, the feature extraction work is completed after the hole convolution of 3x3 is carried out once again, and the comprehensive feature fusion including shallow layer, middle layer and high layer information is realized.

S3: and training by using the feature tensor passing through the feature aggregation module as the input of Re-id, matching the potential target with an image library, and supervising the training process by using a Circle Loss function.

For the Re-id (Re-identification) process, a fused feature tensor passing through a feature aggregation module is directly used as the input of the Re-id process, an additional embedding layer is not needed, and matching is carried out through the proposed Circle Loss function.

For a single sample x in the feature space, there is a correlation with the sample xKAn intra-similarity score andLthe similarity scores among the classes are respectively expressed as

And

. Therefore, to minimize

And

the Circle Loss function is shown as follows:

wherein the content of the first and second substances,

is a scale factor that is a function of,

and

is a non-negative weighting factor.

The Circle Loss function will be reduced by iteration of similar image pairs

And allow

And

learning at different speeds. In addition, to

And

as

And

the linear function coefficient realizes the segmented learning of the algorithm, and the learning speed of the algorithm is adaptive to the optimization state. The further the similarity score deviates from the optimal value, the larger the weighting factor.

Wherein the content of the first and second substances,

is a zero-point cut-off operation for ensuring

And

is not negative.

The matching of the symmetrically optimized re-recognition Loss function, such as Ttriplet Loss and AM-Softmax Loss, is commonly used in the related art, and the symmetrically optimized re-recognition Loss function has the following two limitations:

1) the optimization lacks flexibility, the reward and punishment are strictly equal, even if the gradient is still large when convergence is approached, the method is inefficient and unreasonable;

2) the convergence state is ambiguous, whether convergence is determined only by judging the distance between the positive and negative samples, and when two groups of positive and negative samples simultaneously satisfy the convergence condition, the situation that the first group of positive samples is similar to the second group of negative samples may exist, so that the separability of the feature space is reduced.

And the Circle Loss function improves the situation pertinently, so that a more accurate commodity Re-id process can be obtained.

S4: and when the Re-id task is carried out, taking the fusion feature tensor as the input of a detection head, dividing the input fusion feature tensor into two branches by using an FCOS (fiber channel operating System) anchor-free search framework for regression and classification, respectively, sequentially carrying out deep convolution on the regression branches and the classification branches, and then accessing the regression branches and the classification branches into a full connection layer, wherein the regression branches are used for predicting regression offset and center score of a boundary frame, and the classification branches are used for foreground/background classification.

The identification process and the Re-id process of a detection head are synchronously carried out, the detection head uses an FCOS (fuzzy control operating system) anchor-free search framework to carry out anchor-free search tasks, the FCOS anchor-free search framework divides an input fusion feature tensor into two branches to carry out regression and classification respectively, the regression branches are accessed to a full connection layer after passing through a convolution layer, coordinate regression of an IoU Loss supervision boundary box is used, and Cross Entropy Loss (Cross Entropy Loss) supervision center point fractional regression is used; the classification branch is accessed to a full connection layer after passing through a convolution layer, and the foreground/background classification is carried out by using a Focal local supervision target.

And (4) performing a dissolving experiment on whether a characteristic aggregation module and a Circle Loss function are adopted, and selecting an optimal model as a final model for searching and identifying the commodities of the retail container according to the accuracy and the recall rate.

In the specific training process, a ResNet _50 basic network pre-trained on ImageNet is used as a basic feature extraction network, the batch size is set to be 4, random gradient descent (SGD) optimization is adopted, and the weight attenuation is 0.0005. The initial learning rate was set to 0.001 and decreased by a factor of 10 at epochs 16 and 22 for a total of 24 epochs. In addition, a multi-scale training strategy is adopted, the longer sides of the images are randomly adjusted between 667 and 2000 in the training process, and meanwhile zero padding is utilized to fit the images with different resolutions. At test time, the test image is resized to a fixed size of 960x 720. The overall Loss function dropping process in the training process is shown in fig. 3.

The trained model is used for testing, taking retail container commodity green tea as an example, fig. 4a is a commodity sample, fig. 4b is a search result when green tea exists in a scene image, and fig. 4c is a search result when green tea does not exist in the scene image.

The present invention also provides a computer-readable storage medium having stored therein a computer program programmed or configured to execute the retail container commodity search identification method described above. The contents in the above method embodiments are all applicable to the present storage medium embodiment, and the realized functions and advantageous effects are the same as those in the method embodiments.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The steps of an embodiment represent or are otherwise described herein as logic and/or steps, e.g., a sequential list of executable instructions that can be thought of as implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

Claims

1. A retail container commodity searching and identifying method is characterized by comprising the following steps:

2. The retail container commodity search identification method as claimed in claim 1, wherein the content labeled in the step S1 includes a category and a coordinate position of the commodity.

3. The method for searching and identifying commodities in retail containers according to claim 1, wherein the step S2 is specifically:

4. The retail container commodity search identification method of claim 3, wherein the ResNet _50 base network comprises an initial convolutional layer, a final fully-connected layer, and four groups of blocks, each group of blocks comprising 3, 4, 6, 3 modules, each module comprising three convolutional layers.

5. The retail container commodity search identification method as claimed in claim 1, wherein for a single sample in the feature space, there is a sample-relatedKAn intra-similarity score andLthe similarity scores among the classes are respectively expressed as

And

the Circle Loss function is expressed as:

in the formula (I), the compound is shown in the specification,

is a scale factor that is a function of,

and

a non-negative weighting factor;

in the formula (I), the compound is shown in the specification,

is a zero-point cut-off operation for ensuring

And

is not negative.

6. A retail container commodity search identification system, characterized by comprising a computer device comprising at least a microprocessor and a memory connected to each other, the microprocessor being programmed or configured to perform the steps of the retail container commodity search identification method according to any one of claims 1-5, or the memory having stored therein a computer program programmed or configured to perform the retail container commodity search identification method according to any one of claims 1-5.

7. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program programmed or configured to perform the retail container item search identification method of any one of claims 1-5.