CN114255377A - Differential commodity detection and classification method for intelligent container - Google Patents

Differential commodity detection and classification method for intelligent container Download PDF

Info

Publication number
CN114255377A
CN114255377A CN202111476957.XA CN202111476957A CN114255377A CN 114255377 A CN114255377 A CN 114255377A CN 202111476957 A CN202111476957 A CN 202111476957A CN 114255377 A CN114255377 A CN 114255377A
Authority
CN
China
Prior art keywords
commodity
different
difference
images
commodities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111476957.XA
Other languages
Chinese (zh)
Inventor
冯栋
刘治宇
刘浩
陈洪伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Turing Technology Co ltd
Original Assignee
Qingdao Turing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Turing Technology Co ltd filed Critical Qingdao Turing Technology Co ltd
Priority to CN202111476957.XA priority Critical patent/CN114255377A/en
Publication of CN114255377A publication Critical patent/CN114255377A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features

Abstract

The invention provides a method for detecting and classifying difference commodities of an intelligent container, which comprises the steps of obtaining two commodity images shot by a camera arranged above a goods shelf of the intelligent container at different moments; detecting the commodities in the two commodity images by using a commodity difference detection model trained in advance to obtain a difference commodity detection result; the difference commodity detection result comprises the coordinates of a difference commodity detection frame and image information to which the difference commodity belongs; and identifying the detected different commodities according to the detection result of the different commodities to obtain the category information of the different commodities. According to the scheme of the invention, the image positions of the different commodities on the scene images of two intelligent containers before and after the purchase of a consumer are directly detected, and then the commodities at the image positions of the different commodities are identified to determine the type of the different commodities, so that the problems of high labeling cost, high updating cost, high deployment cost and the like of the existing fully supervised target detection model are solved.

Description

Differential commodity detection and classification method for intelligent container
Technical Field
The invention relates to the technical field of computer vision and deep learning, in particular to a method for detecting and classifying different commodities of an intelligent container.
Background
The e-commerce in the internet era develops very rapidly, but after a period of high-speed development, the traditional e-commerce enters a bottleneck period, the requirements of people on convenience and timeliness of consumption are higher and higher, and the traditional e-commerce is difficult to meet the requirements of people on convenient high-quality life. In the context of the concept of "new retail," traditional e-commerce is trying to integrate with off-line sales channels, and intelligent containers are an important development direction for new retail.
The solutions of the intelligent container are divided into two categories of vision and non-vision, the non-vision solution occupies most markets by the advantages of simple principle, convenient deployment, high accuracy and the like, but with the progress and development of the deep neural network in the field of computer vision, the vision solution based on the deep neural network becomes the key research point of the solution of the intelligent container.
Disclosure of Invention
The invention provides a method for detecting and classifying differential commodities of intelligent containers, which can directly detect the positions of the differential commodities on scene images of two intelligent containers before and after a consumer purchases the commodities, and then obtain the types of the differential commodities by using a target recognition model, thereby completing the functions of automatic commodity settlement, inventory checking and the like.
The invention provides a method for detecting and classifying different commodities of an intelligent container, which comprises the following steps:
acquiring two commodity images shot by a camera arranged above a goods shelf of an intelligent container at different moments; the two commodity images are obtained by overlooking and shooting by the camera;
detecting the commodities in the two commodity images by using a commodity difference detection model trained in advance, wherein the detection process comprises the following steps: extracting, fusing and target regressing the characteristics to obtain a detection result of the different commodity; the difference commodity detection result comprises the coordinates of a difference commodity detection frame and image information to which the difference commodity belongs;
and identifying the detected different commodities according to the detection result of the different commodities to obtain the category information of the different commodities.
In an optional embodiment, the commodity difference detection model is provided with two weight-sharing feature extractors, output ends of the two weight-sharing feature extractors are connected with a feature fusion operator, and output ends of the feature fusion operator are connected with a regression network;
correspondingly, the commodity difference detection model trained in advance is used for detecting the commodities in the two commodity images, and the detection processing comprises the following steps: and (3) obtaining a detection result of the different commodity by feature extraction, feature fusion and target regression, wherein the detection result comprises the following steps:
respectively extracting the features of the two commodity images through the feature extractor shared by the two weights to obtain a first image feature and a second image feature;
calculating the difference value of the first image characteristic and the second image characteristic through the characteristic fusion operator to obtain a fusion image characteristic;
and identifying the fusion image characteristics through the regression network to obtain a difference commodity detection result.
Further, the feature extractor shared by the two weights adopts ResNet-18 which deletes the last full connection layer.
Further, a spatial attention module and a channel attention module exist in the regression network.
In an optional embodiment, the classifying the detected different commodities according to the different commodity detection result to obtain different commodity category information includes:
determining position information of the differential commodities according to the coordinates of the differential commodity detection frame and the image information of the differential commodities;
and identifying the commodities at the positions of the different commodities by using the commodity identification model trained in advance, and determining the category information of the different commodities.
In an optional embodiment, before acquiring two images of the commodity captured by the cameras disposed above the intelligent container rack at different times, the method further includes:
collecting a plurality of groups of different commodity images; each group of difference commodity images comprises two commodity images which are overlooked and shot at the same visual angle and different moments;
performing frame marking on the different commodities in the multiple groups of different commodity images to obtain marking frame information corresponding to each group of different commodity images, and generating label information of each group of different commodity images according to the marking frame information;
performing data enhancement processing on the multiple groups of difference commodity images to obtain multiple groups of difference commodity training images;
and training the constructed commodity difference detection model by using the plurality of groups of difference commodity training images and the label information to obtain the commodity difference detection model which is trained in advance.
Further, the generating label information of each group of difference commodity images according to the labeling frame information includes:
performing grid division on the multiple groups of different commodity training images according to the preset grid size to obtain at least one grid area;
generating label information corresponding to each grid area of each group of the differential commodity training images according to the labeling frame information corresponding to each group of the differential commodity training images; the label information comprises the position relationship between the grids and the central point of the labeling frame, the position relationship between the two difference commodity images and the central point of the labeling frame, the abscissa and the ordinate of the central point of the labeling frame, and the length and the width of the labeling frame.
Further, the data enhancement processing on the plurality of groups of difference commodity images comprises at least one of the following steps:
randomly replacing the positions of two different commodity images in each group of different commodity training images;
randomly cutting and/or randomly filling two different commodity images in each group of different commodity training images;
turning over two different commodity images in each group of different commodity training images in a random mirror image manner;
and enhancing the contrast and/or brightness and/or saturation of the two different commodity images in each group of different commodity training images.
Further, the training of the constructed commodity difference detection model by using the plurality of groups of difference commodity training images and the label information includes:
carrying out different commodity detection on each grid area in the multiple groups of different commodity training images, and adjusting an anchor frame preset in each grid area according to a detection result to obtain different commodity prediction information of each grid area;
calculating a difference commodity detection loss value according to the difference commodity prediction information and the label information of each grid area, and reversely transmitting the difference commodity detection loss value to each layer of the commodity difference detection model so as to update weight parameters of each layer according to the difference commodity detection loss value;
and repeating the training steps until the commodity difference detection model converges.
The invention provides a method for detecting and classifying difference commodities of an intelligent container, which comprises the steps of obtaining two commodity images shot by a camera arranged above a goods shelf of the intelligent container at different moments; the two commodity images are obtained by overlooking and shooting by the camera; detecting the commodities in the two commodity images by using a commodity difference detection model trained in advance, wherein the detection process comprises the following steps: extracting, fusing and target regressing the characteristics to obtain a detection result of the different commodity; the difference commodity detection result comprises the coordinates of a difference commodity detection frame and image information to which the difference commodity belongs; and identifying the detected different commodities according to the detection result of the different commodities to obtain the category information of the different commodities. Compared with the prior art, the scheme directly detects the image positions of the different commodities on the scene images of the two intelligent containers before and after the consumer purchases the goods, and then the target recognition model is used for recognizing the commodities at the image positions of the different commodities to obtain the categories of the different commodities so as to realize automatic settlement and intelligent management of the commodities.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic diagram of a scenario architecture upon which the present disclosure is based;
FIG. 2 is a schematic flow chart of a method for detecting and classifying different commodities of an intelligent container according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a commodity difference detection model according to an embodiment of the present disclosure;
fig. 4 is a schematic flow chart of a training method for a commodity difference detection model according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, the intelligent container mainly identifies the types of commodities taken away by consumers through a visual identification algorithm and automatically settles accounts, and the main principle is to shoot images of commodities purchased by the users by using a camera, automatically identify the types of the purchased commodities by using a trained target identification model and further settle the commodity cost according to the types of the commodities purchased by the users.
However, the method realizes commodity identification and calculation based on the fully supervised target detection model, and has the problems of high labeling cost, high updating cost, high deployment cost and the like.
Fig. 1 is a schematic diagram of a scene architecture based on the present disclosure, and as shown in fig. 1, the scene architecture based on the present disclosure may include an intelligent container 1, a differential goods detection and classification apparatus 2, and a camera 3.
The different commodity detecting and classifying device 2 is hardware or software that can interact with the camera 3 through a network, and can be used to execute the different commodity detecting and classifying method described in each embodiment described below.
When the different product detecting and classifying device 2 is hardware, it may be an electronic device having an arithmetic function. When the different product detecting and classifying device 2 is software, it may be installed in an electronic device having an arithmetic function. The electronic devices include, but are not limited to, servers, smart boxes, desktop computers, and the like.
The camera 3 may be a hardware device integrated on the intelligent container 1 and capable of shooting a large range of objects in a short distance.
In an actual scene, the differential commodity detecting and classifying device 2 may be a server integrated with or installed on the intelligent container 1, the differential commodity detecting and classifying device 2 may be operated on the intelligent container 1, and the differential commodity detecting and classifying device 2 may also be integrated with or installed in a backend server that processes commodity images, so as to provide a commodity detecting and classifying service for the intelligent container 1. The specific process is as follows: the differential commodity detection and classification device 2 obtains two commodity images shot by the camera 3 before and after the consumer purchases the commodity, and the differential commodity detection and classification device 2 detects the two commodity images shot before and after the consumer purchases the commodity by adopting the method shown in the following embodiment to determine the position of the differential commodity in the intelligent container 1, and identifies the commodity type of the position of the differential commodity for automatic settlement of the commodity.
The method for detecting and classifying the different commodities of the intelligent container provided by the application is further explained as follows:
fig. 2 is a schematic flow chart of a method for detecting and classifying different commodities of an intelligent container according to an embodiment of the present disclosure. As shown in fig. 2, the method for detecting and classifying different commodities of an intelligent container provided by the embodiment of the present disclosure includes:
s21, acquiring two commodity images shot by a camera arranged above the intelligent container shelf at different moments; wherein the two commodity images are obtained by the camera through overlooking shooting.
The camera can shoot a large-range commodity image in a short distance.
In this embodiment, since the consumer takes the commodity away from the shelf of the intelligent container when purchasing the commodity in the intelligent container, and the camera can be configured to the position capable of shooting all the commodities on the shelf in order to accurately identify the commodity taken away by the user, the overlooking image of all the commodities on the shelf can be shot by using the camera, and the commodity identification error caused by the shooting dead angle can be avoided.
S22, detecting the commodities in the two commodity images by using the commodity difference detection model trained in advance, wherein the detection process comprises the following steps: extracting, fusing and target regressing the characteristics to obtain a detection result of the different commodity; and the difference commodity detection result comprises the coordinates of the difference commodity detection frame and the image information to which the difference commodity belongs.
In this embodiment, unlike a common target detection model that detects an input single commodity image, the commodity difference detection model of this embodiment inputs a pair of commodity images, and the commodity difference detection model sequentially performs feature extraction, feature fusion, and target regression on two commodity images to obtain coordinates of a different commodity detection frame and image information to which a different commodity belongs.
Specifically, as shown in fig. 3, the commodity difference detection model is composed of two feature extractors shared by weights, a feature fusion operator, and a regression network, and the difference commodity detection process includes: respectively extracting the features of the two commodity images through the feature extractor shared by the two weights to obtain a first image feature and a second image feature; calculating the difference value of the first image characteristic and the second image characteristic through the characteristic fusion operator to obtain a fusion image characteristic; and identifying the fusion image characteristics through the regression network to obtain a difference commodity detection result.
Further, the feature extractor of this embodiment extracts the depth features of two input commodity images by deleting the ResNet-18 of the last full connection layer, and the initialization parameters of the ResNet-18 are pre-trained on the ImageNet data set. The regression network of the present embodiment has a spatial attention module and a channel attention module.
The advantage of this arrangement is that two weight-shared feature extractors are used to extract the features of the pair of input commodity images respectively. Because the parameters of the two feature extractors are shared, the two input commodity images are mapped to the same feature space, which is beneficial to obtaining the difference of the two input commodity images on the spatial position in the follow-up process; the key of the algorithm is to fuse the features extracted by the two feature extractors by using a feature fusion operator to obtain difference information, perform difference calculation on the features of the two input commodity images to obtain fusion image features containing the difference information, and regress the difference commodity information according to the fusion image feature images; in order to better analyze the dependency relationship between the spatial difference of the fused image features and the global features, two types of attention modules of the spatial dimension and the channel dimension are added into a regression network to respectively simulate the semantic interdependency in the spatial dimension and the channel dimension and better analyze the spatial difference features.
And S23, identifying the detected difference commodities according to the detection result of the difference commodities to obtain the type information of the difference commodities.
In the embodiment, on the basis of acquiring the coordinates of the differential commodity detection frame and the image to which the differential commodity belongs, the commodity identification model can be used for identifying the differential commodity in the image to which the differential commodity belongs, the type of the differential commodity is determined to be the type of the commodity purchased by the consumer, further, the cost settlement can be performed according to the type of the commodity purchased by the consumer, and the intelligent management of the commodity in the intelligent container is realized.
Specifically, the position information of the differential commodity is determined according to the coordinates of the differential commodity detection frame and the image information of the differential commodity; and identifying the commodities at the positions of the different commodities by using the commodity identification model trained in advance, and determining the category information of the different commodities.
The commodity difference detection model has the advantages that the commodity difference detection model only detects the positions of the different commodities, commodity classification is not carried out, and commodity classification work is completed by the special commodity identification model, so that when a product is new, only the commodity identification model needs to be updated, the commodity difference detection model does not need to be retrained, the difficulty of model training is reduced, and a large amount of model updating cost can be saved.
The embodiment provides a commodity identification method of an intelligent container, which comprises the steps of obtaining two commodity images shot by a camera arranged above a goods shelf of the intelligent container at different moments; the two commodity images are obtained by overlooking and shooting by the camera; detecting the commodities in the two commodity images by using a commodity difference detection model trained in advance, wherein the detection process comprises the following steps: extracting, fusing and target regressing the characteristics to obtain a detection result of the different commodity; the difference commodity detection result comprises the coordinates of a difference commodity detection frame and image information to which the difference commodity belongs; and identifying the detected different commodities according to the detection result of the different commodities to obtain the category information of the different commodities. By adopting the technical scheme provided by the embodiment of the disclosure, the image positions of the different commodities on the scene images of two intelligent containers before and after the consumer purchases the different commodities are directly detected, and the target recognition model is used for recognizing the commodities at the image positions of the different commodities to obtain the categories of the different commodities, so that the problems of high labeling cost, high updating cost, high deployment cost and the like are solved.
On the basis of the foregoing embodiment, fig. 4 is a flowchart illustrating a method for training a commodity difference detection model according to an embodiment of the present disclosure, where before the two commodity images captured by the cameras disposed above the shelves of the intelligent containers at different times are obtained in step S21, the method further includes a stage of training the commodity difference detection model, as shown in fig. 4, the method includes:
s41, collecting a plurality of groups of difference commodity images; each group of difference commodity images comprises two commodity images which are overlooked and shot at the same visual angle and different moments.
In this embodiment, because there is no available public data set to directly train, the data set used for training the model needs to be collected and labeled according to the actual application scene, the image pair captured in the container can be collected, each pair of images is a simulation of the scene in the container at two moments before and after consumer consumption, the cameras in the container are used for looking down and capturing at different moments at the same viewing angle, the capturing time interval is not long, so as to ensure that the two images are basically in the same illumination and background, but the commodities in the images are different, the capturing condition is set to simulate the situation that the consumer consumes once in the intelligent container, the time for consuming once is usually not too long, the purchased commodities are not too many, and the commodity change caused by consumption is not very large.
And S42, performing frame annotation on the different commodities in the multiple groups of different commodity images to obtain annotation frame information corresponding to each group of different commodity images, and generating label information of each group of different commodity images according to the annotation frame information.
In this embodiment, the labeling frame of each group of difference commodity images is the position of the difference commodity in each group of difference commodity images, the labeling frame information can be stored as an xml file, the labeling frame information records the enclosing frames of the difference commodities on the two images of each group of difference commodity images, the positions and sizes of the difference commodities can be recorded by using the coordinates of the upper left corner and the lower right corner of the enclosing frames, and during training, the labeling frame information of each group of difference commodity images is analyzed according to a specific rule to obtain the label information of each group of difference commodity images for model training.
Specifically, after the frame annotation is performed on the difference commodities in each group of difference commodity images, the annotation frame information is analyzed, and the method comprises the following steps: performing grid division on the multiple groups of different commodity training images according to the preset grid size to obtain at least one grid area; generating label information corresponding to each grid area of each group of the differential commodity training images according to the labeling frame information corresponding to each group of the differential commodity training images; the label information comprises the position relationship between the grids and the central point of the labeling frame, the position relationship between the two difference commodity images and the central point of the labeling frame, the abscissa and the ordinate of the central point of the labeling frame, and the length and the width of the labeling frame.
For example, each set of difference product images includes a graph a and a graph B, each set of difference product images is divided into S × S grids, each grid corresponding to a label represented by a vector of the form: [ P (Obj), P (A | Obj), P (B | Obj), midx, midy, w, h ], P (Obj) indicates whether a center point of a labeling frame in the grid falls in, P (A | Obj) and P (B | Obj) respectively indicate whether the center point of the labeling frame falls in the graph A or the graph B, midx and midy indicate horizontal and vertical coordinates of the center point of the labeling frame, and w and h indicate the length and width of the labeling frame. If the center point of the marking frame is in the current grid, setting P (obj) to be 1, otherwise, setting 0; if the label frame is on the graph A, setting P (A | Obj) as 1, otherwise setting 0, setting the setting of P (B | Obj) in the same way, if two different label frames are on the graph A and the graph B respectively, and the central points of the two different label frames just fall in the same grid, setting P (A | Obj) and P (B | Obj) as 1 at the same time, and recording the size and the position information of the label frame by (midx, midy, w, h). According to the above rule, one label vector with a size of S × 7 can be obtained for each group of difference product images.
And S43, performing data enhancement processing on the multiple groups of difference commodity images to obtain multiple groups of difference commodity training images.
In this embodiment, since training sample data is insufficient and the distribution of the difference objects on each group of difference commodity images is irregular, the type and the number of the training sample data are expanded by processing the initial training sample data in a data enhancement mode.
Specifically, the data enhancement processing includes at least one of: randomly replacing the positions of two different commodity images in each group of different commodity training images; randomly cutting and/or randomly filling two different commodity images in each group of different commodity training images; turning over two different commodity images in each group of different commodity training images in a random mirror image manner; and enhancing the contrast and/or brightness and/or saturation of the two different commodity images in each group of different commodity training images.
For example, in order to reduce the negative influence caused by uneven distribution of the difference commodities in the image, a randomly replaced data enhancement strategy is designed, the sequence of the graph A and the graph B and the corresponding label information are replaced by 50% of probability, and the negative influence on the learning process caused by the fixed input sequence of the graph A and the graph B is reduced, and the quantity of the difference commodities distributed in the graph A and the graph B is balanced. In order to increase the diversity of the training samples, randomly cutting or randomly filling two different commodity images in each group of different commodity training images to obtain samples with richer sizes; then randomly adjusting contrast, brightness and saturation of the image; and finally, randomly mirroring and turning the graph A and the graph B with the probability of 50%. When each group of different commodity training images is subjected to the image enhancement processing, the processing modes of the graph A and the graph B are required to be completely the same.
And S44, training the constructed commodity difference detection model by using the plurality of groups of difference commodity training images and the label information to obtain the commodity difference detection model which is trained in advance.
In the embodiment, a plurality of groups of different commodity training images and label information are input into a constructed commodity difference detection model, and different commodities in each group of different commodity training images are detected through the commodity difference detection model to obtain different commodity prediction information; calculating a difference commodity detection loss value according to the difference commodity prediction information and the label information, and reversely transmitting the difference commodity detection loss value to each layer of the commodity difference detection model so as to update weight parameters of each layer according to the difference commodity detection loss value; and repeating the training steps until the commodity difference detection model converges.
Specifically, performing different commodity detection on each grid area in the multiple groups of different commodity training images, and adjusting an anchor frame preset in each grid area according to a detection result to obtain different commodity prediction information of each grid area; calculating a difference commodity detection loss value according to the difference commodity prediction information and the label information of each grid area, and reversely transmitting the difference commodity detection loss value to each layer of the commodity difference detection model so as to update weight parameters of each layer according to the difference commodity detection loss value; and repeating the training steps until the commodity difference detection model converges.
For example, predicting the offsets of the sample center coordinates and the length and width from a preset anchor box is much simpler than directly regressing the coordinates, which simplifies the regression problem and makes the network easier to train. The preset anchor boxes tile the feature map in a convolution manner such that the position of each anchor box relative to its corresponding grid is fixed. For each prediction box, the network predicts the following: the probability of the target object in the prediction box, the probability of the object in the prediction box falling on the two images respectively, and the position of the prediction box. For example, K anchor boxes are preset for each grid, 1 object score, 2 position scores and 4 offsets from the anchor boxes are predicted for each anchor box, so 7K filters are applied around each grid in the feature map. The model models the task of detecting differences as a regression problem. The image is first divided into S by S grids, each of which predicts K bounding boxes. The whole loss function is divided into a priori frame loss and a regression loss, the regression loss comprises an object loss, a coordinate loss and a category loss, the category of the commodity difference detection model refers to that an object falls on a graph A or a graph B in space, does not represent the category of a commodity, and is information on a spatial level.
Figure BDA0003393804460000111
Where t represents the total number of current training samples,
Figure BDA0003393804460000112
indicating that the number of the current training samples is less than the preset number of times T0Only if this condition is met, the prior frame loss is calculated, since LpriorThe design of (a) is just to let the model learn the preset anchor frame faster at an early stage.
The regression loss is defined as follows:
Figure BDA0003393804460000113
Figure BDA0003393804460000114
Figure BDA0003393804460000115
Figure BDA0003393804460000116
when the intersection ratio of a prediction frame obtained by the jth anchor point corresponding to the ith grid and a labeling frame falling into the grid is smaller than a preset threshold value Thresh, the prediction frame is considered to have no target object, and L is calculated for the prediction framenoojb(ii) a When the jth anchor frame corresponding to the ith grid is matched with the label frame falling into the grid, calculating L for the prediction frameobj,LcroodAnd Lclass. In order to reflect that the same prediction deviation has different influences on large-scale and small-scale labeling boxes, we add error loss
Figure BDA0003393804460000117
The parameter reduces the punishment value of the prediction deviation to the large-scale frame and increases the punishment value of the prediction deviation to the small-scale frame,
Figure BDA0003393804460000118
and
Figure BDA0003393804460000119
respectively, the length and width of the labeling frame are normalized relative to the current grid, and the value is between 0 and 1.
The loss function is obtained by adding the above loss functions according to different weights, and is specifically shown in formula six.
Lt=λprior*Lpriorcoord*Lcoordnoobj*Lnoobjobj*Lobjclass*LclassFormula six
In practice, the weight is set to λprior=0.01,λnoobj=0.5,λobj=5,λcrd=2,λclass=1。
It can be seen that the loss function of the present embodiment is to calculate the loss values of each part in each grid, and the meaning of the loss function is not exactly the same as that of the loss function of the common target detection model. Firstly, the independent variable of the loss function is the fusion characteristic of a pair of input images, and the meaning of the represented information is different from that of a common target detection model. Secondly, the semantics of the category is completely different from that of the common target detection model, the category of the loss function represents which image the current prediction frame belongs to, and is semantic information of a spatial level, and the category in the common target detection model refers to the category to which the object in the prediction frame specifically belongs.
Although the present invention has been described with reference to the above embodiments, it should be understood that various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention.

Claims (9)

1. The method for detecting and classifying the different commodities of the intelligent container is characterized by comprising the following steps of:
acquiring two commodity images shot by a camera arranged above a goods shelf of an intelligent container at different moments; the two commodity images are obtained by overlooking and shooting by the camera;
detecting the commodities in the two commodity images by using a commodity difference detection model trained in advance, wherein the detection process comprises the following steps: extracting, fusing and target regressing the characteristics to obtain a detection result of the different commodity; the difference commodity detection result comprises the coordinates of a difference commodity detection frame and image information to which the difference commodity belongs;
and identifying the detected different commodities according to the detection result of the different commodities to obtain the category information of the different commodities.
2. The method for detecting and classifying the difference commodities of the intelligent container according to claim 1, wherein the commodity difference detection model is provided with two weight-sharing feature extractors, the output ends of the two weight-sharing feature extractors are connected with a feature fusion operator, and the output ends of the feature fusion operator are connected with a regression network;
correspondingly, the commodity difference detection model trained in advance is used for detecting the commodities in the two commodity images, and the detection processing comprises the following steps: and (3) obtaining a detection result of the different commodity by feature extraction, feature fusion and target regression, wherein the detection result comprises the following steps:
respectively extracting the features of the two commodity images through the feature extractor shared by the two weights to obtain a first image feature and a second image feature;
calculating the difference value of the first image characteristic and the second image characteristic through the characteristic fusion operator to obtain a fusion image characteristic;
and identifying the fusion image characteristics through the regression network to obtain a difference commodity detection result.
3. The method for detecting and classifying differential commodities of intelligent containers as claimed in claim 2, wherein the two weight-sharing feature extractors use ResNet-18 with the last fully-connected layer deleted.
4. The method for detecting and classifying differential commodities of intelligent containers according to claim 2, wherein a space attention module and a channel attention module exist in the regression network.
5. The method for detecting and classifying the different commodities in the intelligent container according to the claim 1, wherein the step of classifying the detected different commodities according to the detection result of the different commodities to obtain the category information of the different commodities comprises the steps of:
determining position information of the differential commodities according to the coordinates of the differential commodity detection frame and the image information of the differential commodities;
and identifying the commodities at the positions of the different commodities by using the commodity identification model trained in advance, and determining the category information of the different commodities.
6. The method for detecting and classifying the differential commodities of the intelligent container according to claim 1, wherein before the obtaining of two commodity images taken by the camera arranged above the shelf of the intelligent container at different times, the method further comprises:
collecting a plurality of groups of different commodity images; each group of difference commodity images comprises two commodity images which are overlooked and shot at the same visual angle and different moments;
performing frame marking on the different commodities in the multiple groups of different commodity images to obtain marking frame information corresponding to each group of different commodity images, and generating label information of each group of different commodity images according to the marking frame information;
performing data enhancement processing on the multiple groups of difference commodity images to obtain multiple groups of difference commodity training images;
and training the constructed commodity difference detection model by using the plurality of groups of difference commodity training images and the label information to obtain the commodity difference detection model which is trained in advance.
7. The method for detecting and classifying the different commodities in the intelligent container according to the claim 6, wherein the generating of the label information of each group of different commodity images according to the labeling frame information comprises:
performing grid division on the multiple groups of different commodity training images according to the preset grid size to obtain at least one grid area;
generating label information corresponding to each grid area of each group of the differential commodity training images according to the labeling frame information corresponding to each group of the differential commodity training images; the label information comprises the position relationship between the grids and the central point of the labeling frame, the position relationship between the two difference commodity images and the central point of the labeling frame, the abscissa and the ordinate of the central point of the labeling frame, and the length and the width of the labeling frame.
8. The method for detecting and classifying the differential commodities in the intelligent container according to claim 7, wherein the data enhancement processing on the plurality of groups of differential commodity images comprises at least one of the following steps:
randomly replacing the positions of two different commodity images in each group of different commodity training images;
randomly cutting and/or randomly filling two different commodity images in each group of different commodity training images;
turning over two different commodity images in each group of different commodity training images in a random mirror image manner;
and enhancing the contrast and/or brightness and/or saturation of the two different commodity images in each group of different commodity training images.
9. The method for detecting and classifying the different commodities in the intelligent container according to claim 8, wherein the training of the constructed commodity difference detection model by using the plurality of groups of different commodity training images and the label information comprises:
carrying out different commodity detection on each grid area in the multiple groups of different commodity training images, and adjusting an anchor frame preset in each grid area according to a detection result to obtain different commodity prediction information of each grid area;
calculating a difference commodity detection loss value according to the difference commodity prediction information and the label information of each grid area, and reversely transmitting the difference commodity detection loss value to each layer of the commodity difference detection model so as to update weight parameters of each layer according to the difference commodity detection loss value;
and repeating the training steps until the commodity difference detection model converges.
CN202111476957.XA 2021-12-02 2021-12-02 Differential commodity detection and classification method for intelligent container Pending CN114255377A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111476957.XA CN114255377A (en) 2021-12-02 2021-12-02 Differential commodity detection and classification method for intelligent container

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111476957.XA CN114255377A (en) 2021-12-02 2021-12-02 Differential commodity detection and classification method for intelligent container

Publications (1)

Publication Number Publication Date
CN114255377A true CN114255377A (en) 2022-03-29

Family

ID=80791722

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111476957.XA Pending CN114255377A (en) 2021-12-02 2021-12-02 Differential commodity detection and classification method for intelligent container

Country Status (1)

Country Link
CN (1) CN114255377A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423695A (en) * 2022-07-15 2022-12-02 清华大学 Streetscape image sampling method and device for city prediction task
CN117422937A (en) * 2023-12-18 2024-01-19 成都阿加犀智能科技有限公司 Intelligent shopping cart state identification method, device, equipment and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423695A (en) * 2022-07-15 2022-12-02 清华大学 Streetscape image sampling method and device for city prediction task
CN117422937A (en) * 2023-12-18 2024-01-19 成都阿加犀智能科技有限公司 Intelligent shopping cart state identification method, device, equipment and storage medium
CN117422937B (en) * 2023-12-18 2024-03-15 成都阿加犀智能科技有限公司 Intelligent shopping cart state identification method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
JP7058669B2 (en) Vehicle appearance feature identification and vehicle search methods, devices, storage media, electronic devices
CN108960119B (en) Commodity recognition algorithm for multi-angle video fusion of unmanned sales counter
Zhang et al. Toward new retail: A benchmark dataset for smart unmanned vending machines
CN112131978B (en) Video classification method and device, electronic equipment and storage medium
CN110298297A (en) Flame identification method and device
WO2020134102A1 (en) Article recognition method and device, vending system, and storage medium
US11501110B2 (en) Descriptor learning method for the detection and location of objects in a video
CN111340126A (en) Article identification method and device, computer equipment and storage medium
CN111626201A (en) Commodity detection method and device and readable storage medium
CN114255377A (en) Differential commodity detection and classification method for intelligent container
CN109934081A (en) A kind of pedestrian's attribute recognition approach, device and storage medium based on deep neural network
CN111523421A (en) Multi-user behavior detection method and system based on deep learning and fusion of various interaction information
CN115272652A (en) Dense object image detection method based on multiple regression and adaptive focus loss
CN115797736B (en) Training method, device, equipment and medium for target detection model and target detection method, device, equipment and medium
CN109712324A (en) A kind of automatic vending machine image-recognizing method, good selling method and vending equipment
CN114937179B (en) Junk image classification method and device, electronic equipment and storage medium
CN111368634B (en) Human head detection method, system and storage medium based on neural network
Bappy et al. Real estate image classification
Yang et al. Increaco: incrementally learned automatic check-out with photorealistic exemplar augmentation
CN111428743B (en) Commodity identification method, commodity processing device and electronic equipment
CN111444802A (en) Face recognition method and device and intelligent terminal
CN111126264A (en) Image processing method, device, equipment and storage medium
Chen et al. Self-supervised multi-category counting networks for automatic check-out
CN110472639B (en) Target extraction method based on significance prior information
Achakir et al. An automated AI-based solution for out-of-stock detection in retail environments

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination