CN116342985A

CN116342985A - Robust feature learning method for dynamic intelligent container

Info

Publication number: CN116342985A
Application number: CN202310109875.4A
Authority: CN
Inventors: 张健; 陶泽; 贺建飚; 任子欣; 熊友曹
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-02-14
Filing date: 2023-02-14
Publication date: 2023-06-27
Anticipated expiration: 2043-02-14
Also published as: CN116342985B

Abstract

The invention provides a robust feature learning method for a dynamic intelligent container, which comprises the following steps of S1, generating a first global inhibition mask G according to a first output feature map M; step S2, generating a second global inhibition mask G'; step S3, obtaining a second feature map M' after global inhibition; s4, dividing the M into a plurality of continuous small-scale local feature graphs M _i At said m _i Up-generating a local suppression mask K _i 'A'; step S5, the K is processed _i ' uniformly cover the m _i On, for completing the m _i Inhibition of the above significant features, obtaining M _K 'A'; and step S6, adopting the global suppression operation implemented in the steps S1-S3 and the local suppression operation in the steps S4-S5 to finally generate a third output characteristic diagram M. The invention can dynamically adjust the suppression of the remarkable characteristicsThe strength is improved, accurate robust feature learning performance is obtained, and the reliability of dynamic intelligent container retail commodity identification is improved.

Description

Robust feature learning method for dynamic intelligent container

Technical Field

The invention relates to the technical field of dynamic intelligent containers and computer vision, in particular to a robust feature learning method for a dynamic intelligent container.

Background

The dynamic intelligent container is used as an important component of an unmanned retail mode, and provides convenience for consumers to enjoy a quick and efficient consumption mode; meanwhile, an entrance is provided for merchants to search low-cost and high-quality operation modes. However, due to the influence of factors such as consumer behavior habits and shopping scenes, retail commodities of the dynamic intelligent container are extremely easy to be shielded in a plurality of different grades in the retail process, the characteristic representation and extraction of the commodities are seriously influenced, and then related functions of the dynamic intelligent container cannot be effectively completed. When the shielding is serious, the background manual guard can only be relied on to watch the video for screening, so that the core advantages of the related technology cannot be exerted, and due to the reasons of fatigue work of staff and the like, the frequency of missed operation and misoperation is high, and the characteristics of retail goods cannot be learned accurately in real time, so that the automatic and intelligent analysis of the dynamic intelligent container becomes an urgent need.

Although computer vision, man-machine interaction and cloud-edge collaboration techniques have all been greatly developed, robustness has become a limiting factor. If the high-quality characteristics of the dynamic intelligent container retail commodity can be obtained through excavation, the manual working pressure is reduced, the working efficiency and the identification accuracy are improved, and the high-reliability, high-efficiency and high-precision dynamic intelligent container consumption experience can be provided for consumers.

To learn the robustness features of generalization, current research is mostly based on deep learning techniques, which are used during the training phase of neural networks: 1) Data enhancement refers to the enhancement of a digital image by digital image transformation (e.g.: randomly rotated, cropped) to generate training data of different distributions, which is simple and efficient to implement, but requires more memory space and training time. 2) Adding regularization noise refers to randomly injecting regularization noise at the input, weight, etc. of the network to prevent the feature detectors from jointly adapting, and improve the robustness of the model. Such as: a binary matrix mask of the same dimension as the full-connected layer weights is created, the activation values of the neuron training process are randomly suppressed, and the structured random discard of the input continuum takes into account the continuity of the input. However, adding regularized noise has the following problems: a) The discarding rule needs to be designed separately, and the quality of the rule not only affects whether important features are lost, but also can prompt the network to ignore the association between the features. b) The discard units (elements or areas) cannot be predicted in advance, and the discard units are randomly selected so that the discard process is blind and irregular. c) The discard process is interrelated, and when the features between the discard units are mutually coupled, even without the discard units themselves, the interrelation between the features iterates to the next level of the network, resulting in a co-adaptation of the feature detectors. Moreover, the regularized noise adding method is generally only suitable for a certain level of the network, cannot be used in a crossing manner at other levels of the network, and has no convenience (such as an hidden layer and a fully connected layer).

In summary, the commodity identification method of the current dynamic intelligent container mostly improves generalization of the commodity identification method by constructing a large-scale data set through time-consuming and labor-consuming manual labeling or using regularization technologies such as data enhancement. Meanwhile, most of such methods for improving product recognition are limited to certain specific subspaces, and cannot generate enhancement examples with irregularity, instability and susceptibility to environmental factors, and feature robustness learning is limited.

Disclosure of Invention

The invention aims to provide a robust feature learning method for a dynamic intelligent container, which comprises the following specific technical scheme:

a robust feature learning method for a dynamic intelligent container, comprising the steps of:

step S1, extracting color, outline and texture information of retail goods in a dynamic intelligent container by adopting a deep neural network hidden layer to obtain a first output characteristic diagram M of the hidden layer; generating a first global suppression mask G according to the dimension, the height and the width of the M;

step S2, traversing the area where the significant features of the M are obtained through a maximum value algorithm, obtaining the area where the non-significant features of the M are located, initializing and assigning the G by utilizing the significant features of the M and the area where the non-significant features of the M are located, and generating a second global inhibition mask G' by using Bernoulli distribution to reconcile the proportion of the significant features of the M and the non-significant features of the M with the G;

step S3, uniformly covering the G 'on the M, so as to complete suppression of the salient features on the M, thereby obtaining a second feature map M' after global suppression, where M '=m×g', and M 'represents a product of corresponding matrix elements of the M and G';

s4, dividing the M into a plurality of continuous small-scale local feature graphs M _i At said m _i Up-generating a local suppression mask K _i '；

wherein ,

s ^r represents a line division interval, s, for the M ^c Represents a column division interval for the M, W represents a width of the M, H represents a height of the M, +.>

Representing the number of the local feature graphs after M division;

step S5, the K is processed _i ' uniformly cover the m _i On, for completing the m _i Inhibition of the above significant features, obtaining M _K', wherein ,

M _K ' represent by->

Local feature map m _i Performing a spliced characteristic diagram after local inhibition;

step S6, the global suppression operation implemented by the steps S1-S3 and the local suppression operation implemented by the steps S4-S5 are adopted, and finally a third output characteristic diagram M 'is generated, wherein M' is represented by an expression (1):

in the expression (1), ω is an inactivation coefficient, ω e 0, 1), and the numerical value thereof is manually adjusted according to the observation experimental result and the characteristic expression condition in the visualized image of the class activation diagram; the experimental result is a retail commodity identification experimental result of the dynamic intelligent container.

Optionally, in step S1, the M ε R ^D×H×W Wherein D represents the dimension of M, H represents the height of M, and W represents the width of M.

Optionally, the G is a binary matrix consisting of 0 and 1, and is overlaid on the M.

Optionally, in step S1, the M is an integer multiple of the G.

Optionally, in step S1, the dimensions, the height and the width of the G and the M are the same.

Optionally, in step S2, traversing by using a maximum algorithm to obtain a position L (d, x, y) of a maximum point in each dimension of the M, where a region where the salient feature of the M is located is represented by taking L (d, x, y) as a center point and making a square with a side length of 3t, that is, L (d, x±t, y±t), where the value of t is any real number;

when the G is initialized and assigned to the position of the region where the M salient features are located, the G value is 0, and the pixel units corresponding to the region where the M salient features are located are discarded;

when the G is initialized and assigned to the position of the area where the non-significant feature of the M is located, the G is valued to be 1, and the pixel unit corresponding to the area where the non-significant feature of the M is located is reserved;

the process of generating the G' is as follows:

G'＝G*B ₁ ，B ₁ ～Bernoulli(p ₁ )；

wherein ,B₁ Indicating a compliance parameter of size p ₁ Bernoulli distribution, p ₁ ∈[0,1]；

Use B ₁ The ratio of the significant feature of the M to the non-significant feature of the M is blended with the G, specifically, the ratio of the significant feature of the M in the M ranges from 0 to 1, the ratio of the non-significant feature of the M in the M ranges from 0 to 1, and the ratio of the significant feature of the M to the non-significant feature of the M ranges from 0 to positive infinity.

Optionally, in step S4, the basis for dividing the M is set by integrating the data size used by the deep neural network for training retail goods, the height and width of the M.

Optionally, in step S4, the S ^r and s^c The values of (2) are the same.

Optionally, generating the K in step S4 _i The process of' is as follows:

first, create

Independently of and from m _i Binarization matrix K having the same height and width _i ；

Second, using a maximum algorithm to locate the m by traversal _i Is located at the same time as the m _i By the m _i Is characterized by the significance of (c) and the m _i The region of the non-salient features of (2) is opposite to the K _i Performing initialization assignment;

finally, the K is distributed by Bernoulli _i Reconciling said m _i Is characterized by the significance of (c) and the m _i Is not significant, generates the K _i ' said K _i '＝K _i *B ₂ ，B ₂ ～Bernoulli(p ₂ )；

wherein ,B₂ Indicating a compliance parameter of size p ₂ Bernoulli distribution, p ₂ ∈[0,1]；

Use B ₂ For the K _i Reconciling said m _i Is characterized by the significance of (c) and the m _i The proportion of non-significant features of (a), in particular, the m _i Is characterized by the significance of K _i In a ratio in the range of 0 to 1, said m _i Is not significant in the K _i In a ratio in the range of 0 to 1, said m _i Is significant in terms of (a) and (m) _i The proportion of non-salient features of (a) ranges from 0 to plus infinity.

Optionally, in step S4, the K _i The initialization assignment process comprises the following steps:

when the K is to _i Initializing a value at said m _i At the location of the region of the salient feature of (2), said K _i Take a value of 0, indicating that the m is discarded _i Pixel units corresponding to the areas where the salient features are located;

when the K is to _i Initializing a value at said m _i The K is at the position of the region where the non-salient feature is located _i A value of 1, indicating that the m is reserved _i And a pixel unit corresponding to the region where the non-salient feature is located.

The technical scheme of the invention has the following beneficial effects:

(1) According to the robust feature learning method for the dynamic intelligent container, the global inhibition operation implemented by the steps S1-S3 and the local inhibition operation implemented by the steps S4-S5 are adopted, the first output feature map M is decoupled from different scales and expressed as the obvious features related to the real target label and the irrelevant non-obvious features, the expression of the obvious features in the training stage is actively inhibited, the association of the non-obvious features which are difficult to learn by the network learning and the real target label is forced, the network has stronger insight on the obvious features and the non-obvious features at the same time, and the operation performance of the dynamic intelligent container retail commodity identification is improved; in addition, the method adopts the combination of the step S6 and the steps S1-S5, comprehensively considers the interpretability of the extracted robust features, interprets the robust feature extraction process from the human cognition angle by combining the class activation diagram, finally generates a third output feature diagram M', can dynamically adjust the suppression intensity of the remarkable features, obtains accurate robust feature learning expression, and improves the reliability of the dynamic intelligent container retail commodity identification.

(2) The invention can flexibly insert and use the first output characteristic diagram M in the step S1 in the convolution layer and the full connection layer without modifying the original network structure, thereby improving the robust characteristic learning capability of dynamic intelligent container retail commodity identification in different scenes.

In addition to the objects, features and advantages described above, the present invention has other objects, features and advantages. The present invention will be described in further detail with reference to the drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention. In the drawings:

FIG. 1 is a flow chart of a robust feature learning method for a dynamic intelligent container in accordance with embodiment 1 of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.

Example 1:

referring to FIG. 1, a robust feature learning method for a dynamic intelligent container, comprising the steps of:

s1, extracting color, outline and texture information of retail goods in a dynamic intelligent container by adopting a hidden layer of a deep neural network (particularly a ResNet-56 network) to obtain a first output characteristic diagram M of the hidden layer; generating a first global suppression mask G according to the dimension, the height and the width of the M;

step S3, uniformly covering the G 'on the M, so as to complete suppression of the salient features on the M, thereby obtaining a second feature map M' after global suppression, where M '=m×g', and M 'represents a product of corresponding matrix elements of the M and G'; comparing the front-back difference between the M and the M', wherein the salient features of the maximum value region in the M are all suppressed;

wherein ,

s ^r represents a line division interval, s, for the M ^c Representing a column division interval for the M, the s ^r and s^c The values of the two are the same and are all 4; w represents the width of M, H represents the height of M, < >>

The number of the local feature graphs after M division is represented, and the specific value is 16;

M _K ' represent by->

in the expression (1), ω is an inactivation coefficient, ω e 0, 1), and the numerical value thereof is manually adjusted according to the observation experimental result and the characteristic expression condition in the visualized image of the class activation diagram; the experimental result is a retail commodity identification experimental result of the dynamic intelligent container, specifically, the robustness characteristic learning effect of the dynamic intelligent container is optimal when omega is set to be 0.5 by analyzing the observation result of the commodity identification precision index in the retail commodity identification experiment of the dynamic intelligent container.

In step S1, the M ε R ^D×H×W Wherein D represents the dimension of M, in particular 4096, h represents the height of M, in particular 16, and w represents the width of M, in particular 16.

The G is a binary matrix consisting of 0 and 1, and is overlaid on the M.

In step S1, the M is an integer multiple of the G.

In step S1, the dimensions, height and width of G and M are the same.

In step S2, traversing by using a maximum algorithm to obtain a position L (d, x, y) of a maximum value point in each dimension of the M, where a region where the M salient features are located is represented by taking L (d, x, y) as a center point and making a square with a side length of 3t, that is, L (d, x±t, y±t), where the value of t is 1;

the process of generating the G' is as follows:

G'＝G*B ₁ ，B ₁ ～Bernoulli(p ₁ )；

wherein ,B₁ Indicating a compliance parameter of size p ₁ Bernoulli distribution, p ₁ ∈[0,1]Specifically, p is set by analyzing the observation result of the commodity identification precision index in the retail commodity identification experiment of the dynamic intelligent container ₁ 0.3;

In step S4, the basis for dividing the M is set comprehensively by the data size used by the deep neural network training the retail goods, the height and width of the M.

Generating the K in step S4 _i The process of' is as follows:

first, create

Independently of and from m _i Binarization matrix K having the same height and width (specifically, the height and width are 4) _i ；

Second, using a maximum algorithm to locate the m by traversal _i Is located at the same time as the m _i By the m _i Significant features and the m _i The region of non-salient features is opposite to the K _i Performing initialization assignment;

finally, bernoulli distribution is usedFor the K _i Reconciling said m _i Significant features and the m _i Proportion of non-salient features, generating the K _i ' said K _i '＝K _i *B ₂ ，B ₂ ～Bernoulli(p ₂ )；

wherein ,B₂ Indicating a compliance parameter of size p ₂ Bernoulli distribution, p ₂ ∈[0,1]Setting p by analyzing the observation result of commodity identification precision index in the retail commodity identification experiment of the dynamic intelligent container ₂ 0.7;

In step S4, for the K _i The initialization assignment process comprises the following steps:

According to the embodiment 1 of the invention, the contribution degree of the salient features and the non-salient features reflected in the visualized image of the class activation diagram to the decision of the network execution on the input image can be mined into the non-salient feature region which cannot be focused in the advanced robust feature learning method. Therefore, the embodiment 1 of the invention actively suppresses the expression of the salient features in the training stage, forces the association of the non-salient features which are difficult to learn by the network learning and the real target labels, ensures that the network has stronger insight on the salient features and the non-salient features at the same time, and improves the operation performance of the dynamic intelligent container retail commodity identification. The invention can improve the retail commodity identification accuracy by 3% on average on the basis of the retail commodity identification network ResNet-56 in the embodiment 1. Specifically, the invention realizes that the average identification accuracy of commodities is 96% on 13 types of beverage commodities, namely, assam milk tea, pecam (canned), fenda (canned), red cow (canned), cola (canned), kangshifu green tea, kangshifu ice black tea, farmer mountain spring (500 ml), snow bi (canned) and Yibao (500 ml).

Table 1 results of identification of beverage products on the retail product identification network ResNet-56 of example 1 and prior art, respectively

The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A robust feature learning method for a dynamic intelligent container, comprising the steps of:

wherein ,

Representing the number of the local feature graphs after M division;

M _K ' represent by->

2. The robust feature learning method for dynamic intelligent containers according to claim 1, characterized in that in step S1, the me R ^D×H×W Wherein D represents the dimension of M, H represents the height of M, and W represents the width of M.

3. The robust feature learning method for dynamic intelligent containers according to claim 2, wherein G is a binary matrix consisting of 0 and 1 and is overlaid on the M.

4. A robust feature learning method for dynamic smart containers according to claim 3, characterized in that in step S1, the M is an integer multiple of the G.

5. The robust feature learning method for dynamic intelligent containers according to claim 4, characterized in that in step S1, the dimensions, height and width of the G and M are the same.

6. The robust feature learning method for a dynamic intelligent container according to claim 5, wherein in step S2, a maximum value algorithm is used to traverse and obtain a position L (d, x, y) of a maximum value point in each dimension of M, wherein a region where the salient feature of M is located is represented by taking L (d, x, y) as a center point and making a square with a side length of 3t, namely L (d, x±t, y±t), and the value of t is any real number;

the process of generating the G' is as follows:

G'＝G*B ₁ ，B ₁ ～Bernoulli(p ₁ )；

7. The robust feature learning method for dynamic intelligent containers according to claim 1, wherein in step S4, the basis employed for dividing the M is set comprehensively by the data size used by the deep neural network for training retail goods, the height and width of the M.

8. The robust feature learning method for dynamic intelligent containers according to claim 1, characterized in that in step S4, the S ^r and s^c The values of (2) are the same.

9. The robust feature learning method for dynamic smart containers according to any of the claims 1-8, characterized in that the K is generated in step S4 _i The process of' is as follows:

first, create

10. The robust feature learning method for dynamic intelligent containers according to claim 9, characterized in that in step S4, the K is calculated for _i The initialization assignment process comprises the following steps:

when the K is to _i Initializing a value at said m _i At the location of the region of the salient feature of (2), said K _i Take a value of 0, indicating that the m is discarded _i Image corresponding to the region of the upper salient featureA prime unit;