CN111126264A

CN111126264A - Image processing method, device, equipment and storage medium

Info

Publication number: CN111126264A
Application number: CN201911344600.9A
Authority: CN
Inventors: 蔡丁丁
Original assignee: Beijing Missfresh Ecommerce Co Ltd
Current assignee: Beijing Missfresh Ecommerce Co Ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-05-08

Abstract

The application discloses an image processing method, an image processing device, image processing equipment and a storage medium, and belongs to the technical field of image processing. The method comprises the following steps: acquiring at least one image pair corresponding to at least one goods storage area of intelligent vending equipment, wherein any image pair is obtained by respectively carrying out image acquisition on the same goods storage area at a first moment and a second moment, the first moment is the moment before a vending door of the intelligent vending equipment is opened at this time, and the second moment is the moment after the vending door is closed at this time; determining a target image pair in the at least one image pair, wherein the difference degree between two images included in the target image pair is larger than a target threshold value; and identifying the two images included in the target image pair to obtain attribute information of the removed goods in the intelligent vending equipment. The method and the device reduce the image recognition amount, shorten the time of the whole image processing process and improve the image processing efficiency.

Description

Image processing method, device, equipment and storage medium

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, an image processing device, and a storage medium.

Background

With the popularization of mobile payment technology and the maturity of image recognition technology, the large-scale landing application of intelligent unmanned sales counter is realized. The user only needs to scan the code to open the door, take the commodity, and close the door, and the unmanned sales counter of intelligence settles accounts the commodity that the user took away, can realize 24 hours unmanned retail, provides the noninductive shopping experience for the user.

In the related technology, the intelligent unmanned sales counter utilizes an image recognition technology to recognize commodities in images shot by cameras arranged on each layer of shelves before and after the door is opened, and compares the difference of the commodities before and after the door is opened to determine the commodities taken away by a user, so as to automatically settle accounts and deduct money for the commodities.

The technology identifies the images shot by the cameras on all the shelves to determine the removed commodities, and the process takes a long time and is low in efficiency.

Disclosure of Invention

The embodiment of the application provides an image processing method, an image processing device, image processing equipment and a storage medium, and can solve the problem of low efficiency of related technologies. The technical scheme is as follows:

in a first aspect, an image processing method is provided, including:

acquiring at least one image pair corresponding to at least one goods storage area of intelligent vending equipment, wherein any image pair is obtained by respectively carrying out image acquisition on the same goods storage area at a first moment and a second moment, the first moment is the moment before a vending door of the intelligent vending equipment is opened at this time, and the second moment is the moment after the vending door is closed at this time;

determining a target image pair in the at least one image pair, wherein the difference degree between two images included in the target image pair is larger than a target threshold value;

and identifying the two images included in the target image pair to obtain attribute information of the removed goods in the intelligent vending equipment.

In one possible implementation, the determining a target image pair of the at least one image pair includes:

and determining the target image pair in the at least one image pair and a difference region of the target image pair, wherein the difference region is a region with difference in two images included in the target image pair.

In one possible implementation, the attribute information of any good includes a name and a location area of the any good,

the identifying the two images included in the target image pair to obtain the attribute information of the removed goods in the intelligent vending equipment comprises the following steps:

respectively identifying the two images to obtain attribute information of goods respectively contained in the two images;

determining attribute information of at least one target cargo according to the attribute information of the cargos contained in the two images respectively, wherein the attribute information of the two images and the attribute information of the at least one target cargo comprise different quantities of cargos with the same name;

and according to the attribute information of the at least one target cargo and the difference area, taking the attribute information of the target cargo with the position area included by the attribute information in the difference area as the attribute information of the removed cargo.

respectively identifying goods in the difference areas in the two images to obtain attribute information of the goods respectively contained in the difference areas in the two images;

determining attribute information of at least one target cargo according to the attribute information of the cargos contained in the difference areas in the two images, wherein the attribute information of the difference areas in the two images and the attribute information of the at least one target cargo comprise different quantities of cargos with the same name;

and taking the attribute information of the at least one target cargo as the attribute information of the removed cargo.

inputting the at least one image pair into a target model, and outputting a probability value of each of the at least one image pair, wherein the probability value of any image pair is used for representing the difference degree between two images included in any image pair;

and determining the target image pair according to the respective probability value of the at least one image pair, wherein the probability value of the target image pair is larger than a target threshold value.

In one possible implementation, the inputting the at least one image pair into a target model, outputting respective probability values of the at least one image pair, includes:

inputting the at least one image pair into the target model, outputting a probability value for each of the at least one image pair and a difference region for each of the at least one image pair.

and inputting any image pair into a target model for any image pair in the at least one image pair, respectively extracting features of two images included in any image pair by the target model, fusing the two extracted feature matrixes, calculating a difference feature matrix obtained by fusion processing, and outputting a probability value of any image pair.

In one possible implementation, the obtaining process of the target model includes:

acquiring a training data set and a test data set, the training data set and the test data set comprising at least one first image pair and at least one second image pair, the first image pair comprising a first label value indicating that a degree of difference between the two images is greater than a target threshold, the second image pair comprising a second label value indicating that the degree of difference between the two images is less than or equal to the target threshold;

training an initial model based on the training data set to obtain a first model;

testing the first model based on the test data set to obtain the accuracy of the first model;

when the accuracy of the first model is greater than a first threshold, treating the first model as the target model.

In one possible implementation, the training an initial model based on the training data set to obtain a first model includes:

inputting at least one image pair in the training data set into the initial model, and outputting a probability value of each of the at least one image pair in the training data set;

obtaining a loss value according to the probability value of at least one image pair in the training data set, the label value of at least one image pair in the training data set and a loss function, wherein the loss value is used for representing the difference between the probability value output by the initial model and the label value of the input image pair;

and when the loss value is larger than a second threshold value, adjusting the parameters of the initial model until the loss value meets a target condition, and taking the current obtained model as the first model.

In one possible implementation, the testing the first model based on the test data set to obtain the accuracy of the first model includes:

inputting at least one image pair in the test data set into the first model, outputting a probability value for each of the at least one image pair in the test data set;

determining an image pair of which the difference between the probability value and the label value in the at least one image pair in the test data set is smaller than a third threshold value according to the respective probability value of the at least one image pair in the test data set and the respective label value of the at least one image in the test data set;

determining a ratio of the determined image pairs to a number of at least one image in the test data set as an accuracy of the first model.

In a second aspect, there is provided an image processing apparatus comprising:

the intelligent vending equipment comprises an acquisition module, a storage module and a control module, wherein the acquisition module is used for acquiring at least one image pair corresponding to at least one goods storage area of the intelligent vending equipment, and any image pair is obtained by respectively acquiring images of the same goods storage area at a first moment and a second moment, the first moment is the moment before a vending door of the intelligent vending equipment is opened at this time, and the second moment is the moment after the vending door is closed at this time;

the determining module is used for determining a target image pair in the at least one image pair, and the difference degree between two images included in the target image pair is greater than a target threshold value;

and the identification module is used for identifying the two images included in the target image pair to obtain the attribute information of the removed goods in the intelligent vending equipment.

In one possible implementation, the determining module is configured to:

In one possible implementation, the attribute information of any cargo includes a name and a location area of the any cargo, and the identification module is configured to:

In one possible implementation, the determining module is configured to:

In one possible implementation manner, the obtaining module is further configured to:

In one possible implementation, the obtaining module is configured to:

In a third aspect, an electronic device is provided, which includes one or more processors and one or more memories, and at least one program code is stored in the one or more memories, and the at least one program code is loaded and executed by the one or more processors to implement the method steps of any one of the implementations of the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, in which at least one program code is stored, which is loaded and executed by a processor to implement the method steps of any of the implementations of the first aspect.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:

the image pairs obtained by image acquisition of each goods storage area before and after the intelligent goods selling equipment is opened are obtained, firstly, the image pairs are obtained, the target image pairs with the difference degree larger than the target threshold value are determined, namely, the changed image pairs are determined, only the target image pairs are identified, the attribute information of the goods removed from the intelligent goods selling equipment can be obtained, the image pairs which are not changed are not required to be identified, the image identification amount is reduced, the time of the whole image processing process is shortened, and the image processing efficiency is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment of an image processing method according to an embodiment of the present application;

fig. 2 is a flowchart of an image processing method provided in an embodiment of the present application;

FIG. 3 is a flow chart of processing an image by a model according to an embodiment of the present application;

FIG. 4 is a schematic diagram of an image recognition effect provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of an image recognition effect provided by an embodiment of the present application;

FIG. 6 is a schematic diagram of an image recognition effect provided by an embodiment of the present application;

fig. 7 is a schematic structural diagram of an image processing apparatus 700 according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of an electronic device 800 according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Fig. 1 is a schematic diagram of an implementation environment of an image processing method according to an embodiment of the present application. Referring to FIG. 1, the fulfillment environment may include a smart vending apparatus 101 and a server 102.

The intelligent vending apparatus 101 is used for providing goods for sale to a user, and the intelligent vending apparatus 101 may include a plurality of goods storage areas, and each goods storage area may store the same kind of goods or different kinds of goods. For example, the intelligent vending apparatus 101 may be an intelligent unmanned vending cabinet, which may include a plurality of shelves, each serving as a storage area. The server 102 is used to provide services to the smart vending apparatus 101, such as settlement and deduction of removed items in the smart vending apparatus 101. The server 102 may be a server or a cluster of servers. The intelligent vending apparatus 101 and the server 102 may establish a communication connection through a wired network or a wireless network.

Fig. 2 is a flowchart of an image processing method according to an embodiment of the present application. The method is executed by an electronic device, which may be a smart vending apparatus, or an apparatus other than the smart vending apparatus, such as a server, and referring to fig. 2, the method includes:

201. the electronic equipment acquires at least one image pair corresponding to at least one goods storage area of the intelligent vending equipment, and any image pair is obtained by respectively carrying out image acquisition on the same goods storage area at a first moment and a second moment.

The first moment is the moment before the selling door of the intelligent selling equipment is opened at this time, and the second moment is the moment after the selling door is closed at this time. This intelligence equipment of selling goods can be unmanned sales counter of intelligence, and this storage goods region can be the sales rack in the cabinet, and every storage goods region can be provided with corresponding camera for carry out image acquisition to this storage goods region, for example, the camera can set up at the regional top of storage goods.

The outside of the intelligent vending equipment can be pasted with a graphic identification code, such as a two-dimensional code, and the graphic identification code can be generated based on identification information of the intelligent vending equipment, wherein the identification information is used for uniquely identifying the intelligent vending equipment. The user can be used at the payment at terminal and carry out the scanning operation, the terminal can be based on the figure identification code scanning function that this payment application provided, scan this figure identification code on the intelligent vending equipment, obtain the identification information of the intelligent vending equipment that this figure identification code bore, then the terminal can be sent this intelligent vending equipment's identification information for the server, the server can send the instruction of opening the door to intelligent vending equipment, intelligent vending equipment is at the moment of receiving this instruction of opening the door, also be this first moment, can carry out image acquisition to the storage goods region that corresponds separately through the camera that every storage goods region corresponds, obtain an image that every storage goods region corresponds separately. The intelligent vending apparatus may then open the vending door. After goods are taken away from a certain goods storage area of the intelligent goods selling equipment by the user, the goods selling door of the intelligent goods selling equipment can be closed, the intelligent goods selling equipment can carry out image acquisition on the goods storage area corresponding to the intelligent goods selling equipment through the camera corresponding to each goods storage area at the moment when the goods selling door is successfully closed, namely the second moment, and an image corresponding to each goods storage area is obtained. For any goods storage area, the camera corresponding to the goods storage area respectively collects images of the goods storage area at a first moment and a second moment, so that two images, namely one image pair, can be obtained. For the case that the electronic device is the intelligent vending apparatus, the intelligent vending apparatus may directly perform the subsequent steps 202 to 204 on the acquired at least one image pair. For the case that the electronic device is a server, the intelligent vending apparatus may send the acquired at least one image pair to the server, and the server performs the following steps 202 to 204 on the at least one image pair.

202. The electronic device inputs the at least one image pair into the target model, and outputs a respective probability value for the at least one image pair, the probability value for any image pair being indicative of a degree of difference between two images included in the any image pair.

Wherein, the target model is used for outputting corresponding probability value according to the input image pair.

The acquisition process of the target model may include two processes, a training process and a testing process (verification process), respectively. The training process can utilize an artificial neural network to construct a deep learning algorithm, namely, an initial model, the model is trained by utilizing a large amount of collected data, parameters in the model are automatically learned and adjusted, and when the model can achieve an expected effect under a certain precision, the training is stopped. The testing process may test the performance of the evaluation model and if the requirements are met, take it as the target model.

In one possible implementation, the target model obtaining process includes the following steps one to four:

step one, a training data set and a test data set are obtained, wherein the training data set and the test data set comprise at least one first image pair and at least one second image pair.

Wherein the first image pair comprises a first label value indicating that the degree of difference between the two images is greater than the target threshold, and the second image pair comprises a second label value indicating that the degree of difference between the two images is less than or equal to the target threshold.

The electronic device may obtain the training data set and the test data set from a local storage, or may obtain the training data set and the test data set from other devices. For the source of the data set, the training data set and the test data set may be collected and labeled manually and then stored on the device. For example, a tester can simulate normal user purchasing operation by using off-line intelligent vending equipment, capture images through a camera in each storage area of the intelligent vending equipment to collect image pairs before and after opening and closing a door, label the image pairs with the difference degree larger than a threshold value (the image pairs with change) as one type, mark a first label value, mark the image pairs with the difference degree smaller than or equal to the threshold value (the image pairs without change) as another type, mark a second label value, and randomly select and combine the types of goods in each storage area of the intelligent vending equipment to simulate the combined placement of various goods in the actual operation process. The labeled data can be divided into a training data set and a testing data set, wherein each data set is divided into two types, namely a first image pair (invariant image pair) and a second image pair (variant image pair), the training data set is used for training a model, and the testing data set is used for testing and verifying the effect of the trained model.

And secondly, training the initial model based on the training data set to obtain a first model.

In one possible implementation, the training the initial model based on the training data set to obtain a first model includes: inputting at least one image pair in the training data set into the initial model, and outputting a probability value of each of the at least one image pair in the training data set; obtaining a loss value according to the probability value of at least one image pair in the training data set, the label value of at least one image pair in the training data set and a loss function, wherein the loss value is used for representing the difference between the probability value output by the initial model and the label value of the input image pair; and when the loss value is larger than a second threshold value, adjusting the parameters of the initial model until the loss value meets a target condition, and taking the currently obtained model as the first model.

In order to avoid increasing extra time consumption in the whole image recognition process as much as possible, an initial model in the training process is designed into a lightweight neural network structure, the model is input into two images, and the model is output into a probability value, and the probability value represents the difference degree (change degree) of the two images. Referring to fig. 3, fig. 3 is a flowchart of processing an image by a model according to an embodiment of the present application, and as shown in fig. 3, the design and construction of the model may be divided into three stages, i.e., feature extraction, feature fusion, and probability function calculation. The feature extraction part may adopt an inclusion _ V3 network, the inclusion _ V3 network may be a network under a Pytorch deep learning framework, and the feature extraction part may use an output result of a middle layer in the network as a feature matrix of the image, such as a depth feature matrix. Besides selecting the Pythrch as the learning framework in the technical scheme, other deep learning frameworks such as Tensorflow, Caffe, MXnet and the like can be used. The network for extracting the features of the image can also select other basic networks such as Resnet series, VGGNet series, DenseNet series and the like to extract the features of the image besides the inclusion _ V3 network. For selecting which layer of the output in the basic network as the feature matrix of the image, in addition to selecting the output result of a certain intermediate layer in the inclusion _ v3, the output of other intermediate layers can be selected as the feature matrix of the image, such as any intermediate layer from mixed5a to mixed5d or mixed6a to mixed6 e.

The feature fusion part can perform feature fusion calculation on feature matrixes of the two images at a depth feature level to obtain a difference feature matrix, and the difference feature matrix can represent the difference condition of the two images at an abstract feature level. And the probability function calculation part calculates a probability value and a difference area through a probability function based on the difference characteristic matrix obtained in the last step through the operation of a series of convolution layers and full connection layers, wherein the difference area is an area with difference in the two images.

Each element in the difference feature matrix can represent the difference condition of the corresponding position in the two images. The difference region may also be referred to as a change region, and the output form of the difference region may be a matrix, where each element in the matrix may indicate whether there is a difference (whether there is a change) in the corresponding position of the element in the two images, for example, if an element in the matrix is 0, it indicates that there is no difference (there is no change) in the corresponding position of the element in the two images, and if the element is 1, it indicates that there is a difference (there is a change) in the corresponding position of the element in the two images, so that the difference region (change region) of the two images may be determined.

As shown in fig. 3, taking an image pair including an image a and an image B as an example, the image a and the image B may be input into the target model, feature extraction modules of the target model respectively perform feature extraction on the image a and the image B to obtain a feature matrix a of the image a and a feature matrix B of the image B, a feature fusion module of the target model performs fusion processing on the feature matrix a and the feature matrix B to obtain a difference feature matrix, and a probability function calculation module of the target model calculates the difference feature matrix to obtain a probability value and a difference region of the image pair.

After the initial model is constructed, the collected data is needed to train and learn the parameters inside the model, and the model parameters are automatically adjusted until the model finally achieves the expected effect. Specifically, in the training process, for at least one image pair in the training data set, after the at least one image pair is input into the initial model, the probability value of each of the at least one image pair may be obtained through the process shown in fig. 3. For each image pair in the at least one image pair, the difference between the probability value and the label value of the image pair can be measured by a loss function, and the larger the loss value calculated by the loss function is, the more the result output by the model deviates from the real situation, the model needs to be optimized, and the parameters of the model are adjusted, so that the result output by the model is consistent with the real situation as much as possible. Specifically, for at least one image pair, a difference between the probability value and the label value of the image pair may be calculated, the obtained at least one difference is substituted into the loss function to calculate, so as to obtain a loss value, and if the loss value is greater than a certain threshold, the parameters of the initial model may be adjusted. Specifically, based on the loss value, parameters inside the model are updated and adjusted by using an Adam optimization algorithm, so that the loss value is as small as possible, the initial learning rate of Adam is set to lr, then the learning rate is gradually reduced every several rounds until the loss value cannot be further reduced, the training process is stopped, and the currently obtained model is used as the first model obtained by training. The first model can be used to identify whether the two images are the same or not, and if there is a difference, the difference region is located, that is, if there is a change, the change region is located.

For example, in addition to the model obtained when the training process is stopped being used as the first model, at least one intermediate model obtained before the training is stopped may be used as the first model.

And thirdly, testing the first model based on the test data set to obtain the accuracy of the first model.

In one possible implementation, the testing the first model based on the test data set to obtain the accuracy of the first model includes: inputting at least one image pair in the test data set into the first model, and outputting a probability value of each of the at least one image pair in the test data set; determining an image pair of which the difference between the probability value and the label value in the at least one image pair in the test data set is smaller than a third threshold value according to the respective probability value of the at least one image pair in the test data set and the respective label value of the at least one image in the test data set; the accuracy of the first model is determined as a ratio of the determined number of image pairs to at least one image in the test data set.

After the training is finished, a first model is obtained, the first model is tested by using a test data set, for each image pair in the at least one image pair, the first model can output a probability value p, and the p value represents the difference degree of the image pair, namely, the larger the p value is, the larger the difference degree of the two images is. By setting a threshold T, if p > T, the difference between the two images included in the image pair is greater than the threshold, that is, the image pair is changed, and if p < T, the difference between the two images included in the image pair is less than the threshold, that is, the image pair is not changed. Different thresholds T may affect the final classification effect, and the appropriate threshold T needs to be adjusted according to specific service requirements. Optionally, for each image pair, the first model may output a difference region (change region Mask) in addition to a probability value p, where the difference region represents a region with a difference degree greater than a threshold value, that is, a region with a change.

For at least one of the image pairs, a difference between the probability value and the label value for each image pair may be calculated, thereby determining the respective image pair for which the difference is less than a threshold value, and then calculating a ratio of the number of the respective image pair to the at least one image pair, the ratio being taken as the accuracy of the first model.

And step four, when the accuracy of the first model is greater than a first threshold value, taking the first model as the target model.

For the case that the first model is one, if the accuracy of the first model is greater than the first threshold value through testing, the first model can be used as a target model for image recognition.

For the case that the number of the first models is multiple, the accuracy of each first model can be tested respectively, and the model with the highest accuracy is selected as the target model finally used for image recognition.

The target model obtained through the first step to the fourth step achieves preset precision, the target model can be integrated and applied to the intelligent vending system, and the image pair is identified through the target model.

In one possible implementation, step 202 may include: and inputting any image pair in the at least one image pair into a target model, respectively extracting the features of two images included in the any image pair by the target model, fusing the two extracted feature matrixes, calculating the difference feature matrix obtained by the fusion processing, and outputting the probability value of the any image pair.

The processing of the image pair by the target model is the same as the processing shown in fig. 3, and is not described here again.

In one possible implementation, the inputting the at least one image pair into the target model, outputting respective probability values for the at least one image pair, includes: the at least one image pair is input into the target model, and a respective probability value of the at least one image pair and a respective difference region of the at least one image pair are output.

Similarly to the process shown in fig. 3, for any image pair input to the target model, in addition to the probability value of the image pair, the difference region of the image pair may be output. Referring to fig. 4 to 6, fig. 4 to 6 are schematic diagrams of an image identification effect provided by an embodiment of the present application, and as shown in fig. 4 to 6, the target model may identify a difference region of each image pair, such as the difference region 401 in fig. 4, the difference region 501 in fig. 5, and the difference region 601 in fig. 6.

203. The electronic device determines a target image pair according to the respective probability value of the at least one image pair, wherein the probability value of the target image pair is greater than a target threshold value.

The electronic device may select a target image pair having a probability value greater than a target threshold from the at least one image pair according to the respective probability value of the at least one image pair. Since the probability value is used to indicate the difference between the two images included in the image pair, and the probability value of the target image pair is greater than the target threshold, it indicates that the difference between the two images included in the target image pair is greater than the target threshold.

For any image pair, the electronic device may directly filter out the image pair if the probability value for the image pair is less than or equal to the target threshold, indicating that the image pair includes two images that differ by less than or equal to the target threshold, and the image pair is considered to be an invariant image pair. If the probability value of the image pair is greater than the target threshold, indicating that the difference between the two images included in the image pair is greater than the target threshold, the electronic device may regard the image pair as a changed image pair, and perform the subsequent step 204.

Optionally, a difference region of each of the at least one image pair may be further input for the target model in step 202, and accordingly, in step 203, the electronic device may determine the target image pair and the difference region of the target image pair according to the respective probability value of each of the at least one image pair and the difference region of each of the at least one image pair, where the difference region is a region where a difference exists between two images included in the target image pair.

It should be noted that

steps

202 and 203 are one possible implementation of determining the target image pair in the at least one image pair. By filtering the image pairs which are not changed and only executing the subsequent identification steps on the image pairs which are changed, the data transmission quantity and the calculated quantity can be reduced, the identification time is shortened, the settlement efficiency is improved, and the calculation cost is reduced.

204. The electronic equipment identifies the two images included in the target image pair to obtain attribute information of the removed goods in the intelligent vending equipment.

The attribute information of any cargo may include a name and a location area of the any cargo, and the name may be used to indicate a category of the any cargo.

For the situation that the electronic device determines the target image pair, after the electronic device determines the target image pair, the electronic device may respectively identify the two images by using an image identification technology to obtain attribute information of goods respectively contained in the two images, and by comparing the attribute information of the goods respectively contained in the two images, attribute information of the goods removed from the intelligent vending equipment may be determined.

For the case that the electronic device can also determine the difference region of the target image pair, the specific implementation manner of this step 204 may include the following two types:

in one possible implementation, this step 204 may include the following steps a1 to a 3:

step a1, the two images are respectively identified to obtain the attribute information of the goods respectively contained in the two images.

The electronic device may use an image recognition technique to respectively recognize the articles in the two images, so as to obtain the attribute information of the goods respectively contained in each image.

Step a2, determining the attribute information of at least one target cargo according to the attribute information of the cargo contained in the two images respectively, wherein the attribute information in the two images is different from the attribute information of the at least one target cargo in the number of cargos with the same name.

The electronic device may count the number of the goods whose attribute information includes the name in the two images for each name according to the attribute information of the goods included in the two images, respectively, determine whether the number changes, and if the number changes, take the attribute information of at least one goods of the name as the attribute information of the at least one target goods.

Step a3, according to the attribute information of the at least one target cargo and the difference area, using the attribute information of the target cargo whose position area included in the attribute information is located in the difference area as the attribute information of the removed cargo.

The electronic device may filter at least one target good according to the difference region, specifically, for each target good, the electronic device may compare a location region included in the attribute information of the target good with the difference region, and if the location region is located in the difference region, take the attribute information of the target good as the attribute information of the removed good.

The realization mode is that the articles in the two images are identified firstly, and then the articles are filtered according to the difference area, so that the error identification of the articles can be reduced, and the accuracy of the identification result is ensured.

In another possible implementation, this step 204 may include the following steps b 1-b 3:

and b1, identifying the goods in the difference area in the two images respectively to obtain the attribute information of the goods contained in the difference area in the two images respectively.

The electronic device may use an image recognition technique to respectively recognize the articles in the difference regions in the two images, so as to obtain the attribute information of the goods respectively contained in the difference regions in each image.

Step b2, determining the attribute information of at least one target cargo according to the attribute information of the cargo contained in the difference area in the two images, wherein the number of the cargo with the same name in the attribute information of the difference area in the two images is different from the number of the cargo with the same name in the attribute information of the at least one target cargo.

The electronic device may count, for each name, the number of the goods whose attribute information includes the name in the difference region in the two images, respectively, according to the attribute information of the goods included in the difference region in the two images, determine whether the number changes, and if the number changes, take the attribute information of at least one good of the name as the attribute information of the at least one target good.

And b3, taking the attribute information of the at least one target cargo as the attribute information of the removed cargo.

Since the attribute information of the at least one target good is determined based on the difference region, the attribute information of the at least one target good can be directly used as the attribute information of the removed good.

The realization mode is to identify the articles in the difference area of the two images, so that the identification range can be reduced, the time required by identification can be reduced, and the accuracy of the identification result can be ensured.

After the electronic equipment obtains the attribute information of the removed goods in the intelligent vending equipment, automatic settlement and deduction can be carried out according to the attribute information of the goods. For example, after the user scans the graphic identification code of the smart vending device using the terminal, the user account information may be sent to the server, and the server may determine the cost information corresponding to the goods according to the attribute information of the removed goods, and deduct the corresponding cost from the user account.

Above-mentioned technical scheme is through detecting that there is the target image pair of difference in each image pair that intelligent vending equipment gathered around opening the door, only needs to open and close the article in the storage goods region that changes around the door and discern, and need not discernment to the storage goods region that does not change, so can settle out the article that the user bought in a shorter time sooner, can guarantee the discernment precision better to and improve recognition efficiency. Taking the intelligent vending equipment as an intelligent unmanned vending cabinet and the goods storage area as a vending rack as an example, for example, a single vending rack can be divided into four layers of vending racks, each layer is provided with a camera, only the goods on a certain layer of the vending rack generally change in a normal shopping process, and the other three layers do not change at all, so that the changed goods on the layer only need to be identified, and three-quarters of computing resources and time can be saved compared with the identification of the goods on all four layers of the vending racks. Furthermore, for two images before and after opening and closing the door at the same layer, the purchasing operation of the user often only affects a small area of the shelf at the same layer, and articles in other most areas of the shelf may not be changed at all.

This technical scheme is through utilizing light-weight convolution neural network, provides an effective image filtering mechanism to intelligent vending system, under increasing a small amount of calculation loads, through automatic filtration the pair of images that does not change before and after the switch door, only to taking place the pair of images that change and carry out further processing, can reduce calculation load and data transmission volume on the whole for article discernment and settlement speed, also can improve the degree of accuracy of article discernment simultaneously, reduce the misidentification order, provide better user's purchase experience.

The method provided by the embodiment of the application comprises the steps of acquiring image pairs obtained by image acquisition of each goods storage area before and after the intelligent vending equipment opens the door, determining the target image pairs with the difference degree larger than the target threshold value from the image pairs, namely determining the changed image pairs, and only identifying the target image pairs, so that the attribute information of the removed goods in the intelligent vending equipment can be obtained.

Fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application. Referring to fig. 7, the apparatus includes:

the acquisition module 701 is configured to acquire at least one image pair corresponding to at least one goods storage area of the intelligent vending apparatus, where any image pair is obtained by respectively performing image acquisition on the same goods storage area at a first time and a second time, the first time is a time before a vending door of the intelligent vending apparatus is opened this time, and the second time is a time after the vending door is closed this time;

a determining module 702, configured to determine a target image pair in the at least one image pair, where a difference between two images included in the target image pair is greater than a target threshold;

the identifying module 703 is configured to identify two images included in the target image pair to obtain attribute information of the removed goods in the intelligent vending apparatus.

In one possible implementation, the determining module 702 is configured to:

and determining the target image pair in the at least one image pair and a difference region of the target image pair, wherein the difference region is a region where a difference exists between the two images included in the target image pair.

In a possible implementation manner, the attribute information of any cargo includes a name and a location area of the any cargo, and the identifying module 703 is configured to:

and according to the attribute information of the at least one target cargo and the difference area, taking the attribute information of the target cargo with the position area included by the attribute information positioned in the difference area as the attribute information of the removed cargo.

identifying the goods in the difference area in the two images respectively to obtain the attribute information of the goods contained in the difference area in the two images respectively;

determining attribute information of at least one target cargo according to the attribute information of the cargo respectively contained in the difference area in the two images, wherein the attribute information in the difference area in the two images and the attribute information of the at least one target cargo comprise different quantities of the cargo with the same name;

In one possible implementation, the determining module 702 is configured to:

the at least one image pair is input into the target model, and a respective probability value of the at least one image pair and a respective difference region of the at least one image pair are output.

In one possible implementation, the determining module 702 is configured to:

and inputting any image pair in the at least one image pair into a target model, respectively extracting the features of two images included in the any image pair by the target model, fusing the two extracted feature matrixes, calculating the difference feature matrix obtained by the fusion processing, and outputting the probability value of the any image pair.

In one possible implementation, the obtaining module 701 is further configured to:

obtaining a training data set and a test data set, the training data set and the test data set comprising at least one first image pair and at least one second image pair, the first image pair comprising a first label value indicating that a degree of difference between the two images is greater than a target threshold, the second image pair comprising a second label value indicating that the degree of difference between the two images is less than or equal to the target threshold;

training the initial model based on the training data set to obtain a first model;

and when the accuracy of the first model is greater than a first threshold value, taking the first model as the target model.

In one possible implementation, the obtaining module 701 is configured to:

and when the loss value is larger than a second threshold value, adjusting the parameters of the initial model until the loss value meets a target condition, and taking the currently obtained model as the first model.

In one possible implementation, the obtaining module 701 is configured to:

inputting at least one image pair in the test data set into the first model, and outputting a probability value of each of the at least one image pair in the test data set;

the accuracy of the first model is determined as a ratio of the determined number of image pairs to at least one image in the test data set.

It should be noted that: in the image processing apparatus provided in the above embodiment, only the division of the above functional modules is taken as an example for image processing, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the above described functions. In addition, the image processing apparatus and the image processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

Fig. 8 is a schematic structural diagram of an electronic device 800 according to an embodiment of the present application, where the electronic device 800 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 801 and one or more memories 802, where the memory 802 stores at least one program code, and the at least one program code is loaded and executed by the processor 801 to implement the methods provided by the foregoing method embodiments. Of course, the electronic device may further have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the electronic device may further include other components for implementing the functions of the device, which is not described herein again.

In an exemplary embodiment, there is also provided a computer readable storage medium, such as a memory, storing at least one program code, which is loaded and executed by a processor, to implement the image processing method in the above embodiments. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. An image processing method, characterized in that the method comprises:

2. The method of claim 1, wherein determining the target image pair of the at least one image pair comprises:

3. The method of claim 2, wherein the attribute information of any cargo includes a name and a location area of the any cargo,

4. The method of claim 2, wherein the attribute information of any cargo includes a name and a location area of the any cargo,

5. The method of claim 1, wherein determining the target image pair of the at least one image pair comprises:

6. The method of claim 5, wherein inputting the at least one image pair into a target model, outputting a probability value for each of the at least one image pair, comprises:

7. The method of claim 5, wherein the obtaining of the target model comprises:

8. An image processing apparatus, characterized in that the apparatus comprises:

9. An electronic device, comprising one or more processors and one or more memories having at least one program code stored therein, the at least one program code being loaded and executed by the one or more processors to implement the image processing method according to any one of claims 1 to 8.

10. A computer-readable storage medium, having stored therein at least one program code, which is loaded and executed by a processor, to implement the image processing method according to any one of claims 1 to 8.