CN111353540B

CN111353540B - Commodity category identification method and device, electronic equipment and storage medium

Info

Publication number: CN111353540B
Application number: CN202010135255.4A
Authority: CN
Inventors: 秦永强; 高达辉
Original assignee: Innovation Qizhi Qingdao Technology Co ltd
Current assignee: Innovation Qizhi Qingdao Technology Co ltd
Priority date: 2020-02-28
Filing date: 2020-02-28
Publication date: 2023-07-18
Anticipated expiration: 2040-02-28
Also published as: CN111353540A

Abstract

The application provides a commodity category identification method and device, electronic equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a commodity area image in an image to be identified; extracting image features of each commodity region image, and fusing the image features to obtain a relative size feature vector; aiming at each commodity area image, acquiring a commodity feature vector corresponding to the commodity area image based on the image feature of the commodity area image and the relative size feature vector; and carrying out classification calculation on the commodity feature vector corresponding to each commodity area image to obtain commodity category information corresponding to each commodity area image. According to the embodiment provided by the application, the relative size feature vector capable of representing the relative size information of different commodities is obtained by fusing the feature images of each commodity area image, so that similar commodities of the same brand and different specifications of the same variety can be distinguished and packaged by means of the relative size information, and the identification accuracy is improved.

Description

Commodity category identification method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to a method and apparatus for identifying a commodity category, an electronic device, and a computer readable storage medium.

Background

The method adopts the image recognition technology to realize the channel display commodity verification, and has the advantages of high efficiency and low cost compared with the traditional manual verification. Products of different specifications of the same brand and variety often adopt packages with similar appearance textures. In a channel display scene, due to the problems of shielding among objects, long shooting distance, shooting angle and the like, an actually acquired image may not be clear enough, and text information or bar codes capable of distinguishing commodity specifications on packaging are lacked.

At present, the image recognition technology recognizes commodity category in a 'detection and classification' mode, firstly detects commodities in images, then cuts out the images of the areas where the commodities are located, and respectively recognizes commodity category information of each partial image. This approach makes it difficult to distinguish between different sizes of similar packages of the same brand and variety.

Disclosure of Invention

An object of an embodiment of the present application is to provide a commodity category identification method and apparatus, an electronic device, and a computer-readable storage medium for identifying commodities having similar packages and different specifications.

In one aspect, the present application provides a method for identifying a commodity category, including:

acquiring a commodity area image in an image to be identified;

extracting image features of each commodity region image, and fusing the image features to obtain a relative size feature vector;

aiming at each commodity area image, acquiring a commodity feature vector corresponding to the commodity area image based on the image feature of the commodity area image and the relative size feature vector;

and carrying out classification calculation on the commodity feature vector corresponding to each commodity area image to obtain commodity category information corresponding to each commodity area image.

In an embodiment, the acquiring the commodity area image in the image to be identified includes:

extracting commodity position information in the image to be identified;

and cutting a commodity area image corresponding to the position information of each commodity in the image to be identified.

In an embodiment, before the extracting the image feature of each commodity area image, the method further includes:

determining a scaling ratio for scaling the commodity area image with the maximum resolution to a target size;

scaling all commodity area images according to the scaling scale;

and judging whether each commodity area image is smaller than the target size, if so, complementing the commodity area image to the target size.

In an embodiment, the image features comprise feature maps;

the fusing the image features to obtain a relative size feature vector comprises:

compressing the feature map in the channel dimension direction;

stacking the compressed feature images in the channel dimension direction;

and performing dimension reduction treatment on the stacked relative dimension feature graphs to obtain the relative dimension feature vector.

In an embodiment, before the dimension reduction processing is performed on the stacked relative dimension feature maps, the method further includes:

judging whether the relative size feature map reaches a preset first channel length in the channel dimension direction;

if not, complementing the relative dimension characteristic diagram after stacking to the first channel length.

In an embodiment, the obtaining the commodity feature vector corresponding to the commodity area image based on the image feature of the commodity area image and the relative size feature vector includes:

if the image features are feature images, performing dimension reduction processing on the feature images to obtain feature vectors;

and connecting the feature vector and the relative size feature vector in the channel dimension direction to obtain the commodity feature vector.

In an embodiment, the method further includes, based on the image feature of the commodity area image and the relative size feature vector, obtaining a commodity feature vector corresponding to the commodity area image:

and if the image features are feature vectors, connecting the feature vectors with the relative size feature vectors in the channel dimension direction to obtain commodity feature vectors.

On the other hand, the application also provides a commodity category identification device, which comprises:

the acquisition module is used for acquiring the commodity area image in the image to be identified;

the extraction module is used for extracting the image characteristics of each commodity area image, and fusing the image characteristics to obtain a relative size characteristic vector;

the fusion module is used for obtaining commodity feature vectors corresponding to the commodity region images based on the image features of the commodity region images and the relative size feature vectors for each commodity region image;

and the identification module is used for carrying out classification calculation on the commodity characteristic vector corresponding to each commodity area image to obtain commodity category information corresponding to each commodity area image.

Further, the present application also provides an electronic device, including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the above-described merchandise category identification method.

In addition, the application also provides a computer readable storage medium storing a computer program executable by a processor to perform the above commodity category identification method.

In the embodiment provided by the application, the relative size feature vector capable of representing the relative size information of different commodities is obtained by fusing the feature graphs of each commodity area image, and based on all feature graphs and relative size vector calculation, the commodities with different specifications of similar same brands and same varieties can be distinguished and packaged by means of the relative size information, and the identification accuracy is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following description will briefly explain the drawings that are required to be used in the embodiments of the present application.

Fig. 1 is a schematic view of an application scenario of a commodity category identification method according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

FIG. 3 is a flow chart of a method for identifying a commodity category according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of generating a relative dimension feature vector according to one embodiment of the present application;

FIG. 5 is a schematic diagram of image preprocessing according to an embodiment of the present application;

fig. 6 is a block diagram of a commodity category identification apparatus according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.

Like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.

Fig. 1 is an application scenario schematic diagram of a commodity category identification method according to an embodiment of the present application. As shown in fig. 1, the application scenario includes a server 30 and a client 20, where the server 30 may be a server, a server cluster, or a cloud computing center, and the server 30 may perform a business of identifying a commodity category on an image of a display scenario acquired by the client 20. The client 20 may be a smart device such as a camera, smart phone, tablet computer, etc.

As shown in fig. 2, the present embodiment provides an electronic apparatus 1 including: at least one processor 11 and a memory 12, one processor 11 being exemplified in fig. 2. The processor 11 and the memory 12 are connected by a bus 10, and the memory 12 stores instructions executable by the processor 11, which instructions are executed by the processor 11, so that the electronic device 1 may perform all or part of the flow of the method in the embodiments described below. In an embodiment, the electronic device 1 may be the server 30.

The Memory 12 may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as static random access Memory (Static Random Access Memory, SRAM), electrically erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), erasable Programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk, or optical disk.

The present application also provides a computer readable storage medium storing a computer program executable by the processor 11 to perform the commodity category identification method provided herein.

Referring to fig. 3, a flow chart of a method for identifying a commodity category according to an embodiment of the present application is shown in fig. 3, and the method may include the following steps 310 to 340.

Step 310: and acquiring a commodity area image in the image to be identified.

The commodity area image is an image of an area where the commodity is located in the image to be identified.

In an embodiment, the server may extract the commodity location information in the image to be identified.

The server side can calculate commodity position information in the image to be identified through the target detection model. The target detection model may be any one of network models such as YOLO (You Only Look Once), fast R-CNN (Fast Region Convolutional Neural Networks, fast regional convolutional neural network), fast R-CNN (Faster Region Convolutional Neural Networks, faster regional convolutional neural network), and the like.

The target detection network can be trained through a first sample set, the first sample set comprises a large number of marked sample images, and marked labels comprise commodity position information of commodities needing to be identified in the sample images. The labels are denoted as (x 1, y1, x2, y 2), and illustrate that rectangular boxes of positions in the image to be recognized at upper left corner coordinates (x 1, y 1) and lower right corner coordinates (x 2, y 2) are commodity position information.

The target detection model is trained, and commodity position information in the image to be identified can be calculated.

The server side can cut the commodity area image corresponding to the position information of each commodity in the image to be identified. The server may obtain a plurality of merchandise area images.

Step 320: and extracting the image characteristics of each commodity area image, and fusing the image characteristics to obtain a relative size characteristic vector.

Wherein the image feature may be a feature map or a feature vector.

In an embodiment, the server may extract the image features of the commodity area image through a basic convolutional network in the neural network model. The base convolution network may include a convolution layer to obtain image features by performing convolution calculations on the merchandise region image.

If the image features extracted by the basic convolution network are feature graphs, the server can compress the feature graphs in the channel dimension direction to reduce the subsequent calculation amount.

The service side can compress the characteristic diagram through a characteristic compression network in the neural network model. The feature compression network can comprise a convolution layer, and the dimension reduction of the feature map in the channel dimension is realized by carrying out convolution calculation on the feature map.

The server may stack the compressed feature maps in the channel dimension direction. Here, the stacking order of the feature maps is the same as the order in which the plurality of commodity area images are stacked by the server side when the neural network model is input.

The server side can perform dimension reduction processing on the stacked relative dimension feature graphs to obtain relative dimension feature vectors.

The server side can perform dimension reduction processing on the relative dimension feature map through a feature compression network in the neural network model. The feature compression network may include a pooling layer to reduce dimensions to relative-size feature vectors by global averaging pooling the relative-size feature maps. Alternatively, the feature compression network may also dimension down the relative dimensional feature map to a relative dimensional feature vector by successive convolution calculations.

In an embodiment, before performing the dimension reduction processing on the stacked relative dimension feature map, the server may determine whether the relative dimension feature map reaches a preset first channel length in the channel dimension direction.

Wherein the first channel length may be set based on the calculated amount of the neural network model. Such as: if the neural network model is set to process 10 commodity area images simultaneously, the first channel length is the length of the stack after the feature images of the 10 commodity area images are compressed.

If the relative size feature map does not reach the preset first channel length, the server may complement the stacked relative size feature map to the first channel length.

The server may supplement the relative size feature map with 0 in the channel dimension direction, so that the relative size feature map reaches the first channel length.

Such as: if the length of the first channel is the length of the stack after the feature images of the 10 commodity area images are compressed, when the server side only obtains 3 commodity area images from the images to be identified, the feature images of the commodity area images are stacked after being compressed, and the dimension direction of the channel is only 30% of the length of the first channel, at this time, the server side can complement the compressed feature images of the 7 commodity area images after the feature images of the relative dimension, and the values in the complemented feature images are all 0.

To more clearly illustrate the process of obtaining the relative dimensional feature vector, reference is made to fig. 4, which is a schematic diagram of generating the relative dimensional feature vector according to an embodiment of the present application.

As shown in fig. 4, the image feature calculated by the server is a feature map with a size w×h×d1, where W represents the number of pixels in the width direction, H represents the number of pixels in the height direction, and D1 represents the number of channels.

The service end compresses the feature map 1 and the feature map 2 in the channel dimension direction respectively, compresses the channel number to D2, and stacks the compressed feature map 1 and the compressed feature map 2. Since the stacked relative size feature map does not reach the first channel length D4 in the channel dimension direction, the server supplements the relative size feature map with a bit-supplementing feature map having a size of w×h×d3. And performing dimension reduction treatment on the relative dimension characteristic diagram with the dimension W.times.H.times.D4 to obtain a relative dimension characteristic vector with the dimension of 1.times.1.times.D4.

In yet another embodiment, if the image feature is a feature vector, the server may compress the feature vector in the channel dimension direction, and then stack the compressed feature vector in the channel dimension direction to obtain the relative size feature vector.

The server may determine whether the relative size feature vector reaches a preset first channel length in the channel dimension direction, and if not, may complement the relative size feature vector to the first channel length.

Step 330: and aiming at each commodity area image, acquiring commodity feature vectors corresponding to the commodity area image based on the image features of the commodity area image and the relative size feature vectors.

In an embodiment, if the image feature is a feature vector, the server may connect the feature vector corresponding to each commodity region image with the relative size feature vector in the channel dimension direction, to obtain the commodity feature vector.

Such as: if 10 commodity area images exist, after the server side obtains the feature vector corresponding to each commodity area image and the relative size feature vector, the server side can be respectively connected with the feature vector corresponding to each commodity area image and the relative size feature vector to obtain 10 commodity feature vectors.

In yet another embodiment, if the image feature is a feature map, the server may perform a dimension reduction process on the feature map to obtain a feature vector.

The service end can perform dimension reduction processing on the feature map corresponding to each commodity area image through a feature fusion network in the neural network. The feature fusion network may include a pooling layer that pools feature graphs by global averaging, dimension reduction to feature vectors. Alternatively, the feature fusion network may also dimension down the feature map to feature vectors by continuous convolution calculations. The feature fusion network may also include other specific ways of computing the dimension reduction capability, which are not limited herein.

After the feature vector is obtained, the server may connect the feature vector corresponding to each commodity region image and the relative size feature vector in the channel dimension direction, so as to obtain the commodity feature vector.

Step 340: and carrying out classification calculation on the commodity feature vector corresponding to each commodity area image to obtain commodity category information corresponding to each commodity area image.

In an embodiment, the server may perform classification calculation on the commodity feature vector through a classification network in the neural network model, to obtain commodity category information corresponding to the commodity area image. The classification network may include a softmax classifier, which is configured to calculate commodity category information corresponding to the commodity feature vector.

Before the service end executes the technical scheme of the application through the neural network model, the neural network model comprising the basic convolution network, the characteristic compression network, the characteristic fusion network and the classification network can be trained by using the marked sample image. Labels corresponding to commodities of different specifications of the same brand and the same variety in the sample image are different.

Because the relative size feature vector can represent the relative size information of different commodities, when the classification calculation is carried out based on the commodity feature vector comprising the relative size feature vector, the commodities with different specifications of similar same brands and same varieties can be distinguished and packaged according to the relative size information, and the identification accuracy is improved.

In one embodiment, since the sizes of the plurality of merchandise region images acquired by the server may be different, the server may pre-process the plurality of merchandise region images before executing step 320.

The server side can acquire the resolution of each commodity area image and determine the scaling ratio of the commodity area image with the largest resolution to the target size. The resolution is herein the image resolution, and represents the number of pixels contained in the commodity area image, and the larger the resolution is, the larger the image size of the commodity area image is; the target size may be an image size that facilitates extraction of image features.

The server may scale all merchandise area images according to the scaling described above.

Such as: if the image size of the commodity area image with the maximum resolution is 600×800 and the target size is 300×400, the scaling ratio is 0.25, and the server side can scale the other commodity area images according to the scaling ratio.

And judging whether each zoomed commodity area image is smaller than the target size, if so, complementing the commodity area image to the target size.

The service end can supplement 0 value around the commodity area image which is smaller than the target size after scaling so as to supplement the commodity area image to the target size.

For a clearer description of the preprocessing process, refer to fig. 5, which is a schematic diagram of image preprocessing provided in an embodiment of the present application.

As shown in fig. 5, the server cuts three merchandise area images from the image to be identified, and the image sizes are different. The resolution of the commodity area image in which the commodity 3 is positioned is the largest, the image size is the largest, the server determines the scaling ratio scaled to the target size according to the commodity area image, and the other two commodity area images are scaled according to the scaling ratio. In fig. 5, the commodity area image corresponding to "commodity 2" and the commodity area image corresponding to "commodity 3" are scaled and then indicated by a dashed frame, and the solid frame outside the dashed frame indicates the target size. Because the commodity area images corresponding to the commodity 2 and the commodity 3 are smaller than the target size after scaling, the service end supplements 0 value around the commodity area image. The server may stack the supplementary 0-value commodity area images and then execute step 320.

Fig. 6 is a block diagram of a commodity category identifying apparatus according to an embodiment of the present invention. As shown in fig. 6, the apparatus may include: an acquisition module 610, an extraction module 620, a fusion module 630, and an identification module 640.

The acquiring module 610 is configured to acquire an image of a commodity area in the image to be identified.

The extracting module 620 is configured to extract image features of each commodity area image, and fuse the image features to obtain a relative size feature vector.

The fusion module 630 is configured to obtain, for each commodity area image, a commodity feature vector corresponding to the commodity area image based on the image feature of the commodity area image and the relative size feature vector.

And the identification module 640 is configured to perform classification calculation on the commodity feature vector corresponding to each commodity area image, and obtain commodity category information corresponding to each commodity area image.

In an embodiment, the obtaining module 610 is further configured to:

extracting commodity position information in the image to be identified;

In an embodiment, the apparatus further comprises a scaling module for:

scaling all commodity area images according to the scaling scale;

In an embodiment, the extracting module 620 is further configured to:

compressing the feature map in the channel dimension direction;

stacking the compressed feature images in the channel dimension direction;

In an embodiment, the extracting module 620 is further configured to:

In an embodiment, the fusing module 630 is further configured to:

The implementation process of the functions and roles of each module in the above device is specifically detailed in the implementation process of the corresponding steps in the above commodity category identification method, and will not be described herein.

In the several embodiments provided in the present application, the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored on a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims

1. A method for identifying a category of merchandise, comprising:

acquiring a commodity area image in an image to be identified;

extracting image features of each commodity region image, and fusing the image features to obtain a relative size feature vector, wherein the image features comprise feature images; comprising the following steps: compressing the feature map in the channel dimension direction; stacking the compressed feature images in the channel dimension direction; performing dimension reduction treatment on the stacked relative dimension feature graphs to obtain the relative dimension feature vector;

for each commodity area image, obtaining a commodity feature vector corresponding to the commodity area image based on the image feature of the commodity area image and the relative size feature vector, including: if the image features are feature images, performing dimension reduction processing on the feature images to obtain feature vectors; connecting the feature vector and the relative size feature vector in the channel dimension direction to obtain a commodity feature vector;

2. The method of claim 1, wherein the acquiring the merchandise region image in the image to be identified comprises:

extracting commodity position information in the image to be identified;

3. The method of claim 1, wherein prior to said extracting the image features of each commodity region image, the method further comprises:

scaling all commodity area images according to the scaling scale;

4. The method of claim 1, wherein prior to the subjecting the stacked relative dimensional signatures to the dimension reduction process, the method further comprises:

5. A commodity category identification device, comprising:

the extraction module is used for extracting the image characteristics of each commodity area image, and fusing the image characteristics to obtain a relative size characteristic vector, wherein the image characteristics comprise characteristic images; comprising the following steps: compressing the feature map in the channel dimension direction; stacking the compressed feature images in the channel dimension direction; performing dimension reduction treatment on the stacked relative dimension feature graphs to obtain the relative dimension feature vector;

the fusion module is configured to obtain, for each commodity area image, a commodity feature vector corresponding to the commodity area image based on an image feature of the commodity area image and the relative size feature vector, and includes: if the image features are feature images, performing dimension reduction processing on the feature images to obtain feature vectors; connecting the feature vector and the relative size feature vector in the channel dimension direction to obtain a commodity feature vector;

6. An electronic device, the electronic device comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the merchandise category identification method of any one of claims 1-4.

7. A computer readable storage medium storing a computer program executable by a processor to perform the method of item category identification of any one of claims 1-4.