WO2020211499A1 - 一种商品的自助收银方法和设备 - Google Patents

一种商品的自助收银方法和设备 Download PDF

Info

Publication number
WO2020211499A1
WO2020211499A1 PCT/CN2020/072059 CN2020072059W WO2020211499A1 WO 2020211499 A1 WO2020211499 A1 WO 2020211499A1 CN 2020072059 W CN2020072059 W CN 2020072059W WO 2020211499 A1 WO2020211499 A1 WO 2020211499A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
code
area
commodity
product
Prior art date
Application number
PCT/CN2020/072059
Other languages
English (en)
French (fr)
Inventor
宋杨
Original Assignee
创新先进技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 创新先进技术有限公司 filed Critical 创新先进技术有限公司
Priority to US16/810,670 priority Critical patent/US11113680B2/en
Publication of WO2020211499A1 publication Critical patent/WO2020211499A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • G06K7/14Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
    • G06K7/1404Methods for optical code recognition
    • G06K7/1408Methods for optical code recognition the method being specifically adapted for the type of code
    • G06K7/14131D bar codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • G06K7/14Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
    • G06K7/1404Methods for optical code recognition
    • G06K7/1408Methods for optical code recognition the method being specifically adapted for the type of code
    • G06K7/14172D bar codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • G06K7/14Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
    • G06K7/1404Methods for optical code recognition
    • G06K7/1439Methods for optical code recognition including a method step for retrieval of the optical code
    • G06K7/1443Methods for optical code recognition including a method step for retrieval of the optical code locating of the code in an image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G07CHECKING-DEVICES
    • G07GREGISTERING THE RECEIPT OF CASH, VALUABLES, OR TOKENS
    • G07G1/00Cash registers
    • G07G1/0036Checkout procedures
    • G07G1/0045Checkout procedures with a code reader for reading of an identifying code of the article to be registered, e.g. barcode reader or radio-frequency identity [RFID] reader
    • G07G1/0081Checkout procedures with a code reader for reading of an identifying code of the article to be registered, e.g. barcode reader or radio-frequency identity [RFID] reader the reader being a portable scanner or data reader

Definitions

  • This manual relates to the field of computer technology, in particular to the self-service cashier method and equipment for commodities.
  • One or more embodiments of this specification describe a self-service cashier method and device, in which a combination of code recognition and visual recognition is used to improve the efficiency and accuracy of product recognition and improve user experience.
  • embodiments of this specification provide a self-service cash register method for commodities, including:
  • the first image being obtained by a first camera photographing at least one commodity placed on a cash register;
  • the category of the first commodity is recognized based on the first image area through visual recognition
  • the pricing result of the first commodity is determined.
  • acquiring the first image includes controlling the first camera to shoot the at least one commodity to obtain the first image.
  • obtaining the first image includes receiving the first image from a self-service checkout counter.
  • the first image is taken from one of the top view direction, the front view direction, the left view direction, the right view direction, the rear view direction, and the oblique view direction of the at least one commodity, and the oblique view direction is The angle between the direction and the vertical direction of the cash register is 30°-60°.
  • performing image segmentation on the first image includes using an image segmentation model to perform image segmentation on the first image; wherein the image segmentation model is obtained by pre-training using segmented sample pictures, and The segmented sample picture contains a product image and has label data for labeling the outline of the product.
  • the product code is a barcode; in this case, performing code area detection in the first image area includes: using a first target detection model to detect the barcode in the first image area Region; wherein, the first target detection model is obtained by pre-training using a first training sample picture, the first training sample picture contains an image of a product, and has labeling data to frame the barcode area in the image of the product.
  • recognizing the code in the code area includes: correcting the detected barcode area through perspective transformation to obtain a corrected barcode; and performing code recognition on the corrected barcode.
  • the commodity code is a two-dimensional code; in this case, performing code area detection in the first image area includes: detecting the location of the two-dimensional code in the first image area Graphics, when at least two positioning graphics are detected, it is determined that the two-dimensional code area is detected.
  • the identifying the code in the code area includes: performing perspective correction on the detected two-dimensional code region through perspective transformation to obtain a corrected two-dimensional code; based on the at least two positioning patterns , Determining the relationship of the corner point graphics in the corrected two-dimensional code; based on the corner point graphic relationship, extracting coding features from the corrected two-dimensional code, thereby identifying the two-dimensional code encoding.
  • the visual recognition includes: using a second target detection model to determine the category of the first product based on the first image area; wherein the second target detection model uses a second training sample picture to pre- It is obtained through training that the second training sample image contains a product image, and has label data for box-selecting the product and labeling the product category.
  • the method further includes:
  • a self-service cash register method for commodities comprising:
  • the multiple images being obtained by multiple cameras respectively taking pictures of at least one commodity placed on the cashier counter;
  • the category of the same commodity is recognized based on at least one image area of the plurality of image areas through visual recognition;
  • the pricing result of the commodity is determined.
  • a self-service cash register device for commodities comprising:
  • An image acquisition unit configured to acquire a first image, the first image being obtained by a first camera taking pictures of at least one commodity placed on the cashier counter;
  • An image segmentation unit configured to perform image segmentation on the first image to obtain at least one image area, which includes the first image area;
  • a code area detection unit configured to perform code area detection of a product code in the first image area
  • the code recognition unit is configured to recognize the code in the code area when the code area is detected, and determine the category of the first commodity contained in the first image area according to the recognized code;
  • the visual recognition unit is configured to recognize the category of the first commodity based on the first image area through visual recognition when the code area is not detected or the aforementioned code cannot be recognized;
  • the pricing unit is configured to determine the pricing result of the first commodity according to the category of the first commodity.
  • a self-service cash register device for commodities comprising:
  • An image acquisition unit configured to acquire a plurality of images, and the plurality of images are obtained by photographing at least one commodity placed on the cash register by a plurality of cameras;
  • An image segmentation unit configured to perform image segmentation on the multiple images to obtain image regions corresponding to each image
  • An area relationship determining unit configured to determine, from the image areas corresponding to the respective images, a plurality of image areas corresponding to the same commodity according to the relative positional relationship of the plurality of cameras;
  • the code area detection unit is configured to perform code area detection of commodity codes in the plurality of image areas
  • the code recognition unit is configured to recognize the code in the code area when the code area is detected in any image area, and determine the category of the same commodity according to the recognized code;
  • the visual recognition unit is configured to recognize all code areas based on at least one image area of the multiple image areas through visual recognition in the case that no code area is detected or no code is recognized in the multiple image areas State the category of the same commodity;
  • the pricing unit is configured to determine the pricing result of the commodity according to the category of the same commodity.
  • the embodiments of this specification provide a computer-readable storage medium on which a computer program is stored, and when the computer program is executed in a computer, the computer is caused to execute the method described in the first or second aspect .
  • an embodiment of the present specification provides a self-service cashier, including: a storage device and a processor, the processor is communicatively coupled to the storage device, the storage device stores an application program, and the processor is available To execute the application program, the method described in the first aspect or the second aspect is implemented.
  • an embodiment of the present specification provides a server, including: a storage device, a network interface, and a processor, the processor is communicatively coupled to the storage device and the network interface, the storage device stores a server program, and The processor may be used to execute a server program to implement the method described in any one of the first aspect or the second aspect.
  • the scanning speed and accuracy of the commodity barcode scheme and the user experience of the visual recognition scheme can be taken into consideration.
  • Figure 1 is a schematic diagram of the self-service cash register system disclosed in this manual
  • Figure 2 illustrates the bottom view of the panel seen from below
  • FIG. 3 is a schematic diagram of the electronic structure of the computing device in the self-service cash register in the embodiment of the present specification
  • Figure 4 is a structural diagram of an image recognition server according to an embodiment of this specification.
  • Figure 5 is a schematic flowchart of a method for product identification and pricing according to an embodiment of this specification
  • Figures 6a-6c show schematic diagrams of the process of detecting and identifying barcode areas in an example
  • Figure 7a illustrates a schematic diagram of the effect of a two-dimensional code perspective transformation
  • Figure 7b shows the various sub-stages of determining the graphical relationship of the corner points
  • Figure 8 shows the calibration of the camera in an example
  • Fig. 9 shows a flow chart of product identification and pricing for multiple images according to an embodiment
  • FIG. 10 illustrates a schematic block diagram of a commodity cash register device according to an embodiment of the present specification
  • Fig. 11 illustrates a schematic block diagram of a commodity cash register device according to another embodiment of the present specification.
  • Self-service cashiers are widely used in the new retail sector to improve cashier efficiency and reduce labor costs.
  • the product recognition solution based on machine vision has become one of the mainstream solutions in the industry due to its cost and accuracy.
  • the visual solution is based on the appearance of the product and uses machine learning algorithms for product recognition. Affected by external light, product placement angle and other conditions, 100% recognition accuracy cannot be guaranteed.
  • the outer packaging of the product has a product barcode (barcode) that can clearly identify the product.
  • barcode product barcode
  • this specification proposes a hybrid way of combining visual recognition and product code recognition.
  • FIG 1 is a schematic diagram of the self-service cash register system disclosed in this specification.
  • the self-service cashier system includes a self-service cashier 12 and a server system 18.
  • the self-service cash register and the server system can be connected via the network 16.
  • the self-service cashier counter 12 may include a countertop 130 and a panel 120, which are arranged opposite to each other.
  • the countertop 130 is located below and is used to carry one or more commodities 132 and 134.
  • the panel 120 is located on the upper side to provide an ideal lighting environment for the commodities on the countertop, so that the lighting of the commodities is stable, and is helpful for the operation of commodity detection or recognition algorithms.
  • the self-service cash register can be equipped with at least one camera.
  • the camera can record or take pictures of the commodities 134 and 132 placed on the cashier counter to obtain videos or images of these commodities.
  • the video or image may include the video or image portion of each product in multiple products. For the sake of simplicity, the following will only take images as an example. Those skilled in the art are aware that the images in this specification can be either directly shot images or images extracted from videos.
  • the user can freely place commodities on the countertop 130 of the cash register.
  • the form/position of each product on the countertop can be different.
  • the self-service cashier 12 sends the video or image to the server system via the network.
  • the network 16 may be a wired network, a wireless network, a local area network, the Internet, and so on.
  • the server 18 determines the category of each product based on the image through product code detection and/or visual recognition. Specifically, in an example, the server first segments the image, and detects and recognizes the product code carried by the segmented image area. If the product code can be read normally, the product type can be accurately obtained and the identification can be completed. For the image area that is related to a certain product but cannot read the product code from it, the visual recognition algorithm is started to detect the product category.
  • the server system or the self-service cash register can determine the pricing result of the commodity according to the identified commodity category.
  • the server system may also include multiple servers, which perform corresponding detection or identification tasks simultaneously or separately as required.
  • the product segmentation of the image is performed by server A
  • the product code detection is performed by server B
  • the visual recognition is performed by server C. Therefore, in this specification, the server can refer to either a single server itself or a server cluster.
  • FIG. 1 illustrates an example in which the server performs product recognition based on the product image
  • the recognition and pricing of the product can also be performed by a computing device set at the cashier. At this time, the computing device directly obtains the image taken by the camera without sending it to the server.
  • the solution of this specification can integrate the advantages of the two solutions of barcode recognition and visual recognition, and realize automatic cash register on the basis of ensuring user experience.
  • Figure 2 illustrates a bottom view of the panel as seen from below.
  • the panel can be opaque or semi-transparent, so as to block external light from the ceiling or other angles, so as to prevent the external light from having uncontrollable effects on the lighting conditions of the goods.
  • the panel can have a variety of shapes, such as rectangular, arc, and extended cover plates on both sides. As long as it can block or partially block the light source above the panel, any panel shape is acceptable.
  • the panel includes one or more light sources 121 and 122.
  • the light source can be LED or other methods.
  • the light source can be located on the lower surface of the panel or embedded in the panel.
  • the light source stabilizes the light of the commodities placed on the self-service checkout counter and helps the operation of the commodity recognition algorithm.
  • there can be more choices for the number and arrangement of light sources such as dual light sources, 4 light sources, or even more light sources.
  • the layout method can also be customized according to requirements.
  • One of the key points of vision-based product recognition is the need to obtain clear pictures, and stable and uniform lighting is a good guarantee.
  • the light source can take the form of a controllable light source, and the brightness of the light source can be adjusted according to changes in the working environment.
  • One or more cameras 124, 125, 126 can be installed at the self-service cash register.
  • the camera can be an ordinary RGB camera or a 3D camera. These cameras are arranged according to the field of view (FOV) of the cameras and the size of the product placement table. Multiple cameras can acquire images from different angles, thereby effectively avoiding occlusion.
  • the slide rail can be configured to make the camera slide on the slide rail to obtain product images at different angles.
  • different cameras can acquire images of the product from at least one of the top view direction, the front view direction, the left view direction, the right view direction, and the rear view direction of the product.
  • the camera can be installed on other objects than the panel.
  • the camera can also acquire images from the oblique direction of the product.
  • the oblique viewing direction refers to the direction where the angle between the shooting direction and the vertical direction of the cashier countertop is 30-60 degrees, where the shooting direction is the direction pointed by the center line of the FOV of the camera.
  • FIG. 3 is a schematic diagram of the electronic structure of the computing device in the self-service cash register in the embodiment of the present specification.
  • the electronic structure of the self-checkout counter may include multiple electronic devices or devices.
  • the processor 510 controls the overall operation of the computing device.
  • the LED controller 532 can be used to control multiple LED lights (LED#1, LED#2, LED#N) so as to provide uniform and stable lighting.
  • CAM HUB 534 is a camera hub that can be used to control two or more cameras (CAM#1, CAM#2, CAM#N) to acquire images.
  • the computing device may further include a network/bus interface 526 coupled to the data link for data communication with the server; the network/bus interface 526 may also receive images from the camera; In this case, the network/bus interface 526 may include a wireless transceiver.
  • Electronic equipment also includes flash memory FLASH 524.
  • the FLASH 524 can store software, which is loaded from the FLASH into the DRAM 522, and thereby controls the CPU 510 to perform corresponding operations.
  • Fig. 4 is a structural diagram of an image recognition server according to an embodiment of this specification.
  • the server may include a processor 702, which represents a microprocessor for controlling the overall operation of the server.
  • the data bus 715 can facilitate data transmission between the storage device 740, the processor 702, and the network interface 714.
  • the server further includes a storage device 740, which can store a server program.
  • the terminal device may also include random access memory (RAM) 720 and read only memory (ROM) 722.
  • RAM random access memory
  • ROM read only memory
  • the ROM 722 can store programs, utilities, or processes to be executed in a non-volatile manner, such as an operating system.
  • RAM720 also known as memory, can provide volatile data storage, and store instructions and related data for running operating systems and server programs.
  • the server program is loaded from RAM 740 into RAM 720, and thus controls processor 702 to perform corresponding operations.
  • Fig. 5 is a schematic flowchart of a method for product identification and pricing according to an embodiment of the present specification. This method can be compiled into software and implemented by the cash register computing device shown in FIG. 3 or the server shown in FIG. 4. Alternatively, it can also be implemented jointly by the above-mentioned cashier computing device and server, wherein the cashier computing device and the server each execute a part of the method flow.
  • step 501 a first image is acquired, and the first image is obtained by photographing at least one commodity placed on the cash register by the first camera.
  • one or more cameras can be arranged in the cash register for taking pictures of the goods.
  • the image taken by the camera is the aforementioned first image.
  • the multiple cameras can take pictures of the merchandise placed on the table from different angles to generate multiple images.
  • any one of the cameras may be referred to as the first camera, and the commodity image captured by the camera may be referred to as the first image.
  • the “first” and “second” in this text are only used for distinguishing in description, and are not intended to limit the order of appearance and other directions.
  • the method is executed by a cash register computing device.
  • the computing device controls the first camera in the checkout counter to photograph the merchandise placed on the counter to obtain the aforementioned first image.
  • the method is executed by the server.
  • the first camera in the cash register generates a first image by photographing the commodity on the counter
  • the computing device sends the first image to the server via the network/bus interface shown in FIG. 3.
  • the server receives the first image from the self-service cash register, thereby acquiring the first image.
  • step 502 image segmentation is performed on the first image to obtain at least one image region.
  • Image segmentation can be implemented using multiple algorithms and/or models.
  • image segmentation may be performed based on conventional image processing.
  • image processing includes object boundary recognition based on pixel grayscale or contrast analysis (similar to the boundary recognition method in the matting tool). Based on the boundaries thus identified, the image can be divided into several image regions. Generally, each image area corresponds to an identified object, which corresponds to a commodity in the scene of this embodiment.
  • the conventional image processing method is suitable for the situation where the color difference between the object and the background is obvious, the background color is single, and the object boundary is clear.
  • the situation of photographing products on the counter is usually more complicated, especially in the case of multiple products. Therefore, in one embodiment, an image segmentation model is trained in advance, and such a model is used to perform image segmentation on the first image to be analyzed.
  • a large number of product pictures can be taken, and the pictures can contain a combination of one or more products arbitrarily placed, and such product pictures are distributed to annotators, who will mark the outline of the products in the pictures.
  • Such pictures containing product images and labeled product outlines can be used as segmentation sample pictures for training the image segmentation model.
  • the image segmentation model can use, for example, a Mask-RCNN-based model, a conditional random field CRF-based model, and so on.
  • the model can be used to perform image segmentation on the first image.
  • the first image can be segmented into image regions corresponding to the number of items in the figure.
  • the following description takes any one of the image areas, called the first image area, as an example.
  • step 503 the product code area detection is performed in the first image area; if the code area is detected, then in step 504, the code in the code area is identified, and the product contained in the first image area is determined according to the identified code If the code area is not detected, then in step 505, through visual recognition, based on the first image area to identify the category of the product contained therein.
  • the barcode can uniquely identify the specific type of commodity, or called category.
  • the seller of the commodity associates the commodity category with the price in advance.
  • the barcode directly obtains the category and price information of the product.
  • some products are also printed with QR codes.
  • the category and price information of the product can also be obtained by identifying the two-dimensional code. Therefore, the detection of the commodity code area in step 503 may include the detection of barcodes and the detection of two-dimensional codes. The following describes the specific implementation of the above steps in combination with these two situations.
  • the aforementioned product code is a barcode.
  • a target detection model may be trained in advance, and the target detection model is used to detect the barcode area in the first image area.
  • the target detection model is a common model in image recognition, which is used to identify a specific target object from a picture.
  • the target detection model is obtained based on the training of image samples labeled with specific target objects.
  • the training sample picture with the barcode can be used to train a target detection model dedicated to detecting the barcode area.
  • the pictures can contain a combination of one or more products arbitrarily placed, and such product pictures are distributed to annotators, who will mark the barcode area of the products in the pictures.
  • the labeling staff can use the smallest rectangular box to select the barcode to mark the barcode area. In this way, the product picture with the barcode label frame is obtained as the training sample picture of the training target detection model.
  • a one-stage (one-stage) detection model can directly determine the category probability and position coordinates of the target object from the picture, that is, directly identify the target object.
  • Typical examples of single-stage inspection models include SSD model, Yolo model, etc.
  • the two-stage (two-stage) detection model first generates a candidate region, or ROI, in the picture, and then performs target recognition and border regression in the candidate region.
  • Typical examples of two-stage detection models include R-CNN model, Fast R-CNN model, Faster R-CNN model, etc.
  • Other target detection models are also proposed. The above models of structures and algorithms can all be used as target detection models for barcode detection.
  • the barcode area is detected in the first image area through the pre-trained target detection model. If the barcode area is detected, then in step 504, the code in the barcode area is identified.
  • the recognition of the barcode encoding can be achieved by using conventional barcode reading technology.
  • the cash register of the embodiment of this specification allows the user to place multiple commodities on the counter at will.
  • the barcode contained in the captured image often has various distortions such as skew and distortion.
  • the detected barcode area is first corrected by perspective transformation to obtain a corrected barcode.
  • Perspective transformation can be achieved by using a projection transformation matrix to perform transformation operations, so that a corrected barcode with a standardized shape and direction is obtained. Then you can perform code recognition on the corrected barcode to get the code.
  • Figures 6a-6c show schematic diagrams of the process of detecting and identifying the barcode area in an example.
  • the left part (a) shows part of the original image obtained by photographing the product. Perform barcode detection on this part of the image area to get the barcode area.
  • the middle part (b) schematically shows the deformed barcode obtained from the barcode area in the original image
  • the right part (c) shows the corrected barcode obtained after the perspective transformation of the part (b).
  • the aforementioned commodity code is a two-dimensional code.
  • Various methods can be used.
  • the area of the two-dimensional code is detected.
  • a target detection model is trained for a two-dimensional code, and the target detection model is used to detect a two-dimensional code area in the first image area.
  • the structural characteristics of the two-dimensional code itself are used to directly detect in the image area.
  • the current two-dimensional code usually has 3 positioning graphics in the upper left, upper right and lower left corners, and the positioning graphics have specific and significant structural characteristics.
  • the positioning graphics often adopt a black box with a black square inside a "back" shape. structure. Therefore, the above structural features can be used to detect positioning patterns in the image area.
  • the two-dimensional code detection usually has a certain degree of fault tolerance, allowing the use of two positioning graphics to recover the graphic relationship of the two-dimensional code when a certain positioning graphic cannot be detected due to contamination, occlusion, etc. Therefore, when at least two positioning graphics are detected, it can be determined that a two-dimensional code is detected.
  • step 504 the coding information in the two-dimensional code area is identified.
  • the two-dimensional code area is corrected first, the graphic relationship is determined, and then the code recognition is performed.
  • perspective correction is performed on the detected two-dimensional code region through perspective transformation to obtain a corrected two-dimensional code.
  • Perspective transformation can be realized by using a projection transformation matrix to perform transformation operations.
  • Figure 7a illustrates the effect of the perspective transformation of a two-dimensional code.
  • the vertices of the two-dimensional code area can be corrected to obtain a square two-dimensional code with a standardized shape, that is, a corrected two-dimensional code.
  • Figure 7b shows the various sub-phases of this process.
  • the positioning pattern detection is performed again based on the corrected two-dimensional code, that is, the secondary feature detection.
  • the position of the positioning graphic and the position of the corresponding corner point are accurately determined.
  • determine the position relationship (diagonal relationship or the same side relationship) of the two detected positioning graphics and perform virtual corner positioning based on the position relationship, that is, locate the corner point corresponding to the third positioning graphics.
  • the virtual corner positioning in FIG. 7b can be omitted, and the corner graphics relationship can be determined through secondary detection and fine positioning of the corners.
  • the coding feature is extracted from the corrected two-dimensional code, so as to identify the two-dimensional code code.
  • the category of the product can be accurately determined.
  • step 505 the category of the commodity contained therein is identified based on the first image area through visual recognition.
  • the above visual recognition is mainly a scheme of training a target detection model through machine learning, and then using the target detection model to directly detect and identify product categories.
  • the target detection model used in the visual recognition in step 505 is different from the aforementioned target detection model used to detect the barcode area.
  • a large number of product pictures can be taken in advance.
  • the pictures can contain one or more combinations of products placed randomly, and such product pictures are distributed to the annotators. Select the product in the box and mark the category of the product. In this way, the product image with the product category label data can be used as a training sample image to train a target detection model for visual recognition.
  • the target detection model After training such a target detection model, the target detection model can be used to perform product recognition on the aforementioned first image region, and directly output the category of the product contained in the region.
  • a target detection model for visual recognition requires a large number of sample pictures for training, and when using this model for product recognition, more complex calculations are also required, which consumes more computing resources. Therefore, in the process of Figure 5, the product code detection with high accuracy and low computational resources is preferred to identify the product. In the case that the product code is not detected or the code is not recognized, visual recognition is started to ensure that the product is finally Can be recognized.
  • the pricing result is determined according to the category of the commodity. Specifically, the price of the commodity may be determined based on the associated data of the commodity category and price recorded by the seller in advance. Finally, the pricing result can include information such as the name of the product corresponding to the product category, and the product price.
  • the above steps 503 to 506 describe the process of product identification and pricing for any first image region obtained by segmentation in the first image. It can be understood that the above process can be performed for each segmented image area, so as to identify the commodities in each image area and then calculate the price. Thus, the pricing result of each commodity included in the first image can be obtained.
  • multiple cameras can be arranged in the cash register to take photos of the commodities from multiple angles to obtain multiple images.
  • the above first image may be any one of a plurality of images produced by multi-angle shooting.
  • multiple images can also be synthesized to obtain the overall pricing result of the goods on the counter.
  • the process shown in FIG. 5 is executed respectively, so as to obtain the commodity pricing result corresponding to each image. Then, according to the relative position relationship between the multiple cameras, the image area corresponding to the same commodity in each image is determined, and the pricing of the same commodity is removed from the pricing result corresponding to the image, so as to avoid repeated pricing of the same commodity. This process is also called "de-duplication".
  • the multiple images also include another image, which is called a second image.
  • the second image is obtained by taking a picture of the commodity on the table by the second camera. Similar to FIG. 5, the second image can be segmented to obtain each image area corresponding to the number of commodities contained therein. It can be understood that the number of commodities contained in the second image may be different from the first image. For example, if there are 3 items on the table and one of them obscures the other item in a certain direction, when the first camera shoots in the above direction, the first image contains only 2 items. When the second camera shoots from different angles, the second image obtained can contain 3 items.
  • the second image area is included in the image area obtained by segmenting the second image.
  • the product category corresponding to the second image area is determined through code recognition or visual recognition.
  • the relative positional relationship between the first camera and the second camera it is determined whether the first image area and the second image area correspond to the same product. It can be understood that after the installation of multiple cameras is completed, position calibration can be performed to obtain calibration information. Such calibration information may show the relative position relationship between the two cameras, for example, the first camera and the second camera, and the overlap relationship between the captured pictures.
  • Figure 8 shows the calibration of the camera in an example.
  • the cameras C1 and C2 are both fixed on the lighting panel, facing the cashier counter.
  • the overlapping area of the FOV of C1 and C2 on the plane corresponding to the cashier counter can be determined.
  • the picture P1 taken by C1 and the picture P2 taken by C2 will have a corresponding overlap range, as shown in the shaded area.
  • the aforementioned first camera and second camera based on the calibration information thus calibrated, it can be determined whether the first image area and the second image area fall into the overlapping area between the pictures, and then it is determined that the first image area and the second image area Whether the second image area corresponds to the same product.
  • the product pricing result only the pricing of one of the two regions corresponding to the product needs to be included; in other words, exclude one of the two image regions from the product pricing result One corresponds to the pricing result of the commodity.
  • the above is the process of "vertically” performing product recognition processing on multiple images, and then integrating the processing results of the multiple images.
  • the image areas of multiple images can be integrated "horizontally", and then product identification and pricing can be performed.
  • Fig. 9 shows a flow chart of product identification and pricing for multiple images according to an embodiment. Similar to FIG. 5, the process of this method can be implemented by a computing device in a cash register or a server. As shown in Figure 9, the method flow includes the following steps.
  • step 901 a plurality of images are acquired, and the plurality of images are obtained by photographing the commodities placed on the cashier counter by the plurality of cameras.
  • multiple cameras can take pictures of the product from different angles and positions to obtain the above-mentioned multiple images.
  • step 902 image segmentation is performed on the multiple images to obtain image regions corresponding to each image.
  • image segmentation reference may be made to the foregoing description of step 502, which will not be repeated.
  • step 903 a plurality of image regions corresponding to the same product are determined from the image regions corresponding to each image according to the relative positional relationship of the multiple cameras.
  • the overlap relationship between the pictures taken by each camera can be known through the calibration information. In this way, multiple image areas corresponding to the same commodity can be determined in each image area of the multiple images.
  • 6 cameras are used to take pictures of 4 items on the table to obtain 6 images.
  • image segmentation each of these 6 images is divided into several regions. Because there may be occlusions between commodities at certain angles, the number of commodities captured by each camera may be different, and the number of image regions obtained by image segmentation may also be different. For example, 5 of the 6 images are divided into 4 image regions, and the other image is divided into 3 image regions. Then, according to the positional relationship of the six cameras, the image areas corresponding to the same product can be obtained from the image areas obtained by segmentation of the six images. For a product that is not blocked, there are corresponding image areas in the 6 images, so you can get 6 image regions corresponding to the product; for a product that is blocked in an image, you can get the corresponding image area 5 image areas.
  • step 904 the product code area detection is performed in the plurality of image areas.
  • the commodity code area detection reference may be made to the foregoing description in conjunction with step 503, which will not be repeated.
  • step 905 When a code area is detected in any image area corresponding to the same product, in step 905, the code in the code area is identified, and the category of the same product is determined according to the recognized code.
  • the code in the code area is identified, and the category of the same product is determined according to the recognized code.
  • step 906 through visual recognition, identify the above-mentioned same commodity based on at least one image area in the multiple image areas. category.
  • the process of visual recognition is as described above in conjunction with step 505.
  • step 905 the code recognition in step 905 or the visual recognition in step 906
  • the category of the same commodity is determined.
  • the pricing result of the product is determined according to the category of the same product.
  • the comprehensive method of prioritizing code recognition and then visual recognition allows users to place multiple commodities at the checkout counter for pricing at will, which greatly improves user experience.
  • a self-service cashier can be set up in the express channel, allowing users who have not purchased many commodities to complete pricing through self-service cashier methods. In this way, the convenience of the user is improved, and the cash register time is greatly shortened.
  • Figures 10-11 illustrate some possible solutions when the functions described in the embodiments of this specification adopt hardware, firmware, or a combination thereof or a combination of software.
  • Fig. 10 illustrates a schematic block diagram of a commodity cash register device according to an embodiment of the present specification.
  • the device can be deployed in the computing device of the self-service cash register shown in FIG. 3, or in the server shown in FIG. 4.
  • the device 100 includes:
  • the image acquisition unit 101 is configured to acquire a first image, the first image being obtained by a first camera photographing at least one commodity placed on the cashier counter;
  • the image segmentation unit 102 is configured to perform image segmentation on the first image to obtain at least one image area, which includes the first image area;
  • the code area detection unit 103 is configured to perform code area detection in the first image area
  • the code recognition unit 104 is configured to recognize the code in the code area when the code area is detected, and determine the category of the first commodity contained in the first image area according to the recognized code;
  • the visual recognition unit 105 is configured to recognize the category of the first commodity based on the first image area through visual recognition when the code area is not detected or the aforementioned code cannot be recognized;
  • the pricing unit 106 is configured to determine the pricing result of the first commodity according to the category of the first commodity.
  • the image acquisition unit 101 is configured to control the first camera to capture the at least one commodity to obtain the first image.
  • the image acquisition unit 101 is configured to receive the first image from a self-service checkout counter.
  • the first image is taken from one of the top view direction, the front view direction, the left view direction, the right view direction, the rear view direction, and the oblique view direction of the at least one commodity, and the oblique view direction is,
  • the angle between the shooting direction and the vertical direction of the cash register is 30-60 degrees.
  • the image segmentation unit 102 is configured to perform image segmentation on the first image using an image segmentation model; wherein the image segmentation model is obtained by pre-training using segmented sample pictures, and the segmented sample The picture contains the image of the product and has annotation data for marking the outline of the product.
  • the code area is a barcode area; correspondingly, the code area detection unit 103 is configured to: use a first target detection model to detect the barcode area in the first image area; wherein, the The first target detection model is obtained by pre-training using a first training sample picture, the first training sample picture containing a product image and having label data for frame selection of a barcode area in the product image.
  • the code recognition unit 104 is configured to: correct the detected barcode area through perspective transformation to obtain a corrected barcode; and perform code recognition on the corrected barcode.
  • the code area is a two-dimensional code area; correspondingly, the code area detection unit 103 is configured to: detect the positioning pattern of the two-dimensional code in the first image area, and when detecting at least two In the case of a positioning graphic, it is determined that the QR code area is detected.
  • the code recognition unit 104 is configured to:
  • the coding feature is extracted from the corrected two-dimensional code, so as to identify the two-dimensional code code.
  • the visual recognition unit 105 is configured to:
  • a second target detection model is used to determine the category of the first commodity based on the first image area; wherein the second target detection model is pre-trained using a second training sample picture, and the second training sample picture contains The image of the product, and the label data for selecting the product in a frame and marking the product category.
  • the image acquisition unit 101 is further configured to acquire a second image, and the second image is obtained by photographing the at least one commodity by a second camera;
  • the image segmentation unit 102 is further configured to perform image segmentation on the second image to obtain at least a second image area;
  • the code recognition unit 104 or the visual recognition unit 105 is further configured to determine the category of the second commodity corresponding to the second image area;
  • the device further includes (not shown): a relationship determining unit configured to determine that the first image area and the second image area correspond to the same product according to the relative positional relationship between the first camera and the second camera; exclude The unit is configured to exclude the pricing result of one of the first commodity and the second commodity from the commodity pricing result.
  • Fig. 11 illustrates a schematic block diagram of a commodity cash register device according to another embodiment of the present specification.
  • the device can be deployed in the computing device of the self-service cash register shown in Figure 3, or can be deployed in the server shown in Figure 4.
  • the device 110 includes:
  • the image acquisition unit 111 is configured to acquire a plurality of images, and the plurality of images are obtained by photographing at least one commodity placed on the cash register by a plurality of cameras;
  • the image segmentation unit 112 is configured to perform image segmentation on the multiple images to obtain image regions corresponding to each image;
  • the region relationship determining unit 113 is configured to determine, from the image regions corresponding to the respective images, a plurality of image regions corresponding to the same commodity according to the relative position relationship of the plurality of cameras;
  • the code area detection unit 114 is configured to perform code area detection in the multiple image areas
  • the code recognition unit 115 is configured to recognize the code in the code area when a code area is detected in any image area, and determine the category of the same commodity according to the recognized code;
  • the visual recognition unit 116 is configured to recognize based on at least one image area of the plurality of image areas through visual recognition when no code area is detected or no code is recognized in the plurality of image areas The category of the same commodity;
  • the pricing unit 117 is configured to determine the pricing result of the commodity according to the category of the same commodity.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Toxicology (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electromagnetism (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Cash Registers Or Receiving Machines (AREA)

Abstract

一种商品的自助收银方法和自助收银台。方法包括获取由摄像头对放置在收银台上的商品进行拍摄得到的图像(501),然后将该图像分割为图像区域(502)。在任一图像区域中检测商品码的码区(503);如果检测到码区,则识别码区中的编码,根据识别的编码确定该图像区域中包含的商品的类别(504);如果未检测到码区或者无法识别上述编码,则通过视觉识别,基于图像区域识别出商品的类别(505);最后根据商品的类别,确定其计价结果(506)。可以兼顾商品条码方案的扫描速度和精度以及视觉识别方案的用户体验。

Description

一种商品的自助收银方法和设备 技术领域
本说明书涉及计算机技术领域,尤其涉及商品的自助收银方法和设备。
背景技术
技术的发展推动了零售领域的变革。新零售领域采用自助收银台来提升收银效率并减少人工成本。在一种方案下,顾客可以将商品的条形码对准收银台的机器扫描区域,由此识别商品的种类。
这种方案需要顾客自行找到商品条形码,并且协助机器读取条形码。用户参与度要求较高,并且商品收银只能单次操作,用户体验有较大提升空间。
发明内容
本说明书一个或多个实施例描述了自助收银方法和装置,其中利用编码识别和视觉识别相结合的方式,提高商品识别的效率和准确性,提升用户体验。
根据第一方面,本说明书实施例提供一种商品的自助收银方法,包括:
获取第一图像,所述第一图像由第一摄像头对放置在收银台上的至少一个商品进行拍摄得到;
对所述第一图像进行图像分割,得到至少一个图像区域,其中包括第一图像区域;
在所述第一图像区域中进行商品码的码区检测;
在检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述第一图像区域中包含的第一商品的类别;
在未检测到码区或者无法识别上述编码的情况下,通过视觉识别,基于所述第一图像区域识别出所述第一商品的类别;
根据所述第一商品的类别,确定所述第一商品的计价结果。
在一种实施方式中,获取第一图像包括,控制所述第一摄像头拍摄所述至少一个商品,得到所述第一图像。
在另一种实施方式中,获取第一图像包括,从自助收银台接收所述第一图像。
根据不同实施例,所述第一图像是从所述至少一个商品的俯视方向、前视方向、左视方向、右视方向、后视方向和斜视方向之一拍摄,所述斜视方向为,拍摄方向与所述收银台垂直方向夹角30度-60度。
在一种实施方式中,对所述第一图像进行图像分割包括,利用图像分割模型,对所述第一图像进行图像分割;其中,所述图像分割模型利用分割样本图片预先训练得到,所述分割样本图片包含商品图像,并具有对商品轮廓进行标注的标注数据。
根据一种实施方式,所述商品码为条形码;在这样的情况下,在所述第一图像区域中进行码区检测包括:利用第一目标检测模型,在所述第一图像区域中检测条形码区域;其中,所述第一目标检测模型利用第一训练样本图片预先训练得到,所述第一训练样本图片包含商品图像,并具有框选出商品图像中的条形码区域的标注数据。
进一步的,在一个实施例中,识别所述码区中的编码包括:通过透视变换对检测到的条形码区域进行校正,得到校正条形码;对所述校正条形码进行编码识别。
根据另一种实施方式,所述商品码为二维码;在这样的情况下,在所述第一图像区域中进行码区检测包括:在所述第一图像区域中检测二维码的定位图形,在检测到至少两个定位图形的情况下,确定检测到二维码区域。
进一步的,在一个实施例中,所述识别所述码区中的编码包括:通过透视变换对检测到的二维码区域进行透视校正,得到校正二维码;基于所述至少两个定位图形,确定所述校正二维码中角点图形关系;基于所述角点图形关系,在所述校正二维码中提取编码特征,从而识别二维码编码。
在一种实施方式中,视觉识别包括:利用第二目标检测模型,基于所述第一图像区域确定所述第一商品的类别;其中,所述第二目标检测模型利用第二训练样本图片预先训练得到,所述第二训练样本图片包含商品图像,并具有框选出商品并标注出商品类别的标注数据。
在一种实施方式中,所述方法还包括:
获取第二图像,所述第二图像由第二摄像头对所述至少一个商品进行拍摄得到;
对所述第二图像进行图像分割,至少得到第二图像区域;
通过编码识别或视觉识别,确定所述第二图像区域对应的第二商品的类别;
根据所述第一摄像头和第二摄像头的相对位置关系,确定所述第一图像区域和第 二图像区域对应于同一商品;
在商品计价结果中排除所述第一商品和所述第二商品之一的计价结果。
根据第二方面,提供一种商品的自助收银方法,所述方法包括:
获取多个图像,所述多个图像由多个摄像头分别对放置在收银台上的至少一个商品进行拍摄得到;
对所述多个图像分别进行图像分割,得到各个图像对应的图像区域;
根据所述多个摄像头的相对位置关系,从所述各个图像对应的图像区域中,确定出对应于同一商品的多个图像区域;
在所述多个图像区域中进行商品码的码区检测;
在任一图像区域中检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述同一商品的类别;
在所述多个图像区域中均未检测到码区或者均未识别出编码的情况下,通过视觉识别,基于所述多个图像区域中的至少一个图像区域识别出所述同一商品的类别;
根据所述同一商品的类别,确定该商品的计价结果。
根据第三方面,提供一种商品的自助收银装置,所述装置包括:
图像获取单元,配置为获取第一图像,所述第一图像由第一摄像头对放置在收银台上的至少一个商品进行拍摄得到;
图像分割单元,配置为对所述第一图像进行图像分割,得到至少一个图像区域,其中包括第一图像区域;
码区检测单元,配置为在所述第一图像区域中进行商品码的码区检测;
编码识别单元,配置为在检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述第一图像区域中包含的第一商品的类别;
视觉识别单元,配置为,在未检测到码区或者无法识别上述编码的情况下,通过视觉识别,基于所述第一图像区域识别出所述第一商品的类别;
计价单元,配置为根据所述第一商品的类别,确定所述第一商品的计价结果。
根据第四方面,提供一种商品的自助收银装置,所述装置包括:
图像获取单元,配置为获取多个图像,所述多个图像由多个摄像头分别对放置在收银台上的至少一个商品进行拍摄得到;
图像分割单元,配置为对所述多个图像分别进行图像分割,得到各个图像对应的图像区域;
区域关系确定单元,配置为根据所述多个摄像头的相对位置关系,从所述各个图像对应的图像区域中,确定出对应于同一商品的多个图像区域;
码区检测单元,配置为在所述多个图像区域中进行商品码的码区检测;
编码识别单元,配置为在任一图像区域中检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述同一商品的类别;
视觉识别单元,配置为在所述多个图像区域中均未检测到码区或者均未识别出编码的情况下,通过视觉识别,基于所述多个图像区域中的至少一个图像区域识别出所述同一商品的类别;
计价单元,配置为根据所述同一商品的类别,确定该商品的计价结果。
根据第五方面,本说明书实施例提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行第一方面或第二方面所述的方法。
根据第六方面,本说明书实施例提供一种自助收银台,包括:存储设备和处理器,所述处理器可通信地耦合到所述存储设备,存储设备存储有应用程序,所述处理器可用于执行应用程序,实现第一方面或第二方面所述的方法。
根据第七方面,本说明书实施例提供一种服务器,包括:存储设备,网络接口和处理器,所述处理器可通信地耦合到所述存储设备和网络接口,存储设备存储有服务器程序,所述处理器可用于执行服务器程序,实现第一方面或第二方面任一项所述的方法。
通过本说明书实施例提供的自助收银方法和自助收银台,可以兼顾商品条码方案的扫描速度和精度以及视觉识别方案的用户体验。
附图说明
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为本说明书披露的自助收银系统的场景示意图;
图2示意了从下方看到的面板的底视图;
图3是本说明书实施例的自助收银台中计算设备的电子结构示意图;
图4是根据本说明书实施例的图像识别的服务器的结构图;
图5是根据本说明书实施例的商品识别和计价方法的流程示意图;
图6a-6c示出在一个示例中对条形码区域进行检测和识别的过程示意图;
图7a示例了二维码透视变换的效果示意图;
图7b示出了确定角点图形关系的各个子阶段;
图8示出在一个示例中摄像头的标定;
图9示出根据一个实施例的针对多个图像进行商品识别和计价的流程图;
图10示意了根据本说明书一个实施例的商品收银装置的示意性框图;
图11示意了根据本说明书另一实施例的商品收银装置的示意性框图。
具体实施方式
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的模块或具有相同或类似功能的模块。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能理解为对本申请的限制。
新零售领域广泛采用自助收银台来提升收银效率并减少人工成本。其中,基于机器视觉进行商品识别的方案由于其成本和精度而成为业界主流方案之一。但是,视觉方案是基于商品外观,通过机器学习算法来进行商品识别的。受到外部光照、商品摆放角度等条件影响,并不能保证100%的识别精度。另一方面,商品外包装都有商品条形码(barcode)可以明确标识商品。但是由于用户自助摆放,很难保证商品码被100%露出并识别出来。
因此,本说明书提出视觉识别和商品码识别相结合的混合方式。在进行商品识别的时候,可以首先识别商品条形码。如果能识别出来,就可以快速且准确的获得商品种类;如果检测不到商品条形码或者识别不出来商品条形码,就启动正常的视觉识别来检测商品。
图1为本说明书披露的自助收银系统的场景示意图。如图1所示,自助收银系统包括自助收银台12和服务器系统18。自助收银台和服务器系统可以通过网络16相连。
自助收银台12可以包括台面130和面板120,彼此相对设置。台面130位于下方,用于承载一件或多件商品132、134。面板120位于上方,为台面上的商品提供理想的照明环境,使得商品的光照稳定,有助于商品检测或识别算法的运行。
自助收银台可以配置有至少一个摄像头。摄像头可以对放置在收银台上的商品134、132进行摄像或者拍照,获取这些商品的视频或图像。视频或图像可以包括多件商品中各商品的视频或图像部分。为叙述简便起见,下文将仅以图像为例。本领域的技术人员意识到,本说明书中图像既可以是直接拍摄的图像,也可以是从视频中抽取的图像。
根据本说明书实施例的收银台,用户可以较为自由地将商品放置在收银台的台面130上。换言之,各商品在台面上的形态/位置可以不同。
自助收银台12将视频或图像通过网络发送给服务器系统。网络16可以是有线网、无线网、局域网、互联网等等。
服务器18通过商品码检测和/或视觉识别,根据图像确定各商品的类别。具体的,在一个例子中,服务器首先对图像进行分割,针对分割后的图像区域,检测并且识别其所携带的商品码。如果能从中正常读取商品码,则可以准确获取商品种类并完成识别。对于和某个商品相关、但不能从中读取出商品码的图像区域,启动视觉识别算法来检测其商品类别。
服务器系统或自助收银台,可以根据识别得到的商品类别,确定商品的计价结果。
本领域的技术人员意识到,商品码检测和视觉识别可以由单一的服务器实现。服务器系统也可以包括多台服务器,同时执行或者按要求分别执行相应的检测或识别工作。比如,图像的商品分割由服务器A执行,商品码检测由服务器B执行,视觉识别由服务器C执行。故此,在本说明书中服务器既可以指代单个的服务器本身,也可以是服务器集群。
此外,尽管图1中示例了由服务器根据商品图像进行商品识别的例子,但是,商品的识别和计价也可以由设置在收银台的计算设备来执行。此时,计算设备直接获取摄像头拍摄的图像,而不必将其发送至服务器。
通过以上所述的混合方式,本说明书的方案可以融合条码识别和视觉识别两种方案的优势,在保证用户体验的基础上实现自动收银。
应该理解的是,自助收银系统的总体架构、设置和操作以及各个部件仅仅是示例性的,并且不同配置的系统也可以用于实施本发明公开的方法示例。
图2示意了从下方看到的面板的底视图。面板可以是不透明的或者半透明的,以便遮挡来自天花板或者其它角度的外部光线,进而避免外部光线对商品光照情况产生不可控的影响。面板可以有多种形状,比如长方形、弧形,两侧有延伸盖板等。只要能遮挡或部分遮挡面板上方光源,任何面板形状都可以接受。
如图2所示,面板包括一个或多个光源121、122。光源可以是LED或者其他方式。光源可以位于面板的下表面上,也可以嵌入于面板中。光源使得自助收银台上摆放的商品光照稳定,有助于商品识别算法的运行。在具体应用场景中,光源的数量和布置情况可以有较多选择,双光源、4光源,甚至更多的光源都可以。布局方法也可以根据需求来定制。基于视觉的商品识别的一点关键就是需要获取清晰的图片,稳定且均匀的光照是良好保证。
光源可以采取可控光源的形式,根据工作环境的变化而调整光源的亮度。
自助收银台可以设置一个或一个以上摄像头124、125、126。摄像头可以是普通的RGB摄像头,也可以是3D摄像头。根据采用摄像头的视场(FOV)及商品放置台的大小,布置这些摄像头。多个摄像头可以从不同角度获取图像,从而有效避免遮挡。在只有一个摄像头的情况下,可以通过配置滑轨,使得摄像头在滑轨上滑动,从而得到不同角度下的商品图像。
在一个例子中,不同摄像头可以从商品的俯视方向、前视方向、左视方向、右视方向、后视方向中的至少一个方向获取商品的图像。为了获取前视、后视、左视或者右视方向的图像,摄像头可以安装在面板以外的其它物体上。
在一个例子中,摄像头还可以从商品的斜视方向获取图像。该斜视方向是指,拍摄方向与收银台台面垂直方向夹角在30度-60度的方向,其中拍摄方向是摄像头的视场FOV的中心线指向的方向。
为了实现自助收银,自助收银台还需要配备相应的计算设备。图3是本说明书实施例的自助收银台中计算设备的电子结构示意图。自助收银台的电子结构可包括多个电子器件或装置。如图3所示,处理器510控制计算设备的总体操作。LED控制器532可用于对多个LED灯(LED#1,LED#2,LED#N)进行控制,使得其提供均匀稳定的照明。CAM HUB 534是一种摄像头集线器,可用于控制两个或两个以上摄像头(camera) (CAM#1,CAM#2,CAM#N)获取图像。在可选的情况下,计算设备还可包括耦接至数据链路的网络/总线接口526,用于和服务器进行数据通信;网络/总线接口526还可以接收来自摄像头的图像;在无线连接的情况下,网络/总线接口526可包括无线收发器。电子设备还包括闪存FLASH 524。在一个例子中,FLASH 524可存储软件,软件自FLASH中加载于DRAM 522中,并且由此控制CPU 510执行相应操作。
图4是根据本说明书实施例的图像识别的服务器的结构图。如图4所示,服务器可包括处理器702,该处理器表示用于控制服务器的总体操作的微处理器。数据总线715可利于在存储设备740、处理器702和网络接口714之间进行数据传输。
服务器还包括存储设备740,该存储设备可存储服务器程序。终端设备还可包括随机存取存储器(RAM)720和只读存储器(ROM)722。ROM 722可以以非易失性方式存储待执行的程序、实用程序或进程,比如操作系统。RAM720,也称为内存,可提供易失性数据存储,并存储运行操作系统和服务器程序的指令及其相关的数据。
在操作中,服务器程序自RAM740中加载于RAM720中,并且由此控制处理器702执行相应操作。
图5是根据本说明书实施例的商品识别和计价方法的流程示意图。该方法可以编译为软件,由图3所示的收银台计算设备,或图4所示的服务器实施。或者,也可以由上述收银台计算设备和服务器共同实施,其中,收银台计算设备和服务器各自执行该方法流程的一部分。
如图5所示,首先,在步骤501,获取第一图像,该第一图像由第一摄像头对放置在收银台上的至少一个商品进行拍摄得到。
如前所述,收银台中可布置有一个或多个摄像头,用于对商品进行拍摄。在仅布置一个摄像头的情况下,该摄像头拍摄的图像即为上述的第一图像。在布置多个摄像头的情况下,多个摄像头可以从不同角度对台面上放置的商品进行拍摄,产生多个图像。在这样的情况下,为了描述的清楚和简单,可以将其中任意的一个摄像头称为第一摄像头,将该摄像头拍摄的商品图像称为第一图像。应理解,本文中的“第一”、“第二”仅仅是为了在描述时进行区分,并不意在对出现顺序等其他方向进行限定。
在一个实施例中,该方法由收银台计算设备执行。在这样的情况下,在步骤501,计算设备控制收银台中的第一摄像头拍摄台面上放置的商品,得到上述第一图像。
在另一实施例中,该方法由服务器执行。在这样的情况下,收银台中的第一摄像 头通过拍摄台面上的商品产生第一图像,计算设备经由图3所示的网络/总线接口将第一图像发送至服务器。相应的,在步骤501,服务器从自助收银台接收第一图像,从而获取该第一图像。
接着,在步骤502,对第一图像进行图像分割,得到至少一个图像区域。
图像的分割可以采用多种算法和/或模型来实现。
在一个实施例中,可以基于常规图像处理进行图像分割,这样的图像处理包括,基于像素灰度或对比度分析进行的物体边界识别(类似抠图工具中的边界识别方式)。基于如此识别的边界,可以将图像分割为若干图像区域。一般地,每个图像区域对应识别出的一个物体,在本实施例的场景中即对应一件商品。
一般而言,常规图像处理的方式适合于物体与背景色色差明显,背景色单一,物体边界清晰的情况。针对台面上商品进行拍摄的情况通常更为复杂,特别是在多件商品的情况下。因此,在一种实施方式中,预先训练图像分割模型,利用这样的模型对待分析的第一图像进行图像分割。
具体的,可以拍摄大量的商品图片,图片中可以包含一件或多件商品任意摆放的组合,将这样的商品图片分发给标注人员,由标注人员在图片中标注出商品的轮廓。这样的包含商品图像、并具有标注的商品轮廓的图片就可以作为分割样本图片,用于训练图像分割模型。图像分割模型可以采用例如基于Mask-RCNN的模型,基于条件随机场CRF的模型,等等。
在训练得到这样的图像分割模型之后,就可以利用该模型,对第一图像进行图像分割。一般地,通过图像分割,可以将第一图像分割为与图中的商品件数对应的图像区域。
为了描述的简单,下面以其中任意的一个图像区域,称为第一图像区域为例,进行描述。
于是,在步骤503,在第一图像区域中进行商品码码区检测;如果检测到码区,那么在步骤504,识别码区中的编码,根据识别的编码确定第一图像区域中包含的商品的类别;如果没有检测到码区,那么,在步骤505,通过视觉识别,基于第一图像区域识别出其中包含的商品的类别。
可以理解,目前在绝大多数的商品上都印有条形码,该条形码可以唯一地标识出商品的具体种类,或称为类别,商品售卖方预先将商品的类别与价格关联,于是,可以 通过识别该条形码直接得到商品的类别和价格信息。此外,也有部分商品印有二维码。在一些情况下,也可以通过识别该二维码得到商品的类别和价格信息。因此,步骤503中的商品码码区检测,可以包括对条形码的检测,和对二维码的检测。下面结合这两种情况,描述以上步骤的具体实现方式。
在一个实施例中,上述商品码为条形码。为了对条形码区域进行检测,在一个实施例中,可以预先训练目标检测模型,利用该目标检测模型,在第一图像区域中检测条形码区域。
目标检测模型是图像识别中的常用模型,用于从图片中识别出特定的目标对象。一般地,目标检测模型基于对特定目标对象进行标注的图片样本进行训练而得到。在需要对条形码区域进行检测的情况下,可以利用标注出条形码的训练样本图片,训练得到专用于检测条形码区域的目标检测模型。
具体的,可以拍摄大量的商品图片,图片中可以包含一件或多件商品任意摆放的组合,将这样的商品图片分发给标注人员,由标注人员在图片中标注出商品的条形码区域。更具体的,标注人员可以用框选出条形码的最小矩形框标注出条形码区域。如此得到具有条形码标注框的商品图片,作为训练目标检测模型的训练样本图片。
在本领域中,已经基于各种网络结构和各种检测算法提出了各种各样的目标检测模型。例如,单阶段(one stage)检测模型可以从图片中直接确定出目标对象的类别概率和位置坐标,也就是直接识别出目标对象。单阶段检测模型的典型例子包括,SSD模型,Yolo模型等。两阶段(two stage)的检测模型首先在图片中生成候选区域,或称为兴趣区域ROI,然后在候选区域中进行目标识别和边框回归。两阶段的检测模型的典型例子包括,R-CNN模型,Fast R-CNN模型,Faster R-CNN模型等。还提出有其他目标检测模型。以上这些结构和算法的模型均可以用作对条形码进行检测的目标检测模型。
如此,通过预先训练的目标检测模型,在第一图像区域中检测条形码区域。如果检测到条形码区域,那么在步骤504,识别条形码区域中的编码。
在一个实施例中,对条形码编码的识别可以采用常规的条形码读码技术实现。
然而,不同于常规收银台中用户自主地将条形码贴近扫码窗,本说明书实施例的收银台允许用户随意地将多件商品摆放在台面上。如此,拍摄的图像中包含的条形码常常具有歪斜、扭曲等各种变形。为了提高条形码读码识别率,在一个实施例中,首先通过透视变换对检测到的条形码区域进行校正,得到校正条形码。透视变换可以通过利用 投影变换矩阵进行变换操作而实现,如此得到具有规范形状和方向的校正条形码。然后可以对该校正条形码进行编码识别,得到其中的编码。
图6a-6c示出在一个示例中对条形码区域进行检测和识别的过程示意图。在图6a-6c中,左侧(a)部分示出对商品进行拍摄得到的原图的一部分。对这部分图像区域进行条形码检测,可以得到条形码区域。中间的(b)部分示意性示出了从原图中条形码区域得到的变形的条形码,右侧(c)部分示出对(b)部分进行透视变换后,得到的校正条形码。通过对(c)部分的校正条形码进行识别,可以得到商品对应的编码。
在另一实施例中,上述商品码为二维码。可以采用多种方式,在步骤503,对二维码区域进行检测。例如,在一个例子中,与条形码类似的,针对二维码训练目标检测模型,利用该目标检测模型,在第一图像区域中检测二维码区域。
在另一例子中,利用二维码自身的结构特点,直接在图像区域中进行检测。具体的,目前的二维码通常在左上、右上和左下角具有3个定位图形,定位图形具有特定的显著结构特点,例如,定位图形常常采用黑色方框内套一黑色方块的“回”字形结构。因此,可以利用以上的结构特点,在图像区域中检测定位图形。一般而言,二维码检测通常具有一定的容错性,允许在某个定位图形由于污损、遮挡等情况而无法检测时,利用两个定位图形恢复出二维码的图形关系。因此,在检测到至少两个定位图形的情况下,可以确定检测到二维码。
在确定检测到二维码的情况下,在步骤504,识别二维码区域中的编码信息。
如前所述,由于用户很随意地将商品放置在收银台上,因此,检测到的二维码区域往往也存在较大的形变,难以直接进行解码。因此,在一个实施例中,首先对二维码区域进行校正,确定图形关系,然后进行编码识别。
具体的,在一个实施例,首先通过透视变换对检测到的二维码区域进行透视校正,得到校正二维码。透视变换可以通过利用投影变换矩阵进行变换操作而实现。
图7a示例了二维码透视变换的效果示意图。如图中所示,通过透视变换,可以修正二维码区域的顶点,得到形状规范的方形二维码,即校正二维码。
然后,基于前述检测到的至少两个定位图形,确定校正二维码中角点图形关系。图7b示出了这一过程的各个子阶段。
如图7b所示,首先基于校正二维码再次进行定位图形的探测,即二次特征探测。由此,精确确定出定位图形的位置,以及对应的角点的位置。然后,确定探测出的两个 定位图形的位置关系(对角关系还是同侧关系),并基于该位置关系,进行虚拟角点定位,即定位出第三个定位图形对应的角点。最后,将第三个定位图形填充到虚拟角点对应的位置,对二维码进行角点图形关系的恢复。在检测到三个定位图形的情况下,可以省略图7b中的虚拟角点定位,通过二次探测和角点精细定位,确定出角点图形关系。
接着,基于得到的角点图形关系,在校正二维码中提取编码特征,从而识别二维码编码。
不管是条形码还是二维码,一旦识别出其中的编码,就可以准确地确定出商品的类别。
回到图5,如果在步骤503没有检测到码区,或者无法识别码区中的编码,那么,在步骤505,通过视觉识别,基于第一图像区域识别出其中包含的商品的类别。以上的视觉识别主要是通过机器学习训练目标检测模型,进而利用目标检测模型直接检测和识别商品类别的方案。
需要理解,由于检测的目标对象不同,需要的输出结果也不同,步骤505中视觉识别所采用的目标检测模型和前述用于检测条形码区域的目标检测模型并不相同。
为了训练用于视觉识别的目标检测模型,可以预先拍摄大量的商品图片,图片中可以包含一件或多件商品任意摆放的组合,将这样的商品图片分发给标注人员,由标注人员在图片中框选出商品,并标注出商品的类别。如此得到具有商品类别标注数据的商品图片,即可作为训练样本图片,训练用于视觉识别的目标检测模型。
在训练好这样的目标检测模型之后,就可以利用该目标检测模型,对前述的第一图像区域进行商品识别,直接输出该区域中包含的商品的类别。
一般而言,用于视觉识别的目标检测模型需要大量的样本图片进行训练,并且在利用该模型进行商品识别时,也需要进行更复杂的计算,耗费较多的计算资源。因此,在图5的过程中,优先采用准确度高、耗费计算资源少的商品码检测来识别商品,在未检测到商品码或者未能识别编码的情况下,启动视觉识别,以确保商品最终能够被识别。
一旦确定出第一图像区域中对应的商品的类别,在步骤506,根据该商品的类别,确定其计价结果。具体的,可以根据售卖方预先记录的商品类别与价格的关联数据,确定商品的价格。最终,计价结果可以包括,商品类别对应的商品名,商品价格等信息。
以上步骤503到506描述了针对第一图像中分割得到的任意的第一图像区域进行商品识别和计价的过程。可以理解,对于各个分割的图像区域均可以执行上述过程,从 而对各个图像区域中的商品进行识别,进而进行计价。于是,可以得到第一图像中包含的各个商品的计价结果。
如前所述,为了避免商品之间的遮挡,可以在收银台中布置多个摄像头,从多个角度对商品进行拍摄,得到多个图像。以上的第一图像可以是多角度拍摄产生的多个图像中的任意一个。在得到多个图像的情况下,还可以综合多个图像,得到台面上商品的总体计价结果。
在一个实施例中,对于多个摄像头拍摄得到的多个图像中的各个图像,分别执行图5所示的流程,从而得到各个图像对应的商品计价结果。然后,根据多个摄像头之间的相对位置关系,确定各个图像中对应于同一商品的图像区域,从该图像对应的计价结果中去除该同一商品的计价,从而避免对同一商品重复多次计价。这个过程又称为“去重”。
具体的,假定多个图像除了包含前述的第一图像外,还包括另一图像,称为第二图像。该第二图像由第二摄像头对台面上的商品进行拍摄而得到。与图5类似的,可以对第二图像进行分割,得到与其中包含的商品件数相对应的各个图像区域。可以理解,第二图像中包含的商品件数有可能与第一图像不同。例如,如果台面上摆放了3件商品,其中一件沿着某个方向遮挡住了另一件商品,那么当第一摄像头沿着上述方向拍摄时,第一图像中仅包含2件商品。而第二摄像头从不同角度拍摄时,得到的第二图像可以包含3件商品。
为了描述的简单,假定对第二图像进行分割得到的图像区域中包含第二图像区域。对于该第二图像区域,与前述步骤503到506类似的,通过编码识别或视觉识别,确定该第二图像区域对应的商品类别。
接着,根据第一摄像头和第二摄像头的相对位置关系,确定第一图像区域和第二图像区域是否对应于同一商品。可以理解,多个摄像头在安装完成之后,可以进行位置的标定,得到标定信息。这样的标定信息可以示出两个摄像头,例如第一摄像头和第二摄像头之间的相对位置关系,以及拍摄的图片之间的交叠关系。
图8示出在一个示例中摄像头的标定。在图8的示例中,假定摄像头C1和C2均固定在照明面板上,朝向收银台面。在C1和C2的位置(包括朝向)固定之后,可以确定出收银台面对应的平面上、C1和C2的拍摄视场FOV的重叠区域。与FOV的重叠相对应的,C1拍摄的图片P1和C2拍摄的图片P2会具有相应的交叠范围,如阴影部分 所示。在标定过程中,可以利用标记物标记台面,然后用固定好位置的摄像头拍摄台面,通过比对各摄像头拍摄的图片中标记物的位置,确定出各图片之间的交叠关系。
对于前述的第一摄像头和第二摄像头,可以根据如此标定的标定信息,确定出第一图像区域和第二图像区域是否落入图片之间的交叠区域,进而判断,第一图像区域和第二图像区域是否对应于同一商品。
如果第一图像区域和第二图像区域对应于同一商品,那么在商品计价结果中,仅需包含这两个区域之一对应商品的计价;或者说,在商品计价结果中排除两个图像区域之一对应商品的计价结果。
如此,在综合多个摄像头拍摄的多个图像得到商品计价结果的过程中,通过利用已知的摄像头相对位置关系,排除重复商品,避免同一商品重复计价。
以上是分别“纵向地”对多个图像进行商品识别处理,然后综合多个图像的处理结果的过程。根据另一种实施方式,也可以先“横向”地综合多个图像的图像区域,然后进行商品识别和计价。
图9示出根据一个实施例的针对多个图像进行商品识别和计价的流程图。与图5类似的,该方法流程可以由收银台中的计算设备来实施,也可以由服务器来实施。如图9所示,该方法流程包括以下步骤。
在步骤901,获取多个图像,该多个图像由多个摄像头分别对放置在收银台上的商品进行拍摄得到。如前所述,多个摄像头可以从不同角度不同位置对商品进行拍摄,得到上述多个图像。
然后,在步骤902,对上述多个图像分别进行图像分割,得到各个图像对应的图像区域。图像分割的方式可以参照前述对步骤502的描述,不再赘述。
接着,在步骤903,根据多个摄像头的相对位置关系,从各个图像对应的图像区域中,确定出对应于同一商品的多个图像区域。
如前所述,在多个摄像头位置关系已知的情况下,通过其标定信息,可以获知各个摄像头拍摄的图片之间的交叠关系。如此,可以确定出多个图像中各个图像区域中对应于同一商品的多个图像区域。
例如,在一个例子中,利用6个摄像头对台面上的4件商品进行拍摄,得到6个图像。通过图像分割,这6个图像的每个图像被分割为若干区域。由于某些角度下商品 之间可能存在遮挡,因此每个摄像头拍到的商品数目可能不同,图像分割得到的图像区域的数目也可能不同。例如,6个图像中有5个图像都被分割为4个图像区域,而另一个图像被分割为3个图像区域。然后,根据这6个摄像头的位置关系,可以从这6个图像分割得到的图像区域中,获得对应于同一商品的图像区域。对于没有被遮挡的商品,其在6个图像中均有对应的图像区域,于是可以得到对应于该商品的6个图像区域;对于在某个图像中被遮挡的商品,可以得到对应于该商品的5个图像区域。
如此,从各个图像对应的图像区域中,确定出对应于同一商品的多个图像区域。然后,在步骤904,在该多个图像区域中进行商品码码区检测。商品码码区检测的具体执行方式可以参考前述结合步骤503的描述,不复赘述。
在同一商品对应的任一图像区域中检测到码区的情况下,在步骤905,识别码区中的编码,根据识别的编码确定该同一商品的类别。编码识别的具体执行方式可以参考前述结合步骤504的描述。
如果在该多个图像区域中均未检测到码区,或者均无法识别其中的编码,那么在步骤906,通过视觉识别,基于该多个图像区域中的至少一个图像区域识别出上述同一商品的类别。视觉识别的过程如前述结合步骤505所述。
于是,通过步骤905的编码识别或者步骤906的视觉识别,确定出上述同一商品的类别。接着,在步骤907,根据该同一商品的类别,确定该商品的计价结果。
在图9的方法流程中,首先横向地综合多个图像中对应于同一商品的多个图像区域,对这多个图像区域综合地进行商品识别和计价。采用这样的方式,不必在每个图像处理完毕之后再进行商品“去重”,并且只要任一图像区域中存在码区就进行编码识别,可以最大程度地减少视觉识别的启动比例,提高整体的识别效率。
综合以上,通过优先进行编码识别,然后进行视觉识别的综合方式,允许用户随意地将多件商品放置在收银台上进行计价,极大地提高用户体验。
本说明书披露的技术方案可以适用于超市、便利店。比如,在快速通道可以设置自助收银台,允许购买商品不多的用户通过自助收银的方式完成计价。如此,提升了用户的便利度,使得收银时间大为缩短。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本说明书各实施例所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指 令或代码进行传输。根据另一方面的实施例,还提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行结合图5和图9所描述的方法。
图10-图11示意了本说明书各实施例所描述的功能采用硬件、固件或者其组合或者与软件组合时的一些可能的方案。
图10示意了根据本说明书一个实施例的商品收银装置的示意性框图。该装置可以部署在图3所示的自助收银台的计算设备中,也可以部署在图4所示的服务器中。如图10所示,从功能模块的角度,该装置100包括:
图像获取单元101,配置为获取第一图像,所述第一图像由第一摄像头对放置在收银台上的至少一个商品进行拍摄得到;
图像分割单元102,配置为对所述第一图像进行图像分割,得到至少一个图像区域,其中包括第一图像区域;
码区检测单元103,配置为在所述第一图像区域中进行码区检测;
编码识别单元104,配置为在检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述第一图像区域中包含的第一商品的类别;
视觉识别单元105,配置为,在未检测到码区或者无法识别上述编码的情况下,通过视觉识别,基于所述第一图像区域识别出所述第一商品的类别;
计价单元106,配置为根据所述第一商品的类别,确定所述第一商品的计价结果。
在一种实施方式中,所述图像获取单元101配置为,控制所述第一摄像头拍摄所述至少一个商品,得到所述第一图像。
在另一种实施方式中,所述图像获取单元101配置为,从自助收银台接收所述第一图像。
在不同实施例中,所述第一图像是从所述至少一个商品的俯视方向、前视方向、左视方向、右视方向、后视方向和斜视方向之一拍摄,所述斜视方向为,拍摄方向与所述收银台垂直方向夹角30度-60度。
在一种实施方式中,所述图像分割单元102配置为,利用图像分割模型,对所述第一图像进行图像分割;其中,所述图像分割模型利用分割样本图片预先训练得到,所述分割样本图片包含商品图像,并具有对商品轮廓进行标注的标注数据。
根据一种实施方式,所述码区为条形码区域;相应的,所述码区检测单元103配置为:利用第一目标检测模型,在所述第一图像区域中检测条形码区域;其中,所述第一目标检测模型利用第一训练样本图片预先训练得到,所述第一训练样本图片包含商品图像,并具有框选出商品图像中的条形码区域的标注数据。
进一步的,在一个实施例中,所述编码识别单元104配置为:通过透视变换对检测到的条形码区域进行校正,得到校正条形码;对所述校正条形码进行编码识别。
根据一种实施方式,所述码区为二维码区域;相应的,所述码区检测单元103配置为:在所述第一图像区域中检测二维码的定位图形,在检测到至少两个定位图形的情况下,确定检测到二维码区域。
进一步的,在一个实施例中,所述编码识别单元104配置为:
通过透视变换对检测到的二维码区域进行透视校正,得到校正二维码;
基于所述至少两个定位图形,确定所述校正二维码中角点图形关系;
基于所述角点图形关系,在所述校正二维码中提取编码特征,从而识别二维码编码。
在一种实施方式中,所述视觉识别单元105配置为:
利用第二目标检测模型,基于所述第一图像区域确定所述第一商品的类别;其中,所述第二目标检测模型利用第二训练样本图片预先训练得到,所述第二训练样本图片包含商品图像,并具有框选出商品并标注出商品类别的标注数据。
在一种实施方式中,所述图像获取单元101还配置为获取第二图像,所述第二图像由第二摄像头对所述至少一个商品进行拍摄得到;
所述图像分割单元102还配置为,对所述第二图像进行图像分割,至少得到第二图像区域;
所述编码识别单元104或视觉识别单元105还配置为,确定所述第二图像区域对应的第二商品的类别;
所述装置还包括(未示出):关系确定单元,配置为根据所述第一摄像头和第二摄像头的相对位置关系,确定所述第一图像区域和第二图像区域对应于同一商品;排除单元,配置为在商品计价结果中排除所述第一商品和所述第二商品之一的计价结果。
图11示意了根据本说明书另一实施例的商品收银装置的示意性框图。该装置可以 部署在图3所示的自助收银台的计算设备中,也可以部署在图4所示的服务器中。如图11所示,从功能模块的角度,该装置110包括:
图像获取单元111,配置为获取多个图像,所述多个图像由多个摄像头分别对放置在收银台上的至少一个商品进行拍摄得到;
图像分割单元112,配置为对所述多个图像分别进行图像分割,得到各个图像对应的图像区域;
区域关系确定单元113,配置为根据所述多个摄像头的相对位置关系,从所述各个图像对应的图像区域中,确定出对应于同一商品的多个图像区域;
码区检测单元114,配置为在所述多个图像区域中进行码区检测;
编码识别单元115,配置为在任一图像区域中检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述同一商品的类别;
视觉识别单元116,配置为在所述多个图像区域中均未检测到码区或者均未识别出编码的情况下,通过视觉识别,基于所述多个图像区域中的至少一个图像区域识别出所述同一商品的类别;
计价单元117,配置为根据所述同一商品的类别,确定该商品的计价结果。
应当理解,这里描述的部署在自助收银台和服务器中的装置在很多方面可以利用前面描述的方法实施例或与之结合。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。

Claims (27)

  1. 一种商品的自助收银方法,所述方法包括:
    获取第一图像,所述第一图像由第一摄像头对放置在收银台上的至少一个商品进行拍摄得到;
    对所述第一图像进行图像分割,得到至少一个图像区域,其中包括第一图像区域;
    在所述第一图像区域中检测商品码的码区;
    在检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述第一图像区域中包含的第一商品的类别;
    在未检测到码区或者无法识别上述编码的情况下,通过视觉识别,基于所述第一图像区域识别出所述第一商品的类别;
    根据所述第一商品的类别,确定所述第一商品的计价结果。
  2. 根据权利要求1所述的方法,其中,获取第一图像包括,控制所述第一摄像头拍摄所述至少一个商品,得到所述第一图像。
  3. 根据权利要求1所述的方法,其中,获取第一图像包括,从自助收银台接收所述第一图像。
  4. 根据权利要求1所述的方法,其中,所述第一图像是从所述至少一个商品的俯视方向、前视方向、左视方向、右视方向、后视方向和斜视方向之一拍摄,所述斜视方向为,拍摄方向与所述收银台垂直方向夹角30度-60度。
  5. 根据权利要求1所述的方法,其中,对所述第一图像进行图像分割包括,利用图像分割模型,对所述第一图像进行图像分割;其中,所述图像分割模型利用分割样本图片预先训练得到,所述分割样本图片包含商品图像,并具有对商品轮廓进行标注的标注数据。
  6. 根据权利要求1所述的方法,其中,所述商品码为条形码;
    在所述第一图像区域中检测商品码的码区包括:
    利用第一目标检测模型,在所述第一图像区域中检测条形码区域;其中,所述第一目标检测模型利用第一训练样本图片预先训练得到,所述第一训练样本图片包含商品图像,并具有框选出商品图像中的条形码区域的标注数据。
  7. 根据权利要求6所述的方法,其中,识别所述码区中的编码包括:
    通过透视变换对检测到的条形码区域进行校正,得到校正条形码;
    对所述校正条形码进行编码识别。
  8. 根据权利要求1所述的方法,其中,所述商品码为二维码;
    在所述第一图像区域中检测商品码的码区包括:
    在所述第一图像区域中检测二维码的定位图形,在检测到至少两个定位图形的情况下,确定检测到二维码区域。
  9. 根据权利要求8所述的方法,其中,所述识别所述码区中的编码包括:
    通过透视变换对检测到的二维码区域进行透视校正,得到校正二维码;
    基于所述至少两个定位图形,确定所述校正二维码中角点图形关系;
    基于所述角点图形关系,在所述校正二维码中提取编码特征,从而识别二维码编码。
  10. 根据权利要求1所述的方法,其中,通过视觉识别,基于所述第一图像区域识别出所述第一商品的类别,包括:
    利用第二目标检测模型,基于所述第一图像区域确定所述第一商品的类别;其中,所述第二目标检测模型利用第二训练样本图片预先训练得到,所述第二训练样本图片包含商品图像,并具有框选出商品并标注出商品类别的标注数据。
  11. 根据权利要求1所述的方法,还包括:
    获取第二图像,所述第二图像由第二摄像头对所述至少一个商品进行拍摄得到;
    对所述第二图像进行图像分割,至少得到第二图像区域;
    通过编码识别或视觉识别,确定所述第二图像区域对应的第二商品的类别;
    根据所述第一摄像头和第二摄像头的相对位置关系,确定所述第一图像区域和第二图像区域对应于同一商品;
    在商品计价结果中排除所述第一商品和所述第二商品之一的计价结果。
  12. 一种商品的自助收银方法,所述方法包括:
    获取多个图像,所述多个图像由多个摄像头分别对放置在收银台上的至少一个商品进行拍摄得到;
    对所述多个图像分别进行图像分割,得到各个图像对应的图像区域;
    根据所述多个摄像头的相对位置关系,从所述各个图像对应的图像区域中,确定出对应于同一商品的多个图像区域;
    在所述多个图像区域中检测商品码的码区;
    在任一图像区域中检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述同一商品的类别;
    在所述多个图像区域中均未检测到码区或者均未识别出编码的情况下,通过视觉识别,基于所述多个图像区域中的至少一个图像区域识别出所述同一商品的类别;
    根据所述同一商品的类别,确定该商品的计价结果。
  13. 一种商品的自助收银装置,所述装置包括:
    图像获取单元,配置为获取第一图像,所述第一图像由第一摄像头对放置在收银台上的至少一个商品进行拍摄得到;
    图像分割单元,配置为对所述第一图像进行图像分割,得到至少一个图像区域,其中包括第一图像区域;
    码区检测单元,配置为在所述第一图像区域中检测商品码的码区;
    编码识别单元,配置为在检测到码区的情况下,识别所述码区中的编码,根据识别的编码确定所述第一图像区域中包含的第一商品的类别;
    视觉识别单元,配置为,在未检测到码区或者无法识别上述编码的情况下,通过视觉识别,基于所述第一图像区域识别出所述第一商品的类别;
    计价单元,配置为根据所述第一商品的类别,确定所述第一商品的计价结果。
  14. 根据权利要求13所述的装置,其中,所述图像获取单元配置为,控制所述第一摄像头拍摄所述至少一个商品,得到所述第一图像。
  15. 根据权利要求13所述的装置,其中,所述图像获取单元配置为,从自助收银台接收所述第一图像。
  16. 根据权利要求13所述的装置,其中,所述第一图像是从所述至少一个商品的俯视方向、前视方向、左视方向、右视方向、后视方向和斜视方向之一拍摄,所述斜视方向为,拍摄方向与所述收银台垂直方向夹角30度-60度。
  17. 根据权利要求13所述的装置,其中,所述图像分割单元配置为,利用图像分割模型,对所述第一图像进行图像分割;其中,所述图像分割模型利用分割样本图片预先训练得到,所述分割样本图片包含商品图像,并具有对商品轮廓进行标注的标注数据。
  18. 根据权利要求13所述的装置,其中,所述码区为条形码区域;
    所述码区检测单元配置为:
    利用第一目标检测模型,在所述第一图像区域中检测条形码区域;其中,所述第一目标检测模型利用第一训练样本图片预先训练得到,所述第一训练样本图片包含商品图像,并具有框选出商品图像中的条形码区域的标注数据。
  19. 根据权利要求18所述的装置,其中,所述编码识别单元配置为:
    通过透视变换对检测到的条形码区域进行校正,得到校正条形码;
    对所述校正条形码进行编码识别。
  20. 根据权利要求13所述的装置,其中,所述码区为二维码区域;
    所述码区检测单元配置为:
    在所述第一图像区域中检测二维码的定位图形,在检测到至少两个定位图形的情况下,确定检测到二维码区域。
  21. 根据权利要求20所述的装置,其中,所述编码识别单元配置为:
    通过透视变换对检测到的二维码区域进行透视校正,得到校正二维码;
    基于所述至少两个定位图形,确定所述校正二维码中角点图形关系;
    基于所述角点图形关系,在所述校正二维码中提取编码特征,从而识别二维码编码。
  22. 根据权利要求13所述的装置,其中,所述视觉识别单元配置为:
    利用第二目标检测模型,基于所述第一图像区域确定所述第一商品的类别;其中,所述第二目标检测模型利用第二训练样本图片预先训练得到,所述第二训练样本图片包含商品图像,并具有框选出商品并标注出商品类别的标注数据。
  23. 根据权利要求13所述的装置,其中:
    所述图像获取单元还配置为获取第二图像,所述第二图像由第二摄像头对所述至少一个商品进行拍摄得到;
    所述图像分割单元还配置为,对所述第二图像进行图像分割,至少得到第二图像区域;
    所述编码识别单元或视觉识别单元还配置为,确定所述第二图像区域对应的第二商品的类别;
    所述装置还包括:
    关系确定单元,配置为根据所述第一摄像头和第二摄像头的相对位置关系,确定所述第一图像区域和第二图像区域对应于同一商品;
    排除单元,配置为在商品计价结果中排除所述第一商品和所述第二商品之一的计价结果。
  24. 一种商品的自助收银装置,所述装置包括:
    图像获取单元,配置为获取多个图像,所述多个图像由多个摄像头分别对放置在收银台上的至少一个商品进行拍摄得到;
    图像分割单元,配置为对所述多个图像分别进行图像分割,得到各个图像对应的图像区域;
    区域关系确定单元,配置为根据所述多个摄像头的相对位置关系,从所述各个图像对应的图像区域中,确定出对应于同一商品的多个图像区域;
    码区检测单元,配置为在所述多个图像区域中检测商品码的码区;
    编码识别单元,配置为在任一图像区域中检测到码区的情况下,识别所述码区中的 编码,根据识别的编码确定所述同一商品的类别;
    视觉识别单元,配置为在所述多个图像区域中均未检测到码区或者均未识别出编码的情况下,通过视觉识别,基于所述多个图像区域中的至少一个图像区域识别出所述同一商品的类别;
    计价单元,配置为根据所述同一商品的类别,确定该商品的计价结果。
  25. 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-12中任一项的所述的方法。
  26. 一种自助收银台,包括:存储设备和处理器,所述处理器可通信地耦合到所述存储设备,存储设备存储有应用程序,所述处理器可用于执行应用程序,实现权利要求1-12中任一项所述的方法。
  27. 一种服务器,包括:存储设备,网络接口和处理器,所述处理器可通信地耦合到所述存储设备和网络接口,存储设备存储有服务器程序,所述处理器可用于执行服务器程序,实现权利要求1-12中任一项所述的方法。
PCT/CN2020/072059 2019-04-16 2020-01-14 一种商品的自助收银方法和设备 WO2020211499A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/810,670 US11113680B2 (en) 2019-04-16 2020-03-05 Self-service checkout counter checkout

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910305807.9A CN110264645A (zh) 2019-04-16 2019-04-16 一种商品的自助收银方法和设备
CN201910305807.9 2019-04-16

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/810,670 Continuation US11113680B2 (en) 2019-04-16 2020-03-05 Self-service checkout counter checkout

Publications (1)

Publication Number Publication Date
WO2020211499A1 true WO2020211499A1 (zh) 2020-10-22

Family

ID=67913615

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/072059 WO2020211499A1 (zh) 2019-04-16 2020-01-14 一种商品的自助收银方法和设备

Country Status (2)

Country Link
CN (1) CN110264645A (zh)
WO (1) WO2020211499A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210201258A1 (en) * 2019-12-27 2021-07-01 United Parcel Service Of America, Inc. System and method for delivery confirmation using a local device for optical scans
CN115880676A (zh) * 2022-12-21 2023-03-31 南通大学 一种基于深度学习的自助售货机商品识别方法
US11948044B2 (en) 2022-12-19 2024-04-02 Maplebear Inc. Subregion transformation for label decoding by an automated checkout system

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11182759B2 (en) 2019-04-15 2021-11-23 Advanced New Technologies Co., Ltd. Self-service checkout counter
US11113680B2 (en) 2019-04-16 2021-09-07 Advanced New Technologies Co., Ltd. Self-service checkout counter checkout
CN110264645A (zh) * 2019-04-16 2019-09-20 阿里巴巴集团控股有限公司 一种商品的自助收银方法和设备
CN110992140A (zh) * 2019-11-28 2020-04-10 浙江由由科技有限公司 一种用于识别模型的匹配方法和系统
CN112308175A (zh) * 2020-02-26 2021-02-02 北京字节跳动网络技术有限公司 用于识别物品的方法和装置
CN111583557A (zh) * 2020-05-11 2020-08-25 湖北汽车工业学院 一种基于rfid的商场智能结算装置、系统及其方法
CN112016339B (zh) * 2020-08-18 2023-12-29 中移(杭州)信息技术有限公司 二维码识别及缺损修复方法、装置、电子设备及存储介质
CN112686220B (zh) * 2021-03-10 2021-06-22 浙江口碑网络技术有限公司 商品识别方法及装置、计算设备、计算机存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202904599U (zh) * 2012-05-09 2013-04-24 深圳普诺玛商业安全设备有限公司 用于收银判定的自动识别装置
US20140177912A1 (en) * 2012-10-31 2014-06-26 Toshiba Tec Kabushiki Kaisha Commodity reading apparatus, commodity sales data processing apparatus and commodity reading method
US20140263603A1 (en) * 2013-03-14 2014-09-18 Wal-Mart Stores, Inc. Method and Apparatus Pertaining to Use of Both Optical and Electronic Product Codes
CN106529365A (zh) * 2016-12-05 2017-03-22 广东工业大学 自动计价机
KR20180077910A (ko) * 2016-12-29 2018-07-09 주식회사 성우하이텍 차량용 범퍼빔 유닛
CN110264645A (zh) * 2019-04-16 2019-09-20 阿里巴巴集团控股有限公司 一种商品的自助收银方法和设备

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5372191B2 (ja) * 2012-01-30 2013-12-18 東芝テック株式会社 商品読取装置及び商品読取プログラム
CN102799850B (zh) * 2012-06-30 2016-03-30 北京百度网讯科技有限公司 一种条形码识别方法和装置
CN106326802B (zh) * 2016-08-19 2018-07-27 腾讯科技(深圳)有限公司 二维码校正方法、装置及终端设备
CN107578582A (zh) * 2017-08-31 2018-01-12 昆山中骏博研互联网科技有限公司 一种批量商品计费系统及计费方法
CN108062837A (zh) * 2018-01-26 2018-05-22 浙江行雨网络科技有限公司 一种基于图像识别的无人值守超市商品结算系统
CN109190439B (zh) * 2018-09-21 2021-06-08 南京机灵侠软件技术有限公司 一种分光器及其端口线标签二维码的识别方法
CN109389068A (zh) * 2018-09-28 2019-02-26 百度在线网络技术(北京)有限公司 用于识别驾驶行为的方法和装置
CN109522967A (zh) * 2018-11-28 2019-03-26 广州逗号智能零售有限公司 一种商品定位识别方法、装置、设备以及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN202904599U (zh) * 2012-05-09 2013-04-24 深圳普诺玛商业安全设备有限公司 用于收银判定的自动识别装置
US20140177912A1 (en) * 2012-10-31 2014-06-26 Toshiba Tec Kabushiki Kaisha Commodity reading apparatus, commodity sales data processing apparatus and commodity reading method
US20140263603A1 (en) * 2013-03-14 2014-09-18 Wal-Mart Stores, Inc. Method and Apparatus Pertaining to Use of Both Optical and Electronic Product Codes
CN106529365A (zh) * 2016-12-05 2017-03-22 广东工业大学 自动计价机
KR20180077910A (ko) * 2016-12-29 2018-07-09 주식회사 성우하이텍 차량용 범퍼빔 유닛
CN110264645A (zh) * 2019-04-16 2019-09-20 阿里巴巴集团控股有限公司 一种商品的自助收银方法和设备

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210201258A1 (en) * 2019-12-27 2021-07-01 United Parcel Service Of America, Inc. System and method for delivery confirmation using a local device for optical scans
US11948044B2 (en) 2022-12-19 2024-04-02 Maplebear Inc. Subregion transformation for label decoding by an automated checkout system
CN115880676A (zh) * 2022-12-21 2023-03-31 南通大学 一种基于深度学习的自助售货机商品识别方法
CN115880676B (zh) * 2022-12-21 2024-04-09 南通大学 一种基于深度学习的自助售货机商品识别方法

Also Published As

Publication number Publication date
CN110264645A (zh) 2019-09-20

Similar Documents

Publication Publication Date Title
WO2020211499A1 (zh) 一种商品的自助收银方法和设备
US11587195B2 (en) Image processing methods and arrangements useful in automated store shelf inspections
US10007964B1 (en) Image processing methods and arrangements
US10628648B2 (en) Systems and methods for tracking optical codes
US11113680B2 (en) Self-service checkout counter checkout
Winlock et al. Toward real-time grocery detection for the visually impaired
US9033238B2 (en) Methods and arrangements for sensing identification information from objects
CN107403128B (zh) 一种物品识别方法及装置
US20210342639A1 (en) Product onboarding machine
US10636391B2 (en) Electronic label system including control device for controlling electronic labels
US10217083B2 (en) Apparatus, method, and program for managing articles
JP6549558B2 (ja) 売上登録装置、プログラム及び売上登録方法
US10262432B1 (en) System and method for measuring and comparing items using computer vision
JP2014531636A (ja) 物体を識別する方法及び機構
JP2019192248A (ja) 物体の連続的な画像をつなぎ合わせるためのシステムおよび方法
Rosado et al. Supervised learning for out-of-stock detection in panoramas of retail shelves
US11188727B1 (en) Efficient parallel barcode subpixel alignment
US11900552B2 (en) System and method for generating virtual pseudo 3D outputs from images
US11216905B2 (en) Automatic detection, counting, and measurement of lumber boards using a handheld device
CN113869343A (zh) 用于基于视觉的结账的被遮挡物品检测
CN110502948B (zh) 折叠二维码图像的还原方法、还原装置与扫码设备
CN111428743B (zh) 商品识别方法、商品处理方法、装置及电子设备
JP6651169B2 (ja) 陳列状況判定システム
Martinel et al. Robust painting recognition and registration for mobile augmented reality
JP2020009466A (ja) 陳列状況判定システム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20792060

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20792060

Country of ref document: EP

Kind code of ref document: A1