WO2019099987A1 - Barcode validator and text extraction from images for data quality checks and remediation - Google Patents

Barcode validator and text extraction from images for data quality checks and remediation Download PDF

Info

Publication number
WO2019099987A1
WO2019099987A1 PCT/US2018/061781 US2018061781W WO2019099987A1 WO 2019099987 A1 WO2019099987 A1 WO 2019099987A1 US 2018061781 W US2018061781 W US 2018061781W WO 2019099987 A1 WO2019099987 A1 WO 2019099987A1
Authority
WO
WIPO (PCT)
Prior art keywords
product
listing
identifier
discrepancy
information
Prior art date
Application number
PCT/US2018/061781
Other languages
French (fr)
Inventor
Karthikeyan KARUNANITHI
VenkatRamanaRao RAPOLU
Azad Krishna TRIPATHI
Skumar BAGHEL
Sandeep George MOOLAYIL
JP De VILLIERS
Original Assignee
Walmart Apollo, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Walmart Apollo, Llc filed Critical Walmart Apollo, Llc
Publication of WO2019099987A1 publication Critical patent/WO2019099987A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5846Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9554Retrieval from the web using information identifiers, e.g. uniform resource locators [URL] by using bar codes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/087Inventory or stock management, e.g. order filling, procurement or balancing against orders

Definitions

  • This invention relates generally to data quality checks and remediation and, more specifically, to data quality checks and remediation for online product listings.
  • FIG. 1 depicts a web browser 102 presenting a product listing, according to some embodiments
  • FIG. 2 depicts a system for updating product listings, according to some embodiments.
  • FIG. 3 depicts example operations for updating a product listing, according to some embodiments.
  • a system for updating a product listing comprises a product database, wherein the product database includes product information for a plurality of products, and a control circuit communicatively coupled to the product database, the control circuit configured to retrieve, from a website, a product listing, extract, from the product listing, a product identifier, retrieve, from the product database based on the product identifier, product information for a product associated with the product identifier, determine, based on the product listing and the product information for a product associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute provided externally, automatically update the product listing to remove the discrepancy, and in the event the discre
  • product listings include a plurality of attributes for the product.
  • the attributes can include a title
  • product identifier e.g., a universal product code (UPC) or a global trade item number (GTXN)
  • UPC universal product code
  • GTXN global trade item number
  • the product listing may be inaccurate, incorrect, superfluous, or inconsistent with information for the product.
  • the customer may not receive the product that he or she expects, resulting in customer dissatisfaction and/or added costs to the retailer.
  • the risk of discrepancies is rising.
  • a system scans product listings on one or more websites. The system identifies the products in the listings and compares the product listings to information associated with the products. That is, the system compares the attributes in the product listings to the attributes for the products, for example, included in a product database. Additionally, in some embodiments, the system can update the product listing to remove the discrepancy (e.g., include correct attributes, remove incorrect attributes, etc.) and/or transmit a notification indicating that the discrepancy exists.
  • the discussion of FIG. 1 provides an overview of such a system.
  • FIG. 1 depicts a web browser 102 presenting a product listing, according to some embodiments.
  • the product listing is presented via a webpage of a retailer’s website.
  • the product listing includes several fields, each of which can include attributes for the product.
  • the webpage depicted in FIG. 1 includes three fields: a title field 104, a product description 106, and a product image 108.
  • the title field 104 includes a title for the product, such as a name and/or brand.
  • the product description 106 includes information about the product.
  • the product description 106 can include a summary or description of the product, as well as any other attributes for the product.
  • the product description can include a product identifier 110A (e.g., a UPC, barcode, GUN, or any other text or image that identifies the product).
  • the product image 108 includes one or more representations of the product (e.g., pictures, drawings, renderings, schematics, etc.).
  • the product image 108 may also include a product identifier HOB.
  • a UPC for the product may be visible in a picture of the product. It should be noted that while some product listings may include the product identifier 110A in the product description 106 and the product identifier 110B in the product image 108, such is not always the case.
  • some product listings may only include the product identifier 110B in the product image 108.
  • a product identifier may not be included in either the product description 106 or the product image 108. Instead other fields of the product listing, or possibly metadata or a uniform resource locator (URL), may include a product identifier.
  • URL uniform resource locator
  • a system scans product listings.
  • the system can scan the product listing depicted in FIG. 1.
  • the system scans the product listing to determine attributes included in the product listing.
  • the system can scan the product listing using optical character recognition (OCR), stroke width transform (SWT), image recognition, etc.
  • OCR optical character recognition
  • SWT stroke width transform
  • the system determines the product identifier 110 A/1 lOB from the product description 106 and/or the product image 108. If a product identifier is not included in the product description 106 or the product image 108, the system can determine the product identifier elsewhere (e.g:, from a different field, metadata, a URL, etc ).
  • the system compares the product listing with information for a product associated with the product identifier 1 1 QA/1 10B.
  • the system can reference a product database to determine information about the product associated with the product identifier.
  • the information for the product associated with the product identifier can include the atributes for the product.
  • the product database is associated with, or provided by, a third party (e.g., a manufacturer, supplier, third party aggregator, etc.). Additionally, or alternatively, the product database can be provided by the retailer (e.g., the retailer can compile product information, and/or create and supply product information, for the product database).
  • the system compares the product listing with the information for the product associated with the product identifier 110A/110B to determine if any discrepancies exist between the product listing and the information for the product associated with the product identifier 1 IOA'T 10B. If the system determines that a discrepancy exists between the product listing and the information for the product associated with the product identifier 1 lOA/l 10B, the system can automatically update the product listing and/or transmit an indication of the discrepancy (e.g., to an employee of the retailer). In some embodiments, the action that the system performs can be based on the type of discrepancy found between the product listing and the information for the product associated with the product identifier.
  • the action the system performs can be dependent upon whether the attribute for which the discrepancy exists was provided externally (e.g., by a manufacturer, supplier, third party aggregator, etc.) or derived internally (e.g., based on the retailer’s examination of the product).
  • the system can automatically update the product listing to remove the discrepancy. That is, the system can update the product listing based on the information for the product associated with the product identifier in the product database.
  • the system can flag the product listing or otherwise transmit an indication of the discrepancy.
  • FIG. 1 provides an overview of a system for updating product listings
  • FIG. 2 provides additional detail regarding such a system.
  • FIG. 2 depicts a system for updating product listings, according to some embodiments.
  • the system includes a product database 202, a control circuit 204, a network 206 (e.g., the Internet), a server 208, and user device(s) 210.
  • the control circuit 204 can comprise a fixed- purpose hard-wired hardware platform (including but not limited to an application-specific integrated circuit (ASIC) (which is an integrated circuit that is customized by design for a particular use, rather than intended for general-purpose use), a field-programmable gate array (FPGA), and the like) or can compose a partially or wholly-programmable hardware platform (including but not limited to microcontrollers, microprocessors, and the like).
  • ASIC application-specific integrated circuit
  • FPGA field-programmable gate array
  • control circuit 204 is configured (for example, by using corresponding programming as will be well understood by those skilled in the art) to carry out one or more of the steps, actions, and/or functions described herein.
  • control circuit 204 operably couples to a memory .
  • the memory may be integral to the control circuit 204 or can be physically discrete (in whole or in part) from the control circuit 204 as desired.
  • This memory can also be local with respect to the control circuit 204 (where, for example, both share a common circuit board, chassis, pow3 ⁇ 4r supply, and/or housing) or can be partially or wholly remote with respect to the control circuit 204 (where, for example, the memory is physically located in another facility, metropolitan area, or even country as compared to the control circuit 204).
  • This memory can serve, for example, to non-transitorily store the computer instructions that, when executed by the control circuit 204, cause the control circuit 204 to behave as described herein.
  • this reference to“non-transitorily” will be understood to refer to a non-ephemeral state for the stored contents (and hence excludes when the stored contents merely constitute signals or waves) rather than volatility of the storage media itself and hence includes both non-volatile memory (such as read-only memory (ROM)) as well as volatile memory (such as an erasable programmable read-only memory (EPROM)).
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • the server 208 hosts websites, such as online stores for retailers. Although FIG. 2 depicts only one server 208, in some embodiments the system may include multiple servers 208.
  • the user device(s) 210 retrieve the websites from the server 208 via the network 206.
  • the control circuit 204 retrieves the websites from the server 208 via the network 206.
  • the control circuit 204 scans product listings on the websites to determine if discrepancies exist in the product listings. Once the control circuit 204 has reviewed the product listing and identified the product presented in the product listing, the control circuit 204 accesses the product database to retrieve information about the product. In some embodiments, the control circuit 204 is coupled directly to the product database 202.
  • control circuit 204 and the product database 202 may both be controlled by a retailer and housed in the same location.
  • the control circuit 204 may access the product database 202 directly or via an intranet.
  • the control circuit 204 accesses the product database 202 via the network 206.
  • the control circuit 204 may access the product database 202 via the network 206.
  • the product database 202 includes information about products.
  • the product database 202 can be specific to a brand, a retailer, a manufacturer, etc.
  • the product database 202 may contain information about all products produced by Company X, or all products offered for sale by Retailer Y.
  • the information in the product database 202 can be populated by any source. For example, if the product database 202 contains information about all products produced by Company X, Company X may provide, and update, the information contained in the product database 202.
  • Retailer Y may manage the product database 202. In such a case, the information m the product database 202 may come from multiple sources.
  • manufacturers and third parties may provide some of the information, and some of the information may be derived internally (e.g ,, measurements, observations, etc. performed by Retailer Y).
  • a third party may provide the product database 202 and charge customers, such as retailers, a fee for access to the product database 202.
  • the product database 202 can be controlled, maintained, and/or populated by one or more parties dependent upon the embodiment.
  • the product database can include information provided by one or more of suppliers, retailers, manufacturers, sendee providers, subscription sendees, and internal determinations.
  • FIG. 2 provides additional detail regarding a system for updating product listings
  • FIG. 3 describes example operations for updating a product listing.
  • FIG. 3 depicts example operations for updating a product listing, according to some embodiments.
  • the flow begins at block 302
  • a block 302 a product listing is retrieved.
  • a control circuit can retrieve the product listing.
  • the control circuit can retrieve the product listing from a website, such as an online store. Additionally, or alternatively, the control circuit can retrieve product listings that aren’t associated with online stores. For example, the control circuit can retrieve product listings from product aggregation websites (e.g., websites that compile information on products but do not necessarily sell the products for which they compile the information).
  • product aggregation websites e.g., websites that compile information on products but do not necessarily sell the products for which they compile the information.
  • the control circuit can analyze the product listing. For example, the control circuit can use text recognition and image recognition to determine attributes listed in the product listing. The flow continues at block 304.
  • a product identifier is extracted.
  • the control circuit can extract the product identifier.
  • the control circuit extracts the product identifier from the product listing.
  • the control circuit can extract the product identifier from one of the fields in the product listing using text and/or image recognition (e.g., OCR, SWT, etc.).
  • the control circuit can extract the product identifier from data associated with the product listing, such as a URL or metadata.
  • the product identifier identifies the product.
  • the product identifier can be a UPC or a GTIN.
  • the product identifier can identify a manufacturer of the product, a brand of the product, a model number of the product, a type of the product, and/or a supplier of the product. The flow continues at block 306.
  • the control circuit can determine if a discrepancy exists between the product listing and information for a product associated with the product identifier (i.e., one of the attributes is incorrect and/or null). For example, the control circuit can access a product database to retrieve the information for a product associated with the product identifier. The control circuit compares the product listing to the information for the product associated with the product identifier. In some embodiments, a discrepancy exists if the product listing includes attributes that differ from the information contained in the product database. For example, the product listing may include an incorrect image, dimension, detail, description, title, brand, availability, expected shipment data, expected delivery date, price, quantity, size, etc. The flow continues at decision diamond 308.
  • the control circuit can determine if the attribute to which the discrepancy is related is provided externally.
  • attributes are provided externally if a party other than the retailer provided the attribute. For example, if a manufacturer, supplier, third party aggregator, etc. provided the attribute, the attribute is provided externally. Attributes are not provided externally if they are derived internally. That is, attributes are derived internally if the retailer creates the attribute by, for example, measuring, observing, testing, photographing, etc. it should be noted that in some embodiments, certain attributes for a single product may be provided externally and other attributes for the product may be derived internally. In the event that the attribute to which the discrepancy is related is provided externally, the flow continues at block 310.
  • the product listing is updated automatically.
  • the control circuit can automatically update the product listing to remove the discrepancy.
  • the control circuit can add, replace, and/or remove one or more attributes from the product listing to remove the discrepancy.
  • the control circuit can remove the incorrect image and update the product listing to include the correct image (e.g., retrieved from the product database).
  • the control circuit can determine if the attribute to which the discrepancy is related is provided externally. If the attribute to which the discrepancy is related is not provided externally (i.e., the attribute to which the discrepancy is related is derived internally), the flow' continues at block 312.
  • an indication of the discrepancy is transmitted.
  • the control circuit can cause an indication of the discrepancy to be transmitted.
  • the indication of the discrepancy can be transmitted to an employee of the retailer, discrepancy aggregation site for addition to a posting of discrepancies, etc.
  • the indication of the discrepancy identifies the attribute to which the discrepancy is related.
  • the indication of the discrepancy can include the product identifier, information about the product, a suggested change to the product listing, a link to the product listing, information retrieved from the product database, a reproduction of the product listing, and/or any other suitable information.
  • a system for updating a product listing comprises a product database, wherein the product database includes product information for a plurality of products, and a control circuit communicatively coupled to the product database, the control circuit configured to retrieve, from a website, a product listing, extract, from the product listing, a product identifier, retrieve, from the product database based on the product identifier, product information for a product associated with the product identifier, determine, based on the product listing and the product information for a product associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute provided externally, automatically update the product listing to remove the discrepancy, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute derived internally, transmit
  • an apparatus and a corresponding method performed by the apparatus comprises retrieving, from a website, a product listing, extracting, from the product listing, a product identifier, retrieving, from a product database based on the product identifier, product information for a product associated with the product identifier, wherein the product database includes product information for a plurality of products, determining, based on the product listing and the product information for a product associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier, and, in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute provided externally, automatically updating the product listing to remove the discrepancy, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute derived internally, transmit an indication of the discrepancy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Quality & Reliability (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

In some embodiments, apparatuses and methods are provided herein useful to updating a product listing. In some embodiments, a system for updating a product listing comprises a product database including product information, and a control circuit configured to retrieve, from a website, a product listing, extract, from the product listing, a product identifier, retrieve, from the product database, product information for a product associated with the product identifier, determine, based on the product listing and the product information, that a discrepancy exists between the product listing and the product information, and in the event the discrepancy between the product listing and the product information is related to an attribute provided externally, automatically update the product listing to remove the discrepancy, and in the event the discrepancy between the product listing and the product information is related to an attribute derived internally, transmit an indication of the discrepancy.

Description

BARCODE VALIDATOR AND TEXT EXTRACTION FROM IMAGES FOR DATA
QUALITY CHECKS AND REMEDIATION
Cross-Reference to Related Application
[0001] This application claims the benefit of U.S. Provisional Application Number 62/588,672, filed November 20, 2017, which is incorporated by reference m its entirety herein.
Technical Field
[0002] This invention relates generally to data quality checks and remediation and, more specifically, to data quality checks and remediation for online product listings.
Background
[0003] Online shopping has become increasingly popular in recent years and this trend appears to be continuing. In response to this demand, retailers have developed online stores. Customers can visit websites (i.e., the online stores) to browse and purchase products. While many customers find online shopping to be quite convenient, it poses challenges for retailers and customers that are sometimes different than those encountered in a traditional retail facility. Specifically, sometimes there are discrepancies between the product listing (i.e., a listing for the product on the website) and information for the product listed. For example, the product listing may be for a single shirt, but the product image shows a multipack of shirts. Consequently, when purchasing the product, the customer sees the image of the multipack of shirts and expects to receive the multipack of shirts. This can be problematic in that the customer will likely be dissatisfied when the single shirt arrives. Consequently, a need exists for systems, methods, and apparatuses that can determine that discrepancies exist between product listings and product information.
Brief Description of the Drawings
[0004] Disclosed herein are embodiments of systems, apparatuses and methods pertaining to updating a product listing. This description includes drawings, wherein:
[0005] FIG. 1 depicts a web browser 102 presenting a product listing, according to some embodiments; [0006] FIG. 2 depicts a system for updating product listings, according to some embodiments; and
[0007] FIG. 3 depicts example operations for updating a product listing, according to some embodiments.
[0008] Elements in the figures are illustrated for simplicity and clarity and have not necessarily- been drawn to scale. For example, the dimensions and/or relative positioning of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of various embodiments of the present invention. Also, common but well- understood elements that are useful or necessary in a commercially feasible embodiment are often not depicted in order to facilitate a less obstructed view of these various embodiments of the present invention. Certain actions and/or steps may be described or depicted m a particular order of occurrence while those skilled in the art will understand that such specificity with respect to sequence is not actually required. The terms and expressions used herein have the ordinary technical meaning as is accorded to such terms and expressions by persons skilled m the technical field as set forth above except where different specific meanings have otherwise been set forth herein.
Detailed Description
[0009] Generally speaking, pursuant to various embodiments, systems, apparatuses, and methods are provided herein useful to updating a product listing. In some embodiments, a system for updating a product listing comprises a product database, wherein the product database includes product information for a plurality of products, and a control circuit communicatively coupled to the product database, the control circuit configured to retrieve, from a website, a product listing, extract, from the product listing, a product identifier, retrieve, from the product database based on the product identifier, product information for a product associated with the product identifier, determine, based on the product listing and the product information for a product associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute provided externally, automatically update the product listing to remove the discrepancy, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute derived internally, transmit an indication of the discrepancy.
[0010] As previously discussed, online shopping is a convenient way for customers to find products and for retailers to market products. However, because shopping online does not typically allow the customer to view the actual product that he or she will purchase, the customer must rely on product listings to learn about the product. Typically, product listings include a plurality of attributes for the product. For example, the attributes can include a title,
manufacturer, name, type, category, price, picture, description, product identifier (e.g., a universal product code (UPC) or a global trade item number (GTXN)), dimensions, weights, colors, availability, delivery estimate, etc. of/for the product. Unfortunately, there may be a discrepancy between the product listing and the information about the product (i.e., attributes provided for, or derived from, the product). That is, one or more of the attributes included in the product listing may be inaccurate, incorrect, superfluous, or inconsistent with information for the product. If there is a discrepancy m the product listing, the customer may not receive the product that he or she expects, resulting in customer dissatisfaction and/or added costs to the retailer. Additionally, as many retailers are now permitting third party sellers to create product listings on their websites, the risk of discrepancies is rising.
[0011] Described herein are systems, methods, and apparatuses that seek to minimize, or eliminate, discrepancies in product listings. In some embodiments, a system scans product listings on one or more websites. The system identifies the products in the listings and compares the product listings to information associated with the products. That is, the system compares the attributes in the product listings to the attributes for the products, for example, included in a product database. Additionally, in some embodiments, the system can update the product listing to remove the discrepancy (e.g., include correct attributes, remove incorrect attributes, etc.) and/or transmit a notification indicating that the discrepancy exists. The discussion of FIG. 1 provides an overview of such a system.
[0012] FIG. 1 depicts a web browser 102 presenting a product listing, according to some embodiments. The product listing is presented via a webpage of a retailer’s website. The product listing includes several fields, each of which can include attributes for the product. For example, the webpage depicted in FIG. 1 includes three fields: a title field 104, a product description 106, and a product image 108. The title field 104 includes a title for the product, such as a name and/or brand. The product description 106 includes information about the product. For example, the product description 106 can include a summary or description of the product, as well as any other attributes for the product. Additionally, m some embodiments, the product description can include a product identifier 110A (e.g., a UPC, barcode, GUN, or any other text or image that identifies the product). The product image 108 includes one or more representations of the product (e.g., pictures, drawings, renderings, schematics, etc.). In some embodiments, the product image 108 may also include a product identifier HOB. For example, a UPC for the product may be visible in a picture of the product. It should be noted that while some product listings may include the product identifier 110A in the product description 106 and the product identifier 110B in the product image 108, such is not always the case. For example, some product listings may only include the product identifier 110B in the product image 108. Additionally, in some embodiments, a product identifier may not be included in either the product description 106 or the product image 108. Instead other fields of the product listing, or possibly metadata or a uniform resource locator (URL), may include a product identifier.
[0013] As previously discussed, in some embodiments, a system scans product listings. For example, the system can scan the product listing depicted in FIG. 1. The system scans the product listing to determine attributes included in the product listing. For example, the system can scan the product listing using optical character recognition (OCR), stroke width transform (SWT), image recognition, etc. In some embodiments, the system determines the product identifier 110 A/1 lOB from the product description 106 and/or the product image 108. If a product identifier is not included in the product description 106 or the product image 108, the system can determine the product identifier elsewhere (e.g:, from a different field, metadata, a URL, etc ).
[0014] After determining the product identifier 1 10A/1 10B, the system compares the product listing with information for a product associated with the product identifier 1 1 QA/1 10B. For example, the system can reference a product database to determine information about the product associated with the product identifier. The information for the product associated with the product identifier can include the atributes for the product. In some embodiments, the product database is associated with, or provided by, a third party (e.g., a manufacturer, supplier, third party aggregator, etc.). Additionally, or alternatively, the product database can be provided by the retailer (e.g., the retailer can compile product information, and/or create and supply product information, for the product database).
0015] The system compares the product listing with the information for the product associated with the product identifier 110A/110B to determine if any discrepancies exist between the product listing and the information for the product associated with the product identifier 1 IOA'T 10B. If the system determines that a discrepancy exists between the product listing and the information for the product associated with the product identifier 1 lOA/l 10B, the system can automatically update the product listing and/or transmit an indication of the discrepancy (e.g., to an employee of the retailer). In some embodiments, the action that the system performs can be based on the type of discrepancy found between the product listing and the information for the product associated with the product identifier. As one example, the action the system performs can be dependent upon whether the attribute for which the discrepancy exists was provided externally (e.g., by a manufacturer, supplier, third party aggregator, etc.) or derived internally (e.g., based on the retailer’s examination of the product). In some embodiments, if the attribute for which the discrepancy exists is provided externally, the system can automatically update the product listing to remove the discrepancy. That is, the system can update the product listing based on the information for the product associated with the product identifier in the product database. Additionally, in some embodiments, if the attribute for which the discrepancy exists was derived internally, the system can flag the product listing or otherwise transmit an indication of the discrepancy.
[0016] While the discussion of FIG. 1 provides an overview of a system for updating product listings, the discussion of FIG. 2 provides additional detail regarding such a system.
[0017] FIG. 2 depicts a system for updating product listings, according to some embodiments. The system includes a product database 202, a control circuit 204, a network 206 (e.g., the Internet), a server 208, and user device(s) 210. The control circuit 204 can comprise a fixed- purpose hard-wired hardware platform (including but not limited to an application-specific integrated circuit (ASIC) (which is an integrated circuit that is customized by design for a particular use, rather than intended for general-purpose use), a field-programmable gate array (FPGA), and the like) or can compose a partially or wholly-programmable hardware platform (including but not limited to microcontrollers, microprocessors, and the like). These architectural options for such structures are well known and understood m the art and require no further description here. The control circuit 204 is configured (for example, by using corresponding programming as will be well understood by those skilled in the art) to carry out one or more of the steps, actions, and/or functions described herein.
[0018] By one optional approach the control circuit 204 operably couples to a memory . The memory may be integral to the control circuit 204 or can be physically discrete (in whole or in part) from the control circuit 204 as desired. This memory can also be local with respect to the control circuit 204 (where, for example, both share a common circuit board, chassis, pow¾r supply, and/or housing) or can be partially or wholly remote with respect to the control circuit 204 (where, for example, the memory is physically located in another facility, metropolitan area, or even country as compared to the control circuit 204).
[0019] This memory can serve, for example, to non-transitorily store the computer instructions that, when executed by the control circuit 204, cause the control circuit 204 to behave as described herein. As used herein, this reference to“non-transitorily” will be understood to refer to a non-ephemeral state for the stored contents (and hence excludes when the stored contents merely constitute signals or waves) rather than volatility of the storage media itself and hence includes both non-volatile memory (such as read-only memory (ROM)) as well as volatile memory (such as an erasable programmable read-only memory (EPROM)).
[0020] The server 208 hosts websites, such as online stores for retailers. Although FIG. 2 depicts only one server 208, in some embodiments the system may include multiple servers 208. The user device(s) 210 retrieve the websites from the server 208 via the network 206. Like the user device(s) 210, the control circuit 204 retrieves the websites from the server 208 via the network 206. The control circuit 204 scans product listings on the websites to determine if discrepancies exist in the product listings. Once the control circuit 204 has reviewed the product listing and identified the product presented in the product listing, the control circuit 204 accesses the product database to retrieve information about the product. In some embodiments, the control circuit 204 is coupled directly to the product database 202. For example, the control circuit 204 and the product database 202 may both be controlled by a retailer and housed in the same location. In such embodiments, the control circuit 204 may access the product database 202 directly or via an intranet. In other embodiments, the control circuit 204 accesses the product database 202 via the network 206. For example, in embodiments where the product database 202 is maintained by a third party, the control circuit 204 may access the product database 202 via the network 206.
[0021] The product database 202 includes information about products. The product database 202 can be specific to a brand, a retailer, a manufacturer, etc. For example, the product database 202 may contain information about all products produced by Company X, or all products offered for sale by Retailer Y. The information in the product database 202 can be populated by any source. For example, if the product database 202 contains information about all products produced by Company X, Company X may provide, and update, the information contained in the product database 202. As another example, if the product database 202 includes information for all products offered for sale by Retailer Y, Retailer Y may manage the product database 202. In such a case, the information m the product database 202 may come from multiple sources. For example, manufacturers and third parties may provide some of the information, and some of the information may be derived internally ( e.g ,, measurements, observations, etc. performed by Retailer Y). As another example, a third party may provide the product database 202 and charge customers, such as retailers, a fee for access to the product database 202. In any event, the product database 202 can be controlled, maintained, and/or populated by one or more parties dependent upon the embodiment. The product database can include information provided by one or more of suppliers, retailers, manufacturers, sendee providers, subscription sendees, and internal determinations.
[0022] While the discussion of FIG. 2 provides additional detail regarding a system for updating product listings, the discussion of FIG. 3 describes example operations for updating a product listing.
[0023] FIG. 3 depicts example operations for updating a product listing, according to some embodiments. The flow begins at block 302
[0024] A block 302, a product listing is retrieved. For example, a control circuit can retrieve the product listing. The control circuit can retrieve the product listing from a website, such as an online store. Additionally, or alternatively, the control circuit can retrieve product listings that aren’t associated with online stores. For example, the control circuit can retrieve product listings from product aggregation websites (e.g., websites that compile information on products but do not necessarily sell the products for which they compile the information). When the control circuit retrieves the product listing, the control circuit can analyze the product listing. For example, the control circuit can use text recognition and image recognition to determine attributes listed in the product listing. The flow continues at block 304.
[0025] At block 304, a product identifier is extracted. For example, the control circuit can extract the product identifier. In some embodiments, the control circuit extracts the product identifier from the product listing. For example, the control circuit can extract the product identifier from one of the fields in the product listing using text and/or image recognition (e.g., OCR, SWT, etc.). Additionally, or alternatively, the control circuit can extract the product identifier from data associated with the product listing, such as a URL or metadata. The product identifier identifies the product. For example, the product identifier can be a UPC or a GTIN.
The product identifier can identify a manufacturer of the product, a brand of the product, a model number of the product, a type of the product, and/or a supplier of the product. The flow continues at block 306.
[0026] At block 306, it is determined if a discrepancy exists. For example, the control circuit can determine if a discrepancy exists between the product listing and information for a product associated with the product identifier (i.e., one of the attributes is incorrect and/or null). For example, the control circuit can access a product database to retrieve the information for a product associated with the product identifier. The control circuit compares the product listing to the information for the product associated with the product identifier. In some embodiments, a discrepancy exists if the product listing includes attributes that differ from the information contained in the product database. For example, the product listing may include an incorrect image, dimension, detail, description, title, brand, availability, expected shipment data, expected delivery date, price, quantity, size, etc. The flow continues at decision diamond 308.
[0027] At decision diamond 308, it is determined if the attribute was provided externally. For example, the control circuit can determine if the attribute to which the discrepancy is related is provided externally. In some embodiments, attributes are provided externally if a party other than the retailer provided the attribute. For example, if a manufacturer, supplier, third party aggregator, etc. provided the attribute, the attribute is provided externally. Attributes are not provided externally if they are derived internally. That is, attributes are derived internally if the retailer creates the attribute by, for example, measuring, observing, testing, photographing, etc. it should be noted that in some embodiments, certain attributes for a single product may be provided externally and other attributes for the product may be derived internally. In the event that the attribute to which the discrepancy is related is provided externally, the flow continues at block 310.
[0028] At block 310, the product listing is updated automatically. For example, the control circuit can automatically update the product listing to remove the discrepancy. In some embodiments, the control circuit can add, replace, and/or remove one or more attributes from the product listing to remove the discrepancy. As one example, if the product listing includes an incorrect image (i.e., the image is the attribute to which the discrepancy is related), the control circuit can remove the incorrect image and update the product listing to include the correct image (e.g., retrieved from the product database).
[0029] As previously discussed, at decision diamond 308, the control circuit can determine if the attribute to which the discrepancy is related is provided externally. If the attribute to which the discrepancy is related is not provided externally (i.e., the attribute to which the discrepancy is related is derived internally), the flow' continues at block 312. At block 312, an indication of the discrepancy is transmitted. For example, the control circuit can cause an indication of the discrepancy to be transmitted. The indication of the discrepancy can be transmitted to an employee of the retailer, discrepancy aggregation site for addition to a posting of discrepancies, etc. The indication of the discrepancy identifies the attribute to which the discrepancy is related. Additionally, in some embodiments, the indication of the discrepancy can include the product identifier, information about the product, a suggested change to the product listing, a link to the product listing, information retrieved from the product database, a reproduction of the product listing, and/or any other suitable information.
[0030] In some embodiments, a system for updating a product listing comprises a product database, wherein the product database includes product information for a plurality of products, and a control circuit communicatively coupled to the product database, the control circuit configured to retrieve, from a website, a product listing, extract, from the product listing, a product identifier, retrieve, from the product database based on the product identifier, product information for a product associated with the product identifier, determine, based on the product listing and the product information for a product associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute provided externally, automatically update the product listing to remove the discrepancy, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute derived internally, transmit an indication of the discrepancy.
[0031] In some embodiments, an apparatus and a corresponding method performed by the apparatus comprises retrieving, from a website, a product listing, extracting, from the product listing, a product identifier, retrieving, from a product database based on the product identifier, product information for a product associated with the product identifier, wherein the product database includes product information for a plurality of products, determining, based on the product listing and the product information for a product associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier, and, in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute provided externally, automatically updating the product listing to remove the discrepancy, and in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute derived internally, transmit an indication of the discrepancy.
[0032] Those skilled in the art will recognize that a wide variety of other modifications, alterations, and combinations can also be made with respect to the above described embodiments without departing from the scope of the invention, and that such modifications, alterations, and combinations are to be viewed as being within the ambit of the inventive concept.

Claims

CLAIMS What is claimed is:
1. A system for updating product listings, the system comprising:
a product database, wherein the product database includes product information for a
plurality of products; and
a control circuit communicatively coupled to the product database, the control circuit configured to:
retrieve, from a website, a product listing:
extract, from the product listing, a product identifier:
retrieve, from the product database based on the product identifier, product
information for a product associated with the product identifier;
determine, based on the product listing and the product information for a product associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier; and
in the event the discrepancy between the product listing and the product
information for a product associated with the product identifier is related to an attribute provided externally, automatically update the product listing to remove the discrepancy;
in the event the discrepancy between the product listing and the product
information for a product associated with the product identifier is related to an attribute derived internally, transmit an indication of the discrepancy.
2. The system of claim 1, wherein the product identifier is extracted from an image associated with the product listing.
3. The system of claim 1, wherein the product identifier is one or more of a universal product code (UPC) and a global trade item number (GTIN).
4. The system of claim 1 , wherein the product identifier is extracted with use of one or more of optical character recognition (OCR) and stroke width transform (SWT).
5. The system of claim 1, wherein the product listing includes a plurality of attributes.
6. The system of claim 5, wherein the plurality of attributes includes one or more of a title, a description, dimensions, an image, a price, an availability, and a delivery estimate.
7. The system of claim 5, wherein the discrepancy between the product listing and the product information for a product associated with the product identifier occurs because at least one of the plurality of attributes is incorrect.
8. The system of claim 5, wherein the discrepancy between the product listing and the product information for a product associated with the product identifier occurs because at least one of the plurality of attributes is null.
9. The system of claim 1, wherein the product information for a plurality of products is provided by one or more of suppliers, retailers, manufacturers, service providers, subscription services, and internal determinations.
10. A method for updating product listings, the method comprising:
retrieving, from a website, a product listing;
extracting, from the product listing, a product identifier;
retrieving, from a product database based on the product identifier, product information for a product associated with the product identifier, wherein the product database includes product information for a plurality of products;
determining, based on the product listing and the product information for a product
associated with the product identifier, that a discrepancy exists between the product listing and the product information for a product associated with the product identifier; and
in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute provided externally, automatically updating the product listing to remove the discrepancy; and
in the event the discrepancy between the product listing and the product information for a product associated with the product identifier is related to an attribute derived internally, transmit an indication of the discrepancy.
11. The method of claim 10, wherein the extracting step comprises extracting the product identifier from an image associated with the product listing.
12. The method of claim 10, wherein the product identifier is one or more of a universal product code (UPC) and a global trade item number (GTIN).
13. The method of claim 10, wherein the extracting step comprises extracting the product identifier using of one or more of optical character recognition (OCR) and stroke width transform (SWT).
14. The method of claim 10, wherein the product listing includes a plurality of attributes.
15. The method of claim 14, wherein the plurality of attributes includes one or more of a title, a description, dimensions, an image, a price, an availability, and a delivery estimate.
16. The method of claim 14, wherein the discrepancy between the product listing and the product information for a product associated with the product identifier occurs because at least one of the plurality of attributes is incorrect.
17. The method of claim 14, wherein the discrepancy between the product listing and the product information for a product associated with the product identifier occurs because at least one of the plurality of attributes is null.
18. The method of claim 10, wherein the product information for the plurality of products is provided by one or more of suppliers, retailers, manufacturers, service providers, subscription services, and internal determinations.
PCT/US2018/061781 2017-11-20 2018-11-19 Barcode validator and text extraction from images for data quality checks and remediation WO2019099987A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762588672P 2017-11-20 2017-11-20
US62/588,672 2017-11-20

Publications (1)

Publication Number Publication Date
WO2019099987A1 true WO2019099987A1 (en) 2019-05-23

Family

ID=66539167

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/061781 WO2019099987A1 (en) 2017-11-20 2018-11-19 Barcode validator and text extraction from images for data quality checks and remediation

Country Status (1)

Country Link
WO (1) WO2019099987A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8527475B1 (en) * 2011-09-21 2013-09-03 Amazon Technologies, Inc. System and method for identifying structured data items lacking requisite information for rule-based duplicate detection
US20140108206A1 (en) * 2012-10-15 2014-04-17 Cbs Interactive Inc. System and method for managing product catalogs
US20170286901A1 (en) * 2016-03-29 2017-10-05 Bossa Nova Robotics Ip, Inc. System and Method for Locating, Identifying and Counting Items
US9818144B2 (en) * 2013-04-09 2017-11-14 Ebay Inc. Visual product feedback
US20180322540A1 (en) * 2017-05-04 2018-11-08 Wal-Mart Stores, Inc. Systems and methods for updating website modules

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8527475B1 (en) * 2011-09-21 2013-09-03 Amazon Technologies, Inc. System and method for identifying structured data items lacking requisite information for rule-based duplicate detection
US20140108206A1 (en) * 2012-10-15 2014-04-17 Cbs Interactive Inc. System and method for managing product catalogs
US9818144B2 (en) * 2013-04-09 2017-11-14 Ebay Inc. Visual product feedback
US20170286901A1 (en) * 2016-03-29 2017-10-05 Bossa Nova Robotics Ip, Inc. System and Method for Locating, Identifying and Counting Items
US20180322540A1 (en) * 2017-05-04 2018-11-08 Wal-Mart Stores, Inc. Systems and methods for updating website modules

Similar Documents

Publication Publication Date Title
US11741512B2 (en) System, method and computer program product for tracking and correlating online user activities with sales of physical goods
US9706011B2 (en) Personalized real estate event feed
US9646286B2 (en) System and method for automated retail product accounting
US8606649B2 (en) Display of anomymous purchase information over the internet
US20100185525A1 (en) Controlling presentation of purchasing information based on item availability
JP5425961B2 (en) Information providing apparatus, information providing method, information providing program, and computer-readable recording medium storing the program
US8195537B2 (en) Method and system for repairing and processing sales tracings invoices in a contract management system
WO2013019885A1 (en) Systems and methods for generating marketplace listings
US20030004816A1 (en) User-specific method of selling products, computer program product, and system for performing the same
US20160055448A1 (en) Method and Apparatus to Provide Centralized Information Database for Retailers, Manufacturers, and Distributors in Target Industries and Markets
US20170032404A1 (en) Customer Purchase Data Network System, Method, and Apparatus
WO2016000044A1 (en) Online shopping system and method
US8799103B1 (en) Client-side structured data capture and remote application integration using a web browser
US20100030631A1 (en) Free Sample Provision Managing System and Its Program
US20070094271A1 (en) Method and system for an enhanced subscription capability for a newsletter
JP6543576B2 (en) System and method for providing customized search results based on a user's shopping history, a retailer's identity and items promoted by the retailer
WO2017126707A1 (en) Merchandise purchase assist system
JP7163084B2 (en) Pricing devices, programs and pricing methods
US8046325B2 (en) Method and system for distributing product information
US20220164855A1 (en) Computing System and Method for Accomplishing a Transaction Through a Proxy System
KR20090002145A (en) Method and system for providing local information of product
US20220138683A1 (en) Property Inventory Tracking
WO2019099987A1 (en) Barcode validator and text extraction from images for data quality checks and remediation
KR20110087371A (en) System and method for managing affiliated good's information and recording medium
WO2021146655A2 (en) Retail platform with integrated inventory and payment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18879257

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 18879257

Country of ref document: EP

Kind code of ref document: A1