WO2019245801A1 - Association et récupération de complément numérique pour recherche visuelle - Google Patents

Association et récupération de complément numérique pour recherche visuelle Download PDF

Info

Publication number
WO2019245801A1
WO2019245801A1 PCT/US2019/036542 US2019036542W WO2019245801A1 WO 2019245801 A1 WO2019245801 A1 WO 2019245801A1 US 2019036542 W US2019036542 W US 2019036542W WO 2019245801 A1 WO2019245801 A1 WO 2019245801A1
Authority
WO
WIPO (PCT)
Prior art keywords
supplement
digital
digital supplement
image
computing device
Prior art date
Application number
PCT/US2019/036542
Other languages
English (en)
Inventor
Alan JOYCE
Edgar Chung
Zhe Yang
Ian MESA
Joseph Olson
Original Assignee
Google Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/014,512 external-priority patent/US10579230B2/en
Priority claimed from US16/014,520 external-priority patent/US10878037B2/en
Application filed by Google Llc filed Critical Google Llc
Priority to KR1020227044320A priority Critical patent/KR20230003388A/ko
Priority to CN201980022269.0A priority patent/CN112020712B/zh
Priority to EP19735444.2A priority patent/EP3811238A1/fr
Priority to JP2020570146A priority patent/JP7393361B2/ja
Priority to KR1020207031107A priority patent/KR20200136030A/ko
Publication of WO2019245801A1 publication Critical patent/WO2019245801A1/fr
Priority to JP2022077546A priority patent/JP2022110057A/ja

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/532Query formulation, e.g. graphical querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/41Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • Nonprovisional Patent Application No. 16/014,520 filed on June 21, 2018, entitled “DIGITAL SUPPLEMENT ASSOCIATION AND RETRIEVAL FOR VISUAL SEARCH”, the disclosure of which is incorporated by reference herein in its entirety.
  • Mobile computing devices such as smartphones, often include cameras. These cameras can be used to capture images of entities in the environment around the computing device. Various types of content or experiences that relate to those entities may be available for users via the mobile computing device.
  • This disclosure describes systems and methods for digital supplement association and retrieval for visual search.
  • systems and techniques described herein may be used to provide digital supplements, such as augmented reality (AR) content or experiences, that are responsive to a visual search.
  • the visual search may for example be based on an image or an entity identified within an image.
  • the digital supplement may, for example, include providing information or functionality associated with the image.
  • One aspect is a computer-implemented method that includes receiving data specifying a digital supplement, the data identifying a digital supplement and a supplement anchor for associating the digital supplement with visual content.
  • the method also includes generating a data structure instance that specifies the digital supplement and the supplement anchor.
  • the method further includes, after generating the data structure instance, enabling triggering of the digital supplement by an image based at least on storing the data structure instance in a database that includes a plurality of other data structure instances.
  • Each of the other data structure instances specifies a digital supplement and one or more supplement anchors.
  • Another aspect is a computing device that includes at least one processor and memory storing instructions.
  • the instructions when executed by the at least one processor, cause the computing device to receive data specifying a digital supplement, the data identifying a digital supplement, a supplement anchor for associating the digital supplement with visual content, and context information.
  • the instructions also cause the computing device to generate a data structure instance that specifies the digital supplement, the supplement anchor, and the context information.
  • the instructions further cause the computing device to, after generating the data structure instance, enable triggering of the digital supplement by an image based at least on storing the data structure instance in a database that includes a plurality of other data structure instances.
  • Each of the other data structure instances specifies a digital supplement and one or more supplement anchors.
  • Yet another aspect is a computer-implemented method that includes receiving a visual-content query from a computing device and identifying a supplement anchor based on the visual-content query. The method also includes generating an ordered list of digital supplements based on the identified supplement anchor and transmitting the ordered list to the client computing device.
  • FIG. 1 is a block diagram illustrating a system according to an example implementation.
  • FIG. 2 is a third person view of an example physical space in which an embodiment of the client computing device of FIG. 1 is accessing digital supplements.
  • FIG. 3 is a diagram of an example method of enabling triggering of a digital supplement, in accordance with implementations described herein.
  • FIG. 4 is a diagram of an example method of enabling triggering of a digital supplement, in accordance with implementations described herein.
  • FIG. 5 is a diagram of an example method of searching for and presenting a digital supplement, in accordance with implementations described herein.
  • FIG. 6 is a diagram of an example method of identifying and presenting a digital supplement based on an image, in accordance with implementations described herein.
  • FIGS. 7A-7C are schematic diagrams of user interface screens displayed by embodiments of the client computing device of FIG. 1 to conduct a visual-content search and displaying a digital supplement.
  • FIGS. 8A-8C are schematic diagrams of user interface screens displayed by embodiments of the client computing device of FIG. 1 to conduct a visual-content search and displaying a digital supplement.
  • FIGS. 9A and 9B are schematic diagrams of user interface screens displayed by embodiments of the client computing device of FIG. 1 to conduct a visual-content search and display a digital supplement.
  • FIGS. 10A-10C are schematic diagrams of user interface screens displayed by embodiments of the client computing device of FIG. 1 to conduct a visual-content search and display a digital supplement.
  • FIGS. 11A-11C are schematic diagrams of user interface screens displayed by embodiments of the client computing device of FIG. 1 to conduct various visual-content searches within a store.
  • FIGS. 12A-12C are schematic diagrams of user interface screens displayed by embodiments of the client computing device of FIG. 1 during various visual-content searches.
  • FIG. 13 is a schematic diagram of an example of a computer device and a mobile computer device that can be used to implement the techniques described herein.
  • the present disclosure describes technological improvements that simplify the identification and presentation of digital supplements based on visual content.
  • Some implementations of technology described herein generate an index of digital supplements that are relevant to particular types of visual content and provide those digital supplements in response to a visual-content query received from a client computing device.
  • This index can allow a user to access relevant digital supplements that are provided by network-accessible resources (e.g., web pages) disposed throughout the world. This may provide a functional data structure that allows more efficient retrieval of information.
  • a client computing device such as a smartphone, may capture an image of a supplement anchor, such as an entity. The client computing device may then transmit a visual-content query based on the image to a server computing device to retrieve digital supplements associated with the identified supplement anchor.
  • a supplement anchor such as an entity.
  • the client computing device may then transmit a visual-content query based on the image to a server computing device to retrieve digital supplements associated with the identified supplement anchor.
  • the supplement anchor is based on the physical environment around the client computing device and the digital supplement is virtual content that may supplement a user’s experience in the physical environment.
  • the visual-content query may include the image or data that is determined from the image (e.g., such as an indicator of the identified supplement anchor).
  • An example of data determined from the image is text that is extracted from the image using, for example, optical character recognition.
  • Other examples of data extracted from the image include values read from barcodes, QR codes, etc., in the image, identifiers or descriptions of entities, products, or entity types identified in the image.
  • the entities, products, or entity types may be identified in the image using, for example, a neural network system such as a convolutional neural network system.
  • the identifiers or descriptions of entities, products, or entity types may include metadata or a reference to a record in a database that relates to an entity, product, or entity type.
  • Non limiting examples of the entities include buildings, works of art, products, books, posters, photographs, catalogs, signs, documents (e.g., business cards, receipts, coupons, catalogs), people, and body parts.
  • Various types of digital supplements may be available that are related to a supplement anchor.
  • the digital supplement may be provided by a network-accessible resource, such as a web page that is available on the Internet.
  • a network-accessible resource such as a web page that is available on the Internet.
  • Some implementations generate and maintain an index of digital supplements that are associated with entities for use in responding to visual content queries.
  • the index may, for example, be populated by crawling network-accessible resources to determine whether the network- accessible resources include or provide any are digital supplements and to determine the supplement anchors associated with those digital supplements.
  • the network-accessible resource may include metadata that identifies the supplement anchors (e.g., text, codes, entities, or types of entities) for which a digital supplement is associated.
  • the metadata may be included by the network-accessible resource in response to a hypertext transfer protocol (HTTP) request.
  • HTTP hypertext transfer protocol
  • the metadata may be provided in various formats such as extensible markup language (XML), JavaScript Object Notation (JSON), or another format.
  • the metadata for a digital supplement may include one or more of the following: a type indicator, an anchor indicator, a name, a description, a snippet of the content (i.e., an excerpt or preview of a portion of the content), an associated image, a link such as a URL to the digital supplement, and an identifier of an application associated with the digital supplement.
  • the metadata may also include information about a publisher of the digital supplement.
  • the metadata may include one or more of a publisher name, a publisher description, and an image or icon associated with the publisher.
  • the metadata includes context information related to providing the digital supplement.
  • the metadata may also include conditions (e.g., geographic conditions, required applications) associated with providing or accessing the digital supplement.
  • the identified digital supplements may be added to an index that is stored in a memory.
  • the associated supplement anchor for a digital supplement is used as a key to the index.
  • the digital supplements may also be associated with various scores.
  • a digital supplement may be associated with a prestige score that is based on how many other links are found (e.g., while crawling network-accessible resources) that reference the digital supplement or the network-accessible resource associated with the digital supplement and the prestige of the network-accessible resources that provide those links.
  • a digital supplement may be associated with one or more relevance scores that correspond to the relevance of the digital supplement (or the associated network-accessible resource) to a particular anchor.
  • the relevance score may also be associated with a keyword or subject matter.
  • the relevance score may be determined based on one or more of the content of the digital supplement, the content of the network-accessible resource, the content of sites that link to the network-accessible resource, and the contents (e.g., text) of links to the network-accessible resources.
  • FIG. 1 is a block diagram illustrating a system 100 according to an example implementation.
  • the system 100 may associate digital supplement with entities or entity types and may retrieve digital supplements in response to visual searches.
  • a visual search is a search based on visual-content.
  • a visual search may be performed based on a visual-content query.
  • a visual-content query is a query based on an image or other visual- content.
  • a visual-content query may include an image.
  • a visual-content query may include text or data that is based on an image.
  • the text or data may be generated by recognizing one or more entities in an image.
  • the system 100 includes a client computing device 102, a search server 152, and a digital supplement server 172. Also shown is a network 190 over which the client computing device 102, the search server 152, and the digital supplement server 172 may communicate.
  • the client computing device 102 may include a processor assembly 104, a communication module 106, a sensor system 110, and a memory 120.
  • the sensor system 110 may include various sensors, such as a camera assembly 112, an inertial motion unit (IMU) 114, and a global positioning system (GPS) receiver 116. Implementations of the sensor system 110 may also include other sensors, including, for example, a light sensor, an audio sensor, an image sensor, a distance and/or proximity sensor, a contact sensor such as a capacitive sensor, a timer, and/or other sensors and/or different combinations of sensors.
  • the client computing device 102 is a mobile device (e.g., a smartphone).
  • the camera assembly 112 captures images or videos of the physical space around the client computing device 102.
  • the camera assembly 112 may include one or more cameras.
  • the camera assembly 112 may also include an infrared camera. Image captured with the camera assembly 112 may be used to identify to supplement anchors and to form visual content queries.
  • images captured with the camera assembly 112 may also be used to determine a location and orientation of the client computing device 102 within a physical space, such as an interior space, based on a representation of that physical space that is received from the memory 120 or an external computing device.
  • a physical space such as an interior space
  • the representation of a physical space may include visual features of the physical space (e.g., features extracted from images of the physical space).
  • the representation may also include location-determination data associated with those features that can be used by a visual positioning system to determine location and/or position within the physical space based on one or more images of the physical space.
  • the representation may also include a three-dimensional model of at least some structures within the physical space. In some implementations, the representation does not include three-dimensional models of the physical space.
  • the IMU 114 may detect motion, movement, and/or acceleration of the client computing device.
  • the IMU 114 may include various different types of sensors such as, for example, an accelerometer, a gyroscope, a magnetometer, and other such sensors. An orientation of the client computing device 102 may be detected and tracked based on data provided by the IMU 114 or GPS receiver 116.
  • the GPS receiver 116 may receive signals emitted by GPS satellites.
  • the signals include a time and position of the satellite. Based on receiving signals from several satellites (e.g., at least four), the GPS receiver 116 may determine a global position of the client computing device 102.
  • the memory 120 may include an application 122, other applications 140, and a device positioning system 142.
  • the other applications 140 include any other applications that are installed or otherwise available for execution on the client computing device 102.
  • the application 122 may cause one of the other applications 140 to be launched to provide a digital supplement.
  • some digital supplements may only be available if the other applications 140 include a specific application associated with or required to provide the digital supplement.
  • the device positioning system 142 determines a position of the client computing device 102.
  • the device positioning system 142 may use the sensor system 110 to determine a location and orientation of the client computing device 102 globally or within a physical space.
  • the device positioning system 142 determines a location of the client computing device 102 based on, for example, a cellular triangulation.
  • the client computing device 102 may include a visual positioning system that compares images captured by the camera assembly 112 (or features extracted from those images) to a known arrangement of features within the representation of the physical space to determine the six degree-of-freedom pose (e.g., the location and orientation) of the client computing device 102 within a physical space.
  • a visual positioning system that compares images captured by the camera assembly 112 (or features extracted from those images) to a known arrangement of features within the representation of the physical space to determine the six degree-of-freedom pose (e.g., the location and orientation) of the client computing device 102 within a physical space.
  • the application 122 may include a supplement anchor identification engine 124, a digital supplement retrieval engine 126, a digital supplement presentation engine 128, and a user interface engine 130. Some implementations of the application 122 may include fewer, additional, or other components.
  • the supplement anchor identification engine 124 identifies supplement anchors based on, for example, images captured with the camera assembly 112.
  • the supplement anchor identification engine 124 analyzes an image to identify text.
  • the text may then be used to identify an anchor.
  • the text may be mapped to a node in a knowledge graph.
  • the text may be recognized as the name of an entity such as a person, place, product, building, artwork, movie, or other type of entity.
  • the text may be recognized as a phrase that is commonly associated with a specific entity or as a phrase that describes a specific entity.
  • the text may then be recognized as an anchor associated with the specific entity.
  • the supplement anchor identification engine 124 identifies one or more codes, such as a barcode, QR code, or another type of code, within an image. The code may then be mapped to a supplement anchor.
  • codes such as a barcode, QR code, or another type of code
  • the supplement anchor identification engine 124 may include a machine learning module that can recognize at least some types of entities within an image.
  • the machine learning module may include a neural network system.
  • Neural networks are computational models used in machine learning and made up of nodes organized in layers with weighted connections. Training a neural network uses training examples, each example being an input and a desired output, to determine, over a series of iterative rounds, weight values for the connections between layers that increase the likelihood of the neural network providing the desired output for a given input. During each training round, the weights are adjusted to address incorrect output values. Once trained, the neural network can be used to predict an output based on provided input.
  • the neural network system includes a convolution neural network (CNN).
  • a convolutional neural network (CNN) is a neural network in which at least one of the layers of the neural network is a convolutional layer.
  • a convolutional layer is a layer in which the values of a layer are calculated based on applying a kernel function to a subset of the values of a previous layer. Training the neural network may involve adjusting weights of the kernel function based on the training examples. Typically, the same kernel function is used to calculate each value in a convolutional layer. Accordingly, there are far fewer weights that must be learned while training a convolutional layer than a fully- connected layer (e.g., a layer in which each value in a layer is a calculated as an
  • a textual description of the entity or entity type may be generated. Additionally, the entity or entity type may be mapped to a supplement anchor.
  • a supplement anchor is associated with one or more digital supplements.
  • the supplement anchor identification engine 124 determines a confidence score for a recognized anchor.
  • a higher confidence score may indicate that the content (e.g., image, extracted text, barcode, QR code) from an image is more likely to be associated with the determined anchor than if a lower confidence score is determined.
  • FIG. 1 shows the supplement anchor identification engine 124 as a component of the application 122 on the client computing device 102
  • some implementations include a supplement anchor identification engine on the search server 152.
  • the client computing device 102 may send an image captured by the camera assembly 112 to the search server 152, which may then identify supplement anchors within the image.
  • the supplement anchor identification engine 124 identifies potential supplement anchors. For example, the supplement anchor identification engine 124 may identify (recognized) various entities within an image. Identifiers of the recognized entities may then be transmitted to the search server 152, which may determine if any of the entities are associated with any supplement anchors. In some implementations, the search server 152 may use the identified entities as contextual information even if the identified entities are not supplement anchors.
  • the digital supplement retrieval engine 126 retrieves digital supplements.
  • the digital supplement retrieval engine 126 may retrieve digital supplements associated with supplement anchors identified by the supplement anchor identification engine 124.
  • the digital supplement retrieval engine 126 retrieves a digital supplement from the search server 152 or the digital supplement server 172.
  • the digital supplement retrieval engine 126 may retrieve one or more digital supplements that are associated with the identified supplement anchors.
  • the digital supplement retrieval engine 126 may generate a visual-content query that includes the image (or identifiers of supplement anchors or entities within the image) and transmit the visual-content query to the search server 152.
  • the visual- content query may also include contextual information such as the location of the client computing device 102.
  • data relating to the digital supplements such as a name, an image, or a description is retrieved and presented to a user (e.g., by the user interface engine 130). If multiple digital supplements are presented, a user may select one of the digital supplements via a user interface generated by the user interface engine 130.
  • the digital supplement presentation engine 128 presents or causes digital supplements to be presented on the client computing device 102.
  • the digital supplement presentation engine 128 causes the client computing device to initiate one of the other applications 140.
  • the digital supplement presentation engine 128 causes information or content to be displayed.
  • the digital supplement presentation engine 128 may cause the user interface engine 130 to generate a user interface that includes information or content from a digital supplement to be displayed by the client computing device 102.
  • the digital supplement presentation engine 128 is triggered by the digital supplement retrieval engine 126 retrieving a digital supplement. The digital supplement presentation engine 128 may then trigger the display device 108 to display content associated with a digital supplement.
  • the digital supplement presentation engine 128 causes a digital supplement to be displayed at a different time than when the digital supplement retrieval engine 126 retrieves the digital supplement.
  • a digital supplement may be retrieved in response to a visual-content query at a first time and the digital supplement may be presented at a second time.
  • a digital supplement may be retrieved in response to a visual-content query based on an image of a home furnishing or furniture from a catalog or store at a first time (e.g., while the user is looking through a catalog or is at a store).
  • a digital supplement that includes AR content of the home furnishing or furniture may be presented at a second time (e.g., while the user is in a room in which the home furnishing or furniture may be placed).
  • the user interface engine 130 generates user interfaces.
  • the user interface engine 130 may also cause the client computing device 102 to display the generated user interfaces.
  • the generated user interfaces may, for example, display information or content from a digital supplement.
  • the user interface engine 130 generates a user interface includes multiple user-actuatable controls that are each associated with a digital supplement. For example, a user may actuate one of the user-actuatable controls (e.g., by touching the control on a touchscreen, clicking on the control using a mouse or another input device, or otherwise actuating the control).
  • the search server 152 is a computing device.
  • the search server 152 may respond to search requests such as visual-content queries.
  • the response may include one or more digital supplements that are potentially relevant to the visual-content query.
  • the search server 152 includes memory 160, a processor assembly 154, and a communication module 156.
  • the memory 160 may include a content crawler 162, a digital supplement search engine 164, and a digital supplement data store 166.
  • the content crawler 162 may crawl network-accessible resources to identify digital supplements.
  • the content crawler 162 may access web pages that are accessible via the Internet, such as web pages provided by the digital supplement server 172.
  • Crawling a network-accessible resource may include requesting the resource from a web server and parsing at least a portion of the resource.
  • Digital supplements may be identified based on metadata provided by the network-accessible resource, such as XML or JSON data that provides information about a digital supplement.
  • the crawler identifies network-accessible resources based on extracting links from previously crawled network-accessible resources.
  • the content crawler 162 may also identify network-accessible resources to crawl based on receiving input submitted by a user. For example, a user may submit a URL (or other information) to a network-accessible resource that includes a digital supplement via a web form or application programming interface (API).
  • API application programming interface
  • the content crawler 162 generates an index of the identified digital supplement.
  • the content crawler 162 may also generate scores associated with the digital supplements, such as relevance scores or popularity (prestige) scores.
  • the digital supplement search engine 164 receives search queries and generates responses that may include one or more potentially relevant digital supplement.
  • the digital supplement search engine 164 may receive a visual-content query from the client computing device 102.
  • the visual-content query may include an image.
  • the digital supplement search engine 164 may identify supplement anchors in the image and, based on the identified supplement anchor, identify related or potentially relevant digital supplements.
  • the digital supplement search engine 164 may transmit to the client computing device 102 a response that includes the digital supplement or information that can be used to access the digital supplement.
  • the digital supplement search engine 164 may return information associated with multiple digital supplements. For example, a list of digital supplements may be included in a response to the query. The list may be ordered based on relevance to the supplement anchor, popularity, or other properties of the digital supplement.
  • the visual-content queries may, for example, include images captured by the camera assembly 112 or text or other data associated with images captured by the camera assembly 112.
  • the visual-content queries may also include other information such as the location of the client computing device 102 or an identifier of a user of the client computing device 102.
  • the search server 152 may determine a probably location of the client computing device 102 from the user identifier (e.g., if the user has enabled a location service on the client computing device 102 that associates information about a user’s location with the user’s account).
  • the digital supplement data store 166 stores information about digital supplements.
  • the digital supplement data store 166 includes an index of digital supplements.
  • the index may be generated by the content crawler 162.
  • the digital supplement search engine 164 may use the index to respond to search queries.
  • the digital supplement server 172 is a computing device.
  • the digital supplement server 172 provides digital supplements.
  • the digital supplement server 172 includes memory 180, a processor assembly 174, and a
  • the memory 180 may include a digital supplement 182 and metadata 184.
  • the memory 180 may also include other network- accessible resources such as web pages that are not necessarily digital supplements.
  • the memory 180 may store a web page that includes metadata to provide details about one or more digital supplements and how to access those digital supplements.
  • the memory 180 may include a resource serving engine such as a web server that, for example, responds to requests, such as HTTP requests, with network-accessible resources such as web pages and digital supplements.
  • a resource serving engine such as a web server that, for example, responds to requests, such as HTTP requests, with network-accessible resources such as web pages and digital supplements.
  • the digital supplement 182 is content of any type that can be provided as a supplement to something in the physical environment around a user.
  • the digital supplement 182 may also include content of any type that can supplement a stored image (e.g., of a previous physical environment around a user).
  • the digital supplement may be associated with a supplement anchor, such as an image, an object or product identified in the image, or a location.
  • the digital supplement 182 may include one or more images, audio content, textual data, videos, games, data files, applications, or structured text documents. Examples of structured text documents include hypertext markup language (HTML) documents, XML documents, and other types of structured text documents.
  • HTML hypertext markup language
  • the digital supplement 182 may cause an application to be launched and may define parameters for that application.
  • the digital supplement 182 may also cause a request to be transmitted to a server (e.g., an HTTP request) and may define parameters for that request.
  • a server e.g., an HTTP request
  • the digital supplement 182 initiates as a workflow for completing an activity, such as a workflow for completing a purchase.
  • the digital supplement 182 may transmit an HTTP request to a server that adds a particular product to a user’s shopping cart, adds a coupon code, and retrieves a purchase confirmation page.
  • the metadata 184 is data that describes a digital supplement.
  • the metadata 184 may describe one or digital supplements that are provided by the digital supplement server 172 or that are provided elsewhere.
  • the metadata 184 for a digital supplement may include one or more of the following: a type indicator, an anchor indicator, a name, a description, a preview snippet or excerpt, an associated image, a link such as a URL to the digital supplement, and an identifier of an application associated with the digital supplement.
  • the metadata may also include information about a publisher of the digital supplement, such as a publisher name, a publisher description, and an image or icon associated with the publisher.
  • the metadata also includes context information about the digital supplement or that must be satisfied to provide the digital supplement.
  • the metadata may include conditions (e.g., geographic conditions, client computing devices requirements, required applications) that must be met to access the digital supplement.
  • Example context information includes locations, entities identified within an image, or multiple entities identified within an image (e.g., some digital supplements may require a combination of entities to be recognized within the image).
  • the recognized entities may be supplement anchors. In some implementations, the recognized entities are not supplement anchors but instead provide contextual information.
  • the metadata 184 may also include supplement anchors (e.g., text, codes, entities, or types of entities) that are associated with a digital supplement.
  • the metadata 184 may be stored in various formats.
  • the metadata 184 is stored in database.
  • the metadata 184 may also be stored as an XML file, a JSON file or another format file.
  • the digital supplement server 172 retrieves the metadata 184 from a database and formats the metadata 184 as XML,
  • the search server 152 may access the metadata 184 to generate data stored in the digital supplement data store 166 and used to respond to search requests from the client computing device 102.
  • the communication module 106 includes one or more devices for
  • the communication module 106 may communicate via wireless or wired networks, such as the network 190.
  • the communication module 156 of the search server 152 and the communication module 176 of the digital supplement server 172 may be similar to the communication module 106.
  • the display device 108 may, for example, include an LCD (liquid crystal display) screen, an LED (light emitting diode) screen, an OLED (organic light emitting diode) screen, a touchscreen, or any other screen or display for displaying images or information to a user.
  • the display device 108 includes a light projector arranged to project light onto a portion of a user’s eye.
  • the memory 120 can include one or more non-transitory computer-readable storage media.
  • the memory 120 may store instructions and data that are usable by the client computing device 102 to implement the technologies described herein, such as to generate visual-content queries based on captured images, transmit visual-content queries, receive responses to the visual-content queries, and present a digital supplement identified in a response to a visual-content query.
  • the memory 160 of the search server 152 and the memory 180 of the digital supplement server 172 may be similar to the memory 120 and may store data instructions that are usable to implement the technology of the search server 152 and the digital supplement server 172, respectively.
  • the processor assembly 104 includes one or more devices that are capable of executing instructions, such as instructions stored by the memory 120, to perform various tasks associated with digital supplement association and retrieval for visual search.
  • the processor assembly 104 may include a central processing unit (CPU) and/or a graphics processor unit (GPU).
  • CPU central processing unit
  • GPU graphics processor unit
  • image/video rendering tasks such as generating and displaying a user interface or displaying portions of a digital supplement may be offloaded from the CPU to the GPU.
  • some image recognition tasks may also be offloaded from the CPU to the GPU.
  • FIG. 1 does not show it, some implementations include a head- mounted display device (HMD).
  • the HMD may be a separate device from the client computing device 102 or the client computing device 102 may include the HMD.
  • the client computing device 102 communicates with the HMD via a cable.
  • the client computing device 102 may transmit video signals and/or audio signals to the HMD for display for the user, and the HMD may transmit motion, position, and/or orientation information to the client computing device 102.
  • the client computing device 102 may also include various user input components (not shown) such as a controller that communicates with the client computing device 102 using a wireless communications protocol.
  • the client computing device 102 may communicate via a wired connection (e.g., a Universal Serial Bus (USB) cable) or via a wireless communication protocol (e.g., any WiFi protocol, any BlueTooth protocol, Zigbee, etc.) with a HMD (not shown).
  • the client computing device 102 is a component of the HMD and may be contained within a housing of the HMD.
  • the network 190 may be the Internet, a local area network (LAN), a wireless local area network (WLAN), and/or any other network.
  • the client computing device 102 may receive the audio/video signals, which may be provided as part of a digital supplement in an illustrative example implementation, via the network.
  • FIG. 2 is a third person view of an example physical space 200 in which an embodiment of the client computing device 102 is accessing digital supplements.
  • the physical space 200 includes an object 222.
  • the object 222 is an artwork on a wall of the physical space 200.
  • the object 222 is contained within the field of view 204 of the camera assembly 112 of the client computing device 102.
  • the user interface screen 206 may, for example, be generated by the user interface engine 130 of the client computing device 102.
  • the user interface screen 206 includes an image display panel 208, and a digital supplement selection panel 210.
  • the image display panel 208 shows an image.
  • the image display panel 208 may show an image corresponding to a real-time feed from the camera assembly 112 of the client computing device 102.
  • the image display panel 208 shows a previously captured image or an image that has been retrieved from the memory 120 of the client computing device 102.
  • the user interface screen 206 is displayed to the user on a display device of the client computing device 102.
  • the user interface screen 206 may be overlaid on an image (or video feed being captured by the camera of the computing device) of the physical space so. Additionally, the user interface screen 206 may be displayed as AR content over the user’s field of view using an HMD worn by the user.
  • the image display panel 208 may also include annotations or user interface elements that may relate to the image.
  • the image display panel 208 may include an indicator that an object in the image (e.g., the object 222) has been recognized as a supplement anchor.
  • the indicator may include a user-actuatable control to access or view information about digital supplements associated with the identified supplement anchor.
  • the image displayed in the image display panel 208 may include multiple objects that are recognized as supplement anchors, and the image display panel 208 may include multiple annotations that overlay the image to identify those supplement anchors.
  • the supplement anchors may be recognized by a supplement anchor identification engine of the client computing device 102.
  • the supplement anchors are identified by transmitting an image to the search server 152.
  • the search server 152 may then analyze the image and identify supplement anchors in the image.
  • the search server 152 may transmit one or more of the locations (e.g., image coordinates) or the dimensions of any identified objects that are associated with supplement anchors to the client computing device 102.
  • the client computing device 102 may then update the user interface screen to show annotations that identify the supplement anchors (or associated objects) in the image.
  • the client computing device 102 may track the locations of the supplement anchors (or associated objects) in a video stream (e.g., a sequence sequentially captured images) captured by the camera assembly 112 (e.g., the supplement anchor identification engine 124 may track supplement anchors identified by the search server 152).
  • a video stream e.g., a sequence sequentially captured images
  • the supplement anchor identification engine 124 may track supplement anchors identified by the search server 1512.
  • the digital supplement selection panel 210 allows a user to select a digital supplement for presentation.
  • the digital supplement selection panel 210 may include a menu that includes user-actuatable controls that are each associated with a digital supplement.
  • the digital supplement selection panel 210 includes a user- actuatable control 212 and a user-actuatable control 214, which each include information about the associated digital supplement.
  • the user-actuatable controls may display one or more of a name (or title), a brief description, and an image associated with the digital supplements, which may be received from the search server 152.
  • the content of the associated digital supplement may be presented to the user.
  • Presenting the digital supplement to the user may include causing the client computing device 102 to display a user interface screen that includes images, videos, text, other content, or a combination thereof from the digital supplement.
  • the digital supplement content is displayed as an overlay on the image display panel 208 over an image or camera feed.
  • the digital supplement content may be three-dimensional augmented reality content.
  • presenting a digital supplement includes activating an application that is installed on the client computing device 102 (e.g., one of the other applications 140). Presenting the digital supplement may also include transmitting a request to a URL associated with the digital supplement.
  • the request may include parameters associated with the digital supplement, such as an identifier of a product or object identified within the image.
  • the image (or other content) from the visual- content query is passed a parameter with the request.
  • the image may also be provided via an API associated with a digital supplement server 172.
  • the client computing device 102 transmits the image to the digital supplement server 172.
  • the search server 152 may transmit the image to the digital supplement server 172.
  • the client computing device 102 may transmit an indicator of the selection to the search server 152 and the search server 152 may then transmit the image to a corresponding digital supplement server.
  • the client computing device 102 may also transmit a URL to a location on the search server 152 that the digital supplement server 172 can use to access the image.
  • these implementations may reduce the amount of data the client computing device needs to transmit.
  • the digital supplement associated with the user-actuatable control 212 may cause information about the object 222, such as information from a museum, to be displayed.
  • the digital supplement associated with the user-actuatable control 214 may cause information related to a museum tour to be displayed. For example, presentation of the digital supplement may cause a stop on a museum tour to be marked as completed and information about a next stop to be displayed.
  • FIG. 3 is a diagram of an example method 300 of enabling triggering of a digital supplement, in accordance with implementations described herein.
  • This method 300 may, for example, be performed by the content crawler 162 of the search server 152 to allow a user to access a digital supplement based on a visual-content query.
  • data specifying a digital supplement is received.
  • the data may identify a digital supplement and situations in which the digital supplement should be provided.
  • the data specifying a digital supplement may be received in various ways.
  • the data specifying the digital supplement may be received from a network- accessible resource such as a web page that includes metadata about the digital supplement.
  • the data specifying a digital supplement may also be received via an API or form provided by, for example, the search server 152.
  • the data specifying a digital supplement may also be received from a memory location or data store.
  • the data about the digital supplement may include access data that is usable by a client computing device to access the digital supplement.
  • the access data may include a URL of the digital supplement and parameters to pass to that URL.
  • the access data may also include an application identifier and parameters for the application.
  • the data about the digital supplement may also include descriptive data about the digital supplement.
  • the descriptive data may be usable by a client computing device to present information about a digital supplement to a user (e.g., on a menu in which the user may select a digital supplement).
  • the descriptive data may include, for example, a name (or title, a description, a publisher name, and an image.
  • the data about the digital supplement may also include identifiers of supplement anchors.
  • a data structure instance based on the received data is generated.
  • the data structure may, for example, be a record in a database.
  • the database may be a relational database and the data structure instances may be linked (e.g., via a foreign key) with one or more records associated with supplement anchors.
  • a database field associated with the data structure instance may be set to active so that the digital supplement search engine 164 can access and return the associated digital supplement.
  • triggering of the digital supplement may include saving or committing a database record.
  • enabling of retrieval of the digital supplement includes enabling triggering of the digital supplement by a client computing device. For example, after the instance is generated, the digital supplement may be returned to a client computing device in response to a search and activated or presented by the client computing device.
  • FIG. 4 is a diagram of an example method 400 of enabling triggering of a digital supplement, in accordance with implementations described herein.
  • This method 400 may, for example, be performed by the content crawler 162 of the search server 152 to allow a user to access a digital supplement based on a visual-content query.
  • a network-accessible resource is analyzed.
  • the network accessible resource is a web page served by, for example, the digital supplement server 172.
  • a set of network-accessible resources are analyzed.
  • the set of network-accessible resources may be generated based on submissions via a form or API.
  • the set of network-accessible resources may be generated by crawling other network-accessible resources to identify URLs. This crawling process may be performed recursively.
  • the network-accessible resource may include an indicator of metadata associated with a digital supplement.
  • the network-accessible resource may include a tag that identifies a portion of the network-accessible resource that includes the metadata.
  • the tag may be an XML tag with a specific type or attribute.
  • the tag may be an HTML tag, such as a script tag that includes a JSON data structure containing metadata.
  • a digital supplement data structure instance based on the metadata is generated.
  • the operation 406 may be similar to the operation 304.
  • a visual-content query is received.
  • the visual-content query may for example be sent by a client computing device such as the client computing device 102.
  • the visual-content query includes an image.
  • the visual- content query may also include textual data that describes an image.
  • the textual data may include identifiers of supplement anchors within an image captured by a camera assembly of the client computing device.
  • the visual-content query also includes other information, such as a location of the client computing device or an identifier of a user account associated with the client computing device.
  • multiple digital supplement data structures instances are identified based on the visual-content query.
  • supplement anchors are identified within an image provided in the visual-content query.
  • the supplement anchors may then be used to query an index or a database for relevant digital supplements.
  • other data provided with the query may be used to identify the digital supplements too, such as a location of the client computing device or information associated with a user account.
  • multiple supplement anchors are used to identify relevant supplement anchors.
  • an ordering of the multiple digital supplement data structure instances is determined.
  • the ordering may be based on various scores associated with the digital supplement or the relevance of the digital supplement to the visual-content query.
  • a relevance score that corresponds to the relevance of a digital supplement to the visual-content query is used to order the multiple digital supplement data structure instances.
  • the relevance score may be determined from multiple factors, such as one or more of the content of the digital supplement, the content of network-accessible resources that link to the digital supplement (or a network-accessible resource associated with the digital supplement), the link text or content near the links to the digital supplement on other network-accessible resources.
  • the scores may also be based on popularity metrics.
  • a prestige metric is an example of a popularity metric.
  • the prestige metric may be based on a combination of how many other network resources link to the digital supplement and the prestige score of those other network-accessible resources.
  • the popularity score may be based on how frequently the digital resource is or has been selected. In some
  • the popularity score may correspond to how frequently the digital resource is selected for the visual-content query.
  • the scores may be determined or may be retrieved from a data store or an API.
  • an API is accessed to retrieve scores for a digital supplement.
  • the scores may be retrieved from a search engine that has determined a relevance and/or popularity for a digital resource with respect to search terms that are based on the supplement anchors.
  • the multiple digital supplement data structures may also be ordered based on frequency of use by a specific user (e.g., the user of the client computing device) or recency of use by the specific user. In some implementations, the multiple digital supplement data structures are ordered randomly.
  • the visual-content query is responded to based on the multiple digital supplement data structure instances.
  • information associated with the multiple digital supplement data structure instances may be transmitted to the client computing device in the order determined at operation 412.
  • the information includes descriptive data that can be shown in a menu or another type of user interface that is configured to receive a user selection of a digital supplement.
  • the information may also include access data that can be used by the client computing device to access or present the digital supplement.
  • FIG. 5 is a diagram of an example method 500 of searching for and presenting a digital supplement, in accordance with implementations described herein. This method 500 may, for example, be performed by the application 122 of the client computing device 102 to identify and access a digital supplement based on a visual-content query.
  • a visual-content query that is based on an image is transmitted to a server computing device (e.g., the search server 152).
  • a server computing device e.g., the search server 152
  • an image may be captured with the camera assembly 112 of the client computing device 102.
  • the image may also be a stored image such as an image that was previously captured by the camera assembly 112.
  • the visual-content query includes only the image.
  • the visual-content query includes additional information.
  • the visual-content query may include information such as a location of the client computing device 102 or an identifier of an account associated with a user of the client computing device 102.
  • the application 122 may also identify anchors in an image (e.g., with the supplement anchor identification engine 124).
  • the visual-content query may include identifiers (e.g., textual, numeric or other types of identifiers) of the identified anchors.
  • the visual-content query does not include an image.
  • transmitting the visual-content query to the server includes calling an API. In some implementations, transmitting the visual -content query to the server includes calling an API provided by the server. In some implementations, transmitting the visual-content query to the server includes submitting a form using the HTTP protocol (e.g., submitting a GET or POST request).
  • a response to the visual-content query that identifies a digital supplement is received.
  • the response may be received via the network 190 from the search server 152.
  • the response may include one or more digital supplements that were identified based on the visual-content query by the search server 152.
  • the response may include an array of data associated with the digital supplements.
  • the data associated with the digital supplements may include descriptive data that can be used to present digital supplement options for a user to select.
  • the descriptive data may include a name, a short description, a publisher name, and an image.
  • the data may also include access data, such as a URL and parameters to include with a request via the URL or an application name and associated parameters.
  • the data may also include the location, coordinates, or dimensions of supplement anchors in an image transmitted with the visual-content query (e.g., if the supplement anchors are identified by the search server 152).
  • a user interface screen that includes information associated with the digital supplement is displayed.
  • the user interface screen includes annotations that overlay the identified supplement anchors (e.g., based on the provided coordinates).
  • the annotations may provide information about the object in the image associated with the identified supplement anchors.
  • the annotations may include user- actuatable controls that can be actuated to present or activate a digital supplement.
  • the user interface screen may also include a digital supplement selection panel that can be used to select from multiple digital supplements that are identified in the response received at operation 504.
  • the user interface screen may be generated a by a web browser that opens a URL specified by the digital supplement.
  • the user interface screen may also be generated by another application that is launched to provide the digital supplement.
  • FIG. 6 is a diagram of an example method 600 of identifying and presenting a digital supplement based on an image, in accordance with implementations described herein.
  • This method 600 may, for example, be performed by the application 122 of the client computing device 102 to identify and access a digital supplement based on a visual-content query.
  • an image is captured.
  • the image may be captured by the camera assembly 112 of the client computing device 102.
  • a sequence of images i.e., a video
  • a visual-content query that is based on the image is transmitted to a server computing device such as the search server 152.
  • the operation 604 may be similar to the operation 502.
  • the visual-content query may include multiple images or a sequence of images.
  • the sequence of images may be streamed to the server computing device.
  • a response to the visual-content query that identifies multiple digital supplements is received.
  • the operation 606 may be similar to the previously described operation 504.
  • a user interface screen that includes user-actuatable controls to select a digital supplement from the multiple digital supplements is displayed.
  • a digital supplement selection panel may be displayed.
  • the digital supplement selection panel may include multiple user-actuatable controls each of which is associated with one of the multiple digital supplements identified in the response.
  • the digital supplement selection may arrange the user-actuatable controls based on an ordering or ranking of the digital supplements provided by the server computing device.
  • the digital supplement selection panel may arrange the user-actuatable controls vertically, horizontally, or otherwise.
  • the user-actuatable controls may be associated with or include information about the associated digital supplement that user can consider when deciding whether to select the digital supplement.
  • the information that is displayed may include one or more of a name, a description, an image, and a publisher name for a digital supplement.
  • a user input to select a digital supplement is received.
  • the user input may be a click using a mouse or other device.
  • the user input may also be a touch input from a stylus or finger.
  • Another example of a user input is a near-touch input (e.g., holding a finger or pointing device proximate to the screen).
  • the user input can also include a hand gesture, a head motion, an action with an eye, or a spoken input.
  • information is provided to a resource associated with the selected digital supplement.
  • information about a user of the client computing device may be transmitted to a server that provides the digital supplement (if permission to provide the information has been provided).
  • the information may also be provided to an application that provides the digital supplement.
  • Various types of information may be provided.
  • the information may include user information such as a user name, user preferences, or a location.
  • the information may also include information related to the visual-content query such as an image or sequence of images.
  • the information may also include identifiers and/or positions of one or more supplement anchors in the image. This information may be used to provide the digital supplement to the user. For example, AR content of a digital supplement may be sized and positioned based on the image.
  • the information may be transmitted directly to the resource associated with the digital supplement (e.g., the digital supplement server 172) by the client computing device 102.
  • the information is provided to the resource associated with the digital supplement by the search server 152 (e.g., so the client computing device does not need to transmit as much data).
  • the client computing device 102 may transmit selection information to the search server 152 that identifies a selected digital supplement.
  • the search server 152 may then transmit information to the resource that provides the digital supplement.
  • the client computing device 102 may also prompt the user to permit sharing the information.
  • the search server 152 may determine the information to transmit to the resource based on a digital supplement data structure instance (which may be based on metadata associated with the digital supplement).
  • the user interface is updated based on the selected digital supplement.
  • the operation 614 may be similar to the operation 506.
  • FIGS. 7A-7C are schematic diagrams of user interface screens displayed by embodiments of the client computing device 102 to conduct a visual-content search and display a digital supplement.
  • a user interface screen 700a is shown.
  • the user interface screen 700a includes an image display panel 708 and an information panel 730.
  • the image display panel 708 is displaying an image of a shelf full of wine bottles (e.g., as you might find in a store).
  • the image display panel 708 also includes an indicator 740 and an indicator 742. Each of these indicators indicate that the wine bottle shown in the image beneath the indicator has been recognized as a supplement anchor (e.g., in this case as a recognized product).
  • the indicator 740 and the indicator 742 are examples of user-actuatable controls.
  • instructions are provided to“Tap on what you’re interested in.”
  • FIG. 7B a user interface screen 700b is shown after a user has actuated the indicator 740. After actuation, an annotation 744 from a digital supplement is displayed.
  • the annotation 744 includes information on the rating of the wine, which may help the user select a bottle of wine to purchase.
  • FIG. 7C another user interface screen 700c is shown after a user has actuated the indicator 740.
  • the user interface screen 700c may be shown instead of or in addition to the user interface screen 700b is shown (e.g., after actuation of the annotation 744 or if the user swipes up on the information panel 730 in FIG. 7B).
  • an expanded information panel 732 is shown in FIG. 7C.
  • the expanded information panel 732 takes up more of the user interface screen 700c than the information panel 730 took up in FIGS. 7A and 7B.
  • the expanded information panel 732 includes a digital supplement selection panel 710 and a digital supplement content display panel 734.
  • the digital supplement selection panel 710 includes a user-actuatable control 712, a user-actuatable control 714, and a user-actuatable control 716 (which is only partially visible).
  • additional user-actuatable controls may be displayed.
  • the user-actuatable controls of the digital supplement selection panel 710 may be arranged in a ranked order.
  • the user-actuatable control 712 is associated with a digital supplement for meal pairing.
  • a digital supplement that displays food and meal pairing information for the selected wine may be displayed.
  • the user-actuatable control 714 is associated with a digital supplement that saves a photo.
  • an application that saves photos may be activated and provided with the image. Additional information may be saved along with the photo such as the identified supplement anchors.
  • the digital supplement content display panel 734 may display content from a digital supplement.
  • the digital supplement content display panel 734 may display a default digital supplement or a highest-ranked digital supplement that is associated with the identified supplement anchor.
  • the digital supplement content display panel 734 includes product information about the product associated with the selected supplement anchor. In this case, a wine name, rating, location of origin, image, and comments are provided.
  • FIGS. 8A-8C are schematic diagrams of user interface screens displayed by embodiments of the client computing device 102 to conduct a visual-content search and display a digital supplement.
  • the visual-content search is based on an image of a receipt.
  • FIG. 8A a user interface screen 800a is shown.
  • the user interface screen 800a includes an image display panel 808 and an information panel 830.
  • the image display panel 808 is displaying an image of a receipt from a restaurant.
  • the image display panel 808 also includes an indicator 840, an indicator 842, an annotation 844, and a highlight overlay 846.
  • the indicator 840 is associated with the receipt as a document
  • the indicator 842 is associated with a specific restaurant named on the receipt.
  • the identified receipt document and the identified restaurant name are both examples of supplement anchors.
  • the annotation 844 is associated with a digital supplement that provides a tip calculator.
  • an example tip calculation is included on the annotation 844 and is overlaid at the appropriate position on the image display panel 808.
  • a digital supplement may be selected by default and displayed upon identifying an appropriate supplement anchor.
  • the highlight overlay 846 is overlaid over a portion of the receipt document that includes information used by the tip calculator digital supplement.
  • the items displayed in the information panel 830 relate to the receipt as a document, as though the indicator 840 had been actuated.
  • identified supplement anchors are ranked based on the likely relevance or interests of the user based, for example, on the user’s past actions, other user’s actions for similar images, confidence scores for the supplement anchors, or the position or size of the portion of the image that the supplement anchors relate.
  • the information panel 830 may then display items related to the highest ranked supplement anchor in at least some
  • the indicator 842 were actuated the information panel 830 might include items about the specific restaurant.
  • the information panel 830 includes a digital supplement selection panel 810.
  • the digital supplement panel includes a user-actuatable control 812, a user-actuatable control 814, and a user-actuatable control 816.
  • the user-actuatable control 812 is associated with a tip calculator digital supplement
  • the user-actuatable control 814 is associated with a check splitting digital supplement
  • the user-actuatable control 816 is associated with an expense report digital supplement.
  • user interface controls for adjusting parameters of the tip calculator may be displayed (e.g., to adjust the percentage).
  • FIG. 8B a user interface screen 800b is shown after a user has actuated the user-actuatable control 814.
  • an expanded information panel 832 is shown that includes items to help a user calculate how to split a check. For example, the number of people splitting the check can be entered to determine the amount each should pay.
  • a user interface screen 800c is shown after a user has actuated the user-actuatable control 816.
  • an expanded information panel 834 is shown that includes items to help a user store the receipt to an expense report.
  • the user can select an expense report with which the receipt should be associated (e.g.,“Sydney trip 2018”).
  • an image of the receipt may be uploaded to an expense report submission or management system.
  • the full image that is shown on the image display panel 808 is uploaded.
  • a portion of the image is uploaded (e.g., the image is cropped to include only the receipt).
  • FIGS. 9A and 9B are schematic diagrams of user interface screens displayed by embodiments of the client computing device 102 to conduct a visual-content search and display a digital supplement.
  • the visual-content search is based on an image of a face.
  • FIG. 9A a user interface screen 900a is shown.
  • the user interface screen 900a includes an image display panel 908 and an information panel 930.
  • the image display panel 908 is displaying an image of a face.
  • the face is an example of a supplement anchor.
  • the information panel 930 includes a user-actuatable control 912 for a digital supplement that was identified for the supplement anchor in the image (i.e., the face).
  • the user-actuatable control 912 is associated with a digital supplement for tying on glasses.
  • a user interface screen 900b is shown after a user has actuated the user-actuatable control 912.
  • an expanded information panel 932 is shown that includes items to help a user visually try glasses on the face in the image.
  • multiple glasses styles are displayed and the user can select a pair to try on.
  • AR content 960 is overlaid on the image display panel 908.
  • the AR content 960 corresponds to the selected glasses and is sized to match the face in the image.
  • the image shown in the image display panel 908 is transmitted to a server that provides the digital supplement so that the image can be analyzed to determine where and how to position and size the AR content 960 or to recommend glasses to try on.
  • FIGS. 10A-10C are schematic diagrams of user interface screens displayed by embodiments of the client computing device 102 to conduct a visual-content search and display a digital supplement.
  • the visual-content search is based on an image of furniture in a catalog.
  • a user interface screen lOOOa is shown.
  • the user interface screen lOOOa includes an image display panel 1008.
  • the image display panel 1008 is displaying an image of a portion of a page of a furniture catalog.
  • the image display panel also includes an indicator 1040, an indicator 1042, and an indicator 1044.
  • the indicator 1040 is associated with a bed
  • the indicator 1042 is associated with a decorative item
  • the indicator 1044 is associated with a rug.
  • the images of the bed, the decorative item, and the run in the catalog are examples of supplement anchors.
  • FIG. 10B a user interface screen lOOOb is shown after a user has selected the indicator 1040 (e.g., by touching the screen at or near where the indicator 1040 is displayed).
  • the user interface screen lOOOb includes a digital supplement selection panel 1010 and an information panel 1030.
  • the information panel 1030 includes information (e.g., a product name, description, and image) about the supplement anchor associated with the selected indicator.
  • the digital supplement selection panel 1010 includes a user-actuatable control 1012 and a user-actuatable control 1014.
  • the user-actuatable control 1012 is associated with a digital supplement that provides an in-home view.
  • the user-actuatable control 1014 is associated with another digital supplement (e.g., a digital supplement for posting to a social media site).
  • a user interface screen lOOOc is shown after actuation of the user- actuatable control 1012.
  • the user interface screen lOOOc includes the image display panel 1008, a digital supplement selection panel 1010 and a reduced information panel 1032.
  • the reduced information panel 1032 may include a user-actuatable control that when actuated may cause the information panel to pop-up and be displayed.
  • the image display panel 1008 now displays an image of a room and includes AR content 1060.
  • the AR content 1060 includes a 3D model of the bed associated with the indicator 1040 overlaid on the image panel. The user may able to adjust the position of the AR content 1060 within the room to see how the bed would fit in the room.
  • the image shown in the image display panel 1008 is transmitted to a server that provides the digital supplement so that the image can be analyzed to determine where and how to position and size the AR content 1060.
  • the AR content 1060 may provided at a later time than the visual-content query.
  • FIGS. 11A-11C are schematic diagrams of user interface screens displayed by embodiments of the client computing device 102 to conduct various visual-content searches within a store.
  • the visual-content searches are based on images of products captured within a store.
  • a user interface screen 1100a is shown.
  • the user interface screen llOOa includes an image display panel 1108 and an information panel 1130.
  • the image display panel 1108 is displaying an image captured within a store.
  • the image display panel 1108 also includes an indicator 1140 that is associated with a vase.
  • the vase displayed on the image display panel 1108 is an example of a supplement anchor.
  • the information panel 1130 is displaying a digital supplement that includes product information about the vase and functionality to buy the vase.
  • the digital supplement may, for example, include a workflow to initiate a purchase of the vase
  • the digital supplement is identified based on the image content and the location of the client-computing device so that a digital supplement published by the store (or associated with the store) in which the image was captured can be identified and provided as a high-ranking result to a visual- content query when a client computing device is in the store.
  • a different digital supplement would be provided for the same image if the location of the client computing device were changed.
  • FIG. 11B a user interface screen 1100b is shown.
  • the user interface screen llOOb includes an image display panel 1108 and an information panel 1130.
  • the image display panel 1108 is displaying another image captured within a store.
  • the image display panel 1108 also includes an indicator 1142 that is associated with a rug.
  • the rug displayed on the image display panel 1108 is an example of a supplement anchor.
  • the information panel 1130 is displaying a digital supplement that includes product information about the rug and functionality to select a size and buy the rug. Like in FIG. 11 A, the digital supplement is identified based on the image content and the location of the client-computing device.
  • FIG. 11C a user interface screen 1100c is shown.
  • the user interface screen llOOc includes an image display panel 1108 and an information panel 1130.
  • the image display panel 1108 is displaying another image captured within a store.
  • the image display panel 1108 also includes an indicator 1144 that is associated with a vase.
  • the vase displayed on the image display panel 1108 is an example of a supplement anchor.
  • the information panel 1130 is displaying a digital supplement that includes product information about the vase.
  • the information panel 1130 also includes a coupon indicator 1132 and functionality to redeem the coupon. Redeeming the coupon may include purchasing the item at a discounted price from a website associated with the store.
  • a coupon code is presented that can be used to secure a discount during checkout.
  • the digital supplement is identified based on the image content and the location of the client-computing device.
  • FIGS. 12A-12C are schematic diagrams of user interface screens displayed by embodiments of the client computing device 102 during various visual-content searches.
  • the visual-content searches are based on images of movie posters (e.g., as might be captured at a movie theatre).
  • a user interface screen l200a is shown.
  • the user interface screen l200a includes an image display panel 1208.
  • the image display panel 1208 is displaying an image of movie posters.
  • the image display panel 1208 also includes an indicator 1240 that is associated with a movie poster identified in the image.
  • the movie poster is an example of a supplement anchor.
  • the indicator 1240 may include a user- actuatable control that when actuated will display a digital supplement or a menu to select a digital supplement.
  • FIG. 12B a user interface screen l200b is shown.
  • the image display panel 1208 also includes a preview digital supplement 1242 that is associated with the movie poster identified in the image.
  • the preview digital supplement 1242 may be shown after actuation of the indicator 1240 (of FIG. 12 A).
  • the preview digital supplement 1242 may overlay an image or video from a movie associated with the identified movie poster on the image of the movie poster.
  • a user interface screen l200c is shown.
  • the image display panel 1208 also includes a rating indicator 1244 and a rating indicator 1246.
  • the rating indicator 1244 and the rating indicator 1246 may be generated by one or more digital supplements in response to a visual-content query that includes movie posters.
  • the digital supplement may for example, overlay ratings information for the movies associated with the movie posters in the image.
  • the rating indicator 1244 and the rating indicator 1246 may include user- actuatable controls that when actuated cause additional information about the ratings and the associated movie to be shown.
  • FIG. 13 shows an example of a computer device 1300 and a mobile computer device 1350, which may be used with the techniques described here (e.g., to implement the client computing device 102, the search server 152, and the digital supplement server 172).
  • the computing device 1300 includes a processor 1302, memory 1304, a storage device 1306, a high-speed interface 1308 connecting to memory 1304 and high-speed expansion ports 1310, and a low-speed interface 1312 connecting to low-speed bus 1314 and storage device 1306.
  • Each of the components 1302, 1304, 1306, 1308, 1310, and 1312 are interconnected using various busses, and may be mounted on a common motherboard or in other manners as appropriate.
  • the processor 1302 can process instructions for execution within the computing device 1300, including instructions stored in the memory 1304 or on the storage device 1306 to display graphical information for a GUI on an external input/output device, such as display 1316 coupled to high-speed interface 1308.
  • an external input/output device such as display 1316 coupled to high-speed interface 1308.
  • multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory.
  • multiple computing devices 1300 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).
  • the memory 1304 stores information within the computing device 1300.
  • the memory 1304 is a volatile memory unit or units.
  • the memory 1304 is a non-volatile memory unit or units.
  • the memory 1304 may also be another form of computer-readable medium, such as a magnetic or optical disk.
  • the storage device 1306 is capable of providing mass storage for the computing device 1300.
  • the storage device 1306 may be or contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations.
  • a computer program product can be tangibly embodied in an information carrier.
  • the computer program product may also contain instructions that, when executed, perform one or more methods, such as those described above.
  • the information carrier is a computer- or machine- readable medium, such as the memory 1304, the storage device 1306, or memory on processor 1302.
  • the high-speed controller 1308 manages bandwidth-intensive operations for the computing device 1300, while the low-speed controller 1312 manages lower bandwidth intensive operations. Such allocation of functions is exemplary only.
  • the high-speed controller 1308 is coupled to memory 1304, display 1316 (e.g., through a graphics processor or accelerator), and to high-speed expansion ports 1310, which may accept various expansion cards (not shown).
  • low-speed controller 1312 is coupled to storage device 1306 and low-speed expansion port 1314.
  • the low-speed expansion port which may include various communication ports (e.g., USB, Bluetooth, Ethernet, wireless Ethernet) may be coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
  • input/output devices such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, e.g., through a network adapter.
  • the computing device 1300 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a standard server 1320, or multiple times in a group of such servers. It may also be implemented as part of a rack server system 1324. In addition, it may be implemented in a personal computer such as a laptop computer 1322. Alternatively, components from computing device 1300 may be combined with other components in a mobile device (not shown), such as device 1350. Each of such devices may contain one or more of computing device 1300, 1350, and an entire system may be made up of multiple computing devices 1300, 1350 communicating with each other.
  • Computing device 1350 includes a processor 1352, memory 1364, an input/output device such as a display 1354, a communication interface 1366, and a transceiver 1368, among other components.
  • the device 1350 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage.
  • a storage device such as a microdrive or other device, to provide additional storage.
  • Each of the components 1350, 1352, 1364, 1354, 1366, and 1368 are interconnected using various buses, and several of the components may be mounted on a common motherboard or in other manners as appropriate.
  • the processor 1352 can execute instructions within the computing device 1350, including instructions stored in the memory 1364.
  • the processor may be implemented as a chipset of chips that include separate and multiple analog and digital processors.
  • the processor may provide, for example, for coordination of the other components of the device 1350, such as control of user interfaces, applications run by device 1350, and wireless communication by device 1350.
  • Processor 1352 may communicate with a user through control interface 1358 and display interface 1356 coupled to a display 1354.
  • the display 1354 may be, for example, a TFT LCD (Thin-Film-Transistor Liquid Crystal Display), and LED (Light Emitting Diode) or an OLED (Organic Light Emitting Diode) display, or other appropriate display technology.
  • the display interface 1356 may include appropriate circuitry for driving the display 1354 to present graphical and other information to a user.
  • the control interface 1358 may receive commands from a user and convert them for submission to the processor 1352.
  • an external interface 1362 may be provided in communication with processor 1352, so as to enable near area communication of device 1350 with other devices.
  • External interface 1362 may provide, for example, for wired communication in some implementations, or for wireless communication in other implementations, and multiple interfaces may also be used.
  • the memory 1364 stores information within the computing device 1350.
  • the memory 1364 can be implemented as one or more of a computer-readable medium or media, a volatile memory unit or units, or a non-volatile memory unit or units.
  • Expansion memory 1374 may also be provided and connected to device 1350 through expansion interface 1372, which may include, for example, a SIMM (Single In-Line Memory Module) card interface.
  • SIMM Single In-Line Memory Module
  • expansion memory 1374 may provide extra storage space for device 1350, or may also store applications or other information for device 1350.
  • expansion memory 1374 may include instructions to carry out or supplement the processes described above, and may include secure information also.
  • expansion memory 1374 may be provided as a security module for device 1350, and may be programmed with instructions that permit secure use of device 1350.
  • secure applications may be provided via the SIMM cards, along with additional information, such as placing identifying information on the SIMM card in a non-hackable manner.
  • the memory may include, for example, flash memory and/or NVRAM memory, as discussed below.
  • a computer program product is tangibly embodied in an information carrier.
  • the computer program product contains instructions that, when executed, perform one or more methods, such as those described above.
  • the information carrier is a computer- or machine-readable medium, such as the memory 1364, expansion memory 1374, or memory on processor 1352, that may be received, for example, over transceiver 1368 or external interface 1362.
  • Device 1350 may communicate wirelessly through communication interface 1366, which may include digital signal processing circuitry where necessary. Communication interface 1366 may provide for communications under various modes or protocols, such as GSM voice calls, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA2000, or GPRS, among others. Such communication may occur, for example, through radio-frequency transceiver 1368. In addition, short-range communication may occur, such as using a Bluetooth, Wi-Fi, or other such transceiver (not shown). In addition, GPS (Global Positioning System) receiver module 1370 may provide additional navigation- and location- related wireless data to device 1350, which may be used as appropriate by applications running on device 1350.
  • GPS Global Positioning System
  • Device 1350 may also communicate audibly using audio codec 1360, which may receive spoken information from a user and convert it to usable digital information. Audio codec 1360 may likewise generate audible sound for a user, such as through a speaker, e.g., in a handset of device 1350. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.) and may also include sound generated by applications operating on device 1350.
  • Audio codec 1360 may receive spoken information from a user and convert it to usable digital information. Audio codec 1360 may likewise generate audible sound for a user, such as through a speaker, e.g., in a handset of device 1350. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.) and may also include sound generated by applications operating on device 1350.
  • the computing device 1350 may be implemented in a number of different forms, as shown in the figure. For example, it may be implemented as a cellular telephone 1380. It may also be implemented as part of a smartphone 1382, personal digital assistant, or other similar mobile device.
  • implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
  • a programmable processor which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
  • the systems and techniques described here can be implemented on a computer having a display device (a LED (light-emitting diode), or OLED (organic LED), or LCD (liquid crystal display) monitor/screen) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer.
  • a display device a LED (light-emitting diode), or OLED (organic LED), or LCD (liquid crystal display) monitor/screen
  • a keyboard and a pointing device e.g., a mouse or a trackball
  • Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
  • the systems and techniques described here can be implemented in a computing system that includes a back end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back end, middleware, or front end components.
  • the components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), and the Internet.
  • LAN local area network
  • WAN wide area network
  • the Internet the global information network
  • the computing system can include clients and servers.
  • a client and server are generally remote from each other and typically interact through a communication network.
  • the relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
  • the computing devices depicted in FIG. 13 can include sensors that interface with an AR headset/HMD device 1390 to generate an augmented environment for viewing inserted content within the physical space.
  • sensors included on a computing device 1350 or other computing device depicted in FIG. 13 can provide input to the AR headset 1390 or in general, provide input to an AR space.
  • the sensors can include, but are not limited to, a touchscreen, accelerometers, gyroscopes, pressure sensors, biometric sensors, temperature sensors, humidity sensors, and ambient light sensors.
  • the computing device 1350 can use the sensors to determine an absolute position and/or a detected rotation of the computing device in the AR space that can then be used as input to the AR space.
  • the computing device 1350 may be incorporated into the AR space as a virtual object, such as a controller, a laser pointer, a keyboard, a weapon, etc.
  • Positioning of the computing device/virtual object by the user when incorporated into the AR space can allow the user to position the computing device so as to view the virtual object in certain manners in the AR space.
  • the virtual object represents a laser pointer
  • the user can manipulate the computing device as if it were an actual laser pointer.
  • the user can move the computing device left and right, up and down, in a circle, etc., and use the device in a similar fashion to using a laser pointer.
  • the user can aim at a target location using a virtual laser pointer.
  • one or more input devices included on, or connect to, the computing device 1350 can be used as input to the AR space.
  • the input devices can include, but are not limited to, a touchscreen, a keyboard, one or more buttons, a trackpad, a touchpad, a pointing device, a mouse, a trackball, a joystick, a camera, a microphone, earphones or buds with input functionality, a gaming controller, or other connectable input device.
  • a user interacting with an input device included on the computing device 1350 when the computing device is incorporated into the AR space can cause a particular action to occur in the AR space.
  • a touchscreen of the computing device 1350 can be rendered as a touchpad in AR space.
  • a user can interact with the touchscreen of the computing device 1350.
  • the interactions are rendered, in AR headset 1390 for example, as movements on the rendered touchpad in the AR space.
  • the rendered movements can control virtual objects in the AR space.
  • one or more output devices included on the computing device 1350 can provide output and/or feedback to a user of the AR headset 1390 in the AR space.
  • the output and feedback can be visual, tactical, or audio.
  • the output and/or feedback can include, but is not limited to, vibrations, turning on and off or blinking and/or flashing of one or more lights or strobes, sounding an alarm, playing a chime, playing a song, and playing of an audio file.
  • the output devices can include, but are not limited to, vibration motors, vibration coils, piezoelectric devices, electrostatic devices, light emitting diodes (LEDs), strobes, and speakers.
  • the computing device 1350 may appear as another object in a computer-generated, 3D environment. Interactions by the user with the computing device 1350 (e.g., rotating, shaking, touching a touchscreen, swiping a finger across a touch screen) can be interpreted as interactions with the object in the AR space.
  • the computing device 1350 appears as a virtual laser pointer in the computer-generated, 3D environment.
  • the user manipulates the computing device 1350, the user in the AR space sees movement of the laser pointer.
  • the user receives feedback from interactions with the computing device 1350 in the AR environment on the computing device 1350 or on the AR headset 1390.
  • the user’s interactions with the computing device may be translated to interactions with a user interface generated in the AR environment for a controllable device.
  • a computing device 1350 may include a
  • a user can interact with the touchscreen to interact with a user interface for a controllable device.
  • the touchscreen may include user interface elements such as sliders that can control properties of the controllable device.
  • Computing device 1300 is intended to represent various forms of digital computers and devices, including, but not limited to laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers.
  • Computing device 1350 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smartphones, and other similar computing devices.
  • the components shown here, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed in this document.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • User Interface Of Digital Computer (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

La présente invention concerne des systèmes et des procédés d'identification et de récupération de contenu pour une recherche visuelle. Un procédé donné à titre d'exemple comprend la réception de données spécifiant un complément numérique. Les données peuvent identifier un complément numérique et une ancre de complément pour associer le complément numérique à un contenu visuel. Le procédé peut également comprendre la génération d'une instance de structure de données qui spécifie le complément numérique et l'ancre de complément et, après la génération de l'instance de structure de données, le fait de permettre le déclenchement du complément numérique par une image sur la base au moins du stockage de l'instance de structure de données dans une base de données qui comprend une pluralité d'autres instances de structure de données. Les autres instances de structure de données peuvent spécifier chacune un complément numérique et une ou plusieurs ancres de complément.
PCT/US2019/036542 2018-06-21 2019-06-21 Association et récupération de complément numérique pour recherche visuelle WO2019245801A1 (fr)

Priority Applications (6)

Application Number Priority Date Filing Date Title
KR1020227044320A KR20230003388A (ko) 2018-06-21 2019-06-21 시각적 검색을 위한 디지털 보충물 연관 및 검색
CN201980022269.0A CN112020712B (zh) 2018-06-21 2019-06-21 视觉搜索的数字补充关联和检索
EP19735444.2A EP3811238A1 (fr) 2018-06-21 2019-06-21 Association et récupération de complément numérique pour recherche visuelle
JP2020570146A JP7393361B2 (ja) 2018-06-21 2019-06-21 ビジュアルサーチのためのデジタル補足関連付けおよび検索
KR1020207031107A KR20200136030A (ko) 2018-06-21 2019-06-21 시각적 검색을 위한 디지털 보충물 연관 및 검색
JP2022077546A JP2022110057A (ja) 2018-06-21 2022-05-10 ビジュアルサーチのためのデジタル補足関連付けおよび検索

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US16/014,520 2018-06-21
US16/014,512 2018-06-21
US16/014,512 US10579230B2 (en) 2018-06-21 2018-06-21 Digital supplement association and retrieval for visual search
US16/014,520 US10878037B2 (en) 2018-06-21 2018-06-21 Digital supplement association and retrieval for visual search

Publications (1)

Publication Number Publication Date
WO2019245801A1 true WO2019245801A1 (fr) 2019-12-26

Family

ID=68983041

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/036542 WO2019245801A1 (fr) 2018-06-21 2019-06-21 Association et récupération de complément numérique pour recherche visuelle

Country Status (5)

Country Link
EP (1) EP3811238A1 (fr)
JP (2) JP7393361B2 (fr)
KR (2) KR20230003388A (fr)
CN (1) CN112020712B (fr)
WO (1) WO2019245801A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117708680A (zh) * 2024-02-06 2024-03-15 青岛海尔科技有限公司 一种用于提升分类模型准确度的方法及装置、存储介质、电子装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231385B (zh) * 2020-12-11 2021-06-01 湖南新云网科技有限公司 一种数据收集方法、装置、设备及存储介质
US11983691B2 (en) * 2022-07-27 2024-05-14 Bank Of America Corporation System and methods for detecting and implementing resource allocation in an electronic network based on non-contact instructions

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161382A1 (en) * 2015-12-08 2017-06-08 Snapchat, Inc. System to correlate video data and contextual data
US20170351710A1 (en) * 2016-06-07 2017-12-07 Baidu Usa Llc Method and system for evaluating and ranking images with content based on similarity scores in response to a search query

Family Cites Families (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7680324B2 (en) * 2000-11-06 2010-03-16 Evryx Technologies, Inc. Use of image-derived information as search criteria for internet and other search engines
CN101777064A (zh) * 2009-01-12 2010-07-14 鸿富锦精密工业(深圳)有限公司 图片搜索系统及方法
US8429173B1 (en) * 2009-04-20 2013-04-23 Google Inc. Method, system, and computer readable medium for identifying result images based on an image query
JP4981109B2 (ja) * 2009-08-25 2012-07-18 東芝テック株式会社 仮想試着装置及びプログラム
US9710491B2 (en) * 2009-11-02 2017-07-18 Microsoft Technology Licensing, Llc Content-based image search
US8811742B2 (en) * 2009-12-02 2014-08-19 Google Inc. Identifying matching canonical documents consistent with visual query structural information
US8903166B2 (en) * 2010-01-20 2014-12-02 Microsoft Corporation Content-aware ranking for visual search
US8489589B2 (en) * 2010-02-05 2013-07-16 Microsoft Corporation Visual search reranking
JP5280475B2 (ja) * 2010-03-31 2013-09-04 新日鉄住金ソリューションズ株式会社 情報処理システム、情報処理方法及びプログラム
US20130238467A1 (en) * 2011-01-13 2013-09-12 Rakuten, Inc. Object display server, object display method, object display program, and computer-readable recording medium for storing the program
JP5014494B2 (ja) * 2011-01-21 2012-08-29 パナソニック株式会社 情報処理装置、拡張現実感システム、情報処理方法、及び情報処理プログラム
US8543521B2 (en) * 2011-03-30 2013-09-24 Microsoft Corporation Supervised re-ranking for visual search
US9036925B2 (en) * 2011-04-14 2015-05-19 Qualcomm Incorporated Robust feature matching for visual search
US20130129142A1 (en) * 2011-11-17 2013-05-23 Microsoft Corporation Automatic tag generation based on image content
CN103959284B (zh) * 2011-11-24 2017-11-24 微软技术许可有限责任公司 使用置信图像样本进行重新排名
US8935246B2 (en) * 2012-08-08 2015-01-13 Google Inc. Identifying textual terms in response to a visual query
US9671941B1 (en) * 2013-05-09 2017-06-06 Amazon Technologies, Inc. Graphical behaviors for recognition interfaces
US20160117391A1 (en) * 2013-05-16 2016-04-28 Yandex Europe Ag Presentation of ranked image query results to a client
US20160224837A1 (en) * 2013-10-25 2016-08-04 Hyperlayer, Inc. Method And System For Facial And Object Recognition Using Metadata Heuristic Search
JP2015207258A (ja) * 2014-04-23 2015-11-19 キヤノン株式会社 情報出力装置、情報出力方法及びプログラム、情報提供装置、情報提供方法及びプログラム
US9652543B2 (en) * 2014-12-22 2017-05-16 Microsoft Technology Licensing, Llc Task-oriented presentation of auxiliary content to increase user interaction performance
CN106156063B (zh) * 2015-03-30 2019-10-01 阿里巴巴集团控股有限公司 用于图片对象搜索结果排序的相关方法及装置
US9489401B1 (en) * 2015-06-16 2016-11-08 My EyeSpy PTY Ltd. Methods and systems for object recognition
US10235387B2 (en) * 2016-03-01 2019-03-19 Baidu Usa Llc Method for selecting images for matching with content based on metadata of images and content in real-time in response to search queries
JP2017194848A (ja) * 2016-04-21 2017-10-26 大日本印刷株式会社 画像認識サービスシステム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161382A1 (en) * 2015-12-08 2017-06-08 Snapchat, Inc. System to correlate video data and contextual data
US20170351710A1 (en) * 2016-06-07 2017-12-07 Baidu Usa Llc Method and system for evaluating and ranking images with content based on similarity scores in response to a search query

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117708680A (zh) * 2024-02-06 2024-03-15 青岛海尔科技有限公司 一种用于提升分类模型准确度的方法及装置、存储介质、电子装置

Also Published As

Publication number Publication date
JP2021522614A (ja) 2021-08-30
JP2022110057A (ja) 2022-07-28
KR20230003388A (ko) 2023-01-05
EP3811238A1 (fr) 2021-04-28
CN112020712A (zh) 2020-12-01
CN112020712B (zh) 2024-06-25
KR20200136030A (ko) 2020-12-04
JP7393361B2 (ja) 2023-12-06

Similar Documents

Publication Publication Date Title
US11023106B2 (en) Digital supplement association and retrieval for visual search
US11640431B2 (en) Digital supplement association and retrieval for visual search
US12002169B2 (en) System and method for selecting targets in an augmented reality environment
JP6502923B2 (ja) コンピューティングデバイスのための認識インターフェース
JP2022110057A (ja) ビジュアルサーチのためのデジタル補足関連付けおよび検索
WO2012105069A1 (fr) Dispositif de fourniture d'informations
US20210042809A1 (en) System and method for intuitive content browsing
US20200167433A1 (en) Relevance of Search Results
US20220155940A1 (en) Dynamic collection-based content presentation
WO2021173147A1 (fr) Système et procédé pour la lecture d'un contenu de réalité augmentée déclenchée par une reconnaissance d'image
EP4224339A1 (fr) Systèmes et procédés intelligents pour des requêtes de recherche visuelle
WO2014024533A1 (fr) Dispositif de traitement d'informations et support d'enregistrement
US10621237B1 (en) Contextual overlay for documents
US9529936B1 (en) Search results using query hints
WO2017210610A1 (fr) Navigateur à trace rapide
US12032633B2 (en) Digital supplement association and retrieval for visual search
US10437902B1 (en) Extracting product references from unstructured text
US11514082B1 (en) Dynamic content selection
JP7382847B2 (ja) 情報処理方法、プログラム、及び情報処理装置
US11734379B1 (en) Dynamic search results interface based on predicted user intent
US11810577B2 (en) Dialogue management using lattice walking
WO2017123746A1 (fr) Système et procédé d'exploration intuitive de contenu

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19735444

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20207031107

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020570146

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2019735444

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2019735444

Country of ref document: EP

Effective date: 20210121