CN112020712A - Digital supplemental association and retrieval for visual search - Google Patents

Digital supplemental association and retrieval for visual search Download PDF

Info

Publication number
CN112020712A
CN112020712A CN201980022269.0A CN201980022269A CN112020712A CN 112020712 A CN112020712 A CN 112020712A CN 201980022269 A CN201980022269 A CN 201980022269A CN 112020712 A CN112020712 A CN 112020712A
Authority
CN
China
Prior art keywords
digital
computing device
supplemental
image
computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980022269.0A
Other languages
Chinese (zh)
Inventor
艾伦·乔伊斯
埃德加·琼
杨哲
伊恩·梅萨
约瑟夫·奥尔森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Google LLC
Original Assignee
Google LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US16/014,512 external-priority patent/US10579230B2/en
Priority claimed from US16/014,520 external-priority patent/US10878037B2/en
Application filed by Google LLC filed Critical Google LLC
Publication of CN112020712A publication Critical patent/CN112020712A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/41Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/532Query formulation, e.g. graphical querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

Systems and methods for identification and retrieval of content for visual search are provided. An example method includes receiving data specifying a digital complement. The data can identify a digital supplement and a supplement anchor for associating the digital supplement with visual content. The method may further include generating a data structure instance specifying a digital supplement and a supplement anchor, and enabling triggering of the digital supplement by the image based at least on storing the data structure instance in a database including a plurality of other data structure instances after generating the data structure instance. Other data structure instances may each specify a digital supplemental and one or more supplemental anchors.

Description

Digital supplemental association and retrieval for visual search
Cross Reference to Related Applications
This application is a continuation of and claiming priority from U.S. non-provisional patent application No.16/014,520 entitled "DIGITAL supplemental ASSOCIATION and retrieval FOR VISUAL SEARCH" filed on 21.6.2018, the disclosure of which is incorporated herein by reference in its entirety.
Background
Mobile computing devices, such as smartphones, typically include a camera. These cameras may be used to capture images of entities in the environment surrounding the computing device. Various types of content or experiences related to those entities may be made available to the user via the mobile computing device.
Disclosure of Invention
The present disclosure describes systems and methods for digital supplemental association and retrieval for visual search. For example, the systems and techniques described herein may be used to provide digital augmentation in response to a visual search, such as Augmented Reality (AR) content or experience. The visual search may be based on the image or an identified entity within the image, for example. The digital supplement may, for example, include providing information or functionality associated with the image.
One aspect is a computer-implemented method, comprising: data specifying a digital supplement is received, the data identifying the digital supplement and a supplement anchor for associating the digital supplement with visual content. The method also includes generating a data structure instance specifying a digital supplemental and a supplemental anchor. The method further includes, after generating the data structure instance, enabling image-triggered digital replenishment based at least on storing the data structure instance in a database including a plurality of other data structure instances. Each other data structure instance specifies a digital supplemental and one or more supplemental anchors.
Another aspect is a computing device that includes at least one processor and a memory storing instructions. When executed by at least one processor, the instructions cause the computing device to receive data specifying a digital supplement, the data identifying the digital supplement, a supplement anchor for associating the digital supplement with visual content, and contextual information. The instructions also cause the computing device to generate a data structure instance specifying a digital supplemental, a supplemental anchor, and context information. The instructions further cause the computing device, after generating the data structure instance, to enable image-triggered digital replenishment based at least on storing the data structure instance in a database comprising a plurality of other data structure instances. Each other data structure instance specifies a digital supplemental and one or more supplemental anchors.
Yet another aspect is a computer-implemented method that includes receiving a visual content query from a computing device, and identifying a supplemental anchor based on the visual content query. The method also includes generating an ordered list of digital supplements based on the identified supplement anchors and transmitting the ordered list to the client computing device.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.
Drawings
Fig. 1 is a block diagram illustrating a system according to an example embodiment.
FIG. 2 is a third human perspective of an example physical space in which one embodiment of the client computing device of FIG. 1 is accessing a digital supplement.
Fig. 3 is a diagram of an example method of enabling triggering digital replenishment according to embodiments described herein.
Fig. 4 is a diagram of an example method of enabling triggering digital replenishment according to an implementation described herein.
FIG. 5 is a diagram of an example method of searching for and presenting a digital supplement according to embodiments described herein.
FIG. 6 is a diagram of an example method of image-based recognition and presentation of digital supplements, according to embodiments described herein.
7A-7C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 1 for conducting a visual content search and displaying a digital supplement.
8A-8C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 1 for conducting a visual content search and displaying a digital supplement.
Fig. 9A and 9B are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of fig. 1 for performing a visual content search and displaying a digital supplement.
10A-10C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 1 to conduct a visual content search and display a digital supplement.
11A-11C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of FIG. 1 for conducting various visual content searches within a store.
Fig. 12A-12C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device of fig. 11 during various visual content searches.
FIG. 13 is a schematic diagram of an example of a computer device and a mobile computer device that may be used to implement the techniques described herein.
Reference will now be made in detail to non-limiting examples of the present disclosure, examples of which are illustrated in the accompanying drawings. Examples are described below with reference to the drawings, wherein like reference numerals refer to like elements. When the same reference numerals are shown, the corresponding description is not repeated and the interested reader may refer to the previously discussed figures to obtain a description of the same elements.
Detailed Description
The present disclosure describes technological improvements that simplify the identification and presentation of digital supplements based on visual content. Some implementations of the technology described herein generate an index of digital supplements related to particular types of visual content and provide those digital supplements in response to a visual content query received from a client computing device. The index may allow the user to access related digital supplements provided by network-accessible resources (e.g., web pages) located throughout the world. This may provide a functional data structure that allows information to be retrieved more efficiently.
For example, a client computing device, such as a smartphone, may capture an image of a supplemental anchor, such as an entity. The client computing device may then transmit a visual content query to the server computing device based on the image to retrieve the digital supplemental associated with the identified supplemental anchor. In some implementations, the supplemental anchor is based on a physical environment surrounding the client computing device, and the digital supplemental is virtual content that can supplement the user's experience in the physical environment.
The visual content query may include the image or data determined from the image (e.g., an indicator such as the identified supplemental anchor). An example of data determined from an image is text extracted from the image using, for example, optical character recognition. Other examples of data extracted from an image include values read from a bar code, QR code, etc. in the image, an identifier or description of the entity, product, or type of entity identified in the image.
For example, a neural network system, such as a convolutional neural network system, may be used to identify entities, products, or entity types in an image. The identifier or description of the entity, product, or entity type may include metadata or a reference to a record in a database related to the entity, product, or entity type. Non-limiting examples of entities include buildings, works of art, products, books, posters, photographs, catalogs, signs, documents (e.g., business cards, receipts, coupons, catalogs), people, and body parts.
Various types of digital supplements may be available in connection with the supplemental anchor. The digital supplement may be provided through a network accessible resource, such as a web page available on the internet. There is a need for a way to locate and provide these digital supplements in response to a visual content query. Some embodiments generate and maintain a numerically supplemented index associated with an entity for use in responding to visual content queries. For example, the index may be populated by crawling network-accessible resources to determine whether the network-accessible resources include or provide any digital supplements, and to determine the supplement anchors associated with those digital supplements.
For example, the network-accessible resource may include metadata identifying a supplemental anchor (e.g., text, code, entity, or entity type) associated with the digital supplemental. The network-accessible resource may include metadata in response to a hypertext transfer protocol (HTTP) request. The metadata may be provided in various formats, such as extensible markup language (XML), JavaScript object notation (JSON), or another format.
Metadata for digital supplementation may include one or more of the following: a type indicator, an anchor indicator, a name, a description, a content segment (i.e., a snippet or preview of a portion of content), an associated image, a link such as a URL to a digital supplement, and an identifier of an application associated with the digital supplement. The metadata may also include information about the publisher of the digital supplement. For example, the metadata may include one or more of a publisher name, a publisher description, and an image or icon associated with the publisher. In some implementations, the metadata includes contextual information related to providing the digital supplement. For example, the metadata may also include conditions (e.g., geographic conditions, desired applications) associated with providing or accessing the digital supplement.
The identified digital complement may be added to an index stored in memory. In at least some embodiments, an associated supplemental anchor for the digital supplemental is used as a key to the index. The digital supplements may also be associated with various scores. For example, the digital supplement may be associated with a reputation score that is based on how many other links were found (e.g., when crawling accessible resources) that reference the digital supplement or network-accessible resources associated with the digital supplement, and the reputation of the network-accessible resources that provided those links. As another example, the digital supplement may be associated with one or more relevance scores that correspond to the relevance of the digital supplement (or associated network-accessible resource) to a particular anchor. Relevance scores may also be associated with keywords or topics. The relevance score may be determined based on one or more of the digitally supplemented content, the content of the network-accessible resource, the content of a site linked to the network-accessible resource, and the content (e.g., text) of the link to the network-accessible resource.
Fig. 1 is a block diagram illustrating a system 100 according to an example embodiment. The system 100 can associate the digital supplement with an entity or entity type and can retrieve the digital supplement in response to the visual search. The visual search is a visual content-based search. For example, a visual search may be performed based on a visual content query. A visual content query is a query based on images or other visual content. For example, the visual content query may include an image. In some implementations, the visual content query can include image-based text or data. For example, text or data may be generated by recognizing one or more entities in an image. Some visual content queries do not include images (e.g., a visual content query may only include data or text generated from an image). In some implementations, the system 100 includes the client computing device 102, the search server 152, and the digital supplemental server 172. Also shown is a network 190 over which the client computing device 102, search server 152, and digital supplemental server 172 may communicate.
The client computing device 102 may include a processor component 104, a communication module 106, a sensor system 110, and a memory 120. The sensor system 110 may include various sensors, such as a camera assembly 112, an Inertial Motion Unit (IMU)114, and a Global Positioning System (GPS) receiver 116. Embodiments of sensor system 110 may also include other sensors including, for example, light sensors, audio sensors, image sensors, distance and/or proximity sensors, contact sensors such as capacitive sensors, timers, and/or other sensors and/or different combinations of sensors. In some implementations, the client computing device 102 is a mobile device (e.g., a smartphone).
The camera component 112 captures images or video of the physical space surrounding the client computing device 102. The camera component 112 may include one or more cameras. The camera assembly 112 may also include an infrared camera. Images captured with the camera component 112 can be used to identify supplemental anchors and form a visual content query.
In some implementations, the images captured with the camera component 112 can also be used to determine the location and orientation of the client computing device 102 within a physical space, such as an interior space, based on a representation of the physical space received from the memory 120 or an external computing device. In some implementations, the representation of the physical space can include visual features of the physical space (e.g., features extracted from an image of the physical space). The representation may also include location determination data associated with those features that may be used by the visual positioning system to determine a location and/or position within the physical space based on one or more images of the physical space. The representation may also include a three-dimensional model of at least some structures within the physical space. In some embodiments, the representation does not include a three-dimensional model of the physical space.
The IMU114 may detect motion, movement, and/or acceleration of the client computing device. The IMU114 may include a variety of different types of sensors, such as, for example, accelerometers, gyroscopes, magnetometers, and other such sensors. The orientation of the client computing device 102 may be detected and tracked based on data provided by the IMU114 or GPS receiver 116.
The GPS receiver 116 may receive signals transmitted by GPS satellites. The signal includes the time and location of the satellite. Based on the signals received from several satellites (e.g., at least four), the GPS receiver 116 may determine the global position of the client computing device 102.
The memory 120 may include an application 122, other applications 140, and a device location system 142. Other applications 140 include any other applications installed on the client computing device 102 or otherwise available for execution on the client computing device 102. The application 122 may cause one of the other applications 140 to launch to provide the digital complement. In some embodiments, some digital supplements are only available if other applications 140 include specific applications associated with or needed to provide the digital supplements.
The device location system 142 determines the location of the client computing device 102. The device location system 142 may use the sensor system 110 to determine the location and orientation of the client computing device 102 within a global or physical space. In some implementations, the device location system 142 determines the location of the client computing device 102 based on, for example, cellular triangulation.
In some implementations, the client computing device 102 may include a visual positioning system that compares images captured by the camera component 112 (or features extracted from those images) to known arrangements of features within the representation of the physical space to determine a six degree-of-freedom pose (e.g., position and orientation) of the client computing device 102 within the physical space.
The application 122 may include a supplemental anchor identification engine 124, a digital supplemental retrieval engine 126, a digital supplemental presentation engine 128, and a user interface engine 130. Some embodiments of the application 122 may include fewer, more, or other components.
Supplemental anchor identification engine 124 identifies supplemental anchors based on, for example, images captured by camera component 112. In some implementations, the supplemental anchor recognition engine 124 analyzes the image to recognize text. This text can then be used to identify the anchor. For example, the text may be mapped to nodes in the knowledge graph. For example, text may be recognized as the name of an entity such as a person, place, product, building, art, movie, or other type of entity. In some implementations, the text can be recognized as a phrase that is generally associated with a particular entity or as a phrase that describes a particular entity. For example, the text may then be recognized as an anchor associated with a particular entity.
In some implementations, the supplemental anchor recognition engine 124 recognizes one or more codes within the image, such as a barcode, a QR code, or another type of code. The code may then be mapped to a supplemental anchor.
The supplemental anchor identification engine 124 may include a machine learning module that may recognize at least some types of entities within the image. For example, the machine learning module may include a neural network system. Neural networks are computational models for machine learning and are composed of nodes organized in layers with weighted connections. Training a neural network uses training examples, each of which is an input and a desired output, to determine, through a series of iterative rounds, weight values for connections between layers that increase the likelihood that the neural network will provide the desired output for a given input. In each training round, the weights will be adjusted to account for the erroneous output values. After training, the neural network may be used to predict an output based on the provided inputs.
In some embodiments, the neural network system includes a Convolutional Neural Network (CNN). Convolutional Neural Networks (CNNs) are neural networks in which at least one layer of the neural network is a convolutional layer. A convolutional layer is a layer in which the value of the layer is calculated based on applying a kernel function to a subset of the values of the previous layer. Training the neural network may involve adjusting weights of the kernel function based on the training examples. Typically, each value in the convolutional layer is computed using the same kernel function. Thus, the weights that must be learned while training a convolutional layer are much less than a fully connected layer in a neural network (e.g., a layer in which each value in a layer is calculated as an independently adjusted weighted combination of each value in the previous layer). Because weights in convolutional layers are typically less, training and using convolutional layers may require less memory, processor cycles, and time than an equivalent fully-connected layer.
After supplemental anchor identification engine 124 identifies an entity or entity type in the image, a textual description of the entity or entity type may be generated. Additionally, entities or entity types can be mapped to supplemental anchors. In some embodiments, the supplemental anchor is associated with one or more digital supplements.
In some implementations, the supplemental anchor identification engine 124 determines a confidence score for the identified anchor. A higher confidence score may indicate that content from the image (e.g., image, extracted text, bar code, QR code) is more likely to be associated with the determined anchor than a lower confidence score is determined.
Although the example of fig. 1 illustrates supplemental anchor recognition engine 124 as a component of application 122 on client computing device 102, some embodiments include a supplemental anchor recognition engine on search server 152. For example, the client computing device 102 can send the image captured by the camera component 112 to the search server 152, which search server 152 can then identify the supplemental anchor within the image.
In some implementations, the supplemental anchor identification engine 124 identifies potential supplemental anchors. For example, supplemental anchor identification engine 124 can identify various entities (recognized) within the image. The identifiers of the identified entities may then be transmitted to the search server 152, which search server 152 may determine whether any entities are associated with any supplemental anchors. In some implementations, the search server 152 can use the identified entity as context information even if the identified entity is not a supplemental anchor.
The digital supplement retrieval engine 126 retrieves the digital supplement. For example, the digital supplemental retrieval engine 126 can retrieve the digital supplemental associated with the supplemental anchor identified by the supplemental anchor identification engine 124. In some implementations, the digital supplement retrieval engine 126 retrieves the digital supplement from the search server 152 or the digital supplement server 172.
For example, after identifying supplemental anchors, digital supplemental retrieval engine 126 can retrieve one or more digital supplements associated with the identified supplemental anchors. The digital supplemental retrieval engine 126 can generate a visual content query that includes the image (or an identifier of a supplemental anchor or entity within the image) and transmit the visual content query to the search server 152. The visual content query may also include contextual information, such as the location of the client computing device 102. In some implementations, data related to the digital supplement, such as a name, image, or description, is retrieved and presented to the user (e.g., by the user interface engine 130). If multiple digital supplements are presented, the user may select one of the digital supplements via a user interface generated by the user interface engine 130.
The digital supplemental presentation engine 128 presents or causes presentation of the digital supplemental on the client computing device 102. In some implementations, the digital supplemental presentation engine 128 causes the client computing device to launch one of the other applications 140. In some implementations, the digital supplemental presentation engine 128 causes information or content to be displayed. For example, the digital supplemental presentation engine 128 can cause the user interface engine 130 to generate a user interface that includes information or content from the digital supplemental to be displayed by the client computing device 102. In some implementations, the digital supplemental presentation engine 128 is triggered by the digital supplemental retrieval engine 126 retrieving the digital supplemental. The digital supplemental presentation engine 128 can then trigger the display device 108 to display content associated with the digital supplemental. In some implementations, the digital supplement presentation engine 128 causes the digital supplement to be displayed at a different time than when the digital supplement is retrieved by the digital supplement retrieval engine 126. For example, a digital supplement may be retrieved in response to a visual content query at a first time, and the digital supplement may be presented at a second time. For example, at a first time (e.g., when a user is browsing a catalog or is at a store), a digital supplement may be retrieved in response to a visual content query based on an image of home furnishings or furniture from the catalog or store. A digital complement of AR content including upholstery or furniture may be presented at a second time (e.g., when the user is in a room in which the upholstery or furniture may be placed).
The user interface engine 130 generates a user interface. The user interface engine 130 may also cause the client computing device 102 to display the generated user interface. The generated user interface may, for example, display information or content from a digital supplement. In some implementations, the user interface engine 130 generates a user interface that includes a plurality of user-actuatable controls, each control associated with a digital complement. For example, the user may actuate one of the user-actuatable controls (e.g., by touching the control on a touch screen, clicking on the control using a mouse or another input device, or otherwise actuating the control).
Search server 152 is a computing device. The search server 152 may respond to search requests, such as visual content queries. The response may include one or more digital supplements that are potentially relevant to the visual content query. In some embodiments, search server 152 includes memory 160, processor component 154, and communication module 156. The memory 160 may include a content crawler 162, a digital supplemental search engine 164, and a digital supplemental data store 166.
The content crawler 162 may crawl network-accessible resources to identify digital supplements. For example, the content crawler 162 may access web pages accessible over the Internet, such as web pages provided by the digital supplemental server 172. Crawling a network-accessible resource may include requesting the resource from a web server and parsing at least a portion of the resource. The digital supplement may be identified based on metadata provided by the network-accessible resource, such as XML or JSON data that provides information about the digital supplement. In some implementations, the crawler identifies network-accessible resources based on extracting links from previously crawled network-accessible resources. The content crawler 162 may also identify network-accessible resources to crawl based on receiving user-submitted input. For example, a user may submit a URL (or other information) to a network-accessible resource that includes a digital complement through a web form or an Application Programming Interface (API). In some implementations, the content crawler 162 generates an index of the identified digital complements. The content crawler 162 may also generate a score associated with the digital supplement, such as a relevance score or a popularity (reputation) score.
The digital supplemental search engine 164 receives the search query and generates a response that may include one or more potentially relevant digital supplements. For example, the digital supplemental search engine 164 may receive a visual content query from the client computing device 102. The visual content query may include an image. The digital supplemental search engine 164 may identify supplemental anchors in the image and identify relevant or potentially relevant digital supplements based on the identified supplemental anchors. The digital supplemental search engine 164 may transmit a response to the client computing device 102 that includes the digital supplemental or information that may be used to access the digital supplemental. In some implementations, the digital supplemental search engine 164 can return information associated with a plurality of digital supplements. For example, a list of digital supplements may be included in a response to a query. The list may be ordered based on relevance to supplemental anchors, popularity, or other properties of digital supplements.
The visual content query may, for example, include an image captured by the camera component 112 or text or other data associated with the image captured by the camera component 112. The visual content query may also include other information, such as a location of the client computing device 102 or an identifier of a user of the client computing device 102. In some implementations, the search server 152 can determine the likely location of the client computing device 102 from the user identifier (e.g., if the user has enabled a location service on the client computing device 102 that associates information about the user's location with the user account).
The digital supplemental data store 166 stores information about digital supplemental. In some embodiments, the digital supplemental data store 166 includes an index of digital supplements. For example, the index may be generated by the content crawler 162. The numeric supplemental search engine 164 can respond to search queries using the index.
The digital supplemental server 172 is a computing device. The digital supplement server 172 provides digital supplements. In some embodiments, the digital supplemental server 172 includes a memory 180, a processor component 174, and a communication module 176. Memory 180 may include digital supplements 182 and metadata 184. In some embodiments, memory 180 may also include other network accessible resources, such as web pages that are not necessarily digitally supplemented. For example, memory 180 may store web pages that include metadata to provide details about one or more digital supplements and how to access those digital supplements. Additionally, the memory 180 may include a resource services engine, such as a web server, that responds to requests, such as HTTP requests, for example, using network accessible resources, such as web pages and digital supplements.
Digital supplement 182 is any type of content that can be provided as a supplement to something in the physical environment surrounding the user. Digital supplement 182 may also include any type of content that may supplement a stored image (e.g., an image of a previous physical environment surrounding the user). For example, the digital supplement may be associated with a supplemental anchor, such as an image, an object or product identified in the image, or a location. The digital supplement 182 may include one or more images, audio content, text data, video, games, data files, applications, or structured text documents. Examples of structured text documents include hypertext markup language (HTML) documents, XML documents, and other types of structured text documents.
The digital complement 182 may enable an application to be launched and may define parameters for the application. Digital supplement 182 may also cause a request to be transmitted to a server (e.g., an HTTP request), and may define parameters of the request. In some implementations, the digital supplemental 182 is initiated as a workflow for completing an activity (such as a workflow for completing a purchase). For example, the digital complement 182 can transmit an HTTP request to the server to add a particular product to the user's shopping cart, add a coupon code, and retrieve a purchase confirmation page.
Metadata 184 is data describing a digital supplement. The metadata 184 may describe one or digital supplements provided by the digital supplement server 172 or elsewhere. Metadata 184 for digital supplementation may include one or more of the following: a type indicator, an anchor indicator, a name, a description, a preview segment or snippet, an associated image, such as a link to a URL of the digital complement, and an identifier of an application associated with the digital complement. The metadata may also include information about the digitally supplemented publisher, such as the publisher name, the publisher description, and an image or icon associated with the publisher. In some embodiments, the metadata also includes contextual information about the digital supplement or contextual information that must be satisfied to provide the digital supplement. For example, the metadata may include conditions that must be met to access the digital supplement (e.g., geographic conditions, client computing device requirements, desired applications). Exemplary context information includes location, an entity identified within an image, or multiple entities identified within an image (e.g., some digital supplements may require a combination of entities to be recognized within an image). The identified entity may be a supplemental anchor. In some implementations, the identified entity is not a supplemental anchor, but provides context information. The metadata 184 may also include a supplemental anchor (e.g., text, code, entity, or entity type) associated with the digital supplemental.
The metadata 184 may be stored in various formats. In some implementations, the metadata 184 is stored in a database. The metadata 184 may also be stored as an XML file, a JSON file, or another format file. In some implementations, the digital supplemental server 172 retrieves the metadata 184 from a database and formats the metadata 184 into XML, JSON, or otherwise to provide a response to a request from a client or search server 152. For example, search server 152 may access metadata 184 to generate data stored in digital supplemental data store 166 and used to respond to search requests from client computing device 102.
The communication module 106 includes one or more devices for communicating with other computing devices, such as the search server 152 or the digital supplemental server 172. The communication module 106 may communicate via a wireless or wired network, such as the network 190. The communication module 156 of the search server 152 and the communication module 176 of the digital supplemental server 172 may be similar to the communication module 106.
The display device 108 may, for example, comprise an LCD (liquid crystal display) screen, an LED (light emitting diode) screen, an OLED (organic light emitting diode) screen, a touch screen, or any other screen or display for displaying images or information to a user. In some implementations, the display device 108 includes a light projector arranged to project light onto a portion of the user's eye.
Memory 120 may include one or more non-transitory computer-readable storage media. The memory 120 may store instructions and data that may be used by the client computing device 102 to implement the techniques described herein, such as generating a visual content query based on a captured image, transmitting the visual content query, receiving a response to the visual content query, and presenting a digital supplement identified in response to the visual content query. Memory 160 of search server 152 and memory 180 of digital supplemental server 172 may be similar to memory 120 and may store data instructions that may be used to implement the techniques of search server 152 and digital supplemental server 172, respectively.
The processor component 104 includes one or more devices capable of executing instructions, such as instructions stored by the memory 120, to perform various tasks associated with digital supplemental association and retrieval for visual search. For example, the processor component 104 may include a Central Processing Unit (CPU) and/or a Graphics Processor Unit (GPU). For example, if a GPU is present, some image/video rendering tasks (such as generating and displaying a user interface or displaying portions of a digital complement) may be offloaded from the CPU to the GPU. In some embodiments, some image recognition tasks may also be offloaded from the CPU to the GPU.
Although not shown in fig. 1, some embodiments include a head mounted display device (HMD). The HMD may be a separate device from the client computing device 102, or the client computing device 102 may include an HMD. In some implementations, the client computing device 102 communicates with the HMD via a cable. For example, the client computing device 102 may transmit video signals and/or audio signals to the HMD for display to the user, and the HMD may transmit motion, position, and/or orientation information to the client computing device 102.
The client computing device 102 may also include various user input components (not shown), such as a controller that communicates with the client computing device 102 using a wireless communication protocol. In some implementations, the client computing device 102 may communicate with the HMD (not shown) via a wired connection (e.g., a Universal Serial Bus (USB) cable) or via a wireless communication protocol (e.g., any WiFi protocol, any bluetooth protocol, zigbee, etc.). In some implementations, the client computing device 102 is a component of the HMD and may be contained within a housing of the HMD.
The network 190 may be the internet, a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and/or any other network. The client computing device 102 may receive, for example, audio/video signals via a network, which may be provided as part of a digital supplement in the illustrative example embodiments.
Fig. 2 is a third human perspective of an example physical space 200 in which an embodiment of the client computing device 102 is accessing a digital supplement. In this example, the physical space 200 includes an object 222. Here, the object 222 is an art on a wall of the physical space 200. The object 222 is contained within the field of view 204 of the camera component 112 of the client computing device 102.
An example user interface screen 206 is also shown. The user interface screen 206 may be generated, for example, by the user interface engine 130 of the client computing device 102. User interface screen 206 includes an image display panel 208 and a digital supplemental selection panel 210. The image display panel 208 shows an image. For example, the image display panel 208 may show images corresponding to a live feed from the camera component 112 of the client computing device 102. In some implementations, the image display panel 208 shows previously captured images or information that has been retrieved from the memory 120 of the client computing device 102.
In some implementations, the user interface screen 206 is displayed to the user on a display device of the client computing device 102. In some implementations, the user interface screen 206 can be overlaid on an image of the physical space (or a video feed captured by a camera of the computing device). Further, the user interface screen 206 may be displayed as AR content on the user's field of view using an HMD worn by the user.
The image display panel 208 may also include annotations or user interface elements that may be related to the image. For example, image display panel 208 may include an indicator that an object in the image (e.g., object 222) has been recognized as a supplemental anchor. The indicator may include a user-actuatable control to access or view information about the digital supplement associated with the identified supplemental anchor. In some cases, the image displayed in the image display panel 208 may include a plurality of objects recognized as supplemental anchors, and the image display panel 208 may include a plurality of annotations overlaying the image to identify those supplemental anchors.
The supplemental anchor may be recognized by a supplemental anchor recognition engine of the client computing device 102. In some implementations, the supplemental anchor is identified by transmitting the image to the search server 152. Search server 152 may then analyze the image and identify supplemental anchors in the image. In some implementations, the search server 152 can transmit one or more of the location (e.g., image coordinates) or size of any identified objects associated with the supplemental anchor to the client computing device 102. The client computing device 102 may then update the user interface screen to show annotations that identify supplemental anchors (or associated objects) in the image. In some implementations, the client computing device 102 can track the location of supplemental anchors (or associated objects) in a video stream (e.g., images captured in sequence order) captured by the camera component 112 (e.g., the supplemental anchor identification engine 124 can track the supplemental anchors identified by the search server 152).
The digital supplement selection panel 210 allows a user to select a digital supplement for presentation. For example, the digital supplemental selection panel 210 may include a menu including user actuatable controls that are each associated with a digital supplement. In this example, the digital supplemental selection panel 210 includes a user actuatable control 212 and a user actuatable control 214, each of which includes information about an associated digital supplemental. For example, the user-actuatable control may display one or more of a name (or title), a brief description, and an image associated with a digital supplement that may be received from the search server 152. Upon actuation of the user-actuatable control 212 or the user-actuatable control 214, the associated digitally augmented content may be presented to the user. Presenting the digital supplement to the user may include causing the client computing device 102 to display a user interface screen that includes images, videos, text, other content from the digital supplement, or a combination thereof. In some implementations, the digital supplemental content is displayed as an overlay over the image or camera feed on the image display panel 208. The digital supplemental content may be three-dimensional augmented reality content.
In some embodiments, presenting the digital supplement includes activating an application (e.g., one of the other applications 140) installed on the client computing device 102. Presenting the digital supplemental can also include transmitting the request to a URL associated with the digital supplemental. The request may include a parameter associated with the digital supplement, such as an identifier of a product or object identified within the image. In some implementations, images (or other content) from a visual content query are passed parameters along with a request. The images may also be provided via an API associated with the digital supplemental server 172. In some implementations, the client computing device 102 transmits the image to the digital supplemental server 172. In some implementations, the search server 152 can transmit the image to the digital supplemental server 172. For example, in response to a user selecting a digital supplement, the client computing device 102 may transmit an indicator of the selection to the search server 152, and the search server 152 may then transmit the image to the corresponding digital supplement server. The client computing device 102 may also transmit the URL to a location on the search server 152 that the digital supplemental server 172 may use to access the image. Advantageously, these embodiments may reduce the amount of data that needs to be transmitted by the client computing device.
The digital supplement associated with the user-actuatable control 212 may cause information about the object 222, such as information from a museum, to be displayed. The digital complement associated with the user-actuatable control 214 may cause information related to the museum tour to be displayed. For example, presentation of the digital supplement may cause a station of a museum tour to be marked as completed and information about the next station to be displayed.
Fig. 3 is a diagram of an example method 300 of enabling triggering digital replenishment according to embodiments described herein. The method 300 may be performed, for example, by the content crawler 162 of the search server 152 to allow a user to access a digital supplement based on a visual content query.
At operation 302, data specifying a digital complement is received. The data may identify the digital supplement and the situation in which the digital supplement should be provided. The data specifying the digital complement may be received in various ways. For example, data specifying a digital supplement may be received from a network accessible resource, such as a web page that includes metadata about the digital supplement. Data specifying the digital complement may also be received via, for example, an API or form provided by search server 152. Data specifying the digital complement may also be received from a memory location or data store.
The data regarding the digital supplement may include access data usable by the client computing device to access the digital supplement. For example, the access data may include a digitally supplemented URL and parameters passed to the URL. The access data may also include an application identifier and parameters for the application. The data about the digital supplement may also include descriptive data about the digital supplement. The client computing device can use the descriptive data to present information about the digital supplement to the user (e.g., on a menu where the user can select the digital supplement). The descriptive data may include, for example, a name (or title, description, publisher name, and image). The data about the digital supplemental may also include an identifier of the supplemental anchor.
At operation 304, a data structure instance based on the received data is generated. The data structure may be, for example, a record in a database. The database may be a relational database, and the data structure instance may be linked (e.g., via foreign keys) with one or more records associated with the supplemental anchor.
At operation 306, after generating the data structure instance, a digital supplement is enabled to be retrieved through the visual content query. For example, a database field associated with a data structure instance may be set to active so that the digital supplemental search engine 164 may access and return the associated digital supplemental. In some embodiments, the triggering of the digital supplementation may include saving or logging down a database record. In some implementations, enabling retrieval of the digital supplement includes enabling triggering of the digital supplement by the client computing device. For example, after generating the instance, the digital supplement may be returned to the client computing device in response to the search and activated or presented by the client computing device.
Fig. 4 is a diagram of an example method 400 of enabling triggering digital replenishment according to an implementation described herein. The method 400 may be performed, for example, by the content crawler 162 of the search server 152 to allow a user to access a digital supplement based on a visual content query.
At operation 402, a network accessible resource is analyzed. In some implementations, the network-accessible resource is, for example, a web page served by the digital supplemental server 172. In some implementations, a set of network-accessible resources is analyzed. The set of network-accessible resources may be generated based on submissions via a form or API. In some implementations, the set of network-accessible resources can be generated by crawling other network-accessible resources to identify URLs. The crawling process may be performed recursively.
At operation 404, metadata associated with the digital supplement within the network-accessible resource is identified. In some implementations, the network-accessible resource can include an indicator of metadata associated with the digital supplement. For example, the network-accessible resource may include a tag that identifies a portion of the network-accessible resource that includes metadata. The tags may be XML tags having a particular type or attribute. The tags may be HTML tags, such as script tags, which include JSON data structures that contain metadata.
At operation 406, a metadata-based digital supplemental data structure instance is generated. Operation 406 may be similar to operation 304.
At operation 408, a visual content query is received. The visual content query may be sent, for example, by a client computing device, such as client computing device 102. In some implementations, the visual content query includes an image. The visual content query may also include text data describing the image. For example, the text data may include an identifier of the supplemental anchor within an image captured by a camera component of the client computing device. In some implementations, the visual content query also includes other information, such as a location of the client computing device or an identifier of a user account associated with the client computing device.
At operation 410, a plurality of digital supplemental data structure instances are identified based on the visual content query. In some implementations, the supplemental anchor is identified within an image provided in the visual content query. The supplemental anchor can then be used to query an index or database for the relevant digital supplement. In some implementations, other data provided with the query may also be used to identify the digital supplement, such as the location of the client computing device or information associated with the user account. In some embodiments, multiple supplemental anchors are used to identify relevant supplemental anchors.
At operation 412, an ordering of the plurality of instances of the digital supplemental data structure is determined. The ranking may be based on various scores associated with the digital supplements or the relevance of the digital supplements to the visual content query. In some implementations, relevance scores corresponding to the relevance of the digital supplements to the visual content query are used to rank the plurality of digital supplement data structure instances.
The relevance score may be determined from a plurality of factors such as one or more of digitally supplemented content, content linked to a digitally supplemented network-accessible resource (or a network-accessible resource associated with a digital supplement), link text or content near a link to digital supplemental information on other network-accessible resources.
The score may also be based on a popularity metric. The reputation metric is one example of a popularity metric. The reputation metric may be based on how many other network resources are linked to the digital supplemental content and a combination of reputation scores of these other network-accessible resources. In some implementations, the popularity score may be based on how often the digital resource is selected or has been selected. In some implementations, the popularity score may correspond to how often digital resources are selected for visual content queries.
The score may be determined or may be retrieved from a data store or API. In some implementations, the API is accessed to retrieve the scores for the digital supplements. For example, scores may be retrieved from a search engine that has determined relevance and/or popularity of digital resources with respect to the supplemental anchor-based search term.
The plurality of digital supplemental data structures may also be ordered based on a frequency of use by a particular user (e.g., a user of the client computing device) or a recency of use by the particular user. In some embodiments, the plurality of digital supplemental data structures are randomly ordered.
At operation 414, the visual content query is responded to based on the plurality of instances of the digital supplemental data structure. For example, information associated with the plurality of instances of the digital supplemental data structure may be transmitted to the client computing device in the order determined at operation 412. In some implementations, the information includes descriptive data that may be shown in a menu or another type of user interface configured to receive user selections that are supplemental to the numbers. The information may also include access data that may be used by the client computing device to access or present the digital supplement.
FIG. 5 is a diagram of an example method 500 of searching for and presenting a digital supplement, according to an embodiment described herein. The method 500 may be performed, for example, by the application 122 of the client computing device 102 to identify and access a digital supplement based on a visual content query.
At operation 502, an image-based visual content query is transmitted to a server computing device (e.g., search server 152). For example, an image may be captured with the camera component 112 of the client computing device 102. The image may also be a stored image, such as an image previously captured by the camera component 112. In some implementations, the visual content query contains only images. In some implementations, the visual content query includes additional information. For example, the visual content query may include information such as a location of the client computing device 102 or an identifier of an account associated with a user of the client computing device 102. The application 122 may also identify anchors in the image (e.g., using a supplemental anchor identification engine 124). The visual content query may include an identifier (e.g., a text, numeric, or other type of identifier) of the identified anchor. In at least some implementations, the visual content query does not include an image.
In some implementations, transmitting the visual content query to the server includes calling an API. In some implementations, transmitting the visual content query to the server includes calling an API provided by the server. In some implementations, transmitting the visual content query to the server includes submitting a form (e.g., submitting a GET or POST request) using the HTTP protocol.
At operation 504, a response to the visual content query identifying the digital supplement is received. The response may be received from search server 152 via network 190. The response may include one or more digital supplements identified by the search server 152 based on the visual content query. For example, the response may include an array of data associated with the digital complement. In some implementations, the data associated with the digital supplement can include descriptive data that can be used to present digital supplement options for user selection. For example, the descriptive data may include a name, a brief description, a publisher name, and an image. The data may also include access data, such as a URL and parameters included with the request via the URL or an application name and related parameters. The data may also include the location, coordinates, or dimensions of the supplemental anchor in the image transmitted with the visual content query (e.g., if the supplemental anchor is identified by the search server 152).
At operation 506, a user interface screen including information associated with the digital supplemental is displayed. In some implementations, the user interface screen includes annotations that overlay the identified supplemental anchors (e.g., based on the provided coordinates). The annotation can provide information about an object in the image associated with the identified supplemental anchor. The annotation may include a user actuatable control that can be actuated to present or activate the digital complement. The user interface screen may also include a digital supplement selection panel that may be used to select from among a plurality of digital supplements identified in the response received at operation 504. In some implementations, the user interface screen can be generated by a web browser that opens a numeric complement specified URL. The user interface screen may also be generated by another application that is launched to provide digital supplementation.
Fig. 6 is a diagram of an example method 600 of image-based recognition and presentation of digital supplements, according to an embodiment described herein. The method 600 may be performed, for example, by the application 122 of the client computing device 102 to identify and access a digital supplement based on a visual content query.
At operation 602, an image is captured. For example, the image may be captured by the camera component 112 of the client computing device 102. In some implementations, a sequence of images (i.e., video) can be captured by the camera component 112.
At operation 604, the image-based visual content query is transmitted to a server computing device, such as the search server 152. Operation 604 may be similar to operation 502. In embodiments where a sequence of images is captured, the visual content query may include a plurality of images or a sequence of images. In some implementations, the sequence of images can be streamed to a server computing device.
At operation 606, a response to the visual content query identifying a plurality of digital complements is received. Operation 606 may be similar to operation 504 previously described.
At operation 608, a user interface screen is displayed that includes a user actuatable control to select a digital complement from a plurality of digital complements. For example, a numeric supplemental selection panel may be displayed. The digital complement selection panel can include a plurality of user-actuatable controls, each control associated with one of the plurality of digital complements identified in the response. The digital supplement selection can arrange the user-actuatable controls based on a ranking or ranking of digital supplements provided by the server computing device. The digital supplemental selection panel may arrange the user actuatable controls vertically, horizontally, or otherwise. The user actuatable control may be associated with or include information about the associated digital supplement that the user may consider in deciding whether to select the digital supplement. For example, the displayed information may include one or more of a name, description, image, and publisher name of the digital complement.
At operation 610, a user input is received selecting a digital complement. The user input may be a click using a mouse or other device. The user input may also be touch input from a stylus or finger. Another example of a user input is a near-touch input (e.g., holding a finger or pointing device in proximity to a touchscreen). In some implementations, the user input can also include gestures, head movements, eye movements, or voice input.
At operation 612, information is provided to the resource associated with the selected digital complement. For example, information about the user of the client computing device may be transmitted to a server that provides the digital supplemental (if a license to provide the information has already been provided). This information may also be provided to applications that provide digital supplementation. Various types of information may be provided. For example, the information may include user information, such as a user name, user preferences, or location.
The information may also include information related to a visual content query, such as an image or a sequence of images. The information may also include identifiers and/or loci of one or more supplemental anchors in the image. This information can be used to provide a digital complement to the user. For example, the digitally augmented AR content may be resized and located based on the image.
This information may be transmitted by the client computing device 102 directly to a resource associated with the digital supplemental (e.g., the digital supplemental server 172). In some implementations, the information is provided to resources associated with the digital supplement by search server 152 (e.g., so that the client computing device does not need to transmit as much data). In at least some of these implementations, the client computing device 102 can transmit selection information identifying the selected digital complement to the search server 152. After receiving the selection and verifying that the user has authorized information sharing, the search server 152 may then transmit the information to the resource providing the digital supplement. The client computing device 102 may also prompt the user for permission to share information. In some implementations, the search server 152 can determine information to transmit to the resource based on a digital supplemental data structure instance (which can be based on metadata associated with the digital supplemental).
At operation 614, the user interface is updated based on the selected digital complement. Operation 614 may be similar to operation 506.
Fig. 7A-7C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 to conduct a visual content search and display a digital supplement. In fig. 7A, a user interface screen 700a is shown. User interface screen 700a includes an image display panel 708 and an information panel 730. In this example, the image display panel 708 is displaying an image of a rack full of wine bottles (e.g., you can find in a store). Image display panel 708 also includes indicator 740 and indicator 742. Each of these indicators indicates that the wine bottle shown in the image below the indicator has been recognized as a supplemental anchor (e.g., a recognized product in this case). Indicators 740 and 742 are examples of user-actuatable controls. Within the information panel 730, the instruction is provided as "Tap on what you are interested in" re interested in ".
In fig. 7B, the user interface screen 700B is shown after the user has actuated the indicator 740. Upon actuation, the annotation 744 from the digital supplement is displayed. The annotation 744 includes information about the rating of the wine, which may assist the user in selecting a bottle of wine that the user wants to purchase.
In fig. 7C, after the user has actuated indicator 740, another user interface screen 700C is shown. User interface screen 700c may be shown instead of or in addition to user interface screen 700B (e.g., after actuation of annotation 744, or if the user swipes up on information panel 730 in fig. 7B). In fig. 7C, an extended information panel 732 is shown. The expanded information panel 732 occupies more of the user interface screen 700c than the information panel 730 in fig. 7A and 7B.
The expanded information panel 732 includes a digital supplemental selection panel 710 and a digital supplemental content display panel 734. The digital supplemental selection panel 710 includes user actuatable controls 712, user actuatable controls 714, and user actuatable controls 716 (only partially visible). In some implementations, additional user-actuatable controls can be displayed when the user swipes across the digital supplemental selection panel 710. The user-actuatable controls of the digital supplemental selection panel 710 may be arranged in a ranked order. The user actuatable control 712 is associated with a digital complement for meal pairing. Upon actuation of the user actuatable control 712, a digital complement may be displayed displaying food and meal pairing information for the selected wine. The user-actuatable control 714 is associated with a digital supplement to save the photograph. Upon actuation, the application that saved the photograph may be activated and provided with the image. Other information may be saved with the photograph of the supplemental anchor, such as the identification.
The digital supplemental content display panel 734 may display content from the digital supplemental. The digital supplemental content display panel 734 can display a default digital supplement or a highest ranked digital supplement associated with the identified supplemental anchor. In this example, the digital supplemental content display panel 734 includes product information about the product associated with the selected supplemental anchor. In this case, a wine name, rating, place of origin, image, and comment are provided.
Fig. 8A-8C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for visual content searching and displaying digital supplements. In this example, the visual content searches for receipt-based images.
In fig. 8A, a user interface screen 800a is shown. User interface screen 800a includes image display panel 808 and information panel 830. In this example, image display panel 808 is displaying an image of a receipt from a restaurant. Image display panel 808 also includes indicator 840, indicator 842, annotation 844, and highlight overlay 846. In this case, indicator 840 is associated with the receipt as a document, and indicator 842 is associated with the particular restaurant named on the receipt. The identified receipt document and the identified restaurant name are examples of supplemental anchors.
The annotation 844 is associated with a digital supplement that provides a tip calculator. In this example, an exemplary tip calculation is included on the annotation 844 and overlaid at the appropriate location on the image display panel 808. In some embodiments, the digital supplemental may be selected and displayed by default upon identification of the appropriate supplemental anchor. Highlight overlay 846 overlays a portion of the receipt document that includes information for the tip calculator digital supplemental use.
In this example, the items displayed in the information panel 830 relate to receipts that are documents, as if the indicator 840 had been actuated. In some implementations, the identified supplemental anchors are ranked based on possible relevance or user interest based on, for example, a user's past actions, other user actions for similar images, confidence scores for the supplemental anchors, or positions or sizes of portions of the images to which the supplemental anchors are related. Then, in at least some implementations, the information panel 830 can display the items related to the highest ranked supplemental anchor. If instead indicator 842 is actuated, information panel 830 might include items regarding a particular restaurant.
Here, the information panel 830 includes a digital supplemental selection panel 810. The digital supplemental panel includes user actuatable controls 812, user actuatable controls 814, and user actuatable controls 816. In this example, user-actuatable control 812 is associated with a tip calculator digital supplement, user-actuatable control 814 is associated with an apportionment payment digital supplement, and user-actuatable control 816 is associated with an expense report digital supplement. For example, upon actuation of user actuatable control 812, a user interface control for adjusting a parameter of the tip calculator may be displayed (e.g., to adjust the percentage).
In FIG. 8B, user interface screen 800B is shown after the user has actuated user-actuatable control 814. After actuation, an expanded information panel 832 is shown that includes items that help the user calculate how to apportion the payment. For example, the number of people allocated to a payment may be entered to determine the amount each person should pay.
In FIG. 8C, user interface screen 800C is shown after the user has actuated user-actuatable control 816. After actuation, an expanded information panel 834 is shown that includes items that assist the user in storing a receipt to the expense report. For example, the user may select a expense report (e.g., "Sydney trip 2018 (Sydney travel 2018)") that should be associated with the receipt. Once the expense report is selected, an image of the receipt may be uploaded to an expense report submission or management system. In some implementations, the full image shown on the image display panel 808 is uploaded. In some implementations, a portion of the image is uploaded (e.g., the image is cropped to include only a receipt).
Fig. 9A and 9B are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for visual content searching and displaying digital supplements. In this example, the visual content search is based on facial images.
In fig. 9A, a user interface screen 900a is shown. The user interface screen 900a includes an image display panel 908 and an information panel 930. In this example, the image display panel 908 is displaying an image of a face. Here, a face is an example of a supplemental anchor. The information panel 930 includes a user-activatable control 912 for a digital complement that is identified for a complement anchor in an image (i.e., a face). The user actuatable control 912 is associated with a digital supplement for fitting the eyewear.
In FIG. 9B, user interface screen 900B is shown after the user has actuated user actuatable control 912. After actuation, an expanded information panel 932 is shown that includes items that help the user visually attempt to wear glasses on the face in the image. Here, a plurality of eyeglass styles are displayed, and the user can select a pair to try on. Upon selection of a pair of glasses, AR content 960 is overlaid on image display panel 908. Here, the AR content 960 corresponds to the selected glasses, and is sized to match the face in the image. In some embodiments, when a digital supplement for fitting glasses is selected, the image shown in the image display panel 908 is transmitted to a server that provides the digital supplement so that the image can be analyzed to determine where and how to position and size the AR content 960 or recommend fitting glasses.
10A-10C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for visual content searching and displaying digital supplements. In this example, the visual content search is based on images of furniture in the catalog.
In fig. 10A, a user interface screen 1000A is shown. The user interface screen 1000a includes an image display panel 1008. In this example, image display panel 1008 is displaying an image of a portion of a page of a furniture catalog. The image display panel further includes an indicator 1040, an indicator 1042, and an indicator 1044. In this example, indicator 1040 is associated with a bed, indicator 1042 is associated with a decorative item, and indicator 1044 is associated with a carpet. The images of the bed, decorative items and carpet in the catalog are examples of supplemental anchors.
In FIG. 10B, user interface screen 1000B is shown after the user has selected indicator 1040 (e.g., by touching the screen at or near the display of indicator 1040). User interface screen 1000b includes a digital supplemental selection panel 1010 and an information panel 1030. The information panel 1030 includes information (e.g., product name, description, and image) about the supplemental anchor associated with the selected indicator.
The digital supplemental selection panel 1010 includes a user-actuatable control 1012 and a user-actuatable control 1014. User-actuatable control 1012 is associated with a digital complement that provides a view in the home. The user-actuatable control 1014 is associated with another digital supplement (e.g., a digital supplement for posting to a social media site).
In FIG. 10C, user interface screen 1000C is shown after actuation of user actuatable control 1012. The user interface screen 1000c includes an image display panel 1008, a numeric supplemental selection panel 1010, and a condensed information panel 1032. The compact information panel 1032 may include user-actuatable controls that, when actuated, may pop-up and display the information panel.
Here, the image display panel 1008 now displays an image of the room and includes AR content 1060. The AR content 1060 includes a 3D model of the bed associated with the indicator 1040 overlaid on the image panel. The user may be able to adjust the position of the AR content 1060 within the room to see how the bed fits into the room. In some implementations, when a digital supplement for viewing at home is selected, the image shown in the image display panel 1008 is transmitted to a server that provides the digital supplement so that the image can be analyzed to determine where and how to position and size the AR content 1060. In some implementations, the AR content 1060 may be provided at a later time than the visual content query.
11A-11C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 for conducting various visual content searches within a store. In this example, the visual content search is based on images of products captured within the store.
In fig. 11A, a user interface screen 1100a is shown. User interface screen 1100a includes an image display panel 1108 and an information panel 1130. In this example, the image display panel 1108 is displaying images captured within a store. The image display panel 1108 also includes an indicator 1140 associated with the vase. The vase displayed on the image display panel 1108 is an example of a supplemental anchor. The information panel 1130 is displaying a digital supplement that includes product information about the vase and functionality to purchase the vase. The digital supplement may, for example, include a workflow for initiating a vase purchase. In this example, the digital complement is identified based on the image content and the location of the client computing device, such that the digital complement published by (or associated with) the store in which the image was captured while the client computing device was in the store can be identified as a high ranked result and provided to the visual content query. In some implementations, if the location of the client computing device is changed, a different digital complement will be provided for the same image.
In fig. 11B, a user interface screen 1100B is shown. User interface screen 1100b includes an image display panel 1108 and an information panel 1130. In this example, the image display panel 1108 is displaying another image captured within the store. The image display panel 1108 also includes an indicator 1142 associated with the carpet. Carpet displayed on the image display panel 1108 is an example of a supplemental anchor. The information panel 1130 is displaying a digital supplement that includes product information about the carpet as well as functionality to select a size and purchase the carpet. As in fig. 11A, the digital supplement is identified based on the image content and the location of the client computing device.
In fig. 11C, a user interface screen 1100C is shown. User interface screen 1100c includes an image display panel 1108 and an information panel 1130. In this example, the image display panel 1108 is displaying another image captured within the store. The image display panel 1108 also includes an indicator 1144 associated with the vase. The vase displayed on the image display panel 1108 is an example of an additional anchor. The information panel 1130 is displaying a digital supplement including product information about the vase. Information panel 1130 also includes coupon indicators 1132 and coupon redemption functionality. Redeeming the coupon may include purchasing the item at the discounted price from a website associated with the store. In some embodiments, a coupon code is presented that can be used to ensure discounts during checkout. As in fig. 11A and 11B, the digital supplement is identified based on the image content and the location of the client computing device.
Fig. 12A-12C are schematic diagrams of user interface screens displayed by an embodiment of the client computing device 102 during various visual content searches. In this example, the visual content searches for images based on movie posters (e.g., images that may be captured at a movie theater).
In fig. 12A, a user interface screen 1200a is shown. The user interface screen 1200a includes an image display panel 1208. In this example, the image display panel 1208 is displaying an image of a movie poster. The image display panel 1208 also includes an indicator 1240 associated with the movie poster identified in the image. Movie posters are an example of supplemental anchors. Indicator 1240 may include a user-actuatable control that, when actuated, will display a numeric supplement or menu to select a numeric supplement.
In fig. 12B, a user interface screen 1200B is shown. The image display panel 1208 also includes a preview digital complement 1242 associated with the movie poster identified in the image. For example, preview digital complement 1242 can be shown after actuation of indicator 1240 (of fig. 12A). Preview digital supplement 1242 may overlay an image or video from the movie associated with the identified movie poster over the image of the movie poster.
In fig. 12C, a user interface screen 1200C is shown. Image display panel 1208 also includes a rating indicator 1244 and a rating indicator 1246. Rating indicator 1244 and rating indicator 1246 may be generated by one or more digital supplements in response to a visual content query that includes a movie poster. The digital supplement may, for example, cover rating information for a movie associated with a movie poster in the image. Rating indicator 1244 and rating indicator 1246 may include user-actuatable controls that, when actuated, cause additional information to be displayed regarding the rating and associated movie.
Fig. 13 illustrates an example of a computer device 1300 and a mobile computer device 1350 (to implement the client computing device 102, the search server 152, and the digital supplemental server 172) that can be used with the techniques described herein. Computing device 1300 includes a processor 1302, memory 1304, a storage device 1306, a high-speed interface 1308 connecting to memory 1304 and high-speed expansion ports 1310, and a low speed interface 1312 connecting to low speed bus 1314 and storage device 1306. Each of the components 1302, 1304, 1306, 1308, 1310, and 1312, are interconnected using various buses, and may be mounted on a common motherboard or in other manners as appropriate. The processor 1302 may process instructions for execution within the computing device 1300, including instructions stored in the memory 1304 or on the storage device 1306 to display graphical information for a GUI on an external input/output device, such as display 1316 coupled to high speed interface 1308. In other embodiments, multiple processors and/or multiple buses may be used, as appropriate, along with multiple memories and types of memory. Additionally, multiple computing devices 1300 may be connected, with each device providing portions of the necessary operations (e.g., as a server bank, a group of blade servers, or a multi-processor system).
The memory 1304 stores information within the computing device 1300. In one implementation, the memory 1304 is a volatile memory unit or units. In another implementation, the memory 1304 is one or more non-volatile memory units. The memory 1304 may also be another form of computer-readable medium, such as a magnetic or optical disk.
The storage device 1306 is capable of providing mass storage for the computing device 1300. In one implementation, the storage device 1306 may be or contain a computer-readable medium, such as a floppy disk device, a hard disk device, an optical disk device, or a tape device, a flash memory or other similar solid state memory device, or an array of devices, including devices in a storage area network or other configurations. The computer program product may be tangibly embodied in an information carrier. The computer program product may also contain instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer-or machine-readable medium, such as the memory 1304, the storage device 1306, or memory on processor 1302.
The high speed controller 1308 manages bandwidth-intensive operations for the computing device 1300, while the low speed controller 1312 manages lower bandwidth-intensive operations. This allocation of functionality is merely exemplary. In one embodiment, the high-speed controller 1308 is coupled to memory 1304, display 1316 (e.g., through a graphics processor or accelerator), and to high-speed expansion ports 1310, which may accept various expansion cards (not shown). In this embodiment, low-speed controller 1312 is coupled to storage device 1306 and low-speed expansion port 1314. The low-speed expansion port, which may include various communication ports (e.g., USB, bluetooth, ethernet, wireless ethernet), is coupled to one or more input/output devices, such as a keyboard, a pointing device, a scanner, or a networking device such as a switch or router, for example, through a network adapter.
As shown in the figure, computing device 1300 may be implemented in a number of different forms. For example, it may be implemented as a standard server 1320, or multiple times in a group of such servers. It may also be implemented as part of a rack server system 1324. Further, it may be implemented in a personal computer such as a laptop computer 1322. Alternatively, components from computing device 1300 may be combined with other components in a mobile device (not shown), such as device 1350. Each of such devices may contain one or more of computing device 1300, 1350, and an entire system may be made up of multiple computing devices 1300, 1350 communicating with each other.
Computing device 1350 includes a processor 1352, memory 1364, input/output devices such as a display 1354, communication interface 1366 and transceiver 1368, among other components. The device 1350 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage. Each of the components 1350, 1352, 1364, 1354, 1366, and 1368 are interconnected using various buses, and several of the components may be mounted on a common motherboard or in other manners as appropriate.
The processor 1352 may execute instructions within the computing device 1350, including instructions stored in the memory 1364. The processor may be implemented as a chipset of chips that include separate and multiple analog and digital processors. For example, the processor may provide coordination for the other components of the device 1350, such as control of user interfaces, applications run by device 1350, and wireless communication by device 1350.
The processor 1352 may communicate with a user through a control interface 1358 and a display interface 1356 coupled to a display 1354. The display 1354 may be, for example, a TFT LCD (thin film transistor liquid crystal display) and an LED (light emitting diode) or OLED (organic light emitting diode) display or other suitable display technology. The display interface 1356 may include appropriate circuitry for driving the display 1354 to present graphical and other information to a user. The control interface 1358 may receive commands from a user and convert them for submission to the processor 1352. In addition, an external interface 1362 may be provided in communication with the processor 1352, enabling near area communication of the device 1350 with other devices. External interface 1362 may, for example, be provided for wired communication in some embodiments, or for wireless communication in other embodiments, and multiple interfaces may also be used.
Memory 1364 stores information within computing device 1350. The memory 1364 may be implemented as one or more of one or more computer-readable media, one or more volatile memory units, or one or more non-volatile memory units. Expansion memory 1374 may also be provided and connected to device 1350 via an expansion interface 1372, which expansion interface 1372 may include, for example, a SIMM (Single in line memory Module) card interface. Such expansion memory 1374 may provide additional storage space for device 1350, or may also store applications or other information for device 1350. Specifically, expansion memory 1374 may include instructions to perform or supplement the processes described above, and may also include secure information. Thus, for example, expansion memory 1374 may be provided as a security module for device 1350, and may be programmed with instructions that allow device 1350 to be used securely. In addition, secure applications may be provided via the SIMM card, as well as additional information, such as placing identifying information on the SIMM card in a non-hackable manner.
As discussed below, the memory may include, for example, flash memory and/or NVRAM memory. In one embodiment, a computer program product is tangibly embodied in an information carrier. The computer program product contains instructions that, when executed, perform one or more methods, such as those described above. The information carrier is a computer-or machine-readable medium, such as the memory 1364, expansion memory 1374, or memory on processor 1352, that may be received, for example, by transceiver 1368 or external interface 1362.
Device 1350 may communicate wirelessly through communication interface 1366, which communication interface 1366 may include digital signal processing circuitry as necessary. Communication interface 1366 may provide for communicating under various modes or protocols, such as GSM voice calls, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA, CDMA2000, or GPRS, among others. Such communication may occur, for example, through radio-frequency transceiver 1368. Further, short-range communication may occur, such as using Bluetooth, Wi-Fi, or other such transceivers (not shown). In addition, GPS (global positioning system) receiver module 1370 may provide additional navigation-and location-related wireless data to device 1350, which may be used as appropriate by applications running on device 1350.
The device 1350 may also communicate audibly using the audio codec 1360, which may receive spoken information from a user and convert it to usable digital information. Audio codec 1360 may likewise generate audible sound for a user, such as through speakers, e.g., in a headset of device 1350. Such sound may include sound from voice telephone calls, may include recorded sound (e.g., voice messages, music files, etc.) and may also include sound generated by applications operating on device 1350.
As shown in the figure, computing device 1350 may be implemented in a number of different forms. For example, it may be implemented as a cellular telephone 1380. It may also be implemented as part of a smart phone 1382, personal digital assistant, or other similar mobile device.
Various embodiments of the systems and techniques described here can be implemented in digital electronic circuitry, integrated circuitry, specially designed ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various implementations can include implementation in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, coupled to receive data and instructions from, and to transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications or code) include machine instructions for a programmable processor, and can be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium," "computer-readable medium" refers to any computer program product, apparatus and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (an LED (light emitting diode), or OLED (organic LED), or LCD (liquid crystal display) monitor/screen) to display information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
In some implementations, the computing device depicted in fig. 13 may include sensors that interface with the AR headset/HMD device 1390 to generate an enhanced environment for viewing inserted content within the physical space. For example, one or more sensors included on computing device 1350 or other computing devices depicted in fig. 13 may provide input to AR headset 1390 or, in general, may provide input to the AR space. The sensors may include, but are not limited to, touch screens, accelerometers, gyroscopes, pressure sensors, biometric sensors, temperature sensors, humidity sensors, and ambient light sensors. Computing device 1350 may use the sensors to determine an absolute position and/or detected rotation of the computing device in AR space, which may then be used as an input to the AR space. For example, computing device 1350 may be incorporated into AR space as a virtual object, such as a controller, laser pointer, keyboard, weapon, and so forth. When incorporated into the AR space, the positioning of the computing device/virtual object by the user may allow the user to position the computing device to view the virtual object in the AR space in some manner. For example, if the virtual object represents a laser pointer, the user may manipulate the computing device like an actual laser pointer. The user may move the computing device left and right, up and down, circle, etc. and use the device in a similar manner as using a laser pointer. In some embodiments, the user may aim at the target location using a virtual laser pointer.
In some implementations, one or more input devices included on or connected to computing device 1350 can be used as input to the AR space. The input device may include, but is not limited to, a touch screen, a keyboard, one or more buttons, a touch pad, a pointing device, a mouse, a trackball, a joystick, a camera, a microphone, an earpiece or earpiece having input capabilities, a game controller, or other connectable input devices. When the computing device is incorporated into the AR space, a user interacting with an input device included on computing device 1350 may cause a particular action to occur in the AR space.
In some implementations, the touchscreen of computing device 1350 can be rendered as a touchpad in AR space. The user may interact with the touch screen of computing device 1350. For example, in AR headset 1390, the interaction is rendered as movement on the touchpad rendered in AR space. The rendered movement may control a virtual object in the AR space.
In some implementations, one or more output devices included on computing device 1350 may provide output and/or feedback to a user of AR headset 1390 in AR space. The output and feedback may be visual, tactile or audible. The output and/or feedback may include, but is not limited to, vibration, turning on, turning off or flashing and/or flashing of one or more lights or strobe lights, sounding an alarm, playing a ringtone, playing a song, and playing an audio file. Output devices may include, but are not limited to, vibration motors, vibration coils, piezoelectric devices, electrostatic devices, Light Emitting Diodes (LEDs), strobe lights, and speakers.
In some implementations, the computing device 1350 can appear as another object in a computer-generated 3D environment. User interaction with computing device 1350 (e.g., rotating, shaking, touching a touchscreen, sliding a finger on a touchscreen) may be interpreted as interaction with an object in AR space. In the example of a laser pointer in AR space, computing device 1350 appears as a virtual laser pointer in a computer-generated 3D environment. As the user manipulates computing device 1350, the user in AR space sees the movement of the laser pointer. The user receives feedback from interaction with computing device 1350 in an AR environment on computing device 1350 or AR headset 1390. User interaction with the computing device may be transformed into interaction with a user interface generated for the controllable device in the AR environment.
In some implementations, the computing device 1350 can include a touch screen. For example, a user may interact with a touch screen to interact with a user interface of a controllable device. For example, a touch screen may include user interface elements, such as sliders that may control properties of a controllable device.
Computing device 1300 is intended to represent various forms of digital computers and devices, including but not limited to laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Computing device 1350 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular telephones, smart phones, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit embodiments of the inventions described and/or claimed herein.
Various embodiments have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the description.
Moreover, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. In addition, other steps may be provided, or steps may be eliminated, from the described flows, and other components may be added to, or removed from, the described systems. Accordingly, other embodiments are within the scope of the following claims.
While certain features of the described embodiments have been described as described herein, many modifications, substitutions, changes, and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the scope of the embodiments. It is to be understood that they have been presented by way of example only, and not limitation, and various changes in form and details may be made. Any portions of the devices and/or methods described herein may be combined in any combination, except mutually exclusive combinations. The embodiments described herein may include various combinations and/or subcombinations of the functions, features and/or properties of the different embodiments described.

Claims (51)

1. A computer-implemented method, the method comprising:
receiving data specifying a digital supplement, the data identifying the digital supplement and a supplement anchor for associating the digital supplement with visual content;
generating a data structure instance specifying the digital supplemental and the supplemental anchor; and
after generating the data structure instance, enabling image triggering of the digital supplement based at least on storing the data structure instance in a database comprising a plurality of other data structure instances, wherein each of the plurality of other data structure instances specifies a digital supplement and one or more supplement anchors.
2. The computer-implemented method of claim 1, wherein receiving the data specifying the digital complement comprises:
analyzing the webpage; and
metadata associated with the digital supplement represented within the web page is identified.
3. The computer-implemented method of claim 2, wherein receiving the data specifying the digital complement further comprises:
crawling a plurality of web pages including the web page before analyzing the web page.
4. The computer-implemented method of claim 2 or claim 3, wherein receiving the data specifying the digital complement further comprises:
receiving an identifier of the web page prior to analyzing the web page.
5. The computer-implemented method of any of claims 2 to 4, further comprising associating the data structure instance with a popularity score of the web page.
6. The computer-implemented method of any of claims 2 to 5, further comprising associating the data structure instance with a relevance score based on the supplemental anchor of the web page.
7. The computer-implemented method of any of the preceding claims, wherein the digital supplement comprises a name, a description, an image, and a uniform resource locator.
8. The computer-implemented method of any of the preceding claims, wherein the digital complement comprises an identifier of an application.
9. The computer-implemented method of any of the preceding claims, further comprising, after enabling triggering of the digital replenishment:
receiving a visual content query comprising an image;
determining that the image matches the supplemental anchor specified by the generated data structure instance; and is
In response to determining that the image matches the supplemental anchor, providing the digital supplemental from the data structure instance in response to the visual content query.
10. The computer-implemented method of claim 9, wherein determining that the image matches the supplemental anchor comprises identifying an entity within the image.
11. The computer-implemented method of claim 9 or claim 10, wherein the data structure instance further specifies context information, and wherein providing the digital supplement from the data structure instance comprises providing the data from the data structure instance based on determining that the context has been satisfied.
12. The computer-implemented method of any of the preceding claims, wherein providing the digital supplement from the data structure instance in response to the visual content query comprises providing a list of digital supplements that includes the digital supplement from the data structure instance and a digital supplement from one of the other data structure instances and that is ordered based on one or more of a popularity score and a relevance score.
13. A computing device, the computing device comprising:
at least one processor; and
a memory storing instructions that, when executed by the at least one processor, cause the computing device to:
receiving data specifying a digital supplement, the data identifying the digital supplement, a supplement anchor for associating the digital supplement with visual content, and contextual information;
generating a data structure instance specifying the digital supplemental, the supplemental anchor, and the contextual information; and is
After generating the data structure instance, enabling triggering of the digital supplement by an image based at least on storing the data structure instance in a database comprising a plurality of other data structure instances, wherein each of the plurality of other data structure instances specifies a digital supplement and one or more supplement anchors.
14. The computing device of claim 13, wherein the contextual information comprises a location.
15. The computing device of claim 13 or claim 14, wherein the context information comprises an entity identified within the image.
16. The computing device of any of claims 13 to 15, wherein the contextual information includes a plurality of entities identified within the image.
17. A computer-implemented method, the method comprising:
receiving a visual content query from a computing device;
identifying a supplemental anchor based on the visual content query;
generating an ordered list of digital supplements based on the identified supplement anchors; and
transmitting the ordered list to the computing device.
18. The computer-implemented method of claim 17, wherein the visual content query comprises an image and the digital supplement comprises a URL to a web page presenting information about an entity identified in the image.
19. The computer-implemented method of claim 17 or claim 18, wherein the digital complement comprises a URL to a web page presenting a coupon code for an entity identified in the image.
20. The computer-implemented method of any of claims 17 to 19, wherein the digital supplement comprises video content identified based on the image.
21. The computer-implemented method of any of claims 17 to 20, wherein the digital supplement comprises a request for a server and parameters of the request.
22. The computer-implemented method of claim 21, wherein the parameter initiates a workflow.
23. The computer-implemented method of any of claims 17 to 22, wherein the digital supplements in the list are ordered based on one or more of a popularity score and a relevance score.
24. A computing device, the computing device comprising:
at least one processor; and
a memory storing instructions that, when executed by the at least one processor, cause the computing device to perform the method of any of claims 1-12 or 17-23.
25. A computer program product comprising instructions which, when executed by one or more processors, cause the one or more processors to carry out the method according to any one of claims 1 to 12 or 17 to 23.
26. A computer-readable storage medium comprising instructions that, when executed by one or more processors, cause the one or more processors to perform the method of any one of claims 1-12 or 17-23.
27. A computer-implemented method, the method comprising:
transmitting a visual content query to a server computing device;
receiving a response to the visual content query identifying a digital complement; and
causing a user interface to be displayed that includes information associated with the digital supplemental.
28. The computer-implemented method of claim 27, wherein transmitting a visual content query comprises transmitting an image to the server computing device.
29. The computer-implemented method of claim 27 or claim 28, wherein transmitting a visual content query comprises transmitting an identifier of an entity within an image to the server computing device.
30. The computer-implemented method of any of claims 27 to 29, further comprising transmitting context information to the server computing device.
31. The computer-implemented method of claim 30, wherein the contextual information comprises a location.
32. The computer-implemented method of any of claims 27 to 31, wherein receiving a response to the visual content query comprises receiving a URL associated with the digital supplement.
33. The computer-implemented method of any of claims 27 to 32, wherein receiving a response to the visual content query comprises receiving an identifier of an application associated with the digital complement.
34. The computer-implemented method of any of claims 27 to 33, wherein receiving a response to the visual content query comprises receiving a name, a description, and an image associated with the digital supplement.
35. The computer-implemented method of any of claims 27 to 34, wherein receiving a response to the visual content query comprises receiving a list of digital supplements, the list of digital supplements comprising the digital supplements.
36. The computer-implemented method of claim 35, wherein the user interface includes information about a plurality of digital supplements from the list of digital supplements.
37. The computer-implemented method of claim 36, further comprising:
determining an order of the list of digital supplements based on whether an application associated with the digital supplements is installed.
38. The computer-implemented method of any of claims 27 to 37, further comprising capturing an image, and wherein the visual content query is based on the image.
39. A computing device, the computing device comprising:
at least one processor; and
a memory storing instructions that, when executed by the at least one processor, cause the computing device to:
capturing an image;
transmitting a visual content query based on the image to a server computing device;
receiving a response to the visual content query identifying a digital complement; and is
Causing a user interface to be displayed that includes information associated with the digital supplemental.
40. The computing device of claim 39, wherein the instructions that cause the computing device to receive a response to the visual content query comprise instructions that cause the computing device to receive an ordered list of digital supplements that includes the digital supplements.
41. The computing device of claim 40, wherein the user interface comprises a digital supplemental selection panel comprising user actuatable controls associated with a plurality of digital supplements from the ordered list.
42. The computing device of claim 41, wherein the user-actuatable controls are ordered on the digital supplemental selection panel based on an order provided by the ordered list.
43. The computing device of any of claims 39-42, wherein the digital supplemental is associated with a digital supplemental server, and the instructions, when executed by the at least one processor, further cause the computing device to:
transmitting information to the digital supplemental server; and is
Receiving digital supplemental content from the digital supplemental server, the digital supplemental content being based on the transmitted information.
44. The computing device of claim 43, wherein the transmitted information comprises contextual information.
45. A computer-implemented method, the method comprising:
capturing an image;
transmitting a visual content query based on the image to a search server;
receiving a response to the visual content query identifying a digital supplemental server;
causing the image to be transmitted to the digital supplemental server;
receiving digital supplementary content from the digital supplementary server; and is
Causing the digital supplemental content to be displayed.
46. The computer-implemented method of claim 45, wherein the visual content query includes the image and causing the image to be transmitted to the digital supplemental server includes transmitting instructions to the search server to provide the image to the digital supplemental server.
47. The computer-implemented method of any of claims 45 to 46, wherein causing the image to be transmitted to the digital supplemental server comprises transmitting the image to the digital supplemental server.
48. The computer-implemented method of any of claims 45 to 47, wherein the digital supplement includes augmented reality content that is resized and positioned based on the image.
49. A computing device, the computing device comprising:
at least one processor; and
memory storing instructions that, when executed by the at least one processor, cause the computing device to perform the method of any of claims 27-38 or 45-48.
50. A computer program product comprising instructions which, when executed by one or more processors, cause the one or more processors to perform the method of any one of claims 27 to 38 or 45 to 48.
51. A computer-readable storage medium comprising instructions that, when executed by one or more processors, cause the one or more processors to perform the method of any one of claims 27-38 or 45-48.
CN201980022269.0A 2018-06-21 2019-06-21 Digital supplemental association and retrieval for visual search Pending CN112020712A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
US16/014,512 US10579230B2 (en) 2018-06-21 2018-06-21 Digital supplement association and retrieval for visual search
US16/014,512 2018-06-21
US16/014,520 2018-06-21
US16/014,520 US10878037B2 (en) 2018-06-21 2018-06-21 Digital supplement association and retrieval for visual search
PCT/US2019/036542 WO2019245801A1 (en) 2018-06-21 2019-06-21 Digital supplement association and retrieval for visual search

Publications (1)

Publication Number Publication Date
CN112020712A true CN112020712A (en) 2020-12-01

Family

ID=68983041

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980022269.0A Pending CN112020712A (en) 2018-06-21 2019-06-21 Digital supplemental association and retrieval for visual search

Country Status (5)

Country Link
EP (1) EP3811238A1 (en)
JP (2) JP7393361B2 (en)
KR (2) KR20230003388A (en)
CN (1) CN112020712A (en)
WO (1) WO2019245801A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231385A (en) * 2020-12-11 2021-01-15 湖南新云网科技有限公司 Data collection method, device, equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240037529A1 (en) * 2022-07-27 2024-02-01 Bank Of America Corporation System and methods for detecting and implementing resource allocation in an electronic network based on non-contact instructions

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009505288A (en) * 2005-08-15 2009-02-05 エブリックス・テクノロジーズ・インコーポレイテッド Use information from images as search criteria for the Internet and other search engines
CN101777064A (en) * 2009-01-12 2010-07-14 鸿富锦精密工业(深圳)有限公司 Image searching system and method
US20110176724A1 (en) * 2010-01-20 2011-07-21 Microsoft Corporation Content-Aware Ranking for Visual Search
US20110196859A1 (en) * 2010-02-05 2011-08-11 Microsoft Corporation Visual Search Reranking
US20120128251A1 (en) * 2009-12-02 2012-05-24 David Petrou Identifying Matching Canonical Documents Consistent with Visual Query Structural Information
CN102576372A (en) * 2009-11-02 2012-07-11 微软公司 Content-based image search
US20120254076A1 (en) * 2011-03-30 2012-10-04 Microsoft Corporation Supervised re-ranking for visual search
US8429173B1 (en) * 2009-04-20 2013-04-23 Google Inc. Method, system, and computer readable medium for identifying result images based on an image query
CN103582884A (en) * 2011-04-14 2014-02-12 高通股份有限公司 Robust feature matching for visual search
CN103959284A (en) * 2011-11-24 2014-07-30 微软公司 Reranking using confident image samples
CN104685501A (en) * 2012-08-08 2015-06-03 谷歌公司 Identifying textual terms in response to a visual query
US20160098426A1 (en) * 2013-05-16 2016-04-07 Yandex Europe Ag Method and system for presenting image information to a user of a client device
US20160224837A1 (en) * 2013-10-25 2016-08-04 Hyperlayer, Inc. Method And System For Facial And Object Recognition Using Metadata Heuristic Search
CN106156063A (en) * 2015-03-30 2016-11-23 阿里巴巴集团控股有限公司 Correlation technique and device for object picture search results ranking
US20170161382A1 (en) * 2015-12-08 2017-06-08 Snapchat, Inc. System to correlate video data and contextual data
CN107111640A (en) * 2014-12-22 2017-08-29 微软技术许可有限责任公司 Method and user interface for auxiliary content to be presented together with image search result
US20170351710A1 (en) * 2016-06-07 2017-12-07 Baidu Usa Llc Method and system for evaluating and ranking images with content based on similarity scores in response to a search query
US20180165370A1 (en) * 2015-06-16 2018-06-14 My EyeSpy Pty Ltd Methods and systems for object recognition

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5014494B2 (en) * 2011-01-21 2012-08-29 パナソニック株式会社 Information processing apparatus, augmented reality system, information processing method, and information processing program
US20130129142A1 (en) * 2011-11-17 2013-05-23 Microsoft Corporation Automatic tag generation based on image content
US9927949B2 (en) * 2013-05-09 2018-03-27 Amazon Technologies, Inc. Recognition interfaces for computing devices
US10235387B2 (en) * 2016-03-01 2019-03-19 Baidu Usa Llc Method for selecting images for matching with content based on metadata of images and content in real-time in response to search queries

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009505288A (en) * 2005-08-15 2009-02-05 エブリックス・テクノロジーズ・インコーポレイテッド Use information from images as search criteria for the Internet and other search engines
CN101777064A (en) * 2009-01-12 2010-07-14 鸿富锦精密工业(深圳)有限公司 Image searching system and method
US8429173B1 (en) * 2009-04-20 2013-04-23 Google Inc. Method, system, and computer readable medium for identifying result images based on an image query
CN102576372A (en) * 2009-11-02 2012-07-11 微软公司 Content-based image search
US20120128251A1 (en) * 2009-12-02 2012-05-24 David Petrou Identifying Matching Canonical Documents Consistent with Visual Query Structural Information
US20110176724A1 (en) * 2010-01-20 2011-07-21 Microsoft Corporation Content-Aware Ranking for Visual Search
US20110196859A1 (en) * 2010-02-05 2011-08-11 Microsoft Corporation Visual Search Reranking
US20120254076A1 (en) * 2011-03-30 2012-10-04 Microsoft Corporation Supervised re-ranking for visual search
CN103582884A (en) * 2011-04-14 2014-02-12 高通股份有限公司 Robust feature matching for visual search
CN103959284A (en) * 2011-11-24 2014-07-30 微软公司 Reranking using confident image samples
CN104685501A (en) * 2012-08-08 2015-06-03 谷歌公司 Identifying textual terms in response to a visual query
US20160098426A1 (en) * 2013-05-16 2016-04-07 Yandex Europe Ag Method and system for presenting image information to a user of a client device
US20160224837A1 (en) * 2013-10-25 2016-08-04 Hyperlayer, Inc. Method And System For Facial And Object Recognition Using Metadata Heuristic Search
CN107111640A (en) * 2014-12-22 2017-08-29 微软技术许可有限责任公司 Method and user interface for auxiliary content to be presented together with image search result
CN106156063A (en) * 2015-03-30 2016-11-23 阿里巴巴集团控股有限公司 Correlation technique and device for object picture search results ranking
US20180165370A1 (en) * 2015-06-16 2018-06-14 My EyeSpy Pty Ltd Methods and systems for object recognition
US20170161382A1 (en) * 2015-12-08 2017-06-08 Snapchat, Inc. System to correlate video data and contextual data
US20170351710A1 (en) * 2016-06-07 2017-12-07 Baidu Usa Llc Method and system for evaluating and ranking images with content based on similarity scores in response to a search query

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
彭绍武;刘乐元;杨雄;桑农;: "基于图语法的视觉知识表达与标注图像数据库", 计算机应用研究, no. 02, pages 353 - 357 *
齐云飞;赵宇翔;朱庆华;: "关联数据在数字图书馆移动视觉搜索系统中的应用研究", 数据分析与知识发现, no. 01, pages 87 - 96 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231385A (en) * 2020-12-11 2021-01-15 湖南新云网科技有限公司 Data collection method, device, equipment and storage medium

Also Published As

Publication number Publication date
EP3811238A1 (en) 2021-04-28
KR20230003388A (en) 2023-01-05
JP2022110057A (en) 2022-07-28
KR20200136030A (en) 2020-12-04
JP2021522614A (en) 2021-08-30
JP7393361B2 (en) 2023-12-06
WO2019245801A1 (en) 2019-12-26

Similar Documents

Publication Publication Date Title
US11023106B2 (en) Digital supplement association and retrieval for visual search
US11417066B2 (en) System and method for selecting targets in an augmented reality environment
US11640431B2 (en) Digital supplement association and retrieval for visual search
US10540378B1 (en) Visual search suggestions
JP6502923B2 (en) Recognition interface for computing devices
WO2020092093A1 (en) Visual attribute determination for content selection
CN105009113A (en) Queryless search based on context
US20210042809A1 (en) System and method for intuitive content browsing
US20220155940A1 (en) Dynamic collection-based content presentation
JP2022110057A (en) Digital supplement association and retrieval for visual search
EP4224339A1 (en) Intelligent systems and methods for visual search queries
WO2021173147A1 (en) System and method for playback of augmented reality content triggered by image recognition
US10621237B1 (en) Contextual overlay for documents
WO2017210610A1 (en) Quick trace navigator
US10437902B1 (en) Extracting product references from unstructured text
US11514082B1 (en) Dynamic content selection
JP7382847B2 (en) Information processing method, program, and information processing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination