CN110083739B

CN110083739B - System and method for addressing media databases using distance associative hashing

Info

Publication number: CN110083739B
Application number: CN201811395356.4A
Authority: CN
Inventors: 泽夫·纽梅尔; 布莱恩·里德
Original assignee: Vizio Inscape Technologies LLC
Current assignee: Inscape Data Inc
Priority date: 2013-03-15
Filing date: 2014-03-17
Publication date: 2024-04-30
Anticipated expiration: 2034-03-17
Also published as: BR112015023369A2; MX2015012510A; MX356884B; CN110083739A; CN105144141B; BR112015023369B1; CA2906199A1; CN105144141A; CL2015002621A1; WO2014145929A1; HK1217794A1; CA2906199C; MX2020001441A

Abstract

A system, method and computer program utilize a distance associative hashing algorithm approach to provide an efficient means of quickly addressing large databases. The indexing means may be readily subdivided into a plurality of independently addressable segments, wherein each such segment may address a portion of the relevant data of the database, wherein an index to the subdivision of said portion resides entirely within the main memory of each of a plurality of server devices. The resulting cluster of server devices, each hosting one addressable sector of a larger database of searchable audio or video information, provides significant improvements in latency and scalability of the automated content recognition system, among other uses.

Description

System and method for addressing media databases using distance associative hashing

The application is a divisional application of the application patent application with the application date of 2014, 3 months and 17 days, the international application number of PCT/US2014/030782, the national application number of 201480017043.9 and the application name of a system and a method for addressing a media database by using a distance-association hash method.

Priority claim

The present application constitutes a continuation-in-part application of U.S. patent application Ser. No. 12/788,721 entitled "method for identifying video clips and displaying contextual targeted content on a connected television," filed 5/27/2010 and issued as U.S. patent No. 8,595,781/11/6/2009, which is a non-provisional application requiring the benefit of U.S. provisional patent application Ser. No. 61/182,334 entitled "SYSTEM FOR PROCESSING CONTENT INFORMATION IN A TELEVIDEO SIGNAL (system for processing content information in television video signals)" filed 5/29/2009, and is a non-provisional application requiring the benefit of U.S. provisional patent application Ser. No. 61/290,714 filed 12/29/2009 (determined based on contextual objectives of data received from a television system) "; the present application further constitutes a continuation of the application of part of U.S. patent application Ser. No. 12/788,748 entitled "METHODS FOR DISPLAYING CONTEXTUALLY TARGETED CONTENT ON ACONNECTEDTELEVISION (a method for displaying contextually targeted content on a connected television) filed on month 5, 27, and issued as U.S. patent number 8,769,584 at month 7, 1, 2014; the present application further constitutes a continuation-in-part application of U.S. patent application Ser. No. 14/089,003, filed on day 11, 25 of 2013, and issued as U.S. patent number 8,898,714, entitled "METHODS FOR IDENTIFYING VIDEO SEGMENTS AND DISPLAYING CONTEXTUALLY TARGETED CONTENT ON A CONNECTED TELEVISION( method for identifying video clips and displaying contextually relevant targeted content on a connected television); the present application further constitutes a continuation of the section entitled "SYSTEMS AND METHODS FOR IDENTIFYING VIDEO SEGMENTS FOR DISPLAYING CONTEXTUALLY RELEVANTCONTENT (systems and methods for identifying video clips for displaying contextually relevant content)" filed on 3/17 of 2014; the present application further constitutes part of the continuation of the application of U.S. patent application Ser. No. 14/217,094 entitled "SYSTEMS AND METHODS FOR REAL-TIME TELEVISION AD DETECTION USING AN AUTOMATED CONTENT RECOGNITION DATABASE( System and method for real-time television advertisement detection Using an automated content identification database filed on day 3, month 17 of 2014 and issued as U.S. patent number 8,930,980, day 1, month 6; the present application further constitutes a continuation of the section entitled "SYSTEMS AND METHODS FOR ON-SCREEN GRAPHICS DETECTION" U.S. patent application Ser. No. 14/217,375 filed ON 3/17 of 2014; the present application further constitutes a continuation of the section of U.S. patent application Ser. No. 14/217,425 filed on 3/17 of 2014 entitled "SYSTEMS AND METHODS FOR IMPROVING SERVER AND CLIENT PERFORMANCE INFINGERPRINTACR SYSTEMS (systems and methods for improving server and client performance in fingerprint ACR systems)"; the present application further constitutes a continuation-in-part application of U.S. patent application Ser. No. 14/217,435, entitled "SYSTEMS AND METHODS FOR MULTI-BROADCAST DIFFERENTIATION (System and method for multiple broadcast discrimination)" filed on 3/17 of 2014; and the present application further constitutes a continuation of the section of U.S. patent application No. 61/791,578 entitled "SYSTEMS AND METHODS FOR IDENTIFYING VIDEO SEGMENTS BEING DISPLAYED ON REMOTELY LOCATED TELEVISIONS( system and method for identifying video clips displayed on a remotely located television filed on day 3 and 15 of 2013. The foregoing is either a current co-pending or a current co-pending application that is entitled to the benefit of the filing date.

Technical Field

The present invention relates generally to matching unknown media data (e.g., video or audio clips) to a massive database of reference media files.

Background

Systems for Automated Content Recognition (ACR) of audio or video media are well known to those skilled in the art. However, such ACR systems present a number of technical challenges, including managing a potentially very large database of encoded audio or video information and managing the large index needed to address the information in the database.

It is also well known to those skilled in the art that large database indexes (as may be used in the present invention) may be generated using certain hash functions. Another method of addressing databases may be by applying a binary tree structure, also known as a b-tree. Both methods are commonly used in data management systems.

Regardless of the method employed to index a large database, the index is often too large to reside in its entirety in the main memory of a computer server as used in a typical ACR system. When the database cannot be fully assembled in the memory of a computer system, it is typically stored on disk storage, and then portions of the database are read into memory in blocks corresponding to index values that provide addresses. The means of recalling portions of database information is also known to those skilled in the art as "paging," a process common to many different computer software systems.

The present invention is an extension of the invention cited above and is a system and method for matching unknown digital media (such as television programming) to a database of known media using signal processing means employing a modified path tracking algorithm (as described in the first invention).

Another novel aspect of the systems and methods as disclosed herein is a distance-associative hash indexing that may be subdivided into a plurality of independently addressable fragments, wherein each of the fragments may address a portion of a database, and each of the fragments may reside in its entirety in a main memory of a server device. The resulting server clusters of the indexing means each host a sector of an index that addresses associated data of a larger database of searchable audio or video information. This indexing approach of the present invention results in a significant improvement in the speed and accuracy of the ACR system that is enabled to identify unknown media even when the television display is showing content that the user is changing channels, fast rewinding, fast forwarding, or even pausing on video from the digital video recorder.

Disclosure of Invention

In some embodiments, an exemplary method involving addressing a media database using distance-associative hashing may include receiving one or more indications of a sample of video clips; determining algorithmically derived values of the one or more pixels of each slice for at least one slice of the sample of the video segment comprising the one or more pixels of the at least one slice; subtracting the established intermediate point value for each slice from the average value for each slice; transforming the values resulting from the subtraction using a pre-derived function to evenly distribute the values; building a hash value from the transformed values; referencing a plurality of most significant bits of the constructed hash value to determine a database sector; and storing at least the hash value on the determined database sector.

In some embodiments, at least one of receiving, determining, subtracting, transforming, constructing, referencing, or storing the aforementioned example methods is at least partially implemented using one or more processing devices. In some embodiments of the foregoing exemplary methods, receiving one or more indications of samples of video clips may include receiving one or more indications of at least one of frames or still images. In some embodiments of the foregoing exemplary methods, receiving the one or more indications of the samples of the video clips may include receiving the one or more indications of the samples of the video clips associated with at least one indication of a channel, at least one indication of the video clips, and at least one indication of a time code offset from a beginning of the video clips.

In some embodiments of the foregoing exemplary method, determining the algorithmically derived value of the one or more pixels of each tile for the at least one tile of the sample of the video segment comprising the at least one or more pixels of the at least one tile comprises determining an average value of the one or more pixels of each tile for the at least one tile of the sample of the video segment comprising the at least one or more pixels of the at least one tile. In some embodiments of the foregoing exemplary method, subtracting the midpoint value established for each tile from the average value for each tile may include subtracting the midpoint value established for each tile from the average value for each tile, the midpoint value established for each tile having been previously determined for a plurality of channels over at least one time period using data from each tile.

In some embodiments of the foregoing exemplary method, transforming the values resulting from the subtraction using a pre-derived function to uniformly distribute the values may include forming a variable matrix including at least the values resulting from the subtraction; obtaining a static matrix that will more evenly distribute the transformed values when intersected by the variable matrix; and calculating a dot product of the variable matrix and the static matrix, the dot product comprising at least the transformed values that are more evenly distributed. In some embodiments of the foregoing exemplary method, obtaining a static matrix that will more evenly distribute the transformed values when intersected by the variable matrix may include determining a static matrix that will more evenly distribute the transformed values of the variable matrix when intersected by the variable matrix based at least in part on one or more previously obtained hash values using a position-sensitive hashing method.

In some embodiments of the foregoing exemplary method, constructing the hash values from the transformed values may include constructing the hash values from the transformed values, including reducing fidelity of the transformed values by at least reducing each transformed value to a binary representation. In some embodiments of the foregoing exemplary method, reducing the fidelity of each transformed value by reducing the transformed value to a binary representation may include determining for each transformed value whether the transformed value is a positive number and if the transformed value is a positive number, assigning a one to the hash value and otherwise assigning a zero to the hash value.

In some embodiments of the foregoing exemplary method, referencing the constructed plurality of most significant bits of the hash value to determine the database sector may comprise referencing the constructed plurality of most significant bits of the hash value to determine the database server, wherein the plurality of most significant bits are predetermined to address a plurality of database servers, wherein the plurality of database servers associated with the plurality of most significant bits are established such that at least one index associated with the database sector is capable of fully residing in memory of the respective database server. In some embodiments of the foregoing exemplary methods, storing at least the hash value on the determined database sector may include storing at least the hash value on the determined database sector, including storing at least one indication of a channel, at least one indication of a video segment, and at least one indication of a time code offset from a beginning of the video segment at a database location based at least in part on the hash value.

In one or more alternative embodiments of the foregoing exemplary methods, a plurality of related systems include, but are not limited to, circuitry and/or programming for implementing the method embodiments referenced herein; the circuitry and/or programming may be virtually any combination of hardware, software, and/or firmware configured to effect the herein-referenced method aspects depending upon the design choices of the system designer.

In various embodiments, an exemplary method involving addressing a media database using distance-associative hashing may include receiving a hint, the hint constructed by one or more operations associated with a media storage operation; referencing a plurality of most significant bits of the hint received to determine a database sector; and returning at least one indication of at least one candidate from the database sector based at least in part on the received hint.

In some embodiments of the foregoing exemplary methods, receiving a hint (the hint being constructed by one or more operations associated with a media storage operation) may include receiving a hint associated with a sample of a video buffer of a client system, including receiving at least one or more indications related to a time of day associated with the sample of the video buffer of the client system. In some embodiments of the foregoing exemplary methods, receiving a hint (the hint being constructed by one or more operations associated with a media storage operation) may include receiving a hint associated with a sample of a video buffer of a client system, the hint determined at least in part by hashing at least some values associated with the video buffer.

In some embodiments of the foregoing exemplary method, receiving a hint (the hint being associated with a sample of a video buffer of a client system, the hint being determined at least in part by hashing at least some values associated with the video buffer) may include receiving a hint associated with a sample of a video buffer of a client system, the hint being determined at least in part by hashing at least some values associated with the video buffer, the hashing being based at least in part on one or more of at least one operand or at least one algorithm that is also used in an associated media storage operation. In some embodiments of the foregoing exemplary method, receiving a hint (the hint being constructed by one or more operations associated with a media storage operation) may include receiving a hint, the hint being determined by one or more operations including at least: receiving one or more indications of at least one item of content of a video buffer of a client system; determining algorithmically derived values of the one or more pixels of each tile for at least one tile of the at least one item of content of the video buffer including the at least one or more pixels of the at least one tile; subtracting the intermediate point value from the average value for each slice; transforming the values resulting from the subtraction; building a hash value from the transformed values; and associating the hint at least in part with the constructed hash value, wherein at least one of the determining operation, subtracting operation, transforming operation, or constructing operation utilizes one or more of at least one operand or at least one algorithm that is also used in the associated media storage operation.

In some embodiments of the foregoing exemplary method, returning at least one indication of at least one candidate from the database sector based at least in part on the received hint may include returning at least one indication of at least one candidate from the database sector based at least in part on a probabilistic point location ("PPLEB") algorithm in an equal sphere as a function of the received hint. In some embodiments of the foregoing exemplary method, returning at least one indication of at least one candidate from the database sector based at least in part on the received hint may include returning at least one indication of at least one candidate from the database sector based at least in part on the received hint, the at least one candidate being within a predetermined inverse percentage distribution radius of the received hint.

In various embodiments, an exemplary method involving addressing a media database using distance associative hashing may include receiving at least one indication of at least one candidate item and at least one indication of at least one hint; adding the token to a bin associated with the at least one received candidate; and determining whether the number of tokens within the bin exceeds a value associated with a probability that the client system is displaying a particular video segment associated with the at least one hint, and returning at least some data associated with the particular video segment based at least in part on the bin if the number of tokens within the bin exceeds a value associated with a probability that the client system is displaying the particular video segment associated with the at least one hint.

In some embodiments of the foregoing exemplary method, adding the token to the bin associated with the at least one received candidate may include adding the token to a time bin associated with the at least one received candidate. In some embodiments of the foregoing exemplary method, adding the token to the bin associated with the at least one received candidate may include determining a relative time, including subtracting a candidate time associated with the at least one candidate from at least any time associated with the at least one hint; and adding a token to a time bin associated with the candidate based at least in part on the determined relative time. In some embodiments of the foregoing exemplary method, the method may include removing one or more tokens from the time bin based at least in part on the elapsed time period.

In various embodiments, an exemplary system involving addressing a media database using distance-associative hashing may include, but is not limited to: one or more computing devices; and one or more instructions that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to at least: receiving at least one rasterized video stream; creating at least one hash value associated with at least one sample of the at least one received rasterized video stream; determining at least one database sector for storing the created at least one hash value; and storing the created at least one hash value on the determined at least one database sector.

In various embodiments, an exemplary system involving addressing a media database using distance-associative hashing may include, but is not limited to: one or more computing devices; and one or more instructions that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to at least: receiving one or more instructions associated with at least one video buffer of at least one client system; determining a hint based at least in part on the at least one video buffer and at least one time associated with the at least one video buffer, wherein one or more of at least one operand or at least one function associated with determining the hint is also used in an associated media storage operation; referencing a plurality of most significant bits of the determined hint to determine a database sector; and returning at least one indication of at least one candidate from the determined database sector based at least in part on the determined hint.

In various embodiments, an exemplary system involving addressing a media database using distance-associative hashing may include, but is not limited to: one or more computing devices; and one or more instructions that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to at least: receiving at least one indication of at least one candidate item and at least one indication of at least one hint; adding the token to a bin associated with the at least one received candidate; and determining whether the number of tokens within the bin exceeds a value associated with a probability that the client system is receiving a particular video segment associated with the received at least one hint, and returning at least some data associated with the particular video segment based at least in part on the bin if the number of tokens within the bin exceeds a value associated with a probability that the client system is receiving the particular video segment associated with the received at least one hint.

In addition to the foregoing, various other method, system, and/or program product embodiments are set forth and described in the text (e.g., claims, drawings, and/or detailed description) and/or in the teachings of the drawings as this disclosure.

The foregoing is a summary and thus contains, by necessity, simplifications, generalizations and omissions of detail; those skilled in the art will recognize that this summary is illustrative only and is not intended to be in any way limiting. Other aspects, embodiments, features, and advantages of the devices and/or processes and/or other subject matter described herein will become apparent in the teachings set forth herein.

Drawings

Certain embodiments of the invention are described in more detail below with reference to the following figures.

Fig. 1 illustrates the construction of a sectorized video match database as taught by the present invention beginning with an initial video ingest or capture process that is then continuously updated. A television display system 101 and its corresponding television display memory buffer 103 are shown for a potential embodiment of the system. The assignment of pixel tiles 102 and the computation of values 105 using some algorithmic means known to those skilled in the art is done for each pixel tile, and the resulting data structure is created and then time stamped to produce a "hint" 106, which may also have additional metadata associated with it.

Fig. 2 illustrates the use of a distance associative hashing process to process hint data 201 and generate a hash index 202, and further illustrates a sectorized addressing scheme 203 to store data in an associated group (bucket) 206.

Fig. 3 illustrates the real-time capture of unknown television content for identification from a connected television monitor or the like 301. Pixel slices are generally defined as square pixel regions of the video buffer 303 having a size of approximately ten pixels by ten rows of pixels 304, however, any reasonable shape and size may be used. The number of pixel tile locations may be any number between ten and fifty locations within the video buffer and is processed 305 for sending hint data 306 to a central server device.

Fig. 4 illustrates the extraction of a plurality of candidate hint values 401 from a reference (matching) database memory bucket 404 and the application of the hint values 403 to a path trace content matching process 402 as taught in the first invention referenced above.

Fig. 5 illustrates a data structure of bins holding tokens for scoring candidate values from the matching database. As the search process proceeds through time, the bin is "leaky" and the token expires over time.

FIG. 6 illustrates a typical memory paging scheme for accessing large databases as taught by the prior art.

Fig. 7 illustrates the creation of a hash value involving several steps starting with calculating the median value for each of a number of points that make up those samples from a video frame.

Fig. 8 shows how the hash value is calculated.

Fig. 9 illustrates the beneficial results of using the median of the pixel locations as part of the process of calculating the hash value.

Fig. 9a illustrates the problem of not using a median value when partitioning a multi-dimensional dataset.

Fig. 9b demonstrates the benefit of finding the median value of the dataset.

FIG. 10 illustrates an operational flow representative of a number of example operations related to addressing a media database using distance-associative hashing.

Fig. 11 illustrates an alternative embodiment of the operational flow of fig. 10.

Fig. 12 illustrates an alternative embodiment of the operational flow of fig. 10.

Fig. 13 illustrates an alternative embodiment of the operational flow of fig. 10.

Fig. 14 illustrates an alternative embodiment of the operational procedure of fig. 10.

Fig. 15 illustrates an alternative embodiment of the operational flow of fig. 10.

Fig. 16 illustrates an alternative embodiment of the operational flow of fig. 10.

Fig. 17 illustrates an alternative embodiment of the operational flow of fig. 10.

Fig. 18 illustrates an alternative embodiment of the operational flow of fig. 10.

Fig. 19 illustrates an alternative embodiment of the operational flow of fig. 10.

FIG. 20 illustrates different operational flows representing a number of example operations relating to addressing a media database using distance-associative hashing.

Fig. 21 illustrates an alternative embodiment of the operational procedure of fig. 20.

Fig. 22 illustrates an alternative embodiment of the operational procedure of fig. 20.

Fig. 23 illustrates an alternative embodiment of the operational procedure of fig. 20.

FIG. 24 illustrates another operational flow representing example operations related to addressing a media database using distance-associative hashing.

Fig. 25 illustrates an alternative embodiment of the operational procedure of fig. 24.

Fig. 26 illustrates an alternative embodiment of the operational procedure of fig. 24.

FIG. 27 illustrates a system that addresses a media database using distance-associative hashing.

FIG. 28 illustrates another system for addressing a media database using distance-associative hashing.

FIG. 29 illustrates yet another system for addressing a media database using distance-associative hashing.

Detailed Description

As described in the foregoing disclosure, a first invention directed to the present invention is a system and method for matching unknown video to a database of known videos using novel signal processing means employing a modified path tracking algorithm, as well as other means.

The novel approach of the new invention is its distance-associative hashing (Distance Associated Hashing), which is accompanied by providing database access using sectorized indexes. The indexing means provides a computationally efficient means for matching unknown media segments with a reference database of known media (e.g., audio or video content).

This indexing approach of the present invention results in a significant improvement in the speed and accuracy of the ACR system that is enabled to track the identity of the media even when the television display is showing content that the user is changing channels, fast rewinding, fast forwarding, or even pausing the video from the digital video recorder.

Both the creation, updating, and subsequent access to the media matching database will describe a system that is capable of generating and addressing sectorized databases so that these database sectors can each reside in the main memory of a corresponding large number of server devices without resort to paging devices within each of the corresponding server devices. This common approach to addressing a sectorized database by location-sensitive hashing provides a significant improvement in operating efficiency.

Construction of the sectorized video match database begins with the process shown in fig. 1. Television system 101 decodes the television signal and places the content of each video frame into a video frame buffer in preparation for display or further processing of the pixel information for that video frame. The television system may be any television decoding system that can decode a television signal either from baseband or from a modulated television source and fill a video frame buffer with decoded RGB values at a corresponding frame size specified by the video signal. Such systems are well known to those skilled in the art.

The system of the present invention first builds a reference database of television program fingerprints described in the original application as hints or hint values and then continuously updates it. The present invention performs acquisition of one or more video slices 102 read from video frame buffer 103 for the purpose of building a reference database of the video cues. The video tiles may be of any shape or pattern, but for the purposes of this example, should be 10 pixels in the horizontal direction by 10 pixels in the vertical direction. Also for purposes of this example, it is assumed that there are 25 pixel tile locations in a video frame buffer that are evenly distributed within the boundaries of the buffer, although they are not necessarily evenly distributed. Each pixel should be composed 104 of red, green and blue values, typically represented by an eight bit binary value for each color totaling 24 bits or three bytes per tile position.

This composite data structure is populated with average pixel values from multiple pixel tile locations of the video buffer. Pixel slices are defined as generally square regions of pixels of a video buffer having a size of approximately ten pixels by ten rows of pixels 304. The number of pixel tile locations may typically be between ten and fifty locations within the video buffer.

These average pixel values 305 are combined with a time code 306 referencing the "time of day" from the processor means of the television system. Time of day is defined as the time in one second with a number of small parts that have passed since 1 month 1, and half a night in 1970, which is a recognized practice in computing systems, especially Unix (or Linux) based systems.

Furthermore, as taught in the parent application, metadata may be included and defined with the data structure 106, referred to as a marker fingerprint, "hint", or "point". Such metadata attributes may be derived from closed caption data from the video program currently being displayed, or they may be keywords extracted by a speech recognition system running within the processor means of the television system that converts video from the corresponding television program into text information. The text information may then be retrieved for the relevant keywords or sent as a whole as part of a reminder data structure to a central server device for further processing.

Those hint records 201 are passed in FIG. 2 to a hash function 202 that uses a position sensitive hash algorithm based on a probability point location algorithm ("PPLEB") in an equal sphere to generate a hash value 203. This hash value is calculated based on the average pixel value from hint record (fingerprint) 207 and the process correlates 206 data with similar values into a group called buckets.

The ten by ten pixel tile 302 shown in this particular example will have one hundred pixels and mathematically taking the average yields an average pixel value 305 for the red, green, and blue values, respectively. Alternatively, any averaging function may be used instead of a simple average.

A plurality of such pixel slices are extracted from the video frame. For example, if 25 such tiles of pixels are extracted from a video frame, the result will be a point representing a position within 75-dimensional space. The skilled person will appreciate that such a large search space may require a significant amount of computational resources to locate (or even approximately) the other 74 values representing a video frame.

The systems and methods of distance-dependent hashing described herein have the advantage of reducing computational load and improving the accuracy of matching unknown video frames to a database of known video frames.

Creating a hash value involves several steps starting with calculating an algorithmically derived value for each point, as shown at 701 through 775 in fig. 7. A useful means of algorithmically deriving the median value by summing each point of each frame of each program stream or video channel maintained by the matching database over a period of about 24 hours has been found. The median value for each point is found from the summation process. The next step in deriving the final hash value is to subtract the average value from the point value for each corresponding point, row 801 minus row 802 equals row 803. The result is a positive or negative value to which the pre-derived hash function is applied. Typically, the result of subtracting the average value of the corresponding points from the point value is arranged in a matrix, and the dot product to the matrix is calculated using a similar matrix constituting a hash value (or hash key) derived in advance. The result of the dot product of the two matrices is then further transformed to one or zero based on the sign of the product matrix element. Typically, the technician will set a positive value to one and a negative value to zero.

The generated hash values point to more or fewer values evenly distributed across the data storage area. The hash value 203 may be further partitioned (FIG. 2) such that the 'n' most significant bits 205 address one of the 2n (2 n) sectors of the database. The remaining bits 206 address the respective 'buckets' of the addressed sector of the database, as will be described in more detail below.

The division points of the hash values defining the respective sector address spaces are calculated such that the indexes of the data of the database sectors fit within the memory boundaries of those processor systems of the memory sectors. Otherwise, the database will be subject to paging, which will reduce the effectiveness of this process.

Comparing the system and method taught by the present invention with those known to those skilled in the relevant art, fig. 6 illustrates a typical paging scheme. In fig. 6, assume that an example system is attempting to match unknown data to a server of known data. The index 602 is used to address only a portion of the data 605 that may fit in the main memory of the CPU 606. This data is searched and if the result is negative, another piece of data is fetched into main memory 603 and the search continues.

This access means using paging is common but significantly reduces the efficiency of the computer system. In practice, this approach cannot be used with ACR systems that search large media databases because the read/write speed of the hard disk drive is not sufficient to keep up with the task. Many different algorithmic approaches have been developed over the years to address this problem of splitting a search into multiple smaller portions and distributing the smaller searches to multiple computer server systems.

A well-known example might be a fairly large Google (Google) search engine. The skilled person knows that this system is one of the largest computer systems established so far. The google search process is excellent in speed and accuracy. However, google search means are significantly different and are completely unsuitable for matching unknown media with databases of known media, even if the two databases are of the same very large size. This is because google search means employ a map-reduce algorithm designed to search a large database of substantially uncorrelated data. While advanced over paging systems, map-reduce is a computationally intensive process that also requires significant data communication bandwidth between participating computer systems. In contrast, the present invention is efficient in the use of processing resources and communication resources.

In the present invention, the distance-dependent hash function provides a means of addressing databases by sectors, such that the data of the addressing means is assembled in the main memory of the individual server devices of the server farm. The grouping is accomplished by grouping data related by distance in a multi-dimensional array into the same sector using a distance-associative hashing step as a means to accomplish the grouping. The sector identity used to address the data element is calculated from a hash index generated from the process by extracting a subset of the total bits of the hash function and using the subset to address the desired sector to store data therein in a reference database.

In this way, the hash index subset is the address of the sector (referred to as the bucket in the first invention) that contains the distances associated with the plurality of hash values. The remainder of the hash address is then used to address the bucket of the sector for storing new data. Alternatively, the sector address may be found by hashing the first hash value again.

Such a database addressing system and method by multiple hash indexing steps results in an efficient database access scheme with significant performance benefits and improved efficiency over the conventional database access methods described above.

Distance-associative hashing provides a means to rapidly address very complex (multidimensional) databases by finding data that is not a perfect match but that is within a predetermined radius of the sought value (distance-associative). Importantly, sometimes this addressing means will result in no match at all. In the event that the business-oriented database cannot tolerate inaccuracy, the media matching system can easily tolerate lost matches and will simply continue the matching process once the next data received and taught in the first patent arrives. Of course, the arrival of data from an unknown source to be determined by the ACR system is periodic, but may be commanded by the system of the present invention to arrive at different intervals based on the requirements for accuracy or by requirements imposed by the state of the system (e.g., when the system may be approaching overload), and then these sending clients are commanded to send lower sample rates. For example, a typical data reception rate may be an interval of 1/10 second.

For a reference media database, a set of pixel values is derived from each video frame from each video source to be part of the reference database. Then, the broadcast time of the video program and certain metadata, which is information of the program such as a content Identification (ID), a program title, an actor name, a broadcast time, a brief outline, etc., are attached to the pixel value group. The metadata is typically obtained from a commercial electronic program guide source.

Then, the array of processed pixel values with the time code plus metadata added is stored in a reference database, and then the address of the stored data is added to the hash index at the corresponding hash value and sector ID value. In addition, a second database index is created and maintained by using the content ID from the metadata as another means for addressing the reference database.

The process of creating and continuously updating the database is continuous and the number of days of data maintained by the database is based on the needs of the user but may range from one day to one month, for example.

The process of identifying unknown video segments from data received from a large number of client devices begins with a procedure similar to the procedure used to build the reference database above. In fig. 3, this program relates to a television monitor 301, such as a popular flat screen HDTV, typically of the type known as a smart TV, where the TV contains a processing device with a memory capable of executing applications of a type similar to those found on commonly used smart phones today. The system of the present invention samples the region 302 of the video frame buffer 301 in a generally large number of locations. The samples are of exactly the same size, shape and location as those pixel tiles used in the process of building the reference database. Each of the collected pixel tiles is then algorithmically processed to produce calculated values for each tile's red, green, and blue values in exactly the same manner as the method used to create the reference database.

The system of the present invention then calculates a distance-associated hash index of the collected average that is exactly the same as the content ingestion function described above. Also exactly the same as the ingestion system described above, the generated sector Identification (ID) value is extracted as a subset of the total bits of the hash index. The remainder of the hash index is used to address the desired sector to search for all candidate (potential) matches therein that belong to the same bucket as the unknown data point.

Alternatively, if a good guess for a match (a successful match) is available from the above process, the system of the present invention will also use the content ID index to collect candidates from the database responsible for the sectors created during ingestion as described above that belong to potential content IDs to address multiple reference cues around the time radius r' of the timestamp (of the successfully matched candidate). Next, as taught in the first patent, duplicate candidates are removed, as well as candidates that are too far from the unknown point (radius r).

To test for matches of unknown video segments to a reference database of known video data, a list of candidates from the previous step is assumed, where each candidate (i.e., each possible match) has associated with it: content ID, media time, inverse percentage distribution radius calculated as distance from the current unknown point (from the unknown video stream), where 100% represents the exact value of the unknown point and 0% is a value beyond the radius r (distribution) from the unknown point.

Each matching candidate 501 is assigned a data structure 502 in the memory of the matching system of the present invention. The data structure consists of, among other things, arbitrary time bins (e.g., about one second) grouped by some arbitrary amount. For purposes of example, it is assumed that the data structure consists of one hundred bins representing ten seconds of video cue points. The bins are typically not equally spaced in time.

For each candidate found in the reference (match) database: first, the relative time is calculated by subtracting the candidate time from an arbitrary time of the unknown video. The candidate time is the play time of each video cue associated with the candidate during the reference programming.

Any time of unknown video comes from the moment of the television monitor generated internally from the application of the invention running in the memory of the television or in a set-top box attached to the television and is sent by the application to the central server device of the invention together with the sampled video cue points. Time of day is well known to the skilled person and is commonly used in computer systems. The time is the current number of time units since 1 month 1 in 1970.

For example, if the time difference between any time from a television (in the home) and the real media time is 100 seconds, then the relative time of those candidates that actually match should be close to that value. Also, candidates that are not good matches may not have a relative time of approximately 100 seconds for this example.

In the candidate data structure, when the cue point of the unknown video matches the reference cue point, the system of the present invention adds a token into the corresponding bin of the candidate data structure. The system then repeats the process for the next candidate as described in the previous paragraph.

Another and important step for scoring the results is to apply a time discount to all bins. This is a relatively simple process that decrements the values in all bins by a small amount for each time period. The skilled artisan will recognize that this is a "leakage bucket" scoring method. By definition, a bin that is no longer filled by matching hint points will eventually decrement to zero over multiple cycles of the process. Likewise, bins that are slowly filled by random noise in the system will likewise be decremented. Therefore, the time discount eventually clears the bin filled with false positive matches and random noise. The skilled artisan will also clearly see that without the time-break split boxes (time discount binning), all boxes will eventually fill to capacity and no result can be obtained from the process.

The time discount also decrements any bin (e.g., 503) having a level above the match threshold 504 to zero when the video stream from the client television monitor is changed in any way by any of: changing channels, fast rewinding, fast forwarding, pausing video, etc.

If any bin of the candidate data structure (e.g., bin 503) is above a certain threshold 504, then the content is declared a match. Further means of defining matches may include testing for consecutive matches of candidate segments in a time greater than a determined number of seconds (e.g., three seconds).

Fig. 8 shows how the hash value is calculated. First, the median value for each pixel location contributing to the video fingerprint is found by summing those values of the location over time over a number of days from the acquisition values in the plurality of television channels representing the typical television program to be identified by the present invention at the location. Once the median value is determined, it can be used as a constant indefinitely without further calculation or adjustment. The pixel values sent from the client to the server matching system are first processed by subtracting the median of the pixel locations. The generated result is stored in a matrix with other pixel locations of the video frame and an appropriate hash function is applied to the matrix. Then, a plurality of hash values are derived from the generated dot product.

Fig. 9 illustrates the beneficial results of using the median of the pixel locations as part of the process of calculating the hash value. Graph 901 shows a resulting curve of the output of a typical non-optimized hash function having a relatively small number of hash values occupying a relatively narrow range on the left side of the curve. The resulting median 902 is lower. Graph 903 shows the advantageous redistribution of hash values resulting from calculating the median value for each pixel location that participates in the matching process and applying the median value as part of the hash function. The distribution of hash values is more diffuse, accompanied by a rise in the median of all hash keys 904.

Fig. 9a shows what happens to a dataset when the median is not found prior to partitioning the dataset. If the system samples sixteen pixel locations per video frame and if each pixel location has red, green, and blue pixel values, there will be 64 dimensions (or axes) to the map. For purposes of illustration, in this example, the data set includes only two sample points 906 and 908 of a single video frame. Further, this example assumes that only a single luminance value is obtained at each pixel point. By dividing the data set in diagonal direction 907 into clockwise and counter-clockwise sectors 907c and 907cc and the vertical and horizontal axes 908 and 906 intersect at zero value 905, there are only two sectors 910 and 911 of the eight sectors between the two said pixel locations.

Fig. 9b demonstrates the benefit of finding the median value for each pixel location. The present example continues to use the assumption that the pixel value is a single luminance value from zero to 255, although the absolute value is not important to the present method. This example demonstrates a simplified assumption that the median is 128 for both pixel locations. Now, by shifting the partition point to 905', the vertical and horizontal axes are shifted to 908' and 906', respectively. The diagonal slice 909 moves to 909'. It is clear from the illustration that now all eight sectors contain data.

In partitioning a data set in this manner, the calculated median value is neither necessarily nor necessarily in the middle of the data set. The desired result is to spread out the data so that when the data is partitioned and assigned to individual servers, the system accesses the servers more uniformly. In contrast, the non-optimized data of FIG. 9 would see only two of the eight accessed servers if partitioned between the eight servers as shown. In the method illustrated in fig. 9b, the actual calculation results in the application of 48 medians calculated as a 48-dimensional graph, by way of those color values at each pixel location and by way of example of 16 pixel locations. Further, data stitching may be performed more than once around each intermediate point of the 48-dimensional graph as desired, thereby enabling the data set produced by the slice to fit within the operating constraints of a separate computer server of the system. In any case, the data will be found most of the time on the clockwise and counterclockwise sides of each partition slice.

FIG. 10 illustrates an operational flow 1000 representative of a number of example operations related to addressing a media database using distance-associative hashing. In fig. 10, and in the following figures, which include various examples of operational flows, discussion and explanation will be provided with respect to the above-described examples of fig. 1-9 and/or with respect to other examples and contexts. However, it should be understood that the operational flows may be performed in many other environments and contexts and/or in modified versions of fig. 1-9. Moreover, while the various streams of operations are presented in a sequence illustrated, it should be appreciated that the various operations may be performed in other sequences than the illustrated sequence or may be performed concurrently.

After starting the operation, the operational flow 1000 moves to operation 1002. Operation 1002 depicts receiving one or more indications of samples of a video clip. For example, as shown in and/or described with respect to fig. 1-9, these indications may be associated with one or more pixel tiles from the ingestion system.

Operation 1004 then depicts determining algorithmically derived values of the one or more pixels of each slice for at least one slice of the sample of the video segment that includes the one or more pixels of the at least one slice. For example, as shown in and/or described with respect to fig. 1-9, the average of those red pixels in each tile, those green pixels in each tile, and those blue pixels in each tile may be calculated.

Operation 1006 then depicts subtracting the established midpoint value for each tile from the average value for each tile. For example, as shown in and/or described with respect to fig. 1-9, the median value for each pixel location contributing to a video fingerprint may be found by summing those values of the location over time for a number of days from the acquired values in a plurality of television channels at the location.

Operation 1008 then depicts transforming the values resulting from the subtraction using a pre-derived function to evenly distribute the values. For example, as shown in and/or described with respect to fig. 1-9, these values resulting from the subtraction fill the matrix. The dot product of that matrix with the static matrix derived in advance can be calculated. The pre-derived static matrix may be determined prior to instantiating the operational flow 1000 and may be algorithmically optimized based on data taken in the past such that multiple matrices intersecting it will produce more uniform results than the results distribution directly from the subtraction operation.

Operation 1010 then depicts constructing a hash value from the transformed values. For example, as shown in and/or described with respect to fig. 1-9, the value capable of maintaining RGB values is reduced to a bit form such that the hash value may be a bit string.

Operation 1012 then depicts referencing the most significant bits of the constructed hash value to determine a database sector. For example, as shown in and/or described with respect to fig. 1-9, a plurality of bits may be predetermined such that the predetermined plurality of bits of the hash value are used to address one or more database sectors.

Operation 1014 then depicts storing at least the hash value on the determined database sector. For example, as shown in and/or described with respect to fig. 1-9, the hash value may be stored in a bucket that includes other hash values that are mathematically close, where the hash values are associated with at least a particular video segment and offset.

Fig. 11 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 11 illustrates an example embodiment in which an operational procedure 1000 may include at least one additional operation. Additional operations may include an operation 1102.

Operation 1002 illustrates using one or more processing devices to at least partially implement at least one of receiving operation 1002, determining operation 1004, subtracting operation 1006, transforming operation 1008, constructing operation 1010, referencing operation 1012, or storing operation 1014. In some examples, one or more computer processors may be used to at least partially implement one of the foregoing operations. Other processing means may include an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), or any other circuit configured to implement the results of at least one of the foregoing operations.

Fig. 12 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 12 illustrates an example embodiment, wherein operation 1002 may include at least one additional operation. Additional operations may include operation 1202 and/or operation 1204.

Operation 1202 illustrates receiving one or more indications of at least one of a frame or a still image. For example, as shown in and/or described with respect to fig. 1-9, samples of a video clip may include individual frames of a video stream. Such a frame may be a30 fps video frame. In various embodiments, the sample of video clips may be still images or a portion of a video clip that may be imaged at a rate other than 30 times per second.

Further, operation 1204 illustrates receiving one or more indications of samples of a video clip, the one or more indications of samples of the video clip being associated with at least one indication of a channel, at least one indication of a video clip, and at least one indication of a time code offset from a start of the video clip. For example, as shown in and/or described with respect to fig. 1-9, data associated with a video clip (which may be a program title and/or other metadata associated with the video clip), a channel from which a program was ingested, and a time offset from the beginning of the program may be received from, for example, a channel guide associated with a channel being monitored by an ingestion system.

Fig. 13 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. Fig. 13 illustrates an example embodiment, wherein operation 1004 may include at least one additional operation 1302.

Operation 1302 illustrates determining an average value of at least one or more pixels of each slice for at least one slice of a sample of a video segment comprising the one or more pixels of the at least one slice. For example, as shown in and/or described with respect to fig. 1-9, the algorithmic operation for reducing the one or more pixels in a tile to a single value may be, for example, an arithmetic average.

Fig. 14 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. Fig. 14 illustrates an example embodiment, wherein operation 1006 may include at least one additional operation 1402.

Operation 1402 illustrates subtracting the midpoint value established for each tile from the average value for each tile, the midpoint value established for each tile having been previously determined for a plurality of channels using data from each tile over at least one time period. For example, as shown in and/or described with respect to fig. 1-9, a median value may be determined for each tile, wherein the median value is established for the tile at the same ingestion as in the operation of determining the tile on the client system, the median value being established as a constant value derived from monitoring the same tile across many channels over a long period of time (one month, one year, etc.).

Fig. 15 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. Fig. 15 illustrates an example embodiment, wherein operation 1008 may include at least one additional operation. Additional operations may include operation 1502, operation 1504, and/or operation 1506.

Operation 1502 illustrates forming a variable matrix comprising at least the values resulting from the subtracting. For example, as shown in and/or described with respect to fig. 1-9, the values are arranged in a matrix, the values resulting from a subtraction operation that subtracts the median established over time for each slice from the average of the instant frames being ingested.

Operation 1504 illustrates obtaining a static matrix that will more evenly distribute the transformed values when intersected by the variable matrix. For example, as illustrated in and/or described with respect to fig. 1-9, the matrix may be determined based on a mathematical analysis of a previously obtained dataset with respect to hash values. The matrix may be mathematically optimized such that when used as an operand in a dot-product operation with a continuous variable matrix, the corresponding continuous result matrix will include a plurality of values that are more evenly distributed along the distribution curve than the variable matrix prior to the dot-product operation.

Operation 1506 illustrates computing a dot product of the variable matrix and the static matrix, the dot product including at least the more uniformly distributed transformed values. For example, as shown in and/or described with respect to fig. 1-9, a variable matrix containing a plurality of values resulting from a subtraction operation may be intersected by a static matrix that has been predetermined for more evenly distributing data represented by the variable matrix, such that the resulting matrix is more spread apart rather than gathered around a particular portion of the distribution.

Fig. 16 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. Fig. 16 illustrates an example embodiment, wherein operation 1504 may include at least one additional operation 1602.

Operation 1602 illustrates determining a static matrix of transformed values of a variable matrix that will more evenly distribute the variable matrix when intersected by the variable matrix based at least in part on one or more previously obtained hash values using position-sensitive hashing. For example, as shown in and/or described with respect to fig. 1-9, previously taken video samples may be analyzed using a position-sensitive hashing technique to generate a matrix such that when used as an operand in a dot-product operation with a continuous variable matrix, the corresponding continuous result matrix will include a plurality of values that are more evenly distributed along a distribution curve than the variable matrix prior to the dot-product operation.

Fig. 17 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. FIG. 17 illustrates an example embodiment, wherein operation 1010 may include at least one additional operation. Additional operations may include operation 1702 and/or operation 1704.

Operation 1702 shows constructing a hash value from the transformed values, including reducing fidelity of the transformed values by at least reducing each transformed value to a binary representation. For example, as shown in and/or described with respect to fig. 1-9, each value from the result matrix of the dot product operation may be reduced from an 8-bit value (or from-127 to 128) of, for example, 0 to 255, to a single bit, or one or zero.

Operation 1702 may include operation 1704. Operation 1704 shows determining, for each transformed value, whether the transformed value is a positive number, and if the transformed value is a positive number, assigning a one to the hash value, and otherwise assigning a zero to the hash value. For example, as shown in and/or described with respect to fig. 1-9, each value between 1 and 128 from the result matrix of the dot-product operation may be reduced to a bit value of 1, and each value between-127 and 0 from the result matrix of the dot-product operation may be reduced to a bit value of 0.

Fig. 18 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. Fig. 18 illustrates an example embodiment, wherein operation 1012 may include at least one additional operation 1802.

Operation 1802 demonstrates referencing a plurality of most significant bits of the constructed hash value to determine a database server, wherein the plurality of most significant bits are predetermined to address a plurality of database servers, wherein a plurality of database servers associated with the plurality of most significant bits are established such that at least one index associated with a database sector can reside entirely in a memory of a corresponding database server. For example, as shown in and/or described with respect to fig. 1-9, a plurality of most significant bits of 2 bits may be selected, whereby the 2 bits may provide four different values (00, 01, 10, and 11), each of which may be assigned to a different database sector. The plurality of most significant bits of the hash value may be established to provide enough servers such that content associated with the plurality of hash values may fit entirely within the memory of a particular database sector, which may be a database server, a cluster partner, a virtual machine, and/or another type of database node. The number of bits is not necessarily, but may represent exactly the maximum number of database sectors at any given time (i.e., the system may operate with fewer servers (e.g., 60 sectors) or with a maximum of 64 sectors, as 6 bits may be selected to provide addressing of up to 64 database sectors).

Fig. 19 illustrates an alternative embodiment of the example operational flow 1000 of fig. 10. Fig. 19 illustrates an example embodiment, wherein operation 1014 may include at least one additional operation 1902.

Operation 1902 shows storing at least the hash value on the determined database sector, including storing at least one indication of a channel, at least one indication of a video segment, and at least one indication of a time code offset from a start of the video segment at a database location based at least in part on the hash value. For example, as shown in and/or described with respect to fig. 1-9, data associated with a video clip (which may be a program title and/or other metadata associated with the video clip), a channel from which a program was ingested, and a time offset from the beginning of the program may be stored or in a location associated with and/or referenceable by a hash value, which may be in the same or different sector, server, or database as the hash value.

FIG. 20 illustrates an operational flow 2000 representative of a number of example operations related to addressing a media database using distance-associative hashing. In fig. 20, and in the following figures, which include various examples of operational flows, discussion and explanation will be provided with respect to the above-described examples of fig. 1-9 and/or with respect to other examples and contexts. However, it should be understood that the operational flows may be performed in many other environments and contexts and/or in modified versions of fig. 1-9. Moreover, while the various streams of operations are presented in a sequence illustrated, it should be appreciated that the various operations may be performed in other sequences than the illustrated sequence or may be performed concurrently.

After starting the operation, the operational flow 2000 moves to operation 2002. Operation 2002 depicts: a hint is received, the hint constructed by one or more operations associated with a media storage operation. For example, as shown in and/or described with respect to fig. 1-9, at least some data associated with samples of video data taken from a particular client system is received. The data may be associated with a shard of the same client system that is defined by the ingestion operation. The data may be algorithmically processed using the same operations as the ingestion operation to obtain the hash value. Accordingly, if a particular frame associated with a particular time offset of a particular program on a particular channel is ingested and hashed, resulting in a hash value associated with that particular frame, if that particular frame is also sampled while being displayed on the client system, the same hash operation as applied to the ingested frame will result in the same hash value as that produced by the hash operation performed on the ingested frame. But in contrast to the hash values prepared during ingestion, the prompt of operation 2002 represents data associated with a sample of video data from a particular client system. The prompt may be received, for example, via an HTTP request.

Operation 2004 then depicts referencing the received most significant bits of the hint to determine the database sector. For example, as shown in and/or described with respect to fig. 1-9, the bits of the same hint defined by the plurality of most significant bits used to reference a database sector during ingestion are checked. For example, if the first two bits of the hash value at ingestion are used to store the hash value at a particular database sector, the same first two bits of the hint associated with the sample of video data from the client system are used to address the particular database sector.

Operation 2006 then depicts returning at least one indication of at least one candidate from the database sector based at least in part on the received hint. For example, as shown in and/or described with respect to fig. 1-9, hash values that exactly match or are near the hint are returned as one or more of a plurality of suspects or candidates. Candidates may be returned within a specific percentage radius. The candidates may be returned according to a nearest neighbor algorithm or a modified nearest neighbor algorithm.

Fig. 21 illustrates an alternative embodiment of the example operational flow 2000 of fig. 20. FIG. 21 illustrates an example embodiment in which operation 2002 may include at least one additional operation. Additional operations may include operation 2102, operation 2104, and/or operation 2106.

Operation 2102 illustrates receiving a prompt associated with a sample of a video buffer of a client system, including at least receiving one or more indications related to a time of day associated with the sample of the video buffer of the client system. For example, as shown in and/or described with respect to fig. 1-9, the cues may include or be associated with a time offset from any time. For example, the time offset may be calculated from 1 month 1 day 1970.

Operation 2104 shows: a hint is received, the hint associated with a sample of a video buffer of a client system, the hint determined at least in part by hashing at least some values associated with the video buffer. For example, as shown in and/or described with respect to fig. 1-9, one or more operands may be used as constants to reduce the plurality of slices associated with the video buffer to a bit string through one or more mathematical operations or algorithms, the constants being derived beforehand through operations such as described elsewhere herein with respect to hashing.

Operation 2016 shows: a hint is received, the hint being associated with a sample of a video buffer of a client system, the hint being determined at least in part by hashing at least some values associated with the video buffer, the hashing being based at least in part on one or more of at least one operand or at least one algorithm that is also used in an associated media storage operation. For example, as shown in and/or described with respect to fig. 1-9, at least some data associated with a sample of the video buffer representing what the television screen displays at a particular time quantum is processed by operations utilized by and/or in conjunction with data locations common to the ingestion process and/or involving constant values for operands utilized by the ingestion process. For example, the number of tiles analyzed at ingestion may also be used to provide hints associated with a particular client system. The size of the pixel tiles analyzed at ingestion may also be used to provide hints associated with a particular client system. The same pre-derived static matrix used to more evenly distribute hash values at ingestion may also be used during hashing of data associated with a particular client system.

Fig. 22 illustrates an alternative embodiment of the example operational flow 2000 of fig. 20. FIG. 22 illustrates an example embodiment in which operation 2002 may include at least one additional operation. Additional operations may include operation 2202, operation 2204, operation 2206, operation 2208, operation 2210, operation 2212, and/or operation 2214.

Operation 2202 illustrates receiving one or more indications of at least one item of content of a video buffer of a client system. For example, as shown in and/or described with respect to fig. 1-9, the pixel values of the red, green, and blue pixels at each pixel location at each predefined tile for the video buffer of the client system may be read for each frame, or for every three frames, or for every ten frames, or for every second, or at some other interval. These indications (pixel values or other data) may be received by controls on the television, by control logic on the television, by a system coupled with a media server, or elsewhere.

Operation 2204 illustrates determining an algorithmically derived value of the one or more pixels of each tile for at least one tile of the at least one item of content of the video buffer including the at least one or more pixels of the at least one tile. For example, as shown in and/or described with respect to fig. 1-9, the pixel values of the red, green, and blue pixels at each pixel location at each predefined tile of the video buffer for the client system may be averaged.

Operation 2206 shows subtracting the midpoint value from the average for each slice. For example, as shown in and/or described with respect to fig. 1-9, the midpoint value at each tile established by analysis of the ingested content is determined. Once determined by the systems associated with the media database and ingestion system, the midpoint value for each shard may be provided to the client system, for example. These midpoint values may be updated from time to time (hourly, daily, monthly, yearly). These intermediate point values provided for hashing data associated with the video buffer of the client system may be the same intermediate point values used to hash incoming content at ingestion.

Operation 2208 illustrates transforming the values resulting from the subtraction. For example, as shown in and/or described with respect to fig. 1-9, the values resulting from the subtraction are filled into a matrix and intersect a predefined static matrix. The dot product operation that intersects the two matrices may be performed at the client system during the process of converting pixel-slice data associated with frames within the video buffer into hints, such that hints are sent in HTTP requests rather than actual pixel-slice data, resulting in a compact HTTP message. The predefined static matrix may be provided to the client system prior to transformation and may be the same matrix that is generated to more evenly distribute the plurality of hashed values upon ingestion. The predefined static matrix may be updated at the client system from time to time. Alternatively, the shard data (with or without other metadata) may be sent from a client system (e.g., a television) to a different system for processing and/or hashing.

Operation 2210 shows constructing a hash value from the transformed values. For example, as shown in and/or described with respect to fig. 1-9, those values in the matrix resulting from intersecting a matrix having values associated with the video buffer with a pre-derived static matrix may be reduced by a plurality of bits, with a single bit replacing each 8-bit value in the matrix. In other embodiments, the hash value constructed may include a different number of bits for each value in the matrix. In different embodiments, the hash values constructed may have the same number of bits as those in the matrix, or may be a direct representation of those in the matrix.

Operation 2212 shows associating the hint at least partially with the hash value constructed. For example, as shown in and/or described with respect to fig. 1-9, the bit string constructed from the transformed matrix may be a hint, or the constructed bit string may be associated with time (e.g., time of day) to form a hint, or other data (e.g., an IP address or other identifier associated with a client television or a control of a client television) may be associated to form a hint. Alternatively, the cues may comprise or otherwise be associated with any other metadata associated with the audiovisual content at the client system.

Operation 2214 illustrates that at least one of the determining operation 2204, subtracting operation 2206, transforming operation 2208, or constructing operation 2210 utilizes one or more of at least one operand or at least one algorithm that is also used in the associated media storage operation. For example, as shown in and/or described with respect to fig. 1-9, one or more parameters including one or more of a definition of a number of tiles of pixels, a definition of a size of tiles of pixels, a predefined median value associated with tiles of pixels, or a predefined static matrix may be provided to the client TV, which are also utilized by the ingestion process such that multiple operations applied to samples from the video buffer will result in the same hash value as that generated when that frame (e.g., the same video segment and time offset) is ingested and hashed.

Fig. 23 illustrates an alternative embodiment of the example operational flow 2000 of fig. 20. FIG. 23 illustrates an example embodiment, wherein operation 2006 may include at least one additional operation. Additional operations may include operation 2302 and/or operation 2304.

Operation 2302 illustrates returning at least one indication of at least one candidate from the database sector based at least in part on a probabilistic point location ("PPLEB") algorithm in an equal sphere as a function of the hint received. For example, as shown in and/or described with respect to fig. 1-9, at least one of the candidates or suspects representing a waypoint (e.g., neighbor, nearest neighbor, within radius, from within the same bucket, belonging to the same ring, etc.) proximate to the cue is returned from the media database constructed and/or modified by the ingestion process.

Operation 2304 shows: at least one indication of at least one candidate from the database sector is returned based at least in part on the received hint, the at least one candidate being within a predetermined inverse percentage distribution radius of the received hint. For example, at least one candidate or suspect item associated with the location-sensitive hashing with respect to at least one of the hint and hash values is returned as shown in and/or described with respect to fig. 1-9.

FIG. 24 illustrates an operational flow 2400 representative of a plurality of example operations related to addressing a media database using distance-associative hashing. In fig. 24, and in the following figures, which include various examples of operational flows, discussion and explanation will be provided with respect to the above-described examples of fig. 1-9 and/or with respect to other examples and contexts. However, it should be understood that the operational flows may be performed in many other environments and contexts and/or in modified versions of fig. 1-9. Moreover, while the various streams of operations are presented in a sequence illustrated, it should be appreciated that the various operations may be performed in other sequences than the illustrated sequence or may be performed concurrently.

After starting the operation, the operation flow 2400 moves to operation 2402. Operation 2402 depicts receiving at least one indication of at least one candidate and at least one indication of at least one hint. For example, as shown in and/or described with respect to fig. 1-9, a hash value relating to a video buffer of a client system is determined along with one or more associated candidates or suspicions.

Operation 2404 then depicts adding the token to a bin associated with the at least one received candidate. Scoring the candidates is performed by tokens added to the bin corresponding to the candidate/suspected item, e.g. by a value incremented each time the token is added, as shown in and/or described in relation to fig. 1-9.

Then, operation 2406 depicts: it is determined whether the number of tokens within the bin exceeds a value associated with a probability that the client system is displaying a particular video segment associated with the at least one hint, and if the number of tokens within the bin exceeds a value associated with a probability that the client system is displaying a particular video segment associated with the at least one hint, at least some data associated with the particular video segment is returned based at least in part on the bin. For example, as shown in and/or described with respect to fig. 1-9, the determination of a particular video clip and a particular offset for that video clip is probabilistically determined by the scores associated with the bins.

Fig. 25 illustrates an alternative embodiment of the example operational procedure 2400 of fig. 24. FIG. 25 illustrates an example embodiment, wherein operation 2404 may include at least one additional operation 2502.

Operation 2502 illustrates adding a token to a time bin associated with at least one received candidate. For example, as shown in and/or described with respect to fig. 1-9, the data structures associated with the candidates/suspects may include any time bins grouped by any time.

Fig. 26 illustrates an alternative embodiment of the example operational procedure 2400 of fig. 20. FIG. 26 illustrates an example embodiment, wherein operation 2404 may include at least one additional operation. Additional operations may include operation 2602 and/or operation 2604. Further, operational procedure 2400 can include at least one additional operation 2606.

Operation 2602 shows: determining a relative time includes subtracting a candidate time associated with the at least one candidate from at least any time associated with the at least one hint. For example, as shown in and/or described with respect to fig. 1-9, the time offset of the video clip associated with the candidate is subtracted from any time associated with the prompt received from the client system (television, set-top box, or article, machine, or composition displaying and/or providing and/or receiving video content).

Operation 2604 illustrates adding a token to a time bin associated with the candidate based at least in part on the determined relative time. For example, as shown in and/or described with respect to fig. 1-9, when a hint point associated with a client system matches or nearly matches a reference hint point associated with a media database, a token may be added to a box, which may include incrementing a value associated with the box or other means of tracking box operation.

Operation 2606 illustrates removing one or more tokens from the time bin based at least in part on the elapsed time period. For example, as shown in and/or described with respect to fig. 1-9, the bin may be compromised such that data and/or tokens associated with old suspected items/candidates may be released from the bin, which may include subtracting one from a value associated with the bin or other means of tracking bin operation.

In different embodiments, a pixel location may relate to one or a number of colors and/or color spaces/models (e.g., red, blue, green, and yellow, cyan, magenta, yellow, and black, a single pixel value uniquely identifying a color, e.g., a 24-bit value associated with the pixel location, hue, saturation, brightness, etc.). Different numbers of pixels in a tile may be used and the tile is not necessarily a square tile. Further, the resolution of the video buffer of the client system may vary. Resolution and/or color density at the client system and ingestion system may vary. The system may operate at different raster resolutions including, but not limited to 1920 by 1080, 3840 by 2160, 1440 by 1080, 1366 by 768, or other resolutions. It is expected that over the next twenty years, an increase in pixel resolution of common programs, televisions, and/or client systems will occur; the same basic operation may be utilized, although the number, size, sampling rate, or other aspects of the pixel slices may vary. Further, up-conversion, down-conversion, or other conversion operations associated with resolution and/or color density may occur and/or be interposed between other operations described herein.

FIG. 27 illustrates an example system 2700 in which embodiments can be implemented. The system 2700 includes one or more computing devices 2702. The system 2700 also illustrates a framework 2704 for facilitating communications between the one or more computing devices and the one or more client devices 2706. The system 2700 includes a client device 2706 that also illustrates one or more client devices. In some embodiments, the one or more client devices may be between the one or more computing devices. The system 2700 also illustrates at least one non-transitory computer-readable medium 2708. In some embodiments, 2708 may include one or more instructions 2710 that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to at least: receiving at least one rasterized video stream; creating at least one hash value associated with at least one sample of the at least one received rasterized video stream; determining at least one database sector for storing the created at least one hash value; and storing the created at least one hash value on the determined at least one database sector. In various embodiments, the one or more instructions may be executed on a single computing device. In other embodiments, some portions of the one or more instructions may be executed by a first plurality of the one or more computing devices, while other portions of the one or more instructions may be executed by a second plurality of the one or more computing devices.

FIG. 28 illustrates an example system 2800 in which embodiments may be implemented. The system 2800 includes one or more computing devices 2802. The system 2800 also illustrates a framework 2804 for facilitating communications between the one or more computing devices and the one or more client devices 2806. The system 2800 includes a client device(s) 2806 that also are illustrated. In some embodiments, the one or more client devices may be between the one or more computing devices. The system 2800 also illustrates at least one non-transitory computer readable medium 2808. In some embodiments, 2808 may include one or more instructions 2810 that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to perform at least the following: receiving one or more instructions associated with at least one video buffer of at least one client system; determining a hint based at least in part on the at least one video buffer and at least one time associated with the at least one video buffer, wherein one or more of at least one operand or at least one function associated with determining the hint is also used in an associated media storage operation; referencing a plurality of most significant bits of the determined hint to determine a database sector; and returning at least one indication of at least one candidate from the determined database sector based at least in part on the determined hint. In various embodiments, the one or more instructions may be executed on a single computing device. In other embodiments, some portions of the one or more instructions may be executed by a first plurality of the one or more computing devices, while other portions of the one or more instructions may be executed by a second plurality of the one or more computing devices.

FIG. 29 illustrates an example system 2900 in which embodiments may be implemented. The system 2900 includes one or more computing devices 2902. The system 2900 also illustrates a framework 2904 for facilitating communications between the one or more computing devices and the one or more client devices 2906. The system 2900 includes a client device or devices 2906 that are also shown. In some embodiments, the one or more client devices may be between the one or more computing devices. The system 2900 also illustrates at least one non-transitory computer-readable medium 2908. In some embodiments, 2908 may include one or more instructions 2910 that, when executed on at least some of the one or more computing devices, cause at least some of the one or more computing devices to at least: receiving at least one indication of at least one candidate item and at least one indication of at least one hint; adding the token to a bin associated with the at least one received candidate; and determining whether the number of tokens within the bin exceeds a value associated with a probability that the client system is receiving a particular video segment associated with the received at least one hint, and returning at least some data associated with the particular video segment based at least in part on the bin if the number of tokens within the bin exceeds a value associated with a probability that the client system is receiving the particular video segment associated with the received at least one hint. In various embodiments, the one or more instructions may be executed on a single computing device. In other embodiments, some portions of the one or more instructions may be executed by a first plurality of the one or more computing devices, while other portions of the one or more instructions may be executed by a second plurality of the one or more computing devices.

Certain aspects of the invention include process steps and instructions described herein in the form of algorithms. It should be noted that these process steps and instructions of the present invention may be embodied in software, firmware or hardware, and when embodied in software, may be downloaded to reside on and be operated from different platforms used by real time network operating systems.

The present invention also relates to an apparatus for performing the operations herein. The apparatus may be specially constructed for the required purposes, or it may comprise a general purpose computer selectively activated or rearranged by a computer program stored in the computer. Such a computer program may be stored on a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random Access Memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application Specific Integrated Circuits (ASICs), or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.

In addition, the computers or computing devices referred to in this specification may include a single processor or may employ multi-processor designs for increased computing capability.

The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may also be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the required method steps. The required structure for a variety of these systems will appear from the description above. In addition, the present invention is not described with reference to any particular programming language or operating system. It will be appreciated that a variety of programming languages and operating systems may be used to implement the teachings of the invention as described herein.

The systems and methods, flowcharts, and block diagrams described in this specification can be implemented in a computer processing system including program code comprising program instructions executable by the computer processing system. Other implementations may also be used. Furthermore, the flowcharts and block diagrams described herein describe particular methods and/or corresponding acts for supporting various steps and corresponding functions (which support the disclosed structural means), and may also be used to implement corresponding software structures and algorithms, and equivalents thereof.

Embodiments of the subject matter disclosed in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible program carrier for execution by, or to control the operation of, data processing apparatus. The computer storage medium may be a machine-readable storage device, a machine-readable storage substrate, a memory device, or a combination of one or more of them.

A computer program (also known as a program, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. The computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code). A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a suitable communication network.

The processes or logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions of operating on input data and generating output. These processes or logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).

The essential elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such a device. Processors suitable for the execution of a computer program include, by way of example only and not by way of limitation, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both.

To provide for interaction with a user or administrator of the system described herein, embodiments of the subject matter described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard (e.g., a mouse or a trackball) and a pointing device by which the user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user. For example, feedback provided to the user may be any form of sensory feedback, such as visual feedback, auditory feedback, or tactile feedback; and input from the user may be received in any form, including acoustic, speech, or tactile input.

Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component that includes one or more data servers, or that includes a front-end component (e.g., a client computer) that includes one or more middleware components (e.g., application servers), or that includes a graphical user interface or a Web browser through which a user or an administrator can interact with certain implementations of the subject matter described in this specification, or any combination of one or more such back-end components, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this specification in the context of separate embodiments can also be implemented in combination in a single embodiment.

Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, although operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In some cases, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

This written description sets forth the best mode of the invention and provides examples to describe the invention and to enable a person of ordinary skill in the art to make and use the invention. The written description does not limit the invention to the precise terms set forth. Thus, while the present invention has been described in detail with reference to the examples set forth above, variations, modifications, and alterations to these examples may be practiced by those of ordinary skill in the art without departing from the scope of the present invention.

Claims

1. A system for addressing a media database using distance associative hashing, comprising:

one or more processors;

one or more non-transitory machine-readable storage media containing instructions that, when executed on the one or more processors, cause the one or more processors to:

Receiving a rasterized video stream;

Creating a hash value associated with the rasterized video stream, wherein the hash value is created by performing a hash function on pixel values associated with samples of the rasterized video stream, wherein the hash value comprises a first plurality of bits and a second plurality of bits, wherein the first plurality of bits is predetermined to correspond to a plurality of media database sectors of the media database, and wherein the second plurality of bits is predetermined to correspond to a plurality of buckets in one or more of the plurality of media database sectors;

Determining a media database sector of the media database for storing the created hash value, wherein the media database sector is determined by referencing the first plurality of bits of the created hash value; and

The created hash value is stored in the determined media database sector,

Wherein the plurality of media database sectors associated with the first plurality of bits are established such that an index associated with a media database sector can reside entirely in memory of the media database sector, thereby eliminating the need for paging when accessing the media database.

2. The system of claim 1, further comprising instructions that when executed on the one or more processors cause the one or more processors to:

facilitating identification of an unknown video clip client device, wherein the identification is performed using pixel data received from the client device and a created hash value stored in a determined media database sector.

3. The system of claim 1, wherein creating the hash value comprises:

Determining algorithmically derived values associated with one or more pixels of a slice of the sample of the rasterized video stream; and

The hash value is created using the determined algorithmically derived value.

4. The system of claim 1, wherein creating the hash value comprises:

subtracting a midpoint value established for a slice of the sample of the rasterized video stream from an algorithmically derived value associated with the slice; and

The hash value is created using the value resulting from the subtracting step.

5. The system of claim 4, wherein creating the hash value comprises:

Transforming the values resulting from said subtracting step, said transforming using a pre-derived function to evenly distribute said values resulting from said subtracting step and one or more other values; and

The hash value is created from the transformed value.

6. The system of claim 1, wherein creating the hash value comprises:

transforming values resulting from subtracting midpoint values established for a segment from an average of the segment, the transforming using a pre-derived function to uniformly distribute the values resulting and one or more other values; and

The hash value is created from the transformed value.

7. The system of claim 1, wherein the hash value is created using evenly distributed values of slices derived from the samples of the rasterized video stream.

8. The system of claim 7, wherein the hash value is created using the evenly distributed values of the shards and data related to the values of the shards over time.

9. The system of claim 1, wherein creating the hash value comprises:

determining algorithmically derived values of one or more pixels of a slice for the slice of the sample of the rasterized video stream;

subtracting the intermediate point value established for the shard from the algorithmically derived value;

Transforming the values resulting from said subtracting step using a pre-derived function to evenly distribute said values resulting from said subtracting step and one or more other values; and

The hash value is created from the transformed value.

10. The system of claim 1, wherein the hash value is created using the following elements: one or more average values associated with one or more pixels associated with a slice of the sample, one or more midpoint values associated with pixel data associated with the slice over time, and a pre-derived transformation matrix.

11. The system of claim 1, wherein the hash function comprises a distance associative hash.

12. The system of claim 1, wherein referencing the first plurality of bits of the created hash value comprises: the most significant bits of the created hash value are referenced.

13. A computer-implemented method for addressing a media database using distance-associative hashing, comprising:

Receiving a rasterized video stream;

The created hash value is stored in the determined media database sector,

14. The method of claim 13, further comprising:

Facilitating identification of unknown video segments associated with a client device, wherein the identification is performed using pixel data received from the client device and created hash values stored in a determined media database sector.

15. The method of claim 13, wherein creating the hash value comprises:

The hash value is created using the determined algorithmically derived value.

16. The method of claim 13, wherein creating the hash value comprises:

Creating the hash value using the value resulting from the subtracting step;

The hash value is created from the transformed value.

17. A computer-program product for addressing a media database using distance-associative hashing, the computer-program product tangibly embodied in a non-transitory machine-readable storage medium of a computing device, the computer-program product comprising instructions configured to cause one or more data processors to:

Receiving a rasterized video stream;

The created hash value is stored in the determined media database sector,

18. The computer program product of claim 17, the instructions further configured to cause the one or more data processors to:

19. The computer program product of claim 17, wherein creating the hash value comprises:

Creating the hash value using the value resulting from the subtracting step;

The hash value is created from the transformed value.