AU2013203820B2

AU2013203820B2 - Methods and Apparatus to Extract Data Encoded in Media

Info

Publication number: AU2013203820B2
Application number: AU2013203820A
Authority: AU
Inventors: Venugopal Srinivasan; Alexander Pavlovich Topchy
Original assignee: Nielsen Co US LLC
Current assignee: Nielsen Co US LLC
Priority date: 2008-10-24
Filing date: 2013-04-11
Publication date: 2016-08-04
Anticipated expiration: 2029-10-23
Also published as: AU2013203820A1

Abstract

Methods and apparatus to extract data encoded in media content are disclosed. An example method includes receiving a media content signal, sampling the media content signal to generate digital samples, determining a frequency domain representation of the digital samples, determining a first rank of a first frequency in the frequency domain representation, determining a second rank of a second frequency in the frequency domain representation, combining the first rank and the second rank with a set of ranks to create a combined set of ranks, comparing the combined set of ranks to a set of reference sequences, determining a data represented by the combined set of ranks based on the comparison, and storing the data in a tangible memory. 7320070 1 P012855D1 20004/410AU02

Description

METHODS AND APPARATUS TO EXTRACT DATA ENCODED IN MEDIA

RELATED APPLICATIONS

[0001] This patent application is a divisional application of Australian Patent Application No. 2009308256 which claimed priority to US Provisional Patent Application Serial No. 61/108,380, "STACKING METHOD FOR ENHANCED WATERMARK DETECTION," filed on October 24, 2008, the disclosure of each of which is incorporated by reference in their entirety.

FIELD OF THE DISCLOSURE

[0002] The present disclosure pertains to monitoring media content and, more particularly, to methods and apparatus to extract data encoded in media content.

BACKGROUND

[0003] Identifying media information and, more specifically, audio streams (e.g., audio information) is useful for assessing audience exposure to television, radio, or any other media. For example, in television audience metering applications, a code may be inserted into the audio or video of media, wherein the code is later detected at monitoring sites when the media is presented (e.g., played at monitored households). The information payload of the code/watermark embedded into original signal can consist of unique source identification, time of broadcast information, transactional information or additional content metadata.

[0004] Monitoring sites typically include locations such as, for example, households where the media consumption of audience members or audience member exposure to the media is monitored. For example, at a monitoring site, codes from the audio and/or video are captured and may be associated with audio or video streams of media associated with a selected channel, radio station, media source, etc. The collected codes may then be sent to a central data collection facility for analysis. However, the collection of data pertinent to media exposure or consumption need not be limited to in-home exposure or consumption.

BRIEF DESCRIPTION OF THE DRAWINGS

[0005] FIG. 1 is a block diagram of an example system for encoding data in a media content signal to transmit the data to a location where the media content signal is decoded to extract the data.

[0006] FIG. 2 is a graph of an example frequency spectrum and code indexing.

[0007] FIG. 3 illustrates an example sequence that may be encoded in an audio signal by the example encoder of FIG. 1.

[0008] FIG. 4 illustrates an example message thread.

[0009] FIG. 5 is a block diagram of an example apparatus to implement the decoder of FIG. 1 that includes stack and rank functionality.

[0010] FIG. 6 is a flowchart of an example process to decode a message in audio.

[0011] FIG. 7 is a flowchart of an example process to decode a message in audio using stacking.

[0012] FIG. 8 is a schematic illustration Of an example processor platform that may be used and/or programmed to perform any or all of the example machine accessible instructions of FIGS. 6-7 to implement any or all of the example systems, example apparati and/or example methods described herein.

DETAILED DESCRIPTION

[0013] FIG. 1 is a block diagram of an example system 100 for encoding data in a media content signal to transmit the data to a location where the media content signal is decoded to extract the data. The example system 100 includes an encoder 102 and a decode 104 with stack and rank functionality. According to the illustrated example, the encoder 10 encodes a received audio signal with a received data by amplifying or attenuating frequenci of interest as described in detail herein. The encoded audio signal is transported to another location where it is received by the decoder 104. The decoder 104 includes a stack functionality to stack consecutively received portions of the audio signal. In addition, the decoder 104 includes rank functionality to assign ranks to frequencies that may have been amplified or attenuated by the encoder 102. For example, where frequencies are grouped in neighborhoods of five frequencies, a rank of 0 to 4 may be assigned to each frequency. The decoder 104 then extracts the data from the stacked audio signal as described in detail herei Stacking the encoded audio signal will, for example, improve the detection reliability of the decoder 104 when stacked portions include redundant or semi-redundant encoded data. While not shown in the illustrated example, the audio signal may also be output by the decoder 104 to be presented on a media presentation device (e.g., a radio* a television, etc,). Alternatively* the encoded audio signal may be transmitted to a media presentation device i: parallel with the example decoder 104.

[0014] According to the example of FIG. 1, the encoder 102 receives as input an audio signal and data. The encoder 102 further divides the audio signal into frames, which ε blocks of digital audio samples. As described in detail below, the encoder 102 encodes (embeds) the data into the framed audio signal and the encoded frame of audio is tested by the encoder 102 to determine if the modifications to the framed audio signal are significant enough to cause the encoding to be audibly perceptible by a human when the framed audio signal is presented to a viewer (e.g., using psychoacoustic masking). If the modifications to the framed audio signal are too significant and would result in an audible change in the audi the framed audio is transmitted (e.g., broadcast, delivered to a broadcaster, etc.) without being encoded. Conversely, if the encoded audio frame has audio characteristics that are imperceptibly different from the un-encoded audio frame, the encoded audio frame is transmitted.

[0015] The encoder 102 inserts a unique or semi-unique 15-bit pseudorandom numt (PN) synchronization sequence at the start of each message packet. To signal to the decode 104 that a synchronization sequence is to be transmitted, the first code block of the synchronization sequence uses a triple tone. The triple tone is an amplification of three frequencies causing those frequencies to be maxima in their spectral neighborhoods. Thus, by looking for die triple tone, the decoder 104 can detect that a synchronization sequence is about to be sent without the need for decoding the entire synchronization sequence. An example implementation of a triple tone is described in United States Patent 6,272,176 (‘ 17< patent’), which is hereby incorporated by reference in its entirety. The example synchronization sequence is one approach for enabling the decoder 104 to detect the start of new message packet. However, any other indication, signal, flag, or approach may be used.

[0016] The example encoder 102 transits as many as ten 15-bit PN sequences of message data following the synchronization, Thus, each message in the illustrated example comprises 11 groups: one 15-bit synchronization sequence followed by ten 15-bit message data sequences. However, any number of message data sequences may be transmitted between synchronization sequences. The example message data is transmitted in 15-bit PN sequences having ten error correction bits and five message data bits. In other words, message data is divided into groups of five bits each (e.g., ten 5-bit groups for a 50-bit message). Alternatively, any combination of message data bits and error correction bits ma be included in a message data sequence. Each bit of the 15-bit PN sequence is encoded intc 512-sample block of audio. In the example system 100, one bit is transmitted at a time. Eai 5 bits of payload data that is encoded as a 15-bit sequence uses 15 blocks of 512 samples (he., 7680 samples total). The example encoder 102 includes a 16u' block called the null block after the 15 blocks representing the 15-bit sequence. Thus, each message in the illustrated example uses 176 audio blocks: 16 blocks per sequence and 11 sequences per message. In the illustrated example, each message is followed by 11 unencoded blocks to adjust the total message duration to be approximately two seconds in the example encoding. While example encoding and block sizes are described, any desired encoding and block siz< may be used.

[0017] To insert a data bit (e.g., one bit of a 15-bit sequence) into an audio frame, th example encoder 102 makes a first selected frequency of the audio frame a local maximum and makes a second selected frequency of the audio frame a local minimum. For example,: shown in FIG. 2, the encoder 102 uses two audio frequency bands or neighborhoods 202 an 204, each including five frequencies or residents. One of the neighborhoods 202 and 204 is encoded to include a resident that is a local maximum and the other neighborhood 202 and 204 is encoded to include a resident that is a local minimum. The residents that are selected to be local maximum and local minimum are based on the coding block on which the example encoder 102 is operating and the value of the data bit to be transmitted. For example, to encode a logical “ 1 ” in the fifth encoding block, a resident having index numbe 50 in the neighborhood 202 is made a local maximum and a resident having index number £ in the neighborhood 204 is made a local minimum. Conversely, to encode a logical “0” for the same encoding block, the resident having index number 50 in the neighborhood 202 would be made a local minimum and the resident having index number 60 in the neighborhood 204 would be made a local maximum. In other words, the frequencies that ai selected do not represent the bit to be sent, the amplitudes at the selected frequencies represent the value of the bit because the same frequencies may be used whether the bit is a logical “1” or a logical “0”. After encoding, the audio signal may be broadcast to a consum location, may be transmitted to a broadcaster for broadcasting, may be stored to a storage media, etc.

[0018] The example system 100 may be configured to perform stacking and ranking in a system that is implemented with theNielsen Audio Encoding System (NAES) describe! m the ‘176 patent. While this disclosure makes reference to encoding and decoding techniques of the NAES system described in the ‘ 176 patent by way of example, the methoc and apparatus described herein are not limited to operation in conjunction with the technique of the ‘ 176 patent. To the contrary, the example methods and apparatus may be implement in conjunction with any type of encoding or decoding system. For example, the data rates, data grouping, message lengths, parameter lengths, parameter Order in messages, number of parameters, etc. may vary based on the implemented encoding system.

[0019] FIG. 3 illustrates an example sequence 300 that may be encoded in an audio signal by the example encoder 102 of FIG. 1. The example sequence 300 includes 15 bits that are encoded in 15 blocks of audio data (e.g., 512 sample blocks). The message bits 30: convey five bits of message data. The message bits 302 are the payload data to be conveye by the encoding. The error correction bits 304 convey ten bits of error correction data that may be used by the decoder 104 to verify and correct a received message. Each bit of the sequence 300 is encoded in a block of audio data. As described in conjunction with FIG. 1. for each block of audio data, a first selected frequency is made a local maximum and a second selected frequency is made a local minimum.

[0020] FIG. 4 illustrates an example message thread 400. The message thread 400 < the illustrated example includes a synch sequence 402, a first sequence 406, a second sequence 410, a third sequence 414, and no mark blocks 404,408, and 412. The example synch sequence 402 is a 15 bit sequence that indicates the start of a new message thread. T first sequence 406, the second sequence 410, and the third sequence 414 of the illustrated example are 15 bit sequences that each convey five message payload bits and ten error correction bits as described in conjunction with FIG. 3. The no mark blocks 404,408, and 412 are single blocks that include no encoding (e.g., 512 samples of audio data in which no frequencies are amplified or attenuated by the encoder 102). While the example message thread 400 is formatted as described, any other formatting may be used. For example, more Or fewer sequences may be included in a message thread 400, sequences 406,410, and 414 may contain more or fewer data bits and/or error correction bits, the no mark blocks 404,4( and 412 may include multiple blocks, more or fewer no mark blocks 404,408, and 412 ma> be included, etc.

[0021] FIG. 5 is a block diagram of an example apparatus to implement the decoder 104 of FIG. 1 that includes stack and rank functionality. The example decoder 104 includes sampler 502, a time domain to frequency converter 504, a ranker 506, a rank buffer 508, a stacker 510, a stacker control 512, a comparator 514, and a reference sequence datastore 51 · The example decoder 104 receives an input audio (e.g., an audio portion of a television program) and processes the audio to extract and output data encoded in the audio.

[0022] The sampler 502 of the illustrated examples samples the incoming audio. TI sampler 502 may be implemented using an analog to digital converter (A/D) or any other suitable technology, to which encoded audio is provided in analog format. The sampler 50< samples the encoded audio at, for example, a sampling frequency of 48 KHz. Of course, other sampling frequencies may be selected in order to increase resolution or reduce the computational load at the time of decoding. Alternatively, the sampler 502 may be eliminated if audio is provided in digitized format.

[0023] The time domain to frequency domain converter 504 of the illustrated examf may be implemented using a discrete Fourier transformation (DFT), or any other suitable technique to convert time-based information into frequency-based information. In one example, the time domain to frequency domain converter 504 may be implemented using a sliding DFT in which a spectrum of the code frequencies of interest (e.g., frequencies index 1 to N in FIG. 5) is calculated each time four new samples are provided to the example time domain to frequency domain converter 504. In other words, four new samples are shifted into the analysis windows, four old samples are shifted out of the analysis window, and the DFT of the analysis window is computed. Because the boundaries Of blocks are not known when decoding, a sliding DFT may operate by sliding 4 samples at a time to give 128 distin message threads to analyze per 512 samples of audio that are received. Thus, at the end of 128 slides (of four samples each), all 512 samples (i.e., one block worth of samples) will ha been processed and analyzed. The resolution of the spectrum produced by the time domain frequency domain converter 504 increases as the number of samples (e.g., 512 or more) use to generate the spectrum increases. Thus, the number of samples processed by the time domain to frequency domain converter 504 should match the resolution used to select the residents shown in FIG. 2. The finer the frequency spacing between the residents, the more samples that will be used to generate the spectrum for detection of the residents.

[0024] The spectrum produced by the time domain to frequency domain converter 504 passes to the ranker 506. The ranker 506 of the illustrated example ranks the amplitude of each frequency of interest (e.g., RANK 1 to RANK N for the 1 to N frequency indices of interest in FIG. 5) in neighborhoods in the received spectrum relative to the amplitude of thi other frequencies in the neighborhood. For example, when there are five frequencies in eac neighborhood, the amplitude of each frequency may be ranked on a scale of 0 to 4, where 0 the lowest amplitude and 4 is the greatest amplitude. While the forgoing example describes ranking each spectrum frequency, any subset of frequencies may alternatively be ranked su< as, for example, only frequencies of interest that may have been amplified or attenuated to embed information in the audio data. The ranker 506 outputs a set of rank values to the rani buffer 508.

[0025] The rank buffer 508 stores the set of rank values in a circular buffer such tha once the buffer has been filled, each new set of ranks will replace the oldest set of ranks in the buffer. The rank buffer 508 of the illustrated example stores the 128 sets of ranks (e.g., 128 sets of ranks 1 to N) corresponding to each slide of the time domain to frequency doma converter 504. In addition, the rank buffer 508 may store multiple messages worth Of ranks For example, as described in detail below, the rank buffer 508 may store five messages woi of ranks so that the blocks Of messages may be averaged While the rank buffer 508 is described as a circular buffer and type of data structure and storage may be used. For example, the rank buffer 508 may comprise one or more registers, one or more files, one or more databases, one or more buffers of any type, etc.

[0026] An example set of ranks may be:

[0027] The stacker 510 takes advantage of message-to-message redundancy to improve the detection of data encoded in audio signals. In particular, when enabled by the stacker control 512, the stacker 510 retrieves the ranks of consecutive messages from the ra buffer 508 and adds the ranks of corresponding blocks of the consecutive messages. The stacker 510 then divides the sums by the number of messages added together. Accordingly the stacker 510 determines an average of the ranks for consecutive blocks. When messages include redundancy, the ranks will average in order to eliminate errors introduced by noise < host audio. For example, an encoded message may be 50 bits including a broadcaster identifier (e.g., a 16-bit station identifier) followed by a timestamp (e.g., a 32-bit timestamp that denotes time elapsed in seconds since, for example, January 1, 1995), followed by a lc\ specification that allows multiple levels of messages to be included (e.g., a 2-bit level specification). In the example 50 bit message, all but the least significant bits of the messaj will be repeated for several messages in a row. In the example encoding where messages ai divided into ten groups and include one synch group (e.g., 11 total groups), it is expected th the first ten groups will repeat from message to message and the last group (e.g., that contai the three least significant bits of the timestamp and two level specification bits) will change from message to message. Because the three least significant bits can represent eight secon and messages in the example encoding are encoded into approximately two seconds of audi each, the fourth least significant bit of the message will change after four messages. Accordingly, the synchronization group and the first nine data groups are expected to repea for four messages (approximately eight seconds).

[0028] The stacking process may be performed according to the following formulas

where p is a message index (e.g., 0 S /? < 5 ) when five consecutive messages are to be averaged), k is a block index (e.g., 0 <k <16 when there are 16 blocks per sequence), S is the number of consecutive messages to be averaged (e.g., 5 when five consecutive message: are to be averaged), rlkm is the average rank Of the first frequency of interest in the Λ* bloc! of a message m„, and r2im is the average rank of the second frequency of interest in the klh block of message mn. For example, a message may be a station identifier and a timestamp that are encoded every 2 seconds. While the least significant bits of the time stamp (e.g., seconds) may change from message to message, the other bits (e.g., more significant bits of timestamp) will not change between every message. Accordingly, when the ranks of the current message are added to the ranks of the previous four messages, the average ranking can improve detection by reducing the effect of any noise that may have been present for le: than all of the messages. When the stacker 510 is enabled, the stacker 510 outputs the stacked set of ranks (e.g., RANK S 1 to stacked RANK_S in FIG. 5) to the comparator 514 When the stacker 512 is not enabled, the stacker 510 outputs the set of ranks (e.g., RANK_i 1 to RANK S N) retrieved from the rank buffer 508 to the comparator 514.

[0029] In an example, the following ranks may be determined for corresponding packets that are repetitions of the same message:

The sum of the ranks is:

The average of the ranks is:

As shown in the example, even when Block 0 of Message 4 has been ranked in a manner thi suggests the opposite data bit as the previous four messages (i.e., 4,2 would suggest a bit value of 1, while the other values suggest a bit value of 0), averaging of the ranking results i an average that suggests a bit value of 0. Accordingly, even when error due to noise is introduced, averaging of the ranks can result in ranking that more closely matches the encoded data.

[0030] The stacker control 512 controls when the stacker 510 is enabled or disabled For example, when the stacker 510 is disabled, messages may be processed one at time without any averaging of the ranks. When the stacker 510 is enabled by the stacker control 512, stacking of messages is performed as described herein or using any other process. The stacker control 512 may enable stacking based on any Criteria. For example, the stacker control 512 may enable provide selective stacking by automatically enabling stacking when noise is detected, When a poor quality audio Connection is present (e.g., when a microphone used rather than a physical connection), when the decoder 104 is at a distance from an audit source (e.g., a mobile device across the room from an audio source), etc. Additionally or alternatively, the stacker control 512 may be manually controlled to enable stacking when requested by a user and/or may be remotely controlled by a message from a central location the encoder 102, etc.

[0031] The comparator 514 of the illustrated example receives the set of ranks or stacked ranks (“set of ranks”) for a sequence from the stacker 510 and determines if a sync! sequence has been recognized. If a synch sequence has not been detected, the comparator 514 compares the received set of ranks to a reference synch sequence and sets a synch detected flag if the set of ranks is determined to correspond to a synch sequence. If a synch sequence has previously been detected, the comparator 514 compares the set of ranks to a reference set of sequences stored in the reference sequence data store 516. The reference se of sequence comprise a listing of possible ranks and associated high or low indications fort frequencies of interest for each block. For example, when each sequence includes 5 data bii 10 error correction bits, and one blank block, there would be 25 possible Bose and Ray-Chaudhuri (BCH) codewords of IS bits, each bit having an indication of whether each of tv frequencies of interest were attenuated or amplified (i.e., 30 indications). To determine the sequence corresponding to the set of ranks, the set of ranks is compared to each of the reference sequences. The reference sequence with the smallest different from the set of rani is identified as the received sequence.

[0032] For example, when the received set of ranks provided by the stacker 510 is:

The closest reference sequence may be the following set for data bits 0,0,1,1,0:

When compared by determining the distance or absolute value of the difference of the reference ranks and the received set of ranks, the difference is:

The numerical difference (e.g., hamming distance) is the sum of the difference row, which equals 20. This difference would be compared to the difference for all other possible sequences. If this difference was less than all other distances, then the reference sequence: determined to be the closest match.

[0033] In addition to determining the closest sequence from the reference set of sequences, the comparator 514 may also determine if the difference for the closest sequence exceeds a threshold. For example, the comparator 514 may discard the result if the differen is greater than a threshold, meaning that the Closest reference sequence was significantly different than the received set of ranks. In other words, the comparator 514 may ensure tha the received set of ranks are close enough to the determined reference sequence before outputting the sequence.

[0034] The example comparator 514 is further configured to reconstruct the least significant bits (LSB) of a detected sequence. The LSB may need to be reconstructed when the stacker is enabled and several messages are averaged. Such averaging will cause the LS (or other rapidly changing data) that varies among the averaged messages to be recreated. Any method for reconstructed the data may be used. For example, if the data to be reconstructed is the LSB of a timestamp, one message may be detected without the use of stacking and a timer may be used to determine the difference in time between the known LS and the current message so that the LSB of the timestamp can be recreated and the determined message modified to include the correct LSB.

[0035] The reference sequence 516 of the illustrated example may be implemented 1 any type of data storage. For example, the reference sequence datastore 516 may be a file, i database, a table, a list, an array, or any other type of datastore. While the example referent sequence 516 stores the 32 possible BCH sequences, any number of sequences may be store For example, a partial set of sequences may be stored.

[0036] Flowcharts representative of example processes that may be executed to implement some or all of the elements of the system 100 and the decoder 104 are shown in FIGS. 6-7.

[0037] In these examples, the process represented by each flowchart may be implemented by one or more programs comprising machine readable instructions for execution by: (a) a processor, such as the microprocessor 812 shown in the example computer 800 discussed below in connection with FIG. 8, (b) a controller, and/or (c) any Other suitable device. The one or more programs may be embodied in software stored on a tangible medium such as, for example, a flash memory, a CD-ROM, a floppy disk, a hard drive, a DVD, or a memory associated with the processor 812, but the entire program or programs and/or portions thereof could alternatively be executed by a device other than the microprocessor 812 and/or embodied in firmware or dedicated hardware (e.g., implemented by an application specific integrated circuit (ASIC), a programmable logic device (PLD), a field programmable logic device (FPLD), discrete logic, etc.). For example, any one, some all of the example mobile communications system components could be implemented by an combination Of software, hardware, and/or firmware. Also, some or all of the processes represented by the flowcharts of FIGS. 6-7 may be implemented manually.

[0038] Further, although the example processes are described with reference to tl flowcharts illustrated in FIGS. 6-7, many other techniques for implementing the example methods and apparatus described herein may alternatively be used. For example, with reference to the flowcharts illustrated in FIGS. 6-7, the order of execution of the blocks may be changed, and/or some of the blocks described may be changed, eliminated, combined, and/or subdivided into multiple blocks. While the processes of FIGS. 6-7 are described in conjunction with the decoder 104, any apparatus or system may implement the processes of FIGS. 6-7.

[0039] FIG. 6 is a flowchart of an example process to decode a message in audio. The process of FIG. 6 begins when the sampler 502 updates a current audio block by sampling 4 samples and discarding 4 samples from an analysis window (block 602). The example time domain to frequency converter 504 performs a sliding FFT to convert the sampled audio from the time domain to the frequency domain (block 604). The ranker 506 ranks the code frequencies in the converted audio (block 606). For example, as described above, frequencies of interest may be ranked on a scale of 0 to 4 when there are five frequencies in each neighborhood. The determined ranks are stored in the rank buffer 508 (block 608). When the rank buffer 508 is a circular buffer, the addition of the determined ranks will eliminate a previously stored rank. In addition, when the rank buffer 508 is a circular buffer, an index indicating the point at which the next set of ranks should be inserte to the rank buffer 508 is incremented (block 610).

[0040] The comparator 512 then generates a rank distribution array across the numt of blocks in a sequence (e.g., 15 blocks) (block 612). Next, the comparator 514 determines a synch sequence has previously been detected (block 614). The synch sequence indicates the start of a message. Therefore, when the synch has previously been detected, a message thread has started. When a synch sequence has not previously been detected, control proceeds to block 624, which is described below.

[0041] When a synch sequence has previously been detected (block 614), the comparator 514 generates match scores against all potential sequences (e.g., 32 possible BC sequences) (block 616). For example, the comparator 514 may determine a distance betwec the rank distribution and each of the potential sequences. The comparator 514 then selects the potential sequence with the greatest score (e.g., smallest distance) (block 618). The comparator 514 determines if the selected score exceeds a threshold (block 620). For example, if the score is a distance, the comparator 514 determines if the distance is less thar threshold distance. When the score does not exceed the threshold, control proceeds to bloc! 602 to continue processing.

[0042] When the score exceeds the threshold (block 620), the comparator 514 assigi the value to the sequence (block 622). Control then proceeds to block 602 to continue processing.

[0043] Returning to block 624, when a match has not been previously detected (blot 614), the comparator 514 generates a match score for the synch sequence; (block 624). For example, as described above the comparator 514 may determine a distance between the rani distribution and the reference synch sequence. The comparator 514 determines if the score exceeds a threshold (block 626). When the score does not exceed the threshold, control proceeds to block 602 to continue processing. When the score exceeds the threshold, a flag set indicating that a synch has been detected (block 628). Control then proceeds to block 6( to continue processing. While a flag is described above, any indication that a synch has bee detected may be used. For example, a variable may be stored, the synch sequence may be stored in a table, etc. In addition, while the example process includes a separate branch for detecting a synch sequence, synch sequences may be detected in the same branch as other sequences and processing may later be performed to identify a synch sequence that indicate that start of a message thread. Further, while the process of FIG. 6 is illustrated as a continuous loop, any flow may be utilized.

[0044] FIG. 7 is a flowchart of an example process to decode a message in audio. The process of FIG. 7 utilizes stacking to improve decoding accuracy. The process of FIG. begins when the sampler 502 updates a current audio block by sampling 4 samples and discarding 4 samples from an analysis window (block 702). The example time domain to frequency Converter 504 performs a sliding FFT to convert the sampled audio from the time domain to the frequency domain (block 704). The ranker 506 ranks the code frequencies in the converted audio (block 706). For example, as described above, frequencies of interest may be ranked on a scale of 0 to 4 when there are five frequencies in each neighborhood. The stacker 510 then adds the determined ranks to the ranks of corresponding blocks of previous messages and divided by the number of messages to determine an average rank (block 707). For example, the determined ranks may be added to the corresponding ranks o the previous 4 messages.

[0045] The average ranks are stored in the rank buffer 508 (block 708). When the rank buffer 508 is a circular buffer, the addition of the average ranks will eliminate a previously stored rank. In addition, when the rank buffer 508 is a circular buffer, an index indicating the point at which the next set of ranks should be inserted to the rank buffer 508 i incremented (block 710). Alternatively, the ranks may be stored in the rank buffer 508 aftei block 706 and may retrieved from the rank buffer 508 as part of block 707.

[0046] The comparator 514 then generates a rank distribution array across the numb of blocks in a sequence (e.g., 15 blocks) (block 712). Next, the comparator 514 determines a synch sequence has previously been detected (block 714). The synch sequence indicates the start of a message. Therefore, when the synch has previously been detected, a message thread has started. When a synch sequence has not previously been detected, control proceeds to block 724, which is described below.

[0047] When a synch sequence has previously been detected (block 714), the comparator 514 generates match scores against all potential sequences (e.g., 32 possible BC sequences) (block 716). For example, the comparator 514 may determine a distance betwei the rank distribution and each of the potential sequences. The comparator 514 then selects the potential sequence with the greatest score (e.g., smallest distance) (block 718). The comparator 514 determines if the selected score exceeds a threshold (block 720). For example, if the score is a distance, the comparator 514 determines if the distance is less thai threshold distance. When the score does not exceed the threshold, control proceeds to blocl 702 to continue processing.

[0048] When the score exceeds the threshold (block 720), the comparator 514 assign the value to the sequence (block 722). The comparator 512 then reconstructs any data that may have been corrupted by the stacking process. For example, that comparator 512 may determine a corrupted portion of a timestamp (e.g., a second indication) by decoding one message and tracking the amount of time that passes between the decoded message and a currently detected message. Control then proceeds to block 702 to continue processing.

[0049] Returning to block 724, when a match has not been previously detected (blot 714), the comparator 514 generates a match score for the synch sequence (block 724). For example, as described above the comparator 514 may determine a distance between the rani distribution and the reference synch sequence. The comparator 514 determines if the score exceeds a threshold (block 726). When the score does not exceed the threshold, control proceeds to block 702 to continue processing. When the score exceeds the threshold, a flag set indicating that a synch has been detected (block 728). Control then proceeds to block 7( to continue processing. While a flag is described above, any indication that a synch has bee detected may be used. For example, a variable may be stored, the synch sequence may be stored in a table, etc. In addition, while the example process includes a separate branch for detecting a synch sequence, synch sequences may be detected in the same branch as other sequences and processing may later be performed to identify a synch sequence that indicate that start of a message thread. Further, while the process of FIG. 7 is illustrated as a continuous loop, any flow may be utilized.

[0050] FIG. 8 is a schematic diagram of an example processor platform 800 that ma be used and/or programmed to implement any or all of the example system 100 and the decoder 104, and/or any other component described herein. For example, the processor platform 800 can be implemented by one or more general purpose processors, processor cores, microcontrollers, etc. Additionally, the processor platform 800 maybe implemented as a part of a device having other functionality. For example, the processor platform 800 m be implemented using processing power provided in a mobile telephone, or any other handheld device.

[0051] The processor platform 800 of the example of FIG. 8 includes at least one general purpose programmable processor 805. The processor 805 executes coded instructions 810 and/or 812 present in main memory of the processor 805 (e.g., within a RAM 815 and/or a ROM 820). The processor 805 may be any type of processing unit, sucl as a processor core, a processor and/or a microcontroller. The processor 805 may execute, among other things, example machine accessible instructions implementing the processes described herein. The processor 805 is in communication with the main memory (including ROM 820 and/or the RAM 815) via a bus 825. The RAM 815 may be implemented by DRAM, SDRAM, and/or any other type of RAM device, and ROM may be implemented b} flash memory and/or any other desired type of memory device. Access to the memory 815 and 820 may be controlled by a memory controller (not shown).

[0052] The processor platform 800 also includes an interface circuit 830. The interface circuit 830 may be implemented by any type of interface standard, such as a USB interface, a Bluetooth interface, an external memory interface, serial port, general purpose input/output, etc. One or more input devices 835 and one or more output devices 840 are connected to the interface circuit 830.

[0053] Although certain example apparatus, methods, and articles of manufacture ai described herein, other implementations are possible. The scope of coverage of this patent: not limited to the specific examples described herein. On the contrary, this patent covers al apparatus, methods, and articles of manufacture falling within the scope of the invention.

Claims

CLAIMS:

1. A method to extract information from media, the method comprising: sampling a media signal to generate digital samples; determining a frequency domain representation of the digital samples; determining a first rank of a first frequency in the frequency domain representation; determining a second rank of a second frequency in the frequency domain representation; combining, via a processor, the first rank and the second rank with a set of ranks to create a combined set of ranks; comparing the combined set of ranks to a set of reference sequences including determining a set of distances between the combined set of ranks and one or more of the sequences in the reference set of sequences; determining information represented by the combined set of ranks based on the comparison including selecting a sequence in the reference set of sequences that has a smallest distance; and storing the information in a memory device.
2. The method as defined in claim 1, wherein the first rank indicates an amplitude of the first frequency relative to other frequencies in a neighborhood.
3. The method as defined in claim 2, wherein a number of frequencies in the neighborhood is equal to a number of possible rank values.
4. The method as defined in claim 1, further comprising determining a first average rank for the first frequency in the frequency domain representation and determining a second average rank for the second frequency in the frequency domain representation.
5. The method as defined in claim 4, wherein the information is encoded in the media signal after T seconds and wherein determining a first average rank for the first frequency comprises determining a third rank for a first frequency approximately T seconds before determining the first rank and adding the first rank and the third rank.
6. An apparatus to extract information from media, the apparatus comprising: a sampler to sample a media signal to generate digital samples; a time domain to frequency domain converter to determine a frequency domain representation of the digital samples; a ranker to determine a first rank of a first frequency in the frequency domain representation and to determine a second rank of a second frequency in the frequency domain representation; and a comparator to combine the first rank and the second rank with a set of ranks to create a combined set of ranks, to compare the combined set of ranks to a set of reference sequences including determining a set of distances between the combined set of ranks and one or more of the sequences in the reference set of sequences, and to determine information represented by the combined set of ranks based on the comparison including selecting a sequence in the reference set of sequences that has a smallest distance.
7. The apparatus as defined in claim 6, wherein the first rank indicates an amplitude of the first frequency relative to other frequencies in a neighborhood.
8. The apparatus as defined in claim 7, wherein a number of frequencies in the neighborhood is equal to a number of possible rank values.
9. The apparatus as defined in claim 6, further comprising a stacker to determine a first average rank for the first frequency in the frequency domain representation and to determine a second average rank for the second frequency in the frequency domain representation.
10. The apparatus as defined in claim 9, wherein the information is encoded in the media signal after T seconds and wherein the ranker is to determine a first average rank for the first frequency by determining a third rank for a first frequency approximately T seconds before determining the first rank and by adding the first rank and the third rank.
11. A tangible computer readable medium excluding propagating signals and storing instructions that, when executed, cause a machine to extract information from media by at least: sampling a media signal to generate digital samples; determining a frequency domain representation of the digital samples; determining a first rank of a first frequency in the frequency domain representation; determining a second rank of a second frequency in the frequency domain representation; combining the first rank and the second rank with a set of ranks to create a combined set of ranks; comparing the combined set of ranks to a set of reference sequences including determining a set of distances between the combined set of ranks and one or more of the sequences in the reference set of sequences; and determining information represented by the combined set of ranks based on the comparison including selecting a sequence in the reference set of sequences that has the smallest distance.
12. The computer readable medium as defined in claim 11, wherein the first rank indicates an amplitude of the first frequency relative to other frequencies in a neighborhood.
13. The computer readable medium as defined in claim 12, wherein a number of frequencies in the neighborhood is equal to a number of possible rank values.
14. The computer readable medium as defined in claim 11, wherein the instructions cause the machine to determine a first average rank for the first frequency in the frequency domain representation and determine a second average rank for the second frequency in the frequency domain representation.
15. The computer readable medium as defined in claim 14, wherein the instructions cause the machine to encode the information in the media signal after T seconds and wherein the instructions cause the machine to determine a first average rank for the first frequency by determining a third rank for a first frequency approximately T seconds before determining the first rank and adding the first rank and the third rank.