GB2552349A - Sample synchronisation - Google Patents

Sample synchronisation Download PDF

Info

Publication number
GB2552349A
GB2552349A GB1612560.1A GB201612560A GB2552349A GB 2552349 A GB2552349 A GB 2552349A GB 201612560 A GB201612560 A GB 201612560A GB 2552349 A GB2552349 A GB 2552349A
Authority
GB
United Kingdom
Prior art keywords
hash
block
audio stream
point
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1612560.1A
Other versions
GB2552349B (en
GB201612560D0 (en
Inventor
Law Malcolm
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to GB1612560.1A priority Critical patent/GB2552349B/en
Publication of GB201612560D0 publication Critical patent/GB201612560D0/en
Priority to PCT/GB2017/052129 priority patent/WO2018015752A1/en
Publication of GB2552349A publication Critical patent/GB2552349A/en
Application granted granted Critical
Publication of GB2552349B publication Critical patent/GB2552349B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0017Lossless audio signal coding; Perfect reconstruction of coded audio signal by transmission of coding error
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/167Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L7/00Arrangements for synchronising receiver with transmitter
    • H04L7/0054Detection of the synchronisation error by features other than the received signal transition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L7/00Arrangements for synchronising receiver with transmitter
    • H04L7/04Speed or phase control by synchronisation signals
    • H04L7/048Speed or phase control by synchronisation signals using the properties of error detecting or error correcting codes, e.g. parity as synchronisation signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/436Interfacing a local distribution network, e.g. communicating with another STB or one or more peripheral devices inside the home
    • H04N21/43615Interfacing a Home Network, e.g. for connecting the client to a plurality of peripherals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams

Abstract

A method of specifying a point 5 in a Pulse-Code Modulation, PCM, audio stream transmitted over a communication channel comprises transmitting a message 6 containing a hash 10 of a block of samples of the audio stream related to the point, such as a hash of a block of n≥8 samples of the PCM audio stream having a predetermined relationship to the point. The method may also include including a command in the message to take effect at the point e.g. volume control to take effect at a particular point. By specifying to effect a command at the point problems with sending data and control signals over respective links with potentially differing and variable latency is overcome. The hash may comprise a Cyclic Redundancy Check (CRC). A corresponding value of a sample counter can also be included in the message allowing synchronisation of the counter. Conversely a method of identifying a point in a PCM audio stream received over a communication channel comprises successively updating a hash 20 of overlapping blocks in the audio stream until the updated hash matches 21 one received in a message. A transmitter and a receiver for performing the methods are also proposed.

Description

(71) Applicant(s):
Malcolm Law
Hills Road, Steyning, West Sussex, BN44 3QG, United Kingdom (72) Inventor(s):
Malcolm Law (56) Documents Cited:
GB 2356758 A EP 2434756 A US 20070157070 A1
EP 2880809 A1 JP 2000349701 A US 20050226601 A1 (58) Field of Search:
INT CLG10L, H04L
Other: WPI, EPODOC, INSPEC, TXTA, XPESP, XPI3E, XPLNCS (74) Agent and/or Address for Service:
Gill Jennings & Every LLP
The Broadgate Tower, 20 Primrose Street, LONDON, EC2A 2ES, United Kingdom (54) Title of the Invention: Sample synchronisation
Abstract Title: Specifying and identifying a point in a PCM audio stream (57) A method of specifying a point 5 in a Pulse-Code Modulation, PCM, audio stream transmitted over a communication channel comprises transmitting a message 6 containing a hash 10 of a block of samples of the audio stream related to the point, such as a hash of a block of n>8 samples of the PCM audio stream having a predetermined relationship to the point. The method may also include including a command in the message to take effect at the point e.g. volume control to take effect at a particular point. By specifying to effect a command at the point problems with sending data and control signals over respective links with potentially differing and variable latency is overcome. The hash may comprise a Cyclic Redundancy Check (CRC). A corresponding value of a sample counter can also be included in the message allowing synchronisation of the counter. Conversely a method of identifying a point in a PCM audio stream received over a communication channel comprises successively updating a hash 20 of overlapping blocks in the audio stream until the updated hash matches 21 one received in a message. A transmitter and a receiver for performing the methods are also proposed.
Figure GB2552349A_D0001
Fig. 2
1/2
Figure GB2552349A_D0002
Fig. 1
2/2
Figure GB2552349A_D0003
Figure GB2552349A_D0004
Receiver
Fig. 2
SAMPLE SYNCHRONISATION
Field of the Invention
The invention relates to an efficient method of identifying a specific point in an audio stream, particularly for applying commands thereto.
Background to the Invention (RTM)
Common data links, for example USB and Bluetooth, are sometimes used as a
Λ means of transporting audio from one device to another. An example of such a system would be a personal computer acting as a media server connected to a digital to analogue converter (DAC), which in turn may be connected to a number of loudspeakers. Due to the variety of software and hardware components such a system may take, latency along the audio link, whether by USB or otherwise, is often both unknown and variable compared to control messages which may be communicated by different paths, or different protocols on the same wire.
Audio processing, such as volume control, may be performed at various points along the system. Figure 1 illustrates this situation, where both audio 4 and control 7 data are sent from a transmitter 1 to a receiver 2, which performs some action 8 on the audio governed by the control data. However, the relative delay between the audio 4 and control 7 paths is unknown and may be variable.
Sometimes altering the processing configuration will be in response to a user request, and the precise point in relation to the audio where the change is applied does not really matter.
However, there are also situations where an early device in the chain wishes a change to be precisely timed, and a later device in the chain implements that change. An example would be if a DAC was to change volume level in response to ReplayGain information. The change should be implemented precisely on the track boundary, as it can involve a substantial change in volume, and if the timing of the change is approximate then it could result in the end of the first track or the beginning of the second being erroneously loud or quiet. However, the location of the track boundary is not known in the continuous audio stream received by the DAC.
There is therefore a need for a practical method to effect sample accurate configuration changes across such a link, for example volume changes over a USB connection.
Summary of the Invention
According to a first aspect of the present invention there is provided a method of specifying a point in a PCM audio stream losslessly transmitted over a communication channel, the method comprising the steps of:
choosing the point in the PCM audio stream; and, transmitting a message containing a hash of a block of n > 8 samples of the PCM audio stream having a predetermined relationship to the point.
In this way, a specific point in the PCM audio stream can be identified and its location transmitted with sufficient accuracy and in a computationally efficient manner. The hash contained in the message allows the point to be subsequently determined, such that a receiver may apply a related command to a precise position in the audio specified by the transmitter.
In some embodiments, the method comprises the further step of including the command in the message, and the point would be the position where the command is to take effect. In this way both the identifying information about the point and the related command are conveyed in a single message.
In some embodiments the method comprises the further step of maintaining a sample counter, and including the value of the sample counter at the point in the message. In this way, the transmitter and receiver establish a synchronised sample counter, allowing subsequent messages to specify precise positions in the audio where commands are to take effect by referencing the value of the sample counter at that position.
In some embodiments the message may be transmitted over an asynchronous communication channel to the PCM audio stream.
In some preferred embodiments the hash has the property that for any hash value H, block of audio sample values X, and value 2b corresponding to a bitplane in the PCM audio, there exists a block of audio sample values X’ such that the hash of X’ computes as H and each audio sample value in X’ is equal to the corresponding value in X plus one of 0, +2b and -2b. In this way, a hash collision which might lead to the receiver mistakenly identifying the point is unlikely even for repetitive audio.
In some embodiments the hash h^x^x^··· ,xn_f) has the property that /ι(χχ2, ,xn) is a function of/ι(χ01( •••,xn_1), x0, x1: xn_x and xn. In this way, the computational cost for the receiver scanning the audio to find a block with the correct hash is kept low and independent of the block size n.
Preferably, the hash /ι(χ0,χ!, — ,χη_χ) has the stronger property that h(x1,x2, —,Χη) is a function of/ι(χ01(•••,xn_1), x0 and xn. In some of these embodiments, the hash /i(x0, xx, —, xn-i) has the property that/ι(χ1(χ2,•••,xn_1,0) is a bit rotation of h(0, x1( ···, x„_i). Linear hash functions of this form lead to extremely simple and computationally efficient update formulae for the receiver.
In some embodiments the hash comprises a cyclic redundancy check (CRC) and the order of the polynomial defining the CRC divides the size of the data block in bits.
The method may comprise the further step of checking if the block of n samples of the PCM audio stream matches another other block of n samples. In the event of a match, the method may further comprise choosing a different point. In this way the transmitter can avoid ambiguity which might give rise to the receiver misidentifying the point.
In some embodiments, the method may further comprise altering the block of n samples of the PCM audio stream before transmitting the message. In this way the transmitter can minimise the chance of the receiver mistakenly identifying the point even on exactly repeating audio, typically by making a small pseudorandom alteration. Preferably the transmitter also includes in the message information directing how the receiver is to alter the block of n samples of the PCM audio stream. In this way the transmitter can specify the inverse alteration so the receiver can restore the exact original audio before modification by the transmitter.
According to a second aspect of the present invention there is provided a method of identifying a point in a PCM audio stream losslessly received over a communication channel, the method comprising the steps of:
receiving a message containing a hash value; and, computing a hash of successive overlapping blocks of the PCM audio stream until a block is found whose hash matches the received hash value.
In this way, a specific chosen point can be identified in a received PCM audio stream with sufficient accuracy. The received message containing the hash allows the point to be so determined, such that a related command may then be applied at a precise position in the audio requested by the transmitter.
In some embodiments the method of the second aspect comprises the further steps of:
decoding a command from the received message; and, implementing the command at a point in the PCM audio stream having a predetermined relationship to the block whose hash matches the received hash value.
In some embodiments the method comprises the further steps of:
decoding a value for a sample counter from the received message; and, setting a sample counter to the decoded value at a point in the PCM audio stream having a predetermined relationship to the block whose hash matches the received hash value.
In some embodiments the message is received over an asynchronous communication channel to the PCM audio stream.
The hash may have the property that for any hash value H, block of audio sample values X, and value 2b corresponding to a bitplane in the PCM audio, there exists a block of audio sample values X’ such that the hash of X’ computes as H and each audio sample value in X’ is equal to the corresponding value in X plus one of 0, +2b and —2b.
Alternatively, the hash h/Q.x^··· ,xn_f) may have the property that h(x1,x2,—,xn) is a function of Η/ο,χ^···,xn/, x0, xlt xn_3 and xn. In this case the hash h/^x^···,xn_f) may also have the stronger property that h(x1,x2, ···,xn) is a function of/ι(χ01(•••,xn_1), x0 and xn. Additionally, the hash /ι(χ01(•••,xn_1) may have the property that /i(xx, x2, ···, xn-i, 0) is a bit rotation of h(0, xx, ···, xn/).
In some embodiments, the hash may comprise a cyclic redundancy check ‘CRC’ and the order of the polynomial defining the CRC divides the size of the data block in bits.
The method of the second aspect may comprise the further step of computing the hash of a successive block of the PCM audio stream using the hash value computed for the preceding overlapping block of the PCM audio stream. In this way, the audio stream can be scanned for a block which hashes to the received value in a computationally efficient way, especially if the hash function has some of the properties disclosed above.
In some embodiments, the method may comprise the further steps of retrieving information from the message directing how the block of the PCM audio stream is to be altered, and altering the block of the PCM audio stream whose hash matches the hash value. In this way a modification made by the transmitter as described above, can be undone to give end-to-end lossless operation.
According to a third aspect of the present invention there is provided a transmitter adapted to specify a point in a PCM audio stream losslessly transmitted over a communication channel by performing the method of the first aspect.
According to a fourth aspect of the present invention there is provided a receiver adapted to identify a point in a PCM audio stream losslessly received over a communication channel by performing the method of the second aspect.
According to a fifth aspect of the present invention there is provided a codec comprising a transmitter according to the third aspect in combination with a receiver according to the fourth aspect.
According to a sixth aspect of the present invention there is provided a computer program product comprising instructions that when executed by a processor causes said processor to perform the method of the first or second aspects.
As will be appreciated by those skilled in the art, the present invention provides techniques and devices for identifying and communicating the location of a specific point in a PCM audio stream between a transmitter and a receiver with sufficient accuracy and in a computationally efficient manner. This allows a command to be related to the point so that it may subsequently be applied at the point in PCM audio stream. Further variations and embellishments will become apparent to the skilled person in light of this disclosure.
Brief Description of the Drawings
Examples of the present invention will be described in detail with reference to the accompanying drawings, in which:
Figure 1 shows a transmitter and receiver where a PCM audio stream is communicated over a communications link but commands are communicated over an asynchronous communications link with unknown and possibly variable delay relative to the audio path; and
Figure 2 shows a transmitter and receiver according to the invention where a PCM audio stream is communicated over a communications link as is a message containing a hash of a block of the audio stream.
Detailed Description
Except in certain circumstances (particular test signals, artificial complete silence and clipping), audio signals carry some degree of noise. Even though audio waveforms are often repetitive, it is thus unlikely that two nearby blocks of audio exactly match another and increasingly unlikely as the block size increases.
Thus, a particular point in the audio can (with high probability) be uniquely defined by a sufficiently large block of audio samples - for example the n preceding samples, where n > 8 to ensure that even quiet audio contains sufficient randomness to make repetition of identical blocks unlikely. However, it is excessively verbose to include a verbatim copy of a block of audio in a message to specify exactly where volume is to change.
Figure 2 shows a transmitter 1 and receiver 2 where a PCM audio stream 3 is communicated over a communications link 4. It is desired to identify a specific point 5 in the audio stream 3 that the transmitter 1 wishes to communicate to the receiver 2 by means of a message 6 sent across an asynchronous communications link 7.
The specific point 5 might be the point where a configuration change is specified to occur or, preferably, the message might specify the value of a sample counter at that point. Having established a common sample counter synchronised to the audio stream on both sides of the link, any subsequent messages requesting configuration changes need only specify at what value of the sample counter they should take effect.
According to the invention, this is done by computing a hash 10 of a block of audio with a known relationship to the specific point 5, and including the hash value in the message 6. A hash function maps a block of data (containing audio sample values) onto a smaller domain - perhaps a 32 bit integer. No cryptographic properties are required of the hash function, but a good choice of hash function for this purpose will satisfy certain properties, as discussed below. In the receiver, the specific point is found by looking for the block of audio that hashes to the same value 21. This is efficiently done by repeatedly updating 20 a hash value 22 using a new value of the audio and a delayed value 23 of the audio.
The hash function’s first desirable property is updateability. In order to find the block of audio which hashes to the value received in the message, the receiver needs to examine all plausible blocks of audio and compute their hash. It is onerous for the receiver to compute each hash from scratch, and far more computationally efficient if it can compute each hash by updating the preceding
Figure GB2552349A_D0005
hash to take into account the change as the block moves by one sample, bringing in a new sample and dropping an old sample.
A simple hash function that illustrates this property is to represent each PCM audio sample value at time t in a 32 bit word as xt and then to sum them over the block. If we denote the hash function at time t as Ht, then:
η—1 η—1 /=0 /=0
So the hash of each block can be updated from the hash of the last one by adding the new value and subtracting the old one that no longer contributes to the result.
More generally, the updateability property can be expressed as requiring that there exists a function fQ such that:
f (dt—1> Xf—n> Xt)
If the hash function admits of such a formulation then, no matter how big the block is, the receiver can compute the next hash value from just 3 values: the previous hash value, the new audio value that comes into the block and the old audio value that drops out of the block.
Referring back to Figure 2, we see how the receiver uses the updatability property to efficiently identify the block of audio with the correct hash value. As every audio sample comes in, it maintains the hash value 22 corresponding to the block ending at that point. No matter what the length of the block is, the hash is updated 20 using just 3 values: the incoming audio sample, a delayed 23 audio sample and the previous hash value. The updated hash value can then be compared to the hash value in the received message to find a match and thus identify the specific point 5 in the audio received by the receiver 2.
However, although it is updateable, summing over the block is not a good choice of hash function because if the audio is quiet, exercising few values, then there will not be much variability in the sum over a block either. Consequently, hash collisions (where two blocks of audio hash to the same value despite not being identical) are going to be quite common.
We also need the hash function to do a good job of mixing up the bits from the audio samples in the block so that two similar blocks that differ in low level noise have minimal chance of a hash collision, ideally not much more than 2~w, where w is the word width of the hash value in bits.
We shall call this criterion “randomising”, and define a hash function to be randomising if given any block X of audio values, bitplane 2b in the PCM audio sample values and hash value H then there exists a block X’ of PCM audio values which hashes to H where every audio value in X’ is equal to the corresponding value in X plus 0 or +2b.
The rationale behind this definition is that with a randomising hash, we take the bitplane 2b to match the least significant active bit in the audio, which will change with the smallest noise. The randomising property tells us that small amounts of noise in this Isb will exercise all possible hash values.
Although this is not sufficient to prove that the probability of a hash collision is 2~w, it should suffice to give reasonable confidence that it does not greatly exceed this ideal value.
A simple example of a randomising hash is:
η— 1
Ht = ROTATE i) /=0 where ROTATE is a ιν-bit bit rotation (either left or right) and w divides n. This can be seen to be randomising, by noting that in the first w terms of the summation there is one that maps the bit 2b to every bit in the hash word. If we now consider the binary representation of (H - h(Xf) modulo 2W, for every set bit we can add 2b to that audio value whilst leaving the others unchanged. This generates the required hash value H, proving that the hash is randomising.
This hash is also updateable, since:
Ht = ROTATE 1) + xt - xt_n
Another example is to replace the summation by an XOR. This is also randomising (using a similar proof) and updateable with
Ht = (H^ ROTATE 1) X0Rxt X0Rxt_n
This method of receiver operation is illustrated in Figure 2, where the hash value is shown to be updated by a function taking the previous value of the hash function, the input audio and a delayed version of the input audio corresponding to the data dropping out of the hash block. Each updated value is then compared with the hash value received in a message to identify the point in time where a match occurs.
Whilst the above hash function satisfied our test for randomisability, one might be concerned that the hash update function is overly simple. Perhaps a more complex update function would give better scrambling of the audio bits, and so make better use of all of bits in the hash word for avoiding collisions.
Some processors include built in instructions for computing cyclic redundancy checks (CRCs) and we can attempt to build an updateable hash function out of a CRC.
A d bit CRC is defined by a generator polynomial P(y) = yd + ad-iyd_1 + — + ayy + 1 of degree d over the binary field GF(2) (we use y as the indeterminant to avoid confusion with our audio sample values xt). Such a polynomial P(y) is said to have an order m defined as the smallest positive integer m such that xm = 1 modulo P(y).
Supposing each audio sample to be represented as a 32 bit word, we can define the hash function in a similar way to a CRC of the data block:
Ht = η— 1
Figure GB2552349A_D0006
modulo P(y) where all arithmetic in the above expression is in the ring of polynomials over GP(2).
Now if we express Ht in terms of Η{_± we get:
Ht = (/fy-iT32 + xt~ xt-nT32n) modulo P(y)
If the order m of P(y) divides 32n, then y32n ξ l and this simplifies to:
Ht = (Ht-tf32 + (xt - xt_„)) modulo P(y) which would be extremely simple to implement if the processor provides a built in operation for cycling a CRC by 32 bits (the addition and subtraction in the ring are both implemented by computer XOR instructions).
There are degree 32 polynomials whose order is divisible by 32. An example is:
P(y) = y32 -+- y31 + y28 + y26 + y25 + y24 + y22 + y21 + y20 + y18 + y12 + yw + y9 + y8 + y6 + y5 + y4 + y2 + 1 which has degree 1056=32*33.
However, it is very highly composite with:
P(y) = (y + l)18(y2 + y + l)2(y10 + y7 + y5 + y3 + 1) which suggests that we may not be getting as much shuffling benefit as we hoped for in moving to using a CRC, and so we may prefer a polynomial whose order does not divide 32.
An example of such a polynomial is:
P(y) = y32 + y30 + y28 + y27 + y26 + y22 + y21 + y18 + y16 + y14 + y13 + y12 + y11 + y9 + y8 + y6 + y5 + y + 1 which has degree 1127 = 35*32 + 7.
Using this polynomial, we would want the data over which the hash is calculated to contain 1127 bits (or a multiple of 1127 bits) to make the hash easy to update, but this is not a whole number of 32 bit samples. This has the effect of complicating the update formula which needs to add in the new 32 bit word xt, but take away 32 bits of data, 7 bits of which comes from xt_36 and 25 from xt_35. Alternatively, it could take away 32 bits of data from xt_36 and add in 7 bits from xt combined with 25 bits from x^.
Neither of these modes of operation satisfies the desirable updateability test, but they do satisfy a slightly weakened updatability test:
//1 /(//t—l> Xf-ru Xt— (n—1)< Xt— 1’ xt)
Here the hash is updated by computing some function of 5 values: the previous hash value, the newest 2 sample values coming into the hash block and the oldest two sample values falling out of the hash block. The updateability property has been weakened to accommodate the hash block length in bits not being an integer multiple of the size of each sample value.
We have discussed how a block of audio may be uniquely identified on both sides of a communication link by means of an updateable hash. The purpose of doing so is so that messages between the two ends of the communication link can identify exactly where in the audio some event relating to the audio (such as a volume change) is to occur.
This can be done by sending a message conveying that an event is to occur when the audio hashes to a particular value, but more generally useful is to split the process into two stages. The first is to establish a common sample counter using a message conveying that when the audio hashes to a particular value, then the sample counter should have a specified value. The second is to say that an event is to occur at a particular value of the sample counter.
On various test signals (e.g. an undithered sine wave whose sampling rate divides the audio sampling rate) or completely silent sections where the audio is precisely zero, or where heavy clipping occurs and the audio spends a significant amount of time at the maximum (or minimum) representable values, then hashing a block of audio fails to uniquely identify a position in the stream. This is because the audio is exactly repetitive and so nearby blocks of audio have the same hash value because they are identical. This issue can be addressed by various techniques.
Firstly, in many situations it may not matter. When the audio is completely silent, it may be unimportant exactly when a change occurs. And operation on an artificial test signal is an artificial situation which may not be important in real world operation.
However, constant audio can easily be identified in the transmitter, which can decline to signal positions where the block is mostly zero or clipped and so does not have adequate randomness. Instead the transmitter can wait till a block occurs which is neither silent nor clipped.
Exactly repetitive signals are harder to spot in the transmitter, but the transmitter could keep a record of recent hash values to ensure that a hash it proposes to transmit does not collide with another nearby section of prior audio. After transmitting a hash value it could continue to update the hash and look for a subsequent collision. If it finds one, then it could take remedial action - in the case where a sample clock is synchronised suitable remedial action would be to send another synchronisation message to correct any mistaken synchronisation the receiver established.
Another possibility is for the transmitter to make a small perturbation to the audio over which the hash is computed. A good choice would be to XOR the least significant bits of some or all of the block of n samples with a pseudorandom sequence. This reduces the probability of a hash collision to an insignificant level at the expense of introducing a tiny amount of audio noise for a very short duration. On the face of it, this strategy would not appear suitable to situations where lossless operation is required. But the transmitted message could also contain a field directing the receiver to alter the audio in such a way as to invert the transmitter’s alterations, thus restoring overall lossless operation. In the above example, a suitable field might contain a seed for the pseudorandom bit generator, so that the receiver can regenerate the pseudorandom sequence and XOR the Isbs back to their original values.
It will be appreciated that the use of 32 bits for the size of the sample value and the size of the hash value is purely illustrative and other choices of sizes may be chosen. It will also be appreciated that there are many ways a 32 bit value representing a sample can be constructed from a multichannel sample of perhaps 24 bits per sample. One channel could be selected and zero/sign extended to 32 bits or multiple channels could be added or XORed, perhaps with bit shifts/rotations. Different choices will have different properties but the choice does not affect the nature of the invention.
Audio samples may also be grouped together, so that xt is a value representing a group of samples, and so the sample rate of xt is a submultiple of that of the underlying audio. This is natural if the receiver has some other means of establishing group boundaries.

Claims (24)

Claims
1. A method of specifying a point in a PCM audio stream losslessly transmitted over a communication channel, the method comprising the steps of:
choosing the point in the PCM audio stream; and, transmitting a message containing a hash of a block of n > 8 samples of the PCM audio stream having a predetermined relationship to the point.
2. A method according to claim 1, the method comprising the further step of: including a command in the message to take effect at the point.
3. A method according to claim 1 or claim 2, the method comprising the further step of maintaining a sample counter; and, including a value of the sample counter at the point in the message.
4. A method according to any of claims 1 to 3, wherein the message is transmitted over an asynchronous communication channel to the PCM audio stream.
5. A method according to any of claims 1 to 4, wherein the hash has the following property:
for any hash value H, block of audio sample values X, and value 2b corresponding to a bitplane in the PCM audio, there exists a block of audio sample values X’ such that the hash of X’ computes as H and each audio sample value in X’ is equal to the corresponding value in X plus one of 0, +2b and —2b.
6. A method according to any of claims 1 to 5, wherein the hash /ι(χ0,χ1(— ,χ^) has the following property:
h(x1,x2, ,Xn) is a function of /ι(χ0,χ1; ,xn_f), x0, xx, xn_x and xn.
7. A method according to claim 6, wherein the hash /i(x0, xx, —, xn_x) has the following property:
h(x1,x2, ··· ,Xn) is a function of h(x0, xx, ,xn_f), x0 and xn.
8. A method according to claim 6 or claim 7, wherein the hash /i(x0, xx,— ,χ^) has the following property:
/ι(χχ,χ2, ·,χη_ι,0) is a bit rotation of /ι(0,χ1; •••,xn_1).
9. A method according to claim 6, wherein the hash comprises a cyclic redundancy check ‘CRC’ and the order of the polynomial defining the CRC divides the size of the data block in bits.
10. A method according to any of claims 1 to 9, the method comprising the further step of:
checking if the block of n samples of the PCM audio stream matches another other block of n samples.
11. A method according to claim 10, the method comprising the further step of: in the event of a match, choosing a different point.
12. A method according to any of claims 1 to 11, the method comprising the further step of:
altering the block of the n samples of the PCM audio stream before transmitting the message.
13. A method according to claim 12, the method comprising the further step of: including in the message information directing how the receiver is to alter the block of n samples of the PCM audio stream.
14. A method of identifying a point in a PCM audio stream losslessly received over a communication channel, the method comprising the steps of:
receiving a message containing a hash value; and, computing a hash of successive overlapping blocks of the PCM audio stream until a block is found whose hash matches the received hash value.
15. A method according to claim 14, the method comprising the further steps of:
decoding a command from the received message; and, implementing the command at a point in the PCM audio stream having a predetermined relationship to the block whose hash matches the received hash value.
16. A method according to claim 14 or claim 15, the method comprising the further steps of:
decoding a value for a sample counter from the received message; and, setting a sample counter to the decoded value at a point in the PCM audio stream having a predetermined relationship to the block whose hash matches the received hash value.
17. A method according to any of claims 14 to 16, wherein the message is received over an asynchronous communication channel to the PCM audio stream.
18. A method according to any of claims 14 to 17, wherein the hash has the following property:
for any hash value H, block of audio sample values X, and value 2b corresponding to a bitplane in the PCM audio, there exists a block of audio sample values X’ such that the hash of X’ computes as H and each audio sample value in X’ is equal to the corresponding value in X plus one of 0, +2b and —2b.
19. A method according to any of claims 14 to 17, wherein the hash /ι(χ0,χ1(has the following property:
h(x1,x2, ,Xn) is a function of h(x0,x1, ,xn_f)< x0, xx, xn_x and xn.
20. A method according to claim 19, wherein the hash h(x0,xx, ···, xn_x) has the following property:
h(x1,x2, ···,Xn) is a function of h(x0,x1( — ,xn_f), Xq and xn.
21. A method according to claim 19 or claim 20, wherein the hash /ι(χ0,χ1( ,xn_fl) has the following property:
h(x1,x2, ,Χη-γ, 0) is a bit rotation of /ι(0,χ1; •••,xn_1).
22. A method according to claim 19, wherein the hash comprises a cyclic redundancy check ‘CRC’ and the order of the polynomial defining the CRC divides the size of the data block in bits.
23. A method according to any one of claims 14 to 22, the method comprising the further step of:
computing the hash of a successive block of the PCM audio stream using the hash value computed for the preceding overlapping block ofthe PCM audio stream.
24.
Intellectual
Property
Office
Application No: GB1612560.1 Examiner: Adam Tucker
24. A method according to any one of claims 14 to 23, the method comprising the further steps of:
retrieving information from the message directing how the block of the PCM audio stream is to be altered; and, altering the block of the PCM audio stream whose hash matches the hash value.
25. A transmitter adapted to specify a point in a PCM audio stream losslessly transmitted over a communication channel by performing the method of any one of claims 1 to 13.
26. A receiver adapted to identify a point in a PCM audio stream losslessly received over a communication channel by performing the method of any of claims 14 to 24.
27. A codec comprising a transmitter according to claim 25 in combination with a receiver according to claim 26.
28. A computer program product comprising instructions that when executed by a processor causes said processor to perform the method of any one of claims 1 to
GB1612560.1A 2016-07-20 2016-07-20 Sample synchronisation Active GB2552349B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
GB1612560.1A GB2552349B (en) 2016-07-20 2016-07-20 Sample synchronisation
PCT/GB2017/052129 WO2018015752A1 (en) 2016-07-20 2017-07-19 Sample synchronisation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1612560.1A GB2552349B (en) 2016-07-20 2016-07-20 Sample synchronisation

Publications (3)

Publication Number Publication Date
GB201612560D0 GB201612560D0 (en) 2016-08-31
GB2552349A true GB2552349A (en) 2018-01-24
GB2552349B GB2552349B (en) 2019-05-22

Family

ID=56890443

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1612560.1A Active GB2552349B (en) 2016-07-20 2016-07-20 Sample synchronisation

Country Status (2)

Country Link
GB (1) GB2552349B (en)
WO (1) WO2018015752A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111261194A (en) * 2020-04-29 2020-06-09 浙江百应科技有限公司 Volume analysis method based on PCM technology

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000349701A (en) * 1999-06-04 2000-12-15 Nec Corp Device and method corresponding to radio reception data deviation
GB2356758A (en) * 1999-09-30 2001-05-30 Ibm User controlled selection of audio and video data streams
US20050226601A1 (en) * 2004-04-08 2005-10-13 Alon Cohen Device, system and method for synchronizing an effect to a media presentation
US20070157070A1 (en) * 2006-01-04 2007-07-05 Stephan Wenger Method for checking of video encoder and decoder state integrity
EP2434756A1 (en) * 2006-06-22 2012-03-28 TiVo, Inc. Insertion of tags in a multimedia content stream to a location defined by a sequence of hash values of the content
EP2880809A1 (en) * 2012-07-29 2015-06-10 Qualcomm Technologies, Inc. Frame sync across multiple channels

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6802019B1 (en) * 2000-06-15 2004-10-05 Genesys Conferencing, Ltd. Method and system for synchronizing data
US8233648B2 (en) * 2008-08-06 2012-07-31 Samsung Electronics Co., Ltd. Ad-hoc adaptive wireless mobile sound system
WO2014125285A1 (en) * 2013-02-13 2014-08-21 Meridian Audio Limited Versatile music distribution

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000349701A (en) * 1999-06-04 2000-12-15 Nec Corp Device and method corresponding to radio reception data deviation
GB2356758A (en) * 1999-09-30 2001-05-30 Ibm User controlled selection of audio and video data streams
US20050226601A1 (en) * 2004-04-08 2005-10-13 Alon Cohen Device, system and method for synchronizing an effect to a media presentation
US20070157070A1 (en) * 2006-01-04 2007-07-05 Stephan Wenger Method for checking of video encoder and decoder state integrity
EP2434756A1 (en) * 2006-06-22 2012-03-28 TiVo, Inc. Insertion of tags in a multimedia content stream to a location defined by a sequence of hash values of the content
EP2880809A1 (en) * 2012-07-29 2015-06-10 Qualcomm Technologies, Inc. Frame sync across multiple channels

Also Published As

Publication number Publication date
WO2018015752A1 (en) 2018-01-25
GB2552349B (en) 2019-05-22
GB201612560D0 (en) 2016-08-31

Similar Documents

Publication Publication Date Title
CN102884746B (en) Process transmission grouping
US20170295263A1 (en) System and method for applying an efficient data compression scheme to url parameters
Hamdaqa et al. ReLACK: a reliable VoIP steganography approach
JP5694529B2 (en) System and method for synchronous tracking in an in-band modem
Chandran et al. Circuit-PSI with linear complexity via relaxed batch OPPRF
CN106656424B (en) Data transmission verification method
US20220182241A1 (en) Short transaction identifier collision detection and reconciliation
US20130129005A1 (en) Method and system for low latency radio frequency wave transmission
CN111444547A (en) Method, apparatus and computer storage medium for data integrity attestation
RU2147793C1 (en) Method for decryption of repeated data packet in confidential communication system
GB2552349A (en) Sample synchronisation
JP2005354310A (en) Device and method for data transmission and device and method for data reception
US11341217B1 (en) Enhancing obfuscation of digital content through use of linear error correction codes
WO2001039434A2 (en) Packet order determining method and apparatus
US7587046B2 (en) Method and apparatus for generating keystream
JPH05183447A (en) Improved error detection coding system
EP1989807A1 (en) Method and system for transmitting a message expressed by means of a polynomial
KR20200009974A (en) Error-correction code based crypto currency system
US7180851B1 (en) Method for quick identification of special data packets
JP2007306212A (en) Transmitter, receiver, communication system, and communication method
GB2601539A (en) Methods and systems for streaming block templates with cross-references
Pevnev et al. The Method of Data Integrity Assurance for Increasing IoT Infrastructure Security
KR20160123562A (en) Receiver for processing data packet and data packet processing method of receiver
MX2008014753A (en) Generation of valid program clock reference time stamps for duplicate transport stream packets.
RU2239289C2 (en) Method for transmitting digital information in feedback systems