GB2370955A - Encoding and decoding of data in audio signals - Google Patents

Encoding and decoding of data in audio signals Download PDF

Info

Publication number
GB2370955A
GB2370955A GB0100543A GB0100543A GB2370955A GB 2370955 A GB2370955 A GB 2370955A GB 0100543 A GB0100543 A GB 0100543A GB 0100543 A GB0100543 A GB 0100543A GB 2370955 A GB2370955 A GB 2370955A
Authority
GB
United Kingdom
Prior art keywords
audio signal
data
predetermined
amplitude
waveform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB0100543A
Other versions
GB2370955B (en
GB0100543D0 (en
Inventor
William Ferguson Moultrie
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INDEPENDENT MEDIA DISTRIB PLC
Original Assignee
INDEPENDENT MEDIA DISTRIB PLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INDEPENDENT MEDIA DISTRIB PLC filed Critical INDEPENDENT MEDIA DISTRIB PLC
Priority to GB0100543A priority Critical patent/GB2370955B/en
Publication of GB0100543D0 publication Critical patent/GB0100543D0/en
Publication of GB2370955A publication Critical patent/GB2370955A/en
Application granted granted Critical
Publication of GB2370955B publication Critical patent/GB2370955B/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H20/00Arrangements for broadcast or for distribution combined with broadcast
    • H04H20/28Arrangements for simultaneous broadcast of plural pieces of information
    • H04H20/30Arrangements for simultaneous broadcast of plural pieces of information by a single channel
    • H04H20/31Arrangements for simultaneous broadcast of plural pieces of information by a single channel using in-band signals, e.g. subsonic or cue signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H20/00Arrangements for broadcast or for distribution combined with broadcast
    • H04H20/12Arrangements for observation, testing or troubleshooting
    • H04H20/14Arrangements for observation, testing or troubleshooting for monitoring programmes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04HBROADCAST COMMUNICATION
    • H04H20/00Arrangements for broadcast or for distribution combined with broadcast
    • H04H20/28Arrangements for simultaneous broadcast of plural pieces of information
    • H04H20/30Arrangements for simultaneous broadcast of plural pieces of information by a single channel

Abstract

A method of encoding data into an audio signal so that the data is indistinguishable to a listener comprises the steps of: producing respective measures of at least frequency and amplitude of waveform elements A-D in the audio signal; and checking whether the produced measures for such a waveform element satisfy a set of predetermined criteria which take account of psychoacoustic masking. The waveform elements are, for example, half-cycle portions. When a waveform element satisfying the set of predetermined criteria is found, an item of data is encoded into the audio signal by modifying a preselected portion of that waveform element so that, in that portion, its amplitude has a predetermined constant value or has a predetermined variation with time whist remaining within a predetermined narrow range of amplitudes. A corresponding decoding method is also described. The encoding and decoding methods may be used in a broadcast monitoring method in which an audio signal to be broadcast by a radio station is encoded to permit verification of its transmission. The form of the encoding/decoding ensures that the data can survive compression.

Description

ENCODING AND DECODING OF DATA IN AUDIO SIGNALS
The present invention relates to encoding and decoding of data in audio signals. In particular, the present relates to encoding of data in audio signals such that the audio signal after encoding is indistinguishable to a human listener from the original (non-encoded) audio signal.
There are many instances in which it is desirable to be able to encode data inaudibly into an audio signal. For example, when a radio station broadcasts a particular song it may be required to make a royalty payment to the owner of the copyright in the song. To enable such payments to be determined accurately, copyright owners (e. g. record companies) wish to collect information regarding radio station broadcasts so as to identify the number of times each particular song subject to such payments has been broadcast.
Although such information could be provided by the radio stations themselves, it is desired by copyright owners to be able to verify the information independently. It is therefore desirable to encode predetermined data in a song distributed to a radio station for broadcast thereby so that, in a suitablyequipped receiver capable of decoding the data, any broadcast of the song concerned can be detected and registered independently of the radio station. In such a case, it is of course essential that the song (including the encoded data) as broadcast by the radio station be indistinguishable to a human listener from the original (non-encoded) song.
Similarly, it is also desirable for advertisers or distributors of radio advertisements to be broadcast by commercial radio stations to be able to collect information regarding broadcasts of the advertisements by individual radio stations. In particular, in this
case, it is desirable to be able to collect information on the exact times at which the advertisements are broadcast. The time of broadcast is an important factor in determining the likely audience for the advertisement, so that usually the advertiser or distributor will agree with the radio station a particular"slot"in which the advertisement is to be broadcast. Again, therefore, it is desirable to encode, in a advertisement to be broadcast by a radio station, predetermined data so that, in a suitablyequipped receiver capable of decoding the data, any broadcast by the radio station of the advertisement concerned can be detected and registered. The encoded data should of course be inaudible to a human listener.
It is also becoming prevalent in the radio industry for songs and advertisements to be distributed by electronic means, for example by downloading from a record company server or media distribution company server of digital data representing the song or advertisement. In order to reduce downloading times, various techniques have been developed for compressing the digital data representing the original audio signal whilst leaving the sound quality upon final reproduction virtually unchanged to a human listener.
Most of these techniques take advantage of a phenomenon referred to as psychoacoustic masking.
When an audio signal includes two tones of similar frequencies, the louder tone masks the quieter tone to the human ear. Several compression techniques have been developed that take advantage of this psychoacoustic masking phenomenon. Some techniques involve splitting the entire frequency spectrum of the audio signal into a number of narrow frequency bands (e. g. 32 bands for MPEG2 compression and 576 bands for MPEG3 compression). MPEG stands for the Moving Pictures Expert Group of the International Organisation
for Standardisation. The suffix 2 denotes layer 2 compression, and the suffix 3 denotes layer 3 compression. After the audio signal frequency spectrum has been split into the narrow frequency bands, the compression system analyses which of the frequency bands would be masked by other bands. Only those frequency bands whose sounds would not be masked (i. e. would be audible) are then coded.
For example, MPEG2 compression uses sub-band adaptive pulse code modulation (APCM) and splits the original audio signal into 32 equally-spaced frequency bands (sub-bands). The bit allocation for each subband is dynamically controlled by information derived from a psychoacoustic modeller. A 1024 point fast Fourier transform (FFT) unit provides data to the psycho acoustic modeller which has temporal and spectral filters that are designed taking into account the aural sensitivities of the human ear. The filtering removes signals that would not be audible ultimately upon reproduction of the decompressed signal.
Other compression systems in use in the music industry include APT-X 1000, MUSICAM (Masking-patternadapted Universal Sub-band Integrated Coding and Multiplexing), and MP3 (MPEG-1 audio layer 3). APT is a hardware-based fixed-algorithm compression system which uses linear prediction, i. e. it removes redundancy by subtracting a predicted signal derived from coder lookup tables.
When digital data representing an audio signal such as a song or advertisement must be subject to compression, for example as part of the distribution process prior to broadcast or as part of the broadcasting process itself, it is desirable that any specially-encoded data included in the digital audio data for broadcast monitoring purposes should survive
the compression and hence be available reliably for decoding in the receiver. However, when devising an encoding technique to work successfully with whatever compression technique is used, a particular problem arises in that the encoding technique will itself wish to take advantage of the same basic psychoacoustic masking phenomenon on which the compression technique relies. This means that if the data needed for broadcast monitoring purposes is encoded into the digital audio data in such a way that it would be psychoacoustically masked, the compression technique is likely to regard the data as being psychoacoustically redundant data which can be removed by the compression technique.
According to a first aspect of the present invention there is provided a method of encoding data into an audio signal, comprising the steps of: producing respective measures of at least frequency and amplitude of waveform elements in the said audio signal; checking whether the produced measures for such a waveform element satisfy a set of predetermined criteria; and when a waveform element satisfying the said set of predetermined criteria is found, encoding an item of data into the audio signal by modifying a preselected portion of that waveform element so that, in that portion, its amplitude has a predetermined constant value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes.
According to a second aspect of the present invention there is provided a method of decoding data from an encoded audio signal, comprising the steps of: analysing waveform elements in the encoded audio signal to identify a waveform element therein whose amplitude, in a preselected portion of the waveform element, has a substantially constant value equal or close to a
predetermined value, or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes; and when such a waveform element is identified, outputting an item of decoded data.
According to a third aspect of the present invention there is provided encoding apparatus, for encoding data into an audio signal, including: measuring means for producing respective measures of at least frequency and amplitude of waveform elements in the said audio signal; checking means for checking whether the produced measures for such a waveform element satisfy a set of predetermined criteria; and encoding means operable, when a waveform element satisfying the said set of predetermined criteria is found, to encode an item of data into the said audio signal by modifying a preselected portion of that waveform element so that in that portion its amplitude has a predetermined constant value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes.
According to a fourth aspect of the present invention there is provided decoding apparatus for decoding data from an encoded audio signal, including: identifying means for analysing waveform elements in the encoded audio signal to identify a waveform element therein whose amplitude, in a preselected portion of the waveform element, has a substantially constant value equal or close to a predetermined value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes; and decoding means operable, when such a waveform element is identified, to output an item of decoded data.
According to a fifth aspect of the present invention there is provided a method of monitoring broadcasts by a radio station, including the steps of:
encoding an audio signal to be broadcast by the radio station with predetermined data using a method embodying the aforesaid first aspect of the present invention; receiving the audio signal as broadcast by the radio station; decoding the received audio signal using a method embodying the aforesaid second aspect of the present invention so as to extract therefrom the said predetermined data; and using the extracted predetermined data to collect information relating to broadcast of the audio signal by the radio station.
According to a sixth aspect of the present invention there is provided a computer program which, when run on a computer, causes the computer to encode data into an audio signal, the program comprising: a measuring code portion for producing respective measures of at least frequency and amplitude of waveform elements in the said audio signal; a checking code portion for checking whether the produced measures for such a waveform element satisfy a set of predetermined criteria; and an encoding code portion which, when a waveform element satisfying the said set of predetermined criteria is found, encodes an item of data into the said audio signal by modifying a preselected portion of that waveform element so that in that portion its amplitude has a predetermined constant value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes.
According to a seventh aspect of the present invention there is provided a computer program which, when run on a computer, causes the computer to decode data from an encoded audio signal, the program comprising: an identifying code portion for analysing waveform elements in the encoded audio signal to identify a waveform element therein whose amplitude, in a preselected portion of the waveform element, has a
substantially constant value equal or close to a predetermined value, or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes; and an outputting code portion for outputting an item of decoded data when such a waveform element is identified.
Reference will now be made, by way of example, to the accompanying drawings in which: Fig. 1 shows parts of audio encoding and decoding apparatus embodying the present invention; Fig. 2A is a schematic diagram for use in explaining how a waveform element is selected for encoding in the Fig. 1 apparatus; Fig. 2B shows the waveform element of Fig. 2A after encoding; Figs. 3A and 3B show a flowchart relating to a basic encoding procedure in the Fig. 1 apparatus; Figs. 4A and 4B show a flowchart relating to a basic decoding procedure in the Fig. 1 apparatus; Fig. 5 is a schematic diagram of a waveform element subjected to the Fig. 4 decoding procedure; Fig. 6 shows a flowchart relating to an overall decoding procedure in the Fig. 1 apparatus; Figs. 7A and 7B are schematic diagrams of waveform elements for illustrating the effects of compression of an encoded audio signal; Figs. 8A and 8B are further schematic diagrams of waveform elements for illustrating the effects of compression of an encoded audio signal; Fig. 9 is a schematic diagram for explaining how adjacent waveform elements are analysed in a preferred embodiment of the invention; Fig. 10 shows a flowchart relating to a procedure for analysing the Fig. 9 adjacent waveform elements; Figs. 11A and 11B show example waveform elements for use in explaining another procedure for analysing
the Fig. 9 adjacent waveform elements ; Fig. 12 shows parts of broadcast monitoring apparatus embodying the present invention; and Figs. 13A to 13D are schematic diagrams for use in explaining the encoding of waveform elements in other embodiments of the present invention.
The audio encoding and decoding apparatus 1 of Fig. 1 includes an analog circuitry portion 2 and a digital circuitry portion 3. The analog circuitry portion 2 and digital circuitry portion 3 are preferably arranged on physically separate circuit boards to reduce crosstalk from the digital circuitry portion 3 to the analog circuitry portion 2.
The analog circuitry portion 2 has an audio input 4 for receiving an analog audio signal, and a digital output 6 for outputting digital audio data to the digital circuitry portion 3. The analog circuitry portion 2 also has an analog audio output 5 for outputting an analog output signal, and a digital input 7 for receiving digital audio data from the digital circuitry portion 3. In this embodiment, the audio input 4 and audio output 5 are each in the form of a balanced pair of signals. Similarly, in this embodiment the digital audio data output from the output 6 and the digital audio data input to the input 7 are each in serial form.
The analog circuitry portion 2 includes an amplifier 10, a filter 12, and an analog-to-digitalconverter (ADC) 14. As will be explained later in more detail, the audio signal applied to the audio input 4 may be an original audio signal that is to be encoded, or an already-encoded audio signal which is to be decoded.
The analog audio signal applied to the input 4 is amplified by the amplifier 10, subjected to
predetermined filtering (for example, bandpass filtering for excluding frequencies outside a preselected audio range such as 50Hz to 18kHz) by the filter 12 and then digitised by the ADC 14. The sampling frequency of the ADC 14 is, for example, 44. 1kHz and each sample is made up of, for example, 18 bits with positive amplitude values of the analog signal being represented by digital values from OOOOOOH to 01FFFFH, and negative amplitude values of the analog signal being represented by digital values to 020000H to 03FFFFH. The 18 bits of each sample are output serially from the output 6 at a bit rate of 793.8 kbits/s.
The analog circuitry portion 2 also comprises a digital to analog converter (DAC) 16, a filter 18, and an amplifier 20. As described later in more detail, the digital audio data applied to the input 7 by the digital circuitry portion 3 represents in digital form the encoded audio signal to be output by the apparatus 1. The digital audio data applied to the input 7 is made up of 18-bit samples in bit-serial form at a bit rate of 793.8 kbits/s. The bit-serial samples are received by the DAC 16 and converted at a conversion rate of 44.1 ksamples/sec into a corresponding analog signal which is then subjected to predetermining filtering by the filter 18, for example the same filtering as carried out by the filter 12. The filtered analog signal is then amplified by the amplifier 20 and output via the audio output 5.
The digital circuitry portion 3 has a digital input 24 connected to the digital output 6 of the analog circuitry portion 2 for receiving therefrom the digital audio data produced thereby and also has a digital output 26 connected to the digital input 7 of the analog circuitry portion 2 for applying thereto the digital audio data produced by the digital circuitry portion. The digital circuitry portion 3 also has an
input/output port 28 by which the apparatus 1 is connected to a host unit 50, for example a personal computer (PC). Data DATAENCODE generated by the host unit 50 to be encoded into an original audio signal is input to the apparatus 1 via the input/output port 28, and data DATADECODE, representing decoded data extracted from an encoded audio signal, is output from the apparatus 1 to the host unit 50 via the input/output port 28. The data DATAENCODE and DATADECODE is transferred using a predetermined communications protocol, for example an RS-232 communications protocol at a data rate (Baud rate) of 9600 bits per second.
The digital circuitry portion 3 includes an input data conversion unit 30, an output data conversion unit 32, a data capture unit 34, an encoder/decoder unit 36, a host interface unit 38, and a memory unit 40. Each of the units 34,36 and 38 has its own dedicated microprocessor in this embodiment. The microprocessor in each the units 34 and 36 is, for example, a PIC 17C44 microprocessor operating at a frequency of 33MHz.
The microprocessor in the unit 38 is, for example, a PIC 16C74 microprocessor operating at a lower frequency such as 16MHz.
The input data conversion unit 30 includes a serial-to-parallel converter. The serial-to-parallel converter converts into 18-bit parallel form each sample of the digital audio data received in bit-serial form from the analog circuitry portion 2. The resulting parallel data is transferred to the data capture unit 34 which causes the data to be stored in the memory unit 40 as a block of 3 bytes. The 3-byte block contains 24 bits, and the data capture unit 34 maps the 18 bits of each sample into the 18 most significant bits (MSBs) of the 3-byte blocks stored in the memory unit 40. Thus, samples representing positive amplitude values have stored values from
OOOOOOH to 7FFFCOH. Samples having negative amplitude values, on the other hand, have stored values in range from 800000H to FFFFCOH.
In this embodiment, the apparatus is designed to carry out encoding of an original audio signal in"real time". To this end, the memory unit 40 is provided with sufficient capacity to store a predetermined number N of samples of the original audio signal. For example, with a memory capacity of 1 Mbyte, N=349525, corresponding to a period of just under eight seconds in the audio signal.
The data capture unit 34 performs a series of operating cycles. Each operating cycle is of duration
of 22. 7 s, corresponding to the above-mentioned sampling rate of the audio signal of 44. lKHz.
In each operating cycle, one sample of the digital audio data is stored as a 3-byte block of data in the memory unit 40 at a storage location identified by a write pointer WP maintained by the data capture unit 34. Also, in each operating cycle, a previously-stored block of data is read out from the memory unit 40 by the data capture unit 34 from a storage location identified by a read pointer RP maintained by the data capture unit 34. Each of the read and write pointers RP and WP move sequentially in circular fashion through the memory-unit storage locations, and the amount by which the write pointer WP is ahead of the read pointer RP defines the amount of digital audio data that is available to the encoder/decoder unit 36 for manipulation at any time. When, as indicated above, the memory unit has a storage capacity of 349525 blocks, and assuming the maximum separation between the write and read pointers WP and RP, nearly eight seconds of digital audio data will be available for manipulation.
When a block of data is read out from the storage
location pointed to by the read pointer, the 18 most significant bits thereof are passed by the data capture unit 34 to the output data conversion unit 32. The output data conversion unit 32 comprises a parallel-toserial converter which converts the 18 bits into bitserial form and outputs them via the output 6 to the analog circuitry portion 2.
In this embodiment, the encoder/decoder unit 36 is capable selectively of performing encoding of data DATAENCODE supplied from the host unit 50 into an original audio signal, and decoding of an already-encoded audio signal to supply the host unit 50 with the decoded data DATADEcoDE-To perform either the encoding or decoding, the encoder/decoder unit 36 has access to the blocks of digital audio data stored in the memory unit 40 at any given time, and has its own processing pointer PP, independent of the write and read pointers WP and RP, identifying the storage location, or group of storage locations, currently being accessed.
Incidentally, for the sake of simplicity the above description has assumed that there is a single audio channel. To deal with a stereo audio signal one analog circuitry portion 2 would be provided per channel. In this case the digital input 24 of the digital circuitry portion 3 would receive a left-channel sample and a right-channel sample in bit serial form, with the leftchannel samples being produced at the sampling rate of 44. 1kHz and the right-channel samples also being produced at that rate. In the digital circuitry portion 3 the data capture unit 34 stores a pair of samples (left-channel sample and right-channel sample) in each operating cycle, and reads out another pair of samples (left-channel sample and right-channel sample) in each operating cycle. Only the samples of a preselected one of the channels (e. g. the left channel) are manipulated by the encoder/decoder unit 36.
Firstly, operation of the encoder/decoder unit 36 when performing a decoding operation will be described with reference to Figures 2A and 2B.
Figure 2A shows a portion of the stored digital audio data held in the memory unit 40. When such stored digital audio data is to be encoded, it is firstly necessary to analyse the stored data to identify therein candidate waveform elements for encoding purposes. Such candidate waveform elements must satisfy a set of predetermined criteria established taking into account psychoacoustic masking so that, to a human listener, the encoded audio signal will be indistinguishable from the original (nonencoded) audio signal. In systems in which it is expected that the encoded audio signal will be subject to compression, such as MPEG2 or MPEG3 compression, different or further predetermined criteria may be applied to reflect further considerations needed to ensure that the compression applied to the encoded audio signal will not result in the loss of the encoded data.
Referring to Figure 2A, the basic criteria for identifying a candidate waveform element will be explained. In this embodiment, a candidate waveform element is a portion of the digital audio data between two successive zero-amplitude crossover points, for example the points A and D in Figure 2A. In this case, the first crossover point A is a crossover point from negative to positive amplitude and the second crossover point D is a crossover from positive to negative amplitude, but candidate waveform elements in this embodiment also include negative-going waveform elements that begin with a positive-to-negative crossover point and end with a negative-to-positive crossover point.
In this embodiment, positive-going candidate
waveform elements that meet the relevant criteria are used to encode a bit"one"of the data DATAENCODE to be encoded, and negative-going candidate waveform elements that meet the relevant criteria are used to encode a bit"0"of the data DATAENCODE to be encoded. Thus, for each successive bit of data DATAEmcooE to be encoded, the encoder/decoder unit 36 first determines whether the bit concerned is a"1"or a"0". If it is a bit"1", then a positive-going candidate waveform element meeting all of the relevant criteria must be searched for in the stored digital audio data, whereas if the bit concerned is a bit"0"a negative-going candidate waveform element that meets all the relevant criteria must be searched for in the stored digital audio data.
In the case in which a bit"1"is to be encoded, a positive-going candidate waveform element must be searched for. Thus, referring to Figure 3, which shows a flowchart of the steps performed by the encoder/decoder unit 36 to identify a candidate positive-going waveform element, in a first step SIC the next negative-to-positive crossover in the stored digital audio data is located, i. e. the next point at which the amplitude of one sample is negative and the amplitude of the next sample is positive. The most significant bit of any positive sample is 0, whereas the most significant bit of any negative sample is 1, so that to detect the crossover it is only necessary to look at the most significant bit of each sample to determine when, from one sample to the next, it changes from 1 to 0.
When a negative-to-positive crossover point (e. g. point A in Fig. 2A) is found, in step S15 a first test criterion, which is a frequency test criterion, is applied to the positive-going waveform element starting from the crossover point. In step S15 a measure FCWE of frequency of the candidate waveform element is
calculated. In this embodiment, the frequency measure is simply determined by identifying when the stored digital audio data has its next crossover point from positive to negative (point D in Fig. 2A). Again, this simply involves identifying the first sample following the point-A sample at which the most significant bit becomes a 1. The required frequency measure is then simply the number of samples between the point-A crossover and the subsequent point-D crossover.
In step S20, the frequency measure FCWE is then tested to see whether it is between a desired minimum frequency FMIN and a desired maximum frequency Flax. For example, FMIN may be 44 samples and FAX may be 19 samples when the sampling rate is 44. 1kHz. If the audio signal is a pure sinewave, for example, the minimum value of 44 samples corresponds to a frequency of approximately 500Hz, and the maximum value of 19 samples corresponds to a frequency of approximately 1.2kHz.
In step S20, if the frequency measure FCWE is not within the desired range, processing returns to step S10 to continue the search.
If the frequency measure FCWE is within the desired range in step S20, in step S25 the first point (point B) following point A which has an amplitude within a desired amplitude range is located. In this embodiment, the desired amplitude range is set as a desired range of the most significant byte of the sample. This desired range is the range from 19H to 1CH for a positive-going candidate waveform. This corresponds to an amplitude range of 20% to 23% of the maximum positive amplitude value.
If no sample (point B) within the desired amplitude range can be found in step S25, then in step S30, processing returns to S10 to start the search for a suitable positive-going candidate waveform element again.
In step S35, which is carried out if point B is found, the amplitude of the sample at point B is stored for subsequent use. Also, the samples following the point-B sample are analysed to locate the first point (point C) after the point B at which the amplitude decreases below the point-B amplitude. Then, in step S40 it is determined whether the time period from point B to point C is greater than a predetermined minimum time interval TMIN TMIN may be measured in terms of numbers of samples between the point-B sample and the point-C sample. In this embodiment (sampling rate of 44. 1kHz), TMIN is set at 16 samples, corresponding to approximately 0.35 milliseconds. Thus, T, I, in this embodiment is a minimum of 36% of FCWE (when FCWE = FMIN) and a maximum of 84% of FCWE (when FCWE = FMAX). If not, processing moves to step S65.
In step S65 it is determined whether the time from point C to point D is greater than or equal to TMIN If not, processing returns to step S10 and the search for a new positive-going candidate waveform element begins again.
If in step S65 it is determined that the time between points C and D is greater than or equal to TRIN, the in step S70 an attempt is made to locate a new point B. This new point B is the first sample (if any) after the current point C at which the amplitude is within the desired amplitude range (the range used in step S25). Processing then jumps to step S30.
Steps S65 and S70 are used to cater for candidate waveform elements which have two or more portions in the desired amplitude range, the first of which fails to satisfy the criteria of steps S40 and 845. In this case, each of the second and subsequent portions is examined to see if it meets those criteria instead.
If the time interval in step S40 is greater than or equal to the minimum interval TM, then in step S45
it is determined whether the amplitude of the maximumamplitude sample between points B and C is less than or equal to a desired maximum amplitude A In this embodiment, the desired maximum amplitude A is set as
a desired maximum value of 20H of the most significant byte of the sample (approximately 25% of the full-scale amplitude). If the maximum amplitude between points B and C exceeds the desired maximum amplitude value Amm, then processing returns to step S10 to search for the next candidate positive-going waveform element.
Otherwise, it is determined that all of the basic criteria for the candidate waveform element to be used for encoding have been satisfied.
In systems in which no compression of the audio signal will be carried out, this set of basic criteria, which relate to the characteristics of the candidate waveform element alone, are the only criteria that need to be satisfied. However, in general, it must be checked (steps S50 and S55) whether certain further predetermined criteria relating to waveform elements adjacent to the candidate waveform element itself are also satisfied. These further criteria will be described later in reference to Fig. 6.
If the further criteria are found to have been satisfied in step S55 then, in step S60 the candidate waveform element is encoded by setting all of the samples between points B and C, as stored in the memory unit 40, to the point-B amplitude value stored in step S35. As shown in Fig. 2B, this has the effect of making the waveform element of constant amplitude, i. e. flat, between points B and C.
It will be appreciated that the basic criteria which must be satisfied by the candidate waveform element, as set out in steps S10 to S45, are intended to ensure that the candidate waveform element, after encoding, will have the appropriate psychoacoustic
masking properties for the expected type of audio signal to be encoded. The frequency criterion of steps S15 and S20 is designed to select, as candidate waveform elements, only elements within a frequency range that in the expected audio signal is generally "noisy", i. e. contains a relatively large number of high-amplitude frequency components. Such components will have a psychoacoustic masking effect on the candidate waveform element. For example, the frequency range from 500Hz to 1.2kHz used in this embodiment tends to be noisy in radio advertisements, it has been determined empirically.
Similarly, the amplitude range criterion for point B is designed to select, as candidate waveform elements, only elements of moderately high amplitude, such elements tending to be high enough in amplitude to be robust on reproduction, whilst not"standing out"so much that, after the inevitable distorting effect of the encoding process, the distortion will become audible.
The maximum amplitude criterion of step S45 is designed to ensure that candidate waveform elements that are selected for encoding have amplitudes within a suitably-narrow range for the full time interval from points B to C. This means that, when the amplitude is altered (made flat in this embodiment) by the encoding process, the differences between the original waveform element and the encoded waveform element are relatively small and therefore do not lead to discernible distortion to a human listener.
Incidentally, the time interval criterion of step S40 is not so much a criterion devised with psychoacoustic masking considerations in mind. This time interval criterion is primarily intended to ensure that the altered (e. g. flat) portion of the waveform element after encoding is sufficiently long that no
naturally-occurring waveform element within the original audio signal can be mistaken by the decoder for the encoded data.
The encoding of a bit"0"is essentially the same as that of a bit"1", except that, instead of encoding the bit in a positive-going candidate waveform element, a negative-going candidate waveform element must be found. Thus, instead of searching for a negative-topositive crossover point (point A) as in step S10, a positive-to-negative crossover point is located. The frequency criterion is applied in the same way (steps S15 and S20). The desired amplitude range for a negative-going candidate waveform element (step S25) is from E6H to E3H. This corresponds to an amplitude range of 20% to 23% of the maximum negative amplitude value.
Similarly, for a negative-going candidate waveform element, point C is located at the first sample after point B when the amplitude increases above the point-B amplitude. The time interval criterion (step S40) is the same as for the positive-going case. Instead of testing for maximum amplitude of samples between B and C as in step S45 for the positive-going case, in the negative-going case the minimum sample value between the two points is tested to see that it is always above a desired minimum amplitude value AM= DFH. The further criteria of steps S50 and S55 are the same for the negative-going case, as is the"flattening"applied in step S60.
In this embodiment the altered part of the candidate waveform element is made flat. This makes the altered part sufficiently"unnatural", i. e. there will be no naturally-occurring waveform elements having such a flat characteristic for the required minimum duration THIN of the altered part.
In the encoding process in this embodiment, bits of the data DATAENcoDE are encoded in the original audio
signal at intervals, with the minimum interval between successive bits being 100 milliseconds. This minimum interval is chosen so that the repetition rate of the encoded bits is less than 10Hz, so that any potentially-significant lower-order harmonics of the bit repetition rate will not be audible to the human listener. Accordingly, after a bit has been encoded (step S16 in Fig. 3) the encoder/decoder unit 36 skips a number of samples corresponding to the minimum interval between encoded bits, i. e. 4410 samples in this embodiment.
In this embodiment, for error checking purposes, the encoder/decoder unit encodes each bit of the data DATAENCODE to be encoded twice in succession (still with the minimum interval between the successive bits).
In order to enable the decoder to identify the start of a sequence of encoded bits in a received audio signal, in this embodiment, before encoding such a sequence of bits into the signal the encoder unit 36 searches the stored digital audio data for two directly adjacent candidate waveform elements, one of which is positive-going and the other of which is negativegoing. If the first bit of the sequence to be encoded is a"1"then the positive-going element must be the first of the two consecutive elements and the negativegoing element must be the second of the two consecutive elements. If, on the other hand, the first bit of the sequence to be encoded is a"0"then the positive-going element must be the second of the two consecutive elements and the negative-going element must be the first of the two consecutive elements.
Ideally, each of the two candidate waveform elements must satisfy all the relevant criteria for its type of waveform element. However, in practice, finding two such consecutive waveforms may be impossible in a short section of the audio signal (e. g.
less than 500ms). Accordingly, the criteria for one or both types of waveform element may be relaxed slightly when searching for the two consecutive waveform elements. For example, the search may be confined to an initial portion (e. g. the initial 500ms) of the audio signal. In that portion, if no"perfect match"is found (i. e. no two consecutive waveform elements both meeting all the relevant criteria are present) then "near misses"are considered (i. e. two consecutive waveform elements one or both of which just fail (s) to meet all the relevant criteria) and the best near miss selected.
When two consecutive waveform elements have been identified as a perfect match, or best near miss, as the case may be, the encoder unit 36 then encodes the
appropriate one of the waveform elements as bit"1"and the other element as bit"0", so as to provide a reference or"signature"waveform for the decoder to search for to identify the start of a sequence of encoded bits.
As well as using the signature waveform to identify the start of a sequence of encoded bits, the decoder can also employ the signature waveform as received to calibrate itself automatically (selfcalibration) to take account of level changes or other changes due, for example, to compression as the encoded audio signal is transmitted from the encoder to the decoder.
The signature waveform need not itself embody an item of data; instead, it could simply always be a bit followed by a bit"0", with the encoded sequence of bits following the signature waveform at intervals.
Next, basic operations for decoding an audio signal encoded as described above with reference to Fig. 3 will be described.
In the decoding process, a decoded audio signal is
received by the analog circuitry portion 2, and, as described previously in relation to the encoding process, at any given time samples of the digital audio data corresponding to just under eight seconds of the encoded audio signal are available in the memory unit 40 for processing by the encoder/decoder unit 36.
As in the case of decoding, in the decoding process candidate waveform elements satisfying a set of predetermined decoding criteria must be searched for in the stored digital audio data. The set of predetermined decoding criteria corresponds generally to the set of predetermined encoding criteria, although the criteria for decoding may be relaxed somewhat compared to the corresponding encoding criteria so as to allow for the effects of compression and other signal processing which occurs between the encoder and the decoder.
Also, as in the case of the encoding process, positive-going candidate waveform elements (encoding
the bit"1") and negative-going candidate waveform elements (encoding the bit"0") are treated separately.
Fig. 4 shows a flowchart of the basic decoding process for a positive-going candidate waveform element.
In a first step S100, after finding a negative-topositive crossover point (point A in Fig. 5) a measure FcwE of frequency of the candidate waveform element is calculated. As described previously in relation to Fig. 2A, in this embodiment the frequency measure is simply determined by identifying when the stored digital audio data has its next crossover point from positive to negative (point D in Fig. 5). The required frequency measure is then simply the number of samples between the point-A crossover and the subsequent point D crossover in Fig. 5.
In step S105, the frequency measure Fcws is then tested to see whether it is between a predetermined
minimum frequency FMIN and a predetermined maximum frequency F. For example, FMIN may be 44 samples and Fmm may be 19 samples when the sampling rate is 44. 1kHz, as in the encoding process described earlier (step S20 in Fig. 3). As mentioned previously, if the audio signal is a pure sine wave, the minimum frequency value of 44 samples corresponds to a frequency of approximately 500Hz, and the maximum frequency value of 19 samples corresponds to a frequency of approximately 1.2KHz.
If in step S105 the frequency measure FCWE is not within the specified range, processing proceeds to step S195, in which it is determined that the candidate waveform element fails to satisfy the criteria to be an encoded waveform element. The procedure then ends.
If the frequency measure FCWE is within the desired range in step S105, in step S110 the first point (point B) following point A which has an amplitude that exceeds a predetermined amplitude threshold value is located. In this embodiment, the threshold amplitude value is set a 15% of the full-scale amplitude value.
If no sample (point B) exceeding the predetermined amplitude threshold value can be found in step S110, then in step S115 processing jumps to step S195 and the candidate waveform element is rejected.
In step S120, which is carried out if point B is found, the first sample after the point-B sample where the amplitude drops below the predetermined amplitude threshold value again is located (point C). In step S125 the maximum amplitude value among the samples between points B and C is calculated. Then, in step S130 it is determined whether or not the calculated maximum amplitude value is less than a preselected upper limit value. In this embodiment, the preselected upper limit value is 28% of the full-scale amplitude value. If not, processing jumps to step S150
(described below). In this way, a portion of the waveform element is identified, bounded by the points B and C, in which all the samples have amplitudes within a predetermined amplitude range. In this embodiment that range is from 15% to 28% of the full scale positive amplitude value. This range has a greater span than the range of amplitude values (20% to 23%) used for encoding, and includes margins on either side of the encoding range, so as to allow for amplitude changes arising from processing of the encoded signal.
Otherwise, in step S135 the differences in amplitude, between one stored sample and the next, for all of the samples in the identified portion between points B and C are examined. In step S140 it is checked whether, for a run of X consecutive samples between points B and C, all of the differences in amplitude are within a predetermined allowed range. In this embodiment, the predetermined allowed range is expressed in terms of a difference between the respective most significant bytes of the two consecutive samples being considered. In this embodiment, this difference must be either 0 or-1. In this embodiment, the number X of samples in the run is 10, which, when the sampling rate is 44. 1kHz corresponds to a time period of 0.227ms. This time period is set based on the predetermined minimum time TMIN applied during encoding (e. g. 0. 35ms) but is made shorter (e. g. 30% shorter) than that time to provide a suitable operating margin.
If X consecutive samples for which all the differences are in the allowed range cannot be found in step S140 processing moves to the step S195.
Otherwise, in step S145 it is determined that the candidate waveform element is an encoded waveform element carrying the data item (bit)"1". The procedure then ends.
In step S150 it is checked whether there are X or more samples following point C in the candidate waveform element. If there are not, the candidate waveform element must be rejected in step S195, and the procedure ends. If, however, X or more further samples exist, then in step S150 an attempt is made to locate a new point B after the current point C and processing returns to step S110. This caters for a candidate waveform element in which there is a first identified portion that fails the criteria of steps S130 and S140 but there is a second or subsequent identified portion that does meet those criteria.
The decoding process for a negative-going candidate waveform element is essentially the same as for a positive-going candidate waveform element, but with the following differences.
Firstly, the point-B amplitude threshold is a negative amplitude value such as 15% of the full-scale negative amplitude value. In step S125 the minimum (most negative) amplitude value between points B and C is calculated, and then in step S130 it is determined whether or not the calculated minimum amplitude value is less negative than a preselected lower limit value such as 28% of the full-scale negative amplitude value.
In this way the predetermined amplitude range for a negative-going candidate waveform element is, for example, from 15% to 28% of the full-scale negative amplitude value.
Secondly, in step S140 the differences in the respective most-significant-bytes must be either 0 or +1 from one sample to the next for the X or more samples.
Next, the overall decoding process for the described with reference to Fig. 6.
In a first step S200 the decoder unit 36 analyses the stored digital audio data representing a received
encoded audio signal to try to find therein the signature waveform referred to above. This signature waveform provides a temporal reference needed by the decoder unit to achieve synchronisation.
The signature waveform, when found, may also be used for self-calibration purposes by the decoder unit.
In particular, the predetermined amplitude threshold applied in step S110 of the Fig. 4 procedure and/or the predetermined amplitude range applied in step S130 of the Fig. 4 procedure may be set"automatically"by the decoder unit in dependence upon the amplitude values of relevant parts of the signature waveform. This can enable the decoder unit to compensate for the effects of level changes and other signal processing in the signal propagation path of the encoded audio signal.
After the signature waveform has been found, in step S205 the decoder unit skips a number Y of samples of the stored digital audio data currently held in the memory unit 40. The number Y is based on the minimum interval between successive bits of encoded data in the encoded audio signal. When, for example, the minimum interval is lOOms, the number Y may be set to 4000, corresponding to a time period of 90ms. The skipping of Y samples is used to reduce the number of samples which need to be analysed so that the decoding process can be performed in"real time", if desired.
In step S210 the next zero-amplitude crossover point is found. Then, in step S215 the candidate waveform element starting from that crossover point is subjected to the predetermined decoding criteria for its particular type of waveform element. Thus, in the case of a positive-going candidate waveform element, the decoding criteria set out in Fig. 4 are applied.
In the case of a negative-going candidate waveform element, the different decoding criteria described above are applied.
In step S220 it is determined whether the candidate waveform element subjected to the Fig. 4 procedure has met the decoding criteria. If not, it is checked in step S225 whether there are still enough samples left to analyse in which a candidate waveform element representing the current bit of data to be decoded could be present. For example, if the minimum interval between bits is lOOms, but already more than 200ms-worth of samples of the encoded audio signal have been analysed without finding a candidate waveform element meeting all of the relevant decoding criteria, it is determined in step S230 that a decoding error has occurred, and the procedure ends. Otherwise, processing returns to step S210 to continue the search for a candidate waveform element satisfying the relevant criteria.
When, in step S220, a candidate waveform element that meets the relevant criteria is found, processing passes to step S235. In step S235, Y samples are again skipped. From that point onwards, each candidate waveform element of the same type as the candidate waveform element found in step S215 (the first candidate waveform element) is subjected to the relevant decoding criteria (e. g. the Fig. 4 procedure in the case in which the first candidate waveform element found is a positive-going candidate waveform element).
In step S240 it is determined whether a second candidate waveform element of the same type as the first candidate waveform element and meeting all of the relevant decoding criteria has been found. If so, this means that two consecutive candidate waveform elements of the same type have both been found to meet the relevant decoding criteria. Accordingly, in S245 the validity of the bit corresponding to the two candidate waveform elements is confirmed. In other words, if
both candidate waveform elements are positive-going waveform elements, the bit is confirmed as being a"1".
If the two consecutive candidate waveform elements are both negative-going waveform elements, the bit is confirmed as being a"0".
In step S250 it is checked whether or not all of the bits making up the data encoded into the audio signal have been found. For example, the length of the encoded data (total number of bits making up the data) may be preset at a particular value, such as 40. In this case, the predetermined data can represent integer values in the range from 0 to 1. lx10.
If in step S250 it is determined that all of the bits of the encoded data have been found, the data is output via the input/output port 28 to the host unit 50 in step S255, and the procedure ends. Otherwise, processing returns to step S205 to search for the next pair of encoded bits.
If in step S240 a second"qualifying"candidate waveform of the same type as the first candidate waveform has not been found, processing proceeds to step S225 to resume the search for a suitable pair of candidate waveform elements.
Figs. 7 and 8 are schematic diagrams for illustrating the effects of compression of the encoded audio signal. The first example (Figure 7) shows a signature waveform made up of an encoded positive-going waveform element followed immediately by an encoded negative-going waveform element. The second example (Figure 8) shows an individual encoded positive-going waveform element.
In each example, the first plot' (Fig. 7A or Fig.
8A) shows the effect of transmitting the encoded audio signal through a system in which no compression is employed. As is clear, in this case (no compression), the shape of the encoded candidate waveform element is
maintained accurately through the transmission process, so that reliable decoding is possible.
However, the second plot (Fig. 7B or 8B) shows the results of the transmission of the same encoded audio signal through a system in which the signal is subjected twice to MPEG2 compression. It can be seen that in the case of Fig. 7B each encoded waveform element is quite distorted, making decoding unreliable.
In the Fig. 8B case, on the other hand, the encoded waveform element survives the compression process well, and is decodable reliably.
In view of the results shown in Figs. 7 and 8, it is desirable in systems in which compression is performed to further restrict the candidate waveform elements selected for encoding purposes so as to reject candidate waveform elements which, whilst meeting all of the relevant basic criteria as described above, will not survive the compression process well. The restriction of candidate waveform elements for this purpose may be achieved by setting further predetermined criteria relating to waveform elements adjacent to (before and/or after) the candidate waveform element itself. One example of the further predetermined criteria relating to the adjacent waveform elements will now be described with reference to Figs. 9 and 10.
Referring to Fig. 9, when a candidate waveform element CWE has been found that meets all of the basic criteria for that element by itself, the element AWE, immediately before the candidate waveform element CWE and the waveform AWE2 element immediately after the candidate waveform element CWE are analysed to see if each of them meets further predetermined criteria. In this example, the same criteria are applied to both the immediately-preceding and immediately-following waveform elements AWE, and AWE2, and accordingly the
procedure shown in the Fig. 10 flowchart is applied to both elements. Only if both elements satisfy all of the further predetermined criteria is the candidate waveform element CWE selected for encoding purposes.
Referring to Fig. 10, in a first step S300, a measure FAWE of frequency of the adjacent waveform element (immediately-preceding or immediately-following waveform element AWE, or AWE2) being analysed is calculated. As described previously with reference to step S15 in Fig. 3, this frequency measure may simply be the number of samples between the two crossover points marking the beginning and end of the adjacent waveform element concerned. In step S305 the calculated frequency measure FIWE is tested to determine whether it is within a desired range of frequencies FMIN to Fuzz When, for example, the frequency measure FAWE is a number of samples between the beginning and end of crossover points of the adjacent waveform element concerned, it may be tested in S305 whether that number of samples is between 3 and 17, which in the case of a pure sine wave corresponds to a minimum frequency FMIN of 1.3kHz and a maximum frequency FI of 8kHz.
If the adjacent waveform element fails to satisfy the frequency criterion then, in step S320, it is determined that the adjacent waveform element concerned has failed to satisfy the further criteria and the procedure ends.
If, on the other hand, the frequency criterion is found to have been satisfied in step S305, in step S310 it is determined whether or not the maximum absolute amplitude of the adjacent waveform element AWE, or AWE2 between its beginning and end crossover points exceeds a predetermined minimum absolute amplitude value ABSIN- This minimum absolute amplitude value is a relatively large value, for example requiring the most significant byte of the highest-absolute-amplitude sample to exceed
50H (approximately 60% of the full-scale amplitude value) in the case of a positive-going adjacent waveform element. If the adjacent waveform element fails the amplitude criterion of step S310 then, in step S320, it is determined that the adjacent waveform element concerned has failed to satisfy the relevant further criteria and the procedure ends. Otherwise, in step S315 it is determined that the adjacent waveform element AWE, or AWE2 has satisfied the further predetermined criteria and the procedure ends.
It will be appreciated that the further predetermined criteria in this case are being employed to avoid undesirable corruption of the candidate waveform element when subjected to particular types of compression and decompression. For example, in the case of MPEG2 compression which uses sub-band adaptive pulse-code modulation (APCM) the incoming signal is split into 32 equally-spaced sub-bands. The bit allocation for each sub-band is dynamically controlled by information derived from a psychoacoustic modeller.
A 1024 point fast Fourier transform (FFT) unit provides data to the psychoacoustic modeller which has temporal and spectral filters which are designed around the aural sensitivities of the human ear. This filtering removes signals that which would not be audible upon reproduction to a human listener. In this case, if a candidate waveform element were selected at a position in the audio spectrum where the spectral filter of the compression system would have effect, that is where a high-level tone would mask a low-level tone at a nearby frequency, then the candidate waveform element could be corrupted. For this reason, the further predetermined criteria applied to the adjacent waveform elements are designed to identify situations where the adjacent waveform elements represent such high-level tones that would tend to mask a low-level tone (candidate waveform
element) at a nearby frequency.
It will be appreciated that the further predetermined criteria will be different for different compression systems. For example, in the case of APT X100 compression (a hardware-based fixed algorithm system which uses linear prediction), the losses are mainly in the higher sub-bands and therefore the further predetermined criteria in this case should be chosen to avoid candidate waveform elements whose adjacent waveform elements are in such higher-frequency sub-bands.
Another example of the predetermined criteria relating to the adjacent waveform elements will now be described with reference to Figs. 11 (A) and 11 (B).
Figs. 11 (A) and 11 (B) both show examples of candidate waveform elements CWE that meet the relevant basic criteria described above. However, it has been found empirically that candidate waveform elements for which the adjacent waveform elements are non-smooth (have a relatively high rate of change) do not survive the compression process well. On the other hand, candidate waveform elements for which the adjacent waveform elements are smooth (have relatively low rates of change) tend to survive the compression process satisfactorily.
Taking this into account, it is possible in another embodiment of the present invention for the further predetermined criteria relating to the waveform elements to include a rate-of-change criterion. For example, in each adjacent waveform element to be analysed, the respective absolute values of the changes in amplitude from one sample to the next could be summed. Then, the sum of the changes could be divided by the number of samples in the adjacent waveform element concerned to arrive at a rate-of-change (or volatility) measure for the adjacent waveform element
concerned. This rate-of-change measure could be tested against a preselected threshold value so that, when the measure is higher than the threshold value, the candidate waveform element is rejected for encoding purposes. In this case, as described previously, one or more adjacent waveform elements before and/or after the candidate waveform element may be subjected to the rate-of-change analysis.
The rate-of-change criterion may be supplemented by one or more further criteria relating, for example, to the amplitude in the adjacent waveform elements.
Furthermore, the criteria in Fig. 10 may be used in combination with the rate-of-change criterion.
Next, a practical application of encoding and decoding apparatus embodying the present invention will be described with reference to Fig. 12. As shown in Fig. 12, a media centre 200 includes a monitoring unit 220 and a distribution unit 240. The media centre 200 is, for example, operated by a media distribution company which distributes media products such as songs and advertisements to be broadcast by radio stations.
The media centre 200 is located, for example, at one of the business premises of such a media distribution company.
The distribution unit 240 includes a product storage section 245 for storing one or more media products (songs, advertisements, etc) to be distributed to radio stations. The distribution unit 240 also includes an encoder section 250 including encoding apparatus embodying the present invention. The distribution unit 240 is connected via a connection 260 to one or more radio stations (only one radio station 300 is shown in Fig. 12 by way of example). The connection 260 between the distribution unit 240 and the radio station 300 may be of any suitable form. For example, the connection may be via the Internet, via
the public-switched telephone network (PSTN) or in any other form.
When a new product is to be distributed by the distribution unit 240 to the radio station 300, the product is retrieved from the product storage section 245 and is then encoded with predetermined data by the encoder section 250. The predetermined data is, for example, information identifying the product concerned and/or the radio station to which the product is to be distributed. Other information, such as for example the identity of the media distribution company operating the distribution unit 240, may also be included in the predetermined data.
After the product has been encoded, it is then transmitted via the connection 260 to the radio station 300. The product is preferably transmitted to the radio station as digital data, for example by a data downloading operation. The distribution unit 240 may transmit the digital data of the product to the radio station in compressed form using, for example, MPEG2 or MP3 compression, so as to reduce the downloading time.
The product is then broadcast by the radio station 300 from time to time. In the case of advertisements, the broadcast times will normally be agreed in advance with the media distribution company, for example so as to reach a particular target audience listening at particular times. In the case of songs, although no particular times of broadcast may be insisted upon by the media distribution company, the number of occasions on which the product is broadcast needs to be known in order for the correct royalty payments to be made to the owner of the copyright in the product.
The monitoring unit 220 in the media centre 200 includes a radio receiver 225 which continuously receives the radio signals broadcast by the radio station 300 and outputs audio signals derived from the
broadcast radio signals. The monitoring unit 220 also includes a decoder section 230, including decoding apparatus embodying the present invention, which is connected to the radio receiver 225 for receiving the audio signals therefrom and for extracting therefrom any data encoded therein by the distribution unit 240.
Thus, whenever a product in which predetermined data has been encoded by the encoder section 250 in the distribution unit 240 is broadcast by the radio station 300, the decoder section 230 in the monitoring unit 220 decodes the predetermined data from the received audio signals and outputs the decoded data.
The monitoring unit 220 further includes a broadcast logging section 235 which derives items of information relating to the broadcasts of products containing encoded data and logs the items of information. The items of information logged for a given product may, for example, include the number of occasions on which the product has been broadcast and the exact time of each such broadcast. The logged items of information can then be made available for analysis by the media distribution company, for example for royalty payment calculation purposes or for the purpose of verifying that actual broadcast times comply with broadcast times agreed between the media distribution company and the radio station.
Many variations and modifications of the embodiments described above are possible.
In an encoding method embodying the present invention, it is not necessary for the altered part of an encoded waveform element to be made"flat", i. e. be made to have a constant amplitude value. In other embodiments, as shown by way of example in Figs. 13A to 13D, the data may be encoded by modifying the amplitude of the waveform element so that, in a preselected portion thereof, its amplitude has a predetermined
variation with time whilst remaining within a predetermined narrow range of amplitudes AA. In Fig. 13A, for example, the amplitude of the waveform element in the preselected portion rises at a constant rate, i. e. has the form of a ramp. In Fig. 13B, the amplitude of the waveform element falls at a constant rate whilst remaining within the narrow range of amplitudes AA.
In Fig. 13C, the amplitude of the waveform element is modified in the preselected portion to have a sinusoidal variation whilst remaining within the narrow range of amplitudes AA. In the sinusoidal variation case of Fig. 13C the frequency of the sinusoidal variation may be selected with psychoacoustic masking properties and/or resilience to a particular compression technique in mind. For example, where a particular type of compression such as MPEG2 compression is expected to be applied to the signal, it may be desirable to make a frequency of the sinusoidal variation higher than a frequency of the candidate waveform element in which it is encoded, for example of the order of twice as high in frequency.
The narrow range of amplitudes AA may be, for example, from 30% to 35% of the full-scale amplitude.
In another example, shown in Fig. 13 (D), the amplitude of the waveform element in the preselected portion is modified to have a non-sinusoidal but regular variation imposed on the original amplitude variation, which leaves the modified portion conforming more closely to the original waveform shape. The additional variation may be imposed, for example, by repeatedly making minor amplitude adjustments in groups of adjacent samples to ensure that the amplitude differences between successive pairs of samples are alternately positive and negative and are always small (e. g. 1 or 2% of full-scale amplitude). Incidentally,
the imposed variation is shown in exaggerated form in Fig. 13D for the purposes of clarity. The frequency of the variation may be, for example, half the sampling rate.
It will be appreciated that encoding and decoding apparatus embodying the present invention is not limited to being used for broadcast monitoring purposes. Embodiments of the present invention can be used in any situation in which it is desired to be able to encode predetermined data into an audio signal without the encoded audio signal being distinguishable from the original audio signal to a human listener.
For example, much attention has recently been directed to so-called"digital watermarking"techniques which will assist in preventing or restricting unauthorised distribution and/or copying of songs and other copyright products in digital form. Using digital watermarking techniques, detection of trafficking in unauthorised (pirate) copies of audio products may be facilitated.
It will also be appreciated that, although in the embodiment described above with reference to Fig. 1, the apparatus is capable selectively of carrying out encoding and decoding, it is not necessary for apparatus embodying the present invention to be capable of performing both functions. For example, embodiments capable only of encoding and only of decoding could be produced as separate units.
It is also not necessary for the apparatus to be capable of performing the encoding or decoding in"real time"as in the Fig. 1 embodiment. If the original audio signal to be encoded, or the encoded audio signal to be decoded, is stored in digital form, the encoding and decoding operations can be performed other than in real time. This could be useful, for example, to enable more sophisticated encoding and decoding
processes to be applied to the stored data, for example to provide for improved error checking.
In the Fig. 1 embodiment, the processing of the stored digital audio data is carried out by microprocessors but it will be understood by those skilled in the art that dedicated hardware circuitry could be used in place of such microprocessors in other embodiments of the present invention.
It will also be appreciated that encoding and/or decoding methods and apparatus embodying the present invention can be implemented by a general-purpose computer operating according to a program. In particular, the functions of the encoder/decoder unit 36 in Fig. 1 can be implemented by such a computer. In this case, the computer program may be provided in any suitable form. For example, the program may be by itself, or may be carried by a carrier medium. The carrier medium may be a recording medium such as a disk or CD-ROM. Alternatively, the carrier medium may be a signal such as a signal downloaded from a remote server via the Internet. The appended claims in the computer program category are to be interpreted as covering all of these possibilities.

Claims (48)

  1. CLAIMS : 1. A method of encoding data into an audio signal, comprising the steps of: producing respective measures of at least frequency and amplitude of waveform elements in the said audio signal; checking whether the produced measures for such a waveform element satisfy a set of predetermined criteria; and when a waveform element satisfying the said set of predetermined criteria is found, encoding an item of data into the audio signal by modifying a preselected portion of that waveform element so that, in that portion, its amplitude has a predetermined constant value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes.
  2. 2. A method as claimed in claim 1, wherein each waveform element is a half-cycle portion of the audio signal between consecutive zero-amplitude crossover points in the audio signal.
  3. 3. A method as claimed in claim 1 or 2, wherein the data to be encoded includes at least first and second items, and only first ones of the waveform elements, satisfying a first such set of predetermined criteria, are used for encoding the first items of data, whilst only second ones of the waveforms, different from the said first waveform elements and satisfying a second such set of predetermined criteria, are used for encoding the second items of data.
  4. 4. A method as claimed in claim 3 when read as appended to claim 2, wherein the said, first items of data are encoded using respective positive-going halfcycle portions in which the audio-signal amplitude is positive, and the second items of data are encoded using respective negative-going half-cycle portions in
    which the said audio-signal amplitude is negative.
  5. 5. A method as claimed in claim 3 or 4, wherein each of the said items is a single bit of data, and each said first item is a bit having one binary value and each said second item is a bit having the other binary value.
  6. 6. A method as claimed in any preceding claim, wherein one of the said predetermined criteria is that the said frequency measure represents a frequency in the range from 500Hz to 1.2kHz.
  7. 7. A method as claimed in any preceding claim, wherein one of the said predetermined criteria is that the amplitude of the waveform element is within a preselected limited range of amplitudes for at least a preselected minimum time.
  8. 8. A method as claimed in claim 7, wherein a difference between respective upper and lower limits of the said limited range of amplitudes is 5% or less of a full-scale amplitude value the said audio signal.
  9. 9. A method as claimed in claim 7 or 8, wherein the said limited range of amplitudes is centred on an amplitude value of approximately 20'-. of a full-scale amplitude value of the said audio signal.
  10. 10. A method as claimed in any one of claims 7 to 9, wherein the said preselected minimum time is at least 0.25 milliseconds.
  11. 11. A method as claimed in any one of claims 7 to 10, wherein the said preselected minimum time is in the range from 30% to 90% of the total duration of the waveform element.
  12. 12. A method as claimed in any one of claims 7 to 11, wherein the said preselected portion begins where an absolute value of the audio-signal amplitude first reaches a value within the said preselected limited range and ends where the said absolute value of the amplitude first decreases below the value first reached
    within that range.
  13. 13. A method as claimed in claim 12, wherein the said predetermined constant value is made substantially equal to the said value first reached within the said preselected limited range.
  14. 14. A method as claimed in any one of claims 7 to 13, wherein the said predetermined narrow range of amplitudes is no larger than the said preselected limited range of amplitudes.
  15. 15. A method as claimed in any preceding claim, wherein, when a waveform element that satisfies the or such a set of predetermined criteria is found, one or more adjacent waveform elements before and/or after the waveform element concerned is/are analysed to determine whether the or each such adjacent waveform element satisfies a further set of predetermined criteria and, if not, the waveform element found is not used for encoding.
  16. 16. A method as claimed in claim 15, wherein one of the said predetermined criteria of the said further set is that a frequency of the adjacent waveform element falls within a preselected frequency range.
  17. 17. A method as claimed in claim 16, wherein the said preselected frequency range is from 1.3kHz to 8kHz.
  18. 18. A method as claimed in any one of claims 15 to 17, wherein one of the said predetermined criteria of the said further set is that a maximum absolute value of the amplitude of the adjacent waveform element concerned is less than a maximum absolute value of the amplitude of the said preselected portion of the waveform element after it has been modified.
  19. 19. A method as claimed in any one of claims 15 to 18, wherein one of the said predetermined criteria of the said further set is that an absolute amplitude of the adjacent waveform element concerned is less than
    60% of a full-scale amplitude value of the said audio signal.
  20. 20. A method as claimed in any one of claims 15 to 19, wherein one of the said predetermined criteria of the said further set is that a rate of change of amplitude of the adjacent waveform is less than a predetermined threshold value.
  21. 21. A method as claimed in any preceding claim, wherein there is an interval of at least a preselected minimum time between successive waveform elements used for encoding.
  22. 22. A method as claimed in claim 21, wherein the said preselected minimum time is at least 100 milliseconds.
  23. 23. A method as claimed in any preceding claim, wherein a reference item is encoded in the said audio signal by identifying therein two consecutive waveform elements that each satisfy the set, or such a set, of predetermined criteria, and subjecting both the elements to such encoding.
  24. 24. A method as claimed in claim 23, wherein the said reference item constitutes an initial one of said items of encoded data in the said audio signal.
  25. 25. A method as claimed in claim 23 or 24 when read as appended to claim 3, wherein one of the said two successive waveform elements is such a first waveform element and the other of the said two successive waveform elements is such a second waveform element.
  26. 26. A method as claimed in any preceding claim, wherein each said item of data to be encoded is encoded in the audio signal twice in succession.
  27. 27. A method as claimed in any preceding claim, wherein the encoding of data into the audio signal is carried out in real time.
  28. 28. A method of decoding data from an encoded
    audio signal, comprising the steps of : analysing waveform elements in the encoded audio signal to identify a waveform element therein whose amplitude, in a preselected portion of the waveform element, has a substantially constant value equal or close to a predetermined value, or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes; and when such a waveform element is identified, outputting an item of decoded data.
  29. 29. A method as claimed in claim 28, further comprising the step of: determining whether, for at least a predetermined time period within the waveform element, an absolute value of the waveform-element amplitude stays the same or reduces by less than a predetermined amount per unit time.
  30. 30. A method as claimed in claim 28 or 29, wherein the encoded audio signal contains items of encoded data at intervals of at least a preselected minimum time, further comprising the step of: after identifying one such waveform element, not subjecting to the said analysis waveform elements that follow the identified waveform element within a period of time set in dependence upon the said preselected minimum time.
  31. 31. A method as claimed in any one of claims 28 to 30, wherein the said predetermined value, or the said predetermined narrow range of amplitudes, is set in dependence upon an amplitude value of a reference waveform element included in the encoded audio signal prior to waveform elements encoded with items of data.
  32. 32. A method as claimed in any one of claims 28 to 31, wherein the decoding of data from the encoded audio signal is carried out in real time.
  33. 33. Encoding apparatus, for encoding data into an
    audio signal, including : measuring means for producing respective measures of at least frequency and amplitude of waveform elements in the said audio signal; checking means for checking whether the produced measures for such a waveform element satisfy a set of predetermined criteria; and encoding means operable, when a waveform element satisfying the said set of predetermined criteria is found, to encode an item of data into the said audio signal by modifying a preselected portion of that waveform element so that in that portion its amplitude has a predetermined constant value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes.
  34. 34. Decoding apparatus for decoding data from an encoded audio signal, including: identifying means for analysing waveform elements in the encoded audio signal to identify a waveform element therein whose amplitude, in a preselected portion of the waveform element, has a substantially constant value equal or close to a predetermined value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes; and decoding means operable, when such a waveform element is identified, to output an item of decoded data.
  35. 35. A method of monitoring broadcasts by a radio station, including the steps of: encoding an audio signal to be broadcast by the radio station with predetermined data using a method as claimed in any one of claims 1 to 27; receiving the audio signal as broadcast by the radio station; decoding the received audio signal using a method
    as claimed in any one of claims 28 to 32 so as to extract therefrom the said predetermined data ; and using the extracted predetermined data to collect information relating to broadcast of the audio signal by the radio station.
  36. 36. A method as claimed in claim 35, wherein the said audio signal that is encoded is part of a copyright work or of a radio advertisement.
  37. 37. A computer program which, when run on a computer, causes the computer to encode data into an audio signal, the program comprising: a measuring code portion for producing respective measures of at least frequency and amplitude of waveform elements in the said audio signal; a checking code portion for checking whether the produced measures for such a waveform element satisfy a set of predetermined criteria; and an encoding code portion which, when a waveform element satisfying the said set of predetermined criteria is found, encodes an item of data into the said audio signal by modifying a preselected portion of that waveform element so that in that portion its amplitude has a predetermined constant value or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes.
  38. 38. A computer program which, when run on a computer, causes the computer to decode data from an encoded audio signal, the program comprising: an identifying code portion for analysing waveform elements in the encoded audio signal to identify a waveform element therein whose amplitude, in a preselected portion of the waveform element, has a substantially constant value equal or close to a predetermined value, or has a predetermined variation with time whilst remaining within a predetermined narrow range of amplitudes; and
    an outputting code portion for outputting an item of decoded data when such a waveform element is identified.
  39. 39. A computer program as claimed in claim 37 or 38, carried on or by a carrier medium.
  40. 40. A computer program as claimed in claim 39, wherein the carrier medium is a recording medium.
  41. 41. A computer program is claimed in claim 39, wherein the carrier medium is a signal.
  42. 42. A method of encoding data into an audio signal substantially as hereinbefore described with reference to the accompanying drawings.
  43. 43. A method of decoding data from an encoded audio signal substantially as hereinbefore described with reference to the accompanying drawings.
  44. 44. Encoding apparatus substantially as hereinbefore described with reference to the accompanying drawings.
  45. 45. Decoding apparatus substantially as hereinbefore described with reference to the accompanying drawings.
  46. 46. A method of monitoring broadcasts by a radio station substantially as hereinbefore described with reference to the accompanying drawings.
  47. 47. A computer program which, when run on a computer, causes the computer to encode data into an audio signal, substantially as hereinbefore described with reference to the accompanying drawings.
  48. 48. A computer program which, when run on a computer, causes the computer to decode data from an encoded audio signal, substantially as hereinbefore described with reference to the accompanying drawings.
GB0100543A 2001-01-09 2001-01-09 Encoding and decoding of data in audio signals Expired - Fee Related GB2370955B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB0100543A GB2370955B (en) 2001-01-09 2001-01-09 Encoding and decoding of data in audio signals

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB0100543A GB2370955B (en) 2001-01-09 2001-01-09 Encoding and decoding of data in audio signals

Publications (3)

Publication Number Publication Date
GB0100543D0 GB0100543D0 (en) 2001-02-21
GB2370955A true GB2370955A (en) 2002-07-10
GB2370955B GB2370955B (en) 2005-04-20

Family

ID=9906511

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0100543A Expired - Fee Related GB2370955B (en) 2001-01-09 2001-01-09 Encoding and decoding of data in audio signals

Country Status (1)

Country Link
GB (1) GB2370955B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997031440A1 (en) * 1996-02-26 1997-08-28 Nielsen Media Research, Inc. Simultaneous transmission of ancillary and audio signals by means of perceptual coding
WO1997037448A2 (en) * 1996-04-03 1997-10-09 Aris Technologies, Inc. Apparatus and method for encoding and decoding supplementary data in analog signals

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997031440A1 (en) * 1996-02-26 1997-08-28 Nielsen Media Research, Inc. Simultaneous transmission of ancillary and audio signals by means of perceptual coding
WO1997037448A2 (en) * 1996-04-03 1997-10-09 Aris Technologies, Inc. Apparatus and method for encoding and decoding supplementary data in analog signals

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
IEEE TRANS. MULTIMEDIA, Vol. 3, No. 2, June 2001, pp 232-241 *

Also Published As

Publication number Publication date
GB2370955B (en) 2005-04-20
GB0100543D0 (en) 2001-02-21

Similar Documents

Publication Publication Date Title
CA2405179C (en) Multi-band spectral audio encoding
US6879652B1 (en) Method for encoding an input signal
US8396705B2 (en) Extraction and matching of characteristic fingerprints from audio signals
CA3124234C (en) Methods and apparatus to perform audio watermarking and watermark detection and extraction
AU2009308305B2 (en) Methods and apparatus to perform audio watermarking and watermark detection and extraction
AU2001251274A1 (en) System and method for adding an inaudible code to an audio signal and method and apparatus for reading a code signal from an audio signal
EP2210252B1 (en) Methods and apparatus to perform audio watermarking and watermark detection and extraction
JP4478183B2 (en) Apparatus and method for stably classifying audio signals, method for constructing and operating an audio signal database, and computer program
AU2004201423B2 (en) System and method for encoding an audio signal, by adding an inaudible code to the audio signal, for use in broadcast programme identification systems
KR20020035116A (en) Scalable coding method for high quality audio
US20040039913A1 (en) Method and system for watermarking digital content and for introducing failure points into digital content
EP1497935B1 (en) Feature-based audio content identification
Petrovic et al. Data hiding within audio signals
US7466742B1 (en) Detection of entropy in connection with audio signals
GB2370955A (en) Encoding and decoding of data in audio signals
US10819884B2 (en) Method and device for processing multimedia data
AU2008201526A1 (en) System and method for adding an inaudible code to an audio signal and method and apparatus for reading a code signal from an audio signal

Legal Events

Date Code Title Description
PCNP Patent ceased through non-payment of renewal fee

Effective date: 20050720