CN100431355C - Modulating one or more parameters of an audio or video perceptual coding system in response to supplemental information - Google Patents

Modulating one or more parameters of an audio or video perceptual coding system in response to supplemental information Download PDF

Info

Publication number
CN100431355C
CN100431355C CNB018140629A CN01814062A CN100431355C CN 100431355 C CN100431355 C CN 100431355C CN B018140629 A CNB018140629 A CN B018140629A CN 01814062 A CN01814062 A CN 01814062A CN 100431355 C CN100431355 C CN 100431355C
Authority
CN
China
Prior art keywords
parameter
signal
watermark
parameters
decoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CNB018140629A
Other languages
Chinese (zh)
Other versions
CN1672418A (en
Inventor
马修·奥布雷·沃特森
迈克尔·米德·特鲁曼
斯蒂芬·德克尔·维尔农
布雷特·格拉汉姆·克罗克特
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dolby Laboratories Licensing Corp
Original Assignee
Dolby Laboratories Licensing Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corp filed Critical Dolby Laboratories Licensing Corp
Publication of CN1672418A publication Critical patent/CN1672418A/en
Application granted granted Critical
Publication of CN100431355C publication Critical patent/CN100431355C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • G06T1/0028Adaptive watermarking, e.g. Human Visual System [HVS]-based watermarking
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/002Dynamic bit allocation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • H04N19/467Embedding additional information in the video signal during the compression process characterised by the embedded information being invisible, e.g. watermarking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2201/00General purpose image data processing
    • G06T2201/005Image watermarking
    • G06T2201/0064Image watermarking for copy protection or copy management, e.g. CGMS, copy only once, one-time copy

Abstract

A method of modifying the operation of the encoder function and/or the decoder function of a perceptual coding system in accordance with supplemental information, such as a watermark, so that the supplemental information may be detectable in the output of the decoder function. One or more parameters are modulated in the encoder function and/or the decoder function in response to the supplemental information.

Description

The response side information is revised one or more parameters of audio or video perceptual coding system
Technical field
The present invention relates to the Steganography in the audio or video signal situation.More particularly, the present invention relates to revise the encoder of audio or video perceptual coding system and/or the operation of decoder, make side information in decoder output, can detect according to side information.This side information usually is called " watermark ".Watermark is a kind of mode of Steganography.
Background technology
Steganography and watermark
Steganography is a science of hiding signal in another kind of signal.Binding algorithm or process may be the halogen rod or " fragility "-in other words, the signal of destroy hiding is difficulty or be very easy to very.Consider voice applications, the weak Steganography technology that is highly brittle will use the least significant bit of PCM channel to carry the data flow that is independent of audio programs of carrying originally in upper.The data channel of hiding that carries in the least significant bit can significantly not make the audio program distortion, but its effect is as the low level vibration signal.This technology can be destroyed under the meaning of data-signal in simple Audio Processing be fragile, such as change in gain, and digital to analogy conversion etc.
Watermark is a kind of form of Steganography, and wherein typical signal hiding technology will have the halogen rod for the destruction of being caused by normal handling or deliberate attack.Like this, watermark is valuable in relating to safe application, such as Copy Protection or the proprietorial identification of content.In this application, for example the watermark portability copies state of a control, copyright information, and relate to the information how the main program material is disengaged.Even main program was stolen afterwards or illegally copied, watermark can still be embedded in the program material ideally, and provides and determine proprietorial evidence.
One or more watermarks can inserted along the many points of " content " (for example audio or video performance) distribution passage.The information of adding the signal that this path begins to can comprise copyright information or control area, and the information of adding the signal chains end to can comprise playback of information, such as the date marking and/or machine sequencing number.For making content can be tracked to its source, watermark can be embedded into along the distribution passage all places.
An important consideration to the watermark of Voice ﹠ Video signal is that the signal of hiding should not can reduce the quality of signals that it is hidden therein.Ideally, watermark should be transparent fully; Difference between watermark signal and the primary signal should be imperceptible (for the observation with the naked eyes personnel) in other words.Certainly, must be detectable by some means difference, otherwise watermark signal be exactly expendable.Yet it is deliberately appreciable that watermark is used available to some.For example, image can be the watermark that can see, so that prevent commercial the use.In addition, paper can be by watermark so that pass on the seal of appreciable authenticity.
Like this, the purpose of watermark can be summarized as follows:
The mode of the modification of a signal is to add a kind of secondary singal or side information, the signal that the result obtains revising,
Difference between the signal of primary signal and modification should detect but can not perception, and
Modification should be difficult to remove or obscure.
Perceptual coding
Perceptual coding is to remove the perception independence so that signal is reduced to the science of more effective representation from signal.For example, in some applications, perceptual coding is used for reducing digital audio or transmission of video signals data rate, so that be fit to predetermined channel capacity limit.The perceptual coding of Voice ﹠ Video signal is the ripe subject of setting up, and can make the Voice ﹠ Video signal be reduced to quite low data rate, for effectively storage and transmission.
Quite the operation of perceptual coding is by the content of analyzing primary signal and the perceived relevance of discerning each signal component.Generate the revision of primary signal then, make the version of revising to use the data rate lower to represent than primary signal.Ideally, the difference between the signal of primary signal and modification is non.Notice that quantizing noise, or other distortion is usually controllably introduced so that than the data rate of low signal.Consider that the character of human perception wants generted noise or other distortion to make that its maintenance can not perception or I perception.
Perceptual audio coder adopts purpose will react the model of sheltering of human perception with certain levels of precision.Shelter model the perceptual mask threshold value of having established the sentience border is provided.The solid line of Fig. 1 is represented just can hear sound pressure level such as the sound of sine wave or noise arrowband, the threshold value of hearing in other words.The sound of level can be heard on the curve; Sound below it can not be heard.This threshold value clearly with frequency dependence.People can hear than at 50Hz or the light sound of 15kHz at about 4kHz.At 25kHz, in any case threshold value can't scale--sound be big, all can not hear.
Consider threshold value as shown in phantom in Figure 1, at the frequency signal that rings relatively of 500Hz sine wave for example, shown in the figure perpendicular bisector.Threshold value sharply rises near near 500Hz, frequency away from the time ease up, and can hear that part does not have fully at a distance.
This rising of threshold value is called as shelters.At loud 500Hz sine wave signal (" masking signal " or " sheltering ") when existing, signal that should be below the threshold value that can be described as " masking threshold " is by loud signal hiding, or shelters.Other signal far away more can rise to a certain degree not to be had on the signal threshold value, but still is lower than new masking threshold, thereby does not still hear.Yet, do not have the distal part of the frequency spectrum that signal threshold value changes therein, at any noise that does not have 500Hz to shelter can to hear still as there being it to hear.Like this, shelter with the existence of one or more masking signals irrelevant; It is relevant somewhere with their institutes on frequency spectrum.For example, some music clip comprises the many spectrum components that can hear that frequency range distributes that spread all over, and thereby provides the masking threshold curve that raises with respect to the no signal threshold curve everywhere.For example, the alternative music paragraph is made up of the loud relatively solo musical instrument sound from having a fraction of composition that is restricted to frequency spectrum, provides like this to shelter the sine wave that curve more resembles Fig. 1 and shelter example.
Shelter in addition and the relevant time character of time relationship of sheltering between (a plurality of) and the masked signal (a plurality of).Some masking signal provide shelter basic just at masking signal (a plurality of) when existing (" shelter the same period ").Other masking signal provides to shelter not just and is sheltering when occurring, but also a little earlier in time (" backward masking " or " sheltering in advance ") and on the time a little later (" sheltering forward " or " later stage shelters ") occur.Signal level " temporary transient ", unexpected, adding to the difficulties rapidly and significantly presents all three kinds " types " and shelters: backward masking, shelter the same period, and shelter forward, and the signal of stable status or metastable state may only present and shelters the same period.
Under the masking threshold that all noises that add by appreciable encoding process and distortion should keep, so that avoid appreciable infringement.If noise that adds by cataloged procedure and distortion reach but surpass masking threshold, then claim signal to be encoded in the level of " just noticeable poor "." the coding mark " of system may be defined as and be positioned at amount-Zero-code mark under the masking threshold by the noise of its interpolation or distortion and mean that signal is encoded with noticeable difference just, and the mark of just encoding means that the noise of interpolation or distortion have some leeway can not perception, and negative coding mark is meant and has appreciable infringement.
Notice that the different aspect of signal (for example, bandwidth, temporal resolution, spatial accuracy etc.) can be encoded as precision in various degree, consequently different signal characteristics had different coding marks.If source signal is encoded like this, make that the coding mark is non-negative to all signal characteristics, then it can be called as in the perception with the source and is equal to.
Perceptual coding system is by can together passing on the encoder of allocation information or sensor model information to constitute to decoder and the data that are encoded.The perceptual coding system that three kinds of main types are arranged: forward direction self adaptation, back are to self adaptation, and both mixing.In the forward direction Adaptable System, encoder obviously sends allocation information to decoder.The back does not send any to decoder to Adaptable System and distributes or sensor model information.Decoder regenerates the position from the data that are encoded and distributes.Hybrid system allows some assignment information, and the full resolution form such as less than sensor model is included in the data that are encoded, but than much smaller in the full forward direction Adaptable System.In following document, propose these three types of more detailed discussion of perceptual coding system " AC-3:FlexiblePerceptual coding for Audio Transmission and Storage; " by Craig C.Todd et al, Preprint 3796,96th Convention of the Audio EngineeringSociety, February 26-March 1,1994.Perceptual coding system is developed by DolbyLaboratories, such as Dolby Digital and Dolby E coded system, this will be at following further recognition, being to mix the example of forward direction/back to Adaptable System, will then be the example of forward direction Adaptable System in the MPEG-2 of following further recognition AAC coded system also.
The effect of perceptual audio coder can be summarized as follows:
The modification result of a signal is the signal of revising,
Difference between the signal of primary signal and modification should be non, and
The expression of the signal of revising should be more effective than the expression of primary signal.
Safety
Watermark is strong like that nothing but the ability that watermark exempts from directtissima as safety measure.Many digital watermarks of current use attempt to avoid itself being subjected to by the secret of the details that keeps watermark the attack of success, do not know the assailant does not just know how to revise watermark signal with the concealment watermark data if be not disclosed in hypothesis watermark in advance.This is the principle that is called " by the fail safe of concealment ".In the cryptography field, generally abandoned as the NOT logic principle by the fail safe of concealment.If by its fail safe of deriving of maintaining secrecy, then only needing a people to disclose the fail safe of ins and outs whole system, algorithm or process promptly be compromised.
The target of safety can be summarized as follows:
Protect content by this way, make, perhaps enable to obtain abuse evidence subsequently and the traceability of abuse root to the stealing of content or impossible,
To attack is the halogen rod, and
Even in system, also can keep high fail safe under the faintest link.
Of the present invention open
The present invention is intended to respond a kind of method of the operation of encoder that side information revises perceptual coding system and/or decoder, makes that side information is detectable in the output of decoder.One or more parameters are modified in response side information encoder and/or the decoder.
According to the present invention,, transmit such as the such side information of watermark information, so that in the output of decoder, cause detectable but preferably non variation by the encoder of adjusting perceptual coding system and/or the one or more parameters in the decoder.This information is being of replenishing, and it is replenishing primary information such as the audio or video information of being carried by encoder.Typically, this side information is the naturality of " watermark ", though not necessarily.The adjusting of one or more parameters can be called as in the signal that is encoded (under the situation of the parameter in regulating perceptual audio coder) and the additional or watermark information of (under the parameter situation in regulating perceptual audio coder and/or perception decoder) " embeddings " in decoded signal.
When in encoder, realizing at least in part, though the certain realization of the present invention may be revised the bit stream data of expression primary information indirectly, the present invention does not plan directly to revise the bit stream data (yet not being modified in the primary information that becomes bit stream data after quantizing in the perceptual audio coder) of representing primary information.The present invention pay attention to the side information in the output of perception decoder rather than in the bit stream of not decoding detection (though this information whether as encoder and/or decoder in the result of behavior be transmitted).
The meaning of revising is to change between relevant or a plurality of values (state) or the value of the parameter among them, and wherein said value can comprise " default value ", and if not for behavior of the present invention, parameter was exactly this value originally.For example, parameter value can change between its default value and one or more other value or among them and (has only in parameter under the situation of two possible values, this parameter is sometimes referred to as " sign ", parameter can change between these two values), perhaps it can not comprise between one or more embedding values of default value or be changed among them.
The meaning of " response regulation " side information or watermark signal or sequence is, the adjusting of parameter is by side information or watermark signal or sequence or direct or control indirectly, such as when control is revised by the function of one or more other signals, sort signal for example comprises one group of instruction, such as the certainty sequence or the input signal that are applied to coded system.
The meaning of " parameter " is or not the variable of the bit stream data of expression primary information in perceptual coding system.Each mode according to the present invention is applicable to the Dolby Digital (AC-3) of modification, mpeg audio, and the example of MPEG video parameter is at the following Fig. 6 that is shown in respectively, in 7 and 8 the form.The present invention also pays attention to do not have approved one or more parameter to comprise the adjusting of parameter to be defined in the perceptual audio coder standard of announcing.
The meaning of the bit stream data of primary information " expression " is to be produced by perceptual audio coder but do not have decoded and carry data bit in the bit stream that is encoded of primary information, such as audio or video information.The bit stream data of expression primary information for example comprises the exponential sum logarithm under the situation of Dolby Digital (AC-3) system, and under the situation of MPEG-2 ACC system, comprises scale factor and Huffman code coefficient.
In compound perceptual coding system (for example, Dolby Digital and audio frequency, mpeg audio, MPEG video etc.), a large amount of independently coding parameters provides significant coding flexibility ratio." Dolby ", " Dolby Digital " reaches " Dolby E " is the trade mark of Dolby LaboratoriesLicensing corporation.
Dolby Digital coding details proposes " Digital AudioCompression Standard (AC-3); " in following document A dvanced Television SystemsCommittee (ATSC), Document a/52, December 20,1995 (the World Wide Web Site www.atsc.org/Standards/A52/a_52.doc. in the internet can get).Also can be referring to Errata Sheet of July 22,1999 (can get) at the World Wide Web Site www.dolby.com/tech/ATSC of internet err.pdf.
The details of Dolby E coding proposes " Efficient Bit Allocation; Quantization; and Coding in an Audio Distribution System " in following document, AESPreprint 5068,107th AES Conference, August 1999 and " ProfessionalAudio Coder Optimized for Use with Video ", AES Preprint 5033,107th AES Conference, August 1999.
The details of MPEG-2ACC coding proposes ISO/IEC13818-7:1997 (E) " Information technology-Generic coding of movingpictures and associated audio information--; Part 7:Advanced AudioCoding (ACC), " International Standards Organization (April 1997) in following document; " MP3 and AAC Explained " by Karlheinz Brandenburg, AES 17thInternational Conference on High Quality Audio coding, August 1999; And " ISO/IEC MPEG-2 Advanced Audio Coding " by Bosi, et.Al., AES preprint 4382,101st AES Convention, October 1996.
Various perceptual audio coders comprise the Dolby encoder, mpeg encoder and other general view propose " Overview of MPEG Audio:Current and FutureStandards or Low-Bit-Rate Audio Coding; " in following document by KarlheinzBrandenburg and Marina Bosi, J.Audio Eng.Soc., vol.45, No.1/2, January/February 1997.
General by the particular default of coded system based on input signal feature selecting perceptual coding parameter.Yet, have not only a kind of method to select to produce the coding parameter value of the decoded signal that does not have perceived differences usually, and this variation the possibility of result of coding parameter value is decoded signal with detectable but can not perceived differences.Note, can not perceptibility be meant perception, and detectability is based on the ability of non-human detector the mankind.
Supplementary signal or be shown in detector and recover to be included in the information that regeneration (decoded) signal is embedded in.For example under the situation of audio signal, detection can realize on acoustics in some cases, and may need detection of electrons under other situation.Detection of electrons can be the numeral or simulation field.In the detection of electrons of digital field can be time domain or frequency domain with decoded output, the frequency domain in the decoder that maybe can be frequency before the time conversion.Because the interpolation of room noise, loud speaker and microphone characteristic, and transit back high-volume, after Acoustic treatment, extract watermark and be considered to more difficult challenge.
The noise that the perceptual coding system of many reality satisfy to keep does not add is in the requirement that just can perceive under the difference.Sentience in perceptual coding system requires usually to be loosened to satisfy bit rate target or limitation of complexity.Under these situations, though the noise difficulty of adding is appreciable, but still have the value that is different from default value that coding parameter can be adjusted to during perceptual coding, but this will can not make the perception more that becomes of appreciable noise.Though the possibility of result of the adjusting of parameter is imperceptible basically variation in the noise of being felt, this possibility of result is detectable variation in decoded signal.
According to each mode of the present invention, preferably regulate one or more parameters, the effect of make regulating causes that the noise that adds by perceptual coding and distortion are at all or part frequency spectrum, near but be lower than just noticeable difference (" distortion " be meant here be encoded with the direct difference of primary signal, and may or may not cause the some effects that can hear).Thereby, do not generate under the appreciable impairment situation and to be difficult to remove or the fuzzy effect as a result of regulating one or more parameters of listening being no more than masking threshold.On the other hand, some attacks are under masking threshold, as if then the part of parameter regulation effect can not keep.
As noted before, each mode of the present invention also can adopt when encoder is not encoded to a source signal, makes noise and distortion just can perceive under the difference.Under this situation, source signal is encoded by this way, makes it be detracted with respect to the source, and parameter regulation causes the impairment that detects viewpoint that is different from of decoded signal, but but is identical basically in the perception preferably.As above situation, but under the impairment that does not add, be difficult to the effect of parameter modification result in removal or the fuzzy solution coded signal with bigger perception degree expansion impairment or introducing.
Method of the present invention is different from the technology that applied watermark before perceptual coding basically.In those technology, even coded system may comprise enough coding mark conversion watermarks, but the selected concrete method of watermark into passing on priori that can not ensure overlaps with the coding mark of perceptual coding system.Because the system operation of this priori is independently, they may interact bad accidentally, cause appreciable impairment or cause that watermark is fuzzy.
As mentioned above, perceptual audio coder has reduced the data rate of input signal by removing the redundant information of perception.For example permanent data rate encoder is reduced to lower fix information speed to fixing input information speed.The part that this data reduce requires the function of a kind of it is characterized by " rate controlled " sometimes, and this function guarantees that encoder output is no more than final fix information amount.Rate controlled reduces information and has reached the restriction encoding amount up to it.
In some perceptual audio coder, distortion measurement and rate controlled pairing are to guarantee to abandon correct information.Distortion measurement comparison original input signal and the signal (output of rate controlled) that is encoded.Distortion measurement can be used to control coding parameter to change the result of rate controlled process.
Distortion rate controlled mode of the present invention seeks to solve, and how watermark is embedded perceptual audio coder, makes intensity maximize and make the maximized problem of sentience of the signal of embedding simultaneously.In one embodiment, the present invention also allows the user by regulating the parameter in the watermark embed process, selects to embed the intensity or the energy of signal.
Except parameter regulation, each mode of the present invention also adopts instruction set such as certainty sequence to change the certain aspect of parameter regulation, thereby changes the feature of gained watermark.Deterministic sequence is to produce by the mathematical procedure that the equation that provides definition (generator equation) and initial condition (keyword) produce binary one and the null sequence calculated.The alternative way some of the present invention that adopts the certainty sequence is disclosed.These technology can be improved the not sentience of watermark, and can also improve the halogen rod of watermark, and this is meaningful and useful because many improvement not other technology of sentience tend to reduce the halogen rod.At last, not under the meaning of the halogen rod of sacrificial system, these technology can be improved fail safe realizing all modes of watermaking system (except certainty sequence keyword).
Deterministic sequence mode of the present invention can comprise following one or more action:
Use certainty sequence modification parameter regulation switching rate, and thereby the switching rate (referring to following table 1) of watermark symbol,
The parameter (a plurality of) (referring to following table 2) of using the certainty sequence selection to be used to regulate, and
The parameter of using the certainty sequence modification to be used to regulate is selected the speed (referring to following table 3) of variation.
In addition, the alternative mode of the present invention comprises that the feature Control Parameter of using source signal is regulated and/or the action of the parameter that selection is used to regulate.Source signal response mode of the present invention can comprise following one or more action:
Use source signal characteristics to revise parameter regulation speed changeably, and thereby watermark symbol switching rate (referring to following table 4),
The use source signal characteristics is revised the speed (referring to following table 5) that the parameter that is used to regulate is selected variation changeably, and
Using source signal characteristics to revise the parameter that is used to regulate changeably can lumped parameter number (referring to following table 6).
As following further instruction, the alternative mode according to the present invention, the characteristic of certainty sequence and source signal all with regulate the parameter correlation connection and use.Referring to following table 6,7 and 8.
For some implementation of the present invention, the watermark detection in the output of perception decoder may need to visit the primary information that is applied to encoder.For some other implementation of the present invention, watermark detection can be not visiting original primary information is that cost is carried out with significant complexity in detecting.
Usually wish to deliver to the audience and a little apply (a for example serial number) unique or " serializing " watermark at signal.According to each mode of the present invention, additional information or watermark are embedded into during the perception decode procedure.Inverse quantization correctly in decoder one or more parameters be conditioned.
Some threshold values that are no more than perception by the noise or the distortion of the interpolation of decoder parameters adjustment process, then sentience is not held.For can not perception ground as the part of decode procedure watermarked, watermark threshold of perception current.Many perceptual audio coders are converted to decode procedure to sensor model from cataloged procedure with certain form; Yet other decoder only provides threshold of perception current approximate or coarse expression.The most accurate threshold of perception current is to derive from the source spectral coefficient of non-quantification, if but transmit this data to decoder, then the data rate increase is significant.In addition, the threshold of perception current that provides to decoder in perceptual coding system can be the index of logarithm, and wherein exponential representation has the intelligence sample (as in Dolby Digital system) of ceiling capacity in critical band.In order to improve the accuracy of threshold of perception current in the decoder, can be index encoder from average coder transitions based on ceiling capacity in the frequency band based on sampled energy the frequency band.
Though the adjusting parameter is similar to and regulates parameter in encoder in the decoder in many aspects, and less flexibility is arranged.For example, regulating one or more parameters in decode system may require to note when representing allocation information again based on coding parameter.In addition, realize that in decoder the parameter regulation effect can not perception be more difficult.One of them reason is, at least under the situation of desirable encoder, cataloged procedure has added the threshold value that quantization error reaches sentience.Yet, for example because the incompleteness of sensor model, positive signal noise ratio deviation or signal condition institute extremely, when the coding mark may exist, also not always not like this.
Brief description of drawings
Fig. 1 is a Utopian curve chart, and expression (solid line) sound when not having masking signal to exist just can be heard the sound pressure level of (threshold value that listens), and represents the threshold value that (dotted line) can be heard when having the 500Hz sine wave to exist.
Fig. 2 is the functional block diagram of expression basic principle of the present invention, wherein one or more parameters of perceptual audio coder function and/or perception decoder function in the side information adjusting perceptual coding system.
Fig. 3 A is the functional block diagram of an expression mode of the present invention, comprises the side information detector functions of received code system output.
Fig. 3 B is the functional block diagram of more detailed expression detector functions, and this mode of the present invention comprises the side information detector functions that received code need be exported.
Fig. 4 is the functional block diagram of a kind of mode of expression the present invention, and this mode comprises the side information detector functions, i.e. the output of the received code system input of received code system again.
Fig. 5 is the functional block diagram of a kind of mode of expression the present invention, and wherein the side information detector functions not only comprises comparator function, and comprises perceptual audio coder function and perception decoder function, and their parameter is not conditioned.
Fig. 6 is a table, and the forward direction/back that is illustrated in certain mixing is applicable to the parameter of adjusting in the self-adaptation type perceptual audio encoders.
Fig. 7 is a table, is illustrated in the parameter that is applicable to adjusting in the forward direction self-adaptation type perceptual audio encoders of certain mixing.
Fig. 8 is a table, is illustrated in the parameter that is applicable to adjusting in certain perception video encoder.
Fig. 9 is the diagram of certain parameter, and these parameters anthropomorphic dummy's ear in certain perceptual audio encoders is sheltered curve (masking spectrum model parameter) frequency spectrum.
Figure 10 is the diagram of the masking spectrum model parameter that can be conditioned in a class perceptual audio encoders.
Figure 11 A is a kind of Utopian expression, is illustrated in the adjusting of SNR offset parameter when having sine wave signal in certain perceptual audio encoders (a kind of masking threshold parameter).
Figure 11 B is a kind of Utopian expression, represents the situation for position constraint coded system, when the SNR offset parameter is conditioned in the mode shown in Figure 11 A, and the effect in the output of parameter decoder.
Figure 11 C is a kind of Utopian expression, represents the situation for non-position constraint coded system, when the SNR offset parameter is conditioned in the mode shown in Figure 11 A, and the effect in the output of parameter decoder.
The legend that adopts among Figure 11 D presentation graphs 11A-C and Figure 12 A-C.
Figure 12 A is a kind of Utopian expression, is illustrated in the adjusting of quick gain code parameter when having sine wave signal in certain perceptual audio encoders (a kind of masking threshold parameter).
Figure 12 B is a kind of Utopian expression, represents the situation for position constraint coded system, when being conditioned with the quick gain code parameter of the mode shown in Figure 12 A, and the effect in the output of perception decoder.
Figure 12 C is a kind of Utopian expression, represents the situation for non-position constraint coded system, when being conditioned with the quick gain code parameter of the mode shown in Figure 12 A, and the effect in the output of perception decoder.
Figure 13 is a kind of Utopian expression, be illustrated in certain perceptual audio encoders, regulate non-effect of sheltering the parameter of parameter in certain perceptual audio encoders, be i.e. " combination in the use " sign, again the sign of matrixing (rematrixing) in the use, and in conjunction with the beginning frequency codes.
Figure 14 is a kind of Utopian expression, is illustrated in certain perceptual audio encoders the effect that non-parameter of sheltering parameter is a phase flag in the adjusting.
Figure 15 is a series of Utopian waveforms, calls window shape for the time domain that embeds side information during the presentation code.
Figure 16 is a series of Utopian waveforms, calls window shape for the time domain that embeds side information during the presentation code.
Figure 17 is a Utopian temporal envelope response, describes sound pressure level (SPL) to time relation, the temporal masking effect of expression masking signal.
Figure 18 is a Utopian expression, and the effect that makes that expression can be used for signal is limited to the type of the adjusting in the temporal masking envelope.
Figure 19 is the diagrams of a series of Utopian amplitudes to frequency, and how the symbol of expression 2-position can be represented by four different bandwidths.
Figure 20 is a Utopian frequency to the diagram of time, and expression comprises an example of the audio signal that embeds signal, and the bandwidth of use signal is represented different symbols.
Figure 21 is the diagram of a Utopian amplitude to frequency, adds being shaped as the noise that listens threshold level near the mankind when there is sine wave signal in expression.
Figure 22 is the diagram of a Utopian energy to frequency, is expressed as to detect the three kinds of required different energy levels of four kinds of different bandwidths that generate 2 bit signs.
Figure 23 is the diagram of a Utopian amplitude to energy, some exemplary histograms of the distribution of expression ' height ' and ' low ' state.
Figure 24-the 26th, logical flow chart, expression uses the threshold value of sentience to be used for watermarked process.
Figure 24 is a logical flow chart, and the inside that expression uses the threshold value of sentience to be used for watermarked process iterates cyclic part.
Figure 25 is a logical flow chart, and the outside that expression uses the threshold value of sentience to be used for watermarked process iterates cyclic part, and wherein the outer loop spectral coefficient is exaggerated.
Figure 26 is a logical flow chart, the process remodeling of expression Figure 25, so that realize psychoacoustic model as far as possible, or threshold of perception current, also embed side information or watermark signal simultaneously.
Figure 27 represents describing of a series of Utopian waveforms, crosses over frequency spectrum, threshold of perception current, quantizer error, and the quantizer error of revising, expression is regulated under the situation of the parameter that influences quantization error in the critical band, how can use the distortion measurement process watermarked.
Figure 28 represents describing of a series of Utopian waveforms, crosses over frequency spectrum, threshold of perception current, quantizer error, and the quantizer error of revising, expression is regulated influences signal noise ratio under the situation of the parameter that is offset on the frequency spectrum, how can use the distortion measurement process watermarked.
Figure 29 is a logical flow chart, is illustrated in during the decoding step according to the watermarked process of mode of the present invention.
Figure 30 is a functional flow diagram, represent alternate manner of the present invention, wherein the control of regulating by the side information of watermark is revised by the function of one or more other signals or data sequence, and these sequences for example comprise deterministic sequence and/or are applied to the input signal of coded system.
Implement best mode of the present invention
Fig. 2 is a functional block diagram of expression basic principle of the present invention.Perceptual audio coder function 2 and perception decoder function 4 comprise perceptual coding system.Primary information such as audio or video information is applied to perceptual audio coder function 2.Encoder functionality 2 produces the digital bit stream that is received by perception decoder 4.Response side information (for example watermark signal or sequence) is regulated the one or more parameters in encoder functionality and/or the decoder function.Because side information can be applied to encoder functionality or decoder function or be applied to both, dotted line is to represent respectively from the side information to the decoder function and to decoder function.The output of perception decoder function is the primary information that has the side information of embedding.Side information is detectable in the output of decoder function.
Some side informations be applied to encoder functionality 2 and decoder function 4 both, the information that then typically is applied to one is different from the information that is applied to another.For example, the side information of controlling one or more encoder functionality parameters may be identification audio or video content owner's watermark, and the side information of controlling one or more decoder function parameters may be sign provides a serial number from the equipment of audio or video content to one or more clients.Typically, side information will be applied to encoder functionality and decoder function at different time.
Fig. 3-the 5th, the functional block diagram of basic principle of expression mode of the present invention, this comprises the detector functions of the side information of the output that is used for detecting decoder function.Detection can realize (electricity or acoustics) in the numeric field of decoder function output or analog domain.Detection can also be after coding but was being realized in the numeric field at decoder before frequency domain to time domain is changed.
Fig. 3 A and Fig. 2 are similar, and institute's difference is that it comprises detector functions 6, and this detects the output that its function 6 receives the detector functions 4 that detects the side information in the output of detector functions.The output of detector functions 6 is side informations.Fig. 4 and Fig. 3 category-A seemingly, institute's difference is to comprise detector functions 8, this detector not only receives the output of detector functions 4, and receives the identical primary information be applied to encoder functionality.The major function of detector functions 8 is relatively to be applied to the original input information of encoder functionality and the output of decoder function, so that provide side information as its output.Fig. 5 is the distortion that Fig. 4 disposes.As among Fig. 4 in Fig. 5, detector functions 10 receives the output of detector functions 4 and is applied to the primary information of encoder functionality 2.Yet detector functions 10 is different from detector functions 8, is not only to comprise comparator function 12, and comprises perceptual audio coder function 14 and perception decoder function 16.Encoder functionality 14 is similar with encoder functionality 2, and institute's difference is that its parameter is not conditioned.Detector functions 16 is similar with detector functions 4, and institute's difference is not to be conditioned in its parameter.The action that detects the side information in the decoder output was like this realized by one of following action beginning:
Observe decoded signal,
More decoded signal and the signal that is applied to encoder functionality, and
More decoded signal with from the decoded signal of the perceptual coding system that is equal to substantially, the parameter that does not wherein respond side information in encoder functionality and decoder function is conditioned.
The effect of the parameter regulation of the certain type of the detection the most suitable detection of configuration of Fig. 3 A is such as (regulate bandwidth parameter at following detailed description) when bandwidth parameter is conditioned.In order to detect the effect of regulating most parameters, as in the configuration of Figure 4 and 5, must relatively be applied to the primary information and the side information that the embedding that provides by decoder has been provided of encoder.The configuration of Fig. 5 makes it possible to make stricter comparison, and is caused poor because the difference between the information that only is compared is only by regulating parameter.In the configuration of Fig. 4, difference comprises may be by other effect of perceptual coding and decode procedure introducing.
Do not need to visit the primary information that is applied to perceptual audio coder because Fig. 3 A detects configuration, what depend on adjusting is the parameter of which encoder and/or decoder, and it can be realized in real time or near real-time.For example, regulating bandwidth parameter can allow to detect by only analyzing decoder in real time or near real-time.Especially, the detector functions 6 of the configuration of Fig. 3 A can comprise one or more delay features, makes the output of decoder function 4 to compare itself.For example, as shown in Fig. 3 B, detector functions 6 can comprise comparator function 12 ' and one or more delay feature 7,7 ' etc., makes the action of observing decoded signal comprise the version of more decoded signal and itself time delay.Stand to use the comparator function of threshold value from the energy state of one or more previous pieces, determine symbol by the bandwidth adjustment detection mode of the following stated so that for example be.Block length is that detector is known, and certain synchronous form must occur, so that make the character rate of expection consistent with actual character rate.The adjusting of other parameter may not allow in real time or near real time detects, or may need compare the output of detector and the input signal of encoder as the configuration in the Figure 4 and 5.
Wherein in the input configuration relatively of the output of decoder and encoder, importantly to make input and output signal synchronous as Figure 4 and 5.Depend on what parameter selected for adjusting or the data rate of a plurality of parameter and side information, may must provide height between these signals synchronously.A method of making is to embed the sequence of determining in a signal like this, such as the PRN sequence, makes calling sequence also be embedded in the output of decoder.By comparing the sequence in the input and output signal, can realize fine-grained synchronous.
Detection can manually realize, or can realize automatically in some cases.Use the PRN sequence in the signal can be convenient to automatic detection.If manually carry out, can adopt vision aid such as the signal Spectrum Analysis that is compared.
Can be conditioned so that some example of watermarked coding parameter is listed in some tables: first shown in Fig. 6 shows (Dolby audio coder parameter), be shown in the table of second among Fig. 7 (mpeg audio coder parameters), and be shown in the table of the 3rd among Fig. 8 (mpeg video encoder parameter).Each category (for example " sheltering model and position distribution ") for parameter, if parameter (a plurality of) is responsive to adjusting in encoder and/or decoder, and the result is the changing of feature of watermark signal in the detected signal when parameter is conditioned, each table has been indicated the type (for example " SNR skew ") of parameter, concrete parameter (for example " csnroffst ", " fsnroffst " etc.).At first row of the table shown in Fig. 6, list six categories of parameter: shelter model and position distribution, the coupling between the channel, frequency bandwidth, shake control, phase relation, and time/the frequency inverted window.Note, in first table, if rematflg is " 0 " (not having matrixing in the encoder), then could during decoding, carry out matrixing again, and in second table,, could during decoding, carry out the M/S coding if ms_used is " 0 " (not having M/S in encoder).
Have at parameter type in coded system under the situation of one or more parameters, approved abbreviation is shown in the bracket for each parameter.Like this, for example " SNR skew " type of parameter comprises four parameters in Dolby Didital: " csnroffst " (thick SNR skew), " fsnroffst " (the thin SNR skew of channel), " cplfsnroffst " (thin SNR skew is coupled), and " lfesfsnroffst " (the thin SNR skew of low-frequency effect channel).These and other Dolby Didital coding parameter has further instruction in the A/52 of above citation Document.Though the Dolby audio coder parameter that major part is listed is general for Dolby Didital and Dolby E coded system, and in A/52 Document to explain, but few parameters is for Dolby E unique (for example, Back gain code (back gain (backgain)) and Back decay code ((backleak) leaked in the back)).Provide following about back gain and the back further information of leakage.
In first row of the table shown in the system, shown four parameter categories: shelter model and position distribution, the coupling between the channel or among them, time noise shaping filter coefficient, and time/the frequency inverted window.Similarly, in first row of the table shown in Fig. 8, two parameter categories have been listed: frame type and motion control., state in the MPEG document of MPEG-2AAC article and other publication at the ISO/IEC of above citation document about the further information of the mpeg audio encoder listed and video encoder parameter.Each mode of the present invention not only can be used for Dolby and MPEG perceptual coding system, and can be used for adjustable other perceptual coding system of parameter in encoder and/or the decoder.The example of other perceptual audio coder has discussion in the Brandenburg of above citation and Bosi (J.Audio Eng.Soc., 1997) magazine article.
Regulate perception and listen model parameter
In such as Dolby Didital and Dolby E sensing audio encoding system, have the expression perception to listen model or shelter the parameter of model and assigning process on the throne in use.Especially, certain parameter simulates the curve of sheltering of people's ear on frequency spectrum: the downward curve of sheltering is suddenly decayed for frequency, and the curve of sheltering that makes progress is suddenly decayed for frequency, and the curve of sheltering that makes progress is decayed gradually for frequency.These are shown among Fig. 9.Though masking spectrum is the notion of a frequency domain, the term (for example " slowly " and " soon ") of time domain is adopted in these standard names of sheltering parameter that decay.
Referring to Fig. 9, be defined as follows for masking signal corresponding to the coding parameter element of masking spectrum model level and slope (be respectively gain and leak) by them:
The downward curve of sheltering: gain/back, back leakage
Upwards shelter curve (soon): fast gain/fast leakage
Upwards shelter curve (slowly): gain slowly/leakage slowly
Notice that leak back gain and back is predetermined parameter in Dolby E coding, but be not predetermined parameter in Dolby Didital coding.In Dolby Didital, described in above-mentioned A/52 document, fast gain parameter is quick gain code (fgaincod, cplfgaincod and lfegaincod); Fast leakage parameters is the code (fdcycod and cplfleak) of decaying fast; Slow gain parameter is slow gain code (sgaincod); And slow leakage parameters is the code (sdycod and clpsleak) of decaying at a slow speed.
More than Ding Yi each parameter is applicable to and regulates so that transmit watermark during perceptual coding.The adjusting that one of they are any slightly changes the masking spectrum model, thereby an influence position assigning process.Like this, masking jig shape parameter and an input signal close coupling, thereby the halogen rod of formation watermark.Figure 10 provide can be conditioned the diagram of parameter of masking spectrum model.
Other certain in Dolby Didital and Dolby E coded system parameter is controlled total signal noise ratio (SNR).These parameters are SNR frequency domain parameters in Dolby Didital: " csnroffst ", " fsnroffst ", " cplfsnroffst ", and " lfesfsnroffst ".The SNR parameter exists between signal and quantizing noise and keeps required minimum signal to noise head room level.These parameters influence entire spectrum equably, and these are different with respect to the masking spectrum model parameter of the part of the frequency spectrum of masking signal with initial only influence.
But the effect of other parameter is to regulate as the thin SNR based on critical band, be called " in conjunction with (banded) SNR ", or the Delta position is distributed: deltba and cpldeltba during promptly Dolby Didital encodes.
Figure 11 A provides the diagram (adjusting of the SNR frequency domain among Figure 11 A of regulating the masking threshold of perceptual coding system to 11C and 12A to 12C, and the adjusting of the quick gain code among Figure 12 A), the i.e. result's who when coded system is position constraint (being respectively Figure 11 B and Figure 12 B), regulates effect, and when coded system be not the effect as a result of bitmap (being respectively Figure 11 C and Figure 12 C) adjusting when showing.Figure 11 D identifies the legend that adopts among Figure 11 A-11C and the 12A-12C.When encoder is limited and produces and the position constraint occurs when having the piece that equal length is encoded, this is the needs of many transmission channels.When encoder can change bits number from the piece to the piece, to the not effectively constraint of number of the position that is used for representing signal.(Figure 11 B and 12B) as shown, in the constraint encoder on the throne, the quantizer error of decoded signal is not accurately to mate masking threshold in all frequencies; This example illustrates existence more than necessary position (gap between threshold value and the decoded signal), and the result causes the positive mark between some frequency masking threshold value and original quantizer error.When not having the position original, encoder can make quantizer error accurately mate masking threshold in whole frequency band.For default parameter value, the watermark symbol of purpose may be place value " 0 ".For the parameter value that is conditioned, the symbol of purpose may be as the place value in this example " 1 ".Figure 11 A and 12A are illustrated in and regulate before and masking threshold afterwards.Figure 11 B, 11C, 12B and 12C illustrate result's the signal that is encoded.The masking threshold that is conditioned is in the superimposed comparison with the signal spectrum that is encoded that provides and be conditioned of Figure 11/12B and 11C/12C.Figure 11 D is illustrated in the legend that is adopted among Figure 11 A-C and the 12A-C.
Regulate the non-parameter of sheltering
Fig. 3 and 14 provides the expression of the non-parameter gained result's who shelters parameter characteristics of signals from regulate the Dolby encoder.In each figure, characteristics of signals uses default parameter value and the parameter value that is conditioned to illustrate.In Figure 13, the effect of regulating coupling parameter is shown.For each piece in the express time on horizontal axis, flag activation is a left side and two right channels.When being coupled as in the service marking " 0 ", each channel is by independent process.When being coupled as in the service marking " 1 ", two channels are combined into single coupling channel on certain frequency, by the cplbegf parametric representation.The coupling in service marking, coupling beginning frequency also can be conditioned, and this is also shown in Figure 13.
Among Figure 14, the effect of control phase sign is shown.When phase flag equaled " 0 ", phase place was not conditioned, if but sign equals " 1 ", and then the phase shift 180 of signal is spent.
Regulate the TDAC window parameter
As mentioned above, perceptual audio coder is by removing the data rate that the perception redundant information has reduced input signal.These systems begin by input signal being decomposed into one or single component, use perception analysis to determine then, for can not perception (or reaching the acceptable level of sentience) between the material that after the component that quantizes is decoded, makes the source and be encoded, the great accuracy of each needs of these components.An example of this system is to use time domain to mix repeatedly to disappear mutually (TDAC) conversion that time sampling is converted to transform coder based on frequency representation.In order to guarantee outstanding reconstruct, before conversion, use overlaid windows to handle time-domain sampling.After conversion, frequency sampling is quantized and is encoded in the mode of speed between reducing, thereby not obvious in the perception when decoding.In order to keep in the decoder outstanding reconstruct after the inverse transformation, use those parameters with used parameter matching in encoder to the time-domain sampling windowing, overlapping, and summation.In general, select to be used for the window parameter of Code And Decode window, make when they when forward direction and reverse TDAC conversion are applied in, aliasing is reduced to minimum or elimination.In following document, state " Analysis/SynthesisFilter bank Design based on Time Domain Aliasing cancellation " byPrincen and Bradley IEEE Trans.On Acoustics about the transition coding details of using the TDAC conversion, Speech, and SignalProcessing, Vol.ASSP-34, No.5, October 1986, pp.1153-1161, and " Subband/Transform Coding using Filter bank Design based onTime Domain Aliasing cancellation " by Princen et al, Proceedings:ICASSP 87,1987 Intl.Conf.on Asoustics, Speech, and SignalProcessing, April, 1987, Dallas, Texas, pp.2161-2164.
Can by be adjusted in constitute or phase signals that reconstruct is transformed in the time domain parameter that uses apply watermark.For example, the slope of the time-domain window that uses during coding or decoding or the result that do not match between the alpha (α) cause the aliasing of time domain when using the threshold sampling conversion.This aliasing result is noise or distortion unique in time domain and frequency domain.Like this, perhaps the window parameter in encoder or decoder can be conditioned so that be transmitted in detectable watermark in the decoder output.Poor between that be defined as being encoded in the distortion under this meaning and the primary signal, and may or may not cause the man-made noise (artifacts) that can hear.In a preferred embodiment, the alpha of time-domain window (slope) value is conditioned.Can not perception but relevant with source signal or by introducing by its noise or distorted signal of hiding, very difficult removal or fuzzy gained watermark when not generating appreciable impairment.
In order to transmit the parameter that watermark can reformed another time-domain window is the type of window own.For example, the window of Kaisrer-Bessel definition can be used to embed the watermark bit of " 0 ", and the Hanning window can be used to embed 1 watermark bit.The time that is conditioned changes and can carry out in encoder or decoder.
In addition, in order to improve detectability and to make sentience be reduced to minimum, can in time regulate window parameter by characteristics of signals.For example, instantaneous signal can blur watermark signal, thereby preferably can detect these signals and regulate window, so that reorientate the position of watermark signal, so that obtain the advantage of psychoacoustic chronergy.In addition, depend on source signal characteristics, but the intensity that self adaptation revise to be regulated, and the intensity of watermark signal in the decoded afterwards signal.The unmatched amount of window parameter directly influences the intensity of the distortion of addition.Thereby input signal psychologic acoustics masking characteristics can be analyzed and be used for the watermark embed process of delivering a letter, so that change the unmatched amount of watermark symbol, makes it be sheltered to greatest extent by signal content.
Direct form forward direction TDAC equation of transformation is provided by following:
X ( k ) = - 2 / N &Sigma; n = 0 N - 1 x ( n ) w ( n ) cos ( 2 &pi; N ( k + 1 / 2 ) ( n + n 0 ) ) , 0 &le; k < N / 2
Wherein
The n=number of sampling
K=frequency case number (CN) sign indicating number
X (n)=input PCM sequence
W (n)=series of windows
X (k)=generation conversion coefficient sequence
Sampling sum in the N=conversion
Half of sampling sum in the n0=conversion
Use the TDAC mapping window sequence of Kaisrer-Bessel Defined (KBD) window to define by following equation:
W KBD ( n , &alpha; , N ) = &Sigma; p = 0 n W KB ( p , &alpha; , N ) &Sigma; p = 0 N / 2 W KB ( p , &alpha; , N )
Wherein WKB is a Kaisrer-Bessel nuclear window function, is defined as:
W KB ( p , &alpha; , N ) = I 0 [ &pi;&alpha; 1 - ( p - N / 4 N / 4 ) 2 ] I 0 ( &pi;&alpha; )
And I0 is the 0th rank Bessel function, is defined as:
I 0 ( x ) = &Sigma; k = 0 &infin; [ ( x / 2 ) k k ! ] 2
Figure 15 illustrates five overlapping encoder windows of length 256.Window number 5 is used α=4 values, and watermark is inserted in encoding phase.Should be noted that window 4 and 6 is to use the mixing window of α=3 and α=4 window combination, to be provided at seamlessly transitting between α=3 and signal alpha=4 windows.In the drawings, the decoder window is realized α=3 windows to all conversion.This of window type do not match and introduced time domain aliasing man made noise in the signal illustrating of gained.When the difference between decoder α value (α=4) and decoder α value (α=3) increased, the amount of introducing the time domain aliasing of decoded audio frequency increased, and only exists in the audio-frequency unit of being handled by encoder window number 5.In order to transmit watermark signal, this method that α changes does not need to revise decoder, and the distributed source place that is used in signal carries out watermark.
It is five overlaid windowss of 256 that Figure 16 illustrates length again, yet in this example, changes the α window value during with reverse TDAC window decode procedure.Occur the time domain aliasing again, inject watermark signal to decoded signal.Yet in this example, the signal of embedding is injected in decoder, allows watermark information to be introduced into for specific end user or device.This α revises and allows decoder to embed the information of serializing to signal data.
It may be useful using short mapping window when applying watermark, because they have reduced the duration of aliased distortion, and can usually use in transition state (in the audio coding).Can sample for the temporal masking characteristic of transition signal,, thereby produce the more watermark of halogen rod so that use the alpha value more different with the value of " correctly ".
The TDAC window is regulated detector
By revising the alpha value of TDAC window, be introduced into the time domain aliasing signal of the signal correction that is encoded.This aliasing can be measured as the introducing of the distortion of the pectrum noise of the signal that is encoded or spectrum component.
Possible detection method is as the mode of the configuration of Figure 4 and 5, can compare poor between source material and the watermark data.This method will be for using watermark to revise the distortion spectrum search difference signal of window part.If distortion spectrum surpasses threshold value, this will indicate with 1 ' ' symbol for the watermark part of data.The following distortion spectrum of threshold value will be detected as 0 ' ' symbol.
This method is sensitive for being introduced into the broadband noise of sheltering watermark signal.Another detection method is to follow the tracks of the spectrum peak of watermark signal, and seeks before the spectrum peak and the amplitude modulation of both frequency casees afterwards, and this is introduced by the time domain aliasing in watermark applies.Be similar to the general distortion spectrum method of the following stated, this detection method will relatively center on the frequency case and the threshold value of main spectrum component.Yet this threshold value will be relevant with the intensity of source signal spectrum component.The following frequency spectrum side projection of threshold value will be interpreted as ' 0 ' symbol, and the projection of an above frequency spectrum side will be interpreted as ' 1 ' symbol.
Regulate the TNS filter coefficient
The time noise shaping is a kind of coding techniques that can help prevent pre-echo man-made noise in the sensing audio encoding; This states " Enhancing the Performance ofPerceptual Audio Coder by Teporal Noise Shaping (TNS) " by JurgenHerre and James Johnston in following document, 101st AES (Audio Engineering Society) Convention Preprint 4384, November 8-11,1996.Predictability coding in the frequency domain is used for to the quantizing noise shaping in the time domain.Prediction can help to control the position that quantizing noise is placed in time domain.Under the situation of audio coding, noise is limited in the time domain masking signal amplitude envelope to prevent Pre echoes.Pre echoes is a kind of man-made noise, and this occurs in during the transition state when the frequency translation that applies does not have time enough resolution to prevent that quantizing noise from occurring before the transition in output signal.
Though time noise shaping (TNS) is the characteristic of MPEG-2AAC perceptual coding system, it can be used for other system, injects Dolby Digital, thereby the parameter in this other system of another method adjusting is provided.
According to this mode of the present invention, regulate one or more TNS filter parameters.Especially, as following and then explanation, TNS noise shaping filter rank and TNS noise shaping filter shape can be conditioned.
The TNS process relates to following steps:
1. be signal decomposition spectral coefficient to the conversion of frequency service time,
2. the autocorrelation matrix by forming windowing and use recurrence to apply the linear prediction program of standard, and
3. if prediction gain surpasses certain threshold value, then noise shaping filter is applied to spectral coefficient.
The present invention depends on the character of the noise shaping filter that is applied during TNS handles.The spectral domain filter can be modified by this way, so that can be to the noise shaping in any different time number of responses.Change certain parameter of this temporal envelope by the spectral domain filter, watermark can be embedded in the signal.In other words, if in spectral domain or frequency domain, regulate noise shaping filter, thereby just in time domain, changed quantizing noise.
An exemplary temporal envelope response shown in Figure 7 is depicted sound pressure level (SPL) to the time.
The temporal masking model quite is similar to the masking spectrum model that uses in certain perceptual audio coder.Particularly, be used for masking spectrum downwards and the envelope that makes progress be similar to backward and temporal masking envelope forward.In order more specifically to identify the adjustable TNS parameter of mode, the more part of detailed consideration time noise shaping process operation according to the present invention.After the conversion of frequency is spectral coefficient to signal decomposition, frequency spectrum data is carried out linear predictive coding (LPC) calculate in use, whether surpass certain threshold value to determine prediction gain, and the envelope of derivation signal.As follows to each TNS filter calculating predictive coefficient of each piece then:
h=Rxx-1rxx
Wherein
rxx?T={Rxx(i,j)};Rxx(i,j)=AutoCorr(|i-j|);i,j=1,2,...Nrxx’=rxx*win
Wherein Rxx is a N-by-N auto-correlation square formation, and N is TNS prediction rank, and h is a vector-optimization predictive coefficient.These equations are based on famous orthogonality principle, and this principle says that minimum predicated error is with all data quadratures that use in prediction.
In initialization time, calculate the autocorrelation matrix window according to following equation:
win ( i = 0 . . 31 ) = e ( i + 1 2 ) 2 &CenterDot; guassExp
Wherein
gaussExp = - 1 2 ( &pi; &CenterDot; F SAMP &CenterDot; 0.001 &CenterDot; timeResolution transformResolution ) transformResolution
Wherein
The FSAMP=signal sampling rate
The TimeResolution variable is relevant with bit rate and channel number.Similarly, transformation block length definition transformResolution variable.
By determine the optimization rank of noise shaping filter from the terminal reflection coefficient of removing below certain threshold value of coefficient arrays.In order to transmit the rank that parameter is a noise shaping filter that watermark can be conditioned.For example,, the watermark bit of a sensation can be represented by the filter rank of optimizing, and the watermark bit of other sensation can be represented by unoptimizable filter rank (perhaps higher or lower).Can reformed another parameter in order to transmit watermark the shape that is noise shaping filter itself.For example, the watermark bit of a sensation can be used by LPC and calculate the optimization coefficient indication of determining, and the watermark bit of another sensation can be by revising this coefficient and the shape indication of noise shaping filter like this.
By regulating TNS parameter (filter rank or filter coefficient), noise is conditioned in the temporal envelope of input signal, makes that it can be detected in decoded output signal.Figure 18 illustrates the variable example that temporal masking envelope and quantizer error can be conditioned in envelope.For each piece in the time, the TNS parameter can be conditioned so that transmit watermark.
The embodiment of reality of the present invention can provide unusual halogen rod the watermark solution.Owing to, be difficult to remove or describe watermark and the primary signal of not degenerating by the noise of TNS process interpolation and the envelope close coupling of source signal.
The transparency of the watermark of describing among the present invention can be used the self adaptation distortion process control of following described type.Under this situation, in case the temporal envelope of signal has used TNS to be modified, the result compares with the time or the frequency spectrum designation of temporal masking threshold value repeatedly.If threshold value is exceeded, the temporal masking parameter is made adjusting, and repeat this process with required balance between the halogen rod that guarantees watermark signal and the sentience.
Temporal masking characteristic shown in Figure 180 can be applied to the sub-band of signal.This allows watermark stratification and watermarked potential more position.
Regulate bandwidth
Known bandwidth than the bass signal causes the degeneration of subjective quality minimum, as long as it remains on about 16kHz minimum level.As long as it remains on the minimum level, experiment has also proved the minimal degradation when Bandwidth Dynamic changes.If basis is replenished in encoder or the decoder or watermark signal is regulated bandwidth, this signal can be derived from decoded audio frequency.For example, the unitary code bandwidth that can be embedded 16kHz is represented " 0 " symbol and the bandwidth of signal 20kHz is represented the audio signal of " 1 " symbol.This can expand to many bandwidth that expression generates the multidigit symbol of the higher signal data rate that is embedded into.Figure 19 illustrates 2 bit signs that use four different bandwidth.This strategy can be at the non-halogen rod of needs, and use in the place of the watermark that can not hear.The standard that can not hear can be by above realization.This strategy is non-halogen rod, because can be easy to remove by the decoded audio signal watermark of low-pass filtering.
Figure 20 illustrates the bandwidth of using signal and represents that different symbols comprises the example of audio signal of the signal of embedding.
A problem of above-mentioned bandwidth digital watermark is that it is relevant with the existence of the content of the above signal of minimum bandwidth.In the time of for limit, the above signal content of minimum bandwidth is non-existent.There is not the content of high-frequency signal can not obtain the permanent signal data rate that is embedded into.For example, if audio signal content is formed by the signal at 1kHz place is sinusoidal wave, then the unique method of data that embeds in transmission in a signal can be reduced to bandwidth and be lower than 1kHz.This can clearly hear and destroy primary signal.
The method that can provide permanent content watermark to embed speed will guarantee that audio signal comprises high-frequency energy.A kind of method that realizes this mode is to add noise to the upper frequencies of audio signal, makes that the hearer can not this noise of perception.If the noise that adds is less than or equal to the threshold value that the mankind hear, then it is non.Use the interpolation of this noise, the signal of embedding can use audio bandwidth as the mechanism of delivering a letter that permanent data rate is provided.Notice that this noise only need be added in the frequency band of delivering a letter.This frequency band of delivering a letter is defined as being used to place the low-limit frequency of watermark and the frequency band between the highest frequency.The frequency band of delivering a letter can be divided into less part, wherein adopts plural bandwidth to generate watermark.
Figure 21 illustrates the interpolation of the noise that is shaped as the level that is similar to the threshold value of hearing.It adds the signal of only being made up of single sine wave to, and only adds the frequency band of delivering a letter to.Noise adds the restriction of the threshold value that the frequency band of delivering a letter needn't be heard to, if but energy at this more than threshold value, then it may be heard.Another dimension of delivering a letter can be added by the noise amplitude that adjusting listens under the threshold value.For example, if the energy in the frequency band district of delivering a letter comprises just more than a kind of energy state, and not by adding the energy state of half energy state, then Fu Jia data can be hidden or insert.This amplitude is delivered a letter will increase the data rate of the signal that embeds.
Just be lower than upward bandwidth as long as guarantee certain signal content, this signal just can detect.Importantly, the signal that adds in the frequency band of delivering a letter is similar in each channel.Under many situations, these signals are mixing on the electricity or on the acoustics, and importantly they do not disappear each other mutually.If sine wave is added to a plurality of channels and is used to deliver a letter in the phase place, when acoustics add relevant with the position time, then they can be deleted.This has reduced the reliability of watermark.Using the noise of independent random is better solution, because this can not delete when mixing.
Because signal content may appear in the frequency band of delivering a letter, and added to the frequency band of delivering a letter by the noise of shaping and embed speed, so add deliver a letter energy in the frequency band of two signals and increase sometimes to guarantee content.The changeability of this energy makes testing process more difficult.In the embodiment of this mode of the present invention, low pass filter was applied to source signal before the noise of shaping adds, with the deliver a letter interaction of any source signal in the frequency band of elimination.
In Dolby Digital algorithm or cataloged procedure, not obvious even the content in the frequency band of top is confirmed as, also in bit stream, transmit coarse power spectrum, this can be used for adding to power spectrum the random noise of shaping in decoder.This be that decoder is connected when the shake sign in the bit stream starts a feature.Even encoder has judged that it is unconspicuous in perception, the noise that adds in the decoder generates watermark again in decoded audio frequency.Can during coding or decode procedure, insert watermark.
Dolby Digital audio coder can in bandwidth, change according to one of two bandwidth parameters (chwcod that in the table of Figure 21, lists and cplendf code).This generates a kind of effective method of realizing watermark.Yet, regulate these codes so that in the signal of decoding, produce detectable variation and what restriction be the signal data rate that embeds do not added:
1. all channels should comprise identical bandwidth, make downward mixed signal can not destroy the data of embedding.This has limited to monophonic equivalent and has embedded data rate.
2. for the sound quality of optimizing, the bandwidth code should not be every frame setting once, data rate that this restriction embeds is the sampling rate of the symbol degree of depth and embedding.If the bandwidth code is changed more than every frame once, then the overall sound quality of Bian Ma audio frequency will reduce.
Can with number of symbols be limited to available bandwidth number of codes more than the minimum bandwidth.
For example, encoder uses and just embeds data at two different bandwidth status with 48kHz, and then the data rate of Qian Ruing is near 31.25bps.(per second 31.25 frames, every frame comprises one information).If use the wide state of the four-tape with 48kHz, then data rate is 62.5bps.These numbers are to derive from such fact, and promptly each Dolby Digital frame comprises 1536 single audio samples.If use every frame to comprise another encoder that 2048 single audio frequencies are sampled, then will be near 23.5bps for the unitary code data rate.
Dolby Digital encoder sends every audio frame energy spectral density approximation in the encoder bit stream.It is updated when having significant change in audible spectrum.Energy spectral density information is sent out as the index of linear interval on the frequency.In Dolby Digital decoder, add shake to any portions of the spectrum of quantitative information of not receiving.Basically be that the shake of random noise is demarcated and is the index level.This has increased signal energy to portions of the spectrum.Index if deliver a letter in the frequency band is shaped as to be less than or equal to and listens threshold value, and then shake has guaranteed signal energy.
Following steps have been summarized the current method that energy is arranged in the frequency band that guarantees to deliver a letter in Dolby Digital code signal.
1. random noise is added on to be shaped as and listens threshold value or the minimum under it and deliver a letter on the bandwidth.This causes that least energy defers to the shape that listens threshold value.
2. this least energy level of index captured of calculating after noise adds.
3. do not deliver a letter on the bandwidth even there has been the position to be assigned to minimum because having added shake usually, decoder also will regenerate spectrum energy from the index of transmission.This guaranteed that signal content is used to be embedded into deliver a letter.
Above-mentioned two kinds of technology (bandwidth change and shake) can be used for integrated low-complexity.The watermark of fixed bit speed is to Dolby Digital encoder or decoder.This system comprises the coding/decoding chain of downward mixing, dynamic range expansion, volume standardization, matrix surround decoder etc. for violation " the normal use " is the halogen rod.
Like this, the embodiment of this mode of the present invention can may further comprise the steps:
1. regulate bandwidth to embed hiding data-signal.
2. use the bandwidth code of Dolby Digital coding/decoding system to regulate bandwidth to embed hiding data-signal.
3. can be used for permanent speed embedding data to guarantee signal content at the frequency band adjustments noise of delivering a letter.
4. the noise of this interpolation of shaping makes it to be less than or equal to the people and listens the perception of hearing of threshold value with the noise that prevents to add.
5. regulate the amplitude of the noise of this interpolation, deliver a letter and increase the data rate of the signal be embedded into to add another dimension.
With the noise of the integrated shaping of Dolby Digital encoder to guarantee that signal content is in the frequency band of delivering a letter.
Watermark detector is explained the information of the embedding in the audio signal that is included in regeneration.Preferably can extract information, but this ability is not necessary to all application with electricity and acoustics dual mode.Extracting watermark after Acoustic treatment is to be more difficult challenge by the people, because added room noise, and loud speaker and microphone characteristic, and total playback volume.
The target of detector is to determine in the given frequency band of delivering a letter whether energy being arranged, to find audio bandwidth.This needs to pass through Fourier transformation, i.e. the group analysis band pass filter of frequency band etc. of delivering a letter, the frequency decomposition of the audio frequency of calculating.Can obtain energy each frequency band of delivering a letter from this signal decomposition.Detector can use this energy information to determine the symbol that is embedded into.
A kind of possible detection method is that the sampling fixed threshold compares the symbol that embeds to determine in each delivers a letter frequency band.This threshold value can be arranged on the energy level that just has been higher than the noise bottom line.Any be higher than this level will energy by the people for being to comprise signal level.Logical 22 illustrate the three kinds of required different energy levels of four kinds of different bandwidths that detect generation 2 bit signs.Any energy that is higher than detection threshold by the people for being ' height ', and any be lower than this threshold value by the people for being ' low '.
This fixing threshold value is only always known and the peak signal level is good from the environmental effect of unattenuated sealing to the noise bottom line of system.For example, if other noise adds the noise bottom line of above sketch to, then the 3rd energy level will be by artificial people for being ' height ', and will be interpreted as incorrect symbol.
If energy level is by equalization or normalization before threshold calculations.Then can use fixing known.A kind of technology that realizes this mode is to determine to apply AGC algorithm or process to the frequency band of delivering a letter before the energy level.The shaping level by normalization, makes ' low ' become more consistent with ' height ' level by AGC.Normalization because of level under this situation can apply fixed threshold.
In any environment that adaptive threshold is considered to constantly change for noise level and signal energy is best.Adopt a kind of possible detection method of adaptive threshold to use previous energy state to current state computation threshold value.The prerequisite of this sensors work is in a limited number of previous state, should have some energy level that is in some energy level of ' height ' state and ' low ' state for given energy band.
Maximum energy can be thought ' height ', and minimum thinks ' low '.These ' height ' can be considered to two different groups with ' low ' state.Figure 23 comprises the several exemplary histogram of ' height ' and ' low ' distributions.Threshold value can be confirmed as being positioned at somewhere between this two kinds ' cluster '.
If suppose that the number of ' height ' state equals the number of ' low ' state in predetermined limited set, then Zui Da half belongs to ' height ' group, and minimum half belongs to ' low ' group.If each group is found averaged energy levels or average, then can be as the average computation of these two averages simple threshold value.To the different distribution of two groups and threshold value hypothesis, consider that each organizes more statistic such as average and variance, this may become more complicated easily.
The another kind that may comprise considers it is to improve separating to ' height ' and ' low ' group.When comprising more than two bandwidth in telescopiny, the energy level in the frequency band of delivering a letter just has correlation.When high bandwidth was ' connection ', all energy levels in each frequency band of delivering a letter should be detected as ' height '.When inferior high bandwidth is ' connection ', be lower than that all levels of delivering a letter should be detected as ' height ' in the bandwidth.To each deliver a letter frequency band this changed energy distributions.
For example, suppose that four different bandwidths of watermark encoder use produce two bit signs.If A, B, C and D represent bandwidth, and wherein A is a maximum bandwidth, and D is high bandwidth.Need determine these bandwidth to three different energy frequency bands.If three energy frequency bands are by 1,2 and 3 expressions, they distribute is bandwidth A-B, B-C, and the energy between the B-D.Following form is listed if symbol and is evenly distributed as the probability at each energy frequency band of ' height ' state.
The energy frequency band P (' height ')
1 3/4
2 1/2
3 1/4
Probability is unequal to be because the correlation of each energy frequency band on bandwidth.For example, the probability of signal content is B in the energy frequency band 1, the probability that C and D symbol take place and.Each symbol has 1/4 probability of happening; Thereby the probability of the signal content in the energy frequency band 1 is 3/4.
If 40 previous states are used for each energy frequency band is calculated current threshold value, 30 then the highest states will be supposed the signal content in the expression energy frequency band 1.Remaining ten sampling will represent not have signal content.On average determine current threshold value by what find average between these two groups under this situation.
For guaranteeing that symbol distribution is essentially evenly, the interpolation of channel coding is important for this detector.If encoder input just in time is the symbol of high bandwidth for the cycle of extending, then this detector will have the difficulty to the data decode that is encoded.Symbol distribution is the closer to the probability of hypothesis, and the detection of the data that are embedded into is accurate more.
A kind of method of channel coding of difficulty is to guarantee that each symbol only occurs once on limit cycle.For example, if four different bandwidth codes are arranged, then each symbol may need to occur once in the group of four symbols.This produces 24 independent symbol, and this is four bandwidth code-group.24 (quadravalence is taken advantage of) are the maximum numbers of the arrangement of the wide code of the four-tape.If A, B, C and D represent four bandwidth codes, and then symbol will resemble ABCD, BACD, ABDC, BADC, BCAD etc.Note this is simple data rate that embeds.
Like this, the watermark detector of this mode can comprise according to the present invention
1. the signal detector of Qian Ruing uses the adaptive threshold that calculates by the check original state.Original state is divided into group based on energy level.Threshold value is the statistic of as much as possible group of each group based on attempting component.
2. when relating to a plurality of groups, based on number from element in the correlation adjusting group of bandwidth adjustment.
3. channel encoder guarantees that the distribution of symbol on finite time is near even.This conclusion that has guaranteed above-mentioned watermark detector is correct.
The intensity that Control Parameter is regulated
Self adaptation distortion control
An object of the present invention is to embed the watermark that has the maximization detectability and minimize sentience.Perceptual audio coder probability of use threshold value determines how to reduce the redundancy of input signal.This same threshold value can be used to regulate watermark signal, makes that it is that detectable the maintenance simultaneously basically can not perception.
As mentioned above, in the encoder of some perception, distortion measurement and rate controlled pairing are abandoned to guarantee correct information.The signal (the control output of speed) of original input signal and embedding is compared in distortion measurement.For controlling some coding parameter to change the result of rate controlled process, distortion measurement may be useful.This can generate the nested loop configuration of the following stated, and wherein outer shroud comprises distortion measurement and interior ring is a rate controlled.By the check distortion measurement coding parameter is made amendment repeatedly, up to satisfying certain standard.Can be used for the variable data rate encoding device by removing the identical method of speed ring.
The watermarked process of sentience threshold value of use a kind of mode according to the present invention is shown in Figure 24-26.This process is similar to the process that defines in the MPEG-2AAC perceptual audio coder, wherein use two nested rings to determine optimum quantization.Inner iteration ring shown in Figure 24 is revised quantiser step size, can be with available bits number be encoded (rate controlled) up to frequency spectrum data.Outer iteration ring shown in Figure 25 amplifies the spectral coefficient in all spectral bands, makes the demand (distortion control) of As soon as possible Promising Policy psychoacoustic model.The process of Figure 25 is modified (shown in Figure 26) by regulating perceptual coding parameter or a plurality of parameter, with the As soon as possible Promising Policy psychoacoustic model, or threshold of perception current, also watermarked simultaneously signal.Fig. 6, all listed parameters can be regulated in such a way in 7 and 8, though some parameter more is difficult to change than other in the distributed process on the throne.
Rate controlled process among Figure 24 attempts to show signal with less fix information scale.Input signal is quantized (step 20) according to threshold of perception current, and (step 22) counted in the position that is used as quantized result.If the number of the position of using is no more than available position, then process finishes (step 24).Otherwise iterative process is proceeded the number up to the available position of the approaching as far as possible coupling of number of the position of using.This by regulating threshold of perception current, passes through the modification of quantiser step size usually, has been abandoned up to enough information and has realized (step 26).
Distortion measurement process shown in Figure 25 can be added the quantiser step size process to, does not cause the error that is easy to perception with some simplification that guarantees the rate controlled cataloged procedure.Distortion measurement allows meticulous adjusting coding parameter, to reduce this error as far as possible.In the first step of this process, carry out speed ring or interior ring with according to rate constraint quantizer input signal (step 28).There is great distortion (step 30) in distortion evaluates calculation then, and whether definite distortion is acceptable (step 32) with respect to threshold of perception current.If distortion is unacceptable, then amplifies spectral coefficient (step 34) and repeat this process.If distortion is acceptable, the result of quantification is applied to input signal (step 36) and process is finished." distortion " under this meaning be encoded and primary signal between poor, and may or may not cause the noise of thinking that can hear.
In each mode of the present invention, distortion measurement process shown in Figure 26 is used for determining the coding parameter value, when being conditioned but still be in the threshold of perception current border, and the amount that can change from its default value.This makes the maximization of possible watermark detection, because make the distortion can not perception by the threshold of perception current constraint, this best mode causes big as far as possible distortion.Repetition rate control (step 28), distortion control (32), coding parameter is regulated (step 38) step, and is acceptable compromise up to reaching.
During encoding, use the rate controlled process such as certain coded systems such as Dolby Digital, but do not apply distortion control.Thereby, in order to make this coded system this mode of the present invention of can sampling, added distortion measurement.Other encoder, such as MPEG-2 AAC, the integrated distortion control procedure of the purpose of promising coding, thereby do less modification and also promptly can be used to this mode according to the present invention and implement watermark.Should be noted that in the variable rate encoding system the demand factor ring does not provide the solution of optimizing to reduce complexity simultaneously to the parameter regulation process like this.
How Figure 27 can be watermarked if illustrating the distortion measurement process of the type of describing just now used according to the invention.Purpose is by forcing in the passage 2 the as close as possible threshold of perception current of parameter effect of the adjusting shown in changing as quantizer error, making the maximization of halogen rod.At first passage, calculate threshold of perception current.At second channel, show quantizer error.Note having certain available mark can be in order to revising quantizer error in perception ground.At passage 3, the watermark encoder parameter of selection is Delta position distribution type parameter (be Delta or cpdelta parameter, it influences the quantizer error in the benchmark frequency band) in this example, has been conditioned and the result is the quantizer error of revising.Even and then still keep can not perception, quantizer error still can be modified.Notice that the result that coding parameter is regulated is a quantization error slightly different on entire spectrum, because available figure place is affected.The adjusting of this presentation code parameter, and gained quantizer resolution in certain frequency band cause and are not only the frequency band that parameter wherein is conditioned, and the error in the entire spectrum.In passage 4, the information that the degree that coding parameter is regulated has reused example passage 3 is conditioned, and its gained quantization error is as far as possible near threshold of perception current.Though when regulating the one or more parameter that influences quantizer error, preferably make quantizer error as far as possible near but be lower than threshold of perception current, but the present invention still pays attention to the such adjusting of one or more parameters, make quantizer error be lower than but keep off threshold of perception current, for example shown in Figure 27 passage 3.
Figure 28 illustrates watermark embed process, and wherein the watermark encoder parameter of Xuan Zeing is overall SNR offset-type parameter (being csnroffst, fsnroffst, cplfsnoffst or lfesnroffst parameter).Notice that in this example, the adjusting result of overall SNR offset parameter is the coupling of accurate and threshold of perception current.This is because SNR offset-type parameter is consistent be offset of threshold of perception current on entire spectrum.So, use SNR offset-type parameter to make only step of needs of process that quantizer error adapts to threshold of perception current.
An another aspect of this mode of the present invention allows the skew of user's Control Parameter threshold value, so possible ' gain ' or the energy of this control watermark.This can be the linear deflection to threshold of perception current, or allows to have in special frequency band the more complicated function of more distortions.This allows the difficulty of user's control detection and the audible audibility of the final signal that embeds.This can realize by improving the threshold of perception current curve by fixed amount.In addition, by revising threshold of perception current, the user can be local watermarked for what bear at the watermark encoder mark.
Such as Dolby Digital, Dolby E, and in the perceptual audio coder such as MPEG-2 AAC encoder quantizes or the position assigning process is based on that the figure place that can use encoder and overall signal calculate noise ratio.Then, compare threshold of perception current and quantizer error.If the requirement of finishing is not satisfied in distortion (between threshold of perception current and the quantizer error poor), regulate selected coding parameter based on distortion and regulate, and repetitive process up to distortion for accepting.
In a preferred embodiment of this mode of the present invention,, form the basis of threshold of perception current from minute coefficient sets calculated distortion of band (promptly pressing the critical band grouping).Should be noted that it is the quantization error of cost based on each spectral coefficient that threshold of perception current also can increase complexity.
In case set up threshold value, the distortion control section of this mode of the present invention promptly begins.Coding parameter under the test is conditioned according to the subsequent iteration of distortion process.Spectral band position distribution result is carried out in the adjusting influence of encoded test in the rate controlled process.Gained threshold value and original threshold of perception current comparison are distributed in the position, and regulate coding parameter repeatedly up to satisfying the needs of finishing.If the requirement of finishing is not satisfied, then use the parameter of regulating to calculate masking threshold with formula again.
In the preferred embodiment of this mode of the present invention, when threshold of perception current and masking threshold are equal to for any given associated frequency band and do not have the frequency band of masking threshold to surpass threshold of perception current, the termination of self adaptation distortion process can appear.If perception and masking threshold are not always restrained, then and then as long as masking threshold be no more than threshold of perception current, the termination logic of can sampling.Stopping needing to exist is in order to limit complexity.
Detector parameters is regulated
Figure 29 illustrates a mode of the present invention, wherein regulates the parameter of perceptual audio decoder.In this example, decoder adopts and mixes position distribution (being that sensor model is sent to decoder from encoder).The bit stream 40 of the perceptual coding of receiving is separated into coding parameter 42 (expression position apportion model) and formative data 44 (data that promptly are quantized) occur in decoder.Execute bit distribution 46 and inverse quantization 48.Next step 50, decision making and (calculate threshold of perception current?).If also do not calculate (promptly for the first time by this ring), then calculate threshold of perception current (step 52) based on the bit stream that comes own coding.If there be (promptly for the first time by after this ring) in threshold of perception current, then between the signal of inverse quantization and threshold value, compare (step 54).Can decisioing making in step 56 then, (distortion be accepted?).If distortion is unacceptable, then adjust the coding parameter (step 58) that just is being conditioned, and repeats bits is distributed inverse quantization, and threshold of perception current comparison procedure.Originally regulate coding parameters based on watermark symbol (being side information) input 60, and subsequently based on the relatively adjustment coding parameter of threshold of perception current.
Adopting the forward direction adaptive bit to distribute in the perceptual audio decoder system of (promptly in encoder, generating sensor model and the explicit decoder that sends to), can adopt similar process.Use the sensor model reformatting signal data of transmission.This sensor model can be modified so that watermarked by a parameter then.The watermark version of audio frequency and unlabelled signal are relatively.Be scheduled to finish requirement (a plurality of) if regulation is not satisfied in distortion measurement, then use the parameter regulation value of revising that signal is carried out formula again and calculate.
Response watermark sequence and/or certainty sequence Control Parameter are regulated
In alternate manner of the present invention, the adjusting of one or more parameters is indirectly by side information or watermark signal or sequence control.For example, the control of the adjusting by watermark is to regulate by the function of one or more other signals or data sequence, for example this input signal that comprises instruction set such as certainty sequence and/or be applied to coded system.Figure 30 is a functional block diagram of expression this mode of the present invention.As the basic configuration among Fig. 2, primary information is applied to perceptual audio coder function 2.In this mode of the present invention, side information is applied to parameter controller function 62.Parameter controller function 62 also receives primary information or one or more certainty sequence, or primary information and one or more certainty sequence.Parameter controller 62 is revised the mode that secondary information is regulated encoder functionality or decoder function parameter.As described below, this is undertaken by revising one or more secondary informations, and each has or the function of primary information and/or the function of one or more certainty sequences.Because can be applied to or encoder functionality or decoder function or both from the side information of the modification of parameter controller function, be clipped to the encoder function from the side information branch dotted line is shown.As the situation of Fig. 2 configuration, the output of perception decoder function is the primary information that has the side information of embedding.Can in decoder function output, detect side information.
If the side information controlled encoder function of revising 2 and decoder function 4 parameter regulation among both, the information that one of then is applied to will be different from the information that is applied to another.For example, control the side information of one or more encoder functionality parameters and can represent identification audio or the possessory watermark of video content, and the side information of controlling one or more decoder function parameters may be sign provides a serial number from the equipment of audio or video content to one or more clients.
When parameter controller 62 adopted certainty sequence modification side information to regulate the mode of one or more parameters, the detection of side information or watermark required the keyword of generator equation and certainty sequence to be known by detector functions in decoder function output.The generator equation can openly be known, can know by detector (but not being disclosed) priori, maybe can send detector to by the safety channel.Similarly, keyword can openly be known, can know by detector (but not being disclosed) priori, maybe can send detector to by the safety channel.For the system of safety, requiring just, keyword is not disclosed.
When parameter controller 62 adopts input signal modification side information to regulate the mode of one or more parameters, detecting side information or watermark in decoder function output needs source signal or about the certain information of source signal (for example, the characteristic of parameter controller source signal that its response is programmed) detector functions is known at least.By the transfer source signal or the characteristic of parameter controller source signal that its response is programmed preferably, this can realize decoder function.If what transmit is source signal rather than the relevant characteristic of source signal, detector functions can be based on correlation properties are derived in the analysis of source signal and the output of decoder function independently.Yet, be not determined because characteristic is originally based on source signal, thereby error may occur with quantization error.
Response certainty sequence Control Parameter is regulated
Revise the watermark symbol switching rate
A distortion of this mode of the present invention relates to the speed with the conversion of certainty sequence Control Parameter adjustment state, and controls the watermark symbol switching rate then.Especially, it relates to the duration that response certainty sequence changes the parameter regulation state, and duration of watermark symbol speed then.If the watermark symbol conversion is embedded into to reach speed, repeating sequences may be appreciable in the watermark symbol pattern.By revising the parameter regulation state duration, symbol duration is reduced to minimum repeating to revise then.Table 1 illustrates the parameter regulation state duration, and then the watermark symbol duration depend on the example of certainty sequence, the result is the pattern as the sequence of revising like this.In this specific example, if the certainty sequential value equals " 1 ", then watermark sequence is repeated.If the value of DS is " 0 ", then watermark symbol is not repeated.The cycle that should be noted that the watermark symbol pattern based on value " 1 " in the certainty sequence appearance and repeat.So, should use the finite sequence of suitable replacement, making can be synchronous between detection period.
Sequence type Sequence
Certainty sequence (DS) 10110010
Watermark sequence (WS) 01011100
Revise sequence 001001111000
Table 1
Select parameter to be used for watermarked
The further distortion of this mode according to the present invention, certainty sequence selection are used for watermarked parameter or a plurality of parameter.In general, can adopt several parameters one of any watermarked.For example, the possibility of result of the adjusting of a parameter is the modification of particular frequency range intermediate frequency spectrum energy, and the possibility of result of another parameter regulation is the reduction of decoded signal bandwidth.If only regulate a parameter, but the gained watermark is regulated more perception of man of keen perception for spectrum energy.On the other hand, if employed embedded technology is switched between parameter of adjusting and another parameter of adjusting, then gained uses and may relatively be difficult for perception.When using the number that embeds parameter to increase, this effect becomes more obvious (by using the impairment of introducing more to resemble noise).
Table 2 illustrates can exchange the dual mode that coding parameter is selected in selected parts.In first example shown in the part " a " of table 2, parameter 1 and 2 adopts use sequence (WS) value relevant with certainty sequence (DS).For example, if the DS value is " 0 ", parameter 1 is adjusted to the state of reflection WS value, otherwise it is adjusted to the state (perhaps state can be but need not be the default value of parameter) of reflection " 0 " value.So if the DS value is 1, parameter 2 is adjusted to the state of reflection WS value, otherwise it is adjusted to the state (perhaps state can be but need not be the default value of parameter) of reflection " 0 " value.In this example, require from two parameters with from the Sequence Detection WS of DS.In second example shown in the part " b " of table 2, parameter 1 and 2 is adjusted to the only state of the value relevant with WS itself of reflection.For example, parameter 1 is adjusted to the state that reflects WS value " 0 " from its default conditions, and parameter 2 is adjusted to the state of reflection WS value " 1 " from its default conditions.Like this, arbitrary parameter can be by independent detection when they all transmit WS.
Figure C0181406200421
Table 2
Modification is used to regulate the speed of selecting variation of counting of ginseng
This mode further is out of shape according to the present invention, and the selection of the parameter that is used to regulate can be dependent on the certainty sequence variation.Eliminating when changing the periodic effects that embedded technology introduces with permanent speed, this so that can reduce the sentience of watermark.This embodiment is shown in table 3.In this example, parameter 1 is adjusted to the contrary state (arbitrary state can be but need not to be parameter default) of reflection WS, and symbol repeats when the DS value is " 1 ", otherwise it does not repeat.Parameter 2 is adjusted to reflection WS default value (arbitrary state can be but need not to be parameter default), and symbol repeats when the DS value is " 1 ", otherwise it does not repeat.In the example as the part b of table 2, two parameters all transmit watermark.
Figure C0181406200431
Table 3
The adjusting of the Characteristics Control parameter of response source signal
Use source signal analysis modify watermark symbol switching rate
Another distortion of this mode of the present invention relates to the characteristic of analyzing source signal, and the speed of the conversion of adaptive control parameter regulation then, then based on this analysis result control watermark symbol switching rate.Specifically, it relates to the duration of the characteristic changing parameter regulation state of response source signal, and is the duration of watermark symbol state then.For example, change the temporal masking degree that signal condition can provide usefulness rapidly, this can be used to reduce the sentience of watermark symbol conversion.2 (supposing that source signal has been formatted as the digital signal streams with frame) change to surpass predetermined threshold value if the amplitude of time domain source signal is from frame 1 to frame, then can allow the value of watermark symbol from frame 1 to change to another value in the frame 2.In frame 3, if the characteristic of source signal is no more than threshold value from the variation of previous frame (a plurality of), admissible mark change value not then.By watermark symbol conversion is associated with other " friendly variation " state in incident of sheltering or the hiding source signal, can improve the not sentience of watermark.
In the table 4, the output of the sequence of source-definition (SDS) expression threshold process is such as transition detection.To this example, SDS value " 0 " indication does not have transition status to occur, and has conversion and be worth " 1 " indication in piece.In the part " a " of table 4, if the SDS value is repeated for " 1 " then WS value.If the SDS value is not repeated for " 0 " then watermark symbol.In this example, suppose that single encoded parameter transmits watermark.
The parameter of using the source signal analysis modify to be used to regulate is selected the speed of variation
In another way of the present invention, the mode that has just illustrated is modified, so that the parameter of using the source signal characteristics modification to be used to regulate is selected the speed of variation, this speed with parameter regulation is opposite.As in the mode that has just illustrated, its benefit is that when source signal provided temporal masking or other " the friendly variation " state, conversion can less perception.The example of this embodiment is shown among the part b of table 4.In this example, parameter 1 is adjusted to the contrary state (arbitrary state can be but need not to be the default value of parameter) of reflection WS, and symbol repetition when the SDS value is " 1 ", otherwise it does not repeat.Parameter 2 is adjusted to the state (arbitrary state can be but need not to be the default value of parameter) of reflection WS default value, and symbol repetition when the SDS value is " 1 ", otherwise it does not repeat.As the example among the part b of table 2, two parameters all transmit watermark.This method is similar to the situation shown in the table 3, is defined by SDS but difference only is the switching rate here.
Figure C0181406200441
Table 4
Use the source signal analysis to select to be used for watermarked parameter
In another way of the present invention, revise several parameters of the available set of the available parameter that is used for regulating based on the characteristic of source signal.Suppose that specific watermaking system can be by revising several different parameters one of any watermarked (for example parameter causes spectrum energy to increase, and the time noise inserts, bandwidth reduction etc.).The current characteristic that depends on source signal, not all these parameters all can cause non variation in the signal of not decoding.For example, if source signal fix, but then in perception masked frequency range time noise insert and may increase more perception than spectrum energy.Consequently, preferably reduce available parameter sets, to forbid to cause those parameters to the more appreciable result of current demand signal characteristic.
In table 5, example illustrates the sequence (SDS) based on the signal definition of previous described identical threshold process (transition detection).SDS value " 1 " indication transition state in piece exists, and SDS value " 0 " indication does not have transition state to exist.In table 5, when nominal transition state exists (SDS=0), nominally parameter 1 and 2 transmits watermark, parameter 1 has the adjustment state for WM value " 0 " reflection value " 1 ", and the adjustment state of other reflection value " 0 " arranged, and parameter 2 has the adjustment state for WM value " 1 " reflection value " 1 ", and the adjustment state of other reflection value " 0 " is arranged.If there be (SDS=1) in transition state, then parameter 3 and 4 is conditioned, and causes time distortion to these parameter optimizations, and parameter 1 and 2 they cause distortion spectrum.If reduced number of parameters, then can use the certainty sequence from less set, to select parameter, thereby the benefit that remains between the parameter or switch among them is simultaneously preferably adaptively selected among them in parameter simultaneously with regard to current source signal characteristics.
Sequence type Sequence
The sequence of signal-definition (SDS) 00101110
Watermark sequence (WS) 01011100
Parameter 1=1, WS (0), SDS (0) 10000001
Parameter 2=1, WS (1), SDS (0) 01010000
Parameter 3=1, WS (0), SDS (1) 00100010
Parameter 4=1, WS (1), SDS (1) 00001100
Table 5
Response certainty sequence and source signal characteristics Control Parameter are regulated
Except only using the certainty sequence or only use the input signal characteristics Control Parameter, the present invention also pays attention to responding the adjusting of the characteristic Control Parameter of certainty sequence and input signal.
There is several different methods to be used in combination certainty sequence and source signal characteristics so that Control Parameter is regulated.Work can further improve not sentience and/or halogen rod like this.In a kind of such method, which type of coding parameter subclass of certainty sequence selection is used for the characteristics of signals different conditions.The example that uses above table 5 more specifically, (SDS=0) selects two parameters to be used for regulating when transition does not exist, and selects those parameters based on certainty sequence D S.Table 6 illustrates this method.
Sequence type Sequence
The sequence of signal-definition (SDS) 00101110
Certainty sequence (DS) 10110010
Watermark sequence (WS) 01011100
Parameter 1=1, SDS (0), DS (0) WS (0) 00000001
Parameter 2=1, SDS (0), DS (0), WS (1), 01000000
Parameter 3=1, SDS (0), DS (1), WS (0), 10000000
Parameter 4=1, SDS (0), DS (1), WS (1), 00010000
Parameter 5=1, SDS (1), DS (0) WS (0) 00000000
Parameter 6=1, SDS (1), DS (0) WS (1) 00001100
Parameter 7=1, SDS (1), DS (1) WS (0) 00100010
Parameter 8=1, SDS (1), DS (1) WS (1) 00000000
Table 6
In another example, the certainty sequence modification is by the switching rate of the watermark sequence of the sequence modification of signal definition.Table 7 illustrates this method.Secondary series illustrates the first step that changes embedded technology based on SDS, and the 3rd row illustrate second step based on DS and then change sequence.As precedent, if SDS has value ` " sequential value is repeated.Sequential value is not repeated if SDS has value " 0 ".
Sequence type Sequence (DS) Sequence (DS/SS)
The sequence of signal-definition (SDS) 00101110
Certainty sequence (DS) 10110010
Watermark sequence (WS) 01011100
Parameter 1=1, SDS (0), DS (0) WS (0) 10000001 110000000001
Parameter 2=1, SDS (0), DS (0), WS (1), 01010000 001001100000
Parameter 3=1, SDS (0), DS (1), WS (0), 00100010 000110000110
Parameter 4=1, SDS (0), DS (1), WS (1), 00001100 000000011000
Table 7
Just wherein a plurality of coding parameters transmit each example that embeds sequence, by applying identical watermark sequence to a plurality of coding parameters, also have the possibility of adding redundancy, so that increase the error resilient to attacking or handling.For the ease of the detection than low-complexity, this coding parameter can have the relation of constraint, or predetermined level, if make a parameter make mistakes, detector can recover message from other coding parameter.
In addition, deterministic sequence can be used to regulate simultaneously one or more other coding parameters, so that the assailant is difficult to infer what parameter watermark carries.In the example shown in the table 8, parameter 1 transmits watermark sequence and certainty sequence, and which will change based on watermark sequence for regulation parameter 2 or parameter 3. Parameter 2 and 3 and carry watermark under this situation, but effect is as false target.In this example, the state false target suitable for DS will equal WS, and otherwise will be " 0 ".
Sequence type Sequence
Certainty sequence (DS) 10110010
Watermark sequence (WS) 01011100
Parameter 1=WS 01011100
Parameter 2=WS, DS (0) 01001100
Parameter 3=WS, DS (1) 00010000
Table 8
Conclusion
The realization that should be appreciated that the present invention and other distortion of each mode and remodeling is significantly for the concealment professional, and the invention is not restricted to described these specific embodiments.Thereby purpose is to cover any and all remodeling, distortion or the equivalent that belongs to connotation disclosed and described basic principle around here and scope by the present invention.
The present invention and each mode thereof can be used as the software function of carrying out and realize in the general purpose digital computer of digital signal processor, programming and/or special digital computer.Interface between the analog and digital signal stream can be at suitable hardware and/or as the function executing in software and/or the firmware.

Claims (17)

1. revise the perceptual audio coder function of perceptual coding system and/or the method for operating of perception decoder function according to side information for one kind, make side information in the output of perception decoder function, to detect, comprise
Respond the information content of described side information, regulate one or more parameters in described perceptual audio coder function and/or the described perception decoder function.
2. according to the process of claim 1 wherein that described perceptual audio coder is the audio coder that adopts the forward/backward position distribution type of mixing.
3. according to the method for claim 2, wherein said one or more parameters comprise one or more parameters that belong in one or more following categories:
Shelter model and position distribution,
Coupling between the channel,
Frequency bandwidth,
Shake control,
Phase relation, and
Time/the frequency translation window.
4. according to the process of claim 1 wherein that described perceptual audio coder is the audio coder that adopts to the anteposition distribution type.
5. according to the method for claim 4, wherein said one or more parameters comprise one or more parameters that belong in one or more following categories:
Shelter model and position distribution,
Coupling between the channel,
Time noise shaping filter coefficient, and
Time/the frequency translation window.
6. according to the process of claim 1 wherein that described perceptual audio coder is a video encoder, and wherein said one or more parameter comprises one or more parameters that belong in one or more following categories:
Frame type, and
Motion control.
7. according to the process of claim 1 wherein that described one or more parameter is to selecting in the following one or more parameters that influence the decoded output signal:
Signal noise ratio,
The quantizer noise,
Time relationship between the channel,
Frequency bandwidth,
By the noise of shaping,
Phase relation between the channel, and
Wide spectrum, time aliasing noise.
8. according to the process of claim 1 wherein that described one or more parameter is conditioned by carrying out one of following action:
Two value parameters are changed between two value,
This parameter is changed between its default value and one or more other value, and
This parameter is changed between the value different with its default value.
9. according to the process of claim 1 wherein that the regulating degree of described one or more parameters is controlled.
10. according to the method for claim 9, the regulating degree of wherein said one or more parameters is controlled, so that restriction is because of regulating the sentience of the man-made noise that described one or more parameter produces in the output signal that is encoded.
11., make one or more following regulating characteristicss according to the process of claim 1 wherein that the adjusting of parameter is controlled indirectly according to side information:
The selection of the one or more parameters that are used to regulate,
The speed that parameter is selected, and
The parameter state switching rate,
In response to side information and as the function of one or more other signals or sequence and be determined.
12. according to the method for claim 11, wherein said one or more other signals or sequence comprise following any or both:
One instruction set, and
Characteristic to the signal of the perceptual audio coder of perceptual coding system input.
13. according to the method for claim 12, wherein said instruction set comprises the certainty sequence.
14. according to the method for claim 13, wherein said certainty sequence is a pseudo-random number sequence.
15. according to the method for claim 1, be used for revising the perceptual audio coder of perceptual coding system and/or the operation of perception decoder, and be used for the side information of detection senses decoder output, also comprise according to side information
Side information in the output of detection senses decoder function.
16. according to the method for claim 15, wherein the action of the side information in the output of detection senses decoder function is realized by one of following action:
Observe decoded signal,
More decoded signal and the signal that is applied to the perceptual audio coder function, and
More decoded signal with from the decoded signal of second perceptual coding system that is equal to described perceptual coding system substantially, do not have parameter response to be conditioned in perceptual audio coder function in this second perceptual coding system or the perception decoder function in side information.
17. according to the method for claim 16, the action of wherein observing decoded signal comprises the version that more decoded signal and its time lag behind.
CNB018140629A 2000-08-16 2001-08-15 Modulating one or more parameters of an audio or video perceptual coding system in response to supplemental information Expired - Lifetime CN100431355C (en)

Applications Claiming Priority (16)

Application Number Priority Date Filing Date Title
US22604400P 2000-08-16 2000-08-16
US22615100P 2000-08-16 2000-08-16
US60/226,151 2000-08-16
US60/226,044 2000-08-16
US25596500P 2000-12-15 2000-12-15
US25607800P 2000-12-15 2000-12-15
US25600000P 2000-12-15 2000-12-15
US25596400P 2000-12-15 2000-12-15
US25615700P 2000-12-15 2000-12-15
US25596700P 2000-12-15 2000-12-15
US60/256,078 2000-12-15
US60/255,964 2000-12-15
US60/255,965 2000-12-15
US60/256,157 2000-12-15
US60/256,000 2000-12-15
US60/255,967 2000-12-15

Publications (2)

Publication Number Publication Date
CN1672418A CN1672418A (en) 2005-09-21
CN100431355C true CN100431355C (en) 2008-11-05

Family

ID=27575224

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB018140629A Expired - Lifetime CN100431355C (en) 2000-08-16 2001-08-15 Modulating one or more parameters of an audio or video perceptual coding system in response to supplemental information

Country Status (12)

Country Link
US (1) US7395211B2 (en)
EP (1) EP1310099B1 (en)
JP (1) JP2004506947A (en)
KR (1) KR100898879B1 (en)
CN (1) CN100431355C (en)
AT (1) ATE308858T1 (en)
AU (2) AU8491001A (en)
BR (1) BRPI0113271B1 (en)
CA (1) CA2418722C (en)
DE (1) DE60114638T2 (en)
HK (1) HK1080243B (en)
WO (1) WO2002015587A2 (en)

Families Citing this family (112)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8306811B2 (en) * 1996-08-30 2012-11-06 Digimarc Corporation Embedding data in audio and detecting embedded data in audio
US7099830B1 (en) 2000-03-29 2006-08-29 At&T Corp. Effective deployment of temporal noise shaping (TNS) filters
US6735561B1 (en) * 2000-03-29 2004-05-11 At&T Corp. Effective deployment of temporal noise shaping (TNS) filters
DE10046575B4 (en) * 2000-09-20 2005-03-10 Siemens Ag Method for frequency acquisition, in particular for initial frequency acquisition, of a mobile communication device
US7610205B2 (en) * 2002-02-12 2009-10-27 Dolby Laboratories Licensing Corporation High quality time-scaling and pitch-scaling of audio signals
US7461002B2 (en) 2001-04-13 2008-12-02 Dolby Laboratories Licensing Corporation Method for time aligning audio signals using characterizations based on auditory events
US7711123B2 (en) * 2001-04-13 2010-05-04 Dolby Laboratories Licensing Corporation Segmenting audio signals into auditory events
DE10129239C1 (en) * 2001-06-18 2002-10-31 Fraunhofer Ges Forschung Audio signal water-marking method processes water-mark signal before embedding in audio signal so that it is not audibly perceived
EP1433175A1 (en) * 2001-09-05 2004-06-30 Koninklijke Philips Electronics N.V. A robust watermark for dsd signals
US20030131350A1 (en) 2002-01-08 2003-07-10 Peiffer John C. Method and apparatus for identifying a digital audio signal
US7110941B2 (en) * 2002-03-28 2006-09-19 Microsoft Corporation System and method for embedded audio coding with implicit auditory masking
US20050147248A1 (en) * 2002-03-28 2005-07-07 Koninklijke Philips Electronics N.V. Window shaping functions for watermarking of multimedia signals
FR2840147B1 (en) * 2002-05-24 2004-08-27 France Telecom VIDEO INTERFERENCE AND SCRAMBLING METHODS, SYSTEM, DECODER, BROADCAST SERVER, DATA MEDIUM FOR CARRYING OUT SAID METHODS
GB2390502A (en) * 2002-07-04 2004-01-07 Sony Uk Ltd Watermarking material using a bandwidth adapted codeword
US7239981B2 (en) * 2002-07-26 2007-07-03 Arbitron Inc. Systems and methods for gathering audience measurement data
US8959016B2 (en) 2002-09-27 2015-02-17 The Nielsen Company (Us), Llc Activating functions in processing devices using start codes embedded in audio
US9711153B2 (en) 2002-09-27 2017-07-18 The Nielsen Company (Us), Llc Activating functions in processing devices using encoded audio and detecting audio signatures
KR100477699B1 (en) * 2003-01-15 2005-03-18 삼성전자주식회사 Quantization noise shaping method and apparatus
WO2004073168A2 (en) * 2003-02-07 2004-08-26 Warner Bros. Entertainment Inc. Methods for encoding data in an analog video signal such that it survives resolution conversion
JP2006517035A (en) * 2003-02-07 2006-07-13 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Signal processing
US7460684B2 (en) 2003-06-13 2008-12-02 Nielsen Media Research, Inc. Method and apparatus for embedding watermarks
ATE415784T1 (en) * 2003-06-19 2008-12-15 Koninkl Philips Electronics Nv INCREASING THE PROBABILITY OF DETECTING ADDITIONAL DATA IN A MEDIA SIGNAL WITH FEW FREQUENCY COMPONENTS
KR20050026641A (en) * 2003-09-09 2005-03-15 삼성전자주식회사 Method for adaptively inserting karaoke information into audio signal and apparatus therefor, method for reproducing karaoke information from audio data and apparatus therefor, and recording medium for recording programs for realizing the same
SG120118A1 (en) * 2003-09-15 2006-03-28 St Microelectronics Asia A device and process for encoding audio data
KR20050028193A (en) * 2003-09-17 2005-03-22 삼성전자주식회사 Method for adaptively inserting additional information into audio signal and apparatus therefor, method for reproducing additional information inserted in audio data and apparatus therefor, and recording medium for recording programs for realizing the same
EP2065885B1 (en) 2004-03-01 2010-07-28 Dolby Laboratories Licensing Corporation Multichannel audio decoding
JP4230953B2 (en) * 2004-03-31 2009-02-25 株式会社ケンウッド Baseband signal generation apparatus, baseband signal generation method, and program
WO2005099385A2 (en) * 2004-04-07 2005-10-27 Nielsen Media Research, Inc. Data insertion apparatus and methods for use with compressed audio/video data
US7451093B2 (en) * 2004-04-29 2008-11-11 Srs Labs, Inc. Systems and methods of remotely enabling sound enhancement techniques
DE102004021403A1 (en) 2004-04-30 2005-11-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Information signal processing by modification in the spectral / modulation spectral range representation
DE102004021404B4 (en) * 2004-04-30 2007-05-10 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Watermark embedding
CN1993700B (en) 2004-07-02 2012-03-14 尼尔逊媒介研究股份有限公司 Methods and apparatus for mixing compressed digital bit streams
US7508947B2 (en) 2004-08-03 2009-03-24 Dolby Laboratories Licensing Corporation Method for combining audio signals using auditory scene analysis
BRPI0515343A8 (en) * 2004-09-17 2016-11-29 Koninklijke Philips Electronics Nv AUDIO ENCODER AND DECODER, METHODS OF ENCODING AN AUDIO SIGNAL AND DECODING AN ENCODED AUDIO SIGNAL, ENCODED AUDIO SIGNAL, STORAGE MEDIA, DEVICE, AND COMPUTER READABLE PROGRAM CODE
US7668715B1 (en) * 2004-11-30 2010-02-23 Cirrus Logic, Inc. Methods for selecting an initial quantization step size in audio encoders and systems using the same
TW200638335A (en) * 2005-04-13 2006-11-01 Dolby Lab Licensing Corp Audio metadata verification
CN101228575B (en) 2005-06-03 2012-09-26 杜比实验室特许公司 Sound channel reconfiguration with side information
JP2006337851A (en) * 2005-06-03 2006-12-14 Sony Corp Speech signal separating device and method
JP4896455B2 (en) * 2005-07-11 2012-03-14 株式会社エヌ・ティ・ティ・ドコモ Data embedding device, data embedding method, data extracting device, and data extracting method
KR100695158B1 (en) * 2005-08-03 2007-03-14 삼성전자주식회사 Image encoding apparatus and method and image decoding apparatus and method thereof
US8184817B2 (en) 2005-09-01 2012-05-22 Panasonic Corporation Multi-channel acoustic signal processing device
GB2431839B (en) * 2005-10-28 2010-05-19 Sony Uk Ltd Audio processing
US7949131B2 (en) * 2005-12-19 2011-05-24 Sigmatel, Inc. Digital security system
US8332216B2 (en) * 2006-01-12 2012-12-11 Stmicroelectronics Asia Pacific Pte., Ltd. System and method for low power stereo perceptual audio coding using adaptive masking threshold
FR2898458B1 (en) * 2006-03-10 2008-05-16 Medialive METHOD FOR THE SECURE DISTRIBUTION OF AUDIOVISUAL SEQUENCES, DECODER AND SYSTEM FOR IMPLEMENTING SAID METHOD
RU2417514C2 (en) 2006-04-27 2011-04-27 Долби Лэборетериз Лайсенсинг Корпорейшн Sound amplification control based on particular volume of acoustic event detection
DE102006022346B4 (en) * 2006-05-12 2008-02-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Information signal coding
US7940833B2 (en) * 2006-06-21 2011-05-10 Telefonaktiebolaget Lm Ericsson (Publ) Transmitter with intelligent preconditioning of amplifier signal
JP5011849B2 (en) * 2006-06-29 2012-08-29 大日本印刷株式会社 Information embedding device for sound signal and device for extracting information from sound signal
KR101393298B1 (en) * 2006-07-08 2014-05-12 삼성전자주식회사 Method and Apparatus for Adaptive Encoding/Decoding
EP2095560B1 (en) * 2006-10-11 2015-09-09 The Nielsen Company (US), LLC Methods and apparatus for embedding codes in compressed audio data streams
DK2082527T3 (en) * 2006-10-18 2015-07-20 Destiny Software Productions Inc Methods for watermarking media data
JP5306358B2 (en) * 2007-09-28 2013-10-02 ドルビー ラボラトリーズ ライセンシング コーポレイション Multimedia encoding and decoding with additional information capabilities
WO2009086174A1 (en) 2007-12-21 2009-07-09 Srs Labs, Inc. System for adjusting perceived loudness of audio signals
US20090285402A1 (en) * 2008-05-16 2009-11-19 Stuart Owen Goldman Service induced privacy with synchronized noise insertion
CN101588341B (en) * 2008-05-22 2012-07-04 华为技术有限公司 Lost frame hiding method and device thereof
US9667365B2 (en) * 2008-10-24 2017-05-30 The Nielsen Company (Us), Llc Methods and apparatus to perform audio watermarking and watermark detection and extraction
US8359205B2 (en) 2008-10-24 2013-01-22 The Nielsen Company (Us), Llc Methods and apparatus to perform audio watermarking and watermark detection and extraction
US8508357B2 (en) * 2008-11-26 2013-08-13 The Nielsen Company (Us), Llc Methods and apparatus to encode and decode audio for shopper location and advertisement presentation tracking
US8515239B2 (en) * 2008-12-03 2013-08-20 D-Box Technologies Inc. Method and device for encoding vibro-kinetic data onto an LPCM audio stream over an HDMI link
TWI459375B (en) * 2009-01-28 2014-11-01 Fraunhofer Ges Forschung Audio encoder, audio decoder, digital storage medium comprising an encoded audio information, methods for encoding and decoding an audio signal and computer program
US8626516B2 (en) * 2009-02-09 2014-01-07 Broadcom Corporation Method and system for dynamic range control in an audio processing system
CA3094520A1 (en) 2009-05-01 2010-11-04 The Nielsen Company (Us), Llc Methods, apparatus and articles of manufacture to provide secondary content in association with primary broadcast media content
EP3352168B1 (en) * 2009-06-23 2020-09-16 VoiceAge Corporation Forward time-domain aliasing cancellation with application in weighted or original signal domain
FR2948484B1 (en) * 2009-07-23 2011-07-29 Parrot METHOD FOR FILTERING NON-STATIONARY SIDE NOISES FOR A MULTI-MICROPHONE AUDIO DEVICE, IN PARTICULAR A "HANDS-FREE" TELEPHONE DEVICE FOR A MOTOR VEHICLE
WO2011013983A2 (en) 2009-07-27 2011-02-03 Lg Electronics Inc. A method and an apparatus for processing an audio signal
US8538042B2 (en) 2009-08-11 2013-09-17 Dts Llc System for increasing perceived loudness of speakers
CN101651837B (en) * 2009-09-10 2011-03-02 北京航空航天大学 Reversible video frequency watermark method based on interframe forecast error histogram modification
US20110087494A1 (en) * 2009-10-09 2011-04-14 Samsung Electronics Co., Ltd. Apparatus and method of encoding audio signal by switching frequency domain transformation scheme and time domain transformation scheme
ES2888804T3 (en) * 2009-10-15 2022-01-07 Voiceage Corp Simultaneous noise shaping in the time domain and the frequency domain for TDAC transformations
US8638851B2 (en) * 2009-12-23 2014-01-28 Apple Inc. Joint bandwidth detection algorithm for real-time communication
US9093066B2 (en) 2010-01-13 2015-07-28 Voiceage Corporation Forward time-domain aliasing cancellation using linear-predictive filtering to cancel time reversed and zero input responses of adjacent frames
WO2012026039A1 (en) * 2010-08-27 2012-03-01 富士通株式会社 Digital watermark embedding device, digital watermark embedding method, computer program for digital watermark embedding, and digital watermark detection device
JP5724338B2 (en) * 2010-12-03 2015-05-27 ソニー株式会社 Encoding device, encoding method, decoding device, decoding method, and program
PL2485488T3 (en) 2011-02-02 2017-09-29 Nagravision S.A. A media decoder and a decoding method allowing for the media decoder to be traced
RU2648439C2 (en) 2011-02-15 2018-03-26 Аллерган, Инк. Pharmaceutical cream composition of oxymetazoline for treating symptoms of rosacea
JP5752324B2 (en) * 2011-07-07 2015-07-22 ニュアンス コミュニケーションズ, インコーポレイテッド Single channel suppression of impulsive interference in noisy speech signals.
EP2549400A1 (en) * 2011-07-22 2013-01-23 Thomson Licensing Method for protecting an unprotected sound effect program
CN103548079B (en) * 2011-08-03 2015-09-30 Nds有限公司 Audio frequency watermark
US9164724B2 (en) 2011-08-26 2015-10-20 Dts Llc Audio adjustment system
US8527264B2 (en) * 2012-01-09 2013-09-03 Dolby Laboratories Licensing Corporation Method and system for encoding audio data with adaptive low frequency compensation
US9312829B2 (en) 2012-04-12 2016-04-12 Dts Llc System for adjusting loudness of audio signals in real time
US8909517B2 (en) * 2012-08-03 2014-12-09 Palo Alto Research Center Incorporated Voice-coded in-band data for interactive calls
US9401153B2 (en) * 2012-10-15 2016-07-26 Digimarc Corporation Multi-mode audio recognition and auxiliary data encoding and decoding
US9305559B2 (en) 2012-10-15 2016-04-05 Digimarc Corporation Audio watermark encoding with reversing polarity and pairwise embedding
US9269363B2 (en) * 2012-11-02 2016-02-23 Dolby Laboratories Licensing Corporation Audio data hiding based on perceptual masking and detection based on code multiplexing
WO2014072260A2 (en) 2012-11-07 2014-05-15 Dolby International Ab Reduced complexity converter snr calculation
US9317872B2 (en) 2013-02-06 2016-04-19 Muzak Llc Encoding and decoding an audio watermark using key sequences comprising of more than two frequency components
KR101764726B1 (en) * 2013-02-20 2017-08-14 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Apparatus and method for generating an encoded signal or for decoding an encoded audio signal using a multioverlap portion
US9679053B2 (en) 2013-05-20 2017-06-13 The Nielsen Company (Us), Llc Detecting media watermarks in magnetic field data
US20150025894A1 (en) * 2013-07-16 2015-01-22 Electronics And Telecommunications Research Institute Method for encoding and decoding of multi channel audio signal, encoder and decoder
EP2830059A1 (en) 2013-07-22 2015-01-28 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Noise filling energy adjustment
US9711152B2 (en) 2013-07-31 2017-07-18 The Nielsen Company (Us), Llc Systems apparatus and methods for encoding/decoding persistent universal media codes to encoded audio
US20150039321A1 (en) 2013-07-31 2015-02-05 Arbitron Inc. Apparatus, System and Method for Reading Codes From Digital Audio on a Processing Device
JP5761318B2 (en) * 2013-11-29 2015-08-12 ヤマハ株式会社 Identification information superimposing device
US8918326B1 (en) 2013-12-05 2014-12-23 The Telos Alliance Feedback and simulation regarding detectability of a watermark message
US8768005B1 (en) * 2013-12-05 2014-07-01 The Telos Alliance Extracting a watermark signal from an output signal of a watermarking encoder
US9824694B2 (en) 2013-12-05 2017-11-21 Tls Corp. Data carriage in encoded and pre-encoded audio bitstreams
US8768714B1 (en) 2013-12-05 2014-07-01 The Telos Alliance Monitoring detectability of a watermark message
US8768710B1 (en) 2013-12-05 2014-07-01 The Telos Alliance Enhancing a watermark signal extracted from an output signal of a watermarking encoder
EP2930717A1 (en) * 2014-04-07 2015-10-14 Thomson Licensing Method and apparatus for determining in a 2nd screen device whether the presentation of watermarked audio content received via an acoustic path from a 1st screen device has been stopped
US20160171987A1 (en) * 2014-12-16 2016-06-16 Psyx Research, Inc. System and method for compressed audio enhancement
US9130685B1 (en) 2015-04-14 2015-09-08 Tls Corp. Optimizing parameters in deployed systems operating in delayed feedback real world environments
US10043527B1 (en) * 2015-07-17 2018-08-07 Digimarc Corporation Human auditory system modeling with masking energy adaptation
US9454343B1 (en) 2015-07-20 2016-09-27 Tls Corp. Creating spectral wells for inserting watermarks in audio signals
US10115404B2 (en) 2015-07-24 2018-10-30 Tls Corp. Redundancy in watermarking audio signals that have speech-like properties
US9626977B2 (en) 2015-07-24 2017-04-18 Tls Corp. Inserting watermarks into audio signals that have speech-like properties
WO2018026299A1 (en) * 2016-08-04 2018-02-08 Huawei Technologies Co., Ltd. Method and apparatus for data hiding in prediction parameters
EP3758385A4 (en) * 2018-02-23 2021-10-27 Evixar Inc. Content reproduction program, content reproduction method, and content reproduction system
US11763832B2 (en) * 2019-05-01 2023-09-19 Synaptics Incorporated Audio enhancement through supervised latent variable representation of target speech and noise
WO2023212753A1 (en) * 2022-05-02 2023-11-09 Mediatest Research Gmbh A method for embedding or decoding audio payload in audio content
CN115546652B (en) * 2022-11-29 2023-04-07 城云科技(中国)有限公司 Multi-temporal target detection model, and construction method, device and application thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1166224A (en) * 1995-10-04 1997-11-26 菲利浦电子有限公司 Marking a digitally encoded video and/or audio signal
CN1183693A (en) * 1996-10-28 1998-06-03 国际商业机器公司 Protecting images with image watermark
WO1999029114A1 (en) * 1997-12-03 1999-06-10 At & T Corp. Electronic watermarking in the compressed domain utilizing perceptual coding

Family Cites Families (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4636088A (en) 1984-05-21 1987-01-13 Therma-Wave, Inc. Method and apparatus for evaluating surface conditions of a sample
EP0200301A1 (en) 1985-03-01 1986-11-05 Therma-Wave Inc. Method and apparatus for evaluating surface and subsurface features in a semiconductor
NL8901032A (en) 1988-11-10 1990-06-01 Philips Nv CODER FOR INCLUDING ADDITIONAL INFORMATION IN A DIGITAL AUDIO SIGNAL WITH A PREFERRED FORMAT, A DECODER FOR DERIVING THIS ADDITIONAL INFORMATION FROM THIS DIGITAL SIGNAL, AN APPARATUS FOR RECORDING A DIGITAL SIGNAL ON A CODE OF RECORD. OBTAINED A RECORD CARRIER WITH THIS DEVICE.
FR2660131B1 (en) 1990-03-23 1992-06-19 France Etat DEVICE FOR TRANSMITTING DIGITAL DATA WITH AT LEAST TWO LEVELS OF PROTECTION, AND CORRESPONDING RECEPTION DEVICE.
US5121204A (en) * 1990-10-29 1992-06-09 General Electric Company Apparatus for scrambling side panel information of a wide aspect ratio image signal
US5254843A (en) 1991-08-07 1993-10-19 Hynes John E Securing magnetically encoded data using timing variations in encoded data
US5539812A (en) * 1992-07-29 1996-07-23 Kitchin; Dwight W. Method and apparatus for detecting an attempted three-way conference call on a remote telephone
DE4241068C2 (en) 1992-12-05 2003-11-13 Thomson Brandt Gmbh Method for transmitting, storing or decoding a digital additional signal in a digital audio signal
US5748763A (en) 1993-11-18 1998-05-05 Digimarc Corporation Image steganography system featuring perceptually adaptive and globally scalable signal embedding
US6122403A (en) 1995-07-27 2000-09-19 Digimarc Corporation Computer system linked by using information in data objects
US5748783A (en) 1995-05-08 1998-05-05 Digimarc Corporation Method and apparatus for robust information coding
US5404377A (en) 1994-04-08 1995-04-04 Moses; Donald W. Simultaneous transmission of data and audio signals by means of perceptual coding
US5774452A (en) 1995-03-14 1998-06-30 Aris Technologies, Inc. Apparatus and method for encoding and decoding information in audio signals
US6041295A (en) 1995-04-10 2000-03-21 Corporate Computer Systems Comparing CODEC input/output to adjust psycho-acoustic parameters
US5680462A (en) * 1995-08-07 1997-10-21 Sandia Corporation Information encoder/decoder using chaotic systems
EP0766468B1 (en) 1995-09-28 2006-05-03 Nec Corporation Method and system for inserting a spread spectrum watermark into multimedia data
WO1997022206A1 (en) 1995-12-11 1997-06-19 Philips Electronics N.V. Marking a video and/or audio signal
US5956430A (en) 1996-02-19 1999-09-21 Fuji Xerox Co., Ltd. Image information coding apparatus and method using code amount of a selected pixel block for changing coding parameter
US6035177A (en) 1996-02-26 2000-03-07 Donald W. Moses Simultaneous transmission of ancillary and audio signals by means of perceptual coding
US6584138B1 (en) 1996-03-07 2003-06-24 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Coding process for inserting an inaudible data signal into an audio signal, decoding process, coder and decoder
US5828416A (en) 1996-03-29 1998-10-27 Matsushita Electric Corporation Of America System and method for interfacing a transport decoder to a elementary stream video decorder
US5812976A (en) 1996-03-29 1998-09-22 Matsushita Electric Corporation Of America System and method for interfacing a transport decoder to a bitrate-constrained audio recorder
US6229924B1 (en) * 1996-05-16 2001-05-08 Digimarc Corporation Method and apparatus for watermarking video images
US6078664A (en) 1996-12-20 2000-06-20 Moskowitz; Scott A. Z-transform implementation of digital watermarks
US5889868A (en) 1996-07-02 1999-03-30 The Dice Company Optimization methods for the insertion, protection, and detection of digital watermarks in digitized data
JP3982836B2 (en) 1996-07-16 2007-09-26 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Method for detecting watermark information embedded in an information signal
US6061793A (en) 1996-08-30 2000-05-09 Regents Of The University Of Minnesota Method and apparatus for embedding data, including watermarks, in human perceptible sounds
US6031914A (en) 1996-08-30 2000-02-29 Regents Of The University Of Minnesota Method and apparatus for embedding data, including watermarks, in human perceptible images
US6069914A (en) 1996-09-19 2000-05-30 Nec Research Institute, Inc. Watermarking of image data using MPEG/JPEG coefficients
US5809139A (en) 1996-09-13 1998-09-15 Vivo Software, Inc. Watermarking method and apparatus for compressed digital video
US5915027A (en) 1996-11-05 1999-06-22 Nec Research Institute Digital watermarking
US5845251A (en) 1996-12-20 1998-12-01 U S West, Inc. Method, system and product for modifying the bandwidth of subband encoded audio data
BR9804764A (en) 1997-01-13 1999-08-17 Koninkl Philips Electronics Nv Processes and sets for embedding and decoding supplementary data into a video signal and encoded video signal with supplementary data embedded
JP3412117B2 (en) * 1997-04-25 2003-06-03 日本電信電話株式会社 Digital watermark creation method using coding parameter of quantization and readout method thereof
US5940135A (en) 1997-05-19 1999-08-17 Aris Technologies, Inc. Apparatus and method for encoding and decoding information in analog signals
US5960081A (en) 1997-06-05 1999-09-28 Cray Research, Inc. Embedding a digital signature in a video sequence
JP3662398B2 (en) * 1997-09-17 2005-06-22 パイオニア株式会社 Digital watermark superimposing device and digital watermark detecting device
US6037984A (en) 1997-12-24 2000-03-14 Sarnoff Corporation Method and apparatus for embedding a watermark into a digital image or image sequence
US6064748A (en) * 1998-01-16 2000-05-16 Hewlett-Packard Company Method and apparatus for embedding and retrieving additional data in an encoded data stream
US6145081A (en) * 1998-02-02 2000-11-07 Verance Corporation Method and apparatus for preventing removal of embedded information in cover signals
US6064764A (en) 1998-03-30 2000-05-16 Seiko Epson Corporation Fragile watermarks for detecting tampering in images
GB2340351B (en) 1998-07-29 2004-06-09 British Broadcasting Corp Data transmission
JP3354880B2 (en) * 1998-09-04 2002-12-09 日本電信電話株式会社 Information multiplexing method, information extraction method and apparatus
US6219634B1 (en) 1998-10-14 2001-04-17 Liquid Audio, Inc. Efficient watermark method and apparatus for digital signals
US6128736A (en) 1998-12-18 2000-10-03 Signafy, Inc. Method for inserting a watermark signal into data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1166224A (en) * 1995-10-04 1997-11-26 菲利浦电子有限公司 Marking a digitally encoded video and/or audio signal
CN1183693A (en) * 1996-10-28 1998-06-03 国际商业机器公司 Protecting images with image watermark
WO1999029114A1 (en) * 1997-12-03 1999-06-10 At & T Corp. Electronic watermarking in the compressed domain utilizing perceptual coding

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
"multiresolution video watermarking using perceptual modelsand scence segmentation". SWANSON M D ET AL.PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING.,Vol.2 . 1997
"multiresolution video watermarking using perceptual modelsand scence segmentation". SWANSON M D ET AL.PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING.,Vol.2 . 1997 *
"Perceptual coding of digital audio". PAINTER T ET AL.PROCEEDINGS OF THE IEEE,,Vol.88 No.4. 2000
"Perceptual coding of digital audio". PAINTER T ET AL.PROCEEDINGS OF THE IEEE,,Vol.88 No.4. 2000 *

Also Published As

Publication number Publication date
DE60114638D1 (en) 2005-12-08
DE60114638T2 (en) 2006-07-20
AU2001284910B2 (en) 2007-03-22
BR0113271A (en) 2003-09-23
AU8491001A (en) 2002-02-25
WO2002015587A3 (en) 2003-03-13
CN1672418A (en) 2005-09-21
HK1080243B (en) 2009-05-15
KR100898879B1 (en) 2009-05-25
EP1310099B1 (en) 2005-11-02
KR20030064381A (en) 2003-07-31
CA2418722A1 (en) 2002-02-21
BRPI0113271B1 (en) 2016-01-26
WO2002015587A2 (en) 2002-02-21
JP2004506947A (en) 2004-03-04
US7395211B2 (en) 2008-07-01
ATE308858T1 (en) 2005-11-15
EP1310099A2 (en) 2003-05-14
CA2418722C (en) 2012-02-07
HK1080243A1 (en) 2006-04-21
US20040024588A1 (en) 2004-02-05

Similar Documents

Publication Publication Date Title
CN100431355C (en) Modulating one or more parameters of an audio or video perceptual coding system in response to supplemental information
Lemma et al. A temporal domain audio watermarking technique
AU2001284910A1 (en) Modulating one or more parameters of an audio or video perceptual coding system in response to supplemental information
US6061793A (en) Method and apparatus for embedding data, including watermarks, in human perceptible sounds
US8306811B2 (en) Embedding data in audio and detecting embedded data in audio
US6345100B1 (en) Robust watermark method and apparatus for digital signals
CA2527011C (en) Audio encoding/decoding apparatus having watermark insertion/abstraction function and method using the same
Takahashi et al. Multiple watermarks for stereo audio signals using phase-modulation techniques
WO2000022605A1 (en) Efficient watermark method and apparatus for digital signals
Nishimura Audio watermarking based on subband amplitude modulation
Malik et al. Robust audio watermarking using frequency-selective spread spectrum
Singh et al. Audio watermarking based on quantization index modulation using combined perceptual masking
CA2993192A1 (en) Creating spectral wells for inserting watermarks in audio signals
Lei et al. Perception-based audio watermarking scheme in the compressed bitstream
He et al. A high capacity watermarking technique for stereo audio
Lin et al. Audio watermarking techniques
Trivedi et al. An algorithmic digital audio watermarking in perceptual domain using direct sequence spread spectrum
Cvejic et al. Audio watermarking: Requirements, algorithms, and benchmarking
Zhao et al. A spread spectrum audio watermarking system with high perceptual quality
Lai et al. Robust Audio Watermarking based on empirical mode decomposition and group differential relations
Vimal et al. Real Steganography in Non Voice Part of the Speech
Mitrea et al. Informed audio watermarking in the wavelet domain
Singh et al. Audio Watermarking Scheme in MDCT Domain
Lei et al. Digital Watermarking Techniques for AVS Audio
Thanuja et al. Schemes for evaluating signal processing properties of audio watermarking

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1080243

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1080243

Country of ref document: HK

CX01 Expiry of patent term
CX01 Expiry of patent term

Granted publication date: 20081105