KR100685974B1

KR100685974B1 - Apparatus and method for watermark insertion/detection

Info

Publication number: KR100685974B1
Application number: KR1020050059696A
Authority: KR
Inventors: 오현오
Original assignee: 엘지전자 주식회사
Priority date: 2005-07-04
Filing date: 2005-07-04
Publication date: 2007-02-26
Also published as: KR20070004248A

Abstract

본 발명은 오디오 또는 영상 부호화 및 복호화 과정에서 워터마크를 삽입하고 검출할 수 있는 장치 및 방법에 관한 것이다. 특히 본 발명은 오디오 또는 영상 신호의 허프만 부호화 과정에서 워터마크 정보를 삽입하여 전송하고, 오디오 또는 영상 신호의 허프만 복호화 과정에서 워터마크 정보를 추출함으로써, 효과적으로 워터마크를 삽입 및 검출할 수 있다. 이때 워터마크 추출 기능을 갖는 오디오 또는 영상 복호화 장치는 워터마크가 삽입된 비트열에 대해서는 워터마크 정보의 추출과 동시에 오디오 또는 영상 신호를 복호화할 수 있으며, 워터마크가 삽입되지 않은 종래의 비트열의 복호화도 가능하다. The present invention relates to an apparatus and method for inserting and detecting a watermark during audio or video encoding and decoding. In particular, the present invention can effectively insert and detect watermarks by inserting and transmitting watermark information in the Huffman encoding process of the audio or video signal, and extracting the watermark information in the Huffman decoding process of the audio or video signal. In this case, the audio or video decoding apparatus having the watermark extraction function may decode the audio or video signal at the same time as the watermark information is extracted with respect to the bit string into which the watermark is inserted. It is possible.

워터마크, 비트열, 허프만 부호화, 코드북 Watermark, bitstream, Huffman coding, codebook

Description

Apparatus and method for watermark insertion / detection}

도 1은 일반적인 디지털 워터마크의 삽입 및 검출 시스템을 보인 개략도1 is a schematic diagram showing a system for embedding and detecting a general digital watermark

도 2는 본 발명에 따른 디지털 워터마크의 삽입 및 검출 시스템을 보인 개략도2 is a schematic diagram showing a system for embedding and detecting a digital watermark according to the present invention;

도 3은 본 발명에 따른 워터마크 삽입부를 포함한 MP3 부호화 장치의 일 실시예를 보인 구성 블록도3 is a block diagram showing an embodiment of an MP3 encoding apparatus including a watermark insertion unit according to the present invention;

도 4는 MP3 부호화에 이용되는 허프만 테이블의 일 예를 보인 도면4 shows an example of a Huffman table used for MP3 encoding.

도 5a, 도 5b는 삽입할 워터마크 정보에 따라 도 4의 허프만 테이블 내 허프만 코드북들을 두 그룹으로 나눈 예를 보인 본 발명의 도면5A and 5B illustrate an example of dividing Huffman codebooks in the Huffman table of FIG. 4 into two groups according to watermark information to be inserted.

도 6은 본 발명에 따른 워터마크 추출부를 포함한 MP3 복호화 장치의 일 실시예를 보인 구성 블록도6 is a block diagram showing an embodiment of an MP3 decoding apparatus including a watermark extractor according to the present invention;

도 7은 본 발명에 따른 워터마크 삽입부를 포함한 AAC 부호화 장치의 일 실시예를 보인 구성 블록도7 is a block diagram showing an embodiment of an AAC encoding apparatus including a watermark inserter according to the present invention;

도 8은 AAC 부호화에 이용되는 허프만 테이블의 일 예를 보인 도면8 shows an example of a Huffman table used for AAC encoding.

도 9a, 도 9b는 삽입할 워터마크 정보에 따라 도 8의 허프만 테이블 내 허프만 코드북들을 두 그룹으로 나눈 예를 보인 본 발명의 도면9A and 9B illustrate examples of dividing Huffman codebooks in the Huffman table of FIG. 8 into two groups according to watermark information to be inserted;

도 10은 본 발명에 따른 워터마크 추출부를 포함한 AAC 복호화 장치의 일 실시예를 보인 구성 블록도10 is a block diagram showing an embodiment of an AAC decoding apparatus including a watermark extractor according to the present invention;

도면의 주요부분에 대한 부호의 설명Explanation of symbols for main parts of the drawings

210 : 오디오 부호화기 230 : 오디오 복호화기210: audio encoder 230: audio decoder

211 : 워터마크 삽입부 231 : 워터마크 추출부211: watermark inserting unit 231: watermark extracting unit

310 : 서브밴드 필터뱅크 320 : MDCT부310: subband filter bank 320: MDCT unit

330 : 왜곡 방지부 331 : 양자화부330: distortion prevention unit 331: quantization unit

332 : 허프만 부호화 및 워터마크 삽입부332: Huffman coding and watermark inserting unit

333 : 비트 할당부 334 : 부가정보 부호화부333: bit allocation unit 334: additional information encoding unit

340 : FFT부 350 : 심리음향 모델부340: FFT unit 350: psychoacoustic model unit

360 : 비트열 생성부 710 : 심리음향 모델부360: bit string generator 710: psychoacoustic model unit

720 : 전처리기 730 : 필터뱅크720: preprocessor 730: filter bank

740 : TNS부 750 : 세기/커플링부740: TNS unit 750: strength / coupling unit

760 : 예측부 770 : M/S 스테레오 처리부760: prediction unit 770: M / S stereo processing unit

780 : 데이터 복원부 781 : 압축율/왜곡 제어부780: data recovery unit 781: compression ratio / distortion control

782 : 스케일팩터 추출부 783 : 양자화부782: scale factor extraction unit 783: quantization unit

784 : 허프만 부호화 및 워터마크 삽입부784: Huffman coding and watermark inserting unit

790 : 비트열 생성부790: bit string generator

본 발명은 데이터 은닉 기법의 일종인 디지털 워터마킹에 관한 것으로, 보다 상세하게는 오디오나 영상 부호화 및 복호화 과정에서 워터마크 정보를 삽입하고 검출하는 장치 및 방법에 관한 것이다. The present invention relates to digital watermarking, which is a kind of data concealment, and more particularly, to an apparatus and method for inserting and detecting watermark information in an audio or video encoding and decoding process.

통상 워터마킹은 '워터마크(watermark)'라고 하는 비밀 정보를 비디오, 이미지, 오디오, 텍스트 등의 매체 내부에 '은닉'하는 것을 말한다. 은닉된 워터마크 정보는 이를 알고 있는 사람에 한해 추출할 수 있으며, 워터마크가 삽입된 매체는 일반 사용자들에게는 보통의 매체와 동일하게 인식된다. In general, watermarking refers to 'hiding' secret information called 'watermark' inside media such as video, image, audio, and text. The hidden watermark information can be extracted only to those who know it, and the medium into which the watermark is inserted is recognized in the same way as a normal medium by general users.

특히 디지털 매체는 아날로그와 비교하여 접근, 전달, 편집 및 보관이 용이하고 전파 또는 통신망 등을 통한 배포 과정에서 데이터의 열화가 없다는 장점으로 인해, '저작권의 보호'라는 새로운 문제점을 낳게 되었는데, 디지털 워터마킹은 이러한 저작권을 보호할 수 있는 수단으로써 주목을 받고 있다. In particular, digital media has created a new problem called 'protection of copyright' due to the advantages that it is easier to access, transmit, edit, and archive compared to analog, and that there is no deterioration of data during distribution through radio waves or communication networks. Marking is drawing attention as a means to protect this copyright.

상기 디지털 워터마킹은 소유자를 구별할 수 있는 정보를 삽입한 저작권 보호의 목적 이외에도, 제어 정보를 삽입하여 복제방지, 유통과정 확인, 방송 모니터링 등에 활용된다. 또는 오디오, 비디오와 같은 실시간 매체에 삽입되어, 재생시간 제어 정보, 오디오/비디오 동기(립싱크/Lip-sync), 컨텐츠 정보, 가사 등의 정보를 전송하는 목적으로도 사용될 수 있다. The digital watermarking is used for copy protection, distribution process check, broadcast monitoring, etc. by inserting control information in addition to the purpose of copyright protection in which information for distinguishing owners is inserted. Or, it may be inserted into a real-time medium such as audio and video, and may be used for transmitting information such as playback time control information, audio / video sync (lip sync / lip sync), content information, lyrics, and the like.

그리고 이러한 다양한 사용목적에 따라 디지털 워터마킹이 지녀야할 특성은 서로 다르지만, 기본적으로 비지각성과 강인성은 꼭 필요한 특성이다. In addition, although the characteristics of digital watermarking are different according to various purposes of use, non-perception and toughness are essential characteristics.

상기 비지각성이란 워터마크가 삽입되기전 원 매체와 삽입 후의 매체가 사람 들이 보거나 듣기에 구별되지 않아야 함을 의미하는 것으로 워터마킹의 가장 기본적인 요구사항이라 할 수 있다. 상기 강인성은 워터마크가 삽입된 매체가 유통, 전송 과정에서 필터링, 압축, 잡음 첨가, 열화 등의 변형이 가해지더라도, 삽입된 워터마크가 보존되어야 함을 의미한다. 특히 저작권 보호 및 복제방지를 위한 워터마킹의 경우는 워터마크를 제거하고자 하는 고의적 공격에 대해서도 대응할 수 있도록 강인해야하는 한편, 위조 방지를 위한 워터마킹의 경우는 변형, 조작될 경우 쉽게 소멸되는 워터마크를 삽입하기도 한다. 이때 재생시간 제어, 립싱크, 컨텐츠 정보, 가사 등의 부가정보를 매체에 은닉시키는 워터마킹의 경우는 의도적 공격이나 왜곡 등에 대한 강인성의 요구사항이 상대적으로 낮다. The non-perceptive means that the source medium before the watermark is inserted and the medium after the insertion should not be distinguished by people to see or hear, which is the most basic requirement of watermarking. The robustness means that the inserted watermark should be preserved even if the medium having the watermark embedded therein is subjected to modifications such as filtering, compression, noise addition, and degradation during distribution and transmission. In particular, in case of watermarking for copyright protection and copy protection, it should be tough to cope with intentional attacks that want to remove watermark, while in case of watermarking for anti-counterfeiting, watermark that easily disappears when modified or manipulated Sometimes inserted. At this time, in case of watermarking concealing additional information such as playback time control, lip syncing, content information, lyrics, and the like, the requirements for robustness against intentional attack or distortion are relatively low.

도 1은 일반적인 디지털 워터마크의 삽입 및 추출 장치를 나타낸다. 1 shows an apparatus for inserting and extracting a general digital watermark.

도 1을 보면, 워터마크 삽입 장치를 통하여 워터마크를 삽입하고자 하는 디지털 매체(예를 들어, 오디오, 비디오, 이미지, 텍스트들 중 적어도 하나)에 워터마크 데이터를 은닉시켜 전송한다. 이때 알고리즘에 따라 부가적으로 보안을 위한 비밀 혹은 공개키를 사용할 수 있다. Referring to FIG. 1, watermark data is concealed and transmitted to a digital medium (eg, at least one of audio, video, image, and text) to which a watermark is to be inserted through a watermark embedding apparatus. In this case, depending on the algorithm, a secret or public key for security may be additionally used.

워터마크 추출 장치는 기본적으로 워터마크가 삽입된 매체로부터 워터마크를 추출한다. 이때 알고리즘에 따라 원본 매체가 필요할 수 있으며, 워터마크 삽입시에 사용한 비밀키를 가지고 있어야 복호화가 가능하기도 하다. The watermark extraction apparatus basically extracts the watermark from the medium into which the watermark is inserted. In this case, the original medium may be required according to the algorithm, and it may be decrypted only if the secret key used to insert the watermark is included.

그리고 상기 워터마크 추출 과정에서 원본이 필요없는 시스템을 블라인드 워터마킹(blind watermarking)이라고 부른다. In the watermark extraction process, the system that does not require the original is called blind watermarking.

이러한 워터마크를 삽입하는 방법으로는 LSB(Least Significant Bit) 부호화 방법, 반향 삽입 방법(Echo Hiding Method), 및 대역확산 통신(Spread Spectrum Communication)을 이용하는 방법 등이 다양하게 제공되고 있다. As a method of inserting such a watermark, various methods such as a Least Significant Bit (LSB) encoding method, an echo hiding method, and spread spectrum communication are provided.

상기 LSB 부호화 방법은 양자화된 오디오 및 영상 샘플의 최하위 비트들을 변형하여 원하는 정보를 삽입하는 방법이다. 이 방법은 오디오 및 영상 신호에 있어서 최하위 비트의 변형은 품질에 거의 영향을 주지 않는다는 특성을 이용한 것으로 삽입과 검출이 간단하고 왜곡이 적은 장점이 있지만, 손실 압축이나 필터링 같은 신호처리에 매우 취약하다는 문제점이 있다.The LSB encoding method is a method of inserting desired information by transforming least significant bits of quantized audio and video samples. This method takes advantage of the fact that the least significant bit distortion in the audio and video signals has little effect on the quality. It is simple to insert and detect and has low distortion, but it is very vulnerable to signal processing such as lossy compression and filtering. There is this.

상기 반향 삽입 방법은 특히 오디오 신호에 적용된 기법으로 사람의 귀에 들리지 않을 만큼 작은 크기의 반향을 삽입하는 방법이다. 이 방법은 일정 시간 간격으로 세분화된 오디오 신호에 삽입하고자 하는 이진 워터마크 정보에 따라 다른 시간 지연을 갖는 반향을 삽입하여 부호화하고, 복호화 과정에서는 각각의 세분화된 구간에서의 반향 시간 지연을 검출함으로써 이진 정보를 복호화하는 방법이다. 이 경우 첨가되는 신호는 잡음이 아니라 원 신호와 같은 특성을 갖고 있는 오디오 신호 자체이기 때문에 삽입된 신호가 들리더라도 왜곡으로 인지되지 않으며 오히려 음색을 좋게 하는 효과를 기대할 수 있어서 고음질 오디오 워터마킹에 사용하기 적합하다. 하지만 켑스트럼(Ceptstrum) 연산을 통해 워터마크를 검출하기 때문에 복호화 과정의 연산량이 매우 높고, 시간 영역에서 분할될 구간에 대한 동기를 놓칠 경우 복호화가 되지 않는 단점을 갖는다. The echo insertion method is particularly a technique applied to an audio signal and inserts an echo of a size small enough to be inaudible to the human ear. This method inserts and encodes echoes with different time delays according to binary watermark information to be inserted into audio signals segmented at predetermined time intervals, and detects echo time delays in each subdivided section during decoding. A method of decoding information. In this case, the added signal is not the noise but the audio signal itself, which has the same characteristics as the original signal. Therefore, even if the inserted signal is heard, it is not recognized as a distortion. Suitable. However, since the watermark is detected through the Ceptstrum operation, the computational amount of the decoding process is very high, and if the synchronization for the section to be divided in the time domain is missed, the decoding is not performed.

상기 대역확산 통신 기반의 방법은 오디오 및 영상 워터마킹에서 가장 많이 연구되고 있는 대표적인 워터마킹 방법이다. 이 방법은 DCT(이산 코사인 변환)나 DFT(이산 푸리에 변환) 등을 통해 A/V 신호를 주파수로 변환한 뒤, 이진수 워터마크 정보를 PN(Pseudo Noise) 시퀀스로 대역 확산하여 주파수 변환된 신호에 첨가하는 방법으로 워터마크를 삽입한다. 그리고 삽입된 워터마크는 PN 시퀀스의 높은 자기 상관(Auto-correlation) 특성을 이용하여 상관기(Correlator)에 의해 검출할 수 있으며 간섭에 강하고 암호성이 뛰어난 특징이 있다. 하지만 강인성 향상을 위해 큰 에너지를 갖도록 삽입하게 되면 원신호의 품질이 나빠지고, 삽입 및 검출 과정의 연산량이 매우 높으며 압축 부호화에 대한 강인성이 완전하지 못한 문제점이 있다.The spread spectrum communication based method is a representative watermarking method that is most studied in audio and video watermarking. This method converts an A / V signal to a frequency through DCT (Discrete Cosine Transform) or DFT (Discrete Fourier Transform), and then spreads the binary watermark information into PN (Pseudo Noise) sequence to apply the frequency-converted signal. Insert a watermark by adding. The inserted watermark can be detected by a correlator using a high auto-correlation characteristic of the PN sequence, and has a feature of being strong against interference and excellent in encryption. However, when inserting with a large energy to improve the robustness, the quality of the original signal is deteriorated, the computational amount of the insertion and detection process is very high, and the robustness of the compression coding is not perfect.

이와 같이 오디오 및 영상 워터마킹의 종래 기술을 종합하여 보면, 일반적으로 압축 부호화되기 전 원 신호에 워터마크 정보를 삽입하는 방법을 사용함으로써, 구현 방법이 복잡하고, 이에 따라 연산량이 많이 필요한 단점이 있으며, 압축 과정에서 변형되기 쉬운 문제점을 갖고 있었다.As a result of combining the prior arts of audio and video watermarking, in general, by using a method of inserting watermark information into a power signal to be compressed and encoded, an implementation method is complicated, and thus, a large amount of computation is required. However, it had a problem of being easily deformed during the compression process.

본 발명은 상기와 같은 문제점을 해결하기 위한 것으로서, 본 발명의 목적은 디지털 오디오나 영상 신호의 압축 부호화 및 복호화 과정에서 워터마크 정보를 삽입하고 검출함으로써, 워터마크 정보의 삽입 및 검출이 용이할 뿐만 아니라, 원 신호 및 삽입된 워터마크의 왜곡을 방지하는 장치 및 그 방법을 제공하는 것이다.SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and an object of the present invention is to insert and detect watermark information during compression encoding and decoding of digital audio or video signals, thereby easily inserting and detecting watermark information. Rather, the present invention provides an apparatus and method for preventing distortion of an original signal and an embedded watermark.

본 발명의 다른 목적은 허프만 부호화 및 복호화 과정에서 워터마크 정보를 삽입하고 검출함으로써, 디지털 오디오나 영상 신호의 부호화 및 복호화 과정에서 부가적인 잡음이나 왜곡을 일으키지 않으면서도 효율적으로 워터마크를 삽입하고 검출하는 장치 및 그 방법을 제공하는 것이다.Another object of the present invention is to insert and detect watermark information during Huffman encoding and decoding, thereby efficiently inserting and detecting watermark without causing additional noise or distortion during encoding and decoding of digital audio or video signals. An apparatus and a method thereof are provided.

상기 목적을 달성하기 위하여, 본 발명에 따른 워터마크 삽입/검출을 위한 장치 및 그 방법은 허프만 부호화 과정의 일부를 변경하여 워터마크를 삽입하고, 허프만 복호화 과정의 일부를 변경하여 삽입된 워터마크를 추출하는 것을 특징으로 한다. In order to achieve the above object, an apparatus and method for embedding / detecting a watermark according to the present invention change a part of the Huffman encoding process to insert a watermark, and change a part of the Huffman decoding process to change the inserted watermark. It is characterized in that the extraction.

특히 상기 워터마크가 삽입된 오디오나 영상 비트열은 종래의 일반적인 오디오와 비디오 복호화기를 통해 왜곡없는 신호의 복호화가 가능하며, 이때 일반적인 오디오나 영상 복호화기는 워터마크의 삽입 여부를 인지하지 못한다. 즉, 전술한 워터마크 추출 기능을 갖는 복호화 방법 및 장치는 워터마크가 삽입된 비트열에 대해서는 워터마크 정보의 추출과 동시에 오디오 신호를 복호화할 수 있으며, 워터마크가 삽입되지 않은 종래의 일반적인 비트열의 복호화도 가능하다.In particular, the watermark-embedded audio or video bit stream can be decoded without distortion through a conventional audio and video decoder, and the general audio or video decoder does not recognize whether or not the watermark is inserted. That is, the decoding method and apparatus having the above-described watermark extraction function can decode the audio signal at the same time as the watermark information is extracted with respect to the bit string in which the watermark is inserted, and decode the conventional general bit string in which the watermark is not inserted. It is also possible.

이러한 특징을 구체적으로 구현한 본 발명에 따른 워터마크 삽입/검출을 위한 장치는, 허프만 부호화시 이용되는 다수개의 허프만 코드북들을 적어도 복수개 이상의 그룹으로 분류하고, 삽입하려는 워터마크 정보에 따라 분류된 그룹들 중 하나의 그룹을 선택하며, 선택된 그룹에서 입력 신호의 허프만 부호화에 사용할 허프만 코드북을 선택하여 허프만 부호화한 후 전송하는 부호화기; 및 상기 부호화기에서 전송된 신호의 허프만 부호화에 사용된 허프만 코드북을 이용하여 전송된 신호의 허프만 복호화를 수행함과 동시에 상기 허프만 코드북이 속하는 그룹을 기반으로 워터마크 정보를 추출하는 복호화기를 포함하여 구성되는 것을 특징으로 한다.An apparatus for embedding / detecting watermarks according to the present invention which specifically implements such a feature may include classifying a plurality of Huffman codebooks used in Huffman coding into at least a plurality of groups, and sorting the groups according to the watermark information to be inserted. An encoder which selects one group and selects a Huffman codebook to be used for Huffman encoding of an input signal from the selected group, and then transmits the Huffman encoding after Huffman encoding; And a decoder for performing Huffman decoding of the transmitted signal using the Huffman codebook used for Huffman coding of the signal transmitted from the encoder and extracting watermark information based on the group to which the Huffman codebook belongs. It features.

상기 부호화기에서 선택된 그룹 및 허프만 코드북 정보는 부가정보로서 복호화기로 전송되는 것을 특징으로 한다.The group and Huffman codebook information selected by the encoder is transmitted to the decoder as side information.

상기 부호화기는 MP3 부호화기를 이용하며, 각 허프만 코드북별 최대값, 각 허프만 코드북의 통계적 특성, 및 허프만 코드북 내 linbits 값 중 적어도 하나 이상을 이용하여 허프만 코드북들을 복수개의 그룹으로 분류하는 것을 특징으로 한다.The coder uses an MP3 encoder and classifies the Huffman codebooks into a plurality of groups using at least one of a maximum value for each Huffman codebook, statistical characteristics of each Huffman codebook, and linbits values in the Huffman codebook.

상기 부호화기는 AAC 부호화기를 이용하며, 각 허프만 코드북별 최대 절대값을 기준으로 허프만 코드북들을 복수개의 그룹으로 분류하는 것을 특징으로 한다.The coder uses an AAC coder and classifies the Huffman codebooks into a plurality of groups based on the maximum absolute value of each Huffman codebook.

상기 AAC 부호화기는 허프만 코드북으로 표현할 수 없는 입력 신호를 허프만 부호화하는 경우, 선택된 허프만 코드북 자체에 대해서는 워터마크를 삽입하지 않으며, 예외처리(ESC)를 실시하는 과정에서 워터마크를 삽입하는 것을 특징으로 한다.When the AAC encoder encodes an input signal that cannot be represented by the Huffman codebook, the AAC encoder does not insert a watermark for the selected Huffman codebook itself, and inserts the watermark in the process of performing exception processing (ESC). .

본 발명에 따른 워터마크 삽입/검출을 위한 방법은, 허프만 부호화시 이용되는 다수개의 허프만 코드북들을 적어도 복수개 이상의 그룹으로 분류하고, 삽입하려는 워터마크 정보에 따라 분류된 그룹들 중 하나의 그룹을 선택하며, 선택된 그룹에서 입력 신호의 허프만 부호화에 사용할 허프만 코드북을 선택하여 허프만 부호화한 후 전송하는 전송 단계; 및 상기 단계에서 전송된 신호의 허프만 부호화에 사용된 허프만 코드북을 이용하여 전송된 신호의 허프만 복호화를 수행함과 동시에 상기 허프만 코드북이 속하는 그룹을 기반으로 워터마크 정보를 추출하는 수신 단계를 포함하여 이루어지는 것을 특징으로 한다.The method for embedding / detecting watermarks according to the present invention comprises: classifying a plurality of Huffman codebooks used for Huffman coding into at least a plurality of groups, and selecting one group among groups classified according to the watermark information to be inserted. Selecting a Huffman codebook to be used for Huffman encoding of an input signal from the selected group, and transmitting the Huffman encoding after Huffman encoding; And receiving a Huffman decoding of the transmitted signal using the Huffman codebook used for Huffman coding of the transmitted signal and extracting watermark information based on the group to which the Huffman codebook belongs. It features.

상기 전송 단계에서 선택된 그룹 및 허프만 코드북 정보는 부가정보로서 전송되는 것을 특징으로 한다.The group and Huffman codebook information selected in the transmitting step is transmitted as additional information.

상기 전송 단계는 입력 신호에 대해 MP3 부호화를 수행하며, 각 허프만 코드북별 최대값, 각 허프만 코드북의 통계적 특성, 및 허프만 코드북 내 linbits 값 중 적어도 하나 이상을 이용하여 허프만 코드북들을 복수개의 그룹으로 분류하는 것을 특징으로 한다.The transmitting step performs MP3 encoding on the input signal and classifies the Huffman codebooks into a plurality of groups using at least one of a maximum value for each Huffman codebook, statistical characteristics of each Huffman codebook, and linbits values in the Huffman codebook. It is characterized by.

상기 전송 단계는 AAC 부호화를 수행하며, 각 허프만 코드북별 최대 절대값을 기준으로 허프만 코드북들을 복수개의 그룹으로 분류하는 것을 특징으로 한다.The transmitting step performs AAC encoding and classifies the Huffman codebooks into a plurality of groups based on the maximum absolute value of each Huffman codebook.

본 발명의 다른 목적, 특징 및 잇점들은 첨부한 도면을 참조한 실시예들의 상세한 설명을 통해 명백해질 것이다.Other objects, features and advantages of the present invention will become apparent from the following detailed description of embodiments taken in conjunction with the accompanying drawings.

이하 상기의 목적을 구체적으로 실현할 수 있는 본 발명의 바람직한 실시예를 첨부한 도면을 참조하여 설명한다. 이때 도면에 도시되고 또 이것에 의해서 설명되는 본 발명의 구성과 작용은 적어도 하나의 실시예로서 설명되는 것이며, 이것에 의해서 상기한 본 발명의 기술적 사상과 그 핵심 구성 및 작용이 제한되지는 않는다.Hereinafter, with reference to the accompanying drawings, preferred embodiments of the present invention that can specifically realize the above object will be described. At this time, the configuration and operation of the present invention shown in the drawings and described by it will be described as at least one embodiment, by which the technical spirit of the present invention and its core configuration and operation is not limited.

본 발명은 허프만 부호화가 사용되는 영상이나 오디오 부호화기의 압축 부호화 과정에서 워터마크를 삽입하고, 영상이나 오디오 복호화 과정에서 삽입된 워터마크를 추출하는 장치 및 그 방법에 관한 것이다. 특히 본 발명은 허프만 부호화 및 복호화 과정에서 비트열 내에 워터마크 정보를 삽입하고 검출하는데 있다.The present invention relates to an apparatus and method for inserting a watermark in a compression encoding process of an image or an audio encoder using Huffman encoding, and extracting a watermark inserted in an image or audio decoding process. In particular, the present invention is to insert and detect the watermark information in the bit stream during the Huffman encoding and decoding process.

도 2는 본 발명에 따른 워터마크 삽입 및 추출 장치가 적용된 부호화 및 복 호화 장치의 일 실시예를 보인 개략도로서, 고음질 오디오 부호화기 및 복호화기와 결합되어 워터마크의 삽입과 추출이 이루어지는 것을 실시예로 설명한다. 2 is a schematic diagram showing an embodiment of an encoding and decoding apparatus to which a watermark embedding and extracting apparatus according to the present invention is applied, which is combined with a high quality audio encoder and a decoder to insert and extract a watermark. do.

도 2를 보면, 오디오 부호화 및 워터마크 정보 삽입이 이루어지는 부호화기(210)에서는 압축 부호화하고자 하는 고음질 오디오 신호와 삽입하고자 하는 워터마크 정보의 부호화를 동시에 수행한다. 이때 워터마크 삽입(211)은 종래의 허프만 부호화 과정의 일부를 변경하여 이루어진다. Referring to FIG. 2, the encoder 210 in which audio encoding and watermark information is inserted simultaneously performs encoding of a high quality audio signal to be compressed and watermark information to be inserted. At this time, the watermark embedding 211 is performed by changing a part of the conventional Huffman encoding process.

그리고 오디오 복호화 및 워터마킹 추출이 이루어지는 복호화기(230)에서는 압축된 비트열을 복호화하여 원래 상태의 고음질 오디오 신호를 복원함과 동시에 부호화기(210)에서 삽입된 워터마크를 추출한다. 상기 워터마크 추출(231)은 복호화기 내 허프만 복호화 과정의 일부를 변경하여 이루어진다. The decoder 230, which performs audio decoding and watermarking extraction, decodes the compressed bit string to restore the high quality audio signal in its original state and simultaneously extracts the watermark inserted by the encoder 210. The watermark extraction 231 is performed by changing a part of the Huffman decoding process in the decoder.

이때, 워터마크 추출 장치가 없는 종래의 고음질 오디오 복호화기에서도 정상적으로 오디오 비트열을 복호화하여 출력 오디오 신호를 얻어낼 수 있다. 또한 워터마킹 되지 않은 비트열이 입력으로 들어온 경우에는 오디오 신호만을 복호화할 수 있다.In this case, even in the conventional high quality audio decoder without the watermark extraction apparatus, the output audio signal can be obtained by decoding the audio bit string normally. In addition, when an unwatermarked bit string is inputted, only an audio signal can be decoded.

본 발명에서는 일 실시예로 MPEG 제3 계층(이하 MP3라 함) 부호화 및 복호화 과정과 AAC(Advanced Audio Coding) 부호화 및 복호화 과정을 바탕으로 워터마크 삽입 및 검출 과정을 설명한다. 즉 상기 워터마크 삽입 장치가 포함된 부호화기(210)는 MP3 부호화 또는 AAC 부호화를 수행한다. 반대로 워터마크 추출 장치가 포함된 복호화기(230)는 MP3 복호화 또는 AAC 복호화를 수행한다.In the present invention, a watermark insertion and detection process will be described based on an MPEG third layer (MP3) encoding and decoding process and an AAC (Advanced Audio Coding) encoding and decoding process. That is, the encoder 210 including the watermark embedding apparatus performs MP3 encoding or AAC encoding. On the contrary, the decoder 230 including the watermark extraction apparatus performs MP3 decoding or AAC decoding.

상기 MP3 및 AAC 부호화 방법은 다른 고음질 오디오 부호화 기술들과 마찬가 지로 오디오 신호의 지각적 중복성을 제거하기 위해 사람 귀의 청각적 특성에 기반한 심리 음향 모델을 이용하고, 신호의 통계적 중복성을 제거하기 위해 기존의 데이터 압축 방식을 결합한 형태를 갖는다.The MP3 and AAC coding methods use psychoacoustic models based on the auditory characteristics of the human ear to remove perceptual redundancy of audio signals as with other high quality audio coding techniques, and to remove the statistical redundancy of signals. It has a combination of data compression methods.

도 3은 본 발명에 따른 워터마크 삽입부를 포함한 MP3 부호화 장치의 일 실시예를 보인 구성 블록도로서, 허프만 부호화 과정 중에 워터마크 삽입이 이루어진다. 3 is a block diagram showing an embodiment of an MP3 encoding apparatus including a watermark inserter according to the present invention, wherein watermark insertion is performed during the Huffman encoding process.

도 3을 보면, 서브밴드 필터뱅크(310), MDCT(Modified Discrete Cosine Transform)부(320), 왜곡 제어부(330), FFT(Fast Fourier Transform)부(340), 심리음향 모델부(350), 및 비트열 생성부(360)로 구성된다.Referring to FIG. 3, the subband filter bank 310, the Modified Discrete Cosine Transform (MDCT) unit 320, the distortion control unit 330, the FFT (Fast Fourier Transform) unit 340, the psychoacoustic model unit 350, And a bit string generator 360.

상기 왜곡 제어부(330)는 반복 루프로서, 비선형 양자화부(331), 비트 할당부(332), 허프만(Huffman) 부호화 및 워터마크 삽입부(333), 그리고 부가정보 부호화부(334)로 구성된다. The distortion control unit 330 is a repetitive loop and includes a nonlinear quantization unit 331, a bit allocation unit 332, a Huffman encoding and watermark inserting unit 333, and an additional information encoding unit 334. .

먼저 오디오 신호는 통계적인 중복성을 제거하기 위해 서브밴드 필터뱅크(310)로 입력되고, 동시에 인간의 청각 특성에 의한 지각적인 중복성을 제거하기 위해 FFT부(340)로 입력된다. First, the audio signal is input to the subband filter bank 310 to remove statistical redundancy, and simultaneously to the FFT unit 340 to remove perceptual redundancy due to human auditory characteristics.

상기 서브밴드 필터뱅크(310)는 32개의 가중 중첩 가산 방식 등간격 필터뱅크로 구성되며, 오디오 신호가 서브밴드 필터뱅크(310)를 거치면 서브밴드 샘플로 변환된다. 상기 서브밴드 필터뱅크(310)에서 오디오 신호의 통계적인 중복성이 제거된 서브밴드 샘플은 MDCT부(320)로 출력되어 MDCT가 수행됨으로써, 주파수 해상도를 높이는 하이브리드(hybrid) 변환 부호화 방식이 사용된다. The subband filter bank 310 is composed of 32 weighted overlap addition scheme equal interval filter banks, and when the audio signal passes through the subband filter bank 310, the subband filter bank 310 is converted into subband samples. The subband sample from which the statistical redundancy of the audio signal is removed from the subband filter bank 310 is output to the MDCT unit 320 and the MDCT is performed, thereby using a hybrid transform coding method for increasing the frequency resolution.

즉, 청각특성을 효율적으로 이용하기 위해서는 신호를 주파수 성분으로 나누는 것이 필요한데, 이 때문에 서브밴드 필터뱅크(310)와 MDCT(320)를 이용하여 입력 신호에 대해 최대 576개의 주파수 분해능을 갖는 MDCT 계수로 변환한다. In other words, in order to effectively use the auditory characteristics, it is necessary to divide the signal into frequency components. Therefore, the subband filter bank 310 and the MDCT 320 are used as MDCT coefficients having up to 576 frequency resolutions for the input signal. Convert.

상기 FFT(Fast Fourier Transform)부(340)는 입력되는 오디오 신호를 FFT를 통해 주파수 영역으로 변환하여 심리음향 모델부(350)로 출력한다. 상기 FFT를 사용하는 심리음향 모델부(350)에서는 입력되는 오디오 신호로부터 인간의 청각 특성에 의한 지각적인 중복성을 제거하기 위하여, FFT된 주파수 신호로부터 귀에 들리지 않는 잡음 레벨인 마스킹 임계값(masking threshold)을 얻는다.The fast fourier transform unit 340 converts the input audio signal into a frequency domain through the FFT and outputs the converted audio signal to the psychoacoustic model unit 350. In the psychoacoustic model unit 350 using the FFT, a masking threshold, which is an inaudible noise level from an FFT frequency signal, is used to remove perceptual redundancy due to human auditory characteristics from an input audio signal. Get

여기서, 마스킹 현상은 소리 지각의 중요한 특성 중 하나로서, 큰 음에 의해 어떤 임계값 이하의 작은 음이 가려지는 현상 즉, 하나의 소리가 다른 소리의 지각(perception)을 억제하는 현상이다. 이러한 마스킹 현상 중 주파수 마스킹 현상은 두 개의 소리가 동시에 존재하는 경우를 처리한다. 즉 주파수 마스킹은 특정 주파수의 순음이 다른 주파수의 소리를 마스킹한다고 할 때 마스킹되는 잔소리는 어느 임계값 이상의 에너지를 가져야만 들리게 하는 것으로, 이때 사용되는 임계값을 마스킹 임계값이라 하여 절대가청 임계값(Absolute threshold)과 구분한다. 상기 절대가청 임계값은 임의의 소리에 대하여 지각을 할 수 있는 임계값을 말한다. Here, the masking phenomenon is one of important characteristics of sound perception, and a phenomenon in which a small sound below a certain threshold is covered by a large sound, that is, a phenomenon in which one sound suppresses the perception of another sound. Among these masking phenomena, the frequency masking phenomena deals with the case where two sounds exist at the same time. In other words, frequency masking means that when a pure sound of a certain frequency masks sound of another frequency, the masked nagging should be heard only when it has energy above a certain threshold. The threshold used here is called a masking threshold, and an absolute audible threshold ( Distinguish from Absolute threshold. The absolute audible threshold refers to a threshold capable of perceiving any sound.

그리고 왜곡 제어부(330)의 비트 할당부(332)는 각 프레임에서의 마스킹 임계값을 근거로 하여 MDCT 계수의 양자화 비트를 할당하여 양자화부(331)로 출력한다. 이때 양자화 잡음이 오디오 신호에 의해 마스킹 될 수 있도록 각 스케일팩터 밴드별로 최소한의 비트를 할당한다. The bit allocator 332 of the distortion controller 330 allocates the quantized bits of the MDCT coefficients based on the masking thresholds of the frames and outputs the quantized bits to the quantizer 331. At this time, a minimum bit is allocated to each scale factor band so that the quantization noise can be masked by the audio signal.

상기 왜곡 제어부(330) 내 비트 할당과 양자화, 그리고 허프만 부호화 과정은 반복 루프를 통해 처리된다. 이때 상기 비트 할당부(332)는 잡음을 완전히 마스킹 시킬 수 없을 때에는 주관적 잡음을 최소화하도록 스케일 팩터 대역별로 비트를 할당한다. 상기 할당된 비트를 기반으로 양자화부(331)에서 비선형 양자화된 MDCT 계수는 허프만 부호화 및 워터마크 삽입부(333)로 입력되어 허프만 부호화 및 워터마크가 삽입된 후 비트열 생성부(360)로 출력된다. 상기 허프만 부호화 및 워터마크가 삽입된 오디오 신호는 비트열 생성부(360)에서 부가 정보와 함께 비트열로 만들어진 후 전송된다. Bit allocation, quantization, and Huffman coding processes in the distortion control unit 330 are processed through an iterative loop. In this case, when the noise cannot be completely masked, the bit allocation unit 332 allocates bits for each scale factor band to minimize subjective noise. The nonlinear quantized MDCT coefficients are input to the Huffman encoding and watermark inserting unit 333 based on the allocated bits, and are output to the bit string generator 360 after Huffman encoding and watermarking are inserted. do. The Huffman-coded and watermark-embedded audio signal is transmitted in a bit string together with additional information by the bit string generator 360.

여기서 허프만 부호화란 통계적 압축 방법의 일종으로 각 단위 정보를 표현하는 비트 수를 단위 정보들의 출현 빈도를 기반으로 할당하는 개념이다. 즉, 빈도가 높은 정보는 적은 비트 수를 사용하여 표현하고, 빈도가 낮은 정보는 비트 수를 많이 사용하여 표현해서 전체 데이터의 표현에 필요한 비트의 양을 줄인다. 따라서 단위 정보에 따라 코드의 길이가 달라지는 가변 길이 코드이며, 오디오 및 영상의 압축을 위해 많이 사용된다. 상기 허프만 부호화에 의해 압축된 데이터는 허프만 복호화에 의해 원 정보를 오차없이 복원할 수 있기 때문에 무손실(noiseless) 부호화 방법의 일종이다. Huffman coding is a method of statistical compression, in which a bit number representing each unit information is allocated based on the frequency of occurrence of the unit information. That is, high frequency information is represented using a small number of bits, and low frequency information is represented using a large number of bits to reduce the amount of bits necessary for representing the entire data. Therefore, it is a variable length code in which the length of the code varies according to the unit information, and is widely used for compressing audio and video. The data compressed by the Huffman coding is a kind of lossless coding method because original information can be recovered without error by Huffman decoding.

그리고 MP3 및 AAC 부호화에서는 양자화된 신호에 대해 허프만 부호화를 수행하며, 이를 위한 허프만 코드북을 가지고, 각각의 단위 정보들에 해당하는 코드북 인덱스를 찾아 해당 코드워드를 전송하는 방법을 취한다.In the MP3 and AAC encoding, Huffman coding is performed on the quantized signal, and a Huffman codebook is used to find a codebook index corresponding to each unit information and transmit the corresponding codeword.

먼저, MP3의 허프만 부호화 과정에 대해 상세히 설명한다.First, the Huffman encoding process of MP3 will be described in detail.

상기 MP3의 허프만 부호화는 반복 루프 내에서 수행된다. 상기 왜곡 제어부(330)에 해당하는 반복 루프는 해당 입력 신호에 대한 마스킹 임계값을 기준으로 비트를 할당하는 외부 루프와 할당된 비트에 따라 MDCT 계수의 비선형 양자화를 수행하고, 양자화된 MDCT 계수를 허프만 부호화하는 내부 루프로 구성된다. Huffman coding of the MP3 is performed in an iterative loop. The iterative loop corresponding to the distortion control unit 330 performs nonlinear quantization of MDCT coefficients according to the allocated bits and the outer loop for allocating bits based on a masking threshold value for the corresponding input signal, and Huffman quantizes the quantized MDCT coefficients. It consists of an inner loop that encodes.

상기 내부 루프에 존재하는 허프만 부호화에서는 양자화된 MDCT 계수의 크기에 따라 세가지 영역으로 나누어 각각 다른 가변 길이 부호화 방법을 사용한다. 즉, MDCT 계수가 모두 0인 rzero 영역, 크기가 0과 1과 -1만 존재하는 count1 영역, 그 외의 크기를 가지는 big_value 영역이다. In the Huffman coding in the inner loop, a variable length coding method is used, which is divided into three regions according to the size of quantized MDCT coefficients. That is, it is the rzero area where all MDCT coefficients are 0, the count1 area where only 0, 1, and -1 exist, and the big_value area having other sizes.

이때, 가장 높은 주파수 스펙트럼 계수로부터 처음으로 0이 아닌 값이 나타날때까지의 구간 즉, MDCT 계수가 모두 0인 rzero 영역은 부호화하지 않는다. In this case, the interval from the highest frequency spectrum coefficient to the first non-zero value, that is, the rzero region in which all MDCT coefficients are 0 is not encoded.

그 다음 0과 1과 -1만 존재하는 count1 구간에 대해서는 4개의 MDCT 계수씩 짝(quadruple)을 지어 허프만 테이블 내 한 개의 코드워드를 전송한다. 이때 사용되는 허프만 테이블은 예를 들면, Huffman codes for Layer III 중 Huffman code table for quadruples(A)와 Huffman code table for quadruples(B)이다. 상기 count1 영역에 대한 상기 허프만 테이블 A 및 B는 해당 정보를 표현하는데 적합한 것을 선택적으로 사용할 수 있으며, 선택된 테이블 정보는 별도의 부가 정보로써 비트열에 포함되어 전송된다. Next, one codeword in the Huffman table is transmitted in quadruple pairs of four MDCT coefficients for the count1 interval where only 0, 1, and -1 exist. The Huffman tables used at this time are, for example, Huffman code table for quadruples (A) and Huffman code table for quadruples (B) of the Huffman codes for Layer III. The Huffman tables A and B for the count1 region may be selectively used to represent the corresponding information, and the selected table information is transmitted as included in the bit string as additional side information.

그리고 절대값이 1보다 큰 값들을 갖는 big_value 영역은 두 개의 MDCT 계수씩 짝을 이루어 허프만 부호화된다. 즉, 상기 big_value 영역은 다시 3개의 부영역(subregion)으로 나뉘어진다. 그리고 각 부영역별로 0~31번까지 있는 허프만 테이 블 예를 들면, Huffman codes for Layer III 중 Huffman code table 0 ~ 31 중에서 역시 최소의 비트로 부호화할 수 있는 허프만 테이블을 선택하여 허프만 부호화에 사용하게 된다. 각 부영역별로 선택된 허프만 테이블은 역시 부가정보를 통해 비트열에 포함되어 전송된다.In the big_value region having absolute values greater than 1, Huffman coding is performed by pairing two MDCT coefficients. That is, the big_value region is further divided into three subregions. Huffman tables 0 to 31 for each sub-area, for example, are selected from Huffman codes table 0 to 31 among Huffman codes for Layer III and used for Huffman coding. . The Huffman table selected for each subregion is also included in the bit string and transmitted through additional information.

이때 상기 big_value 영역의 MPCT 계수를 부호화하는데 사용되는 허프만 코드북 0~31까지는 각 코드북별로 표현할 수 있는 최대값(절대값)이 달라지므로, 각 부영역별 MDCT 계수의 최대값까지 표현할 수 있는 허프만 코드북 가운데서 선택을 해야한다. In this case, since the maximum value (absolute value) that can be expressed for each codebook is different from Huffman codebooks 0 to 31 used for encoding the MPCT coefficients of the big_value region, it is selected from among Huffman codebooks that can represent the maximum value of MDCT coefficients for each subregion. Should.

도 4는 상기 MP3에서 사용되는 허프만 코드북별 최대 절대값 등을 정리하여 나타낸 것이다. FIG. 4 summarizes the maximum absolute value of each Huffman codebook used in the MP3.

도 4를 보면, 코드북 13~31 등에서처럼 MP3의 허프만 코드북을 통해 표현할 수 있는 값은 15가 최대이다. 그러나, 양자화된 MDCT 계수는 그 이상의 값이 존재하며, 이러한 값을 표현하기 위해서는 예외 처리를 하게 된다. 즉, 15 이상의 큰 값이 존재하는 경우 최대값 15에 해당하는 코드워드로 허프만 부호화하고, 그 이상에 해당하는 계수값은 선형 PCM 부호화 방법으로 표현하여 코드워드 뒤에 추가하게 된다. 이때, 선형 PCM 부호화에 사용되는 비트수가 linbits 이다. Referring to FIG. 4, the maximum value that can be expressed through the Huffman codebook of MP3 is 15 as in the codebooks 13 to 31. However, the quantized MDCT coefficients have more values, and exceptions are expressed to express these values. That is, if there is a large value of 15 or more, Huffman is coded by a codeword corresponding to the maximum value of 15, and the coefficient value corresponding to the above value is represented by a linear PCM coding method and added after the codeword. At this time, the number of bits used for linear PCM encoding is linbits.

예를 들어, 18이란 정보를 표현하기 위해서는 코드북 17(linbits=2)을 선택하고, 해당 MDCT 계수 자리에 15에 해당하는 코드워드를 삽입한 뒤, 18-15, 즉 3을 2비트를 이용하여 표현한 이진수 11을 해당 코드워드 뒤에 첨부하는 방법이다. 코드북 16~23까지는 동일한 허프만 코드북(#16)을 사용하면서, linbits이 다른 경우 를 나타낸다. 마찬가지로 코드북 24~31까지는 동일한 #24 코드북을 사용하며, linbits이 서로 다른 경우이다. For example, to express information of 18, codebook 17 (linbits = 2) is selected, a codeword corresponding to 15 is inserted into the corresponding MDCT coefficients, and then 18-15, that is, 3 using 2 bits. This is a method of appending binary 11 after the codeword. Codebooks 16-23 use the same Huffman codebook (# 16), but show different linbits. Similarly, codebooks 24 through 31 use the same # 24 codebook, with different linbits.

이때 상기 허프만 부호화 및 워터마크 삽입부(333)에서는 허프만 부호화 뿐만 아니라 워터마크 정보 부호화도 동시에 수행한다. 즉 상기 허프만 부호화 및 워터마크 삽입부(333)는 워터마크 신호를 추가적인 입력으로 받아 워터마킹을 수행하는 동시에 허프만 부호화를 수행하여 비트열 생성부(360)로 출력한다. In this case, the Huffman encoding and watermark inserting unit 333 simultaneously performs not only Huffman encoding but also watermark information encoding. That is, the Huffman encoding and watermark inserting unit 333 receives a watermark signal as an additional input, performs watermarking, and performs Huffman encoding to output the bitmap generator 360.

다시 말해, 워터마크 삽입이 이루어지는 허프만 부호화 및 워터마크 삽입부(333)에서는 종래의 허프만 부호화 과정 중 각 영역별 허프만 코드북을 선정하는 과정을 변형하여 원하는 이진수의 워터마크를 삽입하게 된다. In other words, the Huffman encoding and watermark inserting unit 333 in which the watermark is inserted is modified to select the Huffman codebook for each region in the conventional Huffman encoding process to insert the desired binary number watermark.

그 상세한 원리는 다음과 같다.The detailed principle is as follows.

즉, 전술한 바와 같이 MP3 부호화 과정에서 허프만 부호화는 MDCT 계수의 크기에 따라 크게 세 영역 즉, rzero, count1, big_value 영역으로 나누고, 각 영역별로 다른 무손실 부호화를 수행하게 된다. 상기 세 영역 가운데 count1 영역과 big_value 영역에 대해서는 정해진 허프만 코드북 가운데 하나를 선택하여 해당 코드북을 이용한 허프만 부호화를 수행한다. 상기 각 영역별로 선택된 허프만 코드북은 MP3 비트열내에 부가정보로 전송되어 MP3 복호화기에서는 이를 참조하여 MDCT 계수를 허프만 복호화하게 된다. That is, as described above, Huffman coding in the MP3 encoding process is divided into three regions, that is, rzero, count1, and big_value regions according to the size of the MDCT coefficients, and different lossless coding is performed for each region. Among the three regions, one of the predetermined Huffman codebooks is selected for the count1 region and the big_value region to perform Huffman encoding using the corresponding codebook. The Huffman codebook selected for each region is transmitted as additional information in the MP3 bit string so that the MP3 decoder can Huffman decode the MDCT coefficients with reference to the MP3 decoder.

이때 사용되는 각 영역별로 선택하여 사용할 수 있는 허프만 코드북(도 4 참조)을 살펴보면, 최대 절대값이 같은 코드북이 2개 내지 3개씩 존재함을 알 수 있다. 이는 스펙트럼 분포의 통계적 특성에 따라 적합한, 즉 최소의 비트로 부호화가 가능한 코드북을 선택하여 부호화할 수 있도록 하기 위함이다. At this time, when looking at the Huffman codebooks (see FIG. 4) that can be selected and used for each region used, it can be seen that there are two or three codebooks having the same maximum absolute value. This is for selecting and coding a codebook suitable for the statistical characteristics of the spectral distribution, that is, codeable with the least bits.

예를 들어, big_value 영역 가운데 특정 부영역에 존재하는 MDCT 계수의 최대 절대값이 5라면, 허프만 코드북 7,8,9번을 각각 이용하여 허프만 부호화를 수행해본 뒤, 가장 적은 비트수를 갖는 허프만 코드북을 선택한다. 실제로 최대 절대값이 5일때는 상기 7,8,9번 코드북 이외에 그 이상의 최대값을 갖는 허프만 코드북을 사용하여도 무방하다. 즉, 코드북 10,11,12번 내지 그 이상의 코드북을 사용하여도 부호화가 가능하다.For example, if the maximum absolute value of the MDCT coefficients in a particular sub-region among the big_value regions is 5, Huffman codebooks are performed using Huffman codebooks 7,8, and 9, respectively, and then the Huffman codebook having the smallest number of bits. Select. In fact, when the maximum absolute value is 5, the Huffman codebook having a maximum value other than the 7,8, and 9 codebooks may be used. That is, the codebook can be encoded using codebooks 10, 11, 12 or more.

본 발명에서는 이렇게 같은 정보를 표현하는데 있어 선택 가능한 허프만 코드북들을 두 그룹으로 나누고, 삽입하고자 하는 이진수 워터마크 정보에 따라 그룹을 먼저 선택하고, 해당 그룹내의 코드북 가운데서 선택이 이루어지도록 함으로써 워터마킹을 수행한다. 예를 들어, 상기 최대 절대값이 5인 어떤 부영역에 대해, 이진수 워터마크 '1'을 삽입하고자 하는 경우, 코드북 9,10,11번 중에 선택하고, 이진수 워터마크 '0'을 삽입하고자 하는 경우, 코드북 7,8,12번 중에 선택하도록 하는 등의 방법이다. In the present invention, watermarking is performed by dividing the selectable Huffman codebooks into two groups in order to express the same information, first selecting a group according to the binary watermark information to be inserted, and making a selection from among the codebooks in the group. . For example, when a binary watermark '1' is to be inserted into a subregion of which the maximum absolute value is 5, the codebook is selected from 9, 10, 11 times and a binary watermark '0' is inserted. In this case, the codebook may be selected from among 7,8, and 12 codebooks.

도 5a 및 도 5b에는 본 발명에 따른 허프만 부호화 및 워터마크 삽입부(333)에서 삽입하려는 이진수 워터마크 '0'과 '1'에 따라 선택 가능한 허프만 코드북을 두 개의 그룹으로 분류한 예시이다. 5A and 5B illustrate an example in which the Huffman codebooks selectable according to binary watermarks '0' and '1' to be inserted by the Huffman encoding and watermark inserting unit 333 according to the present invention are classified into two groups.

이는 허프만 코드북을 양분하여 워터마크 '0'과 '1'에 따라 선택하도록 한 일 예로써 같은 원리에 의한 다른 조합도 가능하다. This is an example in which the Huffman codebook is divided into two and selected according to the watermarks '0' and '1'. Other combinations based on the same principle are possible.

또한 본 발명에서는 삽입하려는 이진수 워터마크 정보에 따라 허프만 코드북 들을 세 개나 네 개 또는 그 이상의 그룹으로 분류할 수도 있다. 이 경우도 삽입하고자 하는 이진수 워터마크 정보에 따라 그룹을 먼저 선택하고, 해당 그룹내의 코드북 가운데서 선택이 이뤄지도록 함으로써 워터마킹을 수행한다.In the present invention, Huffman codebooks may be classified into three, four, or more groups according to binary watermark information to be inserted. In this case, watermarking is performed by first selecting a group according to the binary watermark information to be inserted and then selecting the codebook from the corresponding group.

도 5a 및 도 5b의 분류는 각 코드북별 최대값 및 linbits과 각 코드북에 따른 통계적 특성이 이진수 '0'과 '1'에 잘 배분되도록 나눈 경우이다. The classification of FIGS. 5A and 5B is a case in which the maximum value and linbits for each codebook and statistical characteristics according to each codebook are divided so that the binary numbers '0' and '1' are well distributed.

이때, 코드북 0번 및 1번을 사용해야하는 영역에 대해서는 워터마크를 삽입하지 않는다. At this time, the watermark is not inserted in the areas where codebooks 0 and 1 should be used.

그리고 동일한 허프만 테이블을 사용하며 linbits 값에 의해 구별되는 코드북 16~23번 및 24~31번에 대해서도 linbits 값에 따라 도 5a와 도 5b에 구분되어 배정되었다. Codebooks 16-23 and 24-31, which use the same Huffman table and are distinguished by linbits values, are also separately assigned to FIGS. 5A and 5B according to linbits values.

이와 같이 허프만 코드북의 통계적 특성과 linbits 크기를 고려하여 도 5a와 도 5b처럼 동일한 허프만 테이블 내 허프만 코드북을 배분함으로써, 워터마크의 삽입을 위해 각 영역별로 사용할 수 있는 허프만 코드북을 제약하는 것이 허프만 부호화의 압축 효율을 떨어트릴 수 있는 문제를 최소화할 수 있다.Thus, by distributing the Huffman codebooks in the same Huffman table as shown in FIGS. 5A and 5B in consideration of the statistical characteristics and linbits size of the Huffman codebook, it is necessary to restrict the Huffman codebooks that can be used for each region for the insertion of watermarks. This can minimize problems that can reduce compression efficiency.

따라서, 본 발명에 따른 워터마크의 삽입은 MP3 부호화 과정 중 허프만 부호화를 수행할 때, 각 부영역 및 count1 영역에서의 허프만 코드북 선택시, 이진수 워터마크에 대응하는 코드북을 도 5a 및 도 5b 중에서 선택하는 과정으로 설명된다.Accordingly, when the Huffman coding is performed during the MP3 encoding process, the Huffman codebook is selected from among FIG. 5A and FIG. 5B when the Huffman codebook is selected in the MP3 encoding process. It is explained by the process.

상기와 같이 부호화기(210)에서 허프만 부호화시에 워터마크 정보가 삽입되어 전송되면, 복호화기(230)에서는 전송된 각 영역별 구간값과 각 구간별로 선택된 허프만 테이블에 대한 정보를 이용하여, 전송된 비트열의 코드워드로부터 양자화된 MDCT 계수를 복원함과 동시에 복원시 사용된 허프만 코드북에 따라 워터마크 정보를 추출한다. When the watermark information is inserted and transmitted at the time of Huffman encoding in the encoder 210 as described above, the decoder 230 transmits the information using the transmitted section value of each region and information on the Huffman table selected for each section. The quantized MDCT coefficients are reconstructed from the codewords of the bit strings, and watermark information is extracted according to the Huffman codebook used during the reconstruction.

도 6은 본 발명에 따른 워터마크 추출 장치가 적용된 MP3 복호화기의 일 실시예를 보인 구성 블록도로서, 상기 도 3의 워터마크 삽입장치가 적용된 MP3 부호화기의 역순으로 동작한다. FIG. 6 is a block diagram illustrating an MP3 decoder to which the watermark extracting apparatus according to the present invention is applied, and operates in the reverse order of the MP3 encoder to which the watermark inserting apparatus of FIG. 3 is applied.

도 6을 보면, 비트열 역다중화 및 워터마크 추출부(610), 허프만 복호화부(620), 역양자화부(630), IMDCT부(640), 및 서브밴드 필터뱅크(650)를 포함하여 구성된다. 6, a bit string demultiplexing and watermark extractor 610, a Huffman decoder 620, an inverse quantizer 630, an IMDCT unit 640, and a subband filter bank 650 are configured. do.

즉, 비트열 역다중화 및 워터마크 추출부(610)는 전송된 비트열로부터 복호화에 필요한 정보들을 추출(unpack)해내는 역다중화 과정에서 이진수 워터마크 정보를 추출한다. That is, the bit string demultiplexing and watermark extractor 610 extracts binary watermark information in a demultiplexing process to unpack information necessary for decoding from the transmitted bit string.

상기 비트열 역다중화 및 워터마크 추출부(610)에서 워터마크 정보를 추출하는 방법은 매우 간단하다. 즉, 전송된 부가정보 가운데, 각 영역별로 선택된 허프만 테이블(table_select)의 코드북을 읽어 도 5a 및 도 5b와 비교하는 것으로 이진수 워터마크 정보 '0'과 '1'을 추출한다. 예를 들어 어떤 부영역에 대한 허프만 코드북이 10번이면 도 5b에 존재하므로 해당 위치에 삽입된 워터마크 정보는 '1'이라고 판별할 수 있다. The method of extracting watermark information from the bit string demultiplexing and watermark extractor 610 is very simple. That is, the binary watermark information '0' and '1' are extracted by reading the codebook of the Huffman table (table_select) selected for each region among the transmitted additional information and comparing it with FIGS. 5A and 5B. For example, if the Huffman codebook for a subregion is 10, it is present in FIG. 5B, and thus watermark information inserted at a corresponding position may be determined as '1'.

그리고 한 프레임(1152 시간샘플)의 MP3 비트열에는 2채널일 경우, 12개의 table_select 값과 4개의 count1table_select 값이 존재하므로, 프레임당 최대 16 비트의 이진수 워터마크를 삽입할 수 있다. In the MP3 bit string of one frame (1152 time samples), 12 table_select values and 4 count1table_select values exist in the case of two channels, and a binary watermark of up to 16 bits can be inserted per frame.

상기 비트열 역다중화 및 워터마크 추출부(610)에서 분리된 부가 정보 및 오디오 비트열은 허프만 복호화부(620)로 출력된다. The additional information and the audio bit string separated by the bit string demultiplexing and watermark extractor 610 are output to the Huffman decoder 620.

상기 허프만 복호화부(620)는 허프만 부호화되어 입력된 오디오 비트열로부터 양자화된 MDCT 계수를 얻어내서 역양자화부(630)로 출력한다. 상기 역양자화부(630)에서는 양자화된 MDCT 계수를 역양자화하여 실수의 MDCT 계수를 산출한다. 상기 실수 MDCT 계수는 IMDCT부(640)와 서브밴드 필터뱅크(650)를 거쳐 시간영역의 신호인 출력 PCM 신호로 생성된다. The Huffman decoder 620 obtains the quantized MDCT coefficients from the Huffman coded audio bit stream and outputs the quantized MDCT coefficients to the inverse quantizer 630. The dequantization unit 630 dequantizes the quantized MDCT coefficients to calculate the real MDCT coefficients. The real MDCT coefficient is generated as an output PCM signal which is a signal in the time domain through the IMDCT unit 640 and the subband filter bank 650.

이때 복호화된 PCM 오디오 신호는 워터마크가 삽입되지 않은 일반적인 MP3 비트열로부터 복호화된 경우와 구별되지 않는다. At this time, the decoded PCM audio signal is not distinguished from the case where it is decoded from the general MP3 bit string without the watermark inserted.

한편 도 7은 본 발명에 따른 워터마크 삽입장치가 적용된 AAC 부호화 장치의 일 실시예를 보인 구성 블록도로서, MP3 부호화 과정에서와 마찬가지로 허프만 부호화 과정 중에 워터마크 삽입이 이루어진다. FIG. 7 is a block diagram showing an embodiment of an AAC encoding apparatus to which a watermark embedding apparatus according to the present invention is applied. As in the MP3 encoding process, watermark insertion is performed during the Huffman encoding process.

도 7을 보면, 심리음향 모델부(710), 전처리기(720), 서브밴드 필터뱅크(730), TNS(Temporal Noise Shaping)부(740), 세기(Intensity)/커플링부(750), 예측부(760), M/S(Mid/Side) 스테레오 처리부(770), 비트 할당 및 양자화부(780), 및 비트열 생성부(790)로 구성된다. Referring to FIG. 7, a psychoacoustic model unit 710, a preprocessor 720, a subband filter bank 730, a temporal noise shaping unit 740, an intensity / coupling unit 750, and a prediction A unit 760, a M / S (Mid / Side) stereo processing unit 770, a bit allocation and quantization unit 780, and a bit string generation unit 790.

상기 비트 할당 및 양자화부(780)는 반복 루프 구조로서, 압축율/왜곡 제어부(781), 스케일팩터 추출부(782), 양자화부(783), 그리고 허프만 부호화 및 워터마크 삽입부(784)로 구성된다. The bit allocation and quantization unit 780 has an iterative loop structure and includes a compression / distortion control unit 781, a scale factor extraction unit 782, a quantization unit 783, and a Huffman encoding and watermark insertion unit 784. do.

즉, PCM(Pulse Code Modulation) 형태의 오디오 신호는 심리음향 모델부(710)와 전처리기(720)로 입력된다. 상기 전처리기(720)는 전송될 비트율에 따라 입력 오디오 신호의 샘플링 주파수를 가변시켜 출력한다. That is, the audio signal in the form of pulse code modulation (PCM) is input to the psychoacoustic model unit 710 and the preprocessor 720. The preprocessor 720 changes and outputs a sampling frequency of an input audio signal according to a bit rate to be transmitted.

상기 심리음향 모델부(710)는 입력 오디오 신호들을 적당한 스케일팩터 대역(scalefactor band)의 신호들로 묶고, 각 신호들의 상호작용으로 인해 발생되는 마스킹 현상을 이용하여 각 대역(scalefactor band)에서의 마스킹 임계값(masking threshold)을 계산한다. 상기 심리음향 모델부(710)의 출력은 필터 뱅크(730), TNS부(740), 세기/커플링부(750), M/S 스테레오 처리부(770)로 입력된다. The psychoacoustic model unit 710 binds the input audio signals into signals of an appropriate scale factor band, and masks each scale using a masking phenomenon generated by interaction of the respective signals. Calculate the masking threshold. The output of the psychoacoustic model unit 710 is input to the filter bank 730, the TNS unit 740, the intensity / coupling unit 750, and the M / S stereo processing unit 770.

상기 필터뱅크(730)는 전처리기(720)를 거친 신호와 심리음향 모델부(710)의 마스킹 임계값을 이용하여 오디오 신호의 통계적인 중복성을 제거한 후 TNS부(740)로 출력한다. 즉, 상기 필터뱅크(730)에서는 MDCT 변환에 의해 최대 1024개의 주파수 분해능을 갖는 MDCT 계수를 생성해낸다. The filter bank 730 removes the statistical redundancy of the audio signal using the signal passed through the preprocessor 720 and the masking threshold of the psychoacoustic model unit 710 and outputs the statistical signal to the TNS unit 740. That is, the filter bank 730 generates MDCT coefficients having a maximum frequency resolution of 1024 by MDCT transform.

상기 필터뱅크(730)에서 생성된 MDCT 계수들은 TNS부(740), 세기/커플링부(750), 예측부(760), M/S 스테레오 처리부(770)를 거쳐 비트 할당 및 양자화부(780)로 입력된다. 상기 TNS부(740), 세기/커플링부(750), 예측부(760), M/S 스테레오 처리부(770)는 부호화기에서 선택적으로 사용 가능하다. The MDCT coefficients generated by the filter bank 730 are allocated through the TNS unit 740, the strength / coupling unit 750, the prediction unit 760, and the M / S stereo processor 770. Is entered. The TNS unit 740, the strength / coupling unit 750, the prediction unit 760, and the M / S stereo processing unit 770 may be selectively used by the encoder.

즉, 상기 TNS부(740)는 필터 뱅크(730)의 출력과 심리음향 모델부(710)의 출력을 입력받아 변환의 각 윈도우 내에서 양자화 잡음의 시간적인 모양을 제어한다. 이때 주파수 데이터의 필터링 과정을 적용함으로써 시간영역 잡음 형상화가 가능하다. That is, the TNS unit 740 receives the output of the filter bank 730 and the output of the psychoacoustic model unit 710 to control the temporal shape of the quantization noise in each window of the transform. In this case, time domain noise shaping is possible by applying a filtering process of frequency data.

상기 세기/커플링부(750)는 스테레오 신호를 좀 더 효율적으로 처리하기 위한 모듈로서, 심리음향 모델부(710)의 출력과 TNS부(740)의 출력을 입력받아 두 개의 채널 중 하나의 채널의 스케일팩터 대역에 대한 양자화된 정보만을 부호화하고, 나머지 채널은 상대적인 크기 정보만을 전송한다. 상기 세기/커플링부(750)는 부호화기에서 반드시 사용해야 하는 모듈은 아니고 여러 가지 사항을 고려해서 각 스케일팩터 대역 단위로 사용 여부를 판단할 수 있다.The intensity / coupling unit 750 is a module for more efficiently processing a stereo signal. The intensity / coupling unit 750 receives an output of the psychoacoustic model unit 710 and an output of the TNS unit 740, and outputs one of two channels. Only quantized information about the scale factor band is encoded, and the remaining channels transmit only relative size information. The strength / coupling unit 750 is not a module that must be used in the encoder, and may determine whether to use each scale factor band in consideration of various matters.

상기 예측부(760)는 이전 프레임데 대한 MDCT 계수들을 이용하여 현재 프레임의 MDCT 계수값(즉, 상기 세기/커플링부(750)의 출력값)을 예측한다.The prediction unit 760 predicts the MDCT coefficient value (that is, the output value of the strength / coupling unit 750) of the current frame using the MDCT coefficients of the previous frame.

이렇게 예측된 값과 실제 주파수 성분의 차를 양자화해서 부호화함으로써 사용되는 비트 발생량을 줄일 수 있다. 이때 상기 예측부(760)도 프레임 단위로 선택적으로 사용할 수 있다. 즉 예측부(760)를 사용하면 다음 주파수 계수를 예측하는데 복잡도가 높아지기 때문에 사용하지 않을 수 있다. 또한 경우에 따라서는 예측에 의한 차이가 원래 신호보다 더 큰 확률을 가지고 있음으로 인해 예측을 함으로써 실제 발생한 비트 발생량이 예측을 하지 않을 때보다도 더 커질 수 있는데 이때는 예측부(760)를 사용하지 않는다.The amount of bits used can be reduced by quantizing and encoding the difference between the predicted value and the actual frequency component. In this case, the prediction unit 760 may also be selectively used in units of frames. In other words, when the predicting unit 760 is used, the complexity of predicting the next frequency coefficient may not be used. In addition, in some cases, since the difference due to the prediction has a greater probability than the original signal, the predicted bit generation may be larger than when the prediction is not performed. In this case, the prediction unit 760 is not used.

상기 M/S 스테레오 처리부(770)도 스테레오 신호를 좀 더 효율적으로 처리하기 위한 모듈로서, 상기 심리음향 모델부(710)의 출력과 예측부(760)의 출력을 입력받아 왼쪽 채널 신호와 오른쪽 채널 신호를 각각 더한 신호와 뺀 신호로 변환한 후 이 신호를 처리한다. 상기 M/S 스테레오 처리부(770)도 부호화기에서 반드시 사용해야 하는 것은 아니고 부호화기에서 여러 가지 사항을 고려해서 각 스케일팩터 대역 단위로 사용 여부를 판단할 수 있다.The M / S stereo processor 770 is also a module for processing stereo signals more efficiently. The M / S stereo processor 770 receives the output of the psychoacoustic model unit 710 and the output of the predictor 760 and receives the left channel signal and the right channel. The signals are then converted into plus and minus signals, respectively, and processed. The M / S stereo processor 770 is also not necessarily used in the encoder, but may determine whether the encoder is used in each scale factor band unit in consideration of various matters.

상기 M/S 스테레오 처리부(770)의 출력은 비트 할당 및 양자화부(780) 내 스케일 팩터 추출부(782)로 입력된다. The output of the M / S stereo processor 770 is input to the scale factor extractor 782 in the bit allocation and quantization unit 780.

상기 스케일팩터 추출부(782)는 압축률/왜곡 제어부(781)의 제어에 의해 스케일팩터를 추출한 후 M/S 스테레오 처리부(770)에서 출력되는 서브밴드 샘플들을 정규화하여 양자화부(783)로 출력한다. The scale factor extractor 782 extracts the scale factor under the control of the compression ratio / distortion control unit 781, and then normalizes the subband samples output from the M / S stereo processor 770 to the quantizer 783. .

상기 양자화부(783)는 스케일팩터 추출부(782)에서 정규화된 서브밴드 샘플들을 양자화한 후 허프만 부호화 및 워터마크 삽입부(784)로 입력된다. 즉 상기 양자화부(783)는 인간이 들어도 느끼지 못하도록 각 대역의 양자화 잡음의 크기가 마스킹 임계값보다 작도록 각 대역의 서브밴드 샘플들을 양자화한다. The quantizer 783 quantizes the subband samples normalized by the scale factor extractor 782 and is input to the Huffman encoding and watermark inserting unit 784. That is, the quantization unit 783 quantizes subband samples of each band such that the quantization noise of each band is smaller than a masking threshold so that a human cannot feel it.

상기 양자화부(783)에서 양자화된 신호는 허프만 부호화 및 워터마크 삽입부(784)에서 허프만 부호화 및 워터마킹된 후 압축율/왜곡 제어부(781)와 비트열 생성부(790)로 출력된다. 상기 비트열 생성부(790)는 각 모듈(720~780)에서 만들어진 정보들을 모아서 비트열을 생성하여 전송한다. The signal quantized by the quantizer 783 is Huffman encoded and watermarked by the Huffman encoding and watermark inserting unit 784, and then output to the compression / distortion control unit 781 and the bit string generator 790. The bit string generator 790 collects information generated by each module 720 to 780 and generates and transmits a bit string.

이와 같이 도 7의 AAC 부호화 과정에서, 심리음향 모델을 통해 주파수 영역에서 마스킹 임계값을 얻고 이를 바탕으로 양자화 잡음이 귀에 들리지 않도록 비트 할당을 수행하는 구조는 MP3 부호화 과정과 동일하다. 다만, AAC 부호화 과정에서는 주파수 변환 과정이 MP3에 비해 높은 주파수 해상도를 갖는 2048-point MDCT를 수행(필터뱅크)하고, 시간 영역에서 잡음을 정형하는 기술인 TNS(Temporal Noise Shaping)를 사용하며, 프레임과 프레임 사이의 통계적 중복성을 제거하기 위한 예 측(Prediction)을 사용하는 것이 MP3와 크게 다르다. 또한 AAC 부호화 과정에서는 전송될 비트율에 따라 샘플링 주파수를 가변시킬 수 있는 전처리기(720)를 가지고 있다.As described above, in the AAC encoding process of FIG. 7, the structure of obtaining a masking threshold in the frequency domain through the psychoacoustic model and performing bit allocation so that quantization noise is not audible based on the psychoacoustic model is the same as that of the MP3 encoding process. However, in the AAC encoding process, the frequency conversion process performs a 2048-point MDCT having a higher frequency resolution than the MP3 (filter bank) and uses temporal noise shaping (TNS), a technique for shaping noise in the time domain. Using prediction to remove statistical redundancy between frames is very different from MP3. In addition, the AAC encoding process has a preprocessor 720 that can vary the sampling frequency according to the bit rate to be transmitted.

이러한 구조에서 AAC의 허프만 부호화도 허프만 코드북을 사용하여, 2개 또는 4개의 MDCT 계수 짝(n-tupples)을 부호화한다는 점에서 MP3와 유사하다. 그러나 동일한 코드북을 사용하는 영역을 섹션(section)이라는 단위로 구분하며, rzero 영역 및 count1 영역과 같은 구분이 존재하지 않는다는 점은 다르다. 즉, AAC 부호화 과정은 MDCT 스펙트럼 계수에 대해 통계적 특성이 유사한 영역끼리 묶이도록 섹션을 나눈 뒤, 각 섹션별로 가장 적합한 코드북을 선택하여 허프만 부호화하게 된다. 이때 AAC에서 사용되는 허프만 코드북은 총 12개이며, 각 코드북별로 표현가능한 최대값 등의 특징은 도 8과 같다. In this structure, Huffman coding of AAC is similar to MP3 in that two or four MDCT coefficient pairs (n-tupples) are encoded using the Huffman codebook. However, areas using the same codebook are divided into units called sections, except that there is no division such as an rzero area and a count1 area. That is, in the AAC encoding process, Huffman coding is performed by dividing sections so that regions having similar statistical characteristics with respect to MDCT spectral coefficients are grouped, and selecting the most suitable codebook for each section. In this case, a total of 12 Huffman codebooks are used in the AAC, and features such as a maximum value that can be expressed for each codebook are shown in FIG. 8.

상기 AAC의 경우도 허프만 테이블로 표현 가능한 최대값은 15까지이며, 그 이상의 값을 표현하기 위해서는 예외처리(escape coding ; ESC)를 해야한다. 상기 AAC에서 사용하는 예외처리 방법은 MP3의 경우와 다르며, 다음과 같다. In the case of the AAC, the maximum value that can be represented by the Huffman table is up to 15. To express more than that, an exception coding (ESC) must be performed. The exception handling method used in the AAC is different from that of the MP3 and is as follows.

즉, 16 이상의 정보를 부호화하기 위해서는 먼저 16에 해당하는 코드워드를 허프만 부호화하고 그 뒤에 다음의 수학식 1과 같이 구성된 escape_sequence라는 것을 첨가한다. That is, to encode 16 or more information, first Huffman encodes a codeword corresponding to 16, and then adds escape_sequence configured as shown in Equation 1 below.

escape_sequence = <escape_prefix><escape_separator><escape_word>escape_sequence = <escape_prefix> <escape_separator> <escape_word>

여기서, <escape_prefix>는 N비트개의 이진수 '1'을 의미하고, <escape_separator>는 이진수 '0'을, 그리고, <escape_word>는 N+4 비트의 부호없는 정수값(unsigned integer)으로 표현된 계수를 나타낸다. 따라서, N은 2^(N+4)가 <escape_word> 자리에 들어갈 MDCT 계수를 표현할 수 있는 최소값으로 선택한다. Where <escape_prefix> represents N bits of binary number '1', <escape_separator> represents binary number '0', and <escape_word> represents an N + 4 bit unsigned integer. Indicates. Therefore, N is selected as the minimum value that can represent the MDCT coefficient where 2 ^{(N + 4)} is placed in the <escape_word> position.

상기 MP3에서는 허프만 코드북의 선택에 의해 예외 처리로 부호화하는데 사용되는 비트수가 결정되는데 반해, AAC에서는 각 예외 처리하는 계수별로 사용되는 비트수를 결정하는 것이 다르다.In MP3, the number of bits used for encoding by exception processing is determined by selection of the Huffman codebook, whereas in AAC, the number of bits used for each exception processing coefficient is different.

이때 상기 허프만 부호화 및 워터마크 삽입부(784)에서의 워터마크 삽입 과정은 MP3 부호화기에서의 워터마크 삽입 과정과 유사하다. 즉, 도 8의 AAC에서 사용되는 전체 허프만 코드북을 두개의 그룹으로 나누고, 삽입하고자 하는 이진수 워터마크 정보에 따라 먼저 그룹을 선택하고, 해당 그룹내에서 적합한 허프만 코드북을 찾아 부호화함으로써 워터마크를 삽입하게 된다. In this case, the watermark embedding process in the Huffman encoding and watermark embedding unit 784 is similar to the watermark embedding process in the MP3 encoder. That is, the entire Huffman codebook used in the AAC of FIG. 8 is divided into two groups, the group is first selected according to the binary watermark information to be inserted, and the appropriate Huffman codebook is found in the group to insert the watermark. do.

도 8의 허프만 코드북을 살펴보면, 코드북 0번 및 11번을 제외하면 동일한 최대 절대값을 갖는 경우에 대해 각각 2개씩의 코드북이 존재하므로, 이들을 이진수 '0'과 '1'에 대응하도록 분류하는 것이 가능하다. Referring to the Huffman codebook of FIG. 8, except for codebooks 0 and 11, two codebooks exist for each case having the same maximum absolute value. Therefore, classifying them to correspond to binary '0' and '1' It is possible.

그리고 도 9a 및 도 9b는 본 발명에 따른 AAC 부호화 과정에서 워터마크 삽입에 사용되는 이진수 워터마크 정보 '0'과 '1'에 대응하여 분류한 AAC 허프만 코드북이다. 이때 허프만 코드북 0을 사용하는 경우에 대해서는 워터마크를 삽입하지 않는다. 9A and 9B are AAC Huffman codebooks classified corresponding to binary watermark information '0' and '1' used for watermark insertion in the AAC encoding process according to the present invention. In this case, the watermark is not inserted in the case of using Huffman codebook 0.

한편, 코드북 11번을 사용하는 경우에는 선택된 코드북 자체에 대해서는 워 터마크를 삽입하지 않으며, 이 경우는 예외처리(escape coding)를 실시하는 과정에서 워터마크를 삽입한다. 이에 대한 상세한 설명은 다음과 같다. On the other hand, when using the codebook 11, the watermark is not inserted into the selected codebook itself. In this case, the watermark is inserted in the process of performing the exception coding. Detailed description thereof is as follows.

즉, AAC에서 16 이상의 정보를 부호화하기 위해서 사용되는 escape_sequence 가운데, <escape_prefix>의 설정시, 삽입하고자 하는 워터마크 비트에 따라 N값을 결정한다. 종래의 방법에서는 2^(N+4)가 <escape_word>값을 충분히 표현할 수 있는 최소의 N값으로 결정하였다. 그러나 본 발명에 따른 워터마크 정보의 삽입을 위해서는 최소의 N값이 아니라, 삽입하고자 하는 이진수 워터마크가 '0'인 경우에는 2^(N+4)로 <escape_word>를 표현할 수 있는 최소의 짝수인 N을 취하고, 삽입하고자 하는 이진수 워터마크가 '1'인 경우에는 2^(N+4)로 <escape_word>를 표현할 수 있는 최소의 홀수인 N을 취하는 방법이다. That is, among the escape_sequences used for encoding 16 or more information in the AAC, when the <escape_prefix> is set, the N value is determined according to the watermark bit to be inserted. In the conventional method, 2 ^{(N + 4)} was determined to be the minimum N value that can sufficiently express the <escape_word> value. However, in order to insert the watermark information according to the present invention, when the binary watermark to be inserted is '0', the minimum even number that can represent <escape_word> is 2 ^{(N + 4)} . If the binary watermark to be inserted is '1', ^N is the minimum odd number N that can represent <escape_word> as 2 ^{(N + 4)} .

다시 말해, <escape_prefix>의 개수인 N값으로부터 삽입된 이진수 워터마크 정보를 알 수 있도록 부호화한다. In other words, the embedded binary watermark information is encoded from the N value of the number of <escape_prefix>.

상기 AAC 부호화기에서는 사용되는 허프만 코드북의 개수와 관련이 있는 섹션의 수가 고정되어 있지 않다. 또한, 입력된 오디오 신호의 특성에 따라 escape_sequence가 사용되는 개수도 다르다. 따라서, 삽입 가능한 워터마크 비트수는 프레임에 따라 다르다. In the AAC encoder, the number of sections related to the number of Huffman codebooks used is not fixed. In addition, the number of escape_sequence is used according to the characteristics of the input audio signal. Therefore, the number of embedable watermark bits varies depending on the frame.

도 10은 본 발명에 따른 워터마크 추출장치가 적용된 AAC 복호화기의 일 실시예를 보인 구성 블록도로서, 상기 도 7의 워터마크 삽입장치가 적용된 AAC 부호화기의 역순으로 동작한다. FIG. 10 is a block diagram illustrating an AAC decoder to which the watermark extracting apparatus according to the present invention is applied, and operates in the reverse order of the AAC encoder to which the watermark inserting apparatus of FIG. 7 is applied.

도 10을 보면, 비트열 역다중화 및 워터마크 추출부(910), 허프만 복호화 및 워터마크 추출부(920), 역양자화부(930), M/S 스테레오 처리부(940), 예측부(950), 세기/커플링부(960), TNS부(970), 필터뱅크(980), 및 AAC 이득 제어부(990)를 포함하여 구성된다. Referring to FIG. 10, a bit string demultiplexing and watermark extractor 910, a Huffman decoding and watermark extractor 920, an inverse quantizer 930, an M / S stereo processor 940, and a predictor 950. , The strength / coupling unit 960, the TNS unit 970, the filter bank 980, and the AAC gain control unit 990.

즉, AAC 비트열이 입력으로 들어오면, 비트열 역다중화 과정을 거쳐 복호화에 필요한 정보를 추출하고, 이에 대해 허프만 복호화, 역양자화, M/S 스테레오 복원, 예측, 세기/커플링(Intensity/Coupling) 스테레오 복원, TNS 필터링, 그리고 필터뱅크의 과정을 거쳐 PCM 오디오 신호를 복호한다. 이 과정은 종래의 AAC 복호화 과정과 동일하다. That is, when the AAC bit string is input, the information necessary for decoding is extracted through the bit string demultiplexing process, and Huffman decoding, dequantization, M / S stereo reconstruction, prediction, and intensity / coupling are performed. The PCM audio signal is decoded through stereo reconstruction, TNS filtering, and filter bank. This process is the same as the conventional AAC decoding process.

이때, 상기 비트열 역다중화 및 워터마크 추출부(910)로 워터마크가 삽입된 비트열이 들어오면, 비트열 역다중화 과정에서 도 9a 및 도 9b를 참조하여 전송된 섹션별 허프만 코드북에 대한 정보로부터 워터마크를 1차적으로 추출할 수 있다. 즉 허프만 코드북으로 표현 가능한 신호의 경우 허프만 부호화에 사용된 허프만 코드북 및 그 허프만 코드북이 포함되는 그룹으로부터 워터마크 정보를 추출할 수 있다. 하지만 허프만 코드북으로 표현할 수 없는 신호의 경우 예외 처리에 이용된 <escape_prefix>의 개수인 N 값으로부터 워터마크 정보를 추출할 수 있다. In this case, when a bit string having a watermark inserted into the bit string demultiplexing and watermark extractor 910 is received, information on a section Huffman codebook transmitted with reference to FIGS. 9A and 9B during a bit string demultiplexing process. The watermark can be extracted primarily from. That is, for a signal that can be represented by the Huffman codebook, watermark information may be extracted from the Huffman codebook used for Huffman coding and a group including the Huffman codebook. However, in the case of a signal that cannot be represented by the Huffman codebook, watermark information may be extracted from an N value of the number of <escape_prefix> used for exception processing.

따라서 상기 비트열 역다중화 및 워터마크 추출부(910)에서 추출된 워터마크 정보는 허프만 복호화 및 워터마크 추출부(920)로 출력된다. 즉, escape_sequence에 삽입된 워터마크는 허프만 복호화 과정에서 <escape_prefix>의 개수값 N을 복호하여 추출한다. 이처럼 워터마크 정보는 AAC 복호화 과정의 두 부분에서 복호화가 이뤄지는데, 실제 처리 방법은 매우 간단하다. Accordingly, the watermark information extracted by the bit string demultiplexing and watermark extractor 910 is output to the Huffman decoding and watermark extractor 920. That is, the watermark inserted in the escape_sequence is extracted by decoding the number value N of <escape_prefix> during Huffman decoding. As such, the watermark information is decoded in two parts of the AAC decoding process, and the actual processing method is very simple.

한편, 본 발명에 따른 워터마크가 삽입된 비트열은 종래의 일반적인 AAC 복호화기를 통해 복호화가 가능하며, 복호화된 PCM 오디오 신호는 워터마크가 삽입되지 않은 경우와 구별되지 않는다. 또한, 본 발명에 따른 AAC 복호화기는 워터마크가 삽입되지 않은 비트열에 대해서도 종래의 방법과 동일하게 오디오 신호를 복호할 수 있다. Meanwhile, the watermark-embedded bit string can be decoded through a conventional AAC decoder, and the decoded PCM audio signal is not distinguished from the case where no watermark is inserted. In addition, the AAC decoder according to the present invention can decode an audio signal in the same manner as in the conventional method even for a bit string without a watermark embedded therein.

본 발명의 실시예에서는 고음질 오디오 부호화 방법 가운데 MP3 및 AAC 부호화에 대한 적용을 바탕으로 설명하였다. 그러나, 전송하고자 하는 데이터를 코드북을 이용한 허프만 부호화를 통해 수행하는 여타의 오디오 및 영상 부호화 방법에 대해서도 이와 같은 원리를 적용하여 워터마크를 삽입할 수 있음을 알 수 있다.The embodiment of the present invention has been described based on the application of MP3 and AAC encoding among high quality audio encoding methods. However, it can be seen that watermarks can be inserted by applying the same principle to other audio and video encoding methods for performing data to be transmitted through Huffman coding using a codebook.

한편, 본 발명에서 사용되는 용어(terminology)들은 본 발명에서의 기능을 고려하여 정의 내려진 용어들로써 이는 당분야에 종사하는 기술자의 의도 또는 관례 등에 따라 달라질 수 있으므로 그 정의는 본 발명의 전반에 걸친 내용을 토대로 내려져야 할 것이다. On the other hand, the terms used in the present invention (terminology) are terms defined in consideration of the functions in the present invention may vary according to the intention or practice of those skilled in the art, the definitions are the overall contents of the present invention It should be based on.

본 발명은 상술한 실시예에 한정되지 않으며, 첨부된 청구범위에서 알 수 있는 바와 같이 본 발명이 속한 분야의 통상의 지식을 가진 자에 의해 변형이 가능하고 이러한 변형은 본 발명의 범위에 속한다. The present invention is not limited to the above-described embodiments, and as can be seen in the appended claims, modifications can be made by those skilled in the art to which the invention pertains, and such modifications are within the scope of the present invention.

상기에서 설명한 본 발명에 따른 워터마크 삽입/검출을 위한 장치 및 그 방법의 효과를 설명하면 다음과 같다. An apparatus and method for embedding / detecting a watermark according to the present invention described above are described as follows.

첫째, 허프만 부호화 및 복호화 과정에서 워터마크 정보를 삽입하고 검출함으로써, 원 신호의 품질에 영향을 미치지 않으면서 워터마크를 삽입하고 검출할 수 있다. 따라서 종래와 호환성을 갖는 오디오 및 영상 압축된 비트열을 통해 원 신호와는 다른 별도의 정보를 전송할 수 있다. First, by inserting and detecting watermark information in the Huffman encoding and decoding process, the watermark can be inserted and detected without affecting the quality of the original signal. Therefore, separate information different from the original signal can be transmitted through an audio and video compressed bit stream compatible with the conventional art.

둘째, 상기와 같이 워터마크 정보가 삽입되어 전송되면 해당 컨텐츠에 대한 저작권 보호의 용도로 활용될 수도 있고, 복호화 및 복사, 재생 등의 접근을 제어하는 제어 정보로 활용될 수도 있으며, 모니터링을 위한 식별 정보, 오디오와 비디오 신호 사이의 동기화 정보(LipSync), 곡명, 가사, 자막 등의 부가 정보 전송 등에도 사용될 수 있다. 또한, 비트열이 채널 전송과정에서 손상을 입었는지 여부를 판별하거나, 위조여부를 판단하는 목적으로도 활용할 수 있다. 또한, 워터마크의 추출 방법을 특정인에게만 공개할 경우 해당하는 워터마크는 비밀 통신의 용도로 활용될 수도 있다.Second, when the watermark information is inserted and transmitted as described above, it may be used for copyright protection of the corresponding content, or may be used as control information for controlling access such as decoding, copying, and playback, and for monitoring. It can also be used to transmit information, synchronization information (LipSync) between audio and video signals, additional information such as song names, lyrics, and subtitles. In addition, the bit string may be used for the purpose of determining whether the bit string is damaged during the channel transmission process or for determining whether the bit string is damaged. In addition, when a method of extracting a watermark is disclosed only to a specific person, the corresponding watermark may be used for secret communication.

셋째, 워터마크가 삽입된 비트열은 종래의 복호화기를 통해 왜곡없는 신호의 복호화가 가능하며, 이때 종래의 복호화기는 워터마크의 삽입 여부를 인지하지 못하므로 호환성을 유지할 수 있다. 즉, 종래의 복호화기와의 호환성을 유지하면서 별도의 정보 전송 채널을 확보하게 되며 이를 다양한 응용분야에 활용할 수 있다. Third, the bit string in which the watermark is inserted can be decoded without distortion through a conventional decoder. In this case, the conventional decoder can maintain compatibility because it does not recognize whether or not the watermark is inserted. That is, a separate information transmission channel is secured while maintaining compatibility with the conventional decoder, and it can be used for various applications.

넷째, 워터마크의 삽입 및 추출 과정은 종래의 오디오 및 영상 부호화 과정에서 무시할만한 수준의 연산량 추가만으로 구현할 수 있는 장점이 있다. 따라서 본 발명에 따른 워터마크의 삽입 및 추출 방법은 매우 작은 연산량만을 요구하며 구현이 용이하다.Fourth, the insertion and extraction of the watermark has an advantage that it can be implemented only by adding a negligible amount of computation in the conventional audio and video encoding process. Therefore, the method of embedding and extracting the watermark according to the present invention requires very small calculation amount and is easy to implement.

이상 설명한 내용을 통해 당업자라면 본 발명의 기술 사상을 일탈하지 아니하는 범위에서 다양한 변경 및 수정이 가능함을 알 수 있을 것이다.Those skilled in the art will appreciate that various changes and modifications can be made without departing from the spirit of the present invention.

따라서, 본 발명의 기술적 범위는 실시예에 기재된 내용으로 한정되는 것이 아니라 특허 청구의 범위에 의하여 정해져야 한다. Therefore, the technical scope of the present invention should not be limited to the contents described in the embodiments, but should be defined by the claims.

Claims

Classify a plurality of Huffman codebooks used for Huffman coding into at least a plurality of groups, select one of the groups classified according to the watermark information to be inserted, and select a Huffman codebook to be used for Huffman coding of an input signal from the selected group. An encoder for selecting and transmitting Huffman encoding; And

And a decoder for performing Huffman decoding of the transmitted signal using the Huffman codebook used for Huffman coding of the signal transmitted from the encoder and extracting watermark information based on the group to which the Huffman codebook belongs. A device for embedding / detecting watermarks.

The method of claim 1,

And the group and Huffman codebook information selected by the encoder is transmitted to the decoder as side information.

The method of claim 1,

The coder uses an MP3 encoder and classifies the Huffman codebooks into a plurality of groups using at least one of a maximum value for each Huffman codebook, statistical characteristics of each Huffman codebook, and linbits values in the Huffman codebook. Device for mark insertion / detection.

The method of claim 3, wherein

And the MP3 encoder does not insert a watermark when Huffman encodes an input signal using Huffman codebooks 0 and 1.

The method of claim 1,

And the encoder uses an AAC encoder, and classifies the Huffman codebooks into a plurality of groups based on the maximum absolute value of each Huffman codebook.

The method of claim 5, wherein

Wherein the AAC encoder does not insert a watermark when Huffman encodes an input signal using Huffman codebook # 0.

The method of claim 5, wherein

When the AAC encoder uses the Huffman codebook 11 to Huffman encode the input signal, the AAC encoder does not insert a watermark for the selected Huffman codebook itself, and inserts the watermark in the process of performing exception processing (ESC). A device for embedding / detecting watermarks.

The method of claim 7, wherein

The AAC encoder adds escape_sequence (= <escape_prefix> <escape_separator> <escape_word>) to a codeword corresponding to the maximum absolute value for an input signal having a size that cannot be represented by the Huffman codebook. And an N value which is the number of <escape_prefix> according to the mark information.

Here, <escape_prefix> represents N-bit binary '1', <escape_separator> represents binary '0', and <escape_word> represents a coefficient expressed as an unsigned integer value of N + 4 bits.

The method of claim 8,

If the binary watermark to be inserted is '0', 2 ^{(N + 4)} determines the minimum even number N that can represent <escape_word>, and if it is '1', 2 ^{(N + 4). )} as a watermark unit for embedding / detection, characterized in that to determine the minimum of the odd number, N to express <escape_word>.

The method of claim 8,

And the decoder extracts watermark information from a group to which the Huffman codebook used for Huffman coding of the signal transmitted from the encoder belongs and N value of the number of <escape_prefix>.

Classify a plurality of Huffman codebooks used for Huffman coding into at least a plurality of groups, select one of the groups classified according to the watermark information to be inserted, and select a Huffman codebook to be used for Huffman coding of an input signal from the selected group. A transmission step of selecting and transmitting Huffman encoding; And

And receiving a Huffman decoding of the transmitted signal using the Huffman codebook used for Huffman coding of the transmitted signal and extracting watermark information based on the group to which the Huffman codebook belongs. A method for embedding / detecting watermarks.

The method of claim 11,

The group and Huffman codebook information selected in the transmitting step is transmitted as side information.

The method of claim 11,

The transmitting step performs MP3 encoding on the input signal and classifies the Huffman codebooks into a plurality of groups using at least one of a maximum value for each Huffman codebook, statistical characteristics of each Huffman codebook, and linbits values in the Huffman codebook. Method for embedding / detecting watermarks, characterized in that.

The method of claim 11,

The transmitting step performs AAC encoding and classifies the Huffman codebooks into a plurality of groups based on the maximum absolute value of each Huffman codebook.

The method of claim 14,

In the transmitting step, when Huffman encodes an input signal using Huffman codebook 11, the watermark is not inserted into the selected Huffman codebook itself, and the watermark is inserted in the process of performing exception processing (ESC). A method for embedding / detecting watermarks.

The method of claim 15,

In the transmitting step, Huffman encoding is performed by adding escape_sequence (= <escape_prefix> <escape_separator> <escape_word>) to a codeword corresponding to the maximum absolute value that can be expressed for an input signal having a size that cannot be represented by the Huffman codebook. The method for embedding / detecting a watermark according to claim information, characterized by determining an N value which is the number of <escape_prefix> according to the mark information.

The method of claim 16, wherein the receiving step

Extracting first watermark information from a group to which the Huffman codebook used for Huffman coding of the signal transmitted in the transmitting step belongs;

And extracting second watermark information from an N value of the number of <escape_prefix>.