WO2007025476A1

WO2007025476A1 - Multimedia communication transport protection method

Info

Publication number: WO2007025476A1
Application number: PCT/CN2006/002232
Authority: WO
Inventors: Zhong Luo; Fuzheng Yang; Shuai Wan; Yilin Chang
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2005-09-02
Filing date: 2006-08-30
Publication date: 2007-03-08
Also published as: CN1893663A

Abstract

An transport protection method for multimedia video or still image communication in multimedia communication field, can improve quality of multimedia communication service with the precondition of no burden increase of communication systems or network. Wherein, the key data information is concealed in encoded key-not data information in the compressed images by using digital watermark technique; and the normally transmitted key data is detected right or not through the key data protection backup protected by digital watermarks, therefore, error in transport can be detected; the blocks at the same position in the fore-and-aft block groups are performed reciprocal backup circularly; the motion vector is performed encode, then is embedded into the DCT transform coefficients correspondingly unimportant with form of digital watermark, at the same time, it must ensure the video data not to be influenced as possible; when happened error, if the key data backup is right, it will be used to displace the original key data directly, if not, the average value of the neighbor block key data in the same frame will be used to displace the original key data approximatively.

Description

Transmission protection method for multimedia communication

Technical field

The present invention relates to a multimedia communication method, and more particularly to a transmission protection method for multimedia communication. Background technique

Multimedia communication, especially digital video technology, is widely used in communications, computers, radio and television, etc., bringing a series of applications such as conference TV, videophone, digital TV, media storage, etc., which has led to the emergence of many video coding standards. International Telecommunication Union Telecommunication Standardization Sector ("ITU-T") and International Standardization Organization ("ISO"), International Electrotechnical Commission ("IEC") The Moving Picture Expert Group ("MPEG") is the two organizations that develop video coding standards. ITU-T standards include H.261, H.263, 1-1.263+, H.263++, H.264 and other video compression coding standards, mainly used in real-time video communication, such as conference television; MPEG series standard MPEG -3, MPEG-4, mainly used in video storage, broadcast TV, Internet or streaming media on the wireless network. The two organizations have also developed a number of standards. The I- 1.262 standard is equivalent to the MPEG-2 video coding standard, and the latest I- 1.264 standard is included in Part 10 of MPEG-4.

H.261 is developed by ITU-T for two-way audio and video services (video telephony, video conferencing) on the Integrated Services Digital Network (ISDN) at a rate of 64 kb/s. . Each frame of H.261 is divided into an image frame layer, a Group of Block ("GOB") layer, a Macro Block ("MB") layer, and a Block (Block) layer. H.261 is the earliest motion image compression standard, which specifies the various parts of video coding, including motion compensated interframe prediction, discrete cosine transform ("DCT") transform, quantization, entropy coding, and A portion of rate control that is adapted to a fixed rate channel. H.263 based on H.261 was the earliest ITU-T standard for low bit rate video coding, and the second editions (Η· ² 6 ³ + ) and Η.263++ that appeared later added many options to It has a wider range of applicability.

The motion vector mode of I- 1.263 allows motion vectors to point to areas outside the image. When a certain transport When the reference macroblock pointed to by the motion vector is located outside the coded image, it is replaced by the image pixel value of its edge to obtain a large coding gain. The advanced prediction mode allows four 8x8 luma blocks in a macroblock to correspond to one motion vector, thereby improving the prediction accuracy; the motion vectors of the two chroma blocks take the average of the four luma block motion vectors. In the case of compensation, overlapping block motion compensation is used, and the compensation value of each pixel of the 8 8 luminance block is obtained by weighted averaging of 3 prediction values. Using this mode can produce significant coding gains, especially with overlapping block motion compensation, which reduces blockiness and improves subjective quality.

At present, H.261 and I- 1.263 are widely used in video communication, and there are many mature products. Compared with H.261, I- 1.263 adds several options, provides a more flexible coding method, greatly improves compression efficiency, and is more suitable for network transmission. The introduction of the H.264 standard is an important advancement in the video coding standard. Compared with the existing MPEG-2, MPEG-4 and H.263, it has obvious advantages, especially in coding efficiency. Make it available in many new areas. Although the algorithm complexity of H.264 is more than four times that of the existing coding compression standard, with the rapid development of integrated circuit technology, the application of H.264 will become a reality.

In video communications, critical data protection and Error Concealment are a very important method of guaranteeing end-to-end Quality of Service (QoS). Because the network, especially the Internet or other QoS-guaranteed IP or packet-switched networks, wireless networks, often cause packet loss or packet loss for various reasons, part of the compressed video data will be lost, and the receiver will not be correct. Decoding, because there may be correlations between portions of the compressed video stream, so the lost data not only affects the correct decoding of the information it contains, but also the correct decoding of other information that depends on it. Therefore, the necessary error masking must be performed to ensure correct decoding. The error concealment is: for missing information, approximate replacement with information that has been correctly received and correctly decoded, or extrapolate the missing information.

Digital media and the Internet have brought great convenience to people's lives. Digital media is easy to access, copy, transfer and edit, but it also brings about violations of digital media copyright and tampering with digital media content. . The popularity of the network has made the exchange and transmission of digital media a relatively simple process, and the sharing of information has reached a new level, but at the same time, the chances of information being exposed and the possibility of being attacked are greatly increased. This gave birth to being the earliest used Digital watermarking technology for digital media copyright protection.

The digital watermarking technology embeds a series of meaningful or meaningless information in the original media data, so that the watermark information embedded in the original media data always coexists with the original media data, thereby protecting the original media data copyright and content integrity. With the development of technology, in addition to copyright protection, digital watermarking technology has important applications in many other places.

Figure 1 shows the block diagram of the digital watermark. The main media in the figure is generally original or compressed multimedia data such as video and audio, and the data to be hidden is relative to. Only less data. Watermarked media / , and /. The difference is the distortion produced by the embedding of the watermark, which is generally required to be difficult for human perception. After a certain process, A obtains media such as data compression, noise pollution, and intentional attack on the watermark. These processes can be regarded as noise uniformly. Therefore, the watermark A extracted from _{2 is} relative to the original watermark ⁶ . There may be some distortion, if ₂ is the same as A, the extracted watermark A should also be the same as the original watermark ⁶ . the same.

The general mathematical model for watermark embedding and extraction is: Set, represent the original data and the data after embedding the watermark, ⁶ . For the original watermark, the embedding process of the watermark can be expressed as /, = /. + /(/., ) , where / (., ) represents the embedding algorithm of the watermark. The watermark detection process can be expressed as: If H is assumed. : = / ₃ — /. If N is true, there is no watermark; if it is assumed that = / ₃ -I ₀ = b ₀ + N is true, there is a watermark, where w is noise, such as caused by data compression, noise pollution, and intentional attack on the watermark. The watermark embedded data will be processed to produce a certain distortion, so the watermark detected from the processed data may be different to the original watermark to some extent.

Watermark detection technology is generally implemented by classical signal detection technology. As a signal detection technology, it is studied how to determine whether there is a target signal in the noise, such as whether the radar echo signal contains a reflection signal from the target, and if so, How to use the statistical principle to extract the optimal signal. To determine whether there is a signal in the noise, use the Statistic Hypothesis Test/Validation. In the watermark detection, two hypotheses are given first. And according to the result of the test, it is known which ^_ is set to know whether there is a watermark.

From the video key data protection technology, there are currently many methods, which are roughly divided into the following categories: The UnEqual Protection (UEP) measure refers to a variety of active anti-drop and error-resistance measures for key data in the code stream, such as Forward Error Code (Forward Error Code, The cartridge is called "FEC", Erasure Codes, etc., and is protected from ordinary data;

Using the custom section in the communication protocol to back up key information, this method varies depending on the specific protocol. For example, for H.263/H.263+ international standards, you can use the image enhancement information (Picture Enhancement Information). , referred to as the "ΡΕΓ" domain for critical data backup;

Data Partition refers to the use of a separate stream for key data.

From the error concealment technology, there are currently many kinds, which are roughly divided into the following categories: The time domain masking method uses the information of adjacent frames on the time axis to estimate the missing data. The method of calculation may be: simply adopting the data of the same position of the adjacent frame instead of the missing data; considering the motion prediction factor, the motion prediction is performed according to the adjacent frame data. In addition to this there are more complicated masking strategies, but the amount of calculation is very large;

The spatial domain masking method is to use the spatial adjacent area of the lost data area to perform error concealment. The same method is as follows: simply replace the neighborhood; based on the data fusion, there are multiple spatial neighboring regions to estimate the missing data, such as spatial interpolation; algebraic inversion method, the packet loss process is modeled by a linear model, the input is The data before the packet loss, the output is the correctly received data, using algebraic inversion methods, such as the least squares method, inverting the input from the output, using the inversion result instead of the erroneous data, this method is computationally intensive;

The space-time joint masking method is a combination of spatial and temporal error concealment. For example, depending on the characteristics of the lost data and the situation of adjacent time data and spatial data, it is better to use some strategy to determine whether to cover up with spatial domain or time domain, and then implement this better masking strategy, or fuse spatial data and time. Data, together for cover up.

In fact, the relevant premise of error concealment is error detection and positioning. Accurate error detection and positioning are the premise that the error is correctly covered. The existing error detection method utilizes the characteristics of the video signal to perform error detection; or performs syntax check on the video code stream, such as the occurrence of a variable length code (Variable Length Codes, "VLC") code word, motion vector Beyond the image The surrounding or recovered DCT coefficient is out of range, etc., and is considered to be an error caused by a bit error. The method of error detection based on video signal characteristics is based on the assumption that "the video signal is stationary", but this assumption is usually not established in practical systems, so false detection errors often occur; the syntax check method of the code stream is The error location cannot be accurately defined. Therefore, the accuracy of error location using these methods is relatively low, typically 5 - 15%. Therefore, the premise of error concealment is accurate error detection, which means that error detection is the primary task of error concealment (especially wireless channel).

In addition, there is currently no way to effectively combine key data protection with error concealment.

In practical applications, the above solutions have the following problems: In terms of critical data protection, UEP is used to protect critical data, which requires additional overhead and increased code traffic. Generally speaking, packet loss occurs because the network is congested and the bandwidth is narrowed. If the traffic is to be protected, the traffic is increased. This is a logical contradiction of this method, and therefore the use effect is not good. In addition, although the custom section method in the communication protocol has its ingenuity, it lacks generality depending on the specific protocol. The data segmentation method is too complicated to be practical. Replacing or extrapolating the current frame data directly from the key data of the previous frame is only suitable for some key data and lacks versatility.

Other error concealment methods can only temporarily mask the distortion caused by bit errors, and the simple method produces poor results. The complicated method has a large amount of calculation, and the processing capability of the terminal is high. In addition, the more serious problem is the existing one. The accuracy of the error detection method is too low, which directly limits the effect of error concealment.

The main reason for this situation is that the separate key data protection methods either require additional overhead and cannot solve the fundamental network congestion problem, or are too complicated to implement or have no versatility; the individual error concealment method improves the video quality. It is not good enough, it consumes resources, and there is no error detection mechanism with high accuracy.

Summary of the invention

In view of this, the main object of the present invention is to provide a transmission protection method for multimedia communication, which improves the quality of multimedia communication services without increasing the burden on the communication system or the network.

To achieve the above object, the present invention provides a transmission protection method for multimedia communication, including Includes the following steps:

A uses the digital watermark at the origin to protect the key data backup;

B extracts the digital watermark at the receiving end to obtain key data backup, which detects the error of the multimedia data;

C confuses the errored multimedia data.

The step A includes the following sub-steps:

Performing block processing on the multimedia data, performing backup encoding on the key data of the current block; embedding the backup code of the key data of the current block into the non-critical data encoding of the protection block corresponding to the current block by digital watermarking;

The protection block is different from but corresponding to the current block.

Further in the method, the step B comprises the following sub-steps:

Extracting a digital watermark from all of the protection blocks to obtain a key data backup of the corresponding protected block;

First, it is judged whether the current block is correct according to the following first criterion: if the key data backup of the current block is consistent with the key data transmitted by itself, or the key data backup of the protected block corresponding to the current block and the protected block itself If the key data transmitted is consistent, the current block is correct. Secondly, for the data block that is not determined to be correct by the first criterion, the current block is determined to be erroneous according to the following second criterion: If the key data of the current block is backed up and itself The key data of the transmission is inconsistent and the protection block corresponding to the current block is correct, or the key data backup of the protected block corresponding to the current block is inconsistent with the key data transmitted by the protected block itself and the protected block is correct, the current block error.

In addition, in the method, the multimedia communication is transmitted by using a motion compensation coding method, and the step B further includes the following sub-steps:

For a data block that is not determined to be correct or erroneous by the first criterion and the second criterion, it is determined whether the current block is erroneous according to the following third criterion: If the reference code block of the current block is erroneous, the current block is erroneous.

In addition, in the method, the step C includes a sub-step. When the current block is faulty, if the corresponding protection block is correct, the key data backup of the current block is used as its key data for decoding. In addition, in the method, the step C includes a sub-step, and when the current block is wrong, if the corresponding protection block is incorrect, the average value of the key data of the one or more data blocks adjacent to the current block is used. Decode as the key data of the current block.

Further in the method, one of the average values is one of:

Arithmetic mean, weighted average, geometric mean, harmonic mean, or median mean, or the average of any of the above averages after removing the maximum and minimum values of the average array, ie, removed by average Arithmetic mean, weighted average, geometric mean, harmonic mean, and median mean after the maximum and minimum values of the array

In addition, in the method, the transmission method of the multimedia communication is H.261, H.263, H.263+, 1-1.263++, H.264, Moving Picture Experts Group Standard 1, Moving Picture Experts Group Standard 2. Any one of part 2 and part 10 of the Moving Picture Experts Group Standard 4.

In addition, in the method, the key data includes:

The motion vector of the macroblock, the video sequence structure parameter, the structure parameter of the image frame, the block group structure parameter, the image enhancement information, or the supplemental enhancement information.

In addition, in the method, the non-critical data is any two coefficients between 7 and 12 in the discrete cosine transform AC coefficient of the color image luminance component signal or the gray signal of the gray image. For example, two coefficients of 8 and 9.

In addition, in the method, the backup encoding method of the motion vector includes the following steps: encoding a horizontal component and a vertical component of the motion vector by 4 bits, respectively; wherein the 4-bit encoding corresponds to the horizontal component or The discrete value of any of the 16 reciprocal components of the longitudinal component.

In addition, in the method, the first criterion and the second criterion of the step B determine whether the key data backup is consistent with the key data according to the following fourth criterion:

If the value of the 8-bit representation of the horizontal component and the vertical component of the motion vector backup code is consistent with the value of the motion vector transmitted by the corresponding data block itself, the motion vector backup and the motion vector Consistent.

In addition, in the method, the multimedia data is an image frame sequence;

Each of the image frames is divided into at least two data block groups;

Each of the data block components is at least two of the data blocks; Each of the data blocks includes at least four luminance component signal blocks;

a protection block corresponding to each of the data blocks is a data block in the subsequent one of the data block groups that satisfies a preset-correspondence relationship;

The protected block corresponding to each of the data blocks is a data block in the previous one of the data block groups that satisfies the preset-correspondence relationship;

The reference data block corresponding to each of the data blocks is the previous data block in the same data block group.

Further, in the method, in the preset-correspondence relationship, a 'protection block corresponding to each of the data blocks is a data block of the same position in the next one of the data block groups; The protected block corresponding to the data block is the data block of the same position in the previous one of the data block groups.

In addition, in the method, when the digital watermark of the key data backup is performed in the step A, the 8-bit backup code of the motion vector is respectively inserted into the four brightness components or gray levels of the corresponding protection block. In the code of the discrete cosine transform AC coefficient of the signal block, the code of any two coefficients of the sequence number 7 to 12 may be, for example, two coefficients of 8 and 9.

Further in the method, the rules for inserting the motion vector backup encoded bits into the corresponding encoding of the discrete cosine transform coefficients are as follows:

The code of the discrete cosine transform is changed to an even or odd code of the value closest to the value before encoding according to the motion vector back-coded bits.

In addition, in the method, in the step C, when the current block is wrong, if the corresponding protection block is correct, the motion vector of the current block is used to back up the position of the reference block of the current block in the previous image frame. And replace the current block with the reference block.

In addition, in the method, in the step C, when the current block is wrong, if the corresponding protection block is wrong, the motion vector of the current image frame adjacent to the current block and the correct one or more data blocks is used. The average is used to reverse the position of the reference block of the current block in the previous image frame and replace the current block with the reference block.

By comparison, it can be found that the main difference between the technical solution of the present invention and the prior art is that digital watermarking technology is used to hide key data information in non-critical data encoding of compressed images, so as to effectively protect multimedia without increasing communication burden. Key data of communication to improve the quality of multimedia communication services; The key data protection backup under the digital watermark protection is used to detect whether the key data transmitted normally is correct, thereby detecting the transmission error code;

The macroblocks in the same position in the front and rear macroblock groups are cyclically backed up each other; 'The motion vector is encoded and then embedded into the relatively unimportant DCT transform coefficients to implement digital watermarking, while ensuring that the video data is as unaffected as possible;

When the error occurs, if the critical data backup is correct, the original key data is directly replaced, otherwise the original key data is approximated by the average value of the key data of the adjacent macroblocks in the same frame.

The difference in this technical solution brings about a more obvious beneficial effect, that is, the use of the well-designed digital watermarking technology to protect the backup of critical data can not increase the communication system or network burden, and does not affect the transmission quality. The key data protection can be realized conveniently and efficiently, and the combination of error detection and error concealing is realized based on this, thereby greatly improving the quality of multimedia communication services, thereby improving video communication products, such as videophones and third generation mobile terminals. Market competition such as video conferencing and Internet TV.

DRAWINGS

Figure 1 is a schematic diagram of the principle of digital watermarking technology;

2 is a schematic diagram of a transmission system of multimedia communication according to a first embodiment of the present invention; FIG. 3 is a flowchart of a transmission protection method for multimedia communication according to first and second embodiments of the present invention;

4 is a schematic diagram of a H.263 video data blocking scheme according to a first embodiment of the present invention; FIG. 5 is a schematic diagram of video quality comparison of experimental results according to a third embodiment of the present invention; FIG. 6 is a third embodiment according to the present invention. A schematic comparison of the PSNR of the experimental results of the examples.

DETAILED DESCRIPTION OF THE INVENTION In order to make the objects, technical solutions and advantages of the present invention more comprehensible, the present invention will be further described in detail with reference to the accompanying drawings.

The key idea of the invention is to use digital watermark to embed some watermark data in multimedia data to protect the key data of multimedia communication, such as the protection of motion vector data in many motion prediction coding standards. The key data is embedded in the multimedia data as the watermark in the multimedia compression encoding process at the transmitting end, and becomes the protection redundancy backup of the key data. In addition to the transmission in the multimedia code stream itself, the key data is backed up in the form of watermark data. Of course After the code stream is sent at the same time, after receiving the data, the receiver can obtain the important media data and its backup by extracting the watermark data, so as to achieve the purpose of key data protection. When the key data is lost, the backup can be used to recover.

Through the use of digital watermarks, the protection of key data is realized without increasing the communication burden or reducing the quality of multimedia communication. The invention is based on the protection of key data by digital watermarking, and the effective error detection method is to detect the error vector by comparing the motion vector extracted from the watermark with the motion vector obtained by video decoding, thereby greatly improving the accuracy of error detection. And using the motion vector extracted from the watermark to mask the error block, the video recovery quality is improved.

The use of critical data backup for error detection can accurately detect the occurrence of bit errors. Finally, the present invention improves the quality of multimedia communication services through error concealment. Error concealment can be achieved by replacing critical data with key data backups. , is achieved by a simple alternative to the spatial domain.

It can be seen that the present invention is basically implemented in three steps: First, the digital data is used to embed the key data backup at the transmitting end, and the digital watermark can be extracted at the receiving end to obtain the protected key data; secondly, according to the backup of the key data and the transmission of the key data. The comparison of key data, the method of judging whether the relevant multimedia data has a bit error, in order to efficiently detect the occurrence of the error condition; finally, the error occurrence of the errored data is realized.

The following is an example of a block-motion compensation based video compression algorithm series standard, in particular the H.263 standard, to describe the embodiment of the present invention in detail, as well as other existing standards, such as the aforementioned 1-1.261, H.263, 1-1.263+, H.263++, H.264, MPEG-K MPEG-2, MPEG-4 part2 & partlO, etc., or future standards using the same mechanism, can be extended similarly.

Fig. 2 is a block diagram showing a communication system of a first embodiment of the present invention. During the encoding process performed by the encoder, the motion vectors obtained in the motion prediction are sent to the VLC module and the watermark embedding module, respectively. The motion vector sent to the VLC module is coded in the normal way and combined into the output code stream (ie, the normal protocol processing flow); and the motion vector sent to the watermark embedding module is processed and superimposed on the quantized DCT coefficient. , then by VLC encoding, compounded to the output stream.

Embodiments of the present invention embed key data, ie, motion vectors, into DCT coefficients of a compressed image In the decoding end, error detection and masking are performed according to the extracted digital watermark. It will be understood by those skilled in the art that other non-critical data can also be used as a carrier to embed a watermark backup of key data, thereby achieving the object of the invention without affecting the essence and scope of the present invention. In the first embodiment of the present invention, the digital watermark is embedded in the DCT coefficients of the transform domain because the high frequency components in the DCT coefficients have less influence on human vision, so such unimportant data encoding is more suitable for embedding the watermark. Thus, the quality of the original video transmission can be protected.

In fact, according to the way of watermark signal embedding, digital watermarking technology can be divided into spatial domain digital watermarking technology and transform domain digital watermarking technology. The digital watermarking technique in the spatial domain embeds watermark information directly in the spatial domain of the media, such as embedding information directly in the image pixels. The digital watermarking technique of the transform domain first transforms the media, such as discrete Fourier transform, discrete cosine transform or discrete wavelet transform, and then embeds the watermark information in the transform domain. The transform domain watermarking technique has many advantages over the spatial domain watermarking. For example, the additional image energy caused by the watermark addition can be evenly distributed to the various parts of the embedded image, so that the visible influence of the watermark addition is reduced to a minimum. Thus embodiments of the present invention employ a method of embedding a watermark in a transform domain.

Fig. 3 shows the overall flow of the transmission protection method of video communication in the first embodiment of the present invention.

First, in step 301, the multimedia data is processed into blocks, and the key data of the current block is back-coded.

As mentioned above, the first thing to do is to use the digital watermark at the origin to back up the critical data. In combination with existing motion compensation or other video compression coding standards, the processing of video data is performed in blocks. For example, in the I- 1.263 standard, the video stream is divided into image frame sequences, each image frame is divided into multiple GOBs, each GOB corresponds to one row of MB, and each MB contains four 8 χ 8 luminance component signals B1... B4 and 2 color difference components. Figure 4 shows this blocking scheme. Each MB is indicated by two numbers, the first subscript is the line number, which is the GOB number, and the second subscript is the column number, which is the serial number in the GOB.

The backup code of the key data is specifically used for the coding before embedding the watermark, and the coding mode can be distinguished from the normal coding mode. In the first embodiment of the present invention, the backup coding of the motion vector (MV) is implemented by a quantization method. The motion vector is divided into a horizontal (X) component and a vertical (Y) component, respectively representing the horizontal and vertical displacement of the current macroblock relative to its previous frame reference macroblock. the amount. When the backup code is encoded, the two components are respectively encoded by 4 bits, and the 4-bit code can represent 16 code words, corresponding to the following 16 kinds of values representing the horizontal component or the vertical component: less than -3, -3, -2, 5, -2, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, 2, 2.5, 3, 3.5, greater than 3.5.

Here, the case where both ends exceed a certain quantization range is represented by one codeword, and such quantization method has good performance and coding efficiency. In fact, since the values of the individual components of the MV are mainly concentrated near zero, the components of the motion vector obtained by testing a large number of actual video sequences are mainly concentrated in the interval [-2, 2]. In order to minimize the influence of the embedding of the motion vector on the image, the value of the component of the motion vector is uniformly quantized in the interval of [-3, 3.5], and each end of the interval is represented by a codeword. . This greatly reduces the amount of MV coding, reduces the amount of data, improves the feasibility of watermark embedding, and also ensures less loss of key data coding.

In fact, after using the above non-uniform quantization, for each component, when |Μ ≤ ³ , the MV of the previous row can be completely recovered by embedding the correct MV, ^v \> 3.5, although the damaged MV cannot be recovered. However, by comparing the range of values, the range of the impact of the error can be determined to achieve the purpose of error detection.

Here, the fixed MV (4bit) encoding is used for the embedded MV, so as to facilitate the subsequent error detection. For example, the encoding result corresponding to each component is 0000 to 1111, which can sequentially correspond to the 16 kinds of values described above. As shown in Table 1. This table corresponds to a special case, but as a special case, the value can be taken according to the situation in the specific application.

Then in step 302, the backup code of the key data of the current block is embedded into the non-critical data code of the protection block corresponding to the current block by digital watermarking. The protection block is different from but corresponding to the current block, that is to say, each data block has another data block as its protection block, and it is itself a protection block of another data block.

Taking the coding scheme of I- 1.263 shown in Figure 3 as an example, each layer in the code stream has a corresponding The start code, the start code of the GOB plays a synchronous role in the transmission. By checking the start code of the GOB, the bit error can be limited to the GOB in which it is located, without affecting the next GOB. Therefore, the motion vector of MB1,m of the previous line MB, for example, GOBI, can be selected to be protected in the MB2,m of the next line of GOB2, that is, the MB of the corresponding position of the next line of the current MB is its protection block, and the current MB is the same. A row corresponds to the protection block of position MB. The MB corresponding to the position of the previous row is the protected block of the current MB, and if it is the last row, it is protected by the first row. Through such loop protection between lines and lines, effective backup can be realized. For example, when there is no error in GOB2, even if the data in GOBI is wrong, the key data of GOBI can be restored, which is used for effective error concealment.

It has been mentioned above that the content of the data watermark is the key data, that is, the motion vector, and the carrier of the digital watermark is not given. Of course, since the encoding of key data is important in itself, it is not logical to embed the backup watermark of key data into the encoding of key data. This is logically contradictory. In the first embodiment of the present invention, H.263 is taken as an example, and relatively unimportant data coding is used as a carrier of the watermark to ensure that the transmission quality of the video stream itself is not impaired. As mentioned above, the DCT coefficient of the high frequency is selected as the carrier of the watermark in the first embodiment of the present invention for a reason.

According to the human visual characteristics, the direct and low frequency components of the human eye to the video, that is, the DC (Direct Current, "DC") coefficient and the low frequency alternating current (AC) coefficient in the DCT coefficient It is sensitive and insensitive to noise or distortion in high frequency AC components, so it is not appropriate to embed motion vectors in DC or low frequency AC coefficients. However, the inter-frame coding mode used in video coding makes the DCT coefficients of the frame difference signals relatively small, especially the high-frequency component coefficients are almost zero. In order to avoid excessive bit rate increase, the present invention chooses to embed motion vector information in transform coefficients of luminance signal numbers 8 and 9, namely AC8 and AC9. Experiments show that selecting these two coefficients as the watermark carrier has the best video quality improvement effect. It is also a problem to be considered if the 8 bits of watermark information are hidden or embedded in the AC8 and AC9 codes of the protection block. In the first embodiment of the present invention, 8-bit information is exactly embedded in the AC8 and AC9 coefficients of the four luminance signals corresponding to the protection block, and a total of 8 coefficients, that is, a watermark with one bit hidden on each AC8 or AC9 quantized coefficient. information. The specific rules for inserting the bit information of the MV backup code into the coding of the corresponding DCT coefficient are as follows: The bit coded according to the MV backup is 0 or 1, and the code of the DCT is changed to the even or odd code closest to the value before the encoding. If it is, it does not need to change.

The mathematical model of the watermark embedding method is described as follows:

Let ⁶ = 0, 1 be the bit information to be embedded, LEVEL is the value after the c,., / = ⁸ , ⁹ coefficients are not embedded in 6 , then the value is the value after embedding, and Δ £ is generated after embedding The error is MLEVEL = LEVEL + ALEVEL.

According to the parity correspondence principle described above, the correspondence between the embedded information 6 and M £ £ is required as follows

Even or odd.

In order to reduce the influence of the watermark embedding on the DCT coefficient, the EL should be as small as possible, that is, the DCT coefficients before the quantization corresponding to LEVEL and MLEVEL should be as close as possible. Therefore, in the first implementation of the present invention, the specific embedding algorithm is expressed as:

When 6 = 0 and LEVEL is even, no change is needed, MLEVEL = LEVEL, ALEVEL = 0; When ^ O jL ^ra is odd, the value of AL£ 3⁄4 is taken as follows, and M^ E is further determined:

ALEVEL = *)

Where / _{eve /} = (l _C0 /bu _β / 2) / (2. β >) , sign(-) is a sign function, COF is C,. = 8,9 before quantization, ^ is the quantization factor, / Indicates the divisible operation;

When /, = 1 and LEVEL is odd, you do not need to change 4乜, MLEVEL = LEVEL, ALEVEL = 0; When δ = 1 and LEVEL is even but not 0, EVEL takes the same value as (*), thus determine

MLEVEL;

When δ = 1 and LEVEL = 0 0†, ALEVEL = sign(COF) , MJVEL is determined by j]:匕.

It can be seen that in the case of nearly 50% of the embedding method, the DCT coding values before and after embedding are the same, so that it has little effect on the code rate. After embedding, VLC encoding is performed using ^3⁄4 as the quantized value of iC, / ^-8 , ⁹ , at the time of encoding. After the watermark embedding protection of the transmitting end is completed, the digital watermark is extracted at the receiving end to obtain key data backup, and the multimedia data error is detected by the digital data.

Therefore, in step 303, the digital watermark is extracted from all the protected blocks to obtain a key data backup of the corresponding protected block. Taking H.263 as an example, the method of extracting digital watermark is simple. It only needs to judge according to the quantized value of AC8 and AC9 of video decoding, and judge the value of corresponding watermark bit according to its parity, which is expressed by the following formula:

After extracting 8 bits out of 4 luma blocks, they are arranged in the order of embedding, and the codewords of the motion vector range of the corresponding block in the previous row are obtained, and the corresponding motion vector value is obtained by looking up Table 1. .

Then, in step 304, the error condition is further detected based on the comparison of the recovered key data backup and the key data transmitted by the normal channel itself.

The second embodiment of the present invention determines the error condition by four criteria on the basis of the first embodiment:

First, it is determined whether the current block is correct according to the following first criterion: if the key data backup of the current block is consistent with the key data transmitted by itself, or the key data backup of the protected block corresponding to the current block is transmitted by the protected block itself. The key data is consistent, then the current block is correct.

Secondly, for the data block that is not determined to be correct by the first criterion, determine whether the current block is erroneous according to the following second criterion: if the key data backup of the current block is inconsistent with the key data transmitted by itself and the protection corresponding to the current block If the block is correct, or the key data backup of the protected block corresponding to the current block is inconsistent with the key data transmitted by the protected block itself and the protected block is correct, the current block is incorrect.

For the multimedia communication transmitted by the motion compensation coding method, for the data block that is not determined to be correct or incorrect by the first criterion ij, the second criterion, determine whether the current block is wrong according to the following third criterion: If the current block is referenced If the encoding block is wrong, the current block is wrong. This is because for each data block, its encoding is based on its reference block, so the current block cannot be decoded with reference to the block error.

In the case of the above motion vector as the protected key data, the first criterion, the first In the second criterion, it is judged whether the key data backup is consistent with the key data according to the following fourth criterion: if the motion vector backup code has a total of 8 bits of the horizontal component and the vertical component, and the motion vector of the corresponding data block itself is transmitted. If the value is consistent, the motion vector backup is consistent with the motion vector.

The specific implementation of these guidelines is described in detail below by taking H.263 as an example.

In the video decoding, each MB is checked one by one, and the motion vector obtained from the VLC decoding and the motion vector extracted from the embedded watermark are compared to detect whether there is a bit error. Let the motion vector of MB, ",,," in the GOB" obtained in the VLC decoding process be MV „,, „ , and extract the GOB, the medium MB,,,,,, from the corresponding protection block by the watermarking information. The motion vector backup is recorded as. Since the motion vector has two components, it can be expressed as: MC[D„, M f , κ·„, „,=[ „, „„ Κ „, „,] ⁷ ' , where The x, y represent the horizontal and vertical components, respectively. In the following description, the mathematical logic symbol is used to describe the judgment criterion, and the logical operations "n" and "u" represent the sum and the sum.

For its protection block ,,,,,, ^,, + υ ,,, protected blocks ,, - υ ,,, which is transmitted itself MV MV, ,, ",, protection block Μβ" ₊ ,, „, the MV protected for it is backed up as MV, which is itself protected by ML, and the protected MV is backed up as MV.,,,, the MV transmitted by the protected block itself is ^ and the reference MB corresponding to each MB For the previous MB in the same GOB, that is, the reference MB is -,. The four criteria are described as follows:

The first criterion, if i ^M , , , , , MV , )u ( K„.,,„, = MV ,,, ) is satisfied, then judge MB,,,,,, = True, that is, „ is correct, No error code;

Second criterion, if it is sufficient

[{MV, ,.,„≠ MV,,',,,, ) (Μβ„ _+Ι .,„

False, indicating an error, a bit error occurred;

The third criterion, if 杲 meets, Fake, then determines M£,,, „, = 3⁄4 _{e ;}

The fourth criterion, if (M „= C)n(MC lj;,) is satisfied, B'J is judged, „=Μ^,„, otherwise judged, „,≠ MV; _KI „ , and here MV, _m The rule of = MV , or M „ = MV „ is: If „, and Λ „'; or M„'; „ and M/„';, the backup code values are equal, ie the code words in Table 1 , then judge MV,,, = MV„ or = MV'„ is established, otherwise it does not hold.

It should be noted here that the above four criteria are mutually rushed if there is no priority order. In the first embodiment of the present invention, the priority order is set from the highest to the lowest, and the first criterion, the second criterion, the third criterion, and the fourth criterion. The conclusions after the high-priority criteria are judged, the low-priority criteria must not be overturned. For example, if the first criterion determines that the current block is correct, then the current block is no longer determined in the third criterion.

In addition, in the fourth criterion, whether the MV components are equal according to the code words of the backup encoding is determined, which is equal to whether the MV values are equal or not. Especially for both ends of the quantization range, when the MV range exceeds the uniform quantization range [-3, 3.5], this is itself a small probability event. If ^ , , , and ^ „ are also out of range, they can be enlarged. They are considered to be equal in probability.

In step 305, it is determined whether the current block is erroneous. If yes, the process proceeds to step 306; otherwise, the process ends.

In step 306, it is judged whether the corresponding protection block of the current block is correct. If it is correct, the process proceeds to step 307; otherwise, the process proceeds to step 308.

In step 307, that is, in the case of the current block error, the corresponding protection block is correct, since the key data backup is correct, it can be used to restore the original key data. Therefore, when the current block is wrong, if the corresponding protection block is correct, the key data backup of the current block is used as its key data for decoding.

The position of the reference block in the previous image frame of the current block is reversed with the motion vector backup of the current block, and the current block is replaced with the reference block. Taking Η.263 as an example, when the biliary is detected and there is a bit error, if the protection block is correct, that is, MV:, _{m is} correct, MV:,, _m is used to mask the error of MB,,,,,. That is, using ^MV as the motion vector, and then from MK, "to push ^back ,", in the previous frame, refer to the position of the macroblock, and set the reference macroblock to furnace, „, , superscript ref denotes reference frame , then replace the data with fine ^re/ ,,,,,, , ,.,".

In step 308, that is, in the case of the current block error, and the corresponding protection block is also wrong, the average value of the key data of one or more data blocks adjacent to the current block and the correct block is used as the key data of the current block for decoding. .

The average here can be calculated by various generalized averaging algorithms. For example, arithmetic average ((a+b)/2), weighted average ^{; l!} a+w ₂ *b), w, +w ₂ =l , w,, w ₂ >0) , geometric mean (sqrt(ab) ), harmonic mean (ab/(a+b)), and median average (a _h a ₂ , , a _n total number of n, size The order aa^.an^an, the median average = a( _n+1)/2 , generally requires n to be an odd number, etc., and various average forms after the maximum and minimum values of the average array are removed.

If the protected motion vector backup is also in error, the reference block of the current block in the previous image frame is reversed with the average of the motion vectors of the one or more data blocks adjacent to the current block in the current image frame. Position, and replace the current block with the reference block. Take H.263 as an example, in the vicinity of 8 macroblocks (up, down, left, right, top left, top right, bottom left, bottom right), for those macroblocks with correct data ( The number may be less than 8). By averaging the motion vectors of these adjacent macroblocks, a new motion vector ^^^ , / ^^^ bu ^ +^^^'' ⁼ - ^' ^{1 is obtained} . The new vector is used as the motion vector of MB,,.,,, and then K, ,,, „ to reverse M, „refer to the position of the macroblock in the previous frame, and set the reference macroblock to MB, „ And then use ΜΒ „ data to replace,, ,,,,.

Finally, the third embodiment of the present invention applies the video transmission protection method to the Η.263 video transmission based on the first embodiment, and performs experiments using the international standard image sequences "Foreman" and "Claire". The effectiveness of the present invention is well demonstrated.

Using a standard image sequence, 400 frames (repeated 10 times) were used for the experimental study. The image format was QCIF, and Y:U:V was 4:1:1. In the experiment, the target frame rate is 15 frames/s, and the H.263 encoder uses a quantization factor (QP) of 5.

The left and right figures shown in Fig. 5 are the comparison of the restored images obtained by the 16th frame image of the Foreman sequence in the case of the general error concealing method and the method of the present invention, respectively. It can be seen that with the method of the present invention, the subjective quality of the restored image is significantly improved.

The two graphs given in Figure 6 are the average peak signal-to-noise ratios of the recovered video at the decoder end in the Foreman and Claire experiments at different bit error rates, using both the general error masking method and the method of the present invention. Peak Signal Noise Rate, referred to as "PSNR". As it can be seen from the figure, when the error rate is less than ^10-3, using the method of the present invention for error concealment, error recovery than with the general method of masking an average of 2-3dB image PSNR, thus effectively ensuring the quality of reconstructed video .

In addition, from the case of video code traffic, if the digital watermarking method of the present invention is not employed, but the method of separately transmitting the motion vector is used for error concealment, the increased code traffic is as high as 8.8% - 35.2%. In comparison, the performance of the method of the present invention is superior to the general error masking algorithm.

It will be understood by those skilled in the art that in the above description of the embodiments of the present invention H.263 is taken as an example, but the transmission protection method can be directly applied to other standards, such as H.261, 1-1.263, 1-1.263+, H.263++, H.264, MPEG-1, MPEG-. 2, MPEG-4, and other block-based DCT (Block-based DCT, "B-DCT") standard or non-standard multimedia transmission technology, can be used in any of these feasible technologies to achieve the purpose of the invention without affecting Its essence and scope.

It will also be understood by those skilled in the art that in the above description of the embodiments of the present invention, motion vectors are used as key data for protection and error concealment, and when the method is also applicable to protection of video key data other than motion vectors, For example, video sequence structure parameters, image frame structure parameters, block group (GOB) structure parameters, PEI information, Supplemental Enhancement Information (SEI), etc., still achieve the purpose of the invention without affecting its essence and scope. .

Similarly, in the above description of the embodiments of the present invention, other specific parameters or schemes, such as using DCT coefficients as watermarks, and 8-bit pairs of motion vectors, can be replaced by other feasible parameters or schemes, and the invention can be realized. Purpose without affecting its substance and scope.

Although the present invention has been illustrated and described with reference to the preferred embodiments of the present invention, those skilled in the art The spirit and scope of the invention.

Claims

Rights request

A transmission protection method for multimedia communication, comprising the following steps: A: performing backup protection on key data by using a digital watermark at the originating end;

C confuses the errored multimedia data.

The transmission protection method for multimedia communication according to claim 1, wherein the step A comprises the following sub-steps:

Performing block processing on the multimedia data, performing backup encoding on the key data of the current block; embedding the backup code of the key data of the current block into the non-critical data encoding of the protection block corresponding to the current block by digital watermarking ;

The protection block is different from but corresponding to the current block.

The transmission protection method for multimedia communication according to claim 2, wherein the step B comprises the following sub-steps:

First, it is determined whether the current block is correct according to the following first criterion: if the key data backup of the current block is consistent with the key data transmitted by itself, or the key data backup of the protected block corresponding to the current block is transmitted by the protected block itself. The key data is consistent, then the current block is correct. Secondly, for the data block that is not judged to be correct by the first criterion, it is determined according to the following second criterion whether the current block is wrong: if the key data of the current block is backed up and transmitted by itself If the key data is inconsistent and the protection block corresponding to the current block is correct, or the key data backup of the protected block corresponding to the current block is inconsistent with the key data transmitted by the protected block itself and the protected block is correct, the current block is incorrect.

The method for transmitting and protecting multimedia communication according to claim 3, wherein the multimedia communication is transmitted by using a motion compensation coding method, and the step B further includes the following substeps:

For a data block that is not determined to be correct or incorrect by the first criterion and the second criterion, determine whether the current block is erroneous according to the following third criterion: if the reference coding block of the current block is incorrect, The current block is wrong.

The transmission protection method for multimedia communication according to claim 3, wherein the step C includes a sub-step, and when the current block is incorrect, if the corresponding protection block is correct, the key data of the current block is backed up. Decode as its key data.

The transmission protection method for multimedia communication according to claim 3, wherein the step C includes a sub-step, and when the current block is incorrect, if the corresponding protection block is incorrect, the neighboring block is adjacent to the current block. The average of the key data of one or more data blocks is decoded as key data of the current block.

The transmission protection method for multimedia communication according to claim 6, wherein the average value may be one of the following:

Mean, weighted mean, geometric mean, harmonic mean, median mean, or arithmetic mean, weighted average, geometric mean, harmonic mean, after removing the maximum and minimum values in the average array, And the median average.

The method for transmitting and protecting multimedia communication according to any one of claims 1-7, wherein the transmission method of the multimedia communication is H.261, H.263, 1-1.263+, 1- 1.263++, 1-1.264, Moving Picture Experts Group Standard 1, Moving Picture Experts Group Standard 2, Moving Picture Experts Group Standard 4 Part 2 and Part 10.

The transmission protection method for multimedia communication according to claim 8, wherein the key data comprises:

10 . The transmission protection method for multimedia communication according to claim 8 , wherein the non-critical data is a serial number in a discrete cosine transform AC coefficient of a color image luminance component signal or a gray signal of a gray image Any two coefficients between 7 and 12.

The transmission protection method for multimedia communication according to claim 10, wherein the non-critical data is a color image luminance component signal or a gradation signal of a gray image, and a serial number in a discrete cosine transform AC coefficient Two coefficients for 8 and 9.

The transmission protection method for multimedia communication according to claim 9, wherein the backup encoding method of the motion vector comprises the following steps: Encoding the horizontal component and the longitudinal component of the motion vector by 4 bits, respectively;

The 4-bit code corresponds to a discrete value of any of the 16 reciprocal representations of the horizontal component or the vertical component.

The transmission protection method for multimedia communication according to claim 12, wherein in the first criterion and the second criterion of step B, the key data backup and the location are determined according to the following fourth criterion. Whether the key data is consistent:

The transmission protection method for multimedia communication according to claim 8, wherein the multimedia data is an image frame sequence;

Each of the image frames is divided into at least two data block groups;

Each of the data block components is at least two of the data blocks;

Each of the data blocks includes at least 4 luminance components or gray signal blocks;

The transmission protection method for multimedia communication according to claim 14, wherein in the preset-correspondence relationship,

a protection block corresponding to each of the data blocks is a data block of the same position in the next one of the data block groups;

The protected block corresponding to each of the data blocks is a data block of the same position in the previous one of the data block groups.

The method for transmitting and protecting multimedia communication according to claim 14, wherein when the digital watermark of the key data backup is performed in the step A, the 8-bit backup code of the motion vector is separately inserted into Corresponding to the four brightness components or gray of the protection block The discrete cosine transform AC coefficient of the degree signal block is encoded in any two coefficients of the sequence number 7 to I ² .

The transmission protection method for multimedia communication according to claim 16, wherein the 8-bit backup code of the motion vector is respectively inserted into the discrete cosine of the four luma components or the gradation signal block of the corresponding protection block. The code of the coefficients of the numbers 8 and 9 in the transforming AC coefficients is transformed.

18. The transmission protection method for multimedia communication according to claim 16, wherein the rules for inserting the motion vector backup coded bits into the coding of the corresponding discrete cosine transform coefficients are as follows:

The transmission protection method for multimedia communication according to claim 14, wherein, in the step C, when the current block is incorrect, if the corresponding protection block is correct, the motion vector backup of the current block is used to back up The position of the reference block of the current block in the previous image frame, and the current block is replaced with the reference block.

The transmission protection method for multimedia communication according to claim 14, wherein, in the step C, when the current block is incorrect, if the corresponding protection block is incorrect, the current image frame is adjacent to the current block. And the average of the motion vectors of the correct one or more data blocks to reverse the position of the reference block of the current block in the previous image frame, and replace the current block with the reference block.