CN103177725A

CN103177725A - Method and device for transmitting aligned multichannel audio frequency

Info

Publication number: CN103177725A
Application number: CN2013100831335A
Authority: CN
Inventors: A.R.琼斯
Original assignee: Telefonaktiebolaget LM Ericsson AB
Current assignee: Telefonaktiebolaget LM Ericsson AB
Priority date: 2008-10-06
Filing date: 2008-10-06
Publication date: 2013-06-26
Anticipated expiration: 2028-10-06
Also published as: CN103177725B

Abstract

The invention relates to a method and a device for transmitting aligned multichannel audio frequency. In the method, the audio frequency is encoded, and the encoded audio frequency is combined into digital transport streams. The method includes inputting and receiving audio signals at the same position in multiple periods on an encoder, appointing same timestamps to all the audio signals at the same position in the multiple periods at each unit interval, and combining the audio signals with the same timestamps into the digital transport streams. The invention further provides a method for decoding encoded data, an encoding device and a decoding device.

Description

Method and apparatus for delivery of the multi-channel audio that aligns

Technical field

The present invention relates generally to audio coding, and be specifically related to the method and apparatus for delivery of hyperchannel (multi-channel) audio frequency of alignment.

Background technology

Be provided for the mode of a plurality of Voice ﹠ Video components of transmission in single transport stream such as the modern audiovisual coding such as MPEG-1 and MPEG-2.Can align with selected video component with the audio component that separates separately.Only provide according to single premixed surround sound audio component such as synchronous multi-channel audios such as surround sounds, for example, single Dolby 5.1 audio components.Yet, the current mode that is not provided for transmitting with synchronous versions individuation multi-channel sound frequency component.

Particularly, MPEG-1 and MPEG-2 audio frequency standard (being respectively ISO/IEC 11172-3 and ISO/IEC 13818-3) are described the mode with digital audio signal coding and package.These modes comprise the scheme of specifying the various forms of multi-channel sounds that are used for the single mpeg 2 transport stream component of support use.MPEG-1 audio system before these regulation back compatibles.In the prior art, only have by several voice-grade channels being pooled in this type of single transmission component, just may guarantee the desired synchronous of passage.These schemes all require:

[a] use surround sound compression method (for example, Dolby 5.1) or

[b] uses proprietary compress technique, perhaps

[c] uses not compressed audio.

Use the surround sound compression method by utilizing the redundancy that exists between several passages, and the human auditory system's who utilizes some spatial character make sound to become can not to detect and therefore can shelter in processing feature, reduced the desired bit rate of hyperchannel.These complicated schemes provide processes the appropriate ways that wherein expection only has single code level of a Code And Decode operation, but may need recompile signal several times for transmitting in network, for feasible and the operation reason (for example, the source is fed to central editing facility from remote location), they are not ideal selections.This audio quality that has been the cascade problem worse that produces due to the repeatedly encoding operation that carries out successively.Especially in the situation that capacity limited be like this, cause bit rate significantly to reduce, stay seldom surplus (headroom) process concatenated coding and transmit in this type of deterioration.

Use proprietary compress technique General Requirements to use other external dedicated equipment, cause larger expense and operation complexity.The method also may suffer to worsen more than the equal in quality of the cascade generation of a coding/decoding level.

Yet if audio frequency sends (for example, not compressing the linear PCM sample) with uncompressed form, the data transfer rate that requires is high data transfer rate (for example, every binary channels is to about 3Mbit/s).

Although the above is not generally problem when final audio-visual media is provided to the consumer, but it has proposed the problem of audio-visual media making industry really, because the sector utilizes ubiquitous Modern High-Speed data network making between facility instant " original " audio-visual media in sending compressed format (namely more and more, be used for making the source material of TV, film and other media), perhaps in fact send TV or audio network point of departure from making facility, for example ground forwarder, satellite uplink or wired head end.

For example, position film crew generally is fed to teletorium with audiovisual materials, in order to edit and be distributed to subordinate TV station, and finally is broadcast to the beholder.Above-mentioned audiovisual coding standard does not allow therefore, to have increased the complicacy of its field apparatus without the synchronous multi-channel audio of premixed ground transmission, perhaps stops them that multi-channel audio is provided.

Exist one need to be to transmit the multi-channel audio that has the requirement of channel-to-channel alignment accurately especially, making at multichannel time unifying is that important situation subaudio frequency signal can be encoded to the surround sound audio frequency subsequently, this coding uses above-mentioned mpeg standard, because most of making apparatus has been set to use together with these standards.

Correspondingly, the present invention has proposed method and apparatus, and the method and apparatus of these proposals provides for delivery of the cost-effective of multi-channel audio and machine-processed easily when keeping the alignment of interchannel correct time and sound quality.

Summary of the invention

Embodiments of the invention provide a kind of and are included in method in digital transport stream with audio coding and with described coded audio, it is included in the scrambler input and receives the sound signal that is in same position on a plurality of times, time per unit is assigned to identical timestamp all signals that are in the sound signal of same position on described a plurality of time, and the sound signal that will be added with identical time stamp is attached in digital transport stream.

Optionally, the step that receives also comprises the sampled audio signal that is in same position on the time frame with the voice data that forms pre-sizing, and the described frame of alignment voice data to be keeping sound signal to be in time same position, and wherein assigns the step of identical time stamp to carry out on the aligned frame of voice data.

Optionally, the method arranges the aligned frame of audio compressed data by identical audio coder configuration before also being included in and assigning timestamp, and the voice data that will compress and be added with identical time stamp is assigned to a plurality of single channels of transport stream.

Optionally, a plurality of single channel comprises two single (dual mono) audio components of one or more routines.

Optionally, pre-sizing is the size of addressed location in mpeg standard (Access Unit), and video transmission stream is MPEG-1 or mpeg 2 transport stream.

Optionally, timestamp is the presentative time stamp.

Optionally, in the method for any one of front claim, the step that audio frequency is attached in digital video frequency flow comprises that the voice data that will compress and be added with identical time stamp is multiplexed in transport stream.

Embodiments of the invention also provide a kind of method that will comprise the digital transport stream decoding of the audio frequency of encoding according to any above-mentioned coding method, it comprises a plurality of sound signals that are added with identical time stamp of reception, represent to be on a plurality of times each voice-grade channel of same position, stab the timestamp of sharing to determine detection time, and according to the timestamp that detects, each voice-grade channel that is in same position on a plurality of times is exported as a plurality of passages.

Optionally, a plurality of sound signals that are added with identical time stamp have been sampled and have alignd to form the aligned frame of voice data, and wherein identical time stamp has been applied to the aligned frame of voice data.

Optionally, the aligned frame of voice data is compressed before the appointment of timestamp, and the method comprises that also described frame with voice data decompresses to produce each sound signal for output.

Optionally, export the step that is in each voice-grade channel of same position on a plurality of times and comprise that the only timestamp of a sound signal of the sound signal that is in same position on service time presents audio frequency.

Optionally, digital transport stream is digital video transport stream, and the aligned frame of voice data comprises the PES grouping.

Embodiments of the invention also provide a kind of encoding device that is suitable for carrying out any above-mentioned coding method.

Embodiments of the invention also provide a kind of decoding device that is suitable for carrying out any above-mentioned coding/decoding method.

Embodiments of the invention also provide a kind of digital transmission system, and this system comprises at least one described encoding device, at least one described decoding device and the communication link between it.

Embodiments of the invention also provide a kind of computer-readable media that carries instruction, and described instruction impels computer logic to carry out any described coding, decoding or two kinds of methods when carrying out.

It is a kind of for coded audio and produce the encoding device of transport stream from the voice-grade channel that a plurality of times are in same position that embodiments of the invention also provide, and this encoding device comprises: at least one scrambler that is used for coming according to predetermined compression coded audio; Packet function of every scrambler is for the predetermined portions of the audio frequency package that will encode to audio frequency; Collect function, be suitable for providing identical time stamp to packet function in order to be included in a plurality of predetermined portions of voice data, make the audio frequency indicative audio passage of coding be in time same position; And the multiplexer that is used for the right output of multiplexing at least one scrambler and packet function together.

Description of drawings

Now will be only by example, with reference to accompanying drawing, method and apparatus for delivery of the multi-channel audio of alignment is described, wherein:

Fig. 1 illustrates the schematic block diagram according to the part of the analog or digital list encoding device of prior art;

Fig. 2 illustrates the schematic block diagram according to the part of the analog or digital list decoding device of prior art;

Fig. 3 illustrates the schematic block diagram of the part of or two single encoding devices stereo according to the analog or digital of prior art;

Fig. 4 illustrates the schematic block diagram of the part of or two single decoding devices stereo according to the analog or digital of prior art;

Fig. 5 illustrates according to an embodiment of the invention the process flow diagram for delivery of the coded portion of the method for the multi-channel audio of alignment;

Fig. 6 illustrates according to an embodiment of the invention the process flow diagram for delivery of the decoded portion of the method for the multi-channel audio of alignment;

Fig. 7 illustrates the schematic block diagram of the part of multichannel analog according to an embodiment of the invention or numerical coding equipment;

Fig. 8 illustrates the schematic block diagram of the part of multichannel analog according to an embodiment of the invention or numeral decoding device.

Embodiment

Describe one embodiment of the invention now with reference to accompanying drawing, provide same or similar reference numerals for same or similar part or step in accompanying drawing.

Following content will be based on Moving Picture Experts Group-2.Yet, will understand, the basis invention is equally applicable to support other compressed audio standard of two single codings, as Advanced Audio Coding (AAC) or Dolby numeral (Dolby Digital).

The mode of MPEG-1 and MPEG-2 audio frequency specification description coding and package digital audio and video signals.The sound signal of processing is delivered to mpeg system layer (ISO/IEC 13818-1) so that further package in transport stream (TS), then transmits by communication networks such as telecommunications or broadcast system.These MPEG package rule definitions provide the grammer of structure to bit stream.Particularly, bit stream comprises timestamp, and timestamp is used for controlling the sequential of the output audio of decoding and recovering by demoder.These timestamps are used for Voice ﹠ Video component accurate timing order.

When the demoder timestamp (DTS) that the coded data that the timestamp that mpeg standard definition is two types-definition receives will be when presents to demoder and the audio or video of definition decoding will be exported in order to hear respectively or the presentative time seen stabs (PTS) by system.What the most often use is the timestamp of a rear type.

By these timestamps of managing as described in more detail below, can suitably present simultaneously several independent sound signal for the hyperchannel set of coding or decoding according to the audiovisual transfer system of an implementation column of the present invention, that realizes thus requiring between the hyperchannel set is synchronous.

Fig. 1 illustrates the schematic block diagram according to the part of the analog or digital list encoding device of prior art, and it illustrates by for example such as the system flow of the voice data of the cataloged procedure of MPEG-2.Decode procedure is its inverse process, and shown in Figure 2.

All examples in figure illustrate dual analog 110 and numeral 105 inputs, analog input before input coding device 130 by modulus (A/D) converter 120 to carry out digitizing.The direct input coding device 130 of DAB 105.Each passage is represented by label a-d.Yet, will understand, the invention is not restricted to the passage of any magnitude setting, and fully scalable, and the audio frequency input can be only analog format, numeric only form or as directed pair of form.

In the situation that input is in analog form, before simulated sound input coding device 130, for example with the form of linear impulsive code modulation (PCM), it is carried out digital sample, wherein it is converted into the form that bit reduces.

Scrambler 130 outputs to packet function 140 with the digital bit stream (bit stream of each independent voice-grade channel) of a plurality of codings, packet function 140 with the audio frequency package in audio samples.The definitions section of audio samples collects with related in the territory of coding according to the bit block that is called addressed location.Each addressed location is the part of sealing bag of audio frequency, for example, and the frame of 1152 audio samples.

Subsequently, by multiplexer 150 with the channel multiplexing of each package together to form transport stream 160.

Decoding device is shown in Figure 2, and substantially is inverse process.Transport stream 160 is carried out demultiplexing by demodulation multiplexer 250, demodulation multiplexer 250 provides each voice-grade channel of package in order to unpack by separating packet function 240, decode in decoder stage 235 afterwards and as 105 outputs of Direct Digital stream, perhaps be output into analog form 110 through digital to analog converter 220.

Fig. 3 and 4 illustrates the Code And Decode equipment for two single or synchronous stereo case.A plurality of stereo or two lists are to adding system to, but these will not be to locking together, because the MPEG standard is not done clear (being different from the surround sound option that runs into problem described in the background technology part) to it, and therefore they remain with the separate entity of separating timestamp, and each output at demoder rebuilds separately.

Can exist in order to comprise any given transport stream such as a plurality of independent audio passages such as different language sound rails, each is encoded separately.

The number of channels of the bit-rate allocation that is used for each passage of selecting according to the Systems Operator and quality criterion, requirement exists a plurality of different related between the homologue of input audio group and its coding.Normal mode of operation is these voice-grade channel absolute codings, and does not exist specific (special) requirements that they are locked together.

Some in these passages can be associated with the vision signal of following (that is, in the situation that audio frequency is video or television sound), and system will use the timestamp common to Audio and Video stream, make these signals and its corresponding video proper alignment.Audio frequency be aligned in be not in the case very accurate-it only need to guarantee to satisfy the synchronous requirement of lip.Other alignment of this grade is not as required so accurate of hyperchannel surround sound.

Therefore, usually each independently monophonic audio signal, two monophony or stereo (referring to Fig. 3) had independent identity (namely in multiplexing output stream, Basic Flow), therefore and each has its oneself the timestamp that is independently generated by encoding device during the package stage, and uses separately at demoder.

In brief overview, by being utilized as the sequential control that these situations provide and they are expanded to the sequential control of hyperchannel situation, adapting to the common MPEG-2 transformat that is used for standard mono or binary channels stereo channel to the proposal solution of the shortcoming of above-mentioned prior art.Therefore, demoder can present a plurality of voice-grade channels of Accurate align according to an embodiment of the invention, and this thereby solved stationary problem, and avoided the cascade of coded system and the deterioration of following.

This solution and existing MPEG-2 grammer are fully compatible, and therefore common compatible demoder can present multi-channel audio in conventional time relationship, and the method can realize its repetition in cascade system, and do not worry deterioration, even without with alignment accuracy according to the demoder same degree of one embodiment of the invention.

In more detail, in the multi-channel synchronous method of proposing, several input audio signals that requirement is processed in independent and the method for synchronization are processed by identical sequential control, make and distribute identical time stamp in transfer syntax, so that demoder will also keep alignment.

Fig. 5 illustrates the part of coding method according to an embodiment of the invention 500.

In step 510, the independent audio passage of synchronously and do not convert to by single transport stream transmission the predetermined quantity (N) of single component is transfused to encoding device.The encoding device time per unit forms the audio samples of K alignment, gets a sample from each input voice-grade channel, and wherein, sample is corresponding to the identical moment.

The encoding device time per unit forms N/2 the frame (step 520) of K alignment audio samples, wherein, each frame is corresponding to identical zero-time, but for each voice-grade channel, be ready to use the compression method of selecting in step 530 to compress to form addressed location, be generally every pair of voice-grade channel and use two single audio frequency compressions.

Subsequently, in step 540, for the condensed frame (being addressed location) of audio samples is assigned identical timestamp, general form with header fields.

In step 550, the condensed frame that is added with timestamp of audio samples packed (that is, package) becomes the PES grouping, and it comprises the two single right of respective standard in the use of Moving Picture Experts Group-2 for example.Remaining cataloged procedure is with identical under normal circumstances, that is, the audio frequency of package is transmitted together package and is multiplexed in output transport stream 160 with any relevant video (if applicable) and other passage.

Fig. 6 illustrates contrary according to an embodiment of the invention decode procedure.

Particularly, coding/decoding method comprises that reception N/2 is to single audio frequency passage 610, stab 620 detection time, determine which is to sharing timestamp 630, with those to de-compressing into N addressed location 640 of the single audio frequency sample relevant with identical presentative time, and the audio frequency that output subsequently decompresses accurately presents N sample 650 simultaneously with according to single common timestamp.

To understand, alignment, compression and timestamp supply can be completed by the single nextport hardware component NextPort of encoding device, and inverse process can be completed by the single nextport hardware component NextPort of decoding device.

Shown in Figure 7 for the encoding device of carrying out above-mentioned coding method according to an embodiment of the invention, wherein, can see, there is the extra level of processing (namely, hyperchannel framing level 770), provide it to align several sound signals and arrangement and be provided for using between independent but synchronous voice-grade channel in package level 140 common timestamp.

Described method and apparatus preferably operates by carrying independent but synchronous voice-grade channel with two single channels.Therefore, its homographic solution decoding apparatus 800 of the encoding device 700(of Fig. 7 and Fig. 8) be depicted as every pair of voice-grade channel and have independent encoder/decoder and package/unpack.

Fig. 7 illustrates has an example of four synchronous independent voice-grade channels together, with two (analog/digital) input capability.A/D 120 (a-d) is in order to be provided to the front digitizing of framing level 770 for the analog channel process.The numeral input directly is fed in framing level 770.

Framing level 770 is in the piece of the audio samples of same position from all voice-grade channel creation-times, and mark they in order to process together with the identical time stamp that is used for all audio samples that are in same position on At All Other Times.This general form that adopts timestamp synchronizing signal 780, it is delivered to the more package level 140 of below of processing pipeline (pipeline).

Simultaneously, audio samples is as (co-timed) frame of common timing as right in the two single sampling that forms in framing level 770, be provided to standard code level 730, standard code level 730 is provided to package level 140 with the audio samples of coding again, in this level, the timestamp synchronizing signal 780 that provides according to framing level 770 is with they packages.

The presentative time stamp (PTS) that one preferred embodiment will use the sample block of addressed location size and be associated, addressed location belongs to the hyperchannel pair that uses the compression of individual digit signal processor, generation comprises the compressed audio relevant with the accurate common former sample regularly of voice data with the set of the PES grouping of identical pts value.

In the situation that odd number input channel and two single channel are arranged as transmission mechanism, may simply be one of two single channels and fill quiet.

Subsequently, the output of each two strand (scrambler with packet function to) is by multiplexer 150 multiplexing being in the same place in normal way, so that output transport stream 160 to be provided.

Decoding device 800 according to an embodiment of the invention shown in Fig. 8.

Decode operation decompresses the discrete addressed location of the audio frequency relevant with a plurality of pairs of single audio frequency components, keeps its presentative time stamp 835.Subsequently, according to the common timestamp of sharing between them, present by frame the frame that level 870 presents the sample of decoding at same time.Therefore, a plurality of samples relevant with accurate common sample time regularly are to presenting together, thus realized by whole coding/decoding processing chain, across a plurality of passages to keeping the target of accurate channel-to-channel audio frequency alignment.

Therefore, the complete scheme that is used for several passages of isochronous audio is used following characteristics at encoding device:

The sample that is in same position on time across the input of a plurality of voice-grade channels is formed the frame of alignment of audio samples with the addressed location size of coupling compression;

The audio frame of alignment compresses by identical audio coder configuration, preferably distributes two monophony passages (as a pair of) to the audio component of each compression.Yet, also can use stereo channel or each single channel, and or then use two single right;

The addressed location of compression is preferably assigned identical presentative time stamp value or with the demoder timestamp (DTS) of predetermined time delay;

The audio component of compression transmits as the audio component of the single compression of a plurality of conventional binary channels in mpeg 2 transport stream.

At decoding device (that is, receiving position):

The audio component of a plurality of compressions is decoded, and result is for any set point in respective streams, and a plurality of set of the decompressed frame of audio samples (that is, the passage of decoding) have identical timestamp across passage;

By using the presentative time stamp of one-component only, present audio frame for the decompression of a plurality of passages to output, make the output audio sample be in time same position (or after DTS period predetermined time).

Said method and equipment provide by its can by communication system transmit audio frequency several passages, make their mutually keep being synchronized to all the time mode of sample accuracy.Realize that this synchronous is subject to the stereo surround sound coding that causes deterioration to concatenated coding multistage the time with previous mode.Described method and apparatus has been avoided the deterioration of prior art systems, and need not more complicated and proprietary surround sound solution sometimes.

Therefore, embodiments of the invention provide and have been used for " original " multi-channel audio (namely, not yet be mixed in the surround sound form) and the mode that sends across identical transport stream of the video relevant with it, reduce the deterioration of the sound quality that causes due to the relevant cascade of known audio frequency transmission method before other or other problem thus.This has also been avoided using before transmission and has diminished the surround sound processing or use the unpressed linear PCM of very high bandwidth.

The present invention is particularly suitable for utilizing multi-channel audio and the broadcast quality video that it does not converted to single component (for example, 5.1 surround sounds) transmits.Yet, will understand, embodiments of the invention can be equally applicable to only audio transport stream packet, as for delivery of multichannel radio sound or suchlike those streams.

Sending compressed audio in order to process in the system of surround sound in the another location, the present invention is useful especially.This is because when using the source of this type of compression in mixing, the audio samples of compression do not line up the illusion (artefact) that can cause compression, this can cause again finally around undesirable audio frequency infringement in audio mix.

Typical realization will be included in an end of communication link according to the encoding device of one embodiment of the invention with at the decoding device of the other end according to one embodiment of the invention.If requirement, this type systematic is to repeating across a plurality of communication links.

Said method can be carried out by the hardware of any suitable adaptation or design.Implement in the instruction set that the part of method also can be stored in computer-readable media, instruction set impels this computing machine to carry out said method when being written into computing machine, digital signal processor (DSP) or similar device.

Similarly, described method can be embodied as integrated circuit special programming or hardware design, and this circuit operation is to carry out described method on the voice data in being loaded into described integrated circuit.Integrated circuit can form the part such as PC and suchlike general-purpose calculating appts, and perhaps it can form the part such as the more specialized device such as game console, mobile phone, portable computer device or hardware audio/video encoder/decoder.

Exemplary hardware embodiment is programmed for the embodiment that carries out described method and/or the field programmable gate array (FPGA) of described equipment is provided, and FPGA is arranged on the daughter board of the video server that frame that data center holds installs so that for example at IPTV television system and/or teletorium or support position video uplink bus (uplink van) use of scene information group.

Another exemplary hardware embodiment of the present invention is the embodiment that comprises forwarder and the right Voice ﹠ Video transmitter of receiver, wherein, forwarder comprises encoding device, and receiver comprises decoding device, and wherein each encoding device is embodied as special IC (ASIC).

It will be apparent to one skilled in the art that the accurate order of the step of carrying out in methods described herein and content can according to as the requirement of the specific collection of the speed of coding and suchlike execution parameter change.In addition, will understand, the different embodiment of disclosed equipment can be according to the present invention the requirement of the specific implementation of integral body, optionally realize some feature of the present invention with various combination.Correspondingly, the claim numbering must not be considered as the strict restriction to the ability of moving characteristic between claim, and therefore can freely utilize the part of dependent claims.

Claims

1. a coded audio and the audio frequency of described coding is included in method in digital transport stream comprises:

Receive in the scrambler input sound signal that is in same position on a plurality of times;

The sound signal that is in same position on the described time of sampling is with the aligned frame of the voice data that forms pre-sizing; And

Time per unit is assigned to identical time stamp the described aligned frame of sound signal; And

The sound signal that is added with identical time stamp is attached in described digital transport stream.

2. the method for claim 1 also comprises:

Arranged the described aligned frame of audio compressed data by identical audio coder configuration before assigning described timestamp; And

The voice data that compresses and be added with identical time stamp is assigned to a plurality of single channels of transport stream.

3. method as claimed in claim 2, wherein said a plurality of single channels comprise the two single audio frequency components of one or more routines.

4. as the described method of arbitrary front claim, wherein said pre-sizing is the size of addressed location in mpeg standard, and described video transmission stream is MPEG-1 or mpeg 2 transport stream.

5. as the described method of arbitrary front claim, wherein said timestamp is the presentative time stamp.

6. as the described method of arbitrary front claim, the step that wherein described audio frequency is attached in digital video frequency flow comprises:

The voice data that compresses and be added with identical time stamp is multiplexed in transport stream.

7. the method for a decoded digital transport stream, described method comprises:

Reception comprises the digital transport stream of the audio frequency of coding;

Obtain from described transport stream the frame of audio samples that a plurality of times of expression are in each voice-grade channel of same position;

Detect the timestamp of each frame to determine to be added with the frame of identical time stamp; And

By using the only timestamp of a sound signal in upper juxtaposed sound signal of described time, present the frame that is added with identical time stamp at same time.

8. method as claimed in claim 7, the audio frequency of wherein said coding has been sampled and has alignd to form the aligned frame of voice data, and wherein said identical time stamp has been applied to the described aligned frame of voice data.

9. method as claimed in claim 8, wherein the described aligned frame of voice data is compressed before the appointment of described timestamp, and described method also comprises:

The described frame of voice data is decompressed to produce each sound signal be used to presenting.

10. as the described method of arbitrary front claim, wherein said digital transport stream is digital video transport stream, and the described aligned frame of voice data comprises the PES grouping.

11. a scrambler that is used for coded audio and the audio frequency of described coding is included in digital transport stream, described scrambler is arranged to:

Receive the sound signal that is in same position on a plurality of times in input;

12. a demoder that is used for the decoded digital transport stream, described decoder arrangement becomes:

13. a digital transmission system comprises at least one scrambler and at least one demoder, described scrambler is arranged to:

The sound signal that is added with identical time stamp is attached in described digital transport stream;

Described decoder arrangement becomes:

14. a computer-readable media that carries instruction, described instruction impels any one of computer logic manner of execution claim 1 to 10 when being performed.