WO2015107372A1 - Procédé et appareil pour déterminer la synchronisation de signaux audio - Google Patents
Procédé et appareil pour déterminer la synchronisation de signaux audio Download PDFInfo
- Publication number
- WO2015107372A1 WO2015107372A1 PCT/GB2015/050119 GB2015050119W WO2015107372A1 WO 2015107372 A1 WO2015107372 A1 WO 2015107372A1 GB 2015050119 W GB2015050119 W GB 2015050119W WO 2015107372 A1 WO2015107372 A1 WO 2015107372A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- samples
- audio signal
- timing information
- signal
- Prior art date
Links
- 230000005236 sound signal Effects 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 title claims description 39
- 230000001360 synchronised effect Effects 0.000 claims description 21
- 230000011664 signaling Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 4
- 125000004122 cyclic group Chemical group 0.000 claims 2
- 238000012360 testing method Methods 0.000 abstract description 13
- 230000001771 impaired effect Effects 0.000 abstract description 2
- 230000006872 improvement Effects 0.000 description 24
- 238000012986 modification Methods 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 230000005540 biological transmission Effects 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- 238000006243 chemical reaction Methods 0.000 description 5
- 238000004519 manufacturing process Methods 0.000 description 4
- 230000000717 retained effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000000750 progressive effect Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 101150071746 Pbsn gene Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/04—Synchronising
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/4302—Content synchronisation processes, e.g. decoder synchronisation
- H04N21/4307—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
- H04N21/43076—Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of the same content streams on multiple devices, e.g. when family members are watching the same movie on different devices
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/233—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/2368—Multiplexing of audio and video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/238—Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
- H04N21/2381—Adapting the multiplex stream to a specific network, e.g. an Internet Protocol [IP] network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/242—Synchronization processes, e.g. processing of PCR [Program Clock References]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4341—Demultiplexing of audio and video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
- H04N21/4394—Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/643—Communication protocols
- H04N21/6437—Real-time Transport Protocol [RTP]
Definitions
- This invention relates to timing of digital audio and/or video signals and to methods and apparatus for determining synchronisation of such signals.
- SDI Serial Digital Interface
- WO2013/1 17889 of British Broadcasting Corporation describes a system by which signals may be converted between protocols such as SDI and infrastructure standards of IP (Internet Protocol), more specifically RTP (Real Time Protocol).
- the device used for converting between such protocols is referred to as a "Stagebox" and is a marked departure from the broadcast standards of SDI.
- IP Internet Protocol
- RTP Real Time Protocol
- the Stagebox builds on the concept of sending and receiving video and audio across broadcast centres, and looks at the tools required by camera operators and in studios. Based lower down the 'food chain', the Stagebox aims to commoditise IT equipment and standards in the professional broadcast arena.
- the invention provides a method, an encoder/ decoder and transmitter or receiver provided with functionality to alter an audio and/ or video component of a signal so as to allow for testing of synchronisation.
- the invention also provides a device that may be provided as an addition to a camera or to studio equipment.
- samples of an audio component of a signal are altered such that the contents represent time information such as a timecode, sample number and channel instead of representing the original audio data.
- This first aspect may be referred to as a "test mode" as the quality of the audio signal is at least impaired.
- a repeated code is provided within lower significant bits of samples of an audio component such that a receiver may align a local version of that code with the received code to determine relative synchronisation. This mode may be referred to as a "live mode" as the quality of the audio signal is minimally impacted.
- the invention may also be delivered by way of a method of operating any of the functionality described above, and as a system incorporating multiple cameras, studio equipment and apparatus as described above. BRIEF DESCRIPTION OF THE DRAWINGS
- Fig. 1 is a schematic diagram of a system embodying the invention
- Fig. 2 shows the structure of SDI signalling
- Fig. 3 shows the sampling of an audio waveform
- Fig. 4 shows the insertion of audio samples into ancillary data space
- Fig. 5 shows the layout of audio samples within audio packets
- Fig. 6 shows an example audio sample modified according to a "test mode”
- Fig. 7 shows an example audio sample modified according to a "live mode”
- Fig. 8 is a schematic diagram of an encoder/ decoder embodying the invention.
- An embodiment of the invention comprises a device that is connectable to a camera to provide modification to audio-video signals such as conversion from signalling required by the camera to IP data streams and from IP data streams to signalling for the camera.
- the same device may also be used at studio equipment for converting IP streams received from cameras for use by the studio equipment.
- a single type of device may be deployed at existing items of television production equipment such that transmission between devices may use IP.
- An advantage of an embodiment of the invention is that it allows camera equipment of the type used in a studio environment or remotely but in conjunction with a production facility to take advantage of transmission of data to and from the device over packet based networks.
- Such a system may include multiple cameras, studio equipment and potentially one or more central servers for control, each optionally having a device embodying the invention.
- the embodiment may additionally provide functionality for example coders converting will be automatically set depending upon connectivity factors such as how many cameras are detected in the system, what editing system is used and so on.
- the server within a system can send instructions back to each device to change various settings using return packets.
- the cameras may be anywhere in the world and the instructions may include corrective information or other control data such as a "tally light".
- the device may be implemented as an integral part to future cameras and studio equipment.
- the main embodiment that will be described, though, is a separate device that may be used as an add-on to existing equipment such as cameras, mixing desks and other studio equipment.
- FIG. 1 shows a schematic end to end example embodying the invention.
- An AV device 10 such as a camera provides audio-video signals for delivery to a destination device 16 such as a studio monitor.
- the camera typically provides data in SDI format.
- a network 2 provides the delivery mechanism but, as discussed above, may degrade the synchronisation of the signals.
- a converter 12 on the transmitter side and converter 14 on the receiver side (which may be identical converters) are provided to modify the signals and, optionally, provide conversion as will be described.
- the converters are implemented as "Stageboxes" as noted above.
- Methods and devices embodying the invention preferably operate using signals according to existing standards, such as SDI, and this standard will briefly be described by way of background for ease of understanding. For the avoidance of doubt, other standards are possible as are other formats within such standards. Other examples of line numbers, frame rates, sampling rates and so on may be used and the following is just one example.
- FIG. 2 shows the features of an SDI signal relevant to this disclosure.
- HD-SDI is defined in SMPTE ST 292-2008, and contains three main elements, video, audio and ancillary data.
- a video frame comprises 1 125 lines of pixels of which 1080 are active video lines 20 having start of active video (SAV) blocks 26 and end of active video blocks 28 which separate the active portions of lines from a horizontal blanking interval 22 (the SAV and EAV are also present in the VANC discussed below).
- the active portions comprise 1920 pixels.
- the horizontal blanking interval 22 carries horizontal ancillary data (HANC) and the vertical blanking interval 24 carries vertical ancillary data (VANC).
- the horizontal ancillary data (HANC) contains packets 30 which are between the EAV and SAV and are structured as discussed below.
- the following standards describe further aspects of SDI including SMPTE ST 12-2008 for timecode, SMPTE ST 272-2004 relating to placing of audio data into the video ancillary space, SMPTE ST 274-2008 relating to how the video waveforms are made up, SMPTE 291 relating to how ancillary data packets are formed (used to carry the audio and timecode data) and SMPTE ST 299-2004 is how to carry 24-bit digital audio.
- the Stagebox fully supports the standards with regards to its different frame rates, and resolutions for video. The Stagebox also handles its main elements.
- audio data packets carry all the information in the AES bit stream.
- the audio data packet 30 is located in the ancillary data space 22 of the digital video on most of the television lines in a field.
- An audio control packet is transmitted once per field in an interlaced system and once per frame in a progressive system.
- the audio control packet is optional for the default case of 48-kHz synchronous audio (20 or 24 bits), and is required for all other modes of operation.
- Auxiliary data are carried in an extended data packet corresponding to and immediately following the associated audio data packet.
- the 1920x1080 image structure defined in this standard is mapped onto an interface that contains 1 125 total lines.
- a frame comprises the indicated total number of lines; each line at the interface is of equal duration determined by the interface sampling frequency and the luminance samples per total line (S/TL).
- Raster pixel representation at the interface is presented from left to right, and in the raster shall be presented from top to bottom. Lines are numbered in time sequence according to the raster structure. Each line is represented by a number of samples, equally spaced.
- a progressive system shall convey 1080 active picture lines per frame in order from top to bottom.
- Ancillary data packets and space formatting described by this Standard reside in an Ancillary space defined by the interconnecting interface document.
- Ancillary space in a serial interface is a space not used by the main data stream and is used as a transport for data associated with the main data stream.
- the type of payload data carried in the ancillary space is then defined in separate application documents.
- the ancillary space that is located between EAV and SAV markers is called horizontal ancillary space (HANC space).
- HAC space horizontal ancillary space
- VANC space vertical ancillary space
- SMPTE ST 299-2004 defines the mapping of 24-bit AES digital audio data and associated control information into the ancillary data space of a serial digital video conforming to SMPTE ST 292-2008.
- Audio data derived from two channel pairs are configured in an audio data packet.
- Two types of ancillary data packets carrying AES audio information are defined and formatted per SMPTE ST 292-2008.
- Each audio data packet carries all of the information in the AES bit stream as defined by AES3.
- the audio data packet shall be located in the horizontal ancillary data space of the Cb/Cr data stream.
- An audio control packet shall be transmitted once per field in an interlaced system and once per frame in a progressive system in the horizontal ancillary data space of the second line after the switching point of the Y data stream.
- Data ID are defined for four separate packets of each packet type. This allows for up to eight channel pairs. In this standard, the audio groups are numbered 1 through 4 and the channels are numbered 1 through 16.
- An SDI formatter places the audio data packet in the horizontal ancillary space following the video line during which the audio sample occurred. Following a switching point, the audio data packet is delayed one additional line to prevent data corruption.
- An audio analogue source e.g. microphone
- An audio analogue source is typically sampled at a rate of 48kHz (approximately every 20 microseconds) with a bit depth of 24 bits for each sample.
- Figure 2 shows two such sampling points A and B. There are 25 frames of video per second each having 1 125 (approximately 35 microseconds per line). There will therefore be one or two samples of audio per line of video.
- Figure 3 shows the sampling of a waveform and Figure 4 shows how those samples are then inserted in packets within the horizontal blanking interval at the end of a respective line.
- synchronisation of the audio and video is maintained by virtue of the fact that the data is serial and synchronous.
- Decoding the audio samples as they are received within the data stream ensures that they are presented to an analogue to digital converter (ADC) within a tolerance of one line of video, namely 35 microseconds. Any variation by this amount is inaudible in the resulting audio signal.
- ADC analogue to digital converter
- FIG. 5 shows the structure of the audio packets within the SDI standards and will be described briefly for ease of reference, though this aspect is known fully to the skilled person.
- An audio packet is shown in the upper part of figure 5 and comprises a payload of audio samples within words of audio data UDW2 - UDW17 as well as header and footer words indicating packet structure.
- the header words include ADF words comprise a start code indicating a data packet.
- a DID word indicates that the packet comprises audio data.
- a DBN word provides a sequential count.
- a DC word provides the packet length.
- CLK words provide clock or timing indication as to where in relation to the video line the sample was taken.
- the footer words include error correction words UDW18 - UDW23 and a checksum word CS.
- a camera may be attached to the so called “Stagebox" for conversion of its output to an IP stream, and a remote control remote from the camera may be attached to a second such Stagebox for converting between IP and control signals.
- Stagebox for conversion of its output to an IP stream
- remote control remote from the camera may be attached to a second such Stagebox for converting between IP and control signals.
- Each of the camera and the remote control need to be unaware of the intermediary IP network and to send and receive appropriate timing signals in the manner of a synchronous network, although the intermediary is an asychronous open standard IP network.
- each device attached to an IP network requires functionality to provide timing. For this purpose a timing arrangement is provided.
- a device embodying the invention incorporates a new arrangement for providing timing.
- the "Stagebox” device can operate as an SDI to IP and IP to SDI bridge on a local network, and may be used as part of the wider IP Studio environment.
- This disclosure describes concepts addressing the problems of timing synchronisation in an IP network environment.
- AV material is captured, translated into an on-the-wire format, and then transmitted to receiving device, which then translates it back to the original format.
- the media data arrive with the same timing relationship as they are sent, so the signals themselves effectively carry their own timing.
- any point-to-point IP Audio-visual (AV) link the receiving end must employ a buffer of data which is written to as data arrive and read from at a fixed frequency for content output.
- the transmitter will transmit data at a fixed frequency, and except in cases of extreme network congestion the frequency at which the data arrives will, when averaged out over time, be equal to the frequency at which the transmitter sends it.
- the receive buffer will either start to fill faster than it is emptied or empty faster than it is filled. If, over time, the rate of reception averages out to be the same as the rate of processing at the receive end then this will be a temporary effect, if the two frequencies are notably different, however, then the buffer will eventually either empty entirely or overflow, causing disruptions in the stream of media.
- test mode As the signalling is particularly beneficial for testing synchronisation.
- the improvement is not limited to testing purposes and can be used with a live audio-video signal.
- the concept behind the first improvement is that one or more of the samples of audio data within audio packets are modified to include timing information in place of audio information. Such modification may be applied to every sample, selected samples or potentially to random samples.
- the preferred approach is to provide the timing information in place of all of the sample bits for all audio packets such that the packets no longer convey audio information, but instead convey the timing information. It is for this reason that this improvement is referred to as a test mode.
- FIG. 6 shows an example SDI signal modified according to the first improvement of an embodiment of the invention.
- an SDI signal comprises an audio component and the video component, with the audio component being arranged as samples with one or more samples provided for each line of video.
- a typical implementation will have audio samples approximately every 20 microseconds which happens to provide 1920 samples per frame, each sample having 24 bits and each being provided in the horizontal ancillary space following each video line.
- Such an arrangement intends the receiver to decode the audio samples such that synchronisation is maintained with each line of video.
- synchronisation can be lost even when transmitted over a synchronous network but can particularly be lost when the signal is converted to Internet Protocol and transmitted over an asynchronous network.
- the first improvement modifies the audio component by providing a timestamp in place of audio data bits in at least some and preferably all audio samples.
- the signal can no longer provide an audio output.
- the structure of the signal is retained, it can still be processed by any standard studio equipment using the SDI Protocol in the transmission chain.
- the structure of the inserted timing information could take a variety of forms. Common to these forms is that the timing information can identify timing relative to another timing source, in particular a timecode of an associated video stream.
- a timecode may be in the form HH:MM:SS:FF: where the value is described in terms of hours, minutes, seconds and frames with the frame number being in the range 0-24.
- HH:MM:SS:FF the value is described in terms of hours, minutes, seconds and frames with the frame number being in the range 0-24.
- the timing information comprises a most significant digit (MSD) 42 (value 0, 1 or 2) taking 2 bits and least significant digit (LSD) 40 (value 0 to 9) taking 4 bits of a frame number count having a range 0 - 23.
- MSD most significant digit
- LSD least significant digit
- a receiver can decode the timing information and use this to determine whether the audio signal originally provided by the signal has slipped in relation to another source of timing, in particular in relation to a video timecode. Whilst some original samples of audio data could be retained in the signal, as this technique would negatively impact the quality of the signal, it is likely only to be used in a "test mode" and in this situation you may as well use the entirety of the available audio bits and replace each sample with a corresponding time indication described.
- the above timing information may be considered a tag, label or code that indicates time relative to a video signal.
- Other options include setting a counter to zero for the first sample occurring every few seconds, say every 20 seconds. Assuming that the audio does not drift by more than plus or minus 10 seconds, then the sample is uniquely identified. Again, this provides accuracy to the nearest audio sample, but not the precise position of that sample, and so this could be considered a "coarse" test mode.
- this type of code could be sent periodically amongst the coarse test type codes above, such as every few codes of even alternative between coarse type and precise type codes so as to continually provide synchronisation to both sample and pixel local accuracy.
- this improvement also proposes providing an optional additional timestamp by way of a visible label within the video frame itself.
- This is preferably related to the timecode of the video data.
- Figure 7 shows modification of audio samples according to a second improvement embodying the invention.
- This improvement provides a permanent way of verifying synchronisation of an audio signal with the accompanying video signal or indeed the audio signal with other audio channels in the same signal, or more generally synchronisation of the audio signal with some other signal.
- this improvement may be used with a live audio signal without notable impairment of quality, we will refer to this improvement as a "live mode”.
- the second improvement appreciates that the existing structure of audio samples should be retained for compatibility with existing equipment. In addition, the improvement appreciates that any modification having a detrimental effect to the audible signal at a decoder should be avoided.
- an SDI signal comprises an audio component with one sample for each of 1920 video lines.
- the second improvement provides a modification to the audio samples to alter only a small number of bits for the audio samples. For example, the least significant bit could be altered for every sample at the start of a video frame or for every N video frames. A least significant bit could be altered for every sample.
- a bit or bits of lower significance may be periodically altered so as to provide a digital code spread over multiple samples of the audio. As bits of lower significance are used, the effect of such changes would be inaudible at a decoder.
- the advantage provided is that a repeated code buried in this manner within the audio samples and spread over many samples can be analysed by comparing to a local version of that code at the receiver to thereby align the audio samples with another signal, such as with the accompanying video signal, and other channel of the audio signal or the like.
- Figure 7 shows that within a 24 bit field one bit is the LSB 50 of the audio sample (refer again to Figure 5 to determine the bit position of the LSB as b4 of UDW2, this is not necessarily the bit shown in Figure 7) and that this position is easily determined by the receiver.
- the number of audio bits transmitted comprises 1920 samples per frame multiplied by the 250 frames at a rate of 24 bits per sample which is 1 1 .25 million bits. At first sight, this could create a large overhead for discovering the repeated code within the audio signal. However, this is not the case.
- the least significant bit of the samples of audio for each video frame could be extracted.
- To discover the relative position of the extracted bit sequence against a local version of that bit sequence simply requires a bitwise compare of the bits (in this case 250 bits) at each of the possible 250 relative positions for all 1920 samples.
- a code of 250 bits is to be discovered within 1920 x 250 bit sequences at each of 250 possible positions. As it is unlikely that the bit sequence would have been damaged in any way, this can be a simple logical compare that can be provided very quickly.
- the relative timing offset of the local code and the received code is known. As the alignment is to precise bit positions, the relative offset of the audio and video can be determined to sample position accuracy.
- bits of lower significance within the audio stream are modulated according to a code that is spread across multiple samples of the audio in such a manner that a local version of that code that can be aligned with the received code so as to determine relative timing. This can be achieved by using that bits of lower significance within the audio so that the audio is not materially altered and no audible difference can be determined at a receiver.
- the code can be chosen to balance speed of acquisition of the coder and receiver against other constraints. For example, a different code may be selected for each channel and it may be desired that the codes are sufficiently distinct as between channel so that the receiver could accurately determine if an audio signal somehow became associated with the wrong video signal.
- the codes may be PRBS codes, gold codes or other sequences selected to have desired properties for acquisition at the receiver.
- the first improvement and second improvement described may be used in combination together.
- the first improvement may be used to establish relative timing of signals and then the second improvement used to continuously track the relative timing.
- Other variations are possible such as periodically using the first improvement technique to establish pixel position accuracy of relative timing and tracking using the second improvement.
- a transmitter side converter 12 (which may be part of other equipment) comprises an AV separator 60 that provides an audio component to a code inserter 62 and a video component.
- An AV combiner 64 recombines the audio and video components. These components could then be transmitted over a serial network.
- a de-serialiser is provided to provide an IP signal instead.
- a serialiser takes the IP packets and reconverts to a synchronous serial signal.
- An AV separator 70 then provides separate audio and video components. The audio is provided to a synchroniser that extracts the code of the first or second improvement and determines synchronisation.
- An AV combiner 74 can then re-insert the audio component into the AV signal with correct synchronisation established.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
L'invention réside dans deux aspects. Selon un premier aspect, des échantillons d'une composante audio d'un signal sont modifiés de telle sorte que les contenus représentent des informations de temps, telles qu'un code temporel, un numéro d'échantillon et un canal, au lieu de représenter les données audio d'origine. Ce premier aspect peut être désigné comme « mode de test » lorsque la qualité du signal audio est au moins altérée. Selon un second aspect, un code répété est situé dans des bits significatifs inférieurs d'échantillons d'une composante audio de telle sorte qu'un récepteur peut aligner une version locale de ce code avec le code reçu pour déterminer une synchronisation relative. Ce mode peut être désigné comme « mode en direct » lorsque la qualité du signal audio est impactée au minimum.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1400944.3A GB2522260A (en) | 2014-01-20 | 2014-01-20 | Method and apparatus for determining synchronisation of audio signals |
GB1400944.3 | 2014-01-20 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015107372A1 true WO2015107372A1 (fr) | 2015-07-23 |
Family
ID=50239201
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/GB2015/050119 WO2015107372A1 (fr) | 2014-01-20 | 2015-01-20 | Procédé et appareil pour déterminer la synchronisation de signaux audio |
Country Status (2)
Country | Link |
---|---|
GB (1) | GB2522260A (fr) |
WO (1) | WO2015107372A1 (fr) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107484010A (zh) * | 2017-10-09 | 2017-12-15 | 武汉斗鱼网络科技有限公司 | 一种视频资源解码方法及装置 |
WO2020024980A1 (fr) * | 2018-08-01 | 2020-02-06 | 北京微播视界科技有限公司 | Procédé et appareil de traitement de données |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040148159A1 (en) * | 2001-04-13 | 2004-07-29 | Crockett Brett G | Method for time aligning audio signals using characterizations based on auditory events |
US20070092200A1 (en) * | 2003-04-05 | 2007-04-26 | Black David R | Method and apparatus for synchronizing audio and video streams |
US20070220561A1 (en) * | 2006-03-20 | 2007-09-20 | Girardeau James W Jr | Multiple path audio video synchronization |
US7359006B1 (en) * | 2003-05-20 | 2008-04-15 | Micronas Usa, Inc. | Audio module supporting audio signature |
US20130128115A1 (en) * | 2003-07-25 | 2013-05-23 | Gracenote, Inc. | Method and device for generating and detecting fingerprints for synchronizing audio and video |
WO2013117889A2 (fr) * | 2012-02-10 | 2013-08-15 | British Broadcasting Corporation | Procédé et appareil destinés à convertir des signaux audio, des signaux vidéo et des signaux de commande |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR100499037B1 (ko) * | 2003-07-01 | 2005-07-01 | 엘지전자 주식회사 | 디지털 텔레비젼 수신기의 립 싱크 테스트 방법 및 장치 |
US20050219366A1 (en) * | 2004-03-31 | 2005-10-06 | Hollowbush Richard R | Digital audio-video differential delay and channel analyzer |
EP1766962A4 (fr) * | 2004-06-22 | 2009-03-25 | Sarnoff Corp | Procede et appareil permettant de mesurer et/ou de corriger une synchronisation audiovisuelle |
CA2541560C (fr) * | 2006-03-31 | 2013-07-16 | Leitch Technology International Inc. | Systeme et methode de synchronisation labiale |
-
2014
- 2014-01-20 GB GB1400944.3A patent/GB2522260A/en not_active Withdrawn
-
2015
- 2015-01-20 WO PCT/GB2015/050119 patent/WO2015107372A1/fr active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040148159A1 (en) * | 2001-04-13 | 2004-07-29 | Crockett Brett G | Method for time aligning audio signals using characterizations based on auditory events |
US20070092200A1 (en) * | 2003-04-05 | 2007-04-26 | Black David R | Method and apparatus for synchronizing audio and video streams |
US7359006B1 (en) * | 2003-05-20 | 2008-04-15 | Micronas Usa, Inc. | Audio module supporting audio signature |
US20130128115A1 (en) * | 2003-07-25 | 2013-05-23 | Gracenote, Inc. | Method and device for generating and detecting fingerprints for synchronizing audio and video |
US20070220561A1 (en) * | 2006-03-20 | 2007-09-20 | Girardeau James W Jr | Multiple path audio video synchronization |
WO2013117889A2 (fr) * | 2012-02-10 | 2013-08-15 | British Broadcasting Corporation | Procédé et appareil destinés à convertir des signaux audio, des signaux vidéo et des signaux de commande |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107484010A (zh) * | 2017-10-09 | 2017-12-15 | 武汉斗鱼网络科技有限公司 | 一种视频资源解码方法及装置 |
WO2020024980A1 (fr) * | 2018-08-01 | 2020-02-06 | 北京微播视界科技有限公司 | Procédé et appareil de traitement de données |
Also Published As
Publication number | Publication date |
---|---|
GB201400944D0 (en) | 2014-03-05 |
GB2522260A (en) | 2015-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11095934B2 (en) | Receiving device and receiving method | |
US10547893B2 (en) | Receiving method, receiving device, and transmission and reception system | |
US20050169269A1 (en) | Data transmission device and data transmission method | |
US9900577B2 (en) | Apparatus and method for providing content for synchronizing left/right streams in fixed/mobile convergence 3DTV, and apparatus and method for playing content | |
KR100706619B1 (ko) | Tc 계층에서의 다중화를 이용한 통신 및 방송 데이터송/수신 장치 | |
US8774287B2 (en) | Precise compensation of video propagation duration | |
US10305617B2 (en) | Transmission apparatus, transmission method, reception apparatus, and reception method | |
US9544638B2 (en) | Method for reconstructing system time clock (STC) without carrying PCR | |
EP1762078B1 (fr) | Procede permettant de transmettre des paquets dans un systeme de transmission | |
US12108092B2 (en) | Transmitting method, receiving method, transmitting device, and receiving device | |
JP2012513139A (ja) | 外部のコプロセッサを有するマルチプレクサにおいてトランスポートストリームを同期させる方法 | |
KR20100061221A (ko) | 네트워크 시간표시정보를 삽입하거나 추출하기 위한 장치 및 방법 | |
US20100172374A1 (en) | System and method for transport of a constant bit rate stream | |
EP2681858A1 (fr) | Remultiplexage déterministe pour des réseaux sfn | |
WO2015107372A1 (fr) | Procédé et appareil pour déterminer la synchronisation de signaux audio | |
KR101131836B1 (ko) | 디지털 광고 삽입기를 위한 비동기 직렬 인터페이스 스위쳐 | |
EP3280147A1 (fr) | Procédé et appareil permettant d'émettre et de recevoir un signal de diffusion | |
JP3893643B2 (ja) | 信号多重化方法およびそれを用いた伝送信号生成装置 | |
US20160366417A1 (en) | Method for synchronizing adaptive bitrate streams across multiple encoders with the source originating from the same baseband video | |
Edwards et al. | Elementary flows for live ip production | |
KR20180027281A (ko) | 패킷의 손실을 복구하기 위한 장치 및 그 방법 | |
JP6908170B2 (ja) | 送信方法 | |
EP3035691A2 (fr) | Procédés et appareil permettant de réduire au minimum les artefacts de temporisation dans du remultiplexage | |
JP6743931B2 (ja) | 送信装置、送信方法、受信装置および受信方法 | |
CN117221294A (zh) | 音频流传输方法和系统 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15700793 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 15700793 Country of ref document: EP Kind code of ref document: A1 |