CN114510212B

CN114510212B - Data transmission method, device and equipment based on serial digital audio interface

Info

Publication number: CN114510212B
Application number: CN202111675350.4A
Authority: CN
Inventors: 吴健
Original assignee: Saiyinxin Micro Beijing Electronic Technology Co ltd
Current assignee: Saiyinxin Micro Beijing Electronic Technology Co ltd
Priority date: 2021-12-31
Filing date: 2021-12-31
Publication date: 2023-08-08
Anticipated expiration: 2041-12-31
Also published as: CN114510212A

Abstract

The application relates to a data transmission method, a device and equipment based on a serial digital audio interface, wherein the method comprises the following steps: acquiring data stream data for generating data bursts; wherein the data stream data comprises audio data and/or audio model metadata; according to the data mode of the data stream data, placing the audio samples of the data stream data in the payload field of the sub-frame of the data burst; setting a field other than the payload field in the data burst; the data bursts are transmitted over a serial digital audio interface. The technical scheme of the application can use the AES3 serial digital audio interface to transmit data in professional application.

Description

Data transmission method, device and equipment based on serial digital audio interface

Technical Field

The present disclosure relates to the field of audio processing technologies, and in particular, to a data transmission method, apparatus, and device based on a serial digital audio interface.

Background

With the development of technology, audio has become more and more complex. From early mono audio to stereo, the center of gravity also focuses on the correct way of processing the left and right channels. But after the surround sound appears, the process starts to become complicated. The surround 5.1 speaker system performs a sequence constraint on the multiple channels, so that the surround 6.1 speaker system, the surround 7.1 speaker system, etc. change the audio processing, and transmit the correct signals to the appropriate speakers to form a mutual effect. Thus, as sounds become more immersive and interactive, the complexity of audio processing increases greatly.

An audio channel (or soundtrack) refers to mutually independent audio signals that are collected or played back at different spatial locations of sound when recorded or played. The number of channels is the number of sound sources when recording sound or the corresponding number of loudspeakers when playing back. For example, in a surround 5.1 speaker system, including 6 audio signals at different spatial locations, each individual audio signal is used to drive a speaker at a corresponding spatial location; in a surround 7.1 speaker system, 8 audio signals at different spatial locations are included, each individual audio signal being used to drive a speaker at a corresponding spatial location.

The AES3 interface is widely used in industry to transfer linear PCM audio between digital audio devices.

Disclosure of Invention

The application aims to provide a data transmission method, device and equipment based on a serial digital audio interface, so as to transmit data by using an AES3 serial digital audio interface in professional application.

The first aspect of the present application provides a data transmission method based on a serial digital audio interface, including:

acquiring data stream data for generating data bursts; wherein the data stream data comprises audio data and/or audio model metadata;

According to the data mode of the data stream data, placing the audio samples of the data stream data in the payload field of the sub-frame of the data burst;

setting a field other than the payload field in the data burst;

the data bursts are transmitted over a serial digital audio interface.

A second aspect of the present application provides a data transmission device based on a serial digital audio interface, including:

a data acquisition module for acquiring data stream data for generating data bursts; wherein the data stream data comprises audio data and/or audio model metadata;

a data placement module, configured to place an audio sample of the data stream data in a payload field of a subframe of the data burst according to a data pattern of the data stream data;

a field setting module, configured to set a field other than the payload field in the data burst;

and the data transmission module is used for transmitting the data burst through a serial digital audio interface.

A third aspect of the present application provides an electronic device, comprising: a memory and one or more processors;

the memory is used for storing one or more programs;

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement a serial digital audio interface based data transmission method as provided by any of the embodiments.

A fourth aspect of the present application provides a storage medium containing computer-executable instructions that, when implemented by a computer processor, implement a method of data transmission based on a serial digital audio interface as provided by any of the embodiments.

As can be seen from the above, the data transmission method based on the serial digital audio interface places audio data stream data in data bursts and sets relevant fields of the data bursts to transmit the data stream data using the AES3 serial digital audio interface.

Drawings

FIG. 1 is a schematic diagram of a three-dimensional acoustic audio model according to an embodiment of the present application;

FIG. 2 is a flow chart of a data transmission method based on a serial digital audio interface in an embodiment of the present application;

fig. 3 is a schematic structural diagram of a data transmission device based on a serial digital audio interface according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application;

fig. 5 is a diagram of a mode switch decision process of a receiver in an embodiment of the present application.

Detailed Description

The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.

Examples

As shown in fig. 1, the three-dimensional acoustic audio model is composed of a set of elements, each element describing one phase of audio, and includes a content producing section and a format producing section.

Wherein the content portion includes: audio program elements, audio content elements, audio object elements, and audio unique identification elements; the format making section includes: an audio packet format element, an audio channel format element, an audio stream format element, and an audio track format element;

said audio program element referencing at least one of said audio content elements; the audio content element referencing at least one audio object element; the audio object element references the corresponding audio package format element and the corresponding audio track unique identification element; the audio track unique identification element references the corresponding audio track format element and the corresponding audio packet format element;

Said audio packet format element referencing at least one of said audio channel format elements; the audio stream format element references the corresponding audio channel format element and the corresponding audio packet format element; the audio track format element and the corresponding audio stream format element are referenced to each other. The reference relationships between elements are indicated by arrows in fig. 1.

Audio programming may include, but is not limited to, narration, sound effects, and background music, and the audio programming elements may be used to describe programming that includes at least one content, and the audio content elements are used to describe a corresponding one of the audio programming elements. The audio program elements may reference one or more audio content elements that are combined together to construct a complete audio program element.

The audio content element describes the content of a component of the audio program (e.g., background music) and references one or more audio object elements to relate the content to its format.

The audio object elements are used to build content, format and valuable information and to determine the track unique identification of the actual track.

The format making section includes: an audio packet format element, an audio channel format element, an audio stream format element, and an audio track format element.

The audio packet format element may be used to describe the format in which the audio object element and the original audio data are packed in accordance with a channel packet.

The audio channel format element may be used to represent a single sequence of audio samples and preset operations performed thereon, such as rendering movement of objects in a scene. The audio channel format element may contain at least one audio block format element. The audio block format element may be regarded as a sub-element of the audio channel format element, so that there is an inclusion relationship between the audio channel format element and the audio block format element.

An audio stream is a combination of audio tracks required to render channels, objects, higher order ambient sound components or packages. The audio stream format element is used for establishing a relation between the audio track format element set and the audio channel format element set or a relation between the audio track format set and the audio packet format.

The audio track format elements correspond to a set of samples or data in a single track, are used to describe the format of the original audio data, and the decoded signal of the renderer, and are also used to identify the track combination required to successfully decode the track data.

And generating synthesized audio data containing metadata after the original audio data are manufactured through the three-dimensional audio model.

The Metadata (Metadata) is information describing characteristics of data, and functions supported by the Metadata include indicating storage locations, history data, resource lookups, or file records.

After the synthesized audio data is transmitted to the far end in a communication mode, the far end analyzes the synthesized audio data based on the metadata, and the original sound scene is restored or rendered into a new sound scene in real time.

The division between the content creation section, the format creation section and the BW64 (Broadcast Wave-64bit, 64-bit Broadcast Wave) file is shown in fig. 1. Both the content production portion and the format production portion constitute metadata in XML format, which is typically contained in a block ("axml" block) of the BW64 file. The BW64 file section at the bottom contains a "channel assignment (chna)" block, which is a lookup table used to link metadata to audio programs in the file.

The content production section describes the technical content of the audio, for example whether it contains dialog or a specific language, and loudness metadata. The format section describes the channel types of the audio track and how they are combined, e.g. left and right channels in a stereo pair. The meta-cable of the content production portion is typically unique to the audio and program, while the elements of the format production portion may be multiplexed.

The audio model is an open compatible metadata generic model, but the audio model metadata is not suitable for real-time production and streaming audio applications, but rather for local file storage. When remote real-time transfer of metadata with digital audio is involved, a serial audio metadata schema is required to allow existing audio and its associated audio model metadata files to be sliced into frames and streamed.

A frame of serial audio metadata contains a set of audio model metadata describing the audio frames associated with the frame for a certain period of time. The serial audio metadata has the same structure, properties, and elements as the audio model metadata, and additional properties for specifying a frame format. The serial audio metadata frames do not overlap and are connected to a specified start time and duration. Metadata contained in a serial audio metadata frame may be used to describe the audio itself beyond the duration of the frame.

The parent element of the serial audio metadata is a frame (frame) comprising: frame header (frame header) and audio format extension (audioformat extended). And the frame header includes 2 sub-elements: frame format (frameFormat) and transport track format (transport format).

The audio format extension includes 8 sub-elements: an audio program (audioprogram), an audio content (audioContent), an audio object (audioObject), an audio track unique identifier (audioTrackUID), an audio packet format (audioPackFormat), an audio channel format (audioChannelFormat), an audio stream format (audioStreamFormat), and an audio track format (audioTrackFormat).

The audio model metadata is composed of a content portion (e.g., audio program elements) and a format portion (e.g., audio channel format elements). Only three elements, audio program elements, audio object elements and audio block format elements, have time-dependent parameters stored. In the content part, the start time, end time and duration of an audio program element or audio object element are used to determine the start time, end time or duration of the element, these parameters typically being fixed. In the format part, all parameters in the audio block format element are time-varying parameters.

The audio model metadata can be divided into two groups: namely dynamic metadata (e.g., audio block format elements in an audio channel format element) and static metadata (e.g., audio program elements and audio content elements).

A serial audio metadata frame consists of one or more metadata blocks.

The application provides a data transmission method based on a serial digital audio interface, as shown in fig. 2, the method comprises the following steps:

s210, acquiring data stream data for generating data bursts; wherein the data stream data comprises audio data and/or audio model metadata;

s220, according to the data mode of the data stream data, placing the audio sample of the data stream data in the payload field of the subframe of the data burst;

s230, setting a field except the payload field in the data burst;

s240, transmitting the data burst through a serial digital audio interface.

The serial audio mode is applied to real-time production and streaming audio, serial audio files are sliced into frames or generated into frames after being encoded, and the data transmission method based on the serial digital audio interface provided by the embodiment is utilized to transmit through the serial digital audio interface. The serial digital audio interface may be an AES3 interface to transfer PCM audio, non-PCM data or both non-PCM audio and data, linear PCM audio in separate channels. The AES3 interface used to transfer data in this embodiment allows non-PCM data and serial audio to be exchanged between different devices based on the physical and logical specifications of the existing AES3 format. The standard accommodates multiple non-PCM audio and data formats and allows multiple data streams to be transmitted in a single interface.

AES3 interface, widely used in industry to transfer linear PCM audio between digital audio devices; but the AES3 interface is limited to two channels only. Significant problems can occur when multiple AES3 channels are used to transfer the relevant audio for more than two channels.

The existing AES3 format is modified in this embodiment to transfer non-PCM data, including non-PCM audio bitstreams, typically (but not necessarily) reduced bit rate bitstreams. This allows carrying a single audio program or multiple audio programs of more than 2 channels on a single AES3 interface, each audio program possibly containing more than 2 channels.

The present embodiment makes partial modifications to the existing AES3 standard logic and is compatible with existing AES3 interfaces that transmit linear PCM audio. Thus, interconnection of devices that may work with linear PCM or non-PCM audio and data may be facilitated. This may allow some existing devices capable of recording linear PCM to also record non-PCM data. The present interface supports the independent use of AES3 channels to allow one linear PCM audio channel and one non-PCM data channel to be carried in a single AES3 signal.

The AES3 interface in this embodiment accommodates a synchronization method for reconstruction of the original source audio signal encoded in the non-PCM audio stream, and for time alignment with other information streams such as the relevant video stream. But other standards and suggested operations may contain important and necessary information about the synchronization requirements of a particular data type. In general, in order to accurately transmit and receive non-PCM streams using an interface, reference to such information is required.

Global synchronization is not required in the present interface because of the wide variety of data types that may be transferred according to the present interface. However, in terms of the relation of the encoded audio sample rate to the AES3 frame frequency (when transmitting non-PCM audio) and in terms of time synchronization with other information streams, synchronization of the non-PCM data content is very important for the correct use of the present interface. Furthermore, synchronization requirements for particular data types may add cache to devices supporting those data types. Thus, other documents containing synchronization requirements for particular data types need to be referenced to maintain compatibility with those data types.

There is a special data type, i.e. a timestamp data type, to support the synchronization method. Many data types may utilize information contained in time-stamped data bursts, including SMPTE 12M time code information, to maintain time synchronization with other information streams.

The interface is only suitable for professional sound equipment. The existing standard (IEC 60958) covers the transmission of non-PCM data in consumer environments. The present interface considers some interoperability between professional AMS and user equipment and describes specific compatibility requirements.

The logical format of the AES3 interface consists of a series of subframes. Each subframe is intended to carry one linear PCM sample and contains 32 slots, each slot (excluding four slots for synchronization purposes) may carry a single bit of information. A pair of subframes (each subframe containing PCM words of one audio channel) constitute one AES3 frame, which contains two PCM words, one from channel 1 and one from channel 2. The sequence of 192 frames constitutes a block. The 192 channel state bits for each channel during a block constitute a 192 bit (24 byte) channel state word for that channel. The standard usage of 32 AES3 slots is modified when transmitting non-PCM data. This use is shown in table 1. The subframe bit field usage of AES3 non-PCM data is shown in table 1:

The non-PCM data streams to be transmitted are formed into data bursts, each data burst comprising a burst preamble and a burst payload. The data burst is placed in one of two modes in the audio sample word/auxiliary data field of the AES3 subframe. In frame mode, the data space from each sub-frame within an AES3 frame will be combined to allow up to 48 bits of data to be placed in each frame. In the subframe mode, each channel will be processed independently and data is not shared between intra subframes. In this mode, each subframe may contain linear PCM audio or non-PCM data. This allows the AES3 interface to transfer two linear PCM channels, or one linear PCM channel and one set of data bit streams, or two sets of data bit stream data types, simultaneously.

The data bursts are marked with a number indicating to which data stream they belong. Up to seven different non-PCM data streams, along with additional stream types dedicated to time-stamped data bursts, may be time multiplexed together to form a set of data bit streams. In the subframe mode, this allows multiplexing up to 14 independent non-PCM data streams within a single AES3 interface.

The data burst is placed in the audio sample word/auxiliary data field of the AES3 subframe using 16, 20 or 24 bits of available space within each subframe. While the 24-bit mode allows for more efficient use of AES3 data capacity, 16-bit and 20-bit modes may be required when interfacing with existing devices limited to 16-bit or 20-bit operation.

Optionally, the data patterns include a 16-bit pattern, a 20-bit pattern, and a 24-bit pattern;

the step of placing the audio samples of the data stream data in the payload field of the sub-frame of the data burst according to the data mode of the data stream data comprises the following steps:

if the data mode is a 16-bit mode or a 20-bit mode, placing an audio sample of the data stream data in an audio sample field of a subframe of the data burst;

if the data mode is a 24-bit mode, the audio samples of the data stream data are placed in the audio sample field and the auxiliary data field of the sub-frame of the data burst.

Optionally, the format of the data stream data is PCM data or non-PCM data;

if the data stream data is non-PCM data, the setting a field other than the payload field in the data burst includes:

and setting a preset byte in a channel state field in the data burst.

Optionally, the setting a preset byte in a channel status field in the data burst includes:

setting preset bytes in the channel state bit field in the data burst to include byte 0, byte 1, byte 2 and byte 23; wherein, the liquid crystal display device comprises a liquid crystal display device,

Byte 0 bit 0 is set to 1 to indicate professional use of the channel state block; byte 0 bit 1 is set to 1 to represent a non-audio mode; bits 2-4 of byte 0 are set to 000; byte 0, bit 5, is set according to AES3, segment 4, to indicate source frame frequency lock status; bits 6-7 of byte 0 are set according to section 4 of AES3 to represent the frame rate of the AES3 interface;

bits 0-3 of byte 1 are set to 0000; bits 4-7 of byte 1 are set according to section 4 of AES 3;

byte 2 bits 0-2 are set according to AES3 segment 4 to indicate the use of auxiliary sampling bits; bits 3-5 of byte 2 are set according to section 4 of AES3 to represent the non-PCM data word length; bits 6-7 of byte 2 are set to 00;

bits 0-7 of byte 23 are set according to section 4 of AES3 to indicate a valid CRCC value of the channel state block.

Optionally, after said setting a field other than said payload field in said data burst, further comprising:

the data burst is marked as a digital identifier to indicate the data stream to which the data burst belongs.

Optionally, the setting a field other than the payload field in the data burst includes:

and setting a preamble field of the data burst, wherein the preamble field is arranged at the beginning of each data burst and is positioned in front of the payload field, and the preamble field comprises 4 words and respectively represents a first synchronous word, a second synchronous word, a burst information value and a length code.

Optionally, the setting the preamble field of the data burst includes:

in a frame mode, setting 4 words contained in the preamble in 2 continuous frames, wherein a previous frame in the 2 continuous frames is a frame for starting the data burst, the first synchronization word is contained in a first subframe of the previous frame, the second synchronization word is contained in a second subframe of the previous frame, the burst information value is contained in a first subframe of a subsequent frame, and the length code is contained in a second subframe of the subsequent frame;

in a subframe mode, 4 words contained in the preamble are arranged in 4 sequence subframes of a single channel for transmitting non-PCM data, a first subframe in the 4 sequence subframes is a subframe for starting the data burst, and the first synchronization word, the second synchronization word, the burst information value and the length code are sequentially arranged in the 4 sequence subframes.

Optionally, in frame mode, the payload field is set to transmit a set of non-PCM data streams using two AES3 channels;

in subframe mode, the payload field is set to each AES3 channel alone to carry a set of non-PCM burst sequences or linear PCM audio streams.

The modified AES3 interface format in this embodiment, the logical interface format conforms to the specifications of AES3 unless otherwise specified. The electrical and mechanical properties of the interface are in accordance with AES3 or ANSI/SMPTE 276M. When one channel is used to transmit the linear PCM in the subframe mode, the channel transmitting the linear PCM should be used according to AES 3.

As described in this embodiment, the non-PCM data should be placed in the available AES3 data space in bursts. The non-PCM data should occupy part or all of bit positions 4-27 of the AES3 subframe. Unused bit positions in subframes within a subframe or between bursts should be set to 0.

Channel state mode

For an AES3 channel transmitting non-PCM data, byte 0, byte 1, byte 2, and byte 23 of the channel state word should be used as described in this embodiment. For channels that transmit non-PCM data, the usage of the remaining bytes of the channel state word is undefined. Each bit of the undefined channel status byte is set to 0.

AES3 defines three implementation types related to the use of channel state characteristics: minimum, standard, and enhancement. In order to be compatible with existing implementations of AES3, standard implementations have to be used. Although the specific interpretation of certain bit fields is different from the AES3 specification, the use of channel state bytes 0, 1, 2 and 23 defined in this interface is consistent with the standard implementation described in AES 3.

Byte 0, bit 0, of the channel state word should be set to 1, indicating professional use of the channel state block. Irrespective of the user's use of the AES3 bit stream.

Byte 0 bit 1 should be set to 1 to indicate a non-audio mode.

Bits 2-4 of byte 0 should be set to 000.

Byte 0, bit 5, should be set according to AES3, segment 4, but should indicate the source frame frequency lock state. The source frame frequency should be interpreted as the source rate of the AES3 interface frame rate. This bit is not necessarily used to indicate that the source sampling rate of an audio signal encoded in a non-PCM audio stream in an AES3 signal is locked to the AES3 frame rate, although such use may be specified by some data stream type.

Bytes 0 bits 6-7 should be set according to section 4 of AES3 but should represent the frame rate of the AES3 interface. These bits are not necessarily used to indicate the source sampling rate of the audio signal encoded in the non-PCM audio stream in the AES3 signal, although such use may be specified by some data stream types.

Xu Zhuangtai 00 (interpreted as a frame rate not shown) is allowed in this embodiment, but an actual frame rate is suggested to be displayed.

Table 2 provides a summary of the settings of channel state word byte 0 when non-PCM data is transferred. Table 2 shows the channel status bits in byte 0:

bits 0-3 of byte 1 are set to 0000. Bits 4-7 of byte 1 should be set according to section 4 of AES 3.

Table 3 provides a summary of the settings of channel state word byte 1 when non-PCM data is transferred, table 3 shows the channel state bits in byte 1:

the byte 2 bits 0-2 are set according to the AES3 segment 4 to indicate the use of the auxiliary sampling bits. In this case, the audio sample word length is a non-PCM data word length (number of bits used to transmit non-PCM data) defined by the data pattern parameter. If multiple non-PCM data streams are carried within the AES3 interface and there are multiple data pattern words, then the data pattern corresponding to the maximum non-PCM data word length is used as a reference.

Bits 3-5 of byte 2 are set according to section 4 of AES3 to represent the non-PCM data word length (the number of bits used to transfer non-PCM data) defined by the data mode parameter. If multiple non-PCM data streams are carried within the AES3 interface and there are multiple data pattern words, the data pattern corresponding to the maximum non-PCM data word length is used as a reference.

State 000 is allowed (interpreted as an unspecified non-PCM data word length) but indicates the actual non-PCM data word length. Bits 6-7 of byte 2 should be set to 00.

Table 4 provides a summary of the settings of channel state word byte 2 when non-PCM data is transferred. Table 4 shows the channel status bits in byte 2:

Bits 0-7 of byte 23 are set according to section 4 of AES3 to indicate a valid CRCC value of the channel state block (default state is not allowed to be 0). Table 5 shows the channel status bits in byte 23:

sample rate synchronization, the present interface does not require synchronization between the AES3 interface rate (frame rate) and the sample rate of the encoded audio in the non-PCM data stream. However, other standards or recommended practices may dictate a fixed relationship between the AES3 interface and the encoded audio sample rate for a particular data type.

In data burst format, the non-PCM data stream to be transmitted forms data bursts consisting of data words in a continuous sequence of AES3 frames. Each burst of data includes a burst preamble and a burst payload. When there are multiple streams, the bursts from each stream are placed in the AES3 stream in a time division multiplexed manner.

The burst preamble occurs at the beginning of each data burst, followed by a burst message payload. The burst preamble should occupy 16, 20 or 24 bits in each of 4 consecutive subframes in one of two ways depending on whether a frame or subframe pattern is being used. The preamble includes four words, pa, pb, pc, pd. When placed in an AES3 subframe, the MSB of the preamble is placed in slot 27 of that subframe. The LSB of each preamble is placed in a slot 12, 8 or 4 according to the data pattern. In the 16-bit mode slot, 11-8 is set to 0 for each subframe containing a preamble. In the 16-bit and 20-bit modes, slots 7-4 are also set to 0 for each subframe containing a preamble. Table 6 shows the preamble:

In frame mode, 4 preambles are included in 2 consecutive frames. The frame starting the data burst contains a preamble Pa in the channel 1 subframe and Pb in the channel 2 subframe. The next frame contains Pc in channel 1 and Pd in channel 2.

In the subframe mode, 4 preambles are included in 4 sequential subframes of a single channel (channel 1 or channel 2) for transmitting non-PCM data. The subframe (channel in use) from which the data burst starts should contain the preamble Pa, the next subframe (channel) in the burst contains Pb, etc.

The burst info value should contain information of the burst payload content specified in the CELT static allocation table. bit 15, bit 19 or bit 23 of burst_info should be regarded as MSB, depending on the data pattern (16, 20 or 24 bits). Note that burst_info msb will always be located in slot 27 of the AES3 subframe.

The data type data_type, 5-bit data_type field indicates the data type contained in the burst payload. The MSB of the data_type field is placed in bit 4, 8 or 12 of the burst information word depending on the data mode. Note that the data_type MSB will always be located in slot 16 of the AES3 subframe. The data_type value applies to only a single Data burst. When multiple data streams are carried within the AES3 interface, the data type may vary between different stream numbers. The mapping of supported data types and data_type values to specific data types is defined in SMPTE 338M. Other criteria may include specific data type dependent format requirements. Table 7 shows burst information values:

The data mode data_mode, 2-bit data_mode field indicates the mode in which burst_payload data is placed in the AES3 subframe. The MSB of the data_mode field should be placed in bit 6, 10 or 14 of the burst information word, depending on the data mode. Note that the MSB of the data_mode will always be located in the slot 18 of the AES3 subframe. In each data pattern, the burst_payload data word should occupy the sub-frame slots shown in table 8. In the 16-bit and 20-bit modes, the unused slots should contain a value of 0. The data_mode value should only apply to a single data burst. The data pattern may vary between consecutive data bursts of a given data_stream_number or between data bursts of different stream numbers when multiple data streams are carried within the AES3 interface. Table 8 shows data pattern values:

an error flag error_flag provides an error indication for the data in burst_payload. If the data in burst_payload is known to be error-free or if the data contains no errors, the value of this bit is set to 0. If the data in burst_payload is known to contain errors, then the bit is set to 1. Note that error flag will always be in slot 19 of the AES3 subframe.

The data type related data_type_dependency field should contain 5 bits, the meaning of which depends on the value of the data_type. The particular encoding of this field may be found in other standard or recommended practices applicable to the particular data type. A reference to a document containing this field description of a particular data type may be found in SMPTE 338M.

The data stream number data_stream_number, 3-bit data_stream_number, shall represent the number of the data stream to which the burst belongs. The MSB of the 3-bit data_stream_number should be placed at bit 15, 19 or 23 of the burst information word according to the data pattern. The data_stream_number MSB will always be located in slot 27 of the AES3 subframe.

Each independent data stream uses a unique value of data_stream_number. Eight data stream numbers (0-7) are available. The data stream number 7 is reserved for time stamp data types. All time-stamped data bursts are encoded using a data stream set to 7. The data stream numbers 0-6 may be used for all data types except for the time stamp data type. Thus, up to 7 independent data streams may be time multiplexed in the AES3 interface when in frame mode. In sub-frame mode, each AES3 channel is handled separately, the unique data stream numbering requirement for each data stream only applies to a given AES3 channel. In this mode, up to 14 independent data streams (7 in each channel) may be time multiplexed in the AES3 interface.

It is noted that a single time-stamped data burst is applicable to specific data bursts of other data types. Although all time-stamped data bursts are identified as data stream number 7, they should not be considered as a single stream of related time stamp values. When time code information is carried within a time code data burst, multiple time code streams may be transmitted within the data burst identified as data stream number 7. Other criteria or suggestions contain further information about the type of time stamp data.

Length code length_code, length code represents the length of burst_payload in bits. The length_code occupies 16, 20 or 24 bits of the AES3 subframe depending on the data pattern. The length_code MSB is always located in the slot 27 of the AES3 subframe.

The size of the burst_payload field is limited according to the data pattern: 0 to 65535 bits in 16-bit mode, 0 to 1048575 bits in 20-bit mode, and 0 to 16777215 bits in 24-bit mode. The size of burst_preamble is not counted in the value of length_code.

The burst payload is partitioned into data words and placed in a continuous sequence of AES3 frames in one of two modes:

in frame mode, two AES3 channels should be used to transmit a set of non-PCM data streams. When data bursts are packed into a continuous sequence of frames, the available data space for each sub-frame within an AES3 frame should be combined. This mode will allow up to 32, 40 or 48 data bits to be placed in a single AES3 frame, depending on the data_mode setting in the burst_preamble. The burst payload is considered as a serial stream of bits, the first bit of the first data word of the payload in the burst occupying the MSB bit position of subframe 1 (slot 27) and the last bit of the first data word occupying the LSB bit position of subframe 2 (set according to the data pattern). The last data bit of the burst payload may occupy only a small portion of the last frame. Any unused bits in the last frame are set to 0.

In subframe mode, each AES3 channel should be used separately to carry a set of non-PCM data streams or linear PCM audio. When packaging data bursts into a continuous sequence of frames, the sub-frames of each AES3 channel within a frame should be considered separately. This mode will allow a maximum of 16, 20 or 24 data bits per channel to be placed in a single AES3 frame, depending on the data_mode setting in the burst_preamble.

Regarding burst_payload as a serial bit stream, the first bit of the first data word of the payload in the burst should occupy the MSB bit position of the subframe (slot 27), and the last bit of the first data word should occupy the LSB bit position of the subframe (according to the data_mode setting). The last data bit of burst_payload may occupy only a small portion of the last frame. Any unused bits in the last frame should be set to 0.

In the subframe mode, the channel status word for each channel should be processed separately. Any channel transmitting non-PCM data sets the channel status bit according to the present interface. Any channel that transmits linear PCM audio sets the channel state bit according to AES 3.

A burst interval, there is no sequence of 4096 or more AES3 frames (frame mode) or subframes (subframe mode), containing at least one data burst, there is no beginning of such at least one data burst, which is preceded by four AES3 subframes with subframe content in all 0 slots 8-27. This requirement ensures that an extended synchronization code of 0, pa, pb occurs. Data bursts from a given non-PCM data stream are placed in the AES3 interface in sequential order. If multiple non-PCM data streams are placed in the AES3 interface (or in a single channel in sub-frame mode), the data bursts from each stream should be interleaved in a time-multiplexed manner. The present interface has no other requirements for the multiplexing method as long as each stream meets the above requirements. Other standards or recommended practices may define specific multiplexing techniques that may depend on the data type.

The present interface does not require that data bursts be placed at fixed intervals or positions in the AES3 stream, although these features may be used to convey synchronisation information. Other criteria or recommended operations may be specified for this purpose, depending on the fixed reference location and repetition rate of the data type. the time_stamp data type is also used to transmit specific synchronization information for certain data types.

The data format contained in the data type related field, the data_type_specific and burst_payload fields depends on the data_type field, and is beyond the scope of the present specification. The specific encoding of these fields can be found in other standard or recommended practices applicable to the specific data type.

Consumer format compatibility, the present interface is only applicable to professional devices. Some users in a professional environment may wish to be compatible with consumer devices. For explicit compatibility with user equipment, professional devices should implement an interface or a specific interface mode conforming to the user format specification IEC 60958. However, it may be desirable for some professional devices to be compatible at the level of bitstream formatting while still utilizing the AES3 interface in professional mode. This embodiment lists specific compatibility requirements and issues that need to be considered in terms of bitstream format.

The receiving device, the user bitstream format, is typically considered to be a subset of the professional bitstream format defined by the present specification. Professional equipment for realizing the interface can read the user burst preamble and correctly extract the user burst data carried in the AES3 interface. This does not guarantee that a professional receiver will be able to correctly receive and decode all data types. The present interface may not define certain user data types. In operation, the professional receiver will discard any data bursts containing undefined data types.

Source means for the professional device to generate an AES3 output bit stream compatible with the user format, the data burst should be formatted in the following way:

all data bursts should be limited to a 16-bit frame mode (data mode=0, two AES3 subframes are used).

There is at least one bit stream of data_stream_number=0. The bitstream should contain encoded audio information that is considered to be the primary audio service. The user equipment may not be able to receive a bit stream with a data stream number greater than 0.

Some specialized data types defined by the present interface may not be defined in the user format. The user equipment may expect (but does not guarantee) to ignore these data types.

The user equipment should be able to read the burst preamble and extract the data burst format correctly according to the above-mentioned advice. This does not guarantee a consumer receiver

All data types can be received and decoded correctly. Professional devices also consider the user's synchronization requirements for certain data types that may affect the user's ability to receive and decode encoded data. Many users receive information that relies on channel state information to detect non-PCM encoded data. In an AES3 signal with the channel state set according to the present interface, the consumer receiver may not be able to detect the bit stream.

Data type related problems, for data types common to both professional and consumer, specific data type related problems may still affect bitstream compatibility between professional and consumer devices. For certain data types, the data_type_dependent field may differ between the professional specification and the user specification. Such differences may prevent proper exchange of encoded data and/or synchronization information. The encoding of burst_payload may also be different for certain data types. Furthermore, the synchronization method required by the consumer format for certain data types may impose restrictions on the data burst interval, which in some cases may be different from the allowed professional format. The data type specific format problem is not within the scope of the present interface. Compatibility with a particular data type may be addressed by other standards and recommended operations. References to documents containing specific data type format requirements may be found in SMPTE 338M.

Automatically detecting audio/data patterns, the AES3 interface may transmit PCM audio, non-PCM data, or both in separate channels. A receiving device capable of receiving an AES3 stream containing PCM and non-PCM data needs to identify whether AES3 information is considered PCM audio, non-PCM data, or both. This information can be transferred by setting bit 1 of the channel state word to indicate the data. In some applications, it may be a useful function for the receiver to be able to determine whether the AES3 content is PCM audio or non-PCM data without reference to bit 1 of the channel state word. By recognizing that the synchronization code formed by the first two words (Pa, pb) of the preamble is unlikely to occur frequently in natural PCM audio, this can be achieved very reliably.

By looking for an extended sync code consisting of six words (4 zeros after Pa, pb in the case of 16 bit patterns 0x0000, 0xF872, 0x4E 1F), the probability of sync errors occurring will be very small. The decision process that can be followed is shown in fig. 5. In fig. 5, the mode of the receiver may be switched between PCM and data. The SYNC function is used to indicate whether an extended synchronization code is found within the range of 4096 AES3 frames. Note that if the AES3 stream is idle (all zero), the auto-detector will enter PCM mode and switch back to data mode only when a data burst occurs. If such behavior is not required, it can be prevented by inserting null data bursts at least once every 4096 AES3 frames.

Fig. 3 is a data transmission device based on a serial digital audio interface according to an embodiment of the present application, including:

a data acquisition module 310 for acquiring data stream data for generating a data burst; wherein the data stream data comprises audio data and/or audio model metadata;

a data placement module 320, configured to place audio samples of the data stream data in a payload field of a subframe of the data burst according to a data pattern of the data stream data;

a field setting module 330, configured to set a field other than the payload field in the data burst;

a data transmission module 340 for transmitting the data bursts over a serial digital audio interface.

the data placement module is specifically used for:

Optionally, the format of the data stream data is PCM data or non-PCM data;

if the data stream data is non-PCM data, the field setting module is specifically configured to:

and setting a preset byte in a channel state field in the data burst.

Optionally, the data transmission device based on the serial digital audio interface further comprises:

and the digital identification marking module is used for marking the data burst as a digital identification after the field other than the payload field in the data burst is set so as to indicate the data stream to which the data burst belongs.

Optionally, the field setting module is specifically configured to:

the setting the preamble field of the data burst includes:

in subframe mode, the payload field is set to each AES3 channel alone to carry a set of non-PCM data burst sequences or linear PCM audio streams.

The data transmission device based on the serial digital audio interface provided by the embodiment of the invention can execute the data transmission method based on the serial digital audio interface provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 4, the electronic device includes: processor 410, memory 420, input device 430, and output device 440. The number of processors 30 in the electronic device may be one or more, one processor 410 being illustrated in fig. 4. The number of memories 420 in the electronic device may be one or more, one memory 420 being exemplified in fig. 4. The processor 410, memory 420, input device 430, and output device 440 of the electronic device may be connected by a bus or other means, for example by a bus connection in fig. 4. The electronic device may be a computer, a server, etc. In the embodiment of the application, the electronic device is used as a server for detailed description, and the server can be an independent server or a cluster server.

Memory 420 acts as a computer readable storage medium for storing software programs, computer executable programs, and modules, as described in any of the embodiments of the serial digital audio interface based data transmission device program instructions/modules. Memory 420 may include primarily a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application program required for functionality; the storage data area may store data created according to the use of the device, etc. In addition, memory 420 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In some examples, memory 420 may further include memory located remotely from processor 410, which may be connected to the device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input means 430 may be used to receive input digital or character information and to generate key signal inputs related to viewer user settings and function control of the electronic device, as well as cameras for capturing images and pickup devices for capturing audio data. The output 440 may include an audio device such as a speaker. It should be noted that the specific composition of the input device 430 and the output device 440 may be set according to the actual situation.

The processor 410 performs various functional applications of the device and data processing, i.e., implements a data transmission method based on a serial digital audio interface, by running software programs, instructions, and modules stored in the memory 420.

Embodiments of the present application also provide a storage medium containing computer-executable instructions that, when generated by a computer processor, comprise the serial digital audio interface-based data transmission method provided by any of the embodiments.

Of course, the storage medium containing the computer executable instructions provided in the embodiments of the present application is not limited to the operations of the electronic method described above, but may also perform the related operations in the electronic method provided in any embodiment of the present application, and has corresponding functions and beneficial effects.

From the above description of embodiments, it will be clear to a person skilled in the art that the present application may be implemented by means of software and necessary general purpose hardware, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, etc., including several instructions for causing a computer device (which may be a robot, a personal computer, a server, or a network device, etc.) to execute the electronic method according to any embodiment of the present application.

It should be noted that, in the above electronic device, each unit and module included are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the present application.

It is to be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.

In the description of the present specification, reference to the term "in one embodiment," "in another embodiment," "exemplary," or "in a particular embodiment," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

While the application has been described in detail with respect to the general description, the detailed description and the experiments, it will be apparent to those skilled in the art that modifications or improvements can be made thereto based on the application. Accordingly, such modifications or improvements may be made without departing from the spirit of the application and are intended to be within the scope of the invention as claimed.

Claims

1. A data transmission method based on a serial digital audio interface, comprising:

setting a field other than the payload field in the data burst;

wherein, the format of the data stream data is PCM data or non-PCM data;

setting a preset byte in a channel state field in the data burst;

The setting the preset byte in the channel state field in the data burst includes:

setting preset bytes in a channel state bit field in the data burst to comprise bytes 0, 1, 2 and 23; wherein, the liquid crystal display device comprises a liquid crystal display device,

bits 0-7 of byte 23 are set according to section 4 of AES3 to indicate a valid CRCC value of the channel state block;

wherein said setting a field other than said payload field in said data burst comprises:

setting a preamble field of the data burst, wherein the preamble field is arranged at the beginning of each data burst and is positioned in front of the payload field, and the preamble field comprises 4 words and respectively represents a first synchronous word, a second synchronous word, a burst information value and a length code;

The setting the preamble field of the data burst includes:

in a subframe mode, setting 4 words contained in the preamble in 4 sequence subframes of a single channel for transmitting non-PCM data, wherein a first subframe in the 4 sequence subframes is a subframe for starting the data burst, and the first synchronous word, the second synchronous word, the burst information value and the length code are sequentially set in the 4 sequence subframes;

the data bursts are transmitted over a serial digital audio interface.

2. The method of claim 1, wherein the data patterns comprise a 16-bit pattern, a 20-bit pattern, and a 24-bit pattern;

3. The method of claim 1, further comprising, after said setting a field other than said payload field in said data burst:

4. The method of claim 1, wherein the step of determining the position of the substrate comprises,

in frame mode, the payload field is set to transmit a set of non-PCM data streams using two AES3 channels;

5. A data transmission device based on a serial digital audio interface, comprising:

wherein, the format of the data stream data is PCM data or non-PCM data;

setting a preset byte in a channel state field in the data burst;

the field setting module is specifically configured to:

the setting the preamble field of the data burst includes:

6. An electronic device, comprising: a memory and one or more processors;

the memory is used for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-4.

7. A storage medium containing computer executable instructions which, when executed by a computer processor, implement the method of any one of claims 1-4.