WO2016006915A1

WO2016006915A1 - Method and apparatus for sending multimedia data

Info

Publication number: WO2016006915A1
Application number: PCT/KR2015/007014
Authority: WO
Inventors: 정경훈; 오은미; 정종훈; 황선호
Original assignee: 삼성전자 주식회사
Priority date: 2014-07-08
Filing date: 2015-07-07
Publication date: 2016-01-14

Abstract

Provided are a method and an apparatus for sending multimedia data. A first device, which provides an audio signal to a second device, comprises: a control unit for dividing an audio signal inputted to the first device into a plurality of audio frames, comparing at least one previous audio frame pre-stored in a memory of the first device with a current audio frame among a plurality of audio frames, and selecting one from among the pre-stored previous audio frames on the basis of a similarity between the pre-stored previous audio frame and the current audio frame; and a communication unit for sending an identification value of the selected previous audio frame to the second device.

Description

Method and apparatus for transmitting multimedia data

The present disclosure relates to a method and apparatus for transmitting multimedia data using wireless communication and a data network.

Methods for solving the problems of data transmission as described above in the wireless communication and network are as follows. For example, in order to reduce the interference between two networks in multiple wireless environments, it is possible to monitor the network status by using various parameters that can recognize the transmission status between the networks and, accordingly, to reduce the interference between the two networks. Several methods apply. Most of these methods use RSSI (Receiver Signal Strength Indication), system noise level information, power spectrum of the transmitted signal to monitor network conditions. According to these results, interference between different data networks is minimized by adjusting transmitter power in the RF stage or changing a channel map used for data transmission. Another method is to adjust priority levels or adjust antenna isolation between a plurality of wireless networks, or to configure an antenna system to coexist between networks using an antenna switch when the same antenna is shared in a plurality of networks. There is this.

In addition, the packet rate information is used to adjust the data rate by using packet-loss information or the amount of multimedia data transmitted according to the channel condition and the bandwidth by using a scalable codec. Use the method.

When a packet is lost or distorted, a frame error occurs and an error concealment method is used to interpolate it. The error concealment method commonly used when voice or audio information is transmitted is a muting method that reduces the effect of the error on the output signal by reducing the volume of the sound in the frame where the error occurs. Repetition method which restores the signal of the frame that caused the error by repeatedly playing the previous good frame of the previous frame, and interpolates the parameters of the previous frame and the next good frame. Interpolation for predicting parameters.

Meanwhile, data transmission methods such as DTX (Discontinuous Transmission) and DRX (Discontinuous Reception) are generally used in order to improve power efficiency of a device for transmitting data and frequency efficiency of a network.

In addition, when data is transmitted through a wireless network, since data is transmitted through an unstable channel as compared to a wired network, an error may inevitably occur in data transmission. Accordingly, in order to reduce the error rate of data transmission, the device must transmit data with high transmission power, and in this case, there is a problem that the power efficiency of the device is reduced.

There are methods for reducing data interference and efficiently transmitting data in a plurality of wireless networks. However, the existing methods are mainly applied to minimize the interference between networks in the RF stage, such as monitoring the network environment and using the results to adjust signal strength or adjusting antenna isolation to minimize the interference between channels. have.

In the case of adjusting the bit rate according to the situation of the data network, the conventional method uses the packet loss information to estimate the available bandwidth and adjust the bit rate accordingly. However, when interference between a plurality of wireless networks occurs, a method of adjusting the bit rate with one parameter may not accurately reflect the situation due to the interference of the network.

When using real-time audio or video streaming service using a wireless network, robustness against latency and channel error is an important factor. In this case, delay and robustness are generally determined by the size of the input buffer of the receiver. In other words, if the size of the input buffer is large, the robustness against the error is improved, but delay occurs. If the size of the input buffer is small, the delay is short, but the robustness is low. For audio data transmitted with video, such as a TV, sync and delay are very important. However, for data with only an audio signal, synchronization and delay are less important than for video. However, since the conventional method uses a fixed buffer size without considering the type of the input data, it does not effectively cope with the synchronization and delay caused by the data type.

In the case of error concealment, the repetition of reconstructing an error signal using the previous normal frame mentioned above causes problems such as discontinuity and phase mismatch between frames when simple repetition occurs. It's hard to expect Interpolation using a previous normal frame and a subsequent normal frame requires a delay of one frame, which is not suitable for use in delay-sensitive real-time audio or video streaming services.

In some embodiments, a problem to be solved is to analyze a wireless network environment such as a plurality of network status parameters for detecting network interference in a wireless network environment, and a bandwidth change according to a plurality of wireless peripheral devices using the same network. In addition, the present invention provides a method for efficiently adjusting the bit rate of multimedia data such as audio and video and constructing a packet using the same. That is, data is encoded at a lossless or high bit rate in a strong electric field in a good network environment, and data is transmitted at low bit rates in a weak electric field in a bad network environment.

In addition, the problem to be solved in some embodiments is to provide a method of effectively adjusting the delay according to the media type (audio, video) so as to minimize the delay in case of synchronization-critical data and to be robust to errors otherwise.

In addition, a problem to be solved in some embodiments is to provide a frame error concealment method and apparatus that does not require additional delay with low complexity by using an error concealment method using a previous normal frame when an error frame occurs. To this end, it is possible to minimize sound quality degradation by minimizing discontinuity and phase mismatch between frames occurring during error concealment.

In addition, some embodiments may provide a method and apparatus capable of reducing power consumption of a device in transmitting audio data between devices connected via a wireless communication network.

In addition, some embodiments provide a method and apparatus by which devices transmitting and receiving audio data can manage and synchronize memory storing audio frames.

In addition, some embodiments provide a method and apparatus for selectively transmitting audio frames and identification values of audio frames.

FIG. 1 is a schematic block diagram of an apparatus for controlling a bit rate and latency and configuring a packet to be transmitted according to a wireless network situation in some embodiments.

2 is a flowchart of a method of determining a packet size, a frame number, and a frame size according to a network channel according to some embodiments.

3 is a diagram illustrating an example of a method of determining a packet size, a frame number, and a frame size according to a network channel, according to some embodiments.

4 is a diagram illustrating an embodiment of a method of adjusting a predetermined frame size and the number of frames to be optimized for a packet size.

5 is a flowchart of a method of determining a lossy or lossless mode in accordance with some embodiments.

6 is a diagram illustrating a relationship between goodness of a plurality of network state parameters and a set bit rate.

FIG. 7 is a block diagram illustrating in detail the coding mode / bit rate determination module 140, the media encoding module 160, and the media packet generation module 170 in FIG. 1.

8A is a diagram illustrating an embodiment of generating a packet according to a predefined number of frames and a frame size according to a channel mode and a bit rate.

8B is a diagram illustrating an embodiment of configuring a bitstream when input data is audio data and a coding mode is lossless coding.

9 illustrates an input buffer control method of the receiving end 950 according to a media type.

10 is a schematic block diagram of a data reproducing apparatus 1000 for processing and decoding a packet input at a receiving end of some embodiments.

FIG. 11 is a diagram illustrating a problem of phase mismatch or discontinuity that occurs during simple data repetition.

12 is a diagram illustrating a process of applying a superposition and addition method in a first error frame as an embodiment of a method of generating repetitive data used for error concealment.

13A is a diagram illustrating a process of applying an overlapping and adding method at one end of the repetitive data.

13B is a diagram illustrating a process of applying an overlapping and adding method at the other end of the repetitive data.

14 is a diagram illustrating an error concealment method using a modified iteration buffer according to some embodiments.

FIG. 15 is a diagram illustrating an example in which the data transmission apparatus 100 provides audio data to the data reproduction apparatus 1000 according to some embodiments.

16 is a flowchart of a method of providing, by the data transmission device 100, the audio data to the data reproduction device 1000 according to some embodiments.

17 is a flowchart illustrating a method of compressing a current audio frame and transmitting the current audio frame to the data reproducing apparatus 1000 according to some embodiments.

18 is a flowchart of a method of managing an audio frame by the data transmission apparatus 100 according to some embodiments.

19 is a flowchart of a method of storing and managing an audio frame received from the data transmission apparatus 100 by the data reproducing apparatus 1000, according to some embodiments.

20 and 21 are block diagrams of a data transmission apparatus 100 according to some embodiments.

As a technical means for achieving the above-described technical problem, a first aspect of the present disclosure, a memory for storing at least one previous audio frame; Splitting an audio signal input to the first device into a plurality of audio frames, comparing a current audio frame among the plurality of audio frames with the at least one previous audio frame previously stored in a memory of the first device, A controller for selecting one of the previously stored previous audio frames based on a similarity between the stored previous audio frame and the current audio frame; And a communication unit which transmits an identification value of the selected previous audio frame to the second device. The first device may provide an audio signal to the second device.

The controller may select one of the previously stored previous audio frames according to the similarity between the previously stored previous audio frame and the current audio frame being greater than a preset threshold.

The controller may be further configured to compress the current audio frame as the similarity between the previously stored previous audio frame and the current audio frame is smaller than a preset threshold, and the communication unit may further include compressing the compressed current audio frame. We can send to 2 devices

The controller may be configured to delete at least one of the previously stored previous audio frames and to delete at least one of the previously stored previous audio frames as the compressed current audio frame is successfully transmitted to the second device. The current audio frame may be stored in the memory.

The controller may delete the previous audio frame stored first in the memory among the previously stored previous audio frames.

The controller may delete one of the previously stored previous audio frames based on the number of times the identification value of the previously stored previous audio frame is transmitted to the second device.

The communication unit may retransmit the identification value of the selected previous audio frame to the second device within a preset transmission time limit as the transmission of the identification value of the selected previous audio frame fails.

The controller may determine a method of transmitting a next audio frame as transmission of the identification value of the selected previous audio frame fails within the preset transmission time limit.

In addition, a second aspect of the present disclosure, the step of dividing the audio signal input to the first device into a plurality of audio frames; Comparing a current audio frame among the plurality of audio frames with at least one previous audio frame previously stored in a memory of the first device; Selecting one of the previously stored previous audio frames based on a similarity of the current audio frame among the previously stored previous audio frames; And transmitting an identification value of the selected previous audio frame to the second device. The first device may provide a method for providing an audio signal to a second device.

In addition, the selected previous audio frame may be stored in the memory and the memory of the second device, respectively, as successfully transmitted to the second device.

The selecting of one of the previously stored previous audio frames may select one of the previously stored previous audio frames according to a similarity between the previously stored previous audio frame and the current audio frame being greater than a preset threshold. .

The method may further include: compressing the current audio frame according to a similarity between the previously stored previous audio frame and the current audio frame being less than a preset threshold; And transmitting the compressed current audio frame to the second device.

The method may further include deleting at least one of the previously stored previous audio frames as the compressed current audio frame is successfully transmitted to the second device; And storing the current audio frame in the memory as at least one of the previously stored previous audio frames is deleted.

In the deleting of one of the previously stored previous audio frames, the previous audio frame previously stored in the memory may be deleted.

The deleting of one of the previously stored previous audio frames may include deleting one of the previously stored previous audio frames based on the number of times the identification value of the previously stored previous audio frame is transmitted to the second device. .

The method may further include retransmitting the identification value of the selected previous audio frame to the second device within a preset transmission time limit as the transmission of the identification value of the selected previous audio frame fails. have.

The method may further include determining a transmission method of a next audio frame as transmission of the identification value of the selected previous audio frame fails within the preset transmission time limit.

In addition, a third aspect of the present disclosure, the communication unit for receiving data related to the audio signal input to the other device from the other device; If the type of the received data is determined, and the type of the determined data is an identification value of a pre-stored audio frame, the audio frame corresponding to the identification value is extracted from the memory of the device, and the type of the determined data is the other. A control unit that extends the bitstream to the current audio frame if it is a bitstream of the current audio frame divided from the audio signal input to the device; And an output unit configured to output an audio signal of the extracted audio frame or an audio signal of the extended current audio frame.

Further, a fourth aspect of the present disclosure may include receiving data related to an audio signal input to the other device from the other device; Determining a type of the received data; Extracting an audio frame corresponding to the identification value from a memory of the device when the type of the determined data is an identification value of a pre-stored audio frame; Extending the bitstream to the current audio frame if the type of the determined data is a bitstream of a current audio frame divided from an audio signal input to the other device; And outputting an audio signal of the extracted audio frame or an audio signal of the extended current audio frame. The device may include a method of receiving an audio signal from another device.

Further, the fifth aspect of the present disclosure can provide a computer readable recording medium having recorded thereon a program for executing the method of the second aspect on a computer.

Advantages and features of the present invention, and methods of achieving them will be apparent with reference to the embodiments described below in conjunction with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but can be implemented in various different forms, and only the embodiments make the disclosure of the present invention complete, and the general knowledge in the art to which the present invention belongs. It is provided to fully inform the person having the scope of the invention, which is defined only by the scope of the claims.

Terms used herein will be briefly described and the present invention will be described in detail.

The terms used in the present invention have been selected as widely used general terms as possible in consideration of the functions in the present invention, but this may vary according to the intention or precedent of the person skilled in the art, the emergence of new technologies and the like. In addition, in certain cases, there is also a term arbitrarily selected by the applicant, in which case the meaning will be described in detail in the description of the invention. Therefore, the terms used in the present invention should be defined based on the meanings of the terms and the contents throughout the present invention, rather than the names of the simple terms.

When any part of the specification is to "include" any component, this means that it may further include other components, except to exclude other components unless otherwise stated. In addition, the term "part" as used herein refers to a hardware component, such as software, FPGA or ASIC, and "part" plays certain roles. However, "part" is not meant to be limited to software or hardware. The “unit” may be configured to be in an addressable storage medium and may be configured to play one or more processors. Thus, as an example, a "part" refers to components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, procedures, Subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays and variables. The functionality provided within the components and "parts" may be combined into a smaller number of components and "parts" or further separated into additional components and "parts".

DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present invention. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention.

FIG. 1 is a schematic block diagram of an apparatus for controlling a bit rate and latency and configuring a packet to be transmitted according to a wireless network situation in some embodiments. The block diagram of FIG. 1 may be used to describe not only a data transmission apparatus but also a data transmission method.

Referring to FIG. 1, the data transmission apparatus 100 includes a channel mode initialization module 110, a wireless network state analysis module 120, a media type analysis module 130, a coding mode / bit rate determination module 140, and packetization. Condition control module 150, media encoding module 160, and media packet generation module 170. In this case, the 'module' may be used as a 'part'. For example, the channel mode initialization module 110 may be used as the channel mode initialization unit 110. The same applies to identification numbers 120 to 170. In addition, the media encoding module 160 may be used as the media encoder 160. The media packet generation module 170 may be used as the media packet generator 170. When describing the data transmission method using the block diagram of FIG. 1, the module may be used as a step. For example, the channel mode initialization module 110 may be used as the channel mode initialization step 110. The same applies to identification numbers 120 to 170.

The initialize channel mode module 110 sets the channel mode of the network, determines the packet size according to the set channel mode, and sets the bit rate of the encoded data according to the determined size and the number of encoded data frames to be put in the packet. Set it. As an embodiment, Bluetooth supports various modes according to an error processing level and a packet configuration. Among them, EDR (Enhanced Data Rate) mode enables up to 2.1Mbps data rate in asymmetric mode, and BR (Basic rate) mode enables up to 723.2kbps transmission in asymmetric mode. Therefore, the packet size is determined according to the transmission speed supported by the selected channel. When the packet size is determined, the number of frames that can be included in the packet and the size of each frame are determined for each bit rate supported by the media codec. Equation 1 is an example of a method of determining the number of frames to be included in one packet according to the determined packet size. In this case, the frame size is determined according to the bit rate supported by the media codec. The round function is a rounding function. For example, the decimal point may be rounded down so that the number of frames is a natural number.

In step 210, the channel mode of the network is selected.

In step 220, the packet size is determined according to the bandwidth supported by the selected channel mode.

In step 230, a frame for each bit rate to be included in one packet is configured according to the packet size determined in step 220. Specifically, the number of data frames and the size of the frames to be put in the packet are set.

In step 240, the number of frames and the frame size are readjusted to fit the packet size. A detailed description related to the readjustment of the number of frames and the frame size will be described later with reference to FIG. 4.

In step 250, the determined frame number and frame size are stored in a bit rate table.

The method of FIG. 2 described above may be performed by a data transmission apparatus, a processor, or the like, and may be performed by the channel mode initialization module 110, for example.

In some embodiments of FIG. 3, the packet size is 512 bytes, and the frame size and the number are respectively determined when the bit rate supported by the media codec is 180/229/320 / 352.8 kbps. In this case, the amount of data determined according to each bit rate may be larger or smaller than the size of the packet size determined according to the network channel mode.

4 is a diagram illustrating an embodiment of a method of adjusting a predetermined frame size and the number of frames to be optimized for a packet size. FIG. 4 may be an embodiment to specifically describe the processing performed in step 240 of FIG. 2 described above.

As shown in Fig. 4, when the data amount of the encoded frame is smaller than the packet size, an empty space, for example, a residual data area R is created, and data space is wasted because the data cannot be transmitted as much. To prevent this inefficiency, the number of encoded data frames, the frame size, and the bit rate are adjusted to fit the packet size. If the frame size is L, the number of frames in one packet is K, and the size of the remaining data area is R, the adjusted length L 'of the frame can be L + R / K. In addition to adjusting the frame size, the number of frames and the bit rate may be adjusted.

In addition, if the data amount of the encoded frame exceeds the packet size, for example, when the excess data region R exists, the excess frame (frame 5 in FIG. 4) is excluded or the encoded frame is split and transmitted in the next packet. Done. In this case, since a delay of 1 packet is generated when the receiving end decodes the data, the number of frames, the size of the frame, and the bit rate are adjusted so that the encoded frame is not fragmented. If the frame size is L, the number of frames in one packet is K, and the size of the excess data area is R, the adjusted length L 'of the frame can be L-R / K. In addition to adjusting the frame size, the number of frames and the bit rate may be adjusted.

The number of frames and the bit rate information determined as described above are transmitted to the coding mode / bit-rate decision module 140 as shown in FIG. 1.

As another embodiment of the channel mode initialization module 110, a channel mode of a network may be set, a packet size may be determined according to the set channel mode, and a lossy or lossless coding mode of multimedia data may be determined according to the determined size. Coding may include compression. For example, lossy coding may include lossy compression. Lossless coding may also include lossless compression. If the bandwidth of the selected network channel is greater than or equal to a predetermined width, the channel mode initialization module 110 adds the frame number and frame size according to the lossless coding mode to the bit rate table.

In step 510, the channel mode of the network is selected.

In operation 520, the packet size is determined according to the bandwidth supported by the selected channel mode.

In step 530, it is determined whether the packet size determined in step 520 is greater than a data threshold. The data threshold is the minimum amount of data needed to apply a lossless codec.

As a result of the determination in step 530, if the packet size is not larger than the data threshold value, the lossless codec cannot be applied. Therefore, in step 532, a frame for each bit rate to be included in one packet is configured according to the packet size. Specifically, the number of data frames and the size of the frame to be put in the packet are set.

In step 534, the number of frames and the frame size are readjusted to fit the packet size. Regarding the readjustment of the number of frames and the frame size, the foregoing description is provided with reference to FIG. 4.

As a result of the determination in step 530, if the packet size is larger than the data threshold, a lossless codec may be applied. Thus, in step 536, a lossless coding mode is added to the output bit rate information.

In step 540, information for determining a coding mode and a bit rate is generated.

In step 550, the bit rate information is stored in the bit rate table. As shown in Fig. 5, a table for storing bit rate information includes

respective bit rates

1, 2,... According to the N or the lossless coding mode, information about the frame size and the number of frames is stored.

The aforementioned method of FIG. 5 may be performed by a data transmission apparatus, a processor, or the like, and may be performed by, for example, the channel mode initialization module 110 and the coding mode / bit rate determination module 140 of FIG. 1.

The generated table of coding modes, frame sizes and number of frames is used to determine the coding method of the media encoder in the coding mode / bit rate determination module 140.

The analysis wireless network status module 120 of FIG. 1 analyzes information according to a wireless network environment. At this time, a representative example of information used includes a receiver signal strength index (RSSI), a link quality indicator (LQI), and adaptive frequency hopping (AFH) channel masking information. RSSI indicates the strength of the received signal. LQI is a parameter that indicates the quality of the connected communication state and expresses the BER (bit error rate) as an integer between 0 and 255. Adaptive Frequency Hopping (AFH) is a hopping scheme used in Bluetooth and randomly hops between 79 available channels and masks unavailable channels. At this time, if the network environment is good, the number of masking channels is small. However, when the interference is severe, the number of masking channels increases. When a plurality of wireless networks coexist using the above-described parameters, it is possible to check the network situation caused by the influence of interference between channels, diffraction of obstacles, walls and the like. It also utilizes information such as the number of wireless peripherals on the same network. As peripherals use the same network, as the number of devices increases, the available bandwidth per device decreases and the bit rate decreases. Each piece of information indicative of the state of the channel is passed to the bit rate determination module 140 and used to determine the bit rate.

The coding mode / bit-rate decision module 140 of FIG. 1 selects a bit rate of the media encoder using the network state parameter analyzed by the wireless network state analysis module 120.

6 is a diagram illustrating a relationship between goodness of a plurality of network state parameters and a set bit rate. Bit rate setting according to the network state parameter may be made, for example, in the coding mode / bit rate determination module 140 of FIG. 1.

The network state parameter transmitted as shown in FIG. 6 determines the bit rate in a combination of two or more. By analyzing a plurality of parameters, the better the values of all the parameters, the more the network environment is determined by the strong electric field and the higher the bit rate is selected. On the contrary, if the value of the network condition parameter to be compared is bad, the network environment is regarded as a weak electric field, and a low bit rate is selected. In the case of RSSI, the stronger the received signal, the better it can be determined. In the case of LQI, goodness may be determined according to a bit error rate (BER) as a parameter representing the quality of a connected communication state. In the case of the AFH masking channel, the smaller the number of the masking channels, the better. For the number of connected devices, the smaller the number of devices, the better the value.

The selected bit rate information is transmitted to the media encoder module 160, and the media encoding module 160 encodes input data according to the selected bit rate.

According to another embodiment of the coding mode / bit rate determination module 140, the network channel environment may be identified using network state parameters, and a lossy or lossless coding mode may be selected according to the result. If the analyzed network channel environment is a strong electric field and can support lossless coding in channel mode, the coding mode / bit rate determination module 140 selects lossless coding and transfers this information to the media encoding module 160. If no lossless coding is supported, the higher the bit rate, the weaker field, and the lower bit rate are selected, according to the network state analysis results. In addition, the information on the selected bit rate information, the number of frames and the frame size determined according to the bit rate is transmitted to the control packetizing condition module 150 and used to generate a packet using the encoded frame data.

The media encoding module 160 encodes the input signal according to the input coding mode (lossy or lossless) and the determined bit rate, and transmits the encoded data to the media packet generator module 170 for packet generation. do.

FIG. 7 is a block diagram illustrating in detail the coding mode / bit rate determination module 140, the media encoding module 160, and the media packet generation module 170 in FIG. 1. 7 illustrates a method of operating the media encoding module 160 according to a coding mode and a bit rate.

The coding mode / bit rate determination module 140 may be divided into a bit rate determination module 142 and a coding mode determination module 144.

The media encoding module 160 is divided into a lossy encoder module 162 which performs conventional lossy coding and a residual encoder module 166 for lossless encoding. The reason for this is to maintain compatibility with existing codecs when the encoded bitstream is decoded at the receiving end. Accordingly, the media encoding module 160 performs encoding using the existing lossy coding module 162 according to the input bit rate, and when the coding mode is lossless coding, the residual encoding module 166 receives extra data for lossless coding. Encoding through). When the coding mode determination module 144 determines lossless coding, for example, when the coding mode determination module 144 determines lossless coding without turning off the switch 164 and performing residual encoding, for example, the switch 164 Turn on to perform residual encoding.

The media packet generation module 170 generates a packet according to a predefined packet construction method using the encoded bitstream.

8A is a diagram illustrating an embodiment of generating a packet according to a predefined number of frames and a frame size according to a channel mode and a bit rate. Since the size of the frame and the number of frames are predefined according to the network channel situation and the packet size, the packet size is configured to minimize the excess data space remaining or excess. In the example of FIG. 8A, when the bit rate is 1, the frame size is determined as n and the number of frames is m, and the packet data is configured to include m frames from 0 to m-1.

8B is a diagram illustrating an embodiment of configuring a bitstream when input data is audio data and a coding mode is lossless coding. Lossless coding should be designed considering compatibility with existing encoders. That is, lossy coding is performed on the input data using the lossy coding module 162, and extra data for lossless coding is separately encoded and added to the bitstream. In this way, even when the decoder does not support lossless coding, the lossy coded region may be decoded using only the existing decoder and the lossless coded region may be skipped. Information for distinguishing a lossy coding region and a lossless coding region may be included in the bitstream. In this embodiment, the syncword values of the lossy bitstream and the lossless bitstream are set differently to distinguish the lossless coding region from the bitstream.

The analysis media type module 130 of FIG. 1 selects a media type (audio or video) and a degree of delay of input data at a transmitter.

In operation 910 of FIG. 9, the media type may be determined by the information of the input data, the metadata, and the media player, or the user may directly select the degree of delay using a user interface (UI). The selected media type is used to determine the delay of the data transmitted. In general, for content such as video, there should be no delay because synchronization between audio or video is very important. However, in the case of audio data, a slight delay does not cause much problem. In general, if the receiver's input buffer is large, the delay is relatively increased. But the robustness to errors is strong. Conversely, if the input buffer is small, there is less delay, but the error is weak. Therefore, synchronization and robustness can be appropriately adjusted by adjusting the size of the input buffer of the receiver according to the input media type.

In step 920, the media type information or delay information selected in step 910 is inserted into the packet and delivered to the receiving end 950 in a wireless network environment.

The receiving end 950 parses the packet data in step 960,

In step 970, media type information or delay information is extracted from the packet,

In operation 980, the buffer size is set to be large when the type of the encoded data transmitted using the extracted information is audio, and the buffer size is set to be small when the audio data including video data is included.

The receiver parses the packet transmitted from the transmitter and transmits the media bitstream to the decoder.

10 is a schematic block diagram of a data reproducing apparatus 1000 for processing and decoding a packet input at a receiving end of some embodiments. The block diagram of FIG. 10 may be used to describe not only the data reproducing apparatus 1000 but also a data reproducing method.

Referring to FIG. 10, the data reproducing apparatus 1000 may include a packet parsing module 1010, a media type analysis module 1020, an input / output buffer control module 1030, a media decoding module 1040, and an error concealment module 1050. It includes. In this case, the 'module' may be used as a 'part'. For example, the packet parsing module 1010 may be used as the packet parsing unit 1010. The same applies to identification numbers 1020 to 1050. In addition, the media decoding module 1040 may be used as the media decoder 1040. When describing the data reproducing method using the block diagram of FIG. 10, the 'module' may be used as a 'step'. For example, the packet parsing module 1010 may be used as a packet parsing step 1010. The same applies to identification numbers 1020 to 1050.

The parsing packet module 1010 performs de-packetizing (reconstructing data from packets) on the input packet, and transmits the extracted bitstream to the media decoding module 1040. In addition, the extracted media type information or delay information inserted in the packet is extracted and transmitted to the analysis media type module 1020.

The media type analysis module 1020 analyzes the input media type and delay information.

The input / output buffer control module 1030 adjusts the size of the input buffer according to the media type (audio or video) and delay information analyzed by the media type analysis module 1020.

The media decoding module 1040 decodes the media data by using the bitstream extracted by the packet parsing module 1010 and an input buffer having a size adjusted by the input / output buffer control module 1030.

The error concealment module 1050 restores a case in which some packets are lost or distorted due to a transmission error during transmission of an encoded audio signal through a wireless network. Module. If the error generated in the frame is not properly processed, the sound quality of the audio signal is degraded in the frame section in which the error occurs, so that the decoding apparatus restores the signal by a data repetition method. In some embodiments, error concealment is performed in the time domain of the decoded audio data without the need for additional delay with low complexity.

Some embodiments use a method of repetition of previous data using previously decoded audio data for low complexity as a basic method of error concealment. The repeated audio data stores normal decoded data in a buffer having a predetermined length from the most recent data. When an error occurs, the repeated audio data is read in the buffer and used to repeat the data.

FIG. 11 is a diagram illustrating a problem of phase mismatch or discontinuity that occurs during simple data repetition. Simple data repetition creates phase-mismatching or discontinuity at the boundary of repeated data as shown in FIG. 11, and these parts cause the sound quality of the audio signal to deteriorate. In addition, when the error frame is lengthened, the same data stored in the buffer is repeatedly repeated. In this case, the same problems occur among repeated data for error concealment. Therefore, in order to solve this problem, in some embodiments, audio data stored in a buffer used for restoration at a boundary between an error frame and an error frame and an error frame when an error occurs is modified and used to overlap and add (Overlap & Add). ) Method was applied.

12 is a diagram illustrating a process of applying a superposition and addition method in a first error frame as an embodiment of a method of generating repetitive data used for error concealment. The repetition buffer stores N pieces of audio data decoded during recent normal operation. When the first error frame occurs, as shown in Equation 2, data copying starts from the latest data in the iteration buffer as many times as the number of data to be repeated.

After the first error frame, data is sequentially copied as much as the number of data to be copied from the repetitive buffer. When the read pointer of the buffer reaches the end of the buffer, the data is copied in the opposite direction of the buffer as shown in Equation (3). When the read pointer reaches both ends of the buffer, the above process is repeated to copy the data.

When performing data iterations, simple iterations will introduce phase mismatches and discontinuities as mentioned above. Therefore, when the first error frame starts, the data stored in the repetition buffer is modified to minimize the problem by creating an overlapping interval of a certain length.

Equation 4 shows the overlap and add method when the first error frame occurs. 'reversed_copy_data' is the value of data to be copied from the repetition buffer. The data to be copied is taken from the most recent value (N-1) of the repeat buffer by the overlap size M. 'overlap_data' is a value that is overlapped to minimize errors. 'overlap_data' is inverted based on the value of the last repetition buffer as shown in Equation 5 using data from the latest value (N-1) of the repetition buffer to the overlap size M.

The overlap window used a sin window, such as Equation 6, in some embodiments.

When data iteration is performed, if the read pointer of the repeat buffer reaches both ends of the buffer, data is copied in the opposite direction. At this time, problems such as phase mismatch and discontinuity occur at the buffer boundary as above.

13A is a diagram illustrating a process of applying a superposition and addition method at one end of the repetitive data, and FIG. 13B is a diagram illustrating a process of applying a superposition and addition method at the other end of the repeating data. Through the process of FIGS. 13A and 13B, a modified repeat buffer as shown in FIG. 13B is obtained.

In FIG. 13A, problems caused by overlapping and adding using data in the repeating buffer are minimized. In this case, reversed data is a value obtained by inverting data of a repeating buffer referred to for overlapping. Overlap data uses the value of the repeating buffer referred to for overlapping.

The repeating buffer is converted into a new repeating buffer by performing an overlapping and adding process at both ends of the buffer as shown in FIGS. 13A and 13B through the above process.

Once created, the repeated buffer can be used for data replication for error concealment as shown in FIG. 14 without any additional work. By doing so, the method proposed in some embodiments can reduce the complexity of error concealment and minimize the sound quality degradation.

According to some embodiments, in the case of a strong electric field having a good channel environment according to a channel environment of a wireless network, the sound quality may be improved by encoding multimedia data at a lossless or high bit rate. As a result, the sound interruption can be minimized by efficiently changing from a low bit rate to a high bit rate.

In addition, it is possible to recover a lost frame by applying a low complexity error concealment method to an error frame generated by the encoder due to packet loss or an error, and natural signal recovery is possible by minimizing phase mismatch and discontinuity.

In addition, according to the present invention, the size of the input buffer of the receiver according to the data type (audio or video) can be adjusted to effectively control synchronization and delay according to the data type.

Referring to FIG. 15, the data transmission apparatus 100 may selectively transmit an audio frame or an identification value of an audio frame divided from an audio signal to the data reproduction apparatus 1000. The data transmission apparatus 100 may split an audio signal into a plurality of audio frames, and the fourth audio frame, which is the current audio frame, may be the first audio frame, the second audio frame, which is the previous audio frames stored in the data transmission apparatus 100. And a third audio frame. Also, according to the comparison result, the data transmission apparatus 100 may compress the fourth audio frame and transmit the compressed audio frame to the data reproduction apparatus 1000 or transmit the identification value of the previous audio frame similar to the fourth audio frame. Can be sent to.

In addition, the data reproducing apparatus 1000 extends and outputs the compressed fourth audio frame received from the data transmitting apparatus 100 or outputs an audio frame corresponding to the identification value received from the data transmitting apparatus 100. It can be extracted from the memory of 1000 and output.

In operation S1600, the data transmission apparatus 100 may divide the audio signal input to the data transmission apparatus 100 into a plurality of audio frames. The data transmission device 100 may receive a voice input spoken by a user or an audio signal provided from another device (not shown). In addition, when the audio signal received by the data transmission device 100 is an analog signal, the data transmission device 100 may convert the audio signal into a digital signal. In addition, the data transmission apparatus 100 may divide the received audio signal into a plurality of audio frames. The data transmission apparatus 100 may continuously divide an audio signal according to time.

In operation S1610, the data transmission apparatus 100 may compare the current audio frame with at least one audio frame previously stored in the memory 1700 of the data transmission apparatus 100. The audio frame previously stored in the memory 1700 may be a previous audio frame of the current audio frame. In addition, the audio frame previously stored in the memory 1700 may be a previous audio frame successfully transmitted to the data reproducing apparatus 1000.

The data transmission apparatus 100 may determine the similarity between the previous audio frame and the current audio frame by comparing the previous audio frame previously stored in the memory 1700 with the current audio frame. For example, the data transmission apparatus 100 may determine the similarity between the previous audio frame and the current audio frame by calculating a correlation value between the previous audio frame and the current audio frame. The correlation value between the previous audio frame and the current audio frame may be calculated using a technique such as Mean Square Error (MSE). However, the present invention is not limited thereto and the similarity between the previous audio frame and the current audio frame may be determined through various techniques.

In operation S1620, the data transmission apparatus 100 may determine whether the determined similarity is greater than a preset threshold. The preset threshold may be variously set depending on, for example, the type of the audio signal, the method of dividing the audio signal, the specification of the data transmission apparatus 100, and the specification of the data reproducing apparatus 1000.

As a result of the determination in operation S1620, when the determined similarity is greater than the preset threshold, in operation S1630, the data transmission apparatus 100 may transmit an identification value of the previous audio frame previously stored in the memory 1700 to the data reproduction apparatus 1000. Can be. The data transmission apparatus 100 may transmit the identification value of the previous audio frame having the determined similarity to the data reproduction apparatus 1000.

If there are a plurality of previous audio frames having similarity above a threshold among the previous audio frames previously stored in the memory 1700, the data transmission apparatus 100 may identify an identification value of the previous audio frame having the highest similarity. Transmit to 1000.

In this case, the previous audio frame is previously stored in the memory of the data reproducing apparatus 1000, and the data reproducing apparatus 1000 extracts the previous audio frame stored in the memory of the data reproducing apparatus 1000 by using the received identification value. And output.

In addition, as transmission of the identification value of the selected previous audio frame fails, the data transmission apparatus 100 may retransmit the identification value of the selected previous audio frame to the data reproducing apparatus 1000 within a preset transmission time limit.

In addition, as retransmission of the identification value of the selected previous audio frame fails within the preset transmission time limit, the data transmission apparatus 100 may compress the current audio frame and transmit it to the data reproducing apparatus 1000. Alternatively, as retransmission of the identification value of the selected previous audio frame fails within a preset transmission time limit, the data transmission apparatus 100 may determine a transmission method of the next audio frame.

As a result of the determination in operation S1620, when the determined similarity is smaller than the preset threshold, in operation S1640, the data transmission apparatus 100 may compress the current audio frame.

In operation S1650, the data transmission apparatus 100 may transmit the compressed current audio frame to the data reproduction apparatus 1000.

In operation S1700, the data transmission apparatus 100 may compress the current audio frame. The data transmission apparatus 100 may compress the current audio frame into an audio bitstream using various codec algorithms in order to transmit the current audio frame to the data reproducing apparatus 1000.

In operation S1705, the data transmission apparatus 100 may transmit the compressed audio frame to the data reproduction apparatus 1000. The data transmission apparatus 100 may transmit the compressed current audio frame to the data reproduction apparatus 1000. The data transmission apparatus 100 may check the time at which the compressed audio frame is transmitted.

In operation S1710, the data transmission apparatus 100 may determine whether an ACK signal is received from the data reproduction apparatus 1000. When the data reproducing apparatus 1000 successfully receives the compressed audio frame, the data reproducing apparatus 1000 may transmit an ACK signal to the data transmitting apparatus 100. In addition, when the data reproducing apparatus 1000 does not successfully receive the compressed audio signal, the data reproducing apparatus 1000 may transmit a NACK signal to the data transmitting apparatus 100.

As a result of the determination in step S1710, when it is determined that the data transmission apparatus 100 receives the ACK signal from the data reproducing apparatus 1000, in step S1715 the data transmission apparatus 100 decompresses the successfully transmitted compressed current audio frame. (decompression)

In operation S1720, the data transmission apparatus 100 may delete at least one of previous audio frames previously stored in the memory 1700. The data transmission apparatus 100 may select at least one of previously stored previous audio frames and delete the selected previous audio frame from the memory 1700 according to a preset criterion. As at least one of the previous audio frames previously stored in the memory 1700 is deleted, a space in which the extended current audio frame is to be stored may be secured in the memory 1700.

In operation S1730, the data transmission apparatus 100 may store the extended current audio frame in the memory 1700.

As a result of the determination in S1710, when it is determined that the data transmission apparatus 100 does not receive the ACK signal from the data reproducing apparatus 1000, the data transmission apparatus 100 transmits the transmission time of the compressed current audio frame in step S1730. It may be determined whether the time limit has been exceeded. The transmission time limit may be a time limit set for transmitting a compressed current audio frame and may be set according to various criteria.

If it is determined in operation S1730 that the transmission time of the compressed current audio frame exceeds the transmission time limit, the data transmission device 100 may compress the next audio frame. Thereafter, in operation S1705, the data transmission apparatus 100 may transmit the compressed next audio frame to the data reproduction apparatus 1000.

Referring to FIG. 18, the first audio frame 1840, the second audio frame 1850, and the third audio frame 1860, which are previous audio frames, may be stored in the memory 1700 of the data transmission apparatus 100. The data transmission apparatus 100 may store the fourth audio frame 1870, which is the current audio frame, in the memory 1700.

In operation S1800, the data transmission apparatus 100 may expand the compressed fourth audio frame 1870 that has been successfully transmitted. The fourth audio frame 1870 may be a current audio frame, and as the compressed fourth audio frame 1870 is successfully transmitted to the data reproducing apparatus 1000, the data transmitting apparatus 100 may compress the fourth audio frame. The frame 1870 may be extended.

In operation S1810, the data transmission apparatus 100 may determine a deletion order of previous audio frames. The data transmission apparatus 100 may determine the deletion order of the first audio frame 1840, the second audio frame 1850, and the third audio frame 1860.

The data transmission apparatus 100 may determine the deletion order of previous audio frames based on the time when the previous audio frames are stored in the memory 1700. For example, the data transmission apparatus 100 may determine the deletion order such that the previous audio frame stored in the memory 1700 is deleted first.

Alternatively, the data transmission apparatus 100 may determine the deletion order of the previous audio frames based on the number of times the identification value of the previous audio frame is transmitted to the data reproducing apparatus 1000. For example, the data transmission apparatus 100 may determine the deletion order such that the previous audio frame having a small number of times that an identification value has been transmitted to the data reproduction apparatus 1000 is deleted from the memory 1700 first.

However, the criterion for determining the deletion order of the data transmission device 100 is not limited to the above, and the data transmission device 100 may determine the deletion order according to various criteria in consideration of the type of the audio signal.

In addition, the data transmission apparatus 100 may provide the data reproducing apparatus 1000 with information about the determined deletion order. Accordingly, the data reproducing apparatus 1000 may delete the previous audio frames stored in the data reproducing apparatus 1000 according to the determined deletion order. In addition, the previous audio frame stored in the data transmission apparatus 100 and the previous audio frame stored in the data reproducing apparatus 1000 may be synchronized.

In operation S1820, the data transmission apparatus 100 may select a previous audio frame to be deleted. For example, the data transmission apparatus 100 may be the first of the first audio frame 1840, the second audio frame 1850, and the third audio frame 1860 stored in the memory 1700. One audio frame 1840 may be selected.

In operation S1830, the data transmission apparatus 100 may delete the selected previous audio frame from the memory 1700. The data transmission device 100 may delete the selected first audio frame from the memory 1700.

In operation S1840, the data transmission apparatus 100 may store the expanded fourth audio frame in the memory 1700.

In operation S1900, the data reproducing apparatus 1000 may receive data related to an audio signal from the data transmitting apparatus 100. The data reproducing apparatus 1000 may receive a compressed current audio frame from the data transmission apparatus 100 or receive an identification value of a previous audio frame.

In operation S1905, the data reproducing apparatus 1000 may identify a data type of the received data. The data reproducing apparatus 1000 may determine whether data related to the audio signal received from the data transmitting apparatus 100 is a compressed current audio frame or an identification value of a previous audio frame.

If the type of data identified in step S1905 is a compressed current audio frame, the data reproducing apparatus 1000 may decompress the compressed current audio frame in step S1910.

In operation S1910, the data reproducing apparatus 1000 may delete the previous audio frame previously stored in the memory of the data reproducing apparatus 1000 based on the preset deletion order. In this case, the data reproducing apparatus 1000 may receive and store a setting value related to the deletion rank from the data transmission apparatus 100 in advance, and the same setting value regarding the deletion rank may be stored in the data transmission apparatus 100 and the data reproducing apparatus ( 1000, respectively. However, the present invention is not limited thereto, and the data reproducing apparatus 1000 may generate a setting value relating to a deletion rank based on a user input and provide the generated setting value to the data transmission apparatus 100.

In addition, the data reproducing apparatus 1000 may delete at least one of the previous audio frames pre-stored in the memory according to a setting value of the deletion order. As the previous audio frame is deleted from the memory, a storage space for storing the current audio frame in the memory may be secured.

In operation S1920, the data reproducing apparatus 1000 may store the extended current audio frame in the memory.

In operation S1925, the data reproducing apparatus 1000 may transmit an ACK signal to the data transmitting apparatus 100. The data reproducing apparatus 1000 may transmit an ACK signal to the data transmitting apparatus 100 to inform the data transmitting apparatus 100 that the compressed current audio frame has been successfully received. Accordingly, the data transmission apparatus 100 may determine that the compressed current audio frame has been successfully transmitted to the data reproduction apparatus 1000 by receiving the ACK signal. In addition, the data transmission apparatus 100 may delete the previous audio frame previously stored in the memory 1700 of the data transmission apparatus 100, extend the compressed current audio frame, and store it in the memory 1700. In this case, the data transmission apparatus 100 may delete the previous audio frame according to the same deletion order as that of the data reproduction apparatus 1000.

In operation S1930, the data reproducing apparatus 1000 may output an audio signal of the extended current audio frame.

If the type of data identified in step S1905 is the identification value of the previous audio frame, in step S1935 the data reproducing apparatus 1000 may extract the previous audio frame corresponding to the received identification value from the memory of the data reproducing apparatus 1000. have. The data reproducing apparatus 1000 may extract a previous audio frame having the received identification value.

In operation S1910, the data reproducing apparatus 1000 may output an audio signal of the extracted previous audio frame.

Meanwhile, in the above description, the data transmission apparatus 100 and the data reproducing apparatus 1000 delete the previous audio frame using the same setting value with respect to the deletion order in order to store and delete the same previous audio frame. It is not limited. The data reproducing apparatus 1000 may delete the previous audio frame and provide an identification value of the deleted previous audio frame to the data transmitting apparatus 100, and the data transmitting apparatus 100 receives the data from the data reproducing apparatus 1000. The previous audio frame identical to the previous audio frame deleted by the data reproducing apparatus 1000 may be deleted from the memory 1700 of the data transmission apparatus 100 using the identification value of the deleted previous audio frame.

Meanwhile, the data reproducing apparatus 1000 may transmit the ACK signal to the data transmitting apparatus 100 before step S1915, and the data transmitting apparatus 100 receives the ACK signal and the memory 1700 of the data transmitting apparatus 100. You can delete the previous audio frame within. In addition, the data transmission apparatus 100 may provide an identification value of the deleted previous audio frame to the data reproducing apparatus 1000, and the data reproducing apparatus 1000 may delete the previous previous audio received from the data transmitting apparatus 100. The previous audio frame identical to the previous audio frame deleted by the data transmission apparatus 100 may be deleted from the memory of the data reproducing apparatus 1000 by using the identification value of the frame.

Accordingly, the same audio frame may be stored and deleted in the memory of the data reproducing apparatus 1000 and the memory 1700 of the data transmitting apparatus 100.

As illustrated in FIG. 20, the data transmission apparatus 100 may include a user input unit 1100, an output unit 1200, a controller 1300, and a communicator 1500. However, not all the components shown in FIG. 6 are essential components of the data transmission apparatus 100. The data transmission apparatus 100 may be implemented by more components than those illustrated in FIG. 20, or the data transmission apparatus 100 may be implemented by fewer components than the components illustrated in FIG. 20.

For example, as illustrated in FIG. 21, the data transmission apparatus 100 according to some embodiments may include a sensing unit in addition to the user input unit 1100, the output unit 1200, the control unit 1300, and the communication unit 1500. 1400, an A / V input unit 1600, and a memory 1700 may be further included.

The user input unit 1100 means a means for a user to input data for controlling the data transmission apparatus 100. For example, the user input unit 1100 includes a key pad, a dome switch, a touch pad (contact capacitive type, pressure resistive layer type, infrared sensing type, surface ultrasonic conduction type, and integral type). Tension measurement method, piezo effect method, etc.), a jog wheel, a jog switch, and the like, but are not limited thereto.

The user input unit 1100 may include a user input for setting a threshold to be compared with a similarity between a current audio frame and a previous audio frame, a user input for setting a transmission time limit of the current audio frame, and a user input for setting a deletion order of a previous audio frame. And the like.

The output unit 1200 may output an audio signal, a video signal, or a vibration signal, and the output unit 1200 may include a display unit 1210, an audio output unit 1220, and a vibration motor 1230. have.

The display unit 1210 displays and outputs information processed by the data transmission apparatus 100. For example, the display unit 1210 may display information related to data transmission between the data transmission apparatus 100 and the data reproducing apparatus 1000. For example, the display 1210 may display a transmission rate and power efficiency of the audio frame identification value. In addition, the display unit 1210 may display information on the status of synchronization between the previous audio frame stored in the data transmission apparatus 100 and the previous audio frame stored in the data reproducing apparatus 1000.

Meanwhile, when the display unit 1210 and the touch pad form a layer structure and are configured as a touch screen, the display unit 1210 may be used as an input device in addition to the output device. The display unit 1210 may include a liquid crystal display, a thin film transistor-liquid crystal display, an organic light-emitting diode, a flexible display, and a three-dimensional display. 3D display, an electrophoretic display. In addition, the data transmission apparatus 100 may include two or more display units 1210 according to the implementation form of the data transmission apparatus 100. In this case, the two or more display units 1210 may be disposed to face each other using a hinge.

The sound output unit 1220 outputs audio data received from the communication unit 1500 or stored in the memory 1700. In addition, the sound output unit 1220 outputs a sound signal related to a function (for example, a call signal reception sound, a message reception sound, and a notification sound) performed by the data transmission device 100. The sound output unit 1220 may include a speaker, a buzzer, and the like.

The vibration motor 1230 may output a vibration signal. For example, the vibration motor 1230 may output a vibration signal corresponding to the output of audio data or video data (eg, a call signal reception sound, a message reception sound, etc.). In addition, the vibration motor 1230 may output a vibration signal when a touch is input to the touch screen.

The controller 1300 generally controls the overall operation of the data transmission apparatus 100. For example, the controller 1300 executes programs stored in the memory 1700, such that the user input unit 1100, the output unit 1200, the sensing unit 1400, the communication unit 1500, and the A / V input unit 1600 are provided. ) Can be controlled overall.

In detail, the controller 1300 may divide the audio signal input to the data transmission apparatus 100 into a plurality of audio frames. The data transmission device 100 may receive a voice input spoken by a user or receive an audio signal provided from another device (not shown), and the controller 1300 may continuously divide the audio signal according to time.

The controller 1300 may compare the current audio frame with at least one audio frame previously stored in the memory 1700 of the data transmission apparatus 100. The audio frame previously stored in the memory 1700 may be a previous audio frame of the current audio frame. In addition, the audio frame previously stored in the memory 1700 may be a previous audio frame successfully transmitted to the data reproducing apparatus 1000.

The controller 1300 may determine the similarity between the previous audio frame and the current audio frame by comparing the previous audio frame previously stored in the memory 1700 with the current audio frame. For example, the controller 1300 may determine a similarity degree between the previous audio frame and the current audio frame by calculating a correlation value between the previous audio frame and the current audio frame. The correlation value between the previous audio frame and the current audio frame may be calculated using a technique such as Mean Square Error (MSE). However, the present invention is not limited thereto and the similarity between the previous audio frame and the current audio frame may be determined through various techniques.

The controller 1300 may determine whether the determined similarity is greater than a preset threshold. The preset threshold may be variously set depending on, for example, the type of the audio signal, the method of dividing the audio signal, the specification of the data transmission apparatus 100, and the specification of the data reproducing apparatus 1000.

In addition, when the determined similarity is greater than the preset threshold, the controller 1300 may provide the data reproducing apparatus 1000 with an identification value of a previous audio frame previously stored in the memory 1700. The controller 1300 may provide the data reproducing apparatus 1000 with an identification value of the previous audio frame having the determined similarity.

If there are a plurality of previous audio frames having similarities over a threshold among the previous audio frames previously stored in the memory 1700, the controller 1300 may identify an identification value of the previous audio frame having the highest similarity as the data reproducing apparatus 1000. ) Can be provided.

Meanwhile, when the determined similarity is smaller than the preset threshold, the controller 1300 may compress the current audio frame. In addition, the controller 1300 may provide the compressed current audio frame to the data reproducing apparatus 1000.

Meanwhile, the controller 1300 may compress the current audio frame and provide the same to the data reproducing apparatus 1000 and manage the previous audio frame stored in the memory 1700.

The controller 1300 may compress the current audio frame. The controller 1300 may compress the current audio frame into an audio bitstream using various codec algorithms to provide the current audio frame to the data reproducing apparatus 1000.

The controller 1300 may provide the compressed current audio frame to the data reproducing apparatus 1000. The controller 1300 may check the time at which the compressed audio frame was transmitted.

The controller 1300 may determine whether an ACK signal is received from the data reproducing apparatus 1000. When the data reproducing apparatus 1000 successfully receives the compressed audio frame, the data reproducing apparatus 1000 may transmit an ACK signal to the data transmitting apparatus 100. In addition, when the data reproducing apparatus 1000 does not successfully receive the compressed audio signal, the data reproducing apparatus 1000 may transmit a NACK signal to the data transmitting apparatus 100.

If it is determined that the data transmission apparatus 100 receives the ACK signal from the data reproduction apparatus 1000, the controller 1300 may decompress the successfully transmitted compressed current audio frame.

The controller 1300 may delete at least one of previous audio frames previously stored in the memory 1700. The controller 1300 may select at least one of previously stored previous audio frames and delete the selected previous audio frame from the memory 1700 according to a preset criterion. As at least one of the previous audio frames previously stored in the memory 1700 is deleted, a space in which the extended current audio frame is to be stored may be secured in the memory 1700.

The controller 1300 may store the extended current audio frame in the memory 1700.

In addition, if it is determined that the data transmission apparatus 100 does not receive the ACK signal from the data reproducing apparatus 1000, the controller 1300 may determine whether the transmission time of the compressed current audio frame exceeds the transmission time limit. . The transmission time limit may be a time limit set for transmitting a compressed current audio frame and may be set according to various criteria.

If it is determined that the transmission time of the compressed current audio frame exceeds the transmission time limit, the controller 1300 may compress the next audio frame. Thereafter, the controller 1300 may provide the compressed audio frame to the data reproducing apparatus 1000.

The sensing unit 1400 may detect a state of the data transmission device 100 or a state around the data transmission device 100 and transmit the detected information to the control unit 1300.

The sensing unit 1400 may include a geomagnetic sensor 1410, an acceleration sensor 1420, a temperature / humidity sensor 1430, an infrared sensor 1440, a gyroscope sensor 1450, and a position sensor. (Eg, GPS) 1460, barometric pressure sensor 1470, proximity sensor 1480, and RGB sensor (illuminance sensor) 1490, but are not limited thereto. Since functions of the respective sensors can be intuitively deduced by those skilled in the art from the names, detailed descriptions thereof will be omitted.

The communication unit 1500 may include one or more components that allow communication between the data transmission apparatus 100 and the data reproducing apparatus 1000. For example, the communicator 1500 may include a short range communicator 1510, a mobile communicator 1520, and a broadcast receiver 1530.

The short-range wireless communication unit 151 includes a Bluetooth communication unit, a Bluetooth low energy (BLE) communication unit, a near field communication unit (Near Field Communication unit), a WLAN (Wi-Fi) communication unit, a Zigbee communication unit, an infrared ray ( IrDA (Infrared Data Association) communication unit, WFD (Wi-Fi Direct) communication unit, UWB (ultra wideband) communication unit, Ant + communication unit and the like, but may not be limited thereto.

The mobile communication unit 1520 transmits and receives a radio signal with at least one of a base station, an external terminal, and a server on a mobile communication network. Here, the wireless signal may include various types of data according to transmission and reception of a voice call signal, a video call call signal, or a text / multimedia message.

The broadcast receiving unit 1530 receives a broadcast signal and / or broadcast related information from the outside through a broadcast channel. The broadcast channel may include a satellite channel and a terrestrial channel. According to an implementation example, the data transmission apparatus 100 may not include the broadcast reception unit 1530.

In addition, the communicator 1500 may receive standard audio frames in order to pre-store commonly used audio frame samples in the memory 1700.

The input unit 1600 is for inputting an audio signal or a video signal, and may include a camera 1610 and a microphone 1620. The camera 1610 may obtain an image frame such as a still image or a moving image through an image sensor in a video call mode or a photographing mode. The image captured by the image sensor may be processed by the controller 1300 or a separate image processor (not shown).

The image frame processed by the camera 1610 may be stored in the memory 1700 or transmitted to the outside through the communication unit 1500. Two or more cameras 1610 may be provided according to the configuration aspect of the terminal.

The microphone 1620 receives an external sound signal and processes the external sound signal into electrical voice data. For example, the microphone 1620 may receive an acoustic signal from an external device or speaker. The microphone 1620 may use various noise removing algorithms for removing noise generated in the process of receiving an external sound signal.

The memory 1700 may store a program for processing and controlling the controller 1300, and may store data input to the data transmission device 100 or output from the data transmission device 100. In addition, the memory 1700 may be used as a buffer memory for storing previous audio frames.

The memory 1700 may include a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (for example, SD or XD memory), RAM Random Access Memory (RAM) Static Random Access Memory (SRAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read-Only Memory (PROM), Magnetic Memory, Magnetic Disk It may include at least one type of storage medium of the optical disk.

Programs stored in the memory 1700 may be classified into a plurality of modules according to their functions. For example, the programs stored in the memory 1700 may be classified into a UI module 1710, a touch screen module 1720, a notification module 1730, and the like. .

The UI module 1710 may provide a specialized UI, GUI, or the like that interoperates with the data transmission apparatus 100 for each application. The touch screen module 1720 may detect a touch gesture on a user's touch screen and transmit information about the touch gesture to the controller 1300. The touch screen module 1720 according to some embodiments may recognize and analyze a touch code. The touch screen module 1720 may be configured as separate hardware including a controller.

Various sensors may be provided inside or near the touch screen to detect a touch or proximity touch of the touch screen. An example of a sensor for sensing a touch of a touch screen is a tactile sensor. The tactile sensor refers to a sensor that senses the contact of a specific object to the extent that a person feels or more. The tactile sensor may sense various information such as the roughness of the contact surface, the rigidity of the contact object, the temperature of the contact point, and the like.

In addition, an example of a sensor for sensing a touch of a touch screen is a proximity sensor.

The proximity sensor refers to a sensor that detects the presence or absence of an object approaching a predetermined detection surface or an object present in the vicinity without using a mechanical contact by using an electromagnetic force or infrared rays. Examples of the proximity sensor include a transmission photoelectric sensor, a direct reflection photoelectric sensor, a mirror reflection photoelectric sensor, a high frequency oscillation proximity sensor, a capacitive proximity sensor, a magnetic proximity sensor, and an infrared proximity sensor. The user's touch gesture may include tap, touch and hold, double tap, drag, pan, flick, drag and drop, and swipe.

The notification module 1730 may generate a signal for notifying occurrence of an event of the data transmission device 100. Examples of events occurring in the data transmission apparatus 100 include call signal reception, message reception, key signal input, schedule notification, and the like. The notification module 1730 may output a notification signal in the form of a video signal through the display unit 1210, may output the notification signal in the form of an audio signal through the sound output unit 1220, and the vibration motor 1230. Through the notification signal may be output in the form of a vibration signal.

Meanwhile, the data reproducing apparatus 1000 according to some embodiments may include a configuration similar to that of the data transmitting apparatus 100 illustrated in FIGS. 20 and 21.

The controller of the data reproducing apparatus 1000 may receive data related to an audio signal from the data transmitting apparatus 100 through the communication unit of the data reproducing apparatus 1000.

The controller of the data reproducing apparatus 1000 may identify a data type of the received data. The controller of the data reproducing apparatus 1000 may determine whether data related to the audio signal received from the data transmitting apparatus 100 is a compressed current audio frame or an identification value of a previous audio frame.

If the type of the identified data is a compressed current audio frame, the controller of the data reproducing apparatus 1000 may expand the compressed current audio frame.

The controller of the data reproducing apparatus 1000 may delete the previous audio frame pre-stored in the memory of the data reproducing apparatus 1000 based on the preset deletion order. In this case, the controller of the data reproducing apparatus 1000 may receive and store a setting value of the deletion rank from the data transmission apparatus 100 in advance, and the same setting value of the deletion rank may be stored in the data transmission apparatus 100 and the data reproduction. Each device may be stored in the device 1000. However, the present invention is not limited thereto, and the controller of the data reproducing apparatus 1000 may generate a setting value related to the deletion order based on a user input and provide the generated setting value to the data transmission apparatus 100.

The controller of the data reproducing apparatus 1000 may delete at least one of the previous audio frames pre-stored in the memory according to a setting value of the deletion order. As the previous audio frame is deleted from the memory, a storage space for storing the current audio frame in the memory may be secured.

The controller of the data reproducing apparatus 1000 may store the extended current audio frame in a memory.

The controller of the data reproducing apparatus 1000 may provide an ACK signal to the data transmitting apparatus 100. The controller of the data reproducing apparatus 1000 may provide an ACK signal to the data transmitting apparatus 100 to inform the data transmitting apparatus 100 that the compressed current audio frame has been successfully received. Accordingly, the data transmission apparatus 100 may determine that the compressed current audio frame has been successfully transmitted to the data reproduction apparatus 1000 by receiving the ACK signal. In addition, the data transmission apparatus 100 may delete the previous audio frame previously stored in the memory 1700 of the data transmission apparatus 100, extend the compressed current audio frame, and store it in the memory 1700. In this case, the data transmission apparatus 100 may delete the previous audio frame according to the same deletion order as that of the data reproduction apparatus 1000.

The controller of the data reproducing apparatus 1000 may output the audio signal of the extended current audio frame through the output unit of the data reproducing apparatus 1000.

If the type of the identified data is the identification value of the previous audio frame, the controller of the data reproducing apparatus 1000 may extract the previous audio frame corresponding to the received identification value from the memory of the data reproducing apparatus 1000. The controller of the data reproducing apparatus 1000 may extract a previous audio frame having the received identification value.

The controller of the data reproducing apparatus 1000 may output an audio signal of the extracted previous audio frame.

Meanwhile, the present invention can be embodied as computer readable codes on a computer readable recording medium. The computer-readable recording medium includes all kinds of recording devices in which data that can be read by a computer system is stored.

Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disks, optical data storage devices, and the like, which may also be implemented in the form of carrier waves (for example, transmission over the Internet). Include. The computer readable recording medium can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. And functional programs, codes and code segments for implementing the present invention can be easily inferred by programmers in the art to which the present invention belongs.

So far I looked at the center of the preferred embodiment for the present invention. Those skilled in the art will understand that the present invention can be embodied in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in descriptive sense only and not for purposes of limitation. The scope of the present invention is shown not in the above description but in the claims, and all differences within the scope should be construed as being included in the present invention.

Claims

A first device for providing an audio signal to a second device, the first device comprising:

A memory for storing at least one previous audio frame;

Splitting an audio signal input to the first device into a plurality of audio frames, comparing a current audio frame among the plurality of audio frames with the at least one previous audio frame previously stored in a memory of the first device, A controller for selecting one of the previously stored previous audio frames based on a similarity between the stored previous audio frame and the current audio frame; And

A communication unit which transmits an identification value of the selected previous audio frame to the second device;

Device comprising a.
The method of claim 1

The control unit,

Selecting one of the previously stored previous audio frames according to the similarity between the previously stored previous audio frame and the current audio frame being greater than a preset threshold.

Device characterized in that.
The method of claim 1

The control unit,

Compressing the current audio frame as the similarity between the previously stored previous audio frame and the current audio frame is smaller than a preset threshold;

The communication unit,

Sending the compressed current audio frame to the second device

Device characterized in that.
The method of claim 3

The control unit,

As the compressed current audio frame is successfully transmitted to the second device, delete at least one of the previously stored previous audio frames,

Storing the current audio frame in the memory as at least one of the previously stored previous audio frames is deleted;

Device characterized in that.
The method of claim 4

The control unit,

Deleting the previous audio frame previously stored in the memory among the previously stored previous audio frames

Device characterized in that.
The method of claim 4, wherein

The control unit,

Deleting one of the previously stored previous audio frames based on the number of times the identification value of the previously stored previous audio frame has been transmitted to the second device.

Device characterized in that.
The method of claim 1,

The communication unit,

Retransmitting the identification value of the selected previous audio frame to the second device within a preset transmission time limit as transmission of the identification value of the selected previous audio frame fails.

Device characterized by
The method of claim 7, wherein

The control unit,

Determining a transmission method of a next audio frame as the transmission of the identification value of the selected previous audio frame fails within the preset transmission time limit.

Device characterized in that.
A method in which a first device provides an audio signal to a second device, the method comprising:

Dividing an audio signal input to the first device into a plurality of audio frames;

Comparing a current audio frame among the plurality of audio frames with at least one previous audio frame previously stored in a memory of the first device;

Selecting one of the previously stored previous audio frames based on a similarity of the current audio frame among the previously stored previous audio frames; And

Transmitting an identification value of the selected previous audio frame to the second device;

How to include.
The method of claim 9,

The selected previous audio frame is stored in the memory and the memory of the second device, respectively, as successfully transmitted to the second device.
The method of claim 9,

The step of selecting one of the previously stored previous audio frames is to select one of the previously stored previous audio frames according to a similarity between the previously stored previous audio frame and the current audio frame being greater than a preset threshold. Way.
The method of claim 9,

Compressing the current audio frame according to a similarity between the previously stored previous audio frame and the current audio frame being less than a preset threshold; And

Sending the compressed current audio frame to the second device;

It further comprises.
The method of claim 12,

Deleting at least one of the previously stored previous audio frames as the compressed current audio frame is successfully transmitted to the second device; And

Storing the current audio frame in the memory as at least one of the previously stored previous audio frames is deleted;

It further comprises.
In a device that receives an audio signal from another device,

A communication unit for receiving data related to an audio signal input to the other device from the other device;

If the type of the received data is determined, and the type of the determined data is an identification value of a pre-stored audio frame, the audio frame corresponding to the identification value is extracted from the memory of the device, and the type of the determined data is the other. A control unit that extends the bitstream to the current audio frame if it is a bitstream of the current audio frame divided from the audio signal input to the device; And

An output unit configured to output an audio signal of the extracted audio frame or an audio signal of the extended current audio frame;

Device comprising a.
In a method for a device to receive an audio signal from another device,

Receiving data related to an audio signal input to the other device from the other device;

Determining a type of the received data;

Extracting an audio frame corresponding to the identification value from a memory of the device when the type of the determined data is an identification value of a pre-stored audio frame;

Extending the bitstream to the current audio frame if the type of the determined data is a bitstream of a current audio frame divided from an audio signal input to the other device;

Outputting an audio signal of the extracted audio frame or an audio signal of the extended current audio frame;

How to include.