WO2015078359A1 - 音频传输延时的测量方法及系统 - Google Patents

音频传输延时的测量方法及系统 Download PDF

Info

Publication number
WO2015078359A1
WO2015078359A1 PCT/CN2014/092198 CN2014092198W WO2015078359A1 WO 2015078359 A1 WO2015078359 A1 WO 2015078359A1 CN 2014092198 W CN2014092198 W CN 2014092198W WO 2015078359 A1 WO2015078359 A1 WO 2015078359A1
Authority
WO
WIPO (PCT)
Prior art keywords
codebook
sending
audio
original audio
receiving
Prior art date
Application number
PCT/CN2014/092198
Other languages
English (en)
French (fr)
Inventor
邹连平
张文婷
何航
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to US15/038,861 priority Critical patent/US9755933B2/en
Publication of WO2015078359A1 publication Critical patent/WO2015078359A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/0858One way delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/309Measuring or estimating channel quality parameters
    • H04B17/364Delay profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/10Architectures or entities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L7/00Arrangements for synchronising receiver with transmitter
    • H04L7/0079Receiver details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L7/00Arrangements for synchronising receiver with transmitter
    • H04L7/0091Transmitter details
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M7/00Arrangements for interconnection between switching centres
    • H04M7/006Networks other than PSTN/ISDN providing telephone service, e.g. Voice over Internet Protocol (VoIP), including next generation networks with a packet-switched transport layer
    • H04M7/0081Network operation, administration, maintenance, or provisioning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/50Testing arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/70Media network packetisation

Definitions

  • the present invention relates to the field of communications, and in particular to a method and system for measuring audio transmission delay.
  • End-to-end delay refers to the delay of voice acquisition, pre-processing, encoding, packing, network transmission, unpacking, and final playback.
  • too much delay will affect the audience's subjective auditory perception of the speech product, so delay measurement and evaluation of the speech system is necessary.
  • Some of the current delay measurement methods are based on intrusive and non-intrusive.
  • Intrusive measurements which are deep inside the voice system under test, have some characteristics for intrusive systems:
  • the measurement data is usually transmitted to the data frame or data packet of the system under test, it is inevitable to be encoded, compressed, encapsulated, decapsulated, decoded, etc., and the measurement data may be lost or damaged during the encoding compression and decompression process. ;
  • the measurement method shown in FIG. 1 is a delay measurement method based on single-ended acquisition bidirectional transmission averaging, and the main steps are: (1) playing an audio signal locally, and playing the audio signal is collected by a local measurement device, and the acquisition is performed.
  • the time stamp is recorded as T1
  • (2) the audio signal is simultaneously captured by the local system under test, and then transmitted to the far end of the system under test by the system under test, and (3) the sound played by the far end of the system under test, and then After being collected by the remote end of the system under test and transmitted to the local system under test through the intermediate network, (4) the signal played by the local system under test is collected by the measuring device.
  • the time stamp is collected as T2, and the time difference of the audio signal collected by the measuring device is calculated. (T2-T1) divided by 2 to obtain the delay value.
  • the above scheme is characterized by two-way transmission to obtain the time stamp values of the two acquired signals, and then separately obtain the time stamp difference, and obtain a one-way delay estimate, but there are deficiencies here:
  • the end-to-end delay is all the delays experienced by the voice on the single communication link from the initial acquisition to the playback, because the system under test is black box, most of the communication
  • the link uploading and downloading links are not completely symmetrical.
  • the processing link between the processing of the voice and the test equipment is not necessarily completely equal. Therefore, the delay experienced by the voice on a single communication path is not necessarily A simple arithmetic mean of the delay on both links.
  • Embodiments of the present invention provide a method and system for measuring audio transmission delay, so as to at least solve the technical problem of inaccurate calculation of audio transmission delay in the prior art.
  • a method for measuring an audio transmission delay including: synchronizing a transmission of an original audio codebook to be tested between a transmitting end and a receiving end to obtain an original audio codebook. Sending start indication information, transmission end indication information, reception start indication information, and reception end indication information; the transmitting end sends the original audio codebook to be tested to the receiving end in response to the transmission start indication information, and stops transmitting to the receiving end in response to the transmission end indication information.
  • the receiving end starts collecting the original audio codebook sent by the sending end in response to the receiving start indication information, and stops collecting the original audio codebook sent by the transmitting end in response to the receiving end indication information;
  • the audio codebook and the original audio codebook pre-stored on the receiving end are tested for audio transmission delay.
  • the method further includes: the sending start indication information includes a sending start time, the sending end indication information includes a sending end time, the receiving start indication information includes a receiving start time, the receiving end indication information includes a receiving end time, and the sending
  • the sending of the original audio codebook to be sent to the receiving end by the transmitting end start indication information includes: the transmitting end sends the original audio codebook to the receiving end from the sending start time; the transmitting end stops transmitting the original audio to the receiving end by responding to the sending end indicating information.
  • the codebook includes: the sender ends at the end of the transmission Stop sending the original audio codebook to the receiving end; the receiving end starts collecting the original audio codebook sent by the sending end in response to the receiving start indication information, and the receiving end collects the original audio codebook sent by the sending end from the receiving start time; The receiving end responds to the receiving end indication information to stop collecting the original audio codebook sent by the transmitting end, and the receiving end stops collecting the original audio codebook sent by the transmitting end at the receiving end time.
  • the method further includes: the sending start time is the same as the receiving start time, and the sending end time is the same as the receiving end time; or the sending start time is the same as the receiving start time, and the sending end time and the receiving end are ended.
  • the difference between the times is less than the first predetermined threshold; or the difference between the transmission start time and the reception start time is less than the second predetermined threshold, and the transmission end time is the same as the reception end time; or the transmission start time and the reception start time
  • the difference between the two is less than a third predetermined threshold, and the difference between the end of transmission and the end of reception is less than a fourth predetermined threshold.
  • the synchronous operation of the transmission of the original audio codebook to be tested between the transmitting end and the receiving end further includes: performing information interaction between the transmitting end and the receiving end, so that the sending end sends multiple originals.
  • the order of the audio codebooks is the same as the order in which the receiving end receives a plurality of original audio codebooks.
  • synchronizing the transmission of the original audio codebook to be tested between the transmitting end and the receiving end comprises: synchronizing the first GPS synchronization control unit disposed at the transmitting end and the second GPS synchronization setting at the receiving end
  • the control unit synchronizes the transmission of the original audio codebook between the transmitting end and the receiving end
  • the first GPS synchronization control unit and the second GPS synchronization control unit both comprise: a GPS device, the GPS device includes a GPS antenna and a GPS receiver
  • the module the GPS antenna is configured to transmit at least one of the following: a sending start time, a sending end time, a receiving start time, and a receiving end time;
  • the GPS receiving module is configured to receive at least one of the following: a sending start time, a sending end time, and a receiving start time. And receive the end time.
  • the method further includes: the sending start indication information includes first instruction information for indicating that the receiving end is ready to receive; the sending end indication information includes a second instruction for indicating completion of playing the original audio codebook.
  • the original audio codebook includes: when the first end information is received by the transmitting end, the original audio codebook is sent to the receiving end; and the transmitting end responds to the sending end indication information to stop sending the original audio codebook to the receiving end, and the sending end receives the second instruction.
  • the original audio codebook is sent to the receiving end; the receiving end responds to the receiving start indication information and starts collecting the original audio codebook sent by the transmitting end, including: the original audio sent by the transmitting end when the receiving end receives the third instruction information.
  • the codebook performs acquisition; the receiving end responds to the reception end indication information
  • the collecting of the original audio codebook sent by the transmitting end includes: determining whether the length of the original audio codebook sent by the transmitting end is longer than the collecting duration, and if so, stopping the original audio codebook sent by the transmitting end. collection.
  • the audio transmission delay obtained according to the test audio codebook collected by the receiving end and the original audio codebook pre-stored on the receiving end includes: R xy ( ⁇ ) is an original audio codebook and corresponding test The value of the cross-correlation function of the audio codebook, t s is the time when the receiving end starts collecting the original audio codebook sent by the transmitting end, and t e is the time when the receiving end stops collecting the original audio codebook sent by the transmitting end, t
  • x(t) is the energy value corresponding to the sampling point when the time is t in the original audio codebook
  • is the test audio codebook convolved with x(t)
  • the offset of the sampling point, y(t+ ⁇ ) is the energy value of the sampling point when the time is t+ ⁇ in the test audio codebook, and the value of ⁇ corresponding to the maximum cross-correlation function value is used to represent the audio transmission delay. .
  • the audio transmission delay obtained according to the test audio codebook collected by the receiving end and the original audio codebook pre-stored on the receiving end further includes: Where TestValue(k) is the delay value corresponding to the maximum cross-correlation function value obtained by the original audio codebook i and the corresponding test audio codebook i obtained by the kth measurement of the original audio codebook i, delay The value is the value of ⁇ corresponding to the maximum cross-correlation function value obtained by the kth measurement divided by the time domain value obtained by the sampling rate information used by the receiving end of the kth measurement, and the sampling rate information is the head of the original audio codebook i
  • the sampling rate in the format information, Delay i is the average audio transmission delay of the original audio codebook i, and m is an integer greater than or equal to 1.
  • the audio transmission delay obtained according to the test audio codebook collected by the receiving end and the original audio codebook pre-stored on the receiving end further includes: Where Avg_Delay is the average audio transmission delay of n original audio codebooks, and n is an integer greater than or equal to 1.
  • a measurement system for audio transmission delay comprising: a first synchronization unit at a transmitting end and a second synchronization unit at a receiving end, for transmitting at the transmitting end and the receiving end Synchronizing the transmission of the original audio codebook to be measured, obtaining the transmission start indication information, the transmission end indication information, the reception start indication information, and the reception end indication information of the original audio codebook; the first response unit at the transmitting end, for Sending the original audio codebook to be tested to the receiving end in response to the sending start indication information; the second responding unit at the transmitting end is configured to stop transmitting the original audio codebook to the receiving end in response to the sending end indication information; and the third response unit at the receiving end And responsive to the receiving start indication information, starting to collect the original audio codebook sent by the sending end; the fourth responding unit at the receiving end is configured to stop collecting the original audio codebook sent by the sending end in response to the receiving end indication information; a computing unit at the receiving
  • the system further includes: the first response unit includes: a first response sub-module, And transmitting, by the sending start time, the original audio codebook to the receiving end, where the sending start indication information includes a sending start time; the second response unit includes: a second response sub-module, configured to stop sending the original to the receiving end at the sending end time The audio codebook, wherein the sending end indication information includes a sending end time; the third response unit includes: a third response sub-module, configured to collect the original audio codebook sent by the sending end from the receiving start time, where the receiving starts The indication information includes a receiving start time.
  • the fourth response unit includes: a fourth response sub-module, configured to stop collecting the original audio codebook sent by the transmitting end at the receiving end time, wherein the receiving end indication information includes a receiving end time.
  • the system further includes: the first synchronization unit includes: a first synchronization module, the second synchronization unit includes: a second synchronization module, wherein the first synchronization module and the second synchronization module are configured to perform synchronization Operation to obtain one of the following results: the transmission start time is the same as the reception start time, and the transmission end time is the same as the reception end time; or the transmission start time is the same as the reception start time, and between the transmission end time and the reception end time The difference is less than the first predetermined threshold; or the difference between the transmission start time and the reception start time is less than the second predetermined threshold, and the transmission end time is the same as the reception end time; or the difference between the transmission start time and the reception start time Less than the third predetermined threshold, and the difference between the transmission end time and the reception end time is less than the fourth predetermined threshold.
  • the system includes: the first synchronization unit includes: a third synchronization module, the second synchronization unit includes: a fourth synchronization module, wherein the third synchronization module and the fourth synchronization module are used at the transmitting end
  • the information exchanges with the receiving end, so that the order in which the sending end sends the plurality of original audio codebooks is the same as the order in which the receiving end receives the multiple original audio codebooks.
  • the system includes: the first synchronization unit includes: a first GPS synchronization control unit, the second synchronization unit includes: a second GPS synchronization control unit, wherein the first GPS synchronization control unit and the second GPS
  • the synchronization control unit is configured to perform synchronous operation on the transmission of the original audio codebook between the transmitting end and the receiving end, wherein the first GPS synchronization control unit and the second GPS synchronization control unit both comprise: a GPS device, and the GPS device includes a GPS antenna And a GPS receiving module, the GPS antenna is configured to transmit at least one of: a sending start time, a sending end time, a receiving start time, and a receiving end time; the GPS receiving module is configured to receive at least one of the following: a sending start time, a sending end time, The reception start time and the reception end time.
  • the system further includes: the first response unit includes: a sending submodule, configured to start sending the original audio codebook to the receiving end when the sending end receives the first instruction information, where the first instruction The information is used to indicate that the receiving end is ready to receive; the second response unit includes: a terminating sub-module, configured to stop sending the original audio codebook to the receiving end when the transmitting end receives the second instruction information, where the second instruction information is used for Instructing to finish playing the original audio codebook; the third response unit includes: a collection submodule, configured to start collecting the original audio codebook sent by the sending end when the receiving end receives the third instruction information, where the third instruction information is used Instructed The receiving end starts to receive; the fourth response unit includes: a determining sub-module, configured to determine, at the receiving end, whether the length of the original audio codebook sent by the transmitting end is longer than the collecting duration, and if so, stopping the original sending to the sending end The audio codebook is collected.
  • the first response unit includes: a sending
  • the calculating unit includes: a first calculating module, configured to calculate an audio transmission delay by using: _R xy ( ⁇ ) is a cross-correlation function between an original audio codebook and a corresponding test audio codebook.
  • _R xy ( ⁇ ) is a cross-correlation function between an original audio codebook and a corresponding test audio codebook.
  • the value, t s is the time at which the receiving end starts collecting the original audio codebook sent by the transmitting end, and t e is the time at which the receiving end stops collecting the original audio codebook sent by the transmitting end, and t is the corresponding time of each sampling point.
  • Time information x(t) is the energy value corresponding to the sampling point when the time is t in the original audio codebook, and ⁇ is the offset of the sampling point in the test audio codebook convolved in x(t), y(t+ ⁇ ) is the energy value of the sampling point when the time t+ ⁇ is tested in the audio codebook, and the value of ⁇ corresponding to the maximum cross-correlation function value is used to represent the audio transmission delay.
  • the calculating unit includes: a second calculating module, configured to calculate an audio transmission delay by using the following formula: Where TestValue(k) is the delay value corresponding to the maximum cross-correlation function value obtained by the original audio codebook i and the corresponding test audio codebook i obtained by the kth measurement of the original audio codebook i, delay The value is the value of ⁇ corresponding to the maximum cross-correlation function value obtained by the kth measurement divided by the time domain value obtained by the sampling rate information used by the receiving end of the kth measurement, and the sampling rate information is the head of the original audio codebook i
  • the sampling rate in the format information, Delay i is the average audio transmission delay of the original audio codebook i, and m is an integer greater than or equal to 1.
  • the calculating unit further includes: a third calculating module, configured to calculate an audio transmission delay by using the following formula: Where Avg_Delay is the average audio transmission delay of n original audio codebooks, and n is an integer greater than or equal to 1.
  • the synchronization end is used to synchronously operate the transmitting end and the receiving end, thereby achieving the purpose of avoiding the echo problem and the bidirectional path asymmetry, thereby realizing the technical effect of accurately calculating the transmission delay, thereby solving the present problem.
  • FIG. 1 is a schematic diagram of an audio transmission delay measurement according to the prior art
  • FIG. 2 is a flow chart of an optional audio transmission delay measurement method according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an alternative audio transmission delay measurement implementation in accordance with an embodiment of the present invention.
  • FIG. 4 is a flow chart of another optional audio transmission delay measurement method according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of another alternative audio transmission delay measurement implementation in accordance with an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of still another alternative audio transmission delay measurement implementation in accordance with an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of still another alternative audio transmission delay measurement implementation in accordance with an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of still another alternative audio transmission delay measurement implementation in accordance with an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of an optional audio transmission delay measuring apparatus according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of another optional audio transmission delay measuring apparatus according to an embodiment of the present invention.
  • FIG. 11 is a schematic diagram of still another optional audio transmission delay measuring apparatus according to an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of still another alternative audio transmission delay measuring apparatus according to an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of still another alternative audio transmission delay measuring apparatus in accordance with an embodiment of the present invention.
  • a method for measuring an audio transmission delay includes:
  • S202 Synchronize the transmission of the original audio codebook to be tested between the transmitting end and the receiving end, and obtain sending start indication information, sending end indication information, receiving start indication information, and receiving end indication information of the original audio codebook.
  • the indication information for controlling the start and end of the transmission and reception of the original audio codebook is obtained.
  • the devices that operate synchronously in this embodiment include, but are not limited to, a synchronization control device of the GPS and a synchronization control device of the signaling control server.
  • the foregoing synchronization operation is an operation process for negotiating the opening and stopping of the audio playback of the transmitting end and the opening and stopping of the audio collection of the receiving end, that is, controlling the transmitting end to start or stop playing the codebook, and notifying the receiving end to open or Stop audio collection.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the synchronization control unit at both ends performs a synchronization operation to obtain transmission start instruction information, transmission end instruction information, reception start instruction information, and reception end instruction information for the original audio codebook.
  • the sending end sends the original audio codebook to be tested to the receiving end in response to the sending start indication information, and stops transmitting the original audio codebook to the receiving end in response to the sending end indication information, and the receiving end starts to send to the transmitting end in response to the receiving start indication information.
  • the original audio codebook is collected, and the original audio codebook sent by the sending end is stopped in response to the receiving end indication information;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the synchronization control unit controls the local audio application to start playing audio (for example, Audio play); when the local audio application After receiving the sending end indication information, the terminal stops sending the original audio codebook to the receiving end.
  • the synchronization control unit controls the local audio application end to stop playing the audio; when the remote audio application end receives the start indication information, the local audio is started.
  • the original audio codebook sent by the application side is collected, for example, the synchronization control unit controls to start collecting audio played by the local audio application (for example, Audio Capture); when the remote audio application receives After the indication information is ended, the original audio codebook sent by the local audio application end is stopped, for example, the synchronization control unit controls to stop collecting the audio played by the local audio application end.
  • the local audio application for example, Audio Capture
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • a comparison estimation is performed to obtain a transmission delay of the audio.
  • the sending operation of the transmitting end audio is accurately synchronized with the collecting operation of the receiving end audio, so that the original audio codebook used for the delay calculation is synchronized with the test audio codebook that has been delayedly transmitted and collected.
  • the transmission start indication information includes a transmission start time
  • the transmission end indication information includes a transmission end time
  • the reception start instruction information includes a reception start time
  • the reception end indication information includes a reception end time
  • the transmitting end sends the original audio codebook to the receiving end in response to the sending start indication information, and the sending end sends the original audio codebook to the receiving end from the sending start time; wherein, the sending start time is but not limited to : The moment when the audio starts playing.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the transmission start indication information
  • the original audio for example, Audio play
  • the remote audio application end is started to be played to the remote audio application end at the indicated transmission start time.
  • the sending end sends the original audio codebook to the receiving end in response to the sending end indication information, including: the sending end stops sending the original audio codebook to the receiving end at the sending end time; wherein, the sending end time is but not limited to: the audio stops The moment of playback.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the transmission end indication information, it stops playing the original audio to the remote audio application end at the indicated transmission end time.
  • the receiving end in response to the receiving start indication information, starting to collect the original audio codebook sent by the sending end, includes: receiving, by the receiving end, the original audio codebook sent by the sending end, starting from the receiving start time;
  • the reception start time is but not limited to: the time at which the audio collection is started.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the reception start indication information
  • the original audio played by the local audio application end is started to be collected at the indicated reception start time.
  • the receiving end responds to the receiving end indication information and stops collecting the original audio codebook sent by the transmitting end.
  • the receiving end stops collecting the original audio codebook sent by the sending end at the receiving end time.
  • the receiving end time is but not limited to: Stop the moment of audio collection.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the reception end indication information
  • the collection of the original audio played by the local audio application end is stopped at the indicated reception end time.
  • the synchronization operation between the transmitting end and the receiving end in the embodiment includes four optional determining manners:
  • the transmission start time is the same as the reception start time
  • the transmission end time is the same as the reception end time
  • the start and stop times of the sending end and the receiving end are respectively the same, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is also T 1
  • the transmission end time is T 2
  • the reception end time is also T 2 .
  • the sending start time is the same as the receiving start time, and the difference between the sending end time and the receiving end time is less than a first predetermined threshold
  • the start time of the sending end and the receiving end are the same, and the difference between the ending time of the sending end and the receiving end is less than a first predetermined threshold, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is also T 1
  • the transmission end time is T 2
  • the reception end time is T 3
  • the first predetermined threshold is A 1 , T 3 -T 2 ⁇ A 1 . It can also be judged from the above that the transmitting end and the receiving end realize the synchronous operation.
  • the difference between the sending start time and the receiving start time is less than a second predetermined threshold, and the sending end time is the same as the receiving end time;
  • the difference between the start time of the sending end and the receiving end is less than a second predetermined threshold, and the sending end is the same as the ending time of the receiving end, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is T 4
  • the transmission end time is T 2
  • the reception end time is also T 2
  • the second predetermined threshold is A 2 , T 4 -T 1 ⁇ A 2 . It can also be judged from the above that the transmitting end and the receiving end realize the synchronous operation.
  • the difference between the transmission start time and the reception start time is less than a third predetermined threshold, and the difference between the transmission end time and the reception end time is less than a fourth predetermined threshold.
  • the difference between the start time of the sending end and the receiving end is less than a third predetermined threshold, and the difference between the ending time of the sending end and the receiving end is less than a fourth predetermined threshold, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is T 5
  • the transmission end time is T 2
  • the reception end time is T 6 , T 5 -T 1 ⁇ A 3 , T 6 -T 2 ⁇ A 4
  • the manner of determining the synchronous operation of the transmitting end and the receiving end is not limited to the same time, and in the case where the difference between the two times is less than the allowable numerical range, it can be determined that the synchronous operation is implemented.
  • the synchronous operation of the transmission of the original audio codebook to be tested between the transmitting end and the receiving end further includes:
  • S402 Perform information interaction between the sending end and the receiving end, so that the order in which the sending end sends the plurality of original audio codebooks is the same as the order in which the receiving end receives the multiple original audio codebooks.
  • the original audio codebook includes but is not limited to: one or more.
  • the order sent by the transmitting end is the same as the order received by the receiving end.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the audio sequence played by the local audio application is S1, S2, S3.
  • the sequence of audio collection by the remote audio application is also S1, S2, S3.
  • the sending and receiving sequence is the same, which makes it easier for the local audio application and the remote audio application. Accurate synchronization is achieved to accurately calculate the transmission delay.
  • synchronizing the transmission of the original audio codebook to be tested between the transmitting end and the receiving end comprises: synchronizing the first GPS synchronization control unit disposed at the transmitting end and the second GPS synchronization setting at the receiving end The control unit synchronizes the transmission of the original audio codebook between the transmitting end and the receiving end.
  • the first GPS synchronization control unit and the second GPS synchronization control unit each include: a GPS device, the GPS device includes a GPS antenna and a GPS receiving module, and the GPS antenna is configured to transmit at least one of the following: The time, the transmission end time, the reception start time, and the reception end time; the GPS receiving module is configured to receive at least one of the following: a transmission start time, a transmission end time, a reception start time, and a reception end time.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through the transmission network, and the synchronous control at both ends is performed.
  • the unit is a GPS synchronous control unit.
  • the local audio application controls the start/stop playback of the codebook (for example, Audio play) through the GPS synchronization control unit
  • the remote audio application controls the start/stop acquisition of the audio (for example, Audio Capture) through the GPS synchronization control unit.
  • the GPS device is composed of an antenna and a GPS receiving module, and the hardware circuit and the processing software extract and process the received signal, extract and output two kinds of signals, one is a pulse signal with an interval of 1 s, and the pulse thereof
  • the synchronization error between the leading edge and the international standard GMT is not more than 1us, that is, 1pps; the other is the international standard "year, month, day, hour, minute and second" information corresponding to the pulse front.
  • the first type of signal is called back by the GPS SDk development kit, and the synchronization control unit is notified to read the time information of the GPS; the second type of signal is used by the GPS SDk development kit to provide a precise time to control whether to start the corresponding audio playback and collection.
  • the specific synchronization processing procedure based on the GPS synchronization control device is shown in FIG. 6 , wherein the local audio application end and the remote audio application end implement audio playback and collection through the test app, and the specific steps are as follows:
  • the local audio application end to the remote audio application end runs the voice system to be tested, and initializes the test information of each code book, including the code book number, the duration of each code book, and the time interval of each code book.
  • the start test time of each codebook
  • the test initiator sends a signal to the GPS synchronization control unit according to the codebook number, and reads the time provided by the GPS. If the time provided by the GPS device is read to the codebook corresponding to the start test time, the GPS synchronization control unit Sending a command to the local test app to start playing the audio codebook and sending it through the processing of the system under test;
  • the GPS synchronization control unit queries the GPS device interface according to the time provided by the GPS device to the test time, sends a command to the test app, opens the remote end to collect the output of the tested audio system, and when the receiving end collects the recorded audio file, The sampling rate is collected according to the sampling rate of the audio codebook number sent by the received originating end corresponding to the audio codebook file in the local codebook index table.
  • the receiving end The acquisition is stopped; the collected test audio codebook and the original audio codebook are sent to the delay measurement module.
  • the long-distance or short-distance transmission and reception synchronization is realized based on the GPS, and the problem of delay accuracy is avoided by the one-way acquisition, and the accuracy of the delay measurement is improved.
  • the method for measuring the audio transmission delay further includes: the sending start indication information includes first instruction information indicating that the receiving end is ready to receive, and the sending end indication information includes indicating to finish playing the original audio.
  • the second instruction information of the codebook and the reception start indication information are used to indicate that the receiving end starts to pick up
  • the received third instruction information and the reception end indication information include the collection duration carried in the second instruction information.
  • the instruction information may also be referred to as signaling information, where the instruction information is implemented based on a signaling control server (SyncServer).
  • the signaling control server can implement short-distance transmission and reception synchronization.
  • the sending, by the sending end, the sending of the original audio codebook to be sent to the receiving end in response to the sending start indication information comprises: sending, by the sending end, the original audio codebook to the receiving end when receiving the first instruction information;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application end indicates that the remote audio application is ready to collect audio according to the received first instruction information.
  • the sending, by the sending end, the sending of the original audio codebook to the receiving end in response to the sending end indication information includes: stopping, when the transmitting end receives the second instruction information, stopping sending the original audio codebook to the receiving end;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application end indicates that the remote audio application end has finished playing the original audio according to the received second instruction information.
  • the receiving end in response to the receiving start indication information, starting to collect the original audio codebook sent by the sending end, comprises: when the receiving end receives the third instruction information, starting to collect the original audio codebook sent by the sending end;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the remote audio application end instructs the remote audio application end to start collecting the original audio according to the received third instruction information.
  • the receiving end responding to the receiving end indication information to stop collecting the original audio codebook sent by the sending end includes: determining, by the receiving end, whether the duration of collecting the original audio codebook sent by the sending end exceeds the collecting duration, and if yes, Stop collecting the original audio codebook sent by the sender.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the remote audio application end receives the reception end indication information, where the reception end indication information includes the collection duration T t carried in the second instruction information.
  • the local audio application end to the remote audio application end runs the voice system to be tested, and the corresponding synchronization test client is successfully logged into the SyncServer, and both ends are successfully logged in, and a test session is created by the SyncServer.
  • the ends are identified by the A end and the B end respectively.
  • an audio test session request SyncRequest is initiated by any end (such as the A end) (the request carries the codebook number information), and the request is transferred to the other end of the other test session (B end) through the SyncServer control terminal.
  • the other end receives the test session request SyncRequest, initializes/opens the audio collection resource device, and creates a degraded codebook file name/audio sample rate, number of channels/sample digits, etc. according to the codebook number.
  • SyncRequest the test session request SyncRequest
  • the audio collection resource device initializes/opens the audio collection resource device, and creates a degraded codebook file name/audio sample rate, number of channels/sample digits, etc. according to the codebook number.
  • test session initiator (A end) receives the signaling prepared by the peer in the SyncServer, sends a signaling to start playing the audio codebook (Ok Begin Play) to the other end (B end), and immediately starts. Play the reference codebook signal.
  • the reference code audio signal played at this time is processed by the input of the audio system under test through its entire process (front-end processing, encoding, packing, network transmission, unpacking, decoding, post-processing, playback) and then output to the other end. , is collected by the other end of the test control client.
  • a Play Ended signaling (which carries the duration of the test codebook) is sent to the other end (B end), and the other end receives the comparison and collects Whether the duration is reached or not, the acquisition time is stopped until the output of the output signal of the audio system under test is stopped, and the finally recorded codebook signal is output.
  • the synchronous control based on the instruction realizes the synchronous operation of the transmitting end and the receiving end, and adopts the one-way collecting method, thereby avoiding the path asymmetry and the echo problem that affect the delay accuracy, and improves the problem.
  • the accuracy of the delay measurement is a simple measurement of the delay measurement.
  • the audio transmission delay obtained according to the test audio codebook collected by the receiving end and the original audio codebook pre-stored on the receiving end includes:
  • R xy ( ⁇ ) is the value of the cross-correlation function of the original audio codebook and the corresponding test audio codebook
  • t s is the time at which the receiving end starts collecting the original audio codebook sent by the transmitting end
  • t e is the receiving The time at which the end stops the acquisition of the original audio codebook sent by the transmitting end
  • t is the time information corresponding to each sampling point
  • x(t) is the energy value corresponding to the sampling point when the time is t in the original audio codebook
  • The offset of the sample point in the test audio codebook convolved with x(t), y(t+ ⁇ ) is the energy value of the sample point when the time is t+ ⁇ in the test audio codebook
  • the value of ⁇ corresponding to the largest cross-correlation function value represents the audio transmission delay.
  • the delay can be estimated by dividing the sampling rate information of the reference audio codebook. value.
  • the delay calculation uses a method for solving the cross-correlation of the audio signal to obtain an audio delay, and the solution audio delay is divided into an audio overall coarse delay Delay_crude+audio segment delay Delay_internal.
  • the overall coarse delay Delay_crude is the delay value obtained by the maximum cross-correlation between the reference codebook and the audio output codebook recorded by the synchronous control unit.
  • the Delay_internal audio sub-segment delay is based on the overall rough delay.
  • the audio signal in the codebook is divided and aligned by the audio sub-segment, and then the delay of the audio sub-segment corresponding to the audio output codebook recorded by the synchronous control unit in each reference sub-book is solved, and the delay of the final solution is solved.
  • the value is the overall coarse delay of the audio Delay_crude+ delay within the audio segment Delay_internal.
  • the normalized cross-correlation function value may be normalized by the following formula to obtain the normalized maximum cross-correlation value ⁇ xy ( ⁇ ) and the corresponding phase-marking time ⁇ :
  • the data volume of the one-frame codebook file itself may be more convenient to process.
  • the codebook audio file is requested by the small window of Tms. Take the audio envelope, and then find the maximum cross-correlation value between the envelopes, and find the corresponding delay value t. The specific steps are as follows:
  • the window added in this embodiment includes at least one of the following: a Hamming window, a Hanning window, a Hamming window, a triangular window, a Bartlett window, a Kaiser window, and the like.
  • the rectangular window function is as follows:
  • the energy average of the kth frame signal Xk(n) is represented by E(k):
  • the envelope information value of the frame is obtained every Tms frame, and the envelope information is a logarithm of the voice energy signal after normalization and normalization, which is an identifier of the short-term energy change of the voice, and the k-th frame voice signal
  • the envelope is represented by Env(k):
  • the audio transmission delay obtained according to the test audio codebook collected by the receiving end and the original audio codebook pre-stored on the receiving end includes:
  • TestValue(k) is the delay value corresponding to the maximum cross-correlation function value obtained by the original audio codebook i and the corresponding test audio codebook i obtained by the kth measurement of the original audio codebook i
  • delay The value is the value of ⁇ corresponding to the maximum cross-correlation function value obtained by the kth measurement divided by the time domain value obtained by the sampling rate information used by the receiving end of the kth measurement, and the sampling rate information is the head of the original audio codebook i
  • the sampling rate in the format information, Delay i is the average audio transmission delay of the original audio codebook i, and m is an integer greater than or equal to 1.
  • the audio transmission delay is obtained according to the test audio codebook collected by the receiving end and the original audio codebook pre-stored on the receiving end, and the overall average delay of the audio system is also included:
  • Avg_Delay is the average audio transmission delay of n original audio codebooks, and n is an integer greater than or equal to 1.
  • the energy value of the sampling point is calculated by the cross-correlation function, thereby realizing accurate calculation of the audio transmission delay.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.
  • the apparatus includes:
  • the indication information for controlling the start and end of the transmission and reception of the original audio codebook is obtained.
  • the devices that operate synchronously in this embodiment include, but are not limited to, a synchronization control device of the GPS and a synchronization control device of the signaling control server.
  • the foregoing synchronization operation is used to negotiate the opening and stopping of the audio playback of the transmitting end and the receiving end.
  • the operation of opening and stopping the audio collection that is, controlling the transmitting end to start or stop playing the codebook, and notifying the receiving end to start or stop the audio collection.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the synchronization control unit at both ends performs a synchronization operation to obtain transmission start instruction information, transmission end instruction information, reception start instruction information, and reception end instruction information for the original audio codebook.
  • the first response unit 904, located at the sending end, is configured to start sending the original audio codebook to be tested to the receiving end in response to the sending start indication information;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the transmission start indication information
  • the original audio codebook to be tested is sent to the receiving end.
  • the synchronization control unit controls the local audio application to start playing audio (for example, Audio play).
  • the second response unit 906 at the transmitting end is configured to stop sending the original audio codebook to the receiving end in response to the sending end indication information;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application end receives the transmission end indication information
  • the original audio codebook is stopped from being sent to the receiving end.
  • the synchronization control unit controls the local audio application end to stop playing the audio.
  • the third response unit 908 at the receiving end is configured to start collecting the original audio codebook sent by the sending end in response to receiving the start indication information
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the remote audio application receives the start indication information
  • the original audio codebook sent by the local audio application end is collected, for example, the synchronization control unit controls to start collecting audio played by the local audio application (for example, Audio Capture). .
  • the fourth response unit 910 at the receiving end is configured to stop collecting the original audio codebook sent by the sending end in response to the receiving end indication information;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the remote audio application receives the end indication information
  • the original audio codebook sent by the local audio application end is stopped.
  • the set, for example, the synchronization control unit controls to stop collecting audio played by the local audio application.
  • the computing unit 912 at the receiving end is configured to calculate an audio transmission delay according to the test audio codebook collected by the receiving end and the original audio codebook pre-stored on the receiving end.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • a comparison estimation is performed to obtain a transmission delay of the audio.
  • the sending operation of the audio of the transmitting end is accurately synchronized with the collecting operation of the audio of the receiving end, so that the audio original audio codebook that is sent into the delay calculation and the collected audio codebook that is delayedly transmitted are transmitted. Synchronize.
  • the system further includes:
  • the first response unit 904 includes: a first response sub-module 1002, configured to send an original audio codebook to the receiving end from a sending start time, where the sending start indication information includes a sending start time;
  • the sending start time is, but not limited to, the time at which the audio starts playing.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the transmission start indication information
  • the original audio for example, Audio play
  • the remote audio application end is started to be played to the remote audio application end at the indicated transmission start time.
  • the second response unit 906 includes: a second response sub-module 1004, configured to stop transmitting the original audio codebook to the receiving end at the end of the sending, where the sending end indication information includes a sending end time;
  • the sending end time is but not limited to: the moment when the audio stops playing.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the transmission end indication information, it stops playing the original audio to the remote audio application end at the indicated transmission end time.
  • the third response unit 908 includes: a third response sub-module 1006, configured to collect an original audio codebook sent by the sending end from a receiving start time, where the receiving start indication information includes a receiving start time;
  • the reception start time is, but not limited to, the time at which the audio collection is started.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through the transmission network.
  • the local audio application receives the reception start indication information
  • the original audio played by the local audio application end is started to be collected at the indicated reception start time.
  • the fourth response unit 910 includes: a fourth response sub-module 1008, configured to stop collecting the original audio codebook sent by the sending end at the receiving end time, where the receiving end indication information includes a receiving end time.
  • the receiving end time is, but not limited to, stopping the moment of audio collection.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application receives the reception end indication information
  • the collection of the original audio played by the local audio application end is stopped at the indicated reception end time.
  • the system further includes: the first synchronization unit 902 includes: a first synchronization module 1102, and the second synchronization unit 903 includes: a second synchronization module 1104, wherein the first synchronization The module 1102 and the second synchronization module 1104 are configured to perform a synchronization operation to obtain one of the following results:
  • the start and stop times of the sending end and the receiving end are respectively the same, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is also T 1
  • the transmission end time is T 2
  • the reception end time is also T 2 .
  • the start time of the sending end and the receiving end are the same, and the difference between the ending time of the sending end and the receiving end is less than a first predetermined threshold, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is also T 1
  • the transmission end time is T 2
  • the reception end time is T 3
  • the first predetermined threshold is A 1 , T 3 -T 2 ⁇ A 1 . It can also be judged from the above that the transmitting end and the receiving end realize the synchronous operation.
  • the difference between the start time of the sending end and the receiving end is less than a second predetermined threshold, and the sending end is the same as the ending time of the receiving end, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is T 4
  • the transmission end time is T 2
  • the reception end time is also T 2
  • the second predetermined threshold is A 2 , T 4 -T 1 ⁇ A 2 . It can also be judged from the above that the transmitting end and the receiving end realize the synchronous operation.
  • the difference between the start time of the sending end and the receiving end is less than a third predetermined threshold, and the difference between the ending time of the sending end and the receiving end is less than a fourth predetermined threshold, thereby implementing a synchronous operation on the audio codebook.
  • the transmission start time is T 1
  • the reception start time is T 5
  • the transmission end time is T 2
  • the reception end time is T 6 , T 5 -T 1 ⁇ A 3 , T 6 -T 2 ⁇ A 4
  • the manner of determining the synchronous operation of the transmitting end and the receiving end is not limited to the same time, and in the case where the difference between the two times is less than the allowable numerical range, it can be determined that the synchronous operation is implemented.
  • the first synchronization unit 902 further includes: a third synchronization module 1106, and the second synchronization unit 903 includes: a fourth synchronization module 1108, wherein the third synchronization module 1106 and the The four-synchronization module 1108 performs information exchange between the transmitting end and the receiving end, so that the order in which the transmitting end sends the plurality of original audio codebooks is the same as the order in which the receiving end receives the plurality of original audio codebooks.
  • the original audio codebook includes but is not limited to: one or more.
  • the order sent by the transmitting end is the same as the order received by the receiving end.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the audio sequence played by the local audio application is S1, S2, S3.
  • the sequence of audio collection by the remote audio application is also S1, S2, S3.
  • the sending and receiving sequence is the same, which makes it easier for the local audio application and the remote audio application. Accurate synchronization is achieved to accurately calculate the transmission delay.
  • the first synchronization unit 902 further includes: a first GPS synchronization control unit 1202
  • the second synchronization unit 903 includes: a second GPS synchronization control unit 1204, wherein the first GPS The synchronization control unit 1202 and the second GPS synchronization control unit 1204 are configured to perform a synchronization operation on the transmission of the original audio codebook between the transmitting end and the receiving end.
  • the first GPS synchronization control unit and the second GPS synchronization control unit each include: a GPS device, the GPS device includes a GPS antenna and a GPS receiving module, and the GPS antenna is configured to transmit at least one of the following: The time, the transmission end time, the reception start time, and the reception end time; the GPS receiving module is configured to receive at least one of the following: a transmission start time, a transmission end time, a reception start time, and a reception end time.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through the transmission network, and the synchronous control at both ends is performed.
  • the unit is a GPS synchronous control unit.
  • the local audio application controls the start/stop playback of the codebook (for example, Audio play) through the GPS synchronization control unit
  • the remote audio application controls the start/stop acquisition of the audio (for example, Audio Capture) through the GPS synchronization control unit.
  • the GPS device is composed of an antenna and a GPS receiving module, and the hardware circuit and the processing software extract and process the received signal, extract and output two kinds of signals, one is a pulse signal with an interval of 1 s, and the pulse thereof
  • the synchronization error between the leading edge and the international standard GMT is not more than 1us, that is, 1pps; the other is the international standard "year, month, day, hour, minute and second" information corresponding to the pulse front.
  • the first type of signal is called back by the GPS SDk development kit, and the synchronization control unit is notified to read the time information of the GPS; the second type of signal passes through the GPS SDk.
  • Develop package callbacks that provide precise time to control whether or not to start playback and capture of the corresponding audio.
  • the specific synchronization processing procedure based on the GPS synchronization control device is shown in FIG. 6 , wherein the local audio application end and the remote audio application end implement audio playback and collection through the test app, and the specific steps are as follows:
  • the local audio application end to the remote audio application end runs the voice system to be tested, and initializes the test information of each codebook, including the codebook (here, the codebook is a voice/audio file with audio header format information, where the header The format information includes the sampling frequency, the number of channels, the number of samples, etc.), the format of the voice/audio file can be the format of the format of the audio head, such as wav, mp3, wma, etc., the duration of each codebook, each The time between codebook intervals, the start test time of each codebook;
  • the test initiator sends a signal to the GPS synchronization control unit according to the codebook number, and reads the time provided by the GPS. If the time provided by the GPS device is read to the codebook corresponding to the start test time, the GPS synchronization control unit Sending a command to the local test app to start playing the audio codebook and sending it through the processing of the system under test;
  • the GPS synchronization control unit queries the GPS device to query the time provided by the GPS device to the test time, sends a command to the test app, and opens the remotely collected output of the tested audio system, when the acquisition duration reaches the pre-agreed After the time, the receiving end stops collecting; the collected test audio codebook and the original audio codebook are sent to the delay measuring module.
  • the remote sensing or the short-distance transmission and reception synchronization can be realized based on the GPS, and the problem of delaying the accuracy of the upload/download path is avoided by the one-way acquisition, and the one-way acquisition can avoid the echo. Surrounding brings interference and influence to the delay calculation, which improves the accuracy of the delay measurement.
  • the system further includes:
  • the first response unit 904 includes: a sending submodule 1302, configured to start sending an original audio codebook to the receiving end when receiving the first instruction information, where the first instruction information is used to indicate that the receiving end is ready to receive;
  • the instruction information may also be referred to as signaling information, where the instruction information is implemented based on a signaling control server (SyncServer).
  • the signaling control server can implement short-distance transmission and reception synchronization.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application end indicates that the remote audio application is ready to collect audio according to the received first instruction information.
  • the second response unit 906 includes: a termination sub-module 1304 for stopping when receiving the second instruction information Sending the original audio codebook to the receiving end, wherein the second instruction information is used to indicate that the original audio codebook is completely played;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the local audio application end indicates that the remote audio application end has finished playing the original audio according to the received second instruction information.
  • the third response unit 908 includes: a collection sub-module 1306, configured to start collecting the original audio codebook sent by the sending end when receiving the third instruction information, where the third instruction information is used to indicate that the receiving end starts receiving ;
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the remote audio application end instructs the remote audio application end to start collecting the original audio according to the received third instruction information.
  • the fourth response unit 910 includes: a determining sub-module 1308, configured to determine whether the duration of the collection of the original audio codebook sent by the transmitting end exceeds the collection duration, and if yes, stop performing the original audio codebook sent by the transmitting end. collection.
  • the transmitting end is a local audio application end
  • the receiving end is a remote audio application end
  • the local audio application end transmits the original audio codebook to the remote audio application end through a transmission network.
  • the remote audio application end receives the reception end indication information, where the reception end indication information includes the collection duration T t carried in the second instruction information.
  • the specific synchronization process of the above-mentioned instruction-based synchronization control device wherein the local audio application end and the remote audio application end implement audio playback and collection through the test app, the specific steps are as follows:
  • the local audio application end to the remote audio application end runs the voice system to be tested, and the corresponding synchronization test client is successfully logged into the SyncServer, and both ends are successfully logged in, and a test session is created by the SyncServer.
  • the ends are identified by the A end and the B end respectively.
  • an audio test session request SyncRequest is initiated by any end (such as the A end) (the request carries the codebook number information), and the request is transferred to the other end of the other test session (B end) through the SyncServer control terminal.
  • the other end receives the test session request SyncRequest, initializes/opens the audio collection resource device, and creates a degraded codebook file name/audio sample rate, number of channels/sample digits, etc. according to the codebook number.
  • SyncRequest the test session request SyncRequest
  • test session initiator (A end) receives the signaling prepared by the peer in the SyncServer, sends a signaling to start playing the audio codebook (Ok Begin Play) to the other end (B end), and immediately starts. Play the reference codebook signal.
  • the reference code audio signal played at this time is processed by the input of the audio system under test through its entire process (front-end processing, encoding, packing, network transmission, unpacking, decoding, post-processing, playback) and then output to the other end. , is collected by the other end of the test control client.
  • a Play Ended signaling (which carries the duration of the test codebook) is sent to the other end (B end), and the other end receives the comparison and collects Whether the duration is reached or not, the acquisition time is stopped until the output of the output signal of the audio system under test is stopped, and the finally recorded codebook signal is output.
  • the synchronous control based on the instruction realizes the synchronous operation of the transmitting end and the receiving end, and adopts the one-way collecting method, thereby avoiding the path asymmetry and the echo problem that affect the delay accuracy, and improves the problem.
  • the accuracy of the delay measurement is a simple measurement of the delay measurement.
  • the calculating unit 912 includes: a first calculating module, configured to calculate the audio transmission delay by using the following formula:
  • R xy ( ⁇ ) is the value of the cross-correlation function of the original audio codebook and the corresponding test audio codebook
  • t s is the time at which the receiving end starts collecting the original audio codebook sent by the transmitting end
  • t e is the receiving The time at which the end stops the acquisition of the original audio codebook sent by the transmitting end
  • t is the time information corresponding to each sampling point
  • x(t) is the energy value corresponding to the sampling point when the time is t in the original audio codebook
  • The offset of the sample point in the test audio codebook convolved with x(t), y(t+ ⁇ ) is the energy value of the sample point when the time is t+ ⁇ in the test audio codebook
  • the value of ⁇ corresponding to the largest cross-correlation function value represents the audio transmission delay.
  • the delay value can be estimated by dividing the sampling rate information of the audio codebook.
  • the delay calculation uses a method for solving the cross-correlation of the audio signal to obtain an audio delay, and the solution audio delay is divided into an audio overall coarse delay Delay_crude + an audio segment delay.
  • Delay_internal The overall coarse delay Delay_crude is the delay value obtained by the maximum cross-correlation between the reference codebook and the audio output codebook recorded by the synchronous control unit.
  • the Delay_internal audio sub-segment delay is based on the overall rough delay.
  • the audio signal in the codebook is divided and aligned by the audio sub-segment, and then the delay of the audio sub-segment corresponding to the audio output codebook recorded by the synchronous control unit in each reference sub-book is solved, and the delay of the final solution is solved.
  • the value is the overall coarse delay of the audio Delay_crude+ delay within the audio segment Delay_internal.
  • the normalized cross-correlation function value may be normalized by the following formula to obtain the normalized maximum cross-correlation value ⁇ xy ( ⁇ ) and the corresponding phase-marking time ⁇ :
  • the codebook audio file is a small window of Tms. Obtain the audio envelope, and then find the maximum cross-correlation value between the envelopes, and find the corresponding delay value t. The specific steps are as follows:
  • the window added in this embodiment includes at least one of the following: a Hamming window, a Hanning window, a Hamming window, a triangular window, a Bartlett window, and a Kaiser window.
  • the rectangular window function is as follows:
  • the energy average of the kth frame signal Xk(n) is represented by E(k):
  • the envelope information value of the frame is obtained every Tms frame, and the envelope information is a logarithm of the voice energy signal after normalization and normalization, which is an identifier of the short-term energy change of the voice, and the k-th frame voice signal
  • the envelope is represented by Env(k):
  • the calculating unit 912 includes: a second calculating module, configured to calculate the audio transmission delay by using the following formula:
  • TestValue(k) is the delay value corresponding to the maximum cross-correlation function value obtained by the original audio codebook i and the corresponding test audio codebook i obtained by the kth measurement of the original audio codebook i
  • delay The value is the value of ⁇ corresponding to the maximum cross-correlation function value obtained by the kth measurement divided by the time domain value obtained by the sampling rate information used by the receiving end of the kth measurement, and the sampling rate information is the head of the original audio codebook i
  • the sampling rate in the format information, Delay i is the average audio transmission delay of the original audio codebook i, and m is an integer greater than or equal to 1.
  • the calculating unit 912 includes: a third calculating module, configured to calculate the audio transmission delay by using the following formula:
  • Avg_Delay is the average audio transmission delay of n original audio codebooks, and n is an integer greater than or equal to 1.
  • the energy value of the sampling point is calculated by the cross-correlation function, thereby realizing accurate calculation of the audio transmission delay.
  • the system for measuring the audio transmission delay described above may be adapted for short-range communication.
  • the disclosed client may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • Electromagnetism (AREA)
  • Multimedia (AREA)
  • Synchronisation In Digital Transmission Systems (AREA)
  • Telephone Function (AREA)

Abstract

本发明公开了音频传输延时的测量方法及系统。其中,该方法包括:在发送端和接收端之间对待测的原始音频码本的传输进行同步操作,得到原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息;发送端响应发送开始指示信息开始向接收端发送待测的原始音频码本,响应发送结束指示信息停止向接收端发送原始音频码本,接收端响应接收开始指示信息开始对发送端发送的原始音频码本进行采集,并响应接收结束指示信息停止对发送端发送的原始音频码本进行采集;根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时。本发明解决了现有技术中音频传输延时计算不准确的问题。

Description

音频传输延时的测量方法及系统
本申请要求于2013年11月27日提交中国专利局、申请号为201310616487.1、发明名称为“音频传输延时的测量方法及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信领域,具体而言,涉及一种音频传输延时的测量方法及系统。
背景技术
语音通信中延时是至关重要的因素,端到端延时是指语音经历采集,预处理,编码,打包,网络传输,解包,到最终播放全过程的延时。然而,延时太大会影响受众对语音产品的主观听觉感受,因此对语音系统进行延时测量和评估是必要的。目前的延时测量方法有些是基于侵入式的,也有非侵入式的。
侵入式测量也即深入被测语音系统内部的,对于侵入式系统有一些特点:
首先,由于测量数据通常依附于被测系统的数据帧或数据分组来传递,中间不可避免经过编码压缩、封装、解封装、解码等环节,测量数据在编码压缩和解压过程中可能会丢失或损坏;
其次,由于被测系统的数据格式、封装格式、编码压缩和解码算法未必公开,测试人员要设计与之匹配的测量方式和测量信号会比较困难。
此外,有的介入式测量方法需要在被测系统的终端上运行测量用的工具软件,通过工具软件进行计时。这种方式可能会影响被测系统终端本身的正常运行。
非侵入式测量系统,目前大部分的实现都是基于单端请求双向求平均的延时测量方法。
如图1所示的该测量方法是基于单端采集双向传输求平均的延时测量方法,其主要步骤是:(1)在本地播放音频信号,播放音频信号被本地测量装置采集,将此采集时戳记下为T1,(2)同时播放音频信号被本地被测系统采集,再经过被测系统传输至被测系统的远端播放出来,(3)被测系统远端播放出的声音,再经被测系统远端采集再经中间网络传输到本地被测系统播放,(4)本地被测系统播放出来的信号再被测量装置采集, 将此采集时戳记下为T2,计算测量装置两次采集到的音频信号的时戳差,(T2-T1)除2得到延时值。
上述方案的特点是采用双向传递以得到两个采集信号的时戳值,再分别求时戳差值,得到单向延时估值,但这里存在不足:
首先,双向传递的过程中,由于两端都同时拥有音频播放设备和音频采集设备,这种场景,会不可避免的产生回声(直接回声/间接回声),回声的存在(尤其当有间接回声时)容易对延时计算结果产生干扰,且会使延时计算过程变的复杂严重影响延时计算的准确率。
其次,上述双向传递求平均的过程中,端到端延迟是单条通信链路上语音从最初采集到播放出来过程中所经历的所有延时,因被测系统处理是黑盒的,大部分通信链路上传和下载链路并不完全对称,语音经该条所经过的处理环节与下面测试设备间的处理环节不一定完全对等,因此单条通信路径上语音所经历的延时,不一定是两条链路上延时的简单算术平均值。
针对上述的问题,目前尚未提出有效的解决方案。
发明内容
本发明实施例提供了一种音频传输延时的测量方法及系统,以至少解决现有技术中音频传输延时计算不准确的技术问题。
根据本发明实施例的一个方面,提供了一种音频传输延时的测量方法,包括:在发送端和接收端之间对待测的原始音频码本的传输进行同步操作,得到原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息;发送端响应发送开始指示信息开始向接收端发送待测的原始音频码本,响应发送结束指示信息停止向接收端发送原始音频码本,接收端响应接收开始指示信息开始对发送端发送的原始音频码本进行采集,并响应接收结束指示信息停止对发送端发送的原始音频码本进行采集;根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时。
作为一种可选的方案,该方法还包括:发送开始指示信息包括发送开始时刻、发送结束指示信息包括发送结束时刻、接收开始指示信息包括接收开始时刻、接收结束指示信息包括接收结束时刻;发送端响应发送开始指示信息开始向接收端发送待测的原始音频码本包括:发送端从发送开始时刻开始向接收端发送原始音频码本;发送端响应发送结束指示信息停止向接收端发送原始音频码本包括:发送端在发送结束时刻 停止向接收端发送原始音频码本;接收端响应接收开始指示信息开始对发送端发送的原始音频码本进行采集包括:接收端从接收开始时刻开始对发送端发送的原始音频码本进行采集;接收端响应接收结束指示信息停止对发送端发送的原始音频码本进行采集包括:接收端在接收结束时刻停止对发送端发送的原始音频码本进行采集。
作为一种可选的方案,该方法还包括:发送开始时刻与接收开始时刻相同,以及发送结束时刻与接收结束时刻相同;或者,发送开始时刻与接收开始时刻相同,以及发送结束时刻与接收结束时刻之间的差值小于第一预定阈值;或者发送开始时刻与接收开始时刻之间的差值小于第二预定阈值,以及发送结束时刻与接收结束时刻相同;或者发送开始时刻与接收开始时刻之间的差值小于第三预定阈值,以及发送结束时刻与接收结束时刻之间的差值小于第四预定阈值。
作为一种可选的方案,在发送端和接收端之间对待测的原始音频码本的传输进行同步操作还包括:在发送端和接收端之间进行信息交互,使得发送端发送多个原始音频码本的顺序与接收端接收多个原始音频码本的顺序相同。
作为一种可选的方案,在发送端和接收端之间对待测的原始音频码本的传输进行同步操作包括:通过设置在发送端的第一GPS同步控制单元和设置在接收端的第二GPS同步控制单元在发送端和接收端之间对原始音频码本的传输进行同步操作,其中,第一GPS同步控制单元和第二GPS同步控制单元均包括:GPS设备,GPS设备包括GPS天线和GPS接收模块,GPS天线用于传输以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻;GPS接收模块用于接收以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻。
作为一种可选的方案,该方法还包括:发送开始指示信息包括用于指示接收端准备好接收的第一指令信息;发送结束指示信息包括用于指示完成播放原始音频码本的第二指令信息、接收开始指示信息包括用于指示接收端开始接收的第三指令信息、接收结束指示信息包括第二指令信息中携带的采集时长;发送端响应发送开始指示信息开始向接收端发送待测的原始音频码本包括:发送端接收到第一指令信息时开始向接收端发送原始音频码本;发送端响应发送结束指示信息停止向接收端发送原始音频码本包括:发送端接收到第二指令信息时停止向接收端发送原始音频码本;接收端响应接收开始指示信息开始对发送端发送的原始音频码本进行采集包括:接收端接收到第三指令信息时开始对发送端发送的原始音频码本进行采集;接收端响应接收结束指示信息停止对发送端发送的原始音频码本进行采集包括:接收端判断对发送端发送的原始音频码本进行采集的时长是否超过采集时长,若超过,则停止对发送端发送的原始音频码本进行采集。
作为一种可选的方案,根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时包括:Rxy(τ)为一个原始音频码本与对应的测试音频码本的互相关函数值,ts为接收端开始对发送端发送的原始音频码本进行采集的时刻,te为接收端停止对发送端发送的原始音频码本进行采集的时刻,t为每个采样点对应的时间信息,x(t)为原始音频码本中时刻为t时的采样点对应的能量值,τ为与x(t)中进行卷积的测试音频码本中的采样点的偏移量,y(t+τ)为测试音频码本中时刻为t+τ时的采样点的能量值,使用最大的互相关函数值对应的τ的取值表示音频传输延时。
作为一种可选的方案,根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时还包括:
Figure PCTCN2014092198-appb-000001
其中,TestValue(k)为对原始音频码本i与原始音频码本i第k次测量得到的对应的测试音频码本i求解得到的最大的互相关函数值所对应的延时值,延时值为第k次测量得到的最大的互相关函数值对应的τ取值除以第k次测量接收端采用的采样率信息所得到的时域值,采样率信息为原始音频码本i的头格式信息中的采样率,Delayi为原始音频码本i的平均音频传输延时,m为大于等于1的整数。
作为一种可选的方案,根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时还包括:
Figure PCTCN2014092198-appb-000002
其中,Avg_Delay为n个原始音频码本的平均音频传输延时,n为大于等于1的整数。
根据本发明实施例的另一方面,还提供了一种音频传输延时的测量系统,包括:位于发送端的第一同步单元和位于接收端的第二同步单元,用于在发送端和接收端之间对待测的原始音频码本的传输进行同步操作,得到原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息;位于发送端的第一响应单元,用于响应发送开始指示信息开始向接收端发送待测的原始音频码本;位于发送端的第二响应单元,用于响应发送结束指示信息停止向接收端发送原始音频码本;位于接收端的第三响应单元,用于响应接收开始指示信息开始对发送端发送的原始音频码本进行采集;位于接收端的第四响应单元,用于响应接收结束指示信息停止对发送端发送的原始音频码本进行采集;位于接收端的计算单元,用于根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本计算音频传输延时。
作为一种可选的方案,该系统还包括:第一响应单元包括:第一响应子模块,用 于从发送开始时刻开始向接收端发送原始音频码本,其中,发送开始指示信息包括发送开始时刻;第二响应单元包括:第二响应子模块,用于在发送结束时刻停止向接收端发送原始音频码本,其中,发送结束指示信息包括发送结束时刻;第三响应单元包括:第三响应子模块,用于从接收开始时刻开始对发送端发送的原始音频码本进行采集,其中,接收开始指示信息包括接收开始时刻;第四响应单元包括:第四响应子模块,用于在接收结束时刻停止对发送端发送的原始音频码本进行采集,其中,接收结束指示信息包括接收结束时刻。
作为一种可选的方案,该系统还包括:第一同步单元包括:第一同步模块,第二同步单元包括:第二同步模块,其中,第一同步模块和第二同步模块用于执行同步操作,以得到以下结果之一:发送开始时刻与接收开始时刻相同,以及发送结束时刻与接收结束时刻相同;或者,发送开始时刻与接收开始时刻相同,以及发送结束时刻与接收结束时刻之间的差值小于第一预定阈值;或者发送开始时刻与接收开始时刻之间的差值小于第二预定阈值,以及发送结束时刻与接收结束时刻相同;或者发送开始时刻与接收开始时刻之间的差值小于第三预定阈值,以及发送结束时刻与接收结束时刻之间的差值小于第四预定阈值。
作为一种可选的方案,该系统包括:第一同步单元包括:第三同步模块,第二同步单元包括:第四同步模块,其中,第三同步模块和第四同步模块用于在发送端和接收端之间进行信息交互,使得发送端发送多个原始音频码本的顺序与接收端接收多个原始音频码本的顺序相同。
作为一种可选的方案,该系统包括:第一同步单元包括:第一GPS同步控制单元,第二同步单元包括:第二GPS同步控制单元,其中,第一GPS同步控制单元和第二GPS同步控制单元用于在发送端和接收端之间对原始音频码本的传输进行同步操作,其中,第一GPS同步控制单元和第二GPS同步控制单元均包括:GPS设备,GPS设备包括GPS天线和GPS接收模块,GPS天线用于传输以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻;GPS接收模块用于接收以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻。
作为一种可选的方案,该系统还包括:第一响应单元包括:发送子模块,用于在发送端接收到第一指令信息时开始向接收端发送原始音频码本,其中,第一指令信息用于指示接收端准备好接收;第二响应单元包括:终止子模块,用于在发送端接收到第二指令信息时停止向接收端发送原始音频码本,其中,第二指令信息用于指示完成播放原始音频码本;第三响应单元包括:采集子模块,用于在接收端接收到第三指令信息时开始对发送端发送的原始音频码本进行采集,其中,第三指令信息用于指示接 收端开始接收;第四响应单元包括:判断子模块,用于在接收端判断对发送端发送的原始音频码本进行采集的时长是否超过采集时长,若超过,则停止对发送端发送的原始音频码本进行采集。
作为一种可选的方案,计算单元包括:第一计算模块,用于通过以下公式计算音频传输延时:_Rxy(τ)为一个原始音频码本与对应的测试音频码本的互相关函数值,ts为接收端开始对发送端发送的原始音频码本进行采集的时刻,te为接收端停止对发送端发送的原始音频码本进行采集的时刻,t为每个采样点对应的时间信息,x(t)为原始音频码本中时刻为t时的采样点对应的能量值,τ为与x(t)中进行卷积的测试音频码本中的采样点的偏移量,y(t+τ)为测试音频码本中时刻为t+τ时的采样点的能量值,使用最大的互相关函数值对应的τ的取值表示音频传输延时。
作为一种可选的方案,计算单元包括:第二计算模块,用于通过以下公式计算音频传输延时:
Figure PCTCN2014092198-appb-000003
其中,TestValue(k)为对原始音频码本i与原始音频码本i第k次测量得到的对应的测试音频码本i求解得到的最大的互相关函数值所对应的延时值,延时值为第k次测量得到的最大的互相关函数值对应的τ取值除以第k次测量接收端采用的采样率信息所得到的时域值,采样率信息为原始音频码本i的头格式信息中的采样率,Delayi为原始音频码本i的平均音频传输延时,m为大于等于1的整数。
作为一种可选的方案,计算单元还包括:第三计算模块,用于通过以下公式计算音频传输延时:
Figure PCTCN2014092198-appb-000004
其中,Avg_Delay为n个原始音频码本的平均音频传输延时,n为大于等于1的整数。
在本发明实施例中,采用同步的方式,将发送端及接收端同步操作,达到了避免回声问题及双向路径不对称的目的,从而实现了准确计算传输延时的技术效果,进而解决了现有技术中音频传输延时计算不准确的技术问题。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据现有技术的一种音频传输延时测量的示意图;
图2是根据本发明实施例的一种可选的音频传输延时测量方法的流程图;
图3是根据本发明实施例的一种可选的音频传输延时测量实施方式的示意图;
图4是根据本发明实施例的另一种可选的音频传输延时测量方法的流程图;
图5是根据本发明实施例的另一种可选的音频传输延时测量实施方式的示意图;
图6是根据本发明实施例的又一种可选的音频传输延时测量实施方式的示意图;
图7是根据本发明实施例的又一种可选的音频传输延时测量实施方式的示意图;
图8是根据本发明实施例的又一种可选的音频传输延时测量实施方式的示意图;
图9是根据本发明实施例的一种可选的音频传输延时测量装置的示意图;
图10是根据本发明实施例的另一种可选的音频传输延时测量装置的示意图;
图11是根据本发明实施例的又一种可选的音频传输延时测量装置的示意图;
图12是根据本发明实施例的又一种可选的音频传输延时测量装置的示意图;以及
图13是根据本发明实施例的又一种可选的音频传输延时测量装置的示意图。
具体实施方式
首先,在对本发明实施例进行描述的过程中出现的部分名词或术语适用于如下解释:
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚 地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
实施例1
根据本发明实施例,提供了一种音频传输延时的测量方法,如图1所示,该方法包括:
S202,在发送端和接收端之间对待测的原始音频码本的传输进行同步操作,得到原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息;
可选地,在对待测的原始音频码本的传输进行同步操作时,会得到用于控制原始音频码本发送及接收的开始与结束的指示信息。
可选地,在本实施例中同步操作的装置包括但不限于:GPS的同步控制装置、信令控制服务器的同步控制装置。
需要说明的是,上述同步操作是用于协商发送端音频播放的开启与停止及接收端音频采集的开启与停止的操作过程,即控制发送端开始或停止播放码本,并通知接收端开启或停止音频采集。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。对两端的同步控制单元进行同步操作,得到对原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息。
S204,发送端响应发送开始指示信息开始向接收端发送待测的原始音频码本,响应发送结束指示信息停止向接收端发送原始音频码本,接收端响应接收开始指示信息开始对发送端发送的原始音频码本进行采集,并响应接收结束指示信息停止对发送端发送的原始音频码本进行采集;
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。当本地音频应用端接收到发送开始指示信息时,则向接收端发送待测的原始音频码本,例如,同步控制单元控制本地音频应用端开始播放音频(例如,Audio play);当本地音频应用端接收到发送结束指示信息,则停止向接收端发送原始音频码本,例如,同步控制单元控制本地音频应用端停止播放音频;当远端音频应用端接收到开始指示信息,则开始对本地音频应用端发送的原始音频码本进行采集,例如,同步控制单元控制开始采集本地音频应用端所播放的音频(例如,Audio Capture);当远端音频应用端接收到 结束指示信息,则停止对本地音频应用端发送的原始音频码本进行采集,例如,同步控制单元控制停止采集本地音频应用端所播放的音频。
S206,根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。根据远端音频应用端所采集到的音频及预存的原始音频,进行比较估算,得到该音频的传输延时。
通过本申请提供的实施例,使发送端音频的发送动作与接收端音频的采集动作精确同步,以使用于延时计算的原始音频码本与经过延迟传输并采集到的测试音频码本同步。
作为一种可选方案,发送开始指示信息包括发送开始时刻、发送结束指示信息包括发送结束时刻、接收开始指示信息包括接收开始时刻、接收结束指示信息包括接收结束时刻。
可选地,发送端响应发送开始指示信息开始向接收端发送待测的原始音频码本包括:发送端从发送开始时刻开始向接收端发送原始音频码本;其中,发送开始时刻为但不限于:音频开始播放的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送开始指示信息,则在所指示的发送开始时刻,向远端音频应用端开始播放原始音频(例如,Audio play)。
可选地,发送端响应发送结束指示信息停止向接收端发送原始音频码本包括:发送端在发送结束时刻停止向接收端发送原始音频码本;其中,发送结束时刻为但不限于:音频停止播放的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送结束指示信息,则在所指示的发送结束时刻,向远端音频应用端停止播放原始音频。
可选地,接收端响应接收开始指示信息开始对发送端发送的原始音频码本进行采集包括:接收端从接收开始时刻开始对发送端发送的原始音频码本进行采集;其中, 接收开始时刻为但不限于:开始对音频采集的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到接收开始指示信息,则在所指示的接收开始时刻,开始采集本地音频应用端所播放的原始音频。
接收端响应接收结束指示信息停止对发送端发送的原始音频码本进行采集包括:接收端在接收结束时刻停止对发送端发送的原始音频码本进行采集;其中,接收结束时刻为但不限于:停止对音频采集的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到接收结束指示信息,则在所指示的接收结束时刻,停止对本地音频应用端所播放的原始音频的采集。
通过本申请提供的实施例,通过对发送端与接收端的开始时刻与结束时刻的指示,实现两端的精确同步,提高了延时计算的准确性。
作为一种可选的方案,在本实施例中发送端与接收端的同步操作包括四种可选的判断方式:
作为一种可选的判断方式,发送开始时刻与接收开始时刻相同,以及发送结束时刻与接收结束时刻相同。
可选地,发送端及接收端的开始与停止的时刻分别相同,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻也为T1,发送结束时刻为T2,接收结束时刻也为T2
作为另一种可选的判断方式,发送开始时刻与接收开始时刻相同,以及发送结束时刻与接收结束时刻之间的差值小于第一预定阈值;
可选地,发送端与接收端的开始时刻相同,发送端与接收端的结束时刻之间的差值小于第一预定阈值,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻也为T1,发送结束时刻为T2,接收结束时刻为T3,其中,第一预定阈值为A1,T3-T2<A1。由上述也可判断得出,发送端与接收端实现了同步操作。
作为又一种可选的判断方式,发送开始时刻与接收开始时刻之间的差值小于第二预定阈值,以及发送结束时刻与接收结束时刻相同;
可选地,发送端与接收端的开始时刻之间的差值小于第二预定阈值,发送端与接收端的结束时刻相同,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻为T4,发送结束时刻为T2,接收结束时刻也为T2,其中,第二预定阈值为A2,T4-T1<A2。由上述也可判断得出,发送端与接收端实现了同步操作。
作为又一种可选的判断方式,发送开始时刻与接收开始时刻之间的差值小于第三预定阈值,以及发送结束时刻与接收结束时刻之间的差值小于第四预定阈值。
可选地,发送端与接收端的开始时刻之间的差值小于第三预定阈值,发送端与接收端的结束时刻之间的差值小于第四预定阈值,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻为T5,发送结束时刻为T2,接收结束时刻为T6,T5-T1<A3,T6-T2<A4,由上述也可判断得出,发送端与接收端实现了同步操作。
通过本申请提供的实施例,发送端与接收端的同步操作的判断方式不仅限于时刻完全相同,在两时刻之差小于允许的数值范围内的情况下,也可判断为实现同步操作。
作为一种可选的方案,在发送端和接收端之间对待测的原始音频码本的传输进行同步操作还包括:
S402,在发送端和接收端之间进行信息交互,使得发送端发送多个原始音频码本的顺序与接收端接收多个原始音频码本的顺序相同。
可选地,在本实施例中原始音频码本包括但不限于:一个或多个。其中,当原始音频码本为多个时,发送端发送的顺序与接收端接收的顺序相同。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端所播放的音频顺序为S1,S2,S3,远端音频应用端采集音频的顺序也为S1,S2,S3,收发顺序相同,更便于使本地音频应用端与远端音频应用端实现精确同步,进而准确计算传输延时。
作为一种可选的方案,在发送端和接收端之间对待测的原始音频码本的传输进行同步操作包括:通过设置在发送端的第一GPS同步控制单元和设置在接收端的第二GPS同步控制单元在发送端和接收端之间对原始音频码本的传输进行同步操作。
可选地,在本实施例中第一GPS同步控制单元和第二GPS同步控制单元均包括:GPS设备,GPS设备包括GPS天线和GPS接收模块,GPS天线用于传输以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻;GPS接收模块用于接收以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻。
例如,结合图5所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输,两端的同步控制单元为GPS同步控制单元。本地音频应用端通过GPS同步控制单元控制开始/停止播放码本(例如,Audio play),远端音频应用端通过GPS同步控制单元控制开始/停止对音频的采集(例如,Audio Capture)。
进一步说明,GPS设备由天线及GPS接收模块组成,其硬件电路和处理软件通过对接收到的信号进行解码和处理,从中提取并输出两种信号,一种是间隔为1s的脉冲信号,其脉冲前沿与国际标准的格林尼治时间的同步误差不超过1us,即1pps;另一种为与脉冲前沿相对应的国际标准“年月日时分秒”信息。第一种信号通过GPS SDk开发包回调,通知同步控制单元读取GPS的时间信息;第二种信号通过GPS SDk开发包回调,提供精确时间得以控制是否开始对应音频的播放和采集。
结合图6所示为基于GPS同步控制装置的具体同步处理流程,其中,本地音频应用端及远端音频应用端通过测试App实现音频的播放与采集,具体步骤如下:
S1,本地音频应用端到远端音频应用端都运行待测语音系统,并初始化每条码本的测试信息,包括码书的编号,每个码书的持续时间,每个码书间隔的时间,每个码本的开始测试时间;
S2,远程发送:测试发起端根据码本编号,向GPS同步控制单元发信号,读取GPS提供的时间,若读取到GPS设备提供的时间到了码本对应开始测试时间后,GPS同步控制单元向本地测试App发命令开始播放音频码本经由被测系统的处理过程后发送出去;
S3,远程接收:GPS同步控制单元根据GPS SDK接口查询到GPS设备提供的时间到测试时间以后,向测试App发送命令,打开远端采集被测音频系统的输出,接收端采集记录音频文件时,其采样率以所述接收到的发端发过来的音频码本编号对应本地码本索引表中的音频码本文件的采样率为准进行采集,当采集持续时间达到预先约定的时间后,接收端停止采集;将所采集到的测试音频码本与原始音频码本送入延时测量模块。
通过本申请提供的实施例,基于GPS实现远距离或近距离的收发同步,并通过单向采集避免了路径不对称而影响延时准确性的问题,提高了延时测量的准确性。
作为一种可选的方案,上述音频传输延时的测量方法还包括:发送开始指示信息包括用于指示接收端准备好接收的第一指令信息;发送结束指示信息包括用于指示完成播放原始音频码本的第二指令信息、接收开始指示信息包括用于指示接收端开始接 收的第三指令信息、接收结束指示信息包括第二指令信息中携带的采集时长
可选地,在本实施例中指令信息也可称为信令信息,其中,上述指令信息是基于信令控制服务器(SyncServer)实现传输的。可选的,基于信令控制服务器可以实现近距离的收发同步。
可选地,发送端响应发送开始指示信息开始向接收端发送待测的原始音频码本包括:发送端接收到第一指令信息时开始向接收端发送原始音频码本;
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送开始指示信息,即第一指令信息时,则根据所收到的第一指令信息,指示远端音频应用端准备好采集音频。
可选地,发送端响应发送结束指示信息停止向接收端发送原始音频码本包括:发送端接收到第二指令信息时停止向接收端发送原始音频码本;
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送结束指示信息,即第二指令信息时,则根据所收到的第二指令信息,指示远端音频应用端已完成播放原始音频。
可选地,接收端响应接收开始指示信息开始对发送端发送的原始音频码本进行采集包括:接收端接收到第三指令信息时开始对发送端发送的原始音频码本进行采集;
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。远端音频应用端接收到接收开始指示信息,即第三指令信息时,则根据所收到的第三指令信息,指示远端音频应用端开始采集原始音频。
可选地,接收端响应接收结束指示信息停止对发送端发送的原始音频码本进行采集包括:接收端判断对发送端发送的原始音频码本进行采集的时长是否超过采集时长,若超过,则停止对发送端发送的原始音频码本进行采集。
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。远端音频应用端接收到接收结束指示信息,其中,接收结束指示信息包括第二指令信息中所携带的采集时长Tt
进一步说明,结合图8所示,上述基于指令控制的同步控制装置的具体同步流程, 其中,本地音频应用端及远端音频应用端通过测试App实现音频的播放与采集,具体步骤如下:
S1,本地音频应用端到远端音频应用端都运行待测语音系统,对应都开启同步测试控制客户端都成功登录到SyncServer,两端都登录成功后由SyncServer创建一个测试会话,会话中的两端分别用A端和B端标识。
S2,由任意一端(如A端)发起音频测试会话请求SyncRequest(请求中携带有码本编号信息),该请求经SyncServer控制端中转到另测试会话另一端(B端)。
S3,另一端(B端)收到测试会话请求SyncRequest后,初始化/打开音频采集资源设备,根据码本编号创建降级码本文件名/音频采样率,声道数/样本位数等头部信息,以记录被测系统音频输出信号,并返回准备好Sync Ok的确认信息经SyncServer中转给测试会话发起端(A端)。
S4,测试会话发起端(A端)收到SyncServer中转过来的对端准备好信令后,发送一条开始播放音频码本(Ok Begin Play)的信令给另一端(B端),并立即开启播放参考码本信号。此时播放的参考码本音频信号经被测音频系统的输入采集经其全流程(前端处理,编码,打包,网络传输,解包,解码,后处理,播放)处理后到另一端播放输出后,被另一端的测试控制客户端采集。
S5,另一端(B端)收到Ok Begin Play信令后立即开启音频内录采集被测音频系统的输出,并回送一条正在内录被测音频系统输出的信令(Is Inner Recording)给发起端(A).
S6,测试会话发起端(A端)播放完参考音频码本后即发一条Play Ended信令(其中携带测试码本的持续时长)给另一端(B端),另一端收到之后比对采集持续时长是否已到,采集时长到即停止对被测音频系统输出信号的采集,将输出最终录得的码本信号。
通过本申请提供的实施例,基于指令的同步控制实现对发送端与接收端的同步操作,并采用了单向采集的方法,避免了影响延时准确性的路径不对称及回声的问题,提高了延时测量的准确性。
作为一种可选的方案,根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时包括:
Figure PCTCN2014092198-appb-000005
其中,Rxy(τ)为一个原始音频码本与对应的测试音频码本的互相关函数值,ts为接收端开始对发送端发送的原始音频码本进行采集的时刻,te为接收端停止对发送端发送的原始音频码本进行采集的时刻,t为每个采样点对应的时间信息,x(t)为原始音频码本中时刻为t时的采样点对应的能量值,τ为与x(t)中进行卷积的测试音频码本中的采样点的偏移量,y(t+τ)为测试音频码本中时刻为t+τ时的采样点的能量值,使用最大的互相关函数值对应的τ的取值表示音频传输延时。
通过求解原始音频码本与所得测试音频码本之间最大互相关函数Rxy(τ)及其对应的下标τ值,对应除以参考音频码本的采样率信息即可估计得出延时值。
可选地,在本实施例中延时计算采用的是求解音频信号互相关性的方法来求得音频延时,求解音频延时分为音频整体粗略延时Delay_crude+音频段内延时Delay_internal。整体粗略延时Delay_crude是对参考码本与经同步控制单元录制得到的音频输出码本整体最大互相关所得到的延时值,Delay_internal音频子段延时是在求得整体粗略延时的基础上,对码本中的音频信号进行音频子段划分和对齐,再求解参考码本中各音频子段对应于经同步控制单元录制得到的音频输出码本中音频子段的延时,最终求解的延时值为音频整体粗略延时Delay_crude+音频段内延时Delay_internal。
可选地,也可通过如下公式对上述互相关函数值进行归一化求取归一化后的最大互相关系数值ρxy(τ)与对应的相标时间τ:
Figure PCTCN2014092198-appb-000006
对于采样率高(≥44.1k,48k,96k)的音频播放码本场景延时评估,一帧码本文件本身的数据量可能会比较方便处理,先对码本音频文件以Tms的小窗口求取音频包络,再求取包络间的最大互相关值,求得相应的延时值t,具体步骤如下:
S1,以Tms对语音/音频信号加窗;
可选地,在本实施例中所加的窗包括以下至少之一:汉明窗,汉宁窗,海明窗,三角窗,Bartlett窗、Kaiser窗等。
例如,以窗函数为矩形窗为例,矩形窗函数如下:
Figure PCTCN2014092198-appb-000007
第k帧加窗语音信号为:Xk(n)=w(n)*x(k*N+n)。其中第k帧信号Xk(n)的能量均值用E(k)表示:
Figure PCTCN2014092198-appb-000008
S2,每Tms帧求取该帧的包络信息值,包络信息就是对语音能量信号开方并归一化之后取对数,是语音短时能量变化的一种标识,第k帧语音信号的包络用Env(k)表示:
Figure PCTCN2014092198-appb-000009
S3,求取播放码本信号与录制得到的被测系统的降级信号间的包络的最大的互相关函数值及对应的时间τ;当高音质音频测量时,上述互相关函数或相关系数中的x(t)或y(t)对应置换成参考码本与测试码本加窗后求得的包络序列值,求得对应的延时样本位置,根据采样频率换算成时间即得延时值。
作为一种可选的方案,根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时包括:
Figure PCTCN2014092198-appb-000010
其中,TestValue(k)为对原始音频码本i与原始音频码本i第k次测量得到的对应的测试音频码本i求解得到的最大的互相关函数值所对应的延时值,延时值为第k次测量得到的最大的互相关函数值对应的τ取值除以第k次测量接收端采用的采样率信息所得到的时域值,采样率信息为原始音频码本i的头格式信息中的采样率,Delayi为原始音频码本i的平均音频传输延时,m为大于等于1的整数。
作为一种可选的方案,根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本得到音频传输延时,还包括音频系统的整体平均延时:
Figure PCTCN2014092198-appb-000011
其中,Avg_Delay为n个原始音频码本的平均音频传输延时,n为大于等于1的整数。
通过本申请提供的实施例,由互相关函数计算采样点的能量值,进而实现了对音频传输延时的精确计算。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
实施例2
根据本发明实施例,还提供了一种用于实施上述音频传输延时的测量的系统,如图9所示,该装置包括:
1)位于发送端的第一同步单元902和位于接收端的第二同步单元903,用于在发送端和接收端之间对待测的原始音频码本的传输进行同步操作,得到原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息;
可选地,在对待测的原始音频码本的传输进行同步操作时,会得到用于控制原始音频码本发送及接收的开始与结束的指示信息。
可选地,在本实施例中同步操作的装置包括但不限于:GPS的同步控制装置、信令控制服务器的同步控制装置。
需要说明的是,上述同步操作是用于协商发送端音频播放的开启与停止及接收端 音频采集的开启与停止的操作过程,即控制发送端开始或停止播放码本,并通知接收端开启或停止音频采集。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。对两端的同步控制单元进行同步操作,得到对原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息。
2)位于发送端的第一响应单元904,用于响应发送开始指示信息开始向接收端发送待测的原始音频码本;
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。当本地音频应用端接收到发送开始指示信息时,则向接收端发送待测的原始音频码本,例如,同步控制单元控制本地音频应用端开始播放音频(例如,Audio play)。
3)位于发送端的第二响应单元906,用于响应发送结束指示信息停止向接收端发送原始音频码本;
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。当本地音频应用端接收到发送结束指示信息,则停止向接收端发送原始音频码本,例如,同步控制单元控制本地音频应用端停止播放音频。
4)位于接收端的第三响应单元908,用于响应接收开始指示信息开始对发送端发送的原始音频码本进行采集;
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。当远端音频应用端接收到开始指示信息,则开始对本地音频应用端发送的原始音频码本进行采集,例如,同步控制单元控制开始采集本地音频应用端所播放的音频(例如,Audio Capture)。
5)位于接收端的第四响应单元910,用于响应接收结束指示信息停止对发送端发送的原始音频码本进行采集;
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。当远端音频应用端接收到结束指示信息,则停止对本地音频应用端发送的原始音频码本进行采 集,例如,同步控制单元控制停止采集本地音频应用端所播放的音频。
6)位于接收端的计算单元912,用于根据接收端采集得到的测试音频码本以及接收端上预存的原始音频码本计算音频传输延时。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。根据远端音频应用端所采集到的音频及预存的原始音频,进行比较估算,得到该音频的传输延时。
通过本申请提供的实施例,使发送端音频的发送动作与接收端音频的采集动作精确同步,以使送入延时计算的音频原始音频码本与经过延迟传输过来的采集到的音频码本同步。
作为一种可选的方案,如图10所示,该系统还包括:
1)第一响应单元904包括:第一响应子模块1002,用于从发送开始时刻开始向接收端发送原始音频码本,其中,发送开始指示信息包括发送开始时刻;
可选地,发送开始时刻为但不限于:音频开始播放的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送开始指示信息,则在所指示的发送开始时刻,向远端音频应用端开始播放原始音频(例如,Audio play)。
2)第二响应单元906包括:第二响应子模块1004,用于在发送结束时刻停止向接收端发送原始音频码本,其中,发送结束指示信息包括发送结束时刻;
可选地,发送结束时刻为但不限于:音频停止播放的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送结束指示信息,则在所指示的发送结束时刻,向远端音频应用端停止播放原始音频。
3)第三响应单元908包括:第三响应子模块1006,用于从接收开始时刻开始对发送端发送的原始音频码本进行采集,其中,接收开始指示信息包括接收开始时刻;
可选地,接收开始时刻为但不限于:开始对音频采集的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本 地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到接收开始指示信息,则在所指示的接收开始时刻,开始采集本地音频应用端所播放的原始音频。
4)第四响应单元910包括:第四响应子模块1008,用于在接收结束时刻停止对发送端发送的原始音频码本进行采集,其中,接收结束指示信息包括接收结束时刻。
可选地,其中,接收结束时刻为但不限于:停止对音频采集的时刻。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到接收结束指示信息,则在所指示的接收结束时刻,停止对本地音频应用端所播放的原始音频的采集。
通过本申请提供的实施例,通过对发送端与接收端的开始时刻与结束时刻的指示,实现两端的精确同步,提高了延时计算的准确性。
作为一种可选的方案,如图11所示,上述系统还包括:第一同步单元902包括:第一同步模块1102,第二同步单元903包括:第二同步模块1104,其中,第一同步模块1102和第二同步模块1104用于执行同步操作,以得到以下结果之一:
可选地,发送端及接收端的开始与停止的时刻分别相同,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻也为T1,发送结束时刻为T2,接收结束时刻也为T2
可选地,发送端与接收端的开始时刻相同,发送端与接收端的结束时刻之间的差值小于第一预定阈值,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻也为T1,发送结束时刻为T2,接收结束时刻为T3,其中,第一预定阈值为A1,T3-T2<A1。由上述也可判断得出,发送端与接收端实现了同步操作。
可选地,发送端与接收端的开始时刻之间的差值小于第二预定阈值,发送端与接收端的结束时刻相同,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻为T4,发送结束时刻为T2,接收结束时刻也为T2,其中,第二预定阈值为A2,T4-T1<A2。由上述也可判断得出,发送端与接收端实现了同步操作。
可选地,发送端与接收端的开始时刻之间的差值小于第三预定阈值,发送端与接收端的结束时刻之间的差值小于第四预定阈值,进而实现对音频码本的同步操作。例如,发送开始时刻为T1,接收开始时刻为T5,发送结束时刻为T2,接收结束时刻为T6,T5-T1<A3,T6-T2<A4,由上述也可判断得出,发送端与接收端实现了同步操作。
通过本申请提供的实施例,发送端与接收端的同步操作的判断方式不仅限于时刻完全相同,在两时刻之差小于允许的数值范围内的情况下,也可判断为实现同步操作。
作为一种可选的方案,如图11所示,第一同步单元902还包括:第三同步模块1106,第二同步单元903包括:第四同步模块1108,其中,第三同步模块1106和第四同步模块1108,在发送端和接收端之间进行信息交互,使得发送端发送多个原始音频码本的顺序与接收端接收多个原始音频码本的顺序相同。
可选地,在本实施例中原始音频码本包括但不限于:一个或多个。其中,当原始音频码本为多个时,发送端发送的顺序与接收端接收的顺序相同。
例如,结合图3所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端所播放的音频顺序为S1,S2,S3,远端音频应用端采集音频的顺序也为S1,S2,S3,收发顺序相同,更便于使本地音频应用端与远端音频应用端实现精确同步,进而准确计算传输延时。
作为一种可选的方案,如图12所示,第一同步单元902还包括:第一GPS同步控制单元1202,第二同步单元903包括:第二GPS同步控制单元1204,其中,第一GPS同步控制单元1202和第二GPS同步控制单元1204用于在发送端和接收端之间对原始音频码本的传输进行同步操作。
可选地,在本实施例中第一GPS同步控制单元和第二GPS同步控制单元均包括:GPS设备,GPS设备包括GPS天线和GPS接收模块,GPS天线用于传输以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻;GPS接收模块用于接收以下至少之一:发送开始时刻、发送结束时刻、接收开始时刻、接收结束时刻。
例如,结合图5所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输,两端的同步控制单元为GPS同步控制单元。本地音频应用端通过GPS同步控制单元控制开始/停止播放码本(例如,Audio play),远端音频应用端通过GPS同步控制单元控制开始/停止对音频的采集(例如,Audio Capture)。
进一步说明,GPS设备由天线及GPS接收模块组成,其硬件电路和处理软件通过对接收到的信号进行解码和处理,从中提取并输出两种信号,一种是间隔为1s的脉冲信号,其脉冲前沿与国际标准的格林尼治时间的同步误差不超过1us,即1pps;另一种为与脉冲前沿相对应的国际标准“年月日时分秒”信息。第一种信号通过GPS SDk开发包回调,通知同步控制单元读取GPS的时间信息;第二种信号通过GPS SDk 开发包回调,提供精确时间得以控制是否开始对应音频的播放和采集。
结合图6所示为基于GPS同步控制装置的具体同步处理流程,其中,本地音频应用端及远端音频应用端通过测试App实现音频的播放与采集,具体步骤如下:
S1,本地音频应用端到远端音频应用端都运行待测语音系统,并初始化每条码本的测试信息,包括码本(这里码本即为带音频头格式信息的语音/音频文件,其中头格式信息中包含采样频率,声道数,样本位数等),语音/音频文件的格式可以是wav,mp3,wma等带音频头的格式)的编号,每个码本的持续时间,每个码本间隔的时间,每个码本的开始测试时间;
S2,远程发送:测试发起端根据码本编号,向GPS同步控制单元发信号,读取GPS提供的时间,若读取到GPS设备提供的时间到了码本对应开始测试时间后,GPS同步控制单元向本地测试App发命令开始播放音频码本经由被测系统的处理过程后发送出去;
S3,远程接收:GPS同步控制单元根据GPS SDK接口查询到GPS设备提供的时间到测试时间以后,向测试App发送命令,打开远端采集被测音频系统的输出,当采集持续时间达到预先约定的时间后,接收端停止采集;将所采集到的测试音频码本与原始音频码本送入延时测量模块。
通过本申请提供的实施例,基于GPS可以实现远距离或近距离的收发同步,并通过单向采集避免了上传/下载路径不对称而影响延时准确性的问题,同时单向采集可以避免回声环绕的给延时计算带来干扰和影响,提高了延时测量的准确性。
作为一种可选的方案,如图13所示,该系统还包括:
1)第一响应单元904包括:发送子模块1302,用于在接收到第一指令信息时开始向接收端发送原始音频码本,其中,第一指令信息用于指示接收端准备好接收;
可选地,在本实施例中指令信息也可称为信令信息,其中,上述指令信息是基于信令控制服务器(SyncServer)实现传输的。可选的,基于信令控制服务器可以实现近距离的收发同步。
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送开始指示信息,即第一指令信息时,则根据所收到的第一指令信息,指示远端音频应用端准备好采集音频。
2)第二响应单元906包括:终止子模块1304,用于在接收到第二指令信息时停 止向接收端发送原始音频码本,其中,第二指令信息用于指示完成播放原始音频码本;
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。本地音频应用端接收到发送结束指示信息,即第二指令信息时,则根据所收到的第二指令信息,指示远端音频应用端已完成播放原始音频。
3)第三响应单元908包括:采集子模块1306,用于在接收到第三指令信息时开始对发送端发送的原始音频码本进行采集,其中,第三指令信息用于指示接收端开始接收;
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。远端音频应用端接收到接收开始指示信息,即第三指令信息时,则根据所收到的第三指令信息,指示远端音频应用端开始采集原始音频。
4)第四响应单元910包括:判断子模块1308,用于判断对发送端发送的原始音频码本进行采集的时长是否超过采集时长,若超过,则停止对发送端发送的原始音频码本进行采集。
例如,结合图7所示,发送端为本地音频应用端,接收端为远端音频应用端,本地音频应用端通过传输网络实现对远端音频应用端的原始音频码本的传输。远端音频应用端接收到接收结束指示信息,其中,接收结束指示信息包括第二指令信息中所携带的采集时长Tt
进一步说明,结合图8所示,上述基于指令控制的同步控制装置的具体同步流程,其中,本地音频应用端及远端音频应用端通过测试App实现音频的播放与采集,具体步骤如下:
S1,本地音频应用端到远端音频应用端都运行待测语音系统,对应都开启同步测试控制客户端都成功登录到SyncServer,两端都登录成功后由SyncServer创建一个测试会话,会话中的两端分别用A端和B端标识。
S2,由任意一端(如A端)发起音频测试会话请求SyncRequest(请求中携带有码本编号信息),该请求经SyncServer控制端中转到另测试会话另一端(B端)。
S3,另一端(B端)收到测试会话请求SyncRequest后,初始化/打开音频采集资源设备,根据码本编号创建降级码本文件名/音频采样率,声道数/样本位数等头部信息,以记录被测系统音频输出信号,并返回准备好Sync Ok的确认信息经SyncServer中转 给测试会话发起端(A端)。
S4,测试会话发起端(A端)收到SyncServer中转过来的对端准备好信令后,发送一条开始播放音频码本(Ok Begin Play)的信令给另一端(B端),并立即开启播放参考码本信号。此时播放的参考码本音频信号经被测音频系统的输入采集经其全流程(前端处理,编码,打包,网络传输,解包,解码,后处理,播放)处理后到另一端播放输出后,被另一端的测试控制客户端采集。
S5,另一端(B端)收到Ok Begin Play信令后立即开启音频内录采集被测音频系统的输出,并回送一条正在内录被测音频系统输出的信令(Is Inner Recording)给发起端(A).
S6,测试会话发起端(A端)播放完参考音频码本后即发一条Play Ended信令(其中携带测试码本的持续时长)给另一端(B端),另一端收到之后比对采集持续时长是否已到,采集时长到即停止对被测音频系统输出信号的采集,将输出最终录得的码本信号。
通过本申请提供的实施例,基于指令的同步控制实现对发送端与接收端的同步操作,并采用了单向采集的方法,避免了影响延时准确性的路径不对称及回声的问题,提高了延时测量的准确性。
作为一种可选的方案,计算单元912包括:第一计算模块,用于通过以下公式计算所述音频传输延时:
Figure PCTCN2014092198-appb-000012
其中,Rxy(τ)为一个原始音频码本与对应的测试音频码本的互相关函数值,ts为接收端开始对发送端发送的原始音频码本进行采集的时刻,te为接收端停止对发送端发送的原始音频码本进行采集的时刻,t为每个采样点对应的时间信息,x(t)为原始音频码本中时刻为t时的采样点对应的能量值,τ为与x(t)中进行卷积的测试音频码本中的采样点的偏移量,y(t+τ)为测试音频码本中时刻为t+τ时的采样点的能量值,使用最大的互相关函数值对应的τ的取值表示音频传输延时。
通过求解原始音频码本与所得测试音频码本之间最大互相关函数Rxy(τ)及其对应的下标τ值,对应除以音频码本的采样率信息即可估计得出延时值。
可选地,在本实施例中延时计算采用的是求解音频信号互相关性的方法来求得音频延时,求解音频延时分为音频整体粗略延时Delay_crude+音频段内延时 Delay_internal。整体粗略延时Delay_crude是对参考码本与经同步控制单元录制得到的音频输出码本整体最大互相关所得到的延时值,Delay_internal音频子段延时是在求得整体粗略延时的基础上,对码本中的音频信号进行音频子段划分和对齐,再求解参考码本中各音频子段对应于经同步控制单元录制得到的音频输出码本中音频子段的延时,最终求解的延时值为音频整体粗略延时Delay_crude+音频段内延时Delay_internal。
可选地,也可通过如下公式对上述互相关函数值进行归一化求取归一化后的最大互相关系数值ρxy(τ)与对应的相标时间τ:
Figure PCTCN2014092198-appb-000013
对于采样率高(≥44.1k,48k,96k等)的音频播放码本场景延时评估,一帧码本文件本身的数据量可能会比较方便处理,先对码本音频文件以Tms的小窗口求取音频包络,再求取包络间的最大互相关值,求得相应的延时值t,具体步骤如下:
S1,以Tms对语音/音频信号加窗;
可选地,在本实施例中所加的窗包括以下至少之一:汉明窗,汉宁窗,海明窗,三角窗,Bartlett窗、Kaiser窗。
例如,以窗函数为矩形窗为例,矩形窗函数如下:
Figure PCTCN2014092198-appb-000014
第k帧加窗语音信号为:Xk(n)=w(n)*x(k*N+n)。其中第k帧信号Xk(n)的能量均值用E(k)表示:
Figure PCTCN2014092198-appb-000015
S2,每Tms帧求取该帧的包络信息值,包络信息就是对语音能量信号开方并归一化之后取对数,是语音短时能量变化的一种标识,第k帧语音信号的包络用Env(k)表示:
Figure PCTCN2014092198-appb-000016
S3,求取播放码本信号与录制得到的被测系统的降级信号间的包络的最大的互相关函数值及对应的时间τ;当播放码本信号为高音质时,上述互相关函数或相关系数中的x(t)或y(t)对应置换成参考码本与测试码本加窗后的包络值,求得对应的延时样本位置,根据采样频率换算成时间即得延时值。
作为一种可选的方案,计算单元912包括:第二计算模块,用于通过以下公式计算所述音频传输延时:
Figure PCTCN2014092198-appb-000017
其中,TestValue(k)为对原始音频码本i与原始音频码本i第k次测量得到的对应的测试音频码本i求解得到的最大的互相关函数值所对应的延时值,延时值为第k次测量得到的最大的互相关函数值对应的τ取值除以第k次测量接收端采用的采样率信息所得到的时域值,采样率信息为原始音频码本i的头格式信息中的采样率,Delayi为原始音频码本i的平均音频传输延时,m为大于等于1的整数。
作为一种可选的方案,计算单元912包括:第三计算模块,用于通过以下公式计算所述音频传输延时:
Figure PCTCN2014092198-appb-000018
其中,Avg_Delay为n个原始音频码本的平均音频传输延时,n为大于等于1的整数。
通过本申请提供的实施例,由互相关函数计算采样点的能量值,进而实现了对音频传输延时的精确计算。
可选地,在上述实施例中,上述音频传输延时的测量的系统可以适用于近距离通信。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (18)

  1. 一种音频传输延时的测量方法,其特征在于,包括:
    在发送端和接收端之间对待测的原始音频码本的传输进行同步操作,得到所述原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息;
    所述发送端响应所述发送开始指示信息开始向所述接收端发送所述待测的原始音频码本,响应所述发送结束指示信息停止向所述接收端发送所述原始音频码本,所述接收端响应所述接收开始指示信息开始对所述发送端发送的所述原始音频码本进行采集,并响应所述接收结束指示信息停止对所述发送端发送的所述原始音频码本进行采集;
    根据所述接收端采集得到的测试音频码本以及所述接收端上预存的所述原始音频码本得到音频传输延时。
  2. 根据权利要求1所述的方法,其特征在于,
    所述发送开始指示信息包括发送开始时刻、所述发送结束指示信息包括发送结束时刻、所述接收开始指示信息包括接收开始时刻、所述接收结束指示信息包括接收结束时刻;
    所述发送端响应所述发送开始指示信息开始向所述接收端发送所述待测的原始音频码本包括:所述发送端从所述发送开始时刻开始向所述接收端发送所述原始音频码本;
    所述发送端响应所述发送结束指示信息停止向所述接收端发送所述原始音频码本包括:所述发送端在所述发送结束时刻停止向所述接收端发送所述原始音频码本;
    所述接收端响应所述接收开始指示信息开始对所述发送端发送的所述原始音频码本进行采集包括:所述接收端从所述接收开始时刻开始对所述发送端发送的所述原始音频码本进行采集;
    所述接收端响应所述接收结束指示信息停止对所述发送端发送的所述原始音频码本进行采集包括:所述接收端在所述接收结束时刻停止对所述发送端发送的所述原始音频码本进行采集。
  3. 根据权利要求2所述的方法,其特征在于,
    所述发送开始时刻与所述接收开始时刻相同,以及所述发送结束时刻与所述接收结束时刻相同;或者,
    所述发送开始时刻与所述接收开始时刻相同,以及所述发送结束时刻与所述接收结束时刻之间的差值小于第一预定阈值;或者
    所述发送开始时刻与所述接收开始时刻之间的差值小于第二预定阈值,以及所述发送结束时刻与所述接收结束时刻相同;或者
    所述发送开始时刻与所述接收开始时刻之间的差值小于第三预定阈值,以及所述发送结束时刻与所述接收结束时刻之间的差值小于第四预定阈值。
  4. 根据权利要求1所述的方法,其特征在于,在发送端和接收端之间对待测的原始音频码本的传输进行同步操作还包括:
    在所述发送端和所述接收端之间进行信息交互,使得所述发送端发送多个所述原始音频码本的顺序与所述接收端接收所述多个所述原始音频码本的顺序相同。
  5. 根据权利要求2所述的方法,其特征在于,在发送端和接收端之间对待测的原始音频码本的传输进行同步操作包括:
    通过设置在所述发送端的第一GPS同步控制单元和设置在所述接收端的第二GPS同步控制单元在所述发送端和所述接收端之间对所述原始音频码本的传输进行同步操作,其中,所述第一GPS同步控制单元和第二GPS同步控制单元均包括:GPS设备,所述GPS设备包括GPS天线和GPS接收模块,所述GPS天线用于传输以下至少之一:所述发送开始时刻、所述发送结束时刻、所述接收开始时刻、所述接收结束时刻;所述GPS接收模块用于接收以下至少之一:所述发送开始时刻、所述发送结束时刻、所述接收开始时刻、所述接收结束时刻。
  6. 根据权利要求1所述的方法,其特征在于,
    所述发送开始指示信息包括用于指示所述接收端准备好接收的第一指令信息;所述发送结束指示信息包括用于指示完成播放所述原始音频码本的第二指令信息、所述接收开始指示信息包括用于指示所述接收端开始接收的第三指令信息、所述接收结束指示信息包括所述第二指令信息中携带的采集时长;
    所述发送端响应所述发送开始指示信息开始向所述接收端发送所述待测的原 始音频码本包括:所述发送端接收到所述第一指令信息时开始向所述接收端发送所述原始音频码本;
    所述发送端响应所述发送结束指示信息停止向所述接收端发送所述原始音频码本包括:所述发送端接收到所述第二指令信息时停止向所述接收端发送所述原始音频码本;
    所述接收端响应所述接收开始指示信息开始对所述发送端发送的所述原始音频码本进行采集包括:所述接收端接收到所述第三指令信息时开始对所述发送端发送的所述原始音频码本进行采集;
    所述接收端响应所述接收结束指示信息停止对所述发送端发送的所述原始音频码本进行采集包括:所述接收端判断对所述发送端发送的所述原始音频码本进行采集的时长是否超过所述采集时长,若超过,则停止对所述发送端发送的所述原始音频码本进行采集。
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,根据所述接收端采集得到的测试音频码本以及所述接收端上预存的所述原始音频码本得到音频传输延时包括:
    Figure PCTCN2014092198-appb-100001
    其中,Rxy(τ)为一个所述原始音频码本与对应的所述测试音频码本的互相关函数值,ts为所述接收端开始对所述发送端发送的所述原始音频码本进行采集的时刻,te为所述接收端停止对所述发送端发送的所述原始音频码本进行采集的时刻,t为每个采样点对应的时间信息,x(t)为所述原始音频码本中时刻为t时的采样点对应的能量值,τ为与x(t)中进行卷积的所述测试音频码本中的采样点的偏移量,y(t+τ)为所述测试音频码本中时刻为t+τ时的采样点的能量值,使用最大的所述互相关函数值对应的τ的取值表示所述音频传输延时。
  8. 根据权利要求7所述的方法,其特征在于,根据所述接收端采集得到的测试音频码本以及所述接收端上预存的所述原始音频码本得到音频传输延时还包括:
    Figure PCTCN2014092198-appb-100002
    其中,TestValue(k)为对所述原始音频码本i与所述原始音频码本i第k次测量得到的对应的所述测试音频码本i求解得到最大的所述互相关函数值所对应的延时值,所述延时值为所述第k次测量得到最大的所述互相关函数值对应的τ取 值除以所述第k次测量所述接收端采用的采样率信息所得到的时域值,所述采样率信息为所述原始音频码本i的头格式信息中的采样率,Delayi为所述原始音频码本i的平均音频传输延时,m为大于等于1的整数。
  9. 根据权利要求8所述的方法,其特征在于,根据所述接收端采集得到的测试音频码本以及所述接收端上预存的所述原始音频码本得到音频传输延时还包括:
    Figure PCTCN2014092198-appb-100003
    其中,Avg_Delay为n个所述原始音频码本的平均音频传输延时,n为大于等于1的整数。
  10. 一种音频传输延时的测量系统,其特征在于,包括:
    位于发送端的第一同步单元和位于接收端的第二同步单元,用于在所述发送端和所述接收端之间对待测的原始音频码本的传输进行同步操作,得到所述原始音频码本的发送开始指示信息、发送结束指示信息、接收开始指示信息、接收结束指示信息;
    位于所述发送端的第一响应单元,用于响应所述发送开始指示信息开始向所述接收端发送所述待测的原始音频码本;
    位于所述发送端的第二响应单元,用于响应所述发送结束指示信息停止向所述接收端发送所述原始音频码本;
    位于所述接收端的第三响应单元,用于响应所述接收开始指示信息开始对所述发送端发送的所述原始音频码本进行采集;
    位于所述接收端的第四响应单元,用于响应所述接收结束指示信息停止对所述发送端发送的所述原始音频码本进行采集;
    位于所述接收端的计算单元,用于根据采集得到的测试音频码本以及所述接收端上预存的所述原始音频码本计算音频传输延时。
  11. 根据权利要求10所述的系统,其特征在于,
    第一响应单元包括:第一响应子模块,用于从发送开始时刻开始向所述接收端发送所述原始音频码本,其中,所述发送开始指示信息包括所述发送开始时刻;
    第二响应单元包括:第二响应子模块,用于在发送结束时刻停止向所述接收端发送所述原始音频码本,其中,所述发送结束指示信息包括所述发送结束时刻;
    第三响应单元包括:第三响应子模块,用于从接收开始时刻开始对所述发送端发送的所述原始音频码本进行采集,其中,所述接收开始指示信息包括所述接收开始时刻;
    第四响应单元包括:第四响应子模块,用于在接收结束时刻停止对所述发送端发送的所述原始音频码本进行采集,其中,所述接收结束指示信息包括所述接收结束时刻。
  12. 根据权利要求11所述的系统,其特征在于,所述第一同步单元包括:第一同步模块,所述第二同步单元包括:第二同步模块,其中,所述第一同步模块和所述第二同步模块用于执行所述同步操作,以得到以下结果之一:
    所述发送开始时刻与所述接收开始时刻相同,以及所述发送结束时刻与所述接收结束时刻相同;或者,
    所述发送开始时刻与所述接收开始时刻相同,以及所述发送结束时刻与所述接收结束时刻之间的差值小于第一预定阈值;或者
    所述发送开始时刻与所述接收开始时刻之间的差值小于第二预定阈值,以及所述发送结束时刻与所述接收结束时刻相同;或者
    所述发送开始时刻与所述接收开始时刻之间的差值小于第三预定阈值,以及所述发送结束时刻与所述接收结束时刻之间的差值小于第四预定阈值。
  13. 根据权利要求10所述的系统,其特征在于,所述第一同步单元包括:第三同步模块,所述第二同步单元包括:第四同步模块,其中,所述第三同步模块和所述第四同步模块用于在所述发送端和所述接收端之间进行信息交互,使得所述发送端发送多个所述原始音频码本的顺序与所述接收端接收所述多个所述原始音频码本的顺序相同。
  14. 根据权利要求11所述的系统,其特征在于,所述第一同步单元包括:第一GPS同步控制单元,所述第二同步单元包括:第二GPS同步控制单元,其中,第一GPS同步控制单元和所述第二GPS同步控制单元用于在所述发送端和所述接收端之间对所述原始音频码本的传输进行所述同步操作,其中,所述第一GPS同步控制单元和第二GPS同步控制单元均包括:GPS设备,所述GPS设备包括GPS天线和GPS接收模块,所述GPS天线用于传输以下至少之一:所述发送开始时刻、 所述发送结束时刻、所述接收开始时刻、所述接收结束时刻;所述GPS接收模块用于接收以下至少之一:所述发送开始时刻、所述发送结束时刻、所述接收开始时刻、所述接收结束时刻。
  15. 根据权利要求10所述的系统,其特征在于,
    第一响应单元包括:发送子模块,用于在接收到第一指令信息时开始向所述接收端发送所述原始音频码本,其中,所述第一指令信息用于指示所述接收端准备好接收;
    第二响应单元包括:终止子模块,用于在接收到第二指令信息时停止向所述接收端发送所述原始音频码本,其中,所述第二指令信息用于指示完成播放所述原始音频码本;
    第三响应单元包括:采集子模块,用于在接收到第三指令信息时开始对所述发送端发送的所述原始音频码本进行采集,其中,所述第三指令信息用于指示所述接收端开始接收;
    第四响应单元包括:判断子模块,用于判断对所述发送端发送的所述原始音频码本进行采集的时长是否超过采集时长,若超过,则停止对所述发送端发送的所述原始音频码本进行采集。
  16. 根据权利要求10至15中任一项所述的系统,其特征在于,所述计算单元包括:第一计算模块,用于通过以下公式计算所述音频传输延时:
    Figure PCTCN2014092198-appb-100004
    Rxy(τ)为一个所述原始音频码本与对应的所述测试音频码本的互相关函数值,ts为所述接收端开始对所述发送端发送的所述原始音频码本进行采集的时刻,te为所述接收端停止对所述发送端发送的所述原始音频码本进行采集的时刻,t为每个采样点对应的时间信息,x(t)为所述原始音频码本中时刻为t时的采样点对应的能量值,τ为与x(t)中进行卷积的所述测试音频码本中的采样点的偏移量,y(t+τ)为所述测试音频码本中时刻为t+τ时的采样点的能量值,使用最大的所述互相关函数值对应的τ的取值表示所述音频传输延时。
  17. 根据权利要求16所述的系统,其特征在于,所述计算单元包括:第二计算模块,用于通过以下公式计算所述音频传输延时:
    Figure PCTCN2014092198-appb-100005
    其中,TestValue(k)为对所述原始音频码本i与所述原始音频码本i第k次测量得到的对应的所述测试音频码本i求解得到最大的所述互相关函数值所对应的延时值,所述延时值为所述第k次测量得到最大的所述互相关函数值对应的τ取值除以所述第k次测量所述接收端采用的采样率信息所得到的时域值,所述采样率信息为所述原始音频码本i的头格式信息中的采样率,Delayi为所述原始音频码本i的平均音频传输延时,m为大于等于1的整数。
  18. 根据权利要求17所述的系统,其特征在于,所述计算单元包括:第三计算模块,用于通过以下公式计算所述音频传输延时:
    Figure PCTCN2014092198-appb-100006
    其中,Avg_Delay为n个所述原始音频码本的平均音频传输延时,n为大于等于1的整数。
PCT/CN2014/092198 2013-11-27 2014-11-25 音频传输延时的测量方法及系统 WO2015078359A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/038,861 US9755933B2 (en) 2013-11-27 2014-11-25 Method and system for measuring audio transmission delay

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310616487.1A CN104125022B (zh) 2013-11-27 2013-11-27 音频传输延时的测量方法及系统
CN201310616487.1 2013-11-27

Publications (1)

Publication Number Publication Date
WO2015078359A1 true WO2015078359A1 (zh) 2015-06-04

Family

ID=51770298

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/092198 WO2015078359A1 (zh) 2013-11-27 2014-11-25 音频传输延时的测量方法及系统

Country Status (3)

Country Link
US (1) US9755933B2 (zh)
CN (1) CN104125022B (zh)
WO (1) WO2015078359A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108449630A (zh) * 2018-04-09 2018-08-24 歌尔股份有限公司 音频同步方法及系统

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104125022B (zh) * 2013-11-27 2016-03-23 腾讯科技(成都)有限公司 音频传输延时的测量方法及系统
CN104978982B (zh) * 2015-04-02 2018-01-05 广州酷狗计算机科技有限公司 一种流媒体版本对齐方法,及设备
CN105790854B (zh) * 2016-03-01 2018-11-20 济南中维世纪科技有限公司 一种基于声波的短距离数据传输方法及装置
US10394518B2 (en) * 2016-03-10 2019-08-27 Mediatek Inc. Audio synchronization method and associated electronic device
FR3059507B1 (fr) * 2016-11-30 2019-01-25 Sagemcom Broadband Sas Procede de synchronisation d'un premier signal audio et d'un deuxieme signal audio
CN107995060B (zh) * 2017-11-29 2021-11-16 努比亚技术有限公司 移动终端音频测试方法、装置以及计算机可读存储介质
CN110085259B (zh) * 2019-05-07 2021-09-17 国家广播电视总局中央广播电视发射二台 音频比对方法、装置和设备
CN110365555B (zh) * 2019-08-08 2021-12-10 广州虎牙科技有限公司 音频延时测试方法、装置、电子设备及可读存储介质
CN110609769B (zh) * 2019-09-20 2021-06-11 广州华多网络科技有限公司 信号采集延迟的测量方法及相关装置
CN110933240B (zh) * 2019-10-16 2021-03-16 福建星网智慧软件有限公司 一种VoIP终端的音频自动化测试装置以及方法
CN113382119B (zh) * 2020-02-25 2022-12-06 北京字节跳动网络技术有限公司 消除回声的方法、装置、可读介质和电子设备
CN113381899B (zh) * 2020-02-25 2023-09-22 福建天泉教育科技有限公司 一种投屏技术中第一声延迟测试的方法及其系统
CN112489686A (zh) * 2020-12-02 2021-03-12 公安部第三研究所 一种端到端的音频信号延时测试方法
CN113409817B (zh) * 2021-06-24 2022-05-13 浙江松会科技有限公司 一种基于声纹技术的音频信号实时追踪比对方法
CN116761030B (zh) * 2023-08-11 2023-10-27 南京汉卫教育科技有限公司 一种基于图像识别算法的多机位同步音影录播系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102137414A (zh) * 2010-06-25 2011-07-27 华为技术有限公司 一种移动视频业务时延的评估方法和装置
CN102325059A (zh) * 2011-09-09 2012-01-18 华南理工大学 非介入式单端采集的音频端到端时延测量方法及装置
CN104125022A (zh) * 2013-11-27 2014-10-29 腾讯科技(成都)有限公司 音频传输延时的测量方法及系统

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020193999A1 (en) * 2001-06-14 2002-12-19 Michael Keane Measuring speech quality over a communications network
US20080060281A1 (en) * 2002-02-01 2008-03-13 Mayle Steven R Apparatus and method for sealing a vertical protrusion on a roof
JP2004297287A (ja) * 2003-03-26 2004-10-21 Agilent Technologies Japan Ltd 通話品質評価システム、および、該通話品質評価のための装置
US7180537B2 (en) * 2004-02-18 2007-02-20 Tektronix, Inc. Relative channel delay measurement
US7965639B2 (en) * 2005-03-14 2011-06-21 Sharp Laboratories Of America, Inc. Dynamic adaptation of MAC-layer retransmission value
US7801168B2 (en) * 2006-06-21 2010-09-21 Intel Corporation Systems and methods for multi-slotted power saving multiple polling in wireless communications
US8456530B2 (en) * 2009-08-18 2013-06-04 Arcom Digital, Llc Methods and apparatus for detecting and locating leakage of digital signals
US8116208B2 (en) * 2009-10-19 2012-02-14 Litepoint Corporation System and method for testing multiple digital signal transceivers in parallel
US8441620B2 (en) * 2010-04-05 2013-05-14 Hewlett-Packard Development Company, L.P. Determining distance between nodes
CN102045121B (zh) * 2010-11-12 2013-07-03 中国科学院长春光学精密机械与物理研究所 光电经纬仪无线通讯系统数据传输延迟时间的检测方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102137414A (zh) * 2010-06-25 2011-07-27 华为技术有限公司 一种移动视频业务时延的评估方法和装置
CN102325059A (zh) * 2011-09-09 2012-01-18 华南理工大学 非介入式单端采集的音频端到端时延测量方法及装置
CN104125022A (zh) * 2013-11-27 2014-10-29 腾讯科技(成都)有限公司 音频传输延时的测量方法及系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108449630A (zh) * 2018-04-09 2018-08-24 歌尔股份有限公司 音频同步方法及系统

Also Published As

Publication number Publication date
US20170005896A1 (en) 2017-01-05
CN104125022A (zh) 2014-10-29
CN104125022B (zh) 2016-03-23
US9755933B2 (en) 2017-09-05

Similar Documents

Publication Publication Date Title
WO2015078359A1 (zh) 音频传输延时的测量方法及系统
CN102325059B (zh) 非介入式单端采集的音频端到端时延测量方法及装置
EP3291551B1 (en) Image delay detection method and system
US7864695B2 (en) Traffic load density measuring system, traffic load density measuring method, transmitter, receiver, and recording medium
CN103632680B (zh) 一种语音质量评估方法、网元及系统
CN102014126B (zh) 一种基于QoS的语音体验质量评测平台及评测方法
CN102158881B (zh) 一种全面评估3g视频电话质量的方法和装置
CN102316357A (zh) 非介入式单端采集的视频端到端时延测量方法及装置
CN102246520B (zh) 通过视频会议通信的方法和设备
WO2016127699A1 (zh) 一种实现参考信号调整的方法及装置
JP6574031B2 (ja) モバイル映像通話品質測定方法およびシステム
EP3376690B1 (en) Audio conversion characteristic test method
CN105933181B (zh) 一种通话时延评估方法及装置
CN104009878A (zh) 一种网络传输时延的测量方法及装置
CN102332261B (zh) 非介入式双端采集的音频端到端延迟测量方法及装置
JP4889787B2 (ja) 測定方法、測定装置及びコンピュータプログラム
RU2011123747A (ru) Передатчик с таймером относительного времени
CN107404599A (zh) 音视频数据同步方法、装置及系统
CN112151051B (zh) 音频数据的处理方法和装置及存储介质
CN202309990U (zh) 非介入式单端采集的视频端到端时延测量装置
CN202261341U (zh) 非介入式双端采集的音频端到端延迟测量装置
CN108155952B (zh) 一种非应答式水下测量声信号传播时延的方法
CN101998426A (zh) 语音测试系统中语音评估算法的握手信号处理方法
CN202218352U (zh) 非介入式双端采集的视频端到端时延测量装置
WO2022110385A1 (zh) 测距方法、装置、系统、智能设备和计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14865390

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 15038861

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC ( EPO FORM 1205A DATED 17/10/2016 )

122 Ep: pct application non-entry in european phase

Ref document number: 14865390

Country of ref document: EP

Kind code of ref document: A1