WO2021229828A1 - 信号処理方法、信号処理装置、及びプログラム - Google Patents

信号処理方法、信号処理装置、及びプログラム Download PDF

Info

Publication number
WO2021229828A1
WO2021229828A1 PCT/JP2020/023199 JP2020023199W WO2021229828A1 WO 2021229828 A1 WO2021229828 A1 WO 2021229828A1 JP 2020023199 W JP2020023199 W JP 2020023199W WO 2021229828 A1 WO2021229828 A1 WO 2021229828A1
Authority
WO
WIPO (PCT)
Prior art keywords
delay amount
acoustic signal
transmission
time
calculated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2020/023199
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
陽 前澤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yamaha Corp
Original Assignee
Yamaha Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corp filed Critical Yamaha Corp
Priority to CN202080100307.2A priority Critical patent/CN115462058B/zh
Priority to JP2022522496A priority patent/JP7497755B2/ja
Publication of WO2021229828A1 publication Critical patent/WO2021229828A1/ja
Priority to US17/979,974 priority patent/US12119885B2/en
Anticipated expiration legal-status Critical
Priority to JP2024061057A priority patent/JP7694758B2/ja
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B11/00Transmission systems employing ultrasonic, sonic or infrasonic waves
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
    • H04R27/00Public address systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; ELECTRIC HEARING AIDS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/04Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/72442User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality for playing music files

Definitions

  • the present invention relates to signal processing methods, signal processing devices, and programs. This application claims priority based on No. 63 / 022,591 filed in the United States on May 11, 2020, the contents of which are incorporated herein by reference.
  • Patent Document 1 discloses a technique for increasing the number of participants in a music session while suppressing delay.
  • Patent Document 2 discloses a technique for realizing high-quality and real-time online performance.
  • Patent Document 3 discloses a technique for synchronizing a plurality of images.
  • An object of the present invention is to provide a signal processing method, a signal processing device, and a program capable of transmitting high-quality sound by using a popular call application.
  • the signal processing method of the present invention is a signal processing method of a first acoustic signal and a second acoustic signal in which the same sound source is picked up, and the first receiving unit is the first.
  • the first acoustic signal transmitted via one transmission line is received, and the second receiving unit is transmitted via a second transmission line whose delay time related to transmission is larger than that of the first transmission line.
  • the delay amount calculation unit calculates the transmission delay amount, which is the relative delay amount between the first acoustic signal and the second acoustic signal, and the delay amount addition unit calculates the transmission delay amount.
  • the first acoustic signal is delayed based on the above, and the delayed first acoustic signal is output.
  • the signal processing device of the present invention is a signal processing device that performs signal processing of the first acoustic signal and the second acoustic signal from which the same sound source is picked up, and is transmitted via the first transmission path.
  • the first receiving unit that receives the first acoustic signal and the second reception that receives the second acoustic signal transmitted via the second transmission path whose delay time related to transmission is larger than that of the first transmission path.
  • a delay amount calculation unit that calculates a transmission delay amount that is a relative delay amount between the first acoustic signal and the second acoustic signal, and a delay amount calculation unit that calculates the transmission delay amount, and delays the first acoustic signal based on the transmission delay amount.
  • a delay amount addition unit for outputting the delayed first acoustic signal is provided.
  • the program of the present invention is transmitted to the computer of the signal processing device that performs signal processing of the first acoustic signal picked up by the same sound source and the second acoustic signal via the first transmission path.
  • This is a program for executing a delay amount addition step for outputting a first acoustic signal.
  • FIG. 1 is a block diagram showing a configuration example of the transmission system 1 according to the first embodiment.
  • the transmission system 1 includes, for example, a transmitting side terminal 10, a receiving side terminal 20, a microphone 30, and a speaker 40.
  • the transmitting side terminal 10 and the receiving side terminal 20 are connected to each other so as to be able to communicate with each other via a general-purpose communication line such as the Internet.
  • the transmission system 1 is applied when performing remote music communication such as online music lessons.
  • the transmitting side terminal 10 and the receiving side terminal 20 mutually transmit acoustic signals.
  • a case where the acoustic signal x (t) is transmitted from the transmitting side terminal 10 to the receiving side terminal 20 will be described as an example.
  • a similar method can be applied when an acoustic signal is transmitted from the receiving terminal 20 to the transmitting terminal 10.
  • the session application (hereinafter referred to as a dedicated application) is a session application dedicated to sound. With the dedicated app, it is possible to transmit high-quality sound while minimizing the delay caused by transmission.
  • the calling application (hereinafter referred to as a general-purpose application) is a general-purpose calling application that is widely used, such as Skype (registered trademark). However, in a general-purpose application, the transmission state fluctuates depending on the band condition, and high-quality sound is not transmitted.
  • the transmitting side terminal 10 uses both the session application and the calling application to transmit the acoustic signal x (t) from which the same sound source is picked up to the receiving side terminal 20.
  • the receiving terminal 20 receives the acoustic signal x (t) transmitted from each of the session application and the calling application.
  • the receiving terminal 20 receives the acoustic signal x (t) (hereinafter referred to as the received acoustic signal xN (t)) transmitted via the dedicated application. Further, the acoustic signal x (t) (hereinafter referred to as received acoustic signal xS (t)) transmitted via the general-purpose application is received.
  • transmission is performed via the transmission line ND.
  • transmission is performed via the transmission line SD.
  • the transmission path is generally different between the transmission path ND and the SD. Therefore, the reception side terminal 20 receives the received acoustic signal xN (t) and the received acoustic signal xS (t) at different timings.
  • the received acoustic signal xN (t) is received by the receiving terminal 20 with a smaller delay than the received acoustic signal xS (t). It shall be done.
  • the received acoustic signal xN (t) is an example of the “first acoustic signal”.
  • the transmission line ND is an example of the “first transmission line”.
  • the received acoustic signal xS (t) is an example of the “second acoustic signal”.
  • the transmission line SD is an example of a “second transmission line”.
  • the receiving terminal 20 synchronizes and outputs two acoustic signals received at different timings.
  • the receiving terminal 20 calculates the relative transmission delay amount of the two acoustic signals.
  • the transmission delay amount is simply referred to as a delay amount.
  • the receiving terminal 20 uses the calculated delay amount ⁇ to generate a received acoustic signal xN (t ⁇ ) in which the received acoustic signal xN (t), which is the acoustic signal with the smaller delay, is delayed by the delay amount ⁇ . .. Then, the receiving side terminal 20 outputs the received acoustic signal xN (t— ⁇ ) to the speaker 40 instead of the received acoustic signal xS (t). This makes it possible to output high-quality sound at the timing when the general-purpose application outputs sound on the receiving side. Therefore, it is possible to make a call via a general-purpose application with high-quality sound.
  • the transmitting side terminal 10 is a computer device on the transmitting side, and is, for example, a smartphone, a personal computer, a mobile phone, a tablet terminal, a wearable terminal, or the like.
  • the transmitting terminal 10 includes, for example, a communication unit 11, a storage unit 12, a control unit 13, and an input / output unit 14.
  • the communication unit 11 communicates with the receiving terminal 20. For example, the communication unit 11 transmits the acoustic signal x (t) to the receiving terminal 20 under the control of the dedicated application. Further, the communication unit 11 transmits the acoustic signal x (t) to the receiving terminal 20 under the control of the general-purpose application.
  • the storage unit 12 is a storage medium, for example, an HDD (HardDiskDrive), a flash memory, an EEPROM (Electrically ErasableProgrammableReadOnlyMemory), a RAM (RandomAccessread / writeMemory), a ROM (ReadOnlyMemory), or these. Consists of any combination of storage media.
  • the storage unit 12 stores a program for executing various processes of the transmitting terminal 10 and temporary data used when performing various processes.
  • the control unit 13 executes a program in which a processing unit (processing unit) such as a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit) as hardware included in the transmitting terminal 10 is stored in the storage unit 12. , The function is realized.
  • processing unit processing unit
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • the control unit 13 includes, for example, a dedicated application 130, a general-purpose application 131, and a device control unit 132.
  • the dedicated application 130 is a functional unit that realizes the functions of a general-purpose application, and transmits an acoustic signal x (t) acquired from the microphone 30 to a preset communication destination device (here, a receiving terminal 20).
  • the general-purpose application 131 is a functional unit that realizes the functions of the general-purpose application, and transmits an acoustic signal x (t) acquired from the microphone 30 to a preset communication destination device (here, a receiving terminal 20).
  • the device control unit 132 comprehensively controls the transmission side terminal 10.
  • the device control unit 132 outputs the acoustic signal x (t) input from the microphone 30 to the dedicated application 130 and the general-purpose application 131. Further, the device control unit 132 transmits to the receiving side terminal 20 by outputting the control signal output from the dedicated application 130 and the general-purpose application 131 for establishing communication with the communication destination device to the communication unit 11. do.
  • the input / output unit 14 is a functional unit that mediates the input / output of signals with an external device connected to the transmitting side terminal 10.
  • the acoustic signal x (t) picked up by the microphone 30 is input to the input / output unit 14.
  • the receiving side terminal 20 is a computer device on the receiving side, and is, for example, a smartphone, a personal computer, a mobile phone, a tablet terminal, a wearable terminal, or the like.
  • the receiving side terminal 20 has the same configuration as the transmitting side terminal 10.
  • the functions of the receiving side terminal 20 only the functions different from those of the transmitting side terminal 10 will be described, and detailed description of the functions equivalent to those of the transmitting side terminal 10 may be omitted.
  • the receiving terminal 20 includes, for example, a communication unit 21, a storage unit 22, a control unit 23, and an input / output unit 24.
  • the communication unit 21 communicates with the transmitting terminal 10. For example, the communication unit 21 receives the received acoustic signal xN (t) from the transmitting side terminal 10. The communication unit 21 receives the received acoustic signal xS (t) from the transmitting side terminal 10.
  • the storage unit 22 is composed of a storage medium, for example, an HDD, a flash memory, an EEPROM, a RAM, a ROM, or any combination of these storage media.
  • the storage unit 22 stores a program for executing various processes of the receiving terminal 20 and temporary data used when performing various processes.
  • control unit 23 The function of the control unit 23 is realized by executing a program stored in the storage unit 22 by a Processing Unit (processing unit) such as a CPU and a GPU as hardware included in the receiving terminal 20.
  • processing unit processing unit
  • the input / output unit 24 is a functional unit that mediates the input / output of signals with an external device connected to the receiving terminal 20.
  • the input / output unit 24 outputs the received acoustic signal xN (t— ⁇ ) to the speaker 40.
  • the control unit 23 includes, for example, a delay estimation unit 230, a delay unit 231, a device control unit 232, a dedicated application 233, and a general-purpose application 234.
  • the delay estimation unit 230 is an example of a “delay amount calculation unit”.
  • the delay unit 231 is an example of a "delay amount addition unit”.
  • the device control unit 232 controls the receiving terminal 20 in an integrated manner.
  • the dedicated application 233 is a functional unit equivalent to the dedicated application 130.
  • the general-purpose application 234 is a functional unit equivalent to the general-purpose application 131.
  • FIG. 2 is a diagram illustrating a process performed by the control unit 23.
  • the delay estimation unit 230 acquires the received acoustic signal xN (t) and the received acoustic signal xS (t), and calculates the relative delay amount ⁇ of both signals.
  • the delay estimation unit 230 outputs the calculated delay amount ⁇ to the delay unit 231.
  • the delay unit 231 generates a received acoustic signal xN (t— ⁇ ) in which the received acoustic signal xN (t) is delayed by using the delay amount ⁇ acquired from the delay estimation unit 230, and the generated received acoustic signal xN (t). - ⁇ ) is output.
  • the delay estimation unit 230 buffers the received acoustic signal xN (t) for a certain time interval (for example, 1 second). That is, the delay estimation unit 230 sequentially stores the received acoustic signal xN (t) received by the communication unit 21, and temporarily stores the signal corresponding to a certain time interval in, for example, the storage unit 22.
  • the delay estimation unit 230 buffers the received acoustic signal xS (t) for a certain time interval (for example, 1 second). That is, the delay estimation unit 230 sequentially stores the received acoustic signal xS (t) received by the communication unit 21, and temporarily stores the signal corresponding to a certain time interval in, for example, the storage unit 22.
  • the delay estimation unit 230 takes a cross-correlation value between the received acoustic signal xN (t) in the buffered fixed time interval T and the received acoustic signal xS (t). For example, the delay estimation unit 230 calculates the cross-correlation value R using the following equation (1).
  • R (n) indicates a cross-correlation value when the delay amount is n.
  • T indicates a fixed time interval.
  • t indicates time.
  • the delay estimation unit 230 calculates the cross-correlation value R (n) while changing n.
  • the delay estimation unit 230 sets the delay amount n at which the absolute value
  • the delay estimation unit 230 may set the weighted sum of the cross-correlation value R'(n) as the cross-correlation value R (n) as shown in the following equation (2).
  • the cross-correlation value R'(n) here is a cross-correlation value obtained in the latest (for example, the previous time interval).
  • R'(n) is the latest cross-correlation value and indicates the cross-correlation value when the delay amount is n.
  • T indicates a fixed time interval.
  • t indicates time.
  • the amount of delay due to the transmission line ND or the transmission line SD does not always take a constant value, and usually fluctuates from moment to moment depending on the band conditions and the congestion of signals transmitted and received. Therefore, even if the received acoustic signal xN (t) is delayed using the calculated delay amount, the received acoustic signal xN (t) and the received acoustic signal xS (t) may be gradually out of synchronization. obtain.
  • the calculation timing may be set arbitrarily. For example, the delay amount may be calculated each time the received acoustic signal xN (t) or the received acoustic signal xS (t) is received. Alternatively, the calculation timing may arrive for each buffering time interval T (or 1 / 2T, 1 / 4T, etc.), or the calculation timing may arrive at random.
  • the delay unit 231 delays the received acoustic signal xN (t) by the calculated delay amount based on the delay amount calculated by the delay estimation unit 230, that is, whether or not the delay amount is updated. It may be determined whether or not.
  • the delay unit 231 acquires a reference value nf, which is a reference delay amount stored in advance in the storage unit 22 or the like. Whether the delay unit 231 updates the delay amount of the received acoustic signal xN (t) with the calculated delay amount n (k) using the acquired reference value nf and the calculated delay amount n (k). Judge whether or not.
  • the delay unit 231 updates the delay amount, for example, when the absolute value
  • the delay unit 231 determines that the delay amount is not updated when the absolute value
  • the delay unit 231 may set the delay amount as the reference value nf when the absolute value
  • the delay unit 231 updates the delay amount when the delay amount n (k) is larger than (reference value nf + allowable range) or smaller than (reference value nf-allowable range). judge.
  • the delay unit 231 determines that the delay amount is not updated when the delay amount n (k) is smaller than (reference value nf + allowable range) and larger than (reference value nf-allowable range).
  • (reference value nf + allowable range) is an example of the “first threshold value”.
  • the delay amount can be updated only when the calculated delay amount n (k) exceeds the allowable range from the currently set value. Therefore, it is possible to follow the actual change in the delay amount while suppressing the slight change in the delay amount and reducing the occurrence of acoustic discomfort.
  • the delay estimation unit 230 may, for example, use the reference value nf as the average of the delay amounts n (1), n (2), ..., N (N) calculated at each calculation timing.
  • the average may be a simple addition average value from the calculation timings 1 to N, or may be a weighted average value.
  • it may be the latest moving average.
  • the latest moving average means that when the latest delay amount is n (N), the delay amount n (N) and the latest plurality of delay amounts (for example, the delay amounts n at the calculation timings N-1 and N-2). It is the average value of N-1) and n (N-2)).
  • the delay unit 231 may be changed stepwise so that the change becomes smooth. For example, when updating the delay amount, the delay unit 231 first calculates the difference between the current delay amount and the updated delay amount as the change amount. The delay unit 231 prevents the change rate (change amount per unit time) from exceeding a predetermined threshold value based on the calculated change amount. As a result, the delay unit 231 can gradually change the delay amount, and can suppress the occurrence of sound interruptions and skips.
  • the delay estimation unit 230 sets the delay amount n when the absolute value
  • R (n) the absolute value
  • the received acoustic signal has a smaller overall signal amplitude than when it is not silent.
  • a prominent peak value does not appear in the cross-correlation value R (n) and a similar cross-correlation value is obtained regardless of the value of the delay amount n.
  • the value of n that maximizes the cross-correlation value R (n) is unreliable as the actual delay amount.
  • the delay estimation unit 230 determines that the corresponding delay amount n is not adopted as the actual delay amount. That is, when the peak value of the cross-correlation value R (n) is equal to or higher than a predetermined threshold value, the delay estimation unit 230 adopts the corresponding delay amount n as the actual delay amount. As a result, the delay estimation unit 230 can accurately calculate the delay amount.
  • the delay estimation unit 230 determines whether or not the transmission line is interrupted, and if it is determined that one of the transmission lines is interrupted, the cross-correlation value R (n) is not calculated. To do so.
  • the delay estimation unit 230 calculates the respective powers (maximum levels) of the buffered received acoustic signal xS (t) and the received acoustic signal xN (t).
  • the power here is an index showing the strength of the buffered signal.
  • the power may be, for example, the sum of the absolute values of the signal amplitudes of the buffered signals, or the signal amplitude values of the buffered signals having the maximum absolute value of the signal amplitudes. good.
  • the delay estimation unit 230 sets a threshold value when one of the calculated powers of the received acoustic signal xS (t) and the received acoustic signal xN (t) exceeds a predetermined threshold value and the other is below the threshold value. It is determined that the transmission line whose value is lower than is interrupted.
  • the threshold value here is an example of the “second threshold value”.
  • the delay estimation unit 230 calculates the cross-correlation value R (n). It is determined that the delay amount is not calculated. Further, when the power of the received acoustic signal xN (t) is equal to or higher than the threshold value and the power of the received acoustic signal xS (t) is less than the threshold value, the delay estimation unit 230 does not calculate the cross-correlation value R (n). , It is determined that the delay amount is not calculated.
  • the delay estimation unit 230 may reproduce the uninterrupted acoustic signal. good. Specifically, when the power of the received acoustic signal xN (t) is less than the threshold value and the power of the received acoustic signal xS (t) is equal to or larger than the threshold value, the delay estimation unit 230 inputs the received acoustic signal xS (t). The speaker 40 is output via the output unit 24.
  • the delay estimation unit 230 inputs the received acoustic signal xN (t— ⁇ ).
  • the speaker 40 is output via the output unit 24.
  • the delay amount ⁇ here is the currently set delay amount. As a result, the delay estimation unit 230 can continue to reproduce the sound even when one of the transmission lines is interrupted.
  • FIG. 3 is a sequence diagram showing a processing flow of the transmission system 1 in the first embodiment.
  • the transmitting terminal 10 acquires the acoustic signal x (t) from the microphone 30 (step S10).
  • the transmitting side terminal 10 transmits an acoustic signal x (t) to the receiving side terminal 20 by each of the dedicated application and the general-purpose application (step S11).
  • the acoustic signal x (t) via the dedicated application reaches the receiving terminal 20 as a received acoustic signal xN (t) via the transmission line ND.
  • the acoustic signal x (t) via the general-purpose application reaches the receiving terminal 20 as the received acoustic signal xS (t) via the transmission line SD.
  • the receiving terminal 20 receives the received acoustic signal xN (t) (step S12).
  • the receiving terminal 20 receives the received acoustic signal xS (t) (step S13).
  • the receiving terminal 20 estimates (calculates) the delay amount ⁇ using the received received acoustic signal xN (t) and the received acoustic signal xS (t) (step S14).
  • the receiving terminal 20 delays the received acoustic signal xN (t) using the estimated delay amount ⁇ , and outputs the delayed received acoustic signal xN (t— ⁇ ) to the speaker 40 (step S16).
  • the received acoustic signal xN (t) in which the acoustic signal x (t) which is the same sound source is picked up, and the received acoustic signal xS (T) is the transmission method.
  • the transmission method is an example of a "signal processing method”.
  • the received acoustic signal xN (t) is an example of the “first acoustic signal”.
  • the received acoustic signal xS (t) is an example of the “second acoustic signal”.
  • the communication unit 21 receives the received acoustic signal xN (t) transmitted via the transmission line ND.
  • the communication unit 21 is an example of the “first receiving unit”.
  • the transmission line ND is an example of the “first transmission line”.
  • the communication unit 21 receives the received acoustic signal xS (t) transmitted via the transmission line SD.
  • the communication unit 21 is an example of the “second receiving unit”.
  • the transmission line SD is a transmission line having a delay time related to transmission larger than that of the transmission line ND, and is an example of a “second transmission line”.
  • the delay estimation unit 230 calculates the relative delay amount ⁇ in the received acoustic signal xN (t) and the received acoustic signal xS (t).
  • the delay estimation unit 230 is an example of a “delay amount calculation unit”.
  • the delay amount ⁇ is an example of the “transmission delay amount”.
  • the delay unit 231 delays the received acoustic signal xN (t) based on the calculated delay amount ⁇ , and outputs the delayed received acoustic signal xN (t— ⁇ ).
  • the transmission system 1 of the first embodiment it is possible to output high quality sound at the timing synchronized with the sound output from the general-purpose application. Therefore, it is possible to transmit high-quality sound by using a popular calling application.
  • the delay estimation unit 230 calculates the delay amount ⁇ every time the predetermined calculation timing k arrives. Each time the delay amount ⁇ is calculated, the delay unit 231 delays the received acoustic signal xN (t) by the delay amount ⁇ calculated this time. The delay unit 231 outputs the delayed received acoustic signal xN (t— ⁇ ). As a result, in the transmission method by the transmission system 1 of the embodiment, even if the delay amount changes from moment to moment, it is possible to follow the change, and the received acoustic signal xN synchronized with the received acoustic signal xS (t). It is possible to continue to output (t).
  • the delay unit 231 delays the received acoustic signal xN (t) by the delay amount ⁇ calculated this time.
  • the delay unit 231 outputs the delayed received acoustic signal xN (t— ⁇ ).
  • the delay unit 231 delays the received acoustic signal xN (t) by a predetermined delay amount (for example, the reference value nf).
  • the delay unit 231 outputs the delayed received acoustic signal xN (t-nf).
  • the threshold value is the average of the delay amounts n (k) calculated each time the predetermined calculation timing k arrives.
  • each time the delay amount ⁇ is calculated the delay unit 231 steps based on the delay amount calculated last time and the delay amount calculated this time.
  • the received acoustic signal xN (t) is delayed.
  • the amount of delay can be gradually changed, and it is possible to suppress the occurrence of sound interruptions and skips.
  • the delay estimation unit 230 changes the received acoustic signal xN (t) in time series and the received acoustic signal xS (t) received in the predetermined time interval T. ), The delay amount n that maximizes the cross-correlation value R (n) with the time-series change is calculated.
  • the delay estimation unit 230 sets the calculated delay amount n as the delay amount ⁇ . This makes it possible to prevent the unreliable delay amount from being applied when the calculated delay amount is unreliable, such as when the silent state continues.
  • the delay estimation unit 230 sets the calculated delay amount n as the delay amount ⁇ . do.
  • the delay estimation unit 230 does not set the calculated delay amount n as the delay amount ⁇ . This makes it possible to prevent the unreliable delay amount from being applied when the calculated delay amount is unreliable, such as when the silent state continues.
  • the maximum level of the received acoustic signal xN (t) received in the predetermined time interval T is equal to or higher than the threshold value (second threshold value). If the maximum level of the received acoustic signal xS (t) is less than the threshold value (second threshold value), the delay amount is not calculated.
  • the maximum level of the received acoustic signal xN (t) received in the predetermined time interval T is less than the threshold value (second threshold value), and the maximum level of the received acoustic signal xS (t) is the threshold value (first threshold value). If it is 2 thresholds or more, the delay amount is not calculated. As a result, when the calculated delay amount is unreliable, such as when one of the transmission lines is interrupted, the unreliable delay amount can be prevented from being applied.
  • the present embodiment differs from the above-described embodiment in that the general-purpose application transmits the video signal y (t) together with the acoustic signal x (t).
  • FIG. 4 is a block diagram showing a configuration example of the transmission system 1A according to the second embodiment.
  • the camera 50 is connected to the transmitting terminal 10.
  • a display 60 is connected to the receiving terminal 20.
  • the transmitting side terminal 10 acquires an acoustic signal x (t) from the microphone 30 via the input / output unit 14. Further, the transmitting side terminal 10 acquires the video signal y (t) from the camera 50 via the input / output unit 14. The transmitting side terminal 10 transmits an acoustic signal x (t) via the dedicated application to the receiving side terminal 20 by the dedicated application 130. Further, the transmitting side terminal 10 transmits the acoustic signal x (t) and the video signal y (t) via the general-purpose application to the receiving side terminal 20 by the general-purpose application 131.
  • the receiving terminal 20 receives the received acoustic signal xN (t) via the transmission line ND. Further, the receiving side terminal 20 receives the received acoustic signal xS (t) and the received video signal yS (t) via the transmission line SD. For the received acoustic signal xN (t), the delay estimation unit 230 calculates the relative delay amount ⁇ of the received acoustic signal xN (t) and the received acoustic signal xS (t). The receiving terminal 20 outputs the received acoustic signal xN (t— ⁇ ) delayed by the calculated delay amount ⁇ to the speaker 40 via the input / output unit 24. The receiving terminal 20 outputs the received video signal yS (t) to the display 60 via the input / output unit 24.
  • FIG. 5 is a sequence diagram showing a processing flow of the transmission system 1A according to the second embodiment.
  • the transmitting terminal 10 acquires the acoustic signal x (t) from the microphone 30 and the video signal y (t) from the camera 50 (step S20).
  • the transmitting side terminal 10 transmits an acoustic signal x (t) to the receiving side terminal 20 by a dedicated application, and also transmits an acoustic signal x (t) and a video signal y (t) to the receiving side terminal 20 by a general-purpose application ( Step S21).
  • the acoustic signal x (t) via the dedicated application reaches the receiving terminal 20 as a received acoustic signal xN (t) via the transmission line ND.
  • the acoustic signal x (t) and the video signal y (t) via the general-purpose application reach the receiving terminal 20 as the received audio signal xS (t) and the received video signal yS (t) via the transmission line SD
  • the receiving terminal 20 receives the received acoustic signal xN (t) (step S22).
  • the receiving terminal 20 receives the received audio signal xS (t) and the received video signal yS (t) (step S23).
  • the receiving terminal 20 estimates (calculates) the delay amount ⁇ using the received received acoustic signal xN (t) and the received acoustic signal xS (t) (step S24).
  • the receiving terminal 20 delays the received audio signal xN (t) using the estimated delay amount ⁇ , outputs the delayed received audio signal xN (t— ⁇ ) to the speaker 40, and outputs the delayed received audio signal xN (t— ⁇ ) to the speaker 40, and also receives the received video signal yS ( t) is output to the display 60 (step S26).
  • the communication unit 21 receives the received acoustic signal xN (t) transmitted via the transmission line ND.
  • the communication unit 21 receives the received audio signal xS (t) and the received video signal yS (t) transmitted via the transmission line SD.
  • the delay estimation unit 230 calculates the relative delay amount ⁇ in the received acoustic signal xN (t) and the received acoustic signal xS (t).
  • the delay unit 231 delays the received acoustic signal xN (t) based on the calculated delay amount ⁇ , and outputs the delayed received acoustic signal xN (t— ⁇ ).

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Telephone Function (AREA)
PCT/JP2020/023199 2020-05-11 2020-06-12 信号処理方法、信号処理装置、及びプログラム Ceased WO2021229828A1 (ja)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202080100307.2A CN115462058B (zh) 2020-05-11 2020-06-12 信号处理方法、信号处理装置及程序
JP2022522496A JP7497755B2 (ja) 2020-05-11 2020-06-12 信号処理方法、信号処理装置、及びプログラム
US17/979,974 US12119885B2 (en) 2020-05-11 2022-11-03 Signal processing method, signal processing device, and recording medium
JP2024061057A JP7694758B2 (ja) 2020-05-11 2024-04-04 信号処理方法、信号処理装置、及びプログラム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063022591P 2020-05-11 2020-05-11
US63/022,591 2020-05-11

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/979,974 Continuation US12119885B2 (en) 2020-05-11 2022-11-03 Signal processing method, signal processing device, and recording medium

Publications (1)

Publication Number Publication Date
WO2021229828A1 true WO2021229828A1 (ja) 2021-11-18

Family

ID=78525559

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/023199 Ceased WO2021229828A1 (ja) 2020-05-11 2020-06-12 信号処理方法、信号処理装置、及びプログラム

Country Status (4)

Country Link
US (1) US12119885B2 (https=)
JP (2) JP7497755B2 (https=)
CN (1) CN115462058B (https=)
WO (1) WO2021229828A1 (https=)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021229828A1 (ja) * 2020-05-11 2021-11-18 ヤマハ株式会社 信号処理方法、信号処理装置、及びプログラム

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005151083A (ja) * 2003-11-14 2005-06-09 Ntt Docomo Inc 会話音声中継方法および会話音声中継システム
JP2008193561A (ja) 2007-02-07 2008-08-21 Casio Comput Co Ltd 画像同期システム及び画像同期方法
JP2009282896A (ja) * 2008-05-26 2009-12-03 Takamasa Takahashi 情報処理端末および情報提供システム
US20140328485A1 (en) * 2013-05-06 2014-11-06 Nvidia Corporation Systems and methods for stereoisation and enhancement of live event audio
JP2015138040A (ja) 2014-01-20 2015-07-30 ヤマハ株式会社 音楽セッションシステム、方法及び端末装置
US20170017461A1 (en) * 2015-07-16 2017-01-19 Power Chord Group Limited Personal Audio Mixer
US10182093B1 (en) 2017-09-12 2019-01-15 Yousician Oy Computer implemented method for providing real-time interaction between first player and second player to collaborate for musical performance over network

Family Cites Families (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7027593B2 (en) * 2002-05-22 2006-04-11 Avaya Technology Corp. Apparatus and method for echo control
WO2004010695A1 (ja) 2002-07-18 2004-01-29 Sharp Kabushiki Kaisha 伝送システム、伝送装置、並びに、そのプログラムおよび記録媒体
JP2005079614A (ja) 2003-08-29 2005-03-24 Toshiba Corp 移動型音声出力装置、コンテンツ再生装置、無線チャネル制御方法及び同期制御方法
JP2007096971A (ja) 2005-09-29 2007-04-12 Toshiba Corp 無線送信装置および無線受信装置
CN101731011B (zh) * 2007-05-11 2014-05-28 奥迪耐特有限公司 用于设置接收器延迟时间的方法
JP2008294599A (ja) * 2007-05-23 2008-12-04 Yamaha Corp 放収音装置、および放収音システム
JP2010011274A (ja) 2008-06-30 2010-01-14 Toshiba Corp 映像音声出力装置及び映像音声出力方法
US20110142244A1 (en) * 2008-07-11 2011-06-16 Pioneer Corporation Delay amount determination device, sound image localization device, delay amount determination method and delay amount determination processing program
WO2012048299A1 (en) * 2010-10-07 2012-04-12 Clair Brothers Audio Enterprises, Inc. Method and system for enhancing sound
JP2014003432A (ja) * 2012-06-18 2014-01-09 Sharp Corp 信号処理装置、コンテンツ出力装置、コンテンツ視聴システム
JP6127476B2 (ja) * 2012-11-30 2017-05-17 ヤマハ株式会社 ネットワーク音楽セッションにおける遅延測定方法及び装置
CN109996166B (zh) * 2014-01-16 2021-03-23 索尼公司 声音处理装置和方法、以及程序
JP5792877B1 (ja) * 2014-07-25 2015-10-14 日本電信電話株式会社 遅延時間調整装置及び方法及びプログラム
JP6201949B2 (ja) * 2014-10-08 2017-09-27 株式会社Jvcケンウッド エコーキャンセル装置、エコーキャンセルプログラム及びエコーキャンセル方法
JP6377557B2 (ja) * 2015-03-20 2018-08-22 日本電信電話株式会社 通信システム、通信方法、およびプログラム
US10191715B2 (en) * 2016-03-25 2019-01-29 Semiconductor Components Industries, Llc Systems and methods for audio playback
US9916840B1 (en) * 2016-12-06 2018-03-13 Amazon Technologies, Inc. Delay estimation for acoustic echo cancellation
JP6849055B2 (ja) * 2017-03-24 2021-03-24 ヤマハ株式会社 収音装置および収音方法
US10481859B2 (en) * 2017-12-07 2019-11-19 Powerchord Group Limited Audio synchronization and delay estimation
US10931909B2 (en) * 2018-09-18 2021-02-23 Roku, Inc. Wireless audio synchronization using a spread code
WO2021229828A1 (ja) * 2020-05-11 2021-11-18 ヤマハ株式会社 信号処理方法、信号処理装置、及びプログラム
JP7643113B2 (ja) * 2021-03-19 2025-03-11 ヤマハ株式会社 音信号処理方法および音信号処理装置
CN117501359A (zh) * 2021-06-29 2024-02-02 雅马哈株式会社 传送系统、音输出方法及程序

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005151083A (ja) * 2003-11-14 2005-06-09 Ntt Docomo Inc 会話音声中継方法および会話音声中継システム
JP2008193561A (ja) 2007-02-07 2008-08-21 Casio Comput Co Ltd 画像同期システム及び画像同期方法
JP2009282896A (ja) * 2008-05-26 2009-12-03 Takamasa Takahashi 情報処理端末および情報提供システム
US20140328485A1 (en) * 2013-05-06 2014-11-06 Nvidia Corporation Systems and methods for stereoisation and enhancement of live event audio
JP2015138040A (ja) 2014-01-20 2015-07-30 ヤマハ株式会社 音楽セッションシステム、方法及び端末装置
US20170017461A1 (en) * 2015-07-16 2017-01-19 Power Chord Group Limited Personal Audio Mixer
US10182093B1 (en) 2017-09-12 2019-01-15 Yousician Oy Computer implemented method for providing real-time interaction between first player and second player to collaborate for musical performance over network

Also Published As

Publication number Publication date
US20230059829A1 (en) 2023-02-23
JP7497755B2 (ja) 2024-06-11
CN115462058B (zh) 2024-09-24
US12119885B2 (en) 2024-10-15
JPWO2021229828A1 (https=) 2021-11-18
JP2024086794A (ja) 2024-06-28
CN115462058A (zh) 2022-12-09
JP7694758B2 (ja) 2025-06-18

Similar Documents

Publication Publication Date Title
US9942119B2 (en) Adaptive jitter buffer
AU2007349607B2 (en) Method of transmitting data in a communication system
US9773510B1 (en) Correcting clock drift via embedded sine waves
US20080114606A1 (en) Time scaling of multi-channel audio signals
CN105099795A (zh) 抖动缓冲器水平估计
US6990084B2 (en) Echo cancellation with dynamic latency adjustment
JP7694758B2 (ja) 信号処理方法、信号処理装置、及びプログラム
CN107113283B (zh) 在低延迟多媒体流式传输环境中处理有问题的模式的方法
JP4076981B2 (ja) 通信端末装置およびバッファ制御方法
CN110289013A (zh) 多音频采集源检测方法、装置、存储介质和计算机设备
JP2015012557A (ja) 映像音声処理装置、映像音声処理システム、映像音声同期方法、プログラム
US9129607B2 (en) Method and apparatus for combining digital signals
JP2021536207A (ja) 聴覚装置の環境音声信号を強化するための方法、システム、および聴覚装置
US11741933B1 (en) Acoustic signal cancelling
JP5210788B2 (ja) 音声信号通信システム、音声合成装置、音声合成処理方法、音声合成処理プログラム、並びに該プログラムを格納した記録媒体
US12155789B2 (en) Adjusting transmit audio at near-end device based on background noise at far-end device
JP2010016449A (ja) グループ通信装置及びグループ通信プログラム
US20250392632A1 (en) Method for media stream processing, electronic device, and medium
JP6103718B2 (ja) 効率的な2段構成の非同期式サンプルレートコンバータ
CN113079267B (zh) 房间内的音频会议
WO2023170677A1 (en) Acoustic signal cancelling
AU2012200349A1 (en) Method of transmitting data in a communication system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20935595

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022522496

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020935595

Country of ref document: EP

Effective date: 20221212

122 Ep: pct application non-entry in european phase

Ref document number: 20935595

Country of ref document: EP

Kind code of ref document: A1