WO2022227625A1 - 信号处理方法及装置 - Google Patents

信号处理方法及装置 Download PDF

Info

Publication number
WO2022227625A1
WO2022227625A1 PCT/CN2021/139274 CN2021139274W WO2022227625A1 WO 2022227625 A1 WO2022227625 A1 WO 2022227625A1 CN 2021139274 W CN2021139274 W CN 2021139274W WO 2022227625 A1 WO2022227625 A1 WO 2022227625A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
microphone
terminal
echo cancellation
sound
Prior art date
Application number
PCT/CN2021/139274
Other languages
English (en)
French (fr)
Inventor
张晨
郑羲光
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022227625A1 publication Critical patent/WO2022227625A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63FCARD, BOARD, OR ROULETTE GAMES; INDOOR GAMES USING SMALL MOVING PLAYING BODIES; VIDEO GAMES; GAMES NOT OTHERWISE PROVIDED FOR
    • A63F13/00Video games, i.e. games using an electronically generated display having two or more dimensions
    • A63F13/85Providing additional services to players
    • A63F13/87Communicating with other players during game play, e.g. by e-mail or chat
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Definitions

  • the present disclosure relates to the field of communications, and in particular, to a signal processing method and device.
  • the host usually chooses to use external speakers to play and use the microphone to capture the played content, and then transmit the played content to the microphone end and the audience through the echo cancellation algorithm, so that the microphone end will not listen to it. to your own voice.
  • the present disclosure provides a signal processing method and device.
  • the present disclosure provides a signal processing method, including: calling an internal recording interface of a terminal to obtain an internal recording signal, wherein the internal recording signal includes a sound played on the terminal; Perform echo cancellation processing, wherein the microphone signal is the sound collected by the microphone of the terminal; mix the internal recording signal and the microphone signal after echo cancellation processing to obtain a first mixed signal; send the first mixed signal to an external device.
  • the method further includes: receiving a sound of a first external device communicatively connected to the terminal.
  • performing echo cancellation processing on the microphone signal of the terminal based on the internally recorded signal includes: adjusting the parameters of the filter based on the internally recorded signal; Eliminate echo signals from the microphone signal.
  • sending the first mixed signal to the external device includes: encoding the first mixed signal; and sending the encoded first mixed signal to the external device.
  • encoding the first mixed signal includes: using a standard audio encoder to encode the first mixed signal.
  • the present disclosure further provides a signal processing apparatus, including: an acquisition unit configured to call an internal recording interface of a terminal to acquire an internal recording signal, wherein the internal recording signal includes a sound played on the terminal; echo cancellation processing The unit is used for performing echo cancellation processing on the microphone signal of the terminal based on the internal recording signal, wherein the microphone signal is the sound collected by the microphone of the terminal; the mixing unit is used for performing the echo cancellation processing on the internal recording signal and the microphone signal after the echo cancellation processing. mixing to obtain a first mixed signal; and a sending unit for sending the first mixed signal to an external device.
  • a signal processing apparatus including: an acquisition unit configured to call an internal recording interface of a terminal to acquire an internal recording signal, wherein the internal recording signal includes a sound played on the terminal; echo cancellation processing The unit is used for performing echo cancellation processing on the microphone signal of the terminal based on the internal recording signal, wherein the microphone signal is the sound collected by the microphone of the terminal; the mixing unit is used for performing the echo cancellation processing on the internal recording signal and the microphone signal after the echo
  • the device when the internal-recorded signal further includes sound from a first external device communicatively connected to the terminal, the device further includes: an echo cancellation processing unit, further configured to perform an echo cancellation process on the internal-recorded signal based on the sound of the first external device The echo cancellation processing; the mixing unit is also used for mixing the internal recording signal after the echo cancellation processing and the microphone signal after the echo cancellation processing to obtain the second mixed sound signal; the sending unit is also used for sending the second mixed sound signal to the first external device.
  • an echo cancellation processing unit further configured to perform an echo cancellation process on the internal-recorded signal based on the sound of the first external device
  • the echo cancellation processing the mixing unit is also used for mixing the internal recording signal after the echo cancellation processing and the microphone signal after the echo cancellation processing to obtain the second mixed sound signal
  • the sending unit is also used for sending the second mixed sound signal to the first external device.
  • the acquisition unit is further configured to receive the sound of the first external device communicatively connected to the terminal.
  • the echo cancellation processing unit is further configured to adjust the parameters of the filter based on the internal recording signal; obtain the echo signal corresponding to the internal recording signal through the filter with the adjusted parameters; and eliminate the echo signal from the microphone signal of the terminal.
  • the sending unit is further configured to encode the first mixed signal; and send the encoded first mixed signal to an external device.
  • the transmitting unit includes a standard audio encoder by which the first mixed signal is encoded.
  • the present disclosure also provides an electronic device, comprising: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to execute the instructions to implement signal processing according to the present disclosure method.
  • the present disclosure also provides a computer-readable storage medium, when the instructions in the computer-readable storage medium are executed by at least one processor, cause the at least one processor to perform the signal processing method according to the present disclosure as above .
  • the present disclosure also provides a computer program product comprising computer instructions that, when executed by a processor, implement the signal processing method according to the present disclosure.
  • the sound played by the terminal is internally recorded through the internal recording interface of the terminal, and based on the internally recorded sound, the echo cancellation technology is combined to share the sound that needs to be shared in the live broadcast process to the microphone terminal and the microphone terminal.
  • the audience is disconnected, which ensures that the sound quality of the original live broadcast sound is not damaged, and improves the audience's live broadcast listening experience.
  • FIG. 1 is a schematic diagram of an implementation scenario of a signal processing method according to an exemplary embodiment
  • FIG. 2 is a flowchart of a signal processing method according to an exemplary embodiment
  • FIG. 3 is a schematic diagram showing a principle of echo cancellation according to an exemplary embodiment
  • FIG. 4 is an architecture diagram of a live broadcast system according to an exemplary embodiment
  • FIG. 5 is a structural diagram of a built-in echo cancellation according to an exemplary embodiment
  • FIG. 6 is a block diagram of a signal processing apparatus according to an exemplary embodiment
  • FIG. 7 is a block diagram of an electronic device 700 according to an embodiment of the present disclosure.
  • the anchors In addition to sharing game video content such as soundtracks, sound effects, commentary, etc., the anchors sometimes need to connect with fans. The audience needs to obtain all the information shared by the anchor, such as game sound effects, music, and even microphone calls. In this case, the anchors usually choose to share the sound through external broadcasting. However, the external broadcasting cannot guarantee that the sound quality of the shared original sound will not be damaged, which will easily affect the audio sharing quality of the anchor during the live broadcast. Specifically, the above-mentioned use of external speakers to play the shared content, and then, the microphone picks up the shared content and then transmits it to the audience. , the audience's live broadcast listening experience is not good.
  • the present disclosure provides a signal processing method, which can ensure that the sound quality of the original sound to be shared is not damaged.
  • FIG. 1 is a schematic diagram of an implementation scenario of a signal processing method according to an exemplary embodiment.
  • the implementation scenario includes a server 100, a live broadcast terminal 110, a microphone terminal 120, and a viewer terminal 130, wherein the microphone terminal 130 There can be multiple terminals and viewers, including but not limited to mobile phones, personal computers and other devices.
  • the live broadcast terminal 110, the microphone terminal 120 and the viewer terminal 130 can be installed with a live broadcast APP, and the server can be one server or several servers.
  • a server cluster is formed, and it can also be a cloud computing platform or a virtualization center.
  • the live broadcast APP calls the internal recording interface of the live broadcast terminal 110, and the internal recording needs to send the game soundtrack, sound effect, commentary, etc. to the audience terminal 130, that is, the internal recording signal. Then, the live broadcast APP is based on the internal recording of the internal recording.
  • the signal performs echo cancellation processing on the signal collected by the microphone of the live broadcast terminal 110. After the echo cancellation processed microphone signal and the internal recording signal are mixed, the signal is sent to the viewer terminal 130 through the server 100, so that the original live broadcast sound sent to the viewer terminal 130 is not changed.
  • the sound quality is not damaged, which improves the audience's listening experience of the live broadcast, and solves the problem that the sound quality of the original live broadcast sound is damaged by sharing the live broadcast sound in the form of external broadcasting in the live broadcast process in the related technology.
  • the audience terminal 110 that establishes the microphone connection with the live broadcast terminal 110 is now the microphone connection terminal 120.
  • the live broadcast APP continues to call the internal recording interface of the live broadcast terminal 110, and the internal recording needs to be sent to the audience terminal 130 and the connected microphone terminal 120.
  • the live broadcast APP performs echo silencing processing on the signal collected by the microphone of the live broadcast terminal 110 based on the internal recorded internal recording signal, and then performs echo silencing processing on the internal recorded signal based on the voice signal sent by the microphone terminal 120, and then After the microphone signal after echo cancellation processing is mixed with the internal recording signal after echo cancellation processing, it is sent to the microphone terminal 120 through the server 100, and the internal recording signal and the microphone signal after echo cancellation processing are mixed, and sent to the audience through the server 100
  • the terminal 130 ensures that the sound quality of the original live broadcast sound sent to the audience terminal 130 and the microphone terminal 120 is not damaged, improves the audience's live broadcast listening experience, and solves the problem of sharing the live broadcast sound in the form of external broadcasting in the live broadcast process in related technologies, resulting in The quality of the original live sound is damaged.
  • Fig. 2 is a flowchart of a signal processing method according to an exemplary embodiment. As shown in Fig. 2 , the signal processing method includes the following steps S201-S204.
  • an internal recording interface of the terminal is invoked to acquire an internal recording signal, wherein the internal recording signal includes a sound played on the terminal.
  • the live APP can obtain the sounds emitted by multiple APPs through the internal recording interface, such as the mixed signals of music, game sound effects, voice connected to the microphone, etc.
  • the above mixed signal also includes the sound played by the live APP.
  • step S202 echo cancellation processing is performed on the microphone signal of the terminal based on the internal recording signal, wherein the microphone signal is the sound collected by the microphone of the terminal.
  • the echo signal corresponding to the internal recording signal is eliminated from the microphone signal of the terminal, so that the processed microphone signal only contains the voice of the host speaking.
  • performing echo cancellation processing on the microphone signal of the terminal based on the internal-recorded signal may be implemented in the following manner: adjusting the parameters of the filter based on the internal-recording signal, and then obtaining the filter with the adjusted parameters.
  • the echo signal corresponding to the internal recording signal is eliminated from the microphone signal of the terminal. Through this embodiment, the echo signal can be eliminated well.
  • the method of echo cancellation processing can use different adaptive filtering algorithms to adjust the weight vector of the filter, estimate an approximate echo path to approximate the real echo path, so as to obtain the estimated echo signal corresponding to the internal recording signal, and in the pure The estimated echo signal is removed from the mixed signal of speech and echo to achieve echo cancellation.
  • a common adaptive filter is the Least Mean Square (Least Mean Square, LMS for short).
  • LMS Least Mean Square
  • the gradient descent method can be used to obtain the filter. parameter.
  • the W coefficients for the LMS filter are updated as follows:
  • x(n) is the signal to be eliminated, such as the internal recording signal in the above embodiment
  • y(n) is the estimated echo signal (that is, the internal recording signal collected by the microphone)
  • d(n) is the real echo signal.
  • step S203 the internally recorded signal and the microphone signal after echo cancellation processing are mixed to obtain a first mixed signal.
  • the echo-cancelled microphone signal and the internally recorded signal can be mixed together through the mixing module of the live broadcast APP.
  • step S204 the first mixed signal is sent to an external device.
  • transmitting the first mixed signal to the external device may first encode the first mixed signal, and then transmit the encoded first mixed signal to the external device.
  • the data transmission efficiency can be improved, the bit error rate can be reduced, and the reliability of communication can be increased.
  • encoding the first mixed signal may include: encoding the first mixed signal using a standard audio encoder.
  • the internal recording signal also includes the sound from the first external device that is communicatively connected to the terminal, that is, the first external device sends the voice to the live broadcast APP through the server
  • the terminal will record the sound of the first external device together, and then perform echo cancellation processing on the microphone signal based on the internal recording signal.
  • the The sound of an external device performs echo cancellation processing on the internal recording signal, and then the internal recording signal after echo cancellation processing and the microphone signal after echo cancellation processing are mixed and sent to the first external device.
  • the above signal processing method may further include: echoing the internal recording signal based on the sound of the first external device Noise cancellation processing; mixing the internal recording signal after echo cancellation processing and the microphone signal after echo cancellation processing to obtain a second mixed sound signal; sending the second mixed sound signal to the first external device.
  • the above-mentioned signal processing method further includes receiving a sound of a first external device communicatively connected to the terminal.
  • the live APP when one of the audiences initiates a request to connect with the live broadcaster of the live broadcast APP, the live APP establishes a call with the audience based on the request, and the audience that initiates the request is temporarily called the connection to the microphone.
  • the live broadcast APP receives the sound transmitted by the microphone terminal through the server, and calls the internal recording interface of the terminal to record the sound of the microphone terminal together, and then performs echo cancellation processing on the microphone signal of the terminal based on the internal recording signal.
  • the sound from the microphone end performs echo cancellation processing on the internal recording signal, and then mixes the internal recording signal after echo cancellation processing with the microphone signal after echo cancellation processing, and sends it to the microphone terminal. In this process, the internal recording signal before echo cancellation processing and the microphone signal after echo cancellation processing are sent to the audience at the same time.
  • FIG. 4 is an architecture diagram of a live broadcast system according to an exemplary embodiment.
  • the live broadcast system includes a microphone, a system sound mixing module, internal recording module, hardware output module, algorithm processing module, mixing module, live server, encoding module, live end, viewer end and microphone end, where the live end includes live APP and other APPs, and the hardware output module includes mobile phone Speakers, Headphones, Bluetooth, etc.
  • Microphone used to capture sound.
  • System mixing module used to mix the sound from the APP on the mobile phone and the voice from the microphone end. For example, when multiple apps on the mobile phone make sounds, the live broadcast app can obtain the played music, game sound effects, and even the voice of the microphone, and then mix the obtained content.
  • Internal recording module Use the internal recording interface of the mobile phone to record all the sounds mixed by the system, including the sound played by this app (such as the sound played by the microphone terminal) and the sound played by other applications.
  • Hardware output module Play sound through mobile phone speakers, earphones, Bluetooth, etc.
  • Algorithm processing module do echo cancellation to prevent both ends of the call from hearing their own echoes.
  • the frame diagram of the internal recording echo cancellation is shown in Figure 5 , and the processing of the signal to be sent to the audience is taken as an example.
  • the above-mentioned echo signal y l (n) is eliminated from the microphone signal of the terminal to obtain the microphone signal processed by the algorithm, that is, x 1 (n)-y l (n). Therefore, the subsequent signal sent to the viewer only includes the original internal recording signal, and does not repeatedly include the internal recording signal collected by the live microphone.
  • the echo cancellation processing process for the internally recorded signal is similar to the echo cancellation processing process for the terminal microphone signal, and will not be discussed here.
  • Mixing module It is used to mix the internal recording signal recorded by the internal recording module and the microphone signal processed by the algorithm to obtain the mixed signal 1, and at the same time mix the internal recording signal processed by the algorithm and the microphone signal processed by the algorithm to obtain the mixed signal 2 , and then send the mixed signal 1 and mixed signal 2 to the encoding module.
  • mixed signal 1 can be obtained as follows:
  • Encoding module encode the mixed signal 1 and the mixed signal 2, and send the encoded mixed signal 1 and the encoded mixed signal 2 to the live server.
  • Live server receive the encoded mixed signal 1 and the encoded mixed signal 2, and respectively forward the encoded mixed signal 1 to the viewer, and forward the encoded mixed signal 2 to the Lianmai terminal.
  • the above embodiments share the live audio content by means of internal recording.
  • the live broadcast of the internal recording scheme can achieve lossless sound quality live broadcast, and the user terminal can obtain the original audio signal without damage, which improves the sharing content of live broadcast applications on the platform.
  • the combination of echo cancellation technology ensures that the microphone end will not hear its own voice, and the audience end will not hear the repeated internal recording signal.
  • Fig. 6 is a block diagram of a signal processing apparatus according to an exemplary embodiment.
  • the apparatus includes an acquisition unit 60 , an echo cancellation processing unit 62 , a mixing unit 64 and a sending unit 66 .
  • the acquisition unit 60 is used for calling the internal recording interface of the terminal to acquire the internal recording signal, wherein the internal recording signal includes the sound played on the terminal; the echo cancellation processing unit 62 is used for performing echo cancellation processing on the microphone signal of the terminal based on the internal recording signal, Among them, the microphone signal is the sound collected by the microphone of the terminal; the mixing unit 64 is used for mixing the internal recording signal and the microphone signal after echo cancellation processing to obtain the first mixed sound signal; the sending unit 66 is used for mixing the first mixed sound. Tone signals are sent to external devices.
  • the echo cancellation processing unit 62 is further configured to perform echo cancellation on the internal recording signal based on the sound of the first external device processing; the mixing unit 64 is also used to mix the internal recording signal after the echo cancellation processing and the microphone signal after the echo cancellation processing to obtain a second mixed sound signal; the sending unit 66 is also used to send the second mixed sound signal to the first external device.
  • the acquiring unit 60 is further configured to receive the sound of the first external device communicatively connected to the terminal.
  • the echo cancellation processing unit 62 is further configured to adjust the parameters of the filter based on the internal recording signal; obtain the echo signal corresponding to the internal recording signal through the filter with the adjusted parameters, and eliminate the echo signal from the microphone signal of the terminal .
  • the sending unit 66 is further configured to encode the first mixed signal; and send the encoded first mixed signal to an external device.
  • the sending unit 66 includes a standard audio encoder, and encodes the first mixed signal through the standard audio encoder.
  • FIG. 7 is a block diagram of an electronic device 700 according to an embodiment of the present disclosure.
  • the electronic device includes at least one memory 701 and at least one processor 702.
  • the at least one memory stores a set of computer-executable instructions. When the instruction set is executed by at least one processor, the signal processing method according to the embodiment of the present disclosure is executed.
  • the electronic device 700 may be a PC computer, a tablet device, a personal digital assistant, a smart phone, or other device capable of executing the above set of instructions.
  • the electronic device 1000 is not necessarily a single electronic device, but can also be a collection of any device or circuit capable of executing the above-mentioned instructions (or instruction sets) individually or jointly.
  • Electronic device 700 may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces locally or remotely (e.g., via wireless transmission).
  • processor 702 may include a central processing unit (CPU), graphics processing unit (GPU), programmable logic device, special purpose processor system, microcontroller, or microprocessor.
  • processor 702 may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
  • Processor 702 may execute instructions or code stored in memory, where memory 701 may also store data. Instructions and data may also be sent and received over a network via a network interface device, which may employ any known transport protocol.
  • the memory 701 may be integrated with the processor 702, eg, RAM or flash memory arranged within an integrated circuit microprocessor or the like. Additionally, memory 702 may comprise a separate device, such as an external disk drive, storage array, or any other storage device that may be used by a database system. The memory 701 and the processor 702 may be operatively coupled, or may communicate with each other, eg, through I/O ports, network connections, etc., to enable the processor 702 to read files stored in the memory 701 .
  • the electronic device 700 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device can be connected to each other via a bus and/or a network.
  • a video display such as a liquid crystal display
  • a user interaction interface such as a keyboard, mouse, touch input device, etc.
  • a computer-readable storage medium can also be provided, wherein, when the instructions in the computer-readable storage medium are executed by at least one processor, the at least one processor is caused to perform the signal processing of the embodiment of the present disclosure. method.
  • Examples of the computer-readable storage medium herein include: Read Only Memory (ROM), Random Access Programmable Read Only Memory (PROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Random Access Memory (RAM) , dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, non-volatile memory, CD-ROM, CD-R, CD+R, CD-RW, CD+RW, DVD-ROM , DVD-R, DVD+R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, Blu-ray or Optical Disc Storage, Hard Disk Drive (HDD), Solid State Hard disk (SSD), card memory (such as a multimedia card, Secure Digital (SD) card, or Extreme Digital (XD) card), magnetic tape, floppy disk, magneto-optical data storage device, optical data storage device, hard disk, solid state disk, and any other apparatuses configured to store, in a non-transitory manner, a
  • the computer program in the above-mentioned computer readable storage medium can be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, etc.
  • the computer program and any associated data, data files and data structures are distributed over networked computer systems so that the computer programs and any associated data, data files and data structures are stored, accessed and executed in a distributed fashion by one or more processors or computers.
  • a computer program product which includes computer instructions, and when the computer instructions are executed by a processor, implements the signal processing method of the embodiment of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Telephone Function (AREA)

Abstract

本公开关于一种信号处理方法及装置。信号处理方法包括:调用终端的内录接口,获取内录信号,其中,内录信号包括在终端播放的声音;基于内录信号对终端的麦克风信号进行回声消音处理,其中,麦克风信号为通过终端的麦克风采集到的声音;将内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号;将第一混音信号发送到外部装置。

Description

信号处理方法及装置
本申请要求于2021年4月28日提交至中国专利局、申请号为202110469586.6的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及通信领域,尤其涉及一种信号处理方法及装置。
背景技术
随着直播文化的兴起,各种各样的直播形式相继出现,主播与主播或者观众的互动也越来越频繁,常见的有pk连麦、k歌房等。通常在通话时,由于麦克风会采集到对方人声由于空间反射形成的回音信号,严重影响通话质量,因此,在双端通话的场景中,回声消音功能必不可少,目前回声消音在通话场景中的应用已非常的成熟。然而在一些终端直播场景中除了主播端和连麦端,观众端需要获得主播分享的游戏音效、音乐以及连麦通话等所有信息。为了分享这些有趣的内容,主播通常会选择使用外放扬声器播放并利用麦克风采集播放的内容,然后将播放的内容经过回声消音算法传到连麦端以及观众端,这样连麦端就不会听到自己的声音。
发明内容
本公开提供一种信号处理方法及装置。
在一些实施例中,本公开提供一种信号处理方法,包括:调用终端的内录接口,获取内录信号,其中,内录信号包括在终端播放的声音;基于内录信号对终端的麦克风信号进行回声消音处理,其中,麦克风信号为通过终端的麦克风采集到的声音;将内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号;将第一混音信号发送到外部装置。
在一些实施例中,所述方法还包括:接收与终端通信连接的第一外部装置的声音。
在一些实施例中,基于内录信号对终端的麦克风信号进行回声消音处理包括:基于内录信号调节滤波器的参数;通过调节好参数的滤波器获取内录信号对应的回声信号,从终端的麦克风信号中消除回声信号。
在一些实施例中,将第一混音信号发送到外部装置包括:将第一混音信号进行编码;将编码后的第一混音信号发送到外部装置。
在一些实施例中,将第一混音信号进行编码,包括:采用标准音频编码器对第一混音信号进行编码。
在一些实施例中,本公开还提供一种信号处理装置,包括:获取单元,用于调用终端的内录接口,获取内录信号,其中,内录信号包括在终端播放的声音;回声消音处理单元, 用于基于内录信号对终端的麦克风信号进行回声消音处理,其中,麦克风信号为通过终端的麦克风采集到的声音;混合单元,用于将内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号;发送单元,用于将第一混音信号发送到外部装置。
在一些实施例中,在内录信号还包括来自与终端通信连接的第一外部装置的声音时,装置还包括:回声消音处理单元,还用于基于第一外部装置的声音对内录信号进行回声消音处理;混合单元,还用于将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,得到第二混音信号;发送单元,还用于将第二混音信号发送到第一外部装置。
在一些实施例中,获取单元,还用于接收与终端通信连接的第一外部装置的声音。
在一些实施例中,回声消音处理单元,还用于基于内录信号调节滤波器的参数;通过调节好参数的滤波器获取内录信号对应的回声信号;从终端的麦克风信号中消除回声信号。
在一些实施例中,发送单元,还用于将第一混音信号进行编码;将编码后的第一混音信号发送到外部装置。
在一些实施例中,发送单元包括标准音频编码器,通过标准音频编码器对第一混音信号进行编码。
在一些实施例中,本公开还提供了一种电子设备,包括:处理器;用于存储处理器可执行指令的存储器;其中,处理器被配置为执行指令,以实现根据本公开的信号处理方法。
在一些实施例中,本公开还提供了一种计算机可读存储介质,当计算机可读存储介质中的指令被至少一个处理器运行时,促使至少一个处理器执行如上根据本公开的信号处理方法。
在一些实施例中,本公开还提供了一种计算机程序产品,包括计算机指令,计算机指令被处理器执行时实现根据本公开的信号处理方法。
根据本公开的信号处理方法及装置,通过终端的内录接口来内录终端播放的声音,并在内录的声音基础上结合回声消音技术,将直播过程需要分享的声音分享给连麦端和观众断,保证了原始直播声音的音质不受损伤,提高了观众的直播听感体验。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理,并不构成对本公开的不当限定。
图1是根据一示例性实施例示出的信号处理方法的实施场景示意图;
图2是根据一示例性实施例示出的一种信号处理方法的流程图;
图3是根据一示例性实施例示出的一种回声消音原理示意图;
图4是根据一示例性实施例示出的一种直播系统的架构图;
图5是根据一示例性实施例示出的一种内录回声消音的架构图
图6是根据一示例性实施例示出的一种信号处理装置的框图;
图7是根据本公开实施例的一种电子设备700的框图。
具体实施方式
为了使本领域普通人员更好地理解本公开的技术方案,下面将结合附图,对本公开实施例中的技术方案进行清楚、完整地描述。
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。以下实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。
在此需要说明的是,在本公开中出现的“若干项之中的至少一项”均表示包含“该若干项中的任意一项”、“该若干项中的任意多项的组合”、“该若干项的全体”这三类并列的情况。例如“包括A和B之中的至少一个”即包括如下三种并列的情况:(1)包括A;(2)包括B;(3)包括A和B。又例如“执行步骤一和步骤二之中的至少一个”,即表示如下三种并列的情况:(1)执行步骤一;(2)执行步骤二;(3)执行步骤一和步骤二。
目前,在一些直播场景(如游戏直播),主播除了需要分享游戏视频内容如配乐、音效、解说等,有时还需要与粉丝进行连麦,连麦端需要获取到主播分享的音视频内容,而观众端则需要获得主播分享的游戏音效、音乐以及连麦通话等所有信息。这种情况下,主播们通常会选择通过外放的形式进行声音分享,然而外放无法保证分享的原始声音的音质不受损,容易影响主播直播过程的音频分享质量。具体而言,上述使用外放扬声器播放分享内容,然后,麦克风拾取该分享内容后再传到观众端,分享的内容经过扬声器和麦克风的处理后音质会严重受损,影响直播过程中音频分享质量,观众的直播听感体验不好。
针对上述问题,本公开提供了一种信号处理方法,能够保证分享的原始声音的音质不受损,下面以游戏直播的场景为例进行说明。
图1是根据一示例性实施例示出的信号处理方法的实施场景示意图,如图1所述,该实施场景包括服务器100、直播端110、连麦端120和观众端130,其中,连麦端和观众端均可以为多个,包括但并不限于手机、个人计算机等设备,直播端110、连麦端120和观众端130可以安装直播APP,服务器可以是一个服务器,也可以是若干个服务器组成服务器集群,还可以是云计算平台或虚拟化中心。
在游戏直播过程中,直播APP调用直播端110的内录接口,内录需要发送给观众端130的游戏配乐、音效、解说等,也即内录信号,然后,直播APP基于内录的内录信号对直播端110的麦克风采集的信号进行回声消音处理,将回声消音处理后的麦克风信号和内 录信号混合后,经服务器100发送给观众端130,使得发送给观众端130的原始直播声音的音质不受损伤,提高了观众的直播听感体验,解决了相关技术中直播过程通过外放的形式分享直播声音,导致原始直播声音的音质受损的问题。
另外,如果游戏直播过程中有观众端申请与直播端110进行连麦,直播端110建立与观众端的连麦通话,则与直播端110建立连麦通话的观众端此时即为连麦端120,此时,直播APP继续调用直播端110的内录接口,内录需要发送给观众端130和连麦端120的游戏的配乐、音效、解说和连麦端120发送的语音信号等,也即内录信号,然后,直播APP基于内录的内录信号对直播端110的麦克风采集的信号进行回声消音处理,再基于连麦端120发送的语音信号对内录信号进行回声消音处理,然后将回声消音处理后的麦克风信号和回声消音处理后的内录信号混合后,经服务器100发送给连麦端120,将内录信号和回声消音处理后的麦克风信号混合后,经服务器100发送给观众端130,使得发送给观众端130和连麦端120的原始直播声音的音质不受损伤,提高了观众的直播听感体验,解决了相关技术中直播过程通过外放的形式分享直播声音,导致原始直播声音的音质受损的问题。
下面,将参照图2至图7详细描述根据本公开的示例性实施例的信号处理方法及装置。
图2是根据一示例性实施例示出的一种信号处理方法的流程图,如图2所示,信号处理方法包括以下步骤S201-S204。
在步骤S201中,调用终端的内录接口,获取内录信号,其中,所述内录信号包括在所述终端播放的声音。例如,在终端上的多个APP发出声音的时候,直播APP可以通过内录接口获得多个APP发出的声音,如音乐、游戏音效、连麦端语音等混合后的信号,需要说明的是,上述混合后的信号也包含直播APP播放的声音。
返回图2,在步骤S202中,基于所述内录信号对所述终端的麦克风信号进行回声消音处理,其中,所述麦克风信号为通过所述终端的麦克风采集到的声音。通过本步骤,从所述终端的麦克风信号中消除内录信号对应的回声信号,使得处理后麦克风信号中仅仅包含主播说话的声音。
根据本公开的示例性实施例,基于所述内录信号对终端的麦克风信号进行回声消音处理可以通过如下方式实现:基于内录信号调节滤波器的参数,然后,通过调节好参数的滤波器获取内录信号对应的回声信号,从终端的麦克风信号中消除回声信号。通过本实施例,可以很好的消除回声信号。
例如,回声消音处理的方法可以使用不同的自适应滤波算法调整滤波器的权值向量,估计一个近似的回声路径来逼近真实回声路径,从而得到内录信号对应的估计的回声信号,并在纯净语音和回声的混合信号中除去该估计的回声信号来实现回声的消除。具体地,回声消音原理如图3所示,常见的自适应滤波器有最小均方自适应滤波器(Least Mean Square,简称为LMS),在回声消音过程中,可以通过梯度下降方法获取滤波器的参数。对于LMS滤波器的W系数更新如下:
模拟回声路径的滤波函数:
Figure PCTCN2021139274-appb-000001
差值:e(n)=d(n)-y(n)        (2)
均方误差:F[e(n)]=E[e 2(n)]=E[d 2(n)-2d(n)y(n)+y 2(n)]   (3)
其中,x(n)为需要被消除的信号,如上述实施例中的内录信号,y(n)为估计的回声信号(即经麦克风采集的内录信号),d(n)为真实的回声信号。通过上述公式(1)-(3)不断对W进行更新迭代,一直到均方误差最小,将最小均方误差对应的W作为最终的滤波器参数。在训练好W后,将内录信号输入到以W为参数的滤波器中得到估计的回声信号,然后,从终端的麦克风信号中消除该估计的回声信号,实现回声的消除。
返回图2,在步骤S203中,将所述内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号。例如,可以通过直播APP的混音模块,将回声消音后的麦克风信号和内录到的信号混合到一起。
在步骤S204中,将所述第一混音信号发送到外部装置。
根据本公开的示例性实施例,将第一混音信号发送到外部装置可以先对第一混音信号进行编码,然后将编码后的第一混音信号发送到外部装置。通过本实施例,可以提高数据传输效率,降低误码率,增加通信的可靠性。
根据本公开的示例性实施例,将所述第一混音信号进行编码可以包括:采用标准音频编码器对所述第一混音信号进行编码。
需要说明的是,本公开还存在一种情况,即内录信号还包括来自与终端通信连接的第一外部装置的声音时,也即,第一外部装置通过服务器向直播APP发送语音的情况,在该情况下,终端会将第一外部装置的声音一并内录,然后基于该内录信号对麦克风信号进行回声消音处理,同时,为了防止第一外部装置收听到自己的声音,还基于第一外部装置的声音对内录信号进行回声消音处理,再然后,将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,并发送到第一外部装置。
根据本公开的示例性实施例,在内录信号还包括来自与终端通信连接的第一外部装置的声音时,上述信号处理方法还可以包括:基于第一外部装置的声音对内录信号进行回声消音处理;将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,得到第二混音信号;将第二混音信号发送到所述第一外部装置。通过本公开实施例,保证发送给第一外部装置的信号不包括第一外部装置自己的声音,避免了第一外部装置有回音的问题。
根据本公开的示例性实施例,上述的信号处理方法还包括接收与所述终端通信连接的第一外部装置的声音。
例如,当众多观众端中的某个观众端发起与直播APP的直播者连麦请求时,直播APP基于连麦请求建立与观众端的通话,此时发起请求的观众端暂时称为连麦端,直播APP接 收连麦端通过服务器传输过来的声音,并调用终端的内录接口,将连麦端的声音一并内录,然后基于该内录信号对终端的麦克风信号进行回声消音处理,同时,基于连麦端的声音对内录信号进行回声消音处理,再然后,将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,并发送到连麦端。在该过程中,同时将回声消音处理前的内录信号和回声消音处理后的麦克风信号发送到观众端。
下面以基于手机的直播系统为例对上述实施例进行说明,图4是根据一示例性实施例示出的一种直播系统的架构图,如图4所示,该直播系统包括麦克风、系统混音模块、内录模块、硬件输出模块、算法处理模块、混音模块、直播服务器、编码模块、直播端、观众端和连麦端,其中,直播端包括直播APP和其他APP,硬件输出模块包括手机扬声器、耳机、蓝牙等。
麦克风:用于采集声音。
系统混音模块:用于将手机上的APP发出的声音、连麦端的语音等混合在一起。例如,手机多个app发出声音的时候,直播APP可获得播放的音乐、游戏音效,还可以获取连麦端语音,然后将获取的内容进行混合。
内录模块:采用手机的内录接口,录制系统混合后的所有声音,包含本app播放的声音(比如连麦端播放声音)和其他应用播放的声音。
硬件输出模块:通过手机扬声器、耳机、蓝牙等播放声音。
算法处理模块:做回声消音,避免通话两端听到自己的回声。内录回声消音框架图如图5所示,以处理即将发送给观众端的信号为例进行说明,其中,主播端麦克风信号x 1(n)、连麦端麦克风信号x 2(n)、其他应用输入信号x 3(n)、内录信号x l(n)=x 2(n)+x 3(n)、将内录信号经过AEC(Adaptive echo cancellation)得到内录信号对应的回声信号y l(n),其中,y l(n)通过如下公式获取:
Figure PCTCN2021139274-appb-000002
然后,从终端的麦克风信号中消除上述回声信号y l(n),得到算法处理后的麦克风信号,即x 1(n)-y l(n),此时消除了主播端麦克风信号中采集到的内录信号,从而后续发送给观众端的信号中仅仅包括原始内录信号,不再重复包括直播麦克风采集到的内录信号。对于内录信号的回声消音处理过程与终端麦克风信号的回声消音处理过程相类似,此处不在展开论述。
混音模块:用于将内录模块录取到的内录信号和算法处理后的麦克风信号混合得到混合信号1,同时将算法处理后的内录信号和算法处理后的麦克风信号混合得到混合信号2,然后将混合信号1和混合信号2发送给编码模块。例如,混合信号1可以通过如下方式获取:
y(n)=x 1(n)-y l(n)+x 2(n)+x 3(n)   (5)
编码模块:对混合信号1和混合信号2进行编码,并将编码后的混合信号1和编码后 的混合信号2发送给直播服务器。
直播服务器:接收编码后的混合信号1和编码后的混合信号2,并分别将编码后的混合信号1转发给观众端,将编码后的混合信号2转发给连麦端。
综上,上述实施例通过内录的方式进行直播音频内容的分享,内录方案的直播可以做到无损音质直播,用户端可以获取未受损的原始音频信号,提升了平台上直播应用分享内容的体验;同时结合回声消音技术保证连麦端不会收听到自己的声音,以及观众端不会收听到重复的内录信号。
图6是根据一示例性实施例示出的一种信号处理装置的框图。参照图6,该装置包括获取单元60,回声消音处理单元62、混合单元64和发送单元66。
获取单元60用于调用终端的内录接口,获取内录信号,其中,内录信号包括在终端播放的声音;回声消音处理单元62用于基于内录信号对终端的麦克风信号进行回声消音处理,其中,麦克风信号为通过终端的麦克风采集到的声音;混合单元64用于将内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号;发送单元66用于将第一混音信号发送到外部装置。
根据本公开的实施例,在内录信号还包括来自与终端通信连接的第一外部装置的声音情况下,回声消音处理单元62还用于基于第一外部装置的声音对内录信号进行回声消音处理;混合单元64还用于将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,得到第二混音信号;发送单元66,还用于将第二混音信号发送到第一外部装置。
根据本公开的实施例,获取单元60还用于接收与终端通信连接的第一外部装置的声音。
根据本公开的实施例,回声消音处理单元62还用于基于内录信号调节滤波器的参数;通过调节好参数的滤波器获取内录信号对应的回声信号,从终端的麦克风信号中消除回声信号。
根据本公开的实施例,发送单元66还用于将第一混音信号进行编码;将编码后的第一混音信号发送到外部装置。
根据本公开的实施例,发送单元66包括标准音频编码器,通过标准音频编码器对第一混音信号进行编码。
根据本公开的实施例,可提供一种电子设备。图7是根据本公开实施例的一种电子设备700的框图,该电子设备包括至少一个存储器701和至少一个处理器702,所述至少一个存储器中存储有计算机可执行指令集合,当计算机可执行指令集合被至少一个处理器执行时,执行根据本公开实施例的信号处理方法。
作为示例,电子设备700可以是PC计算机、平板装置、个人数字助理、智能手机、或其他能够执行上述指令集合的装置。这里,电子设备1000并非必须是单个的电子设备,还可以是任何能够单独或联合执行上述指令(或指令集)的装置或电路的集合体。电子设备700还可以是集成控制系统或系统管理器的一部分,或者可被配置为与本地或远程(例 如,经由无线传输)以接口互联的便携式电子设备。
在电子设备700中,处理器702可包括中央处理器(CPU)、图形处理器(GPU)、可编程逻辑装置、专用处理器系统、微控制器或微处理器。作为示例而非限制,处理器702还可包括模拟处理器、数字处理器、微处理器、多核处理器、处理器阵列、网络处理器等。
处理器702可运行存储在存储器中的指令或代码,其中,存储器701还可以存储数据。指令和数据还可经由网络接口装置而通过网络被发送和接收,其中,网络接口装置可采用任何已知的传输协议。
存储器701可与处理器702集成为一体,例如,将RAM或闪存布置在集成电路微处理器等之内。此外,存储器702可包括独立的装置,诸如,外部盘驱动、存储阵列或任何数据库系统可使用的其他存储装置。存储器701和处理器702可在操作上进行耦合,或者可例如通过I/O端口、网络连接等互相通信,使得处理器702能够读取存储在存储器701中的文件。
此外,电子设备700还可包括视频显示器(诸如,液晶显示器)和用户交互接口(诸如,键盘、鼠标、触摸输入装置等)。电子设备的所有组件可经由总线和/或网络而彼此连接。
根据本公开的实施例,还可提供一种计算机可读存储介质,其中,当计算机可读存储介质中的指令被至少一个处理器运行时,促使至少一个处理器执行本公开实施例的信号处理方法。这里的计算机可读存储介质的示例包括:只读存储器(ROM)、随机存取可编程只读存储器(PROM)、电可擦除可编程只读存储器(EEPROM)、随机存取存储器(RAM)、动态随机存取存储器(DRAM)、静态随机存取存储器(SRAM)、闪存、非易失性存储器、CD-ROM、CD-R、CD+R、CD-RW、CD+RW、DVD-ROM、DVD-R、DVD+R、DVD-RW、DVD+RW、DVD-RAM、BD-ROM、BD-R、BD-R LTH、BD-RE、蓝光或光盘存储器、硬盘驱动器(HDD)、固态硬盘(SSD)、卡式存储器(诸如,多媒体卡、安全数字(SD)卡或极速数字(XD)卡)、磁带、软盘、磁光数据存储装置、光学数据存储装置、硬盘、固态盘以及任何其他装置,所述任何其他装置被配置为以非暂时性方式存储计算机程序以及任何相关联的数据、数据文件和数据结构并将所述计算机程序以及任何相关联的数据、数据文件和数据结构提供给处理器或计算机使得处理器或计算机能执行所述计算机程序。上述计算机可读存储介质中的计算机程序可在诸如客户端、主机、代理装置、服务器等计算机设备中部署的环境中运行,此外,在一个示例中,计算机程序以及任何相关联的数据、数据文件和数据结构分布在联网的计算机系统上,使得计算机程序以及任何相关联的数据、数据文件和数据结构通过一个或多个处理器或计算机以分布式方式存储、访问和执行。
根据本公开实施例,提供了一种计算机程序产品,包括计算机指令,计算机指令被处理器执行时实现本公开实施例的信号处理方法。
本公开所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本 公开要求的保护范围。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (20)

  1. 一种信号处理方法,其特征在于,包括:
    调用终端的内录接口,获取内录信号,其中,所述内录信号包括在所述终端播放的声音;
    基于所述内录信号对所述终端的麦克风信号进行回声消音处理,其中,所述麦克风信号为通过所述终端的麦克风采集到的声音;
    将所述内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号;
    将所述第一混音信号发送到外部装置。
  2. 如权利要求1所述的信号处理方法,其特征在于,所述方法还包括;
    响应于确定所述内录信号还包括来自与所述终端通信连接的第一外部装置的声音,基于所述第一外部装置的声音对所述内录信号进行回声消音处理;
    将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,得到第二混音信号;
    将所述第二混音信号发送到所述第一外部装置。
  3. 如权利要求2所述的信号处理方法,其特征在于,所述方法还包括:
    接收与所述终端通信连接的第一外部装置的声音。
  4. 如权利要求1所述的信号处理方法,其特征在于,所述基于所述内录信号对所述终端的麦克风信号进行回声消音处理,包括:
    基于所述内录信号调节滤波器的参数;
    通过调节好参数的滤波器获取所述内录信号对应的回声信号;
    从所述终端的麦克风信号中消除所述回声信号。
  5. 如权利要求1所述的信号处理方法,其特征在于,所述将所述第一混音信号发送到外部装置,包括:
    将所述第一混音信号进行编码;
    将编码后的第一混音信号发送到外部装置。
  6. 如权利要求5所述的信号处理方法,其特征在于,所述将所述第一混音信号进行编码,包括:
    采用标准音频编码器对所述第一混音信号进行编码。
  7. 一种信号处理装置,其特征在于,包括:
    获取单元,用于调用终端的内录接口,获取内录信号,其中,所述内录信号包括在所述终端播放的声音;
    回声消音处理单元,用于基于所述内录信号对所述终端的麦克风信号进行回声消音处理,其中,所述麦克风信号为通过所述终端的麦克风采集到的声音;
    混合单元,用于将所述内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号;
    发送单元,用于将所述第一混音信号发送到外部装置。
  8. 如权利要求7所述的信号处理装置,其特征在于,响应于确定所述内录信号还包括来自与所述终端通信连接的第一外部装置的声音,所述回声消音处理单元还用于基于所述第一外部装置的声音对所述内录信号进行回声消音处理;
    所述混合单元还用于将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,得到第二混音信号;
    所述发送单元还用于将所述第二混音信号发送到所述第一外部装置。
  9. 如权利要求8所述的信号处理装置,其特征在于,所述获取单元还用于接收与所述终端通信连接的第一外部装置的声音。
  10. 如权利要求7所述的信号处理装置,其特征在于,所述回声消音处理单元还用于基于所述内录信号调节滤波器的参数;通过调节好参数的滤波器获取所述内录信号对应的回声信号;从所述终端的麦克风信号中消除所述回声信号。
  11. 如权利要求7所述的信号处理装置,其特征在于,所述发送单元还用于将所述第一混音信号进行编码;将编码后的第一混音信号发送到外部装置。
  12. 如权利要求11所述的信号处理装置,其特征在于,所述发送单元包括标准音频编码器,通过标准音频编码器对所述第一混音信号进行编码。
  13. 一种电子设备,其特征在于,包括:
    处理器;
    用于存储所述处理器可执行指令的存储器;
    其中,所述处理器被配置为执行所述指令,以实现以下步骤:
    调用电子设备的内录接口,获取内录信号,其中,所述内录信号包括在所述电子设备播放的声音;
    基于所述内录信号对所述电子设备的麦克风信号进行回声消音处理,其中,所述麦克风信号为通过所述电子设备的麦克风采集到的声音;
    将所述内录信号和回声消音处理后的麦克风信号进行混合,得到第一混音信号;
    将所述第一混音信号发送到外部装置。
  14. 如权利要求13所述的电子设备,其特征在于,所述处理器还被配置为:
    响应于确定所述内录信号还包括来自与所述电子设备通信连接的第一外部装置的声音,基于所述第一外部装置的声音对所述内录信号进行回声消音处理;
    将回声消音处理后的内录信号和回声消音处理后的麦克风信号进行混合,得到第二混音信号;
    将所述第二混音信号发送到所述第一外部装置。
  15. 如权利要求14所述的电子设备,其特征在于,所述处理器还被配置为:
    接收与所述电子设备通信连接的第一外部装置的声音。
  16. 如权利要求13所述的电子设备,其特征在于,对于基于所述内录信号对所述电 子设备的麦克风信号进行回声消音处理,所述处理器被配置为:
    基于所述内录信号调节滤波器的参数;
    通过调节好参数的滤波器获取所述内录信号对应的回声信号;
    从所述电子设备的麦克风信号中消除所述回声信号。
  17. 如权利要求13所述的电子设备,其特征在于,对于将所述第一混音信号发送到外部装置,所述处理器被配置为:
    将所述第一混音信号进行编码;
    将编码后的第一混音信号发送到外部装置。
  18. 如权利要求17所述的电子设备,其特征在于,对于将所述第一混音信号进行编码,所述处理器被配置为:
    采用标准音频编码器对所述第一混音信号进行编码。
  19. 一种非易失性计算机可读存储介质,其特征在于,当所述计算机可读存储介质中的指令被至少一个处理器运行时,促使所述至少一个处理器执行如权利要求1至6中任一项所述的信号处理方法。
  20. 一种计算机程序产品,包括计算机指令,其特征在于,所述计算机指令被处理器执行时实现如权利要求1至6中任一项所述的信号处理方法。
PCT/CN2021/139274 2021-04-28 2021-12-17 信号处理方法及装置 WO2022227625A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110469586.6A CN113225574B (zh) 2021-04-28 2021-04-28 信号处理方法及装置
CN202110469586.6 2021-04-28

Publications (1)

Publication Number Publication Date
WO2022227625A1 true WO2022227625A1 (zh) 2022-11-03

Family

ID=77089817

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/139274 WO2022227625A1 (zh) 2021-04-28 2021-12-17 信号处理方法及装置

Country Status (2)

Country Link
CN (1) CN113225574B (zh)
WO (1) WO2022227625A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113225574B (zh) * 2021-04-28 2023-01-20 北京达佳互联信息技术有限公司 信号处理方法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001097045A1 (en) * 2000-06-09 2001-12-20 Veazy Inc. Application specific live streaming multimedia mixer apparatus, systems and methods
CN109166589A (zh) * 2018-08-13 2019-01-08 深圳市腾讯网络信息技术有限公司 应用声音抑制方法、装置、介质以及设备
CN110956969A (zh) * 2019-11-28 2020-04-03 北京达佳互联信息技术有限公司 直播音频处理方法、装置、电子设备和存储介质
CN111445901A (zh) * 2020-03-26 2020-07-24 北京达佳互联信息技术有限公司 音频数据获取方法、装置、电子设备及存储介质
CN111583952A (zh) * 2020-05-19 2020-08-25 北京达佳互联信息技术有限公司 音频处理方法、装置、电子设备及存储介质
CN113225574A (zh) * 2021-04-28 2021-08-06 北京达佳互联信息技术有限公司 信号处理方法及装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004274681A (ja) * 2003-03-12 2004-09-30 Matsushita Electric Ind Co Ltd エコーキャンセル装置、エコーキャンセル方法、プログラムおよび記録媒体
US20140133648A1 (en) * 2008-03-06 2014-05-15 Andrzej Czyzewski Method and apparatus for acoustic echo cancellation in voip terminal
CN109767777A (zh) * 2019-01-31 2019-05-17 迅雷计算机(深圳)有限公司 一种直播软件的混音方法
CN111372121A (zh) * 2020-03-16 2020-07-03 北京文香信息技术有限公司 一种回声消除方法、装置、存储介质及处理器

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001097045A1 (en) * 2000-06-09 2001-12-20 Veazy Inc. Application specific live streaming multimedia mixer apparatus, systems and methods
CN109166589A (zh) * 2018-08-13 2019-01-08 深圳市腾讯网络信息技术有限公司 应用声音抑制方法、装置、介质以及设备
CN110956969A (zh) * 2019-11-28 2020-04-03 北京达佳互联信息技术有限公司 直播音频处理方法、装置、电子设备和存储介质
CN111445901A (zh) * 2020-03-26 2020-07-24 北京达佳互联信息技术有限公司 音频数据获取方法、装置、电子设备及存储介质
CN111583952A (zh) * 2020-05-19 2020-08-25 北京达佳互联信息技术有限公司 音频处理方法、装置、电子设备及存储介质
CN113225574A (zh) * 2021-04-28 2021-08-06 北京达佳互联信息技术有限公司 信号处理方法及装置

Also Published As

Publication number Publication date
CN113225574B (zh) 2023-01-20
CN113225574A (zh) 2021-08-06

Similar Documents

Publication Publication Date Title
US10097902B2 (en) System and method for using multiple audio input devices for synchronized and position-based audio
KR101673834B1 (ko) 협업 사운드 시스템
WO2018188282A1 (zh) 回声消除方法、装置、会议平板及计算机存储介质
WO2022110943A1 (zh) 语音预览的方法及装置
US20230364513A1 (en) Audio processing method and apparatus
WO2019062667A1 (zh) 会议内容的传输方法及装置
WO2022227625A1 (zh) 信号处理方法及装置
US11838570B2 (en) Loudness normalization method and system
EP3042336B1 (en) Verification that particular information is transferred by an application
US11741984B2 (en) Method and apparatus and telephonic system for acoustic scene conversion
JP2019514050A5 (zh)
CN113192526B (zh) 音频处理方法和音频处理装置
CN110096250B (zh) 一种音频数据处理方法、装置、电子设备及存储介质
US20150249884A1 (en) Post-processed reference path for acoustic echo cancellation
WO2020087788A1 (zh) 音频处理方法和装置
CN111147655B (zh) 模型生成方法和装置
US20230114327A1 (en) Method and system for generating media content
CN111145769A (zh) 音频处理方法和装置
US11915710B2 (en) Conference terminal and embedding method of audio watermarks
CN115631758B (zh) 音频信号处理方法、装置、设备和存储介质
CN115472176A (zh) 语音信号增强方法及装置
WO2020073566A1 (zh) 音频处理方法和装置
CN110138991B (zh) 回音消除方法和装置
CN111145792A (zh) 音频处理方法和装置
CN116036591A (zh) 音效优化方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21939084

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE